- 437: data processing, Site russter, Wed 23-May-2007 16:32:21 MDT, covar process dies.
* Mon Apr 30 08:55:47 MDT 2007
reran the covar files starting at the date when the process died (25
April 2007). so here's what i did with covar_redo (change the date
to be the date you want "covar_redo" to restart, ie:
cd /usr/local/isff/aster/projects/NIWOT/scripts/
emacs -nw covar_redo
in covar_redo make the line with "set begin" to be: set begin = "2007 apr 25 00:00"
the covar files (before doing anything) are:
dir /usr/local/aster/projects/NIWOT/results/covar/
total 6440
-rw-rw-r-- 1 sburns aster 280488 Apr 23 18:10 nwt.070423.nc
-rw-rw-r-- 1 sburns aster 281892 Apr 25 18:10 nwt.070424.nc
-rw-rw-r-- 1 sburns aster 281860 Apr 26 14:17 nwt.070425.nc
-rw-rw-r-- 1 sburns aster 244196 Apr 26 14:17 nwt.070426.nc
[and then it stops].
so ran, "covar_redo", now with "check_aster" i get:
-------------- Covar calcs --------------
host user pid start exectime process
russter2 sburns 19971 08:58 pt00:00:05 covar -S -a quacker -f -B 2007 apr
and the files are being created, ie:
-rw-rw-r-- 1 sburns aster 281860 Apr 23 18:10 nwt.070422.nc
-rw-rw-r-- 1 sburns aster 280488 Apr 23 18:10 nwt.070423.nc
-rw-rw-r-- 1 sburns aster 281892 Apr 25 18:10 nwt.070424.nc
-rw-rw-r-- 1 sburns aster 281860 Apr 30 08:58 nwt.070425.nc
-rw-rw-r-- 1 sburns aster 281860 Apr 30 08:59 nwt.070426.nc
-rw-rw-r-- 1 sburns aster 281860 Apr 30 09:00 nwt.070427.nc
-rw-rw-r-- 1 sburns aster 281860 Apr 30 09:00 nwt.070428.nc
-rw-rw-r-- 1 sburns aster 281860 Apr 30 09:00 nwt.070429.nc
-rw-rw-r-- 1 sburns aster 190268 Apr 30 09:00 nwt.070430.nc
now just need to restart the "normal" covar...which is "covar_quacker". So
it shows,
-------------- Covar calcs --------------
host user pid start exectime process
russter2 sburns 32295 14:12 pt00:00:00 covar -S -a quacker
then i get,
-rw-rw-r-- 1 sburns aster 281860 Apr 30 09:00 nwt.070429.nc
-rw-rw-r-- 1 sburns aster 190268 Apr 30 09:10 nwt.070430.nc
- 341: data processing, Site russter, Wed 11-Jan-2006 16:28:32 MST, Redo covar data on Nov 15th, 2005
Tuesday, November 15th Visit:
-----------------------------
* NOW, i noticed that i had the wrong number of items in the "units"
section or prep.conf ie:
/usr/local/isff/aster/projects/NIWOT/ops1[95]: diff prep.config prep.config~
66c66
< s=quacker:202 c=camp_21x_bin("mv","mv","mv","mv","kOhms","kOhms","sec","percent","C")
---
> s=quacker:202 c=camp_21x_bin("mv","mv","mv","mv","mv","kOhms","kOhms","sec","percent","C")
-rw-rw-r-- 1 sburns aster 6643 Nov 15 10:51 prep.config
-rw-rw-r-- 1 sburns aster 6648 Oct 28 14:20 prep.config~
* for some reason the chan202 parameters were all coming out as NaN in
the 5-min covar data files (even for days when the communication was
working). So, i tried the following:
Tue Nov 15 10:30:52 MST 2005
cd /usr/local/isff/aster/projects/NIWOT/scripts/
more covar_redo
#!/bin/csh -f
#
# Rerun covar
#
#
#nice +20
set script = $0
set script = $script:t
setenv OPS `getops now`
if ( ! $?PROJECT || ! $?OPS ) then
echo "Error $0 : PROJECT/OPS environment variables are not set!"
echo "Do set_project."
exit 1
endif
# setenv COVARDIR $ASTER/projects/$PROJECT/results/covar
echo "PROJECT=$PROJECT OPS=$OPS"
set begin = "2005 jun 10 00:00"
covar -S -a quacker -f -B "$begin" &
/usr/local/isff/aster/projects/NIWOT/scripts[61]: covar_redo
PROJECT=NIWOT OPS=ops1
[1] 17691
/usr/local/isff/aster/projects/NIWOT/scripts[62]: OPS=ops1
- 290: data system, Site russter, Fri 24-Jun-2005 16:10:16 MDT, How to restart covar process on russter2.
% for some reason the covar process stopped on May 1...here's what I see:
dir /usr/local/aster/projects/NIWOT/results/covar/
total 35224
drwxrwsr-x 2 maclean aster 20480 Apr 30 18:10 ./
drwxrwsr-x 3 maclean aster 4096 Sep 14 1998 ../
-rw-rw-r-- 1 sburns aster 286828 Jan 2 17:15 nwt.050101.nc
-rw-rw-r-- 1 sburns aster 286828 Jan 3 17:10 nwt.050102.nc
-rw-rw-r-- 1 sburns aster 286828 Jan 4 17:05 nwt.050103.nc
-rw-rw-r-- 1 sburns aster 286828 Jan 5 17:10 nwt.050104.nc
-rw-rw-r-- 1 sburns aster 286828 Jan 6 15:17 nwt.050105.nc
-rw-rw-r-- 1 sburns aster 286828 Jan 6 17:10 nwt.050106.nc
-rw-rw-r-- 1 sburns aster 286828 Jan 8 17:15 nwt.050107.nc
etc, etc
-rw-rw-r-- 1 sburns aster 296024 Apr 24 18:10 nwt.050423.nc
-rw-rw-r-- 1 sburns aster 296024 Apr 25 18:15 nwt.050424.nc
-rw-rw-r-- 1 sburns aster 296024 Apr 26 18:15 nwt.050425.nc
-rw-rw-r-- 1 sburns aster 296024 Apr 27 15:02 nwt.050426.nc
-rw-rw-r-- 1 sburns aster 296024 Apr 27 18:10 nwt.050427.nc
-rw-rw-r-- 1 sburns aster 296024 Apr 28 18:10 nwt.050428.nc
-rw-rw-r-- 1 sburns aster 296024 Apr 29 18:05 nwt.050429.nc
-rw-rw-r-- 1 sburns aster 296024 May 1 15:52 nwt.050430.nc
-rw-rw-r-- 1 sburns aster 273524 May 1 15:52 nwt.050501.nc
so i did:
mv /usr/local/aster/projects/NIWOT/results/covar/nwt.050501.nc .
then, cd /usr/local/isff/aster/projects/NIWOT/scripts/
e covar_redo
set begin = "2005 may 01 00:00"
then i did:
covar_redo
PROJECT=NIWOT OPS=ops1
and see:
dir /usr/local/aster/projects/NIWOT/results/covar/ | grep May
drwxrwsr-x 2 maclean aster 20480 May 11 08:49 ./
-rw-rw-r-- 1 sburns aster 296024 May 1 15:52 nwt.050430.nc
-rw-rw-r-- 1 sburns aster 294784 May 11 08:49 nwt.050501.nc
-rw-rw-r-- 1 sburns aster 294784 May 11 08:49 nwt.050502.nc
-rw-rw-r-- 1 sburns aster 154384 May 11 08:50 nwt.050503.nc
etc, etc.
then i ftped these files to specialk and restarted, "fluxcalcs.csh".
back on russter2 i did "check_aster"...ie:
check_aster
Wed May 11 08:59:24 MDT 2005
------------- Environment -------------
PROJECT = NIWOT, OPS=ops1
------------- Server tasks -------------
host user pid start exectime process
russter2 sburns 3734 Apr29 ? 00:00:00 adamserver
russter2 sburns 3736 Apr29 ? 00:00:17 nc_server
------------- Ingest tasks -------------
host user pid start exectime process
russter2 sburns 3738 Apr29 ? 00:00:00 ingest
russter2 sburns 4579 Apr29 ? 00:01:15 quacker
------------- Archive tasks -------------
host user pid start exectime process
russter2 root 4590 Apr29 ? 00:00:27 archive quacker .
-------------- Covar calcs --------------
host user pid start exectime process
-------------- X processes --------------
host user pid start exectime process
russter2 sburns 28967 07:40 pt00:00:02 xcockpit -a quacker -xrm *Monitor.f
-------------- Ingest Statistics ----------
station port status up since sample/sec serialErrs
quacker 1074 open Apr 29 11:37 36.00 0
-------------- Living adams --------------
quacker
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda3 5162828 753656 4146912 16% /usr/local
/dev/hda8 51088492 34082296 14410900 71% /data
-----------------------------------------
Wed May 11 08:59:25 MDT 2005
from "check_aster" it looks like the covar process is not running...so
i restarted the covar process on russter2, ie:
/usr/local/isff/aster/projects/NIWOT/scripts[81]: covar_quacker
PROJECT=NIWOT OPS=ops1
[1] 29483
now check_aster shows:
-------------- Covar calcs --------------
host user pid start exectime process
russter2 sburns 29483 09:01 pt00:00:00 covar -S -a quacker
ps -gaxu | grep cov
root 9 0.0 0.0 0 0 ? SW Apr29 0:00 [mdrecoveryd]
sburns 29483 0.4 0.6 3076 1664 pts/0 S 09:01 0:00 covar -S -a quacker
- 268: data system, Site russter, Mon 21-Mar-2005 15:12:45 MST, Data Archiving stopped on Feb 5, 2005
Sat, Feb 5 at 12:30 MST
-----------------------
the archiving of the hi-rate data had stopped..but it looks
like the covar data are still there...here's what I see:
-rw-rw-r-- 1 root aster 33921934 Feb 4 09:00 nwt050204.080000
-rw-rw-r-- 1 root aster 34267078 Feb 4 17:00 nwt050204.160000
-rw-rw-r-- 1 root aster 1409028 Feb 4 17:20 nwt050205.000000
-rw-rw-r-- 1 root aster 593920 Feb 5 12:40 nwt050205.193153
(it stopped at 17:20 yesterday...which is exactly the time i logged out!)...
sburns pts/0 toyon.mmm.ucar.e Fri Feb 4 08:16 - 17:20 (09:03)
to get things regoing again i did: "ndaqrestart quacker" on russter2
more info:
tail -100 /var/log/local/aster.log | more
Feb 3 16:24:59 russter2 ingest(quacker)[5132]: quacker russter2.32776 socket closed, 5 active connections
Feb 3 16:25:02 russter2 ingest(quacker)[5132]: quacker russter2.32778 socket closed, 4 active connections
Feb 3 16:25:11 russter2 ingest(quacker)[5132]: quacker russter2.32774 socket closed, 3 active connections
Feb 4 00:00:15 russter2 archive(quacker)[5143]: Opened: ./all/nwt050204.000000
Feb 4 00:00:15 russter2 covar[5153]: quacker@russter2: midnight rollover. Sample time: 2005 Feb 04 j035 00:00:00
Feb 4 00:10:15 russter2 nc_server[4348]: Created: /usr/local/aster/projects/NIWOT/results/covar/nwt.050204.nc
Feb 4 08:00:15 russter2 archive(quacker)[5143]: Opened: ./all/nwt050204.080000
Feb 4 08:16:36 russter2 ingest(quacker)[5132]: russter2 port 32780: setsockopt SO_SNDBUF=16384
Feb 4 08:16:36 russter2 ingest(quacker)[5132]: quacker DGRAM socket connected to russter2:32780, 3 active connections
Feb 4 16:00:15 russter2 archive(quacker)[5143]: Opened: ./all/nwt050204.160000
Feb 5 00:00:15 russter2 archive(quacker)[5143]: Opened: ./all/nwt050205.000000
Feb 5 00:00:15 russter2 covar[5153]: quacker@russter2: midnight rollover. Sample time: 2005 Feb 05 j036 00:00:00
Feb 5 00:10:20 russter2 nc_server[4348]: Created: /usr/local/aster/projects/NIWOT/results/covar/nwt.050205.nc
Feb 5 00:10:20 russter2 nc_server[4348]: Closing: /usr/local/aster/projects/NIWOT/results/covar/nwt.050203.nc
Feb 4 17:20:07 russter2 ingest(quacker)[5132]: quacker russter2.32780 socket closed, 3 active connections
Feb 5 00:20:07 russter2 archive(quacker)[5143]: EOF received on input from quacker@russter2
Feb 5 00:20:07 russter2 archive(quacker)[5143]: archiving stopped for quacker
Feb 4 17:20:07 russter2 ingest(quacker)[5132]: quacker russter2.32780 socket closed, 2 active connections
Feb 5 09:51:25 russter2 ingest[4350]: Unknown ADAM: 61.152.96.211
Feb 5 12:31:49 russter2 ingest[4350]: quacker ingest restarted, sending hangup to previous ingest, pid 5132
Feb 5 12:31:49 russter2 ingest(quacker)[5132]: quacker: signal SIGHUP (1) received
Feb 5 12:31:49 russter2 ingest(quacker)[5132]: quacker russter2.32784 socket closed, 1 active connections
Feb 5 12:31:49 russter2 ingest(quacker)[5132]: quacker shutting down
- 158: data processing, Site russter, Sun 04-Jan-2004 14:43:26 MST, How to redo covar process on russter.
October 31st:
needed to update covar.config since the covar process was not
running on the quacker. See:
[/usr/local/aster/projects/NIWOT/ops1] tail /var/adm/messages
Oct 29 18:07:00 russter ntpdate[15860]: adjust time server 128.138.82.228 offset 0.3524125
Oct 29 23:50:36 russter inetd[156]: config: 100068/rpc/udp still active and was not reconfigured.
Oct 29 23:50:36 russter inetd[156]: config: 100083/rpc/tcp still active and was not reconfigured.
Oct 30 06:07:01 russter ntpdate[17315]: adjust time server 128.138.82.228 offset 0.3480103
Oct 30 13:10:05 russter covar[18133]: Can't find c12o2
Oct 30 18:07:01 russter ntpdate[18854]: adjust time server 128.138.82.228 offset 0.3267275
Oct 30 23:48:23 russter inetd[156]: config: 100068/rpc/udp still active and was not reconfigured.
Oct 30 23:48:23 russter inetd[156]: config: 100083/rpc/tcp still active and was not reconfigured.
this is because c12o2 was removed from prep.config, but not covar.config.
/usr/local/aster/projects/NIWOT/scripts/covar_redo*
- 97: site visit, Site russter, Wed 13-Aug-2003 16:00:30 MDT, Aug. 13, 2003
Aug. 13, 2003
Weather - sunny, hot.
(1) Downloaded laser data. Seemed to work OK. CD did burn properly (checked by looking at it with another laptop). Laser restarted at ~ 11:35 AM. All the parameters seemed fine except the flows through the Nafion driers were low (1.2 Lpm and 0.4 Lpm).
(2) Downloaded both the east and west robots. Both were done (done = 1). Switched out the flasks in both.
(3) russter is still off the network, but still collecting data. No missed data since Aug. 8. No breakers popped in the last 3 days either. Mark Losleben is on vacation and Steve Seibold knows of no network problem. Will check with Andy o'Reilly to see what he knows. If the network is functioning fine - we may need to consult with UnixOps to see what to do. I did check the ethernet cable going across the road and saw no evidence of damage. Still could be the problem.
- 96: data system, Site russter, Wed 13-Aug-2003 15:57:52 MDT, Tuesday, Aug. 12, 2003
Aug. 12, 2003
Noticed that data had not been sent down from russter since Aug. 8.
Russter was not on the network (couldn't ping).
However, I was up at the site on Aug. 10th and the russter was running fine - although it had hung when I logged in. this was most likely due to being off the network as this is typical behavior when russter is off-line.
will chekc with MRS to see if there are any known network problems.