NIWOT: Logbook Entries

NIWOT: Site quacker Messages, 4 Entries..

Return to Logbook Contents Page
Entry Date Title Site Author #Graphics
166 Wed 03-Mar-2004Quacker Shutdown on night of 2/11quackersburns
152 Fri 12-Dec-2003Opto 22 Commands for Hydraquackersburns
137 Tue 11-Nov-2003Rash of 4am Shutdowns.quackersburns
33 Fri 16-May-2003cron files on the quacker.quackersburns


166: data system, Site quacker, Wed 03-Mar-2004 14:48:55 MST, Quacker Shutdown on night of 2/11
% quacker crash on the night of 2/11. . .brought back up the next
morning at ~9am.  (this looks similar to the problem back in Nov, when
temperature was extremely cold.).

[root@quacker cuff]# /sbin/lsusb

Bus 001 Device 001: ID 0000:0000 Virtual Hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0         8
  idVendor           0x0000 Virtual
  idProduct          0x0000 Hub
  bcdDevice            0.00
  iManufacturer           0
  etc, etc...

Feb 11 21:44:02 quacker timeupdate[15026]: Adjusted clock 199 milliseconds forward. RPC calls took 4 milliseconds
Feb 11 22:44:03 quacker timeupdate[15026]: Adjusted clock 198 milliseconds forward. RPC calls took 4 milliseconds
Feb 11 22:45:37 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 22:50:04 quacker chemcontrol: chemcontrol reset: Wed Feb 11 22:50:00 2004
Feb 11 22:53:30 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 22:56:05 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 23:01:04 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 23:05:14 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 23:14:35 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 23:29:03 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 23:37:04 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 23:40:02 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 23:44:04 quacker timeupdate[15026]: Adjusted clock 758 milliseconds forward. RPC calls took 4 milliseconds
Feb 11 23:52:06 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 11 23:52:06 quacker kernel: usb.c: USB disconnect on device 00:07.2-1.1 address 3
Feb 11 23:52:06 quacker kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000998
Feb 11 23:52:06 quacker kernel:  printing eip:
Feb 11 23:52:06 quacker kernel: c8866e46
Feb 11 23:52:06 quacker kernel: *pde = 00000000
Feb 11 23:52:06 quacker kernel: Oops: 0002
Feb 11 23:52:06 quacker kernel: CPU:    0
Feb 11 23:52:06 quacker kernel: EIP:    0010:[]    Not tainted
Feb 11 23:52:06 quacker kernel: EFLAGS: 00010246
Feb 11 23:52:06 quacker kernel: eax: 00000000   ebx: 00000000   ecx: 00000000   edx: c797781c
Feb 11 23:52:06 quacker kernel: esi: c797781c   edi: 00000000   ebp: c7977800   esp: c79fdf40
Feb 11 23:52:06 quacker kernel: ds: 0018   es: 0018   ss: 0018
Feb 11 23:52:06 quacker kernel: Process khubd (pid: 82, stackpage=c79fd000)
Feb 11 23:52:06 quacker kernel: Stack: c7977874 c7977874 c8868780 c8868760 c7b11ce4 c7f87c00 c883c0e5 c7f87c00
Feb 11 23:52:06 quacker kernel:        c7977800 00000000 00000000 c7f87b0c 00000000 00000100 c7f86c8c c883e190
Feb 11 23:52:06 quacker kernel:        c7f87b0c c7f87a0c c7f87a00 c7f86ca8 c7f86cb4 c7f87a00 c7f86c8c c883e81b
Feb 11 23:52:06 quacker kernel: Call Trace:    [] [] [] [] []
Feb 11 23:52:06 quacker kernel:   [] [] []
Feb 11 23:52:06 quacker kernel:
Feb 11 23:52:06 quacker kernel:
Feb 11 23:52:06 quacker kernel: Code: 89 98 98 09 00 00 8b 4c 24 04 ff 46 58 0f 8e cb 03 00 00 83
Feb 11 23:59:46 quacker kernel:  <4>usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:02:48 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:04:33 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:11:31 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:13:11 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:17:19 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:19:40 quacker last message repeated 2 times
Feb 12 00:22:32 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:24:59 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:26:40 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:28:51 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:30:25 quacker last message repeated 2 times
Feb 12 00:32:59 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
Feb 12 00:34:00 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576
etc, etc...

eventually the data system shuts down...

Feb 12 09:05:02 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 811
Feb 12 09:05:06 quacker sshd[5855]: Accepted password for cuff from 10.0.0.1 port 37913 ssh2
Feb 12 09:05:07 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 1725
Feb 12 09:05:07 quacker sshd(pam_unix)[5857]: session opened for user cuff by (uid=500)
Feb 12 09:05:12 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 591
Feb 12 09:05:17 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 1505

...couldn't kill the sio process:

[cuff@quacker adam]$ ./stop_adam
Waiting for sio to die ... 15030
6015
.15030
6015
.15030
6015
.15030
6015
.15030
6015
.15030
6015
.15030
6015
.15030
6015
. failed, still running
[cuff@quacker adam]$ ps -elf | grep sio
000 D root     15030     1  0  65 -10    -   340 down   Jan02 ?        01:52:56 sio
000 D root      6015     1  0  65 -10    -   340 down   09:08 pts/1    00:00:00 sio
000 S cuff      6055  5858  0  73   0    -   440 pipe_w 09:15 pts/1    00:00:00 grep sio

eventually had to reboot to get it working again...the reboot was
fine.


152: hydra, Site quacker, Fri 12-Dec-2003 14:35:20 MST, Opto 22 Commands for Hydra
Here are some of the basic concepts for controlling the opto22/hydra:

The programs which control the hydra on the quacker:

To stop/start the opto22 do:

/usr/local/cuff/src/adam/stop_opto
/usr/local/cuff/src/adam/start_opto

the files which will be run using "start_opto" are:

-rw-rw-r--    1 cuff     cuff         6267 Nov  7 15:57 /usr/local/cuff/aster/projects/NIWOT/src/opto22/chemcontrol.cxx
-rw-rw-r--    1 cuff     cuff         6266 Nov  5 14:24 /usr/local/cuff/aster/projects/NIWOT/src/opto22/chemcontrol.cxx~
-rw-rw-r--    1 cuff     cuff         3854 Nov 21 12:58 /usr/local/cuff/aster/projects/NIWOT/src/opto22/cmds.c

Also, helpful is:

 /usr/local/cuff/src/adam/command_opto

(use to send single commands to the opto22)...usage is:

  /usr/local/cuff/src/adam/command_opto
  Usage: command_opto cmd secs


A few more details:


               (8 4 2 1)
Hydra Commands:  0010  0000  0000  0001
   1st4bits:  AspII AspI Distrib Sample  (Asp1=snow, AspII=air, D=1 is A, D=0 is B)
   2nd4bits:  cal3 cal2 cal1  s9
   3rd4bits:  s8 s7 s6 s5
   last4bits:  s4 s3 s2 s1

Example of normal cycle:

struct command airsnow20min[] =
{
  /* air and snow sampling, pumps running, 20 minutes (actually 30min) */
  "4000", 240,  /* ASP1 ON, everything else OFF */
  "3001", 80,   /* DISTRIB ON, sample #1, A1 */
  "3002", 80,
  "3004", 80,
  "3008", 80,
  "3010", 80,
  "3020", 80,
  "3040", 80,
  "3080", 80,
  "3100", 80,
  "1001", 80,   /* DISTRIB OFF, sample #1, B1 */
  "1002", 80,
  "1004", 80,
  "1008", 80,
  "1010", 80,
  "9020", 80,   /* ASP2, Distrib Off, sample, ON, sample 6, B6 */
  "9040", 80,   /* ASP2, Distrib Off, sample, ON, sample 7, B7 */
  "9080", 80,   /* ASP2, Distrib Off, sample, ON, sample 8, B8 */
  "9100", 80,   /* ASP2, Distrib Off, sample, ON, sample 9, B9 */
  "0000", 120,  /* ASP2, Distrib Off, sample, ON, sample 9, B9 */
  0,0,          /* end of command array */
};



137: data system, Site quacker, Tue 11-Nov-2003 10:12:58 MST, Rash of 4am Shutdowns.
November 11, 2003.

In the past week there has been many 4am shutdowns.

Here's a list of the days with 4am shutdowns:

-rw-rw-r--   1 aturnip  aster    9986888 Oct 22 04:06 nwt031022.080000
-rw-rw-r--   1 aturnip  aster    14813400 Oct 26 04:04 nwt031026.080000
-rw-rw-r--   1 aturnip  aster    15019180 Nov  1 04:21 nwt031101.080000
-rw-rw-r--   1 aturnip  aster    14981538 Nov  2 04:20 nwt031102.080000
-rw-rw-r--   1 aturnip  aster    14931314 Nov  4 04:20 nwt031104.080000
-rw-rw-r--   1 aturnip  aster    15140696 Nov  5 04:22 nwt031105.080000
-rw-rw-r--   1 aturnip  aster    14788922 Nov  8 04:03 nwt031108.080000
-rw-rw-r--   1 aturnip  aster    14998778 Nov 11 04:06 nwt031111.080000

Gordon fixed sendmail on Nov 6th....so, note that the shutdowns after
Nov 5th occur at 4:0x, while the ones before that date are at 04:2x.


33: data system, Site quacker, Fri 16-May-2003 15:44:14 MDT, cron files on the quacker.
Fri, May 16th.

Talked to Gordon...we were looking at the cron jobs on the quacker...
apparently it is doing a bunch of stuff which it doesn't need to
do...these jobs all start at around 4am.  Here are details on what
I did:

Cron Logfile:

more /var/log/cron
May 11 04:05:00 quacker CROND[25970]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:10:00 quacker CROND[26029]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:10:00 quacker CROND[26030]: (root) CMD (/usr/lib/sa/sa1 1 1)
May 11 04:15:00 quacker CROND[26033]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:20:00 quacker CROND[26036]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:20:00 quacker CROND[26037]: (root) CMD (/usr/lib/sa/sa1 1 1)
May 11 04:22:00 quacker CROND[26040]: (root) CMD (run-parts /etc/cron.weekly)
May 11 04:22:00 quacker anacron[26044]: Updated timestamp for job `cron.weekly' to 2003-05-11
May 11 04:25:00 quacker CROND[2017]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:30:00 quacker CROND[2021]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:30:00 quacker CROND[2022]: (root) CMD (/usr/lib/sa/sa1 1 1)
May 11 04:35:00 quacker CROND[2025]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:40:00 quacker CROND[2028]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:40:00 quacker CROND[2029]: (root) CMD (/usr/lib/sa/sa1 1 1)
May 11 04:45:00 quacker CROND[2033]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:50:00 quacker CROND[2036]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 04:50:00 quacker CROND[2037]: (root) CMD (/usr/lib/sa/sa1 1 1)
May 11 04:55:00 quacker CROND[2040]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
May 11 05:00:00 quacker CROND[2043]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg)
..etc..etc...

note that this "mrtg" job is running every 5minutes or so...

here's the modified crontab file (where gordon commented out the mrtg stuff):

for more info about mrtg see the webpage:

  http://www.ntop.org/Monitoring.html

(it looks like this is useful software for monitoring bandwidth on a
network...something we don't need at all!).


---------------------------
more /etc/crontab

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

# 0-59/5 * * * * root /usr/bin/mrtg /etc/mrtg/mrtg.cfg
---------------------------


Using the linconfig7.* files in /net/adm/linux/ (on the atd computer
syrah) the following cron files on the quacker were changed to
non-executable:

Cron files to disable:
-rwxr-xr-x    1 root     root         1769 Jun 25  2002 /etc/cron.daily/dbbackup
-rwxr-xr-x    1 root     root          418 Mar 25  2002 /etc/cron.daily/makewhatis.cron
-rwxr-xr-x    1 root     root          197 May 24  2002 /etc/cron.daily/texpire
-rwxr-xr-x    1 root     root          315 Feb 26  2002 /etc/cron.daily/tripwire-check
-rwxr-xr-x    1 root     root          100 Apr 12  2002 /etc/cron.daily/tetex.cron
-rwxr-xr-x    1 root     root          197 May 24  2002 /etc/cron.daily/texpire
-rwxr-xr-x    1 root     root          414 Mar 25  2002 /etc/cron.weekly/makewhatis.cron
-rwxr-xr-x    1 root     root           40 May 23  2002 /etc/cron.weekly/wwwoffle-purge

chmod -x /etc/cron.daily/dbbackup
chmod -x /etc/cron.daily/tripwire-check
chmod -x /etc/cron.daily/makewhatis.cron
chmod -x /etc/cron.daily/texpire
chmod -x /etc/cron.daily/tetex.cron
chmod -x /etc/cron.daily/texpire
chmod -x /etc/cron.weekly/wwwoffle-purge
chmod -x /etc/cron.weekly/makewhatis.cron

-rw-r--r--    1 root     root         1769 Jun 25  2002 /etc/cron.daily/dbbackup
-rw-r--r--    1 root     root          418 Mar 25  2002 /etc/cron.daily/makewhatis.cron
-rw-r--r--    1 root     root          197 May 24  2002 /etc/cron.daily/texpire
-rw-r--r--    1 root     root          315 Feb 26  2002 /etc/cron.daily/tripwire-check
-rw-r--r--    1 root     root          100 Apr 12  2002 /etc/cron.daily/tetex.cron
-rw-r--r--    1 root     root          197 May 24  2002 /etc/cron.daily/texpire
-rw-r--r--    1 root     root          414 Mar 25  2002 /etc/cron.weekly/makewhatis.cron
-rw-r--r--    1 root     root           40 May 23  2002 /etc/cron.weekly/wwwoffle-purge

ls -lag /etc/cron.daily/dbbackup /etc/cron.daily/tripwire-check /etc/cron.daily/makewhatis.cron /etc/cron.daily/texpire
ls -lag /etc/cron.daily/tetex.cron /etc/cron.daily/texpire /etc/cron.weekly/wwwoffle-purge /etc/cron.weekly/makewhatis.cron


here is a complete listing of the remaining cron files (some others may also
be unecessary):

listing of cron files:

ls -lag /etc/cron*/*
lrwxrwxrwx    1 root     root           28 Sep  5  2002 /etc/cron.daily/00-logwatch -> ../log.d/scripts/logwatch.pl
-rwxr-xr-x    1 root     root          135 Apr 17  2002 /etc/cron.daily/00webalizer
-rwxr-xr-x    1 root     root          276 Jun 24  2001 /etc/cron.daily/0anacron
-rwxr-xr-x    1 root     root           51 Apr 15  2002 /etc/cron.daily/logrotate
-rwxr-xr-x    1 root     root          104 Apr 18  2002 /etc/cron.daily/rpm
-rwxr-xr-x    1 root     root          132 Jun 25  2001 /etc/cron.daily/slocate.cron
-rwxr-xr-x    1 root     root          193 Apr 13  2002 /etc/cron.daily/tmpwatch
-rwxr-xr-x    1 root     root          188 Apr 12  2002 /etc/cron.d/sysstat
-rwxr-xr-x    1 root     root          278 Jun 24  2001 /etc/cron.monthly/0anacron
-rwxr-xr-x    1 root     root          277 Jun 24  2001 /etc/cron.weekly/0anacron

-rw-r--r--    1 root     root         1769 Jun 25  2002 /etc/cron.daily/dbbackup
-rw-r--r--    1 root     root          418 Mar 25  2002 /etc/cron.daily/makewhatis.cron
-rw-r--r--    1 root     root          100 Apr 12  2002 /etc/cron.daily/tetex.cron
-rw-r--r--    1 root     root          197 May 24  2002 /etc/cron.daily/texpire
-rw-r--r--    1 root     root          315 Feb 26  2002 /etc/cron.daily/tripwire-check
lrwxrwxrwx    1 root     root           65 Sep  5  2002 /etc/cron.daily/wwwoffle-full-index -> /var/spool/wwwoffle/html/search/htdig/scripts/wwwoff\
le-htdig-full
-rw-r--r--    1 root     root          414 Mar 25  2002 /etc/cron.weekly/makewhatis.cron
-rw-r--r--    1 root     root           40 May 23  2002 /etc/cron.weekly/wwwoffle-purge