| Entry | Date | Title | Site | Author | #Graphics |
|---|---|---|---|---|---|
| 166 | Wed 03-Mar-2004 | Quacker Shutdown on night of 2/11 | quacker | sburns | |
| 152 | Fri 12-Dec-2003 | Opto 22 Commands for Hydra | quacker | sburns | |
| 137 | Tue 11-Nov-2003 | Rash of 4am Shutdowns. | quacker | sburns | |
| 33 | Fri 16-May-2003 | cron files on the quacker. | quacker | sburns |
% quacker crash on the night of 2/11. . .brought back up the next morning at ~9am. (this looks similar to the problem back in Nov, when temperature was extremely cold.). [root@quacker cuff]# /sbin/lsusb Bus 001 Device 001: ID 0000:0000 Virtual Hub Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.00 bDeviceClass 9 Hub bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x0000 Virtual idProduct 0x0000 Hub bcdDevice 0.00 iManufacturer 0 etc, etc... Feb 11 21:44:02 quacker timeupdate[15026]: Adjusted clock 199 milliseconds forward. RPC calls took 4 milliseconds Feb 11 22:44:03 quacker timeupdate[15026]: Adjusted clock 198 milliseconds forward. RPC calls took 4 milliseconds Feb 11 22:45:37 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 22:50:04 quacker chemcontrol: chemcontrol reset: Wed Feb 11 22:50:00 2004 Feb 11 22:53:30 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 22:56:05 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 23:01:04 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 23:05:14 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 23:14:35 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 23:29:03 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 23:37:04 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 23:40:02 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 23:44:04 quacker timeupdate[15026]: Adjusted clock 758 milliseconds forward. RPC calls took 4 milliseconds Feb 11 23:52:06 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 11 23:52:06 quacker kernel: usb.c: USB disconnect on device 00:07.2-1.1 address 3 Feb 11 23:52:06 quacker kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000998 Feb 11 23:52:06 quacker kernel: printing eip: Feb 11 23:52:06 quacker kernel: c8866e46 Feb 11 23:52:06 quacker kernel: *pde = 00000000 Feb 11 23:52:06 quacker kernel: Oops: 0002 Feb 11 23:52:06 quacker kernel: CPU: 0 Feb 11 23:52:06 quacker kernel: EIP: 0010:[] Not tainted Feb 11 23:52:06 quacker kernel: EFLAGS: 00010246 Feb 11 23:52:06 quacker kernel: eax: 00000000 ebx: 00000000 ecx: 00000000 edx: c797781c Feb 11 23:52:06 quacker kernel: esi: c797781c edi: 00000000 ebp: c7977800 esp: c79fdf40 Feb 11 23:52:06 quacker kernel: ds: 0018 es: 0018 ss: 0018 Feb 11 23:52:06 quacker kernel: Process khubd (pid: 82, stackpage=c79fd000) Feb 11 23:52:06 quacker kernel: Stack: c7977874 c7977874 c8868780 c8868760 c7b11ce4 c7f87c00 c883c0e5 c7f87c00 Feb 11 23:52:06 quacker kernel: c7977800 00000000 00000000 c7f87b0c 00000000 00000100 c7f86c8c c883e190 Feb 11 23:52:06 quacker kernel: c7f87b0c c7f87a0c c7f87a00 c7f86ca8 c7f86cb4 c7f87a00 c7f86c8c c883e81b Feb 11 23:52:06 quacker kernel: Call Trace: [ ] [ ] [ ] [ ] [ ] Feb 11 23:52:06 quacker kernel: [ ] [ ] [ ] Feb 11 23:52:06 quacker kernel: Feb 11 23:52:06 quacker kernel: Feb 11 23:52:06 quacker kernel: Code: 89 98 98 09 00 00 8b 4c 24 04 ff 46 58 0f 8e cb 03 00 00 83 Feb 11 23:59:46 quacker kernel: <4>usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:02:48 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:04:33 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:11:31 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:13:11 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:17:19 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:19:40 quacker last message repeated 2 times Feb 12 00:22:32 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:24:59 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:26:40 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:28:51 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:30:25 quacker last message repeated 2 times Feb 12 00:32:59 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 Feb 12 00:34:00 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 576 etc, etc... eventually the data system shuts down... Feb 12 09:05:02 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 811 Feb 12 09:05:06 quacker sshd[5855]: Accepted password for cuff from 10.0.0.1 port 37913 ssh2 Feb 12 09:05:07 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 1725 Feb 12 09:05:07 quacker sshd(pam_unix)[5857]: session opened for user cuff by (uid=500) Feb 12 09:05:12 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 591 Feb 12 09:05:17 quacker kernel: usb-uhci.c: interrupt, status 3, frame# 1505 ...couldn't kill the sio process: [cuff@quacker adam]$ ./stop_adam Waiting for sio to die ... 15030 6015 .15030 6015 .15030 6015 .15030 6015 .15030 6015 .15030 6015 .15030 6015 .15030 6015 . failed, still running [cuff@quacker adam]$ ps -elf | grep sio 000 D root 15030 1 0 65 -10 - 340 down Jan02 ? 01:52:56 sio 000 D root 6015 1 0 65 -10 - 340 down 09:08 pts/1 00:00:00 sio 000 S cuff 6055 5858 0 73 0 - 440 pipe_w 09:15 pts/1 00:00:00 grep sio eventually had to reboot to get it working again...the reboot was fine.
Here are some of the basic concepts for controlling the opto22/hydra:
The programs which control the hydra on the quacker:
To stop/start the opto22 do:
/usr/local/cuff/src/adam/stop_opto
/usr/local/cuff/src/adam/start_opto
the files which will be run using "start_opto" are:
-rw-rw-r-- 1 cuff cuff 6267 Nov 7 15:57 /usr/local/cuff/aster/projects/NIWOT/src/opto22/chemcontrol.cxx
-rw-rw-r-- 1 cuff cuff 6266 Nov 5 14:24 /usr/local/cuff/aster/projects/NIWOT/src/opto22/chemcontrol.cxx~
-rw-rw-r-- 1 cuff cuff 3854 Nov 21 12:58 /usr/local/cuff/aster/projects/NIWOT/src/opto22/cmds.c
Also, helpful is:
/usr/local/cuff/src/adam/command_opto
(use to send single commands to the opto22)...usage is:
/usr/local/cuff/src/adam/command_opto
Usage: command_opto cmd secs
A few more details:
(8 4 2 1)
Hydra Commands: 0010 0000 0000 0001
1st4bits: AspII AspI Distrib Sample (Asp1=snow, AspII=air, D=1 is A, D=0 is B)
2nd4bits: cal3 cal2 cal1 s9
3rd4bits: s8 s7 s6 s5
last4bits: s4 s3 s2 s1
Example of normal cycle:
struct command airsnow20min[] =
{
/* air and snow sampling, pumps running, 20 minutes (actually 30min) */
"4000", 240, /* ASP1 ON, everything else OFF */
"3001", 80, /* DISTRIB ON, sample #1, A1 */
"3002", 80,
"3004", 80,
"3008", 80,
"3010", 80,
"3020", 80,
"3040", 80,
"3080", 80,
"3100", 80,
"1001", 80, /* DISTRIB OFF, sample #1, B1 */
"1002", 80,
"1004", 80,
"1008", 80,
"1010", 80,
"9020", 80, /* ASP2, Distrib Off, sample, ON, sample 6, B6 */
"9040", 80, /* ASP2, Distrib Off, sample, ON, sample 7, B7 */
"9080", 80, /* ASP2, Distrib Off, sample, ON, sample 8, B8 */
"9100", 80, /* ASP2, Distrib Off, sample, ON, sample 9, B9 */
"0000", 120, /* ASP2, Distrib Off, sample, ON, sample 9, B9 */
0,0, /* end of command array */
};
November 11, 2003. In the past week there has been many 4am shutdowns. Here's a list of the days with 4am shutdowns: -rw-rw-r-- 1 aturnip aster 9986888 Oct 22 04:06 nwt031022.080000 -rw-rw-r-- 1 aturnip aster 14813400 Oct 26 04:04 nwt031026.080000 -rw-rw-r-- 1 aturnip aster 15019180 Nov 1 04:21 nwt031101.080000 -rw-rw-r-- 1 aturnip aster 14981538 Nov 2 04:20 nwt031102.080000 -rw-rw-r-- 1 aturnip aster 14931314 Nov 4 04:20 nwt031104.080000 -rw-rw-r-- 1 aturnip aster 15140696 Nov 5 04:22 nwt031105.080000 -rw-rw-r-- 1 aturnip aster 14788922 Nov 8 04:03 nwt031108.080000 -rw-rw-r-- 1 aturnip aster 14998778 Nov 11 04:06 nwt031111.080000 Gordon fixed sendmail on Nov 6th....so, note that the shutdowns after Nov 5th occur at 4:0x, while the ones before that date are at 04:2x.
Fri, May 16th. Talked to Gordon...we were looking at the cron jobs on the quacker... apparently it is doing a bunch of stuff which it doesn't need to do...these jobs all start at around 4am. Here are details on what I did: Cron Logfile: more /var/log/cron May 11 04:05:00 quacker CROND[25970]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:10:00 quacker CROND[26029]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:10:00 quacker CROND[26030]: (root) CMD (/usr/lib/sa/sa1 1 1) May 11 04:15:00 quacker CROND[26033]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:20:00 quacker CROND[26036]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:20:00 quacker CROND[26037]: (root) CMD (/usr/lib/sa/sa1 1 1) May 11 04:22:00 quacker CROND[26040]: (root) CMD (run-parts /etc/cron.weekly) May 11 04:22:00 quacker anacron[26044]: Updated timestamp for job `cron.weekly' to 2003-05-11 May 11 04:25:00 quacker CROND[2017]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:30:00 quacker CROND[2021]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:30:00 quacker CROND[2022]: (root) CMD (/usr/lib/sa/sa1 1 1) May 11 04:35:00 quacker CROND[2025]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:40:00 quacker CROND[2028]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:40:00 quacker CROND[2029]: (root) CMD (/usr/lib/sa/sa1 1 1) May 11 04:45:00 quacker CROND[2033]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:50:00 quacker CROND[2036]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 04:50:00 quacker CROND[2037]: (root) CMD (/usr/lib/sa/sa1 1 1) May 11 04:55:00 quacker CROND[2040]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) May 11 05:00:00 quacker CROND[2043]: (root) CMD (/usr/bin/mrtg /etc/mrtg/mrtg.cfg) ..etc..etc... note that this "mrtg" job is running every 5minutes or so... here's the modified crontab file (where gordon commented out the mrtg stuff): for more info about mrtg see the webpage: http://www.ntop.org/Monitoring.html (it looks like this is useful software for monitoring bandwidth on a network...something we don't need at all!). --------------------------- more /etc/crontab SHELL=/bin/bash PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO=root HOME=/ # run-parts 01 * * * * root run-parts /etc/cron.hourly 02 4 * * * root run-parts /etc/cron.daily 22 4 * * 0 root run-parts /etc/cron.weekly 42 4 1 * * root run-parts /etc/cron.monthly # 0-59/5 * * * * root /usr/bin/mrtg /etc/mrtg/mrtg.cfg --------------------------- Using the linconfig7.* files in /net/adm/linux/ (on the atd computer syrah) the following cron files on the quacker were changed to non-executable: Cron files to disable: -rwxr-xr-x 1 root root 1769 Jun 25 2002 /etc/cron.daily/dbbackup -rwxr-xr-x 1 root root 418 Mar 25 2002 /etc/cron.daily/makewhatis.cron -rwxr-xr-x 1 root root 197 May 24 2002 /etc/cron.daily/texpire -rwxr-xr-x 1 root root 315 Feb 26 2002 /etc/cron.daily/tripwire-check -rwxr-xr-x 1 root root 100 Apr 12 2002 /etc/cron.daily/tetex.cron -rwxr-xr-x 1 root root 197 May 24 2002 /etc/cron.daily/texpire -rwxr-xr-x 1 root root 414 Mar 25 2002 /etc/cron.weekly/makewhatis.cron -rwxr-xr-x 1 root root 40 May 23 2002 /etc/cron.weekly/wwwoffle-purge chmod -x /etc/cron.daily/dbbackup chmod -x /etc/cron.daily/tripwire-check chmod -x /etc/cron.daily/makewhatis.cron chmod -x /etc/cron.daily/texpire chmod -x /etc/cron.daily/tetex.cron chmod -x /etc/cron.daily/texpire chmod -x /etc/cron.weekly/wwwoffle-purge chmod -x /etc/cron.weekly/makewhatis.cron -rw-r--r-- 1 root root 1769 Jun 25 2002 /etc/cron.daily/dbbackup -rw-r--r-- 1 root root 418 Mar 25 2002 /etc/cron.daily/makewhatis.cron -rw-r--r-- 1 root root 197 May 24 2002 /etc/cron.daily/texpire -rw-r--r-- 1 root root 315 Feb 26 2002 /etc/cron.daily/tripwire-check -rw-r--r-- 1 root root 100 Apr 12 2002 /etc/cron.daily/tetex.cron -rw-r--r-- 1 root root 197 May 24 2002 /etc/cron.daily/texpire -rw-r--r-- 1 root root 414 Mar 25 2002 /etc/cron.weekly/makewhatis.cron -rw-r--r-- 1 root root 40 May 23 2002 /etc/cron.weekly/wwwoffle-purge ls -lag /etc/cron.daily/dbbackup /etc/cron.daily/tripwire-check /etc/cron.daily/makewhatis.cron /etc/cron.daily/texpire ls -lag /etc/cron.daily/tetex.cron /etc/cron.daily/texpire /etc/cron.weekly/wwwoffle-purge /etc/cron.weekly/makewhatis.cron here is a complete listing of the remaining cron files (some others may also be unecessary): listing of cron files: ls -lag /etc/cron*/* lrwxrwxrwx 1 root root 28 Sep 5 2002 /etc/cron.daily/00-logwatch -> ../log.d/scripts/logwatch.pl -rwxr-xr-x 1 root root 135 Apr 17 2002 /etc/cron.daily/00webalizer -rwxr-xr-x 1 root root 276 Jun 24 2001 /etc/cron.daily/0anacron -rwxr-xr-x 1 root root 51 Apr 15 2002 /etc/cron.daily/logrotate -rwxr-xr-x 1 root root 104 Apr 18 2002 /etc/cron.daily/rpm -rwxr-xr-x 1 root root 132 Jun 25 2001 /etc/cron.daily/slocate.cron -rwxr-xr-x 1 root root 193 Apr 13 2002 /etc/cron.daily/tmpwatch -rwxr-xr-x 1 root root 188 Apr 12 2002 /etc/cron.d/sysstat -rwxr-xr-x 1 root root 278 Jun 24 2001 /etc/cron.monthly/0anacron -rwxr-xr-x 1 root root 277 Jun 24 2001 /etc/cron.weekly/0anacron -rw-r--r-- 1 root root 1769 Jun 25 2002 /etc/cron.daily/dbbackup -rw-r--r-- 1 root root 418 Mar 25 2002 /etc/cron.daily/makewhatis.cron -rw-r--r-- 1 root root 100 Apr 12 2002 /etc/cron.daily/tetex.cron -rw-r--r-- 1 root root 197 May 24 2002 /etc/cron.daily/texpire -rw-r--r-- 1 root root 315 Feb 26 2002 /etc/cron.daily/tripwire-check lrwxrwxrwx 1 root root 65 Sep 5 2002 /etc/cron.daily/wwwoffle-full-index -> /var/spool/wwwoffle/html/search/htdig/scripts/wwwoff\ le-htdig-full -rw-r--r-- 1 root root 414 Mar 25 2002 /etc/cron.weekly/makewhatis.cron -rw-r--r-- 1 root root 40 May 23 2002 /etc/cron.weekly/wwwoffle-purge