BOUGHT: 19 Dec 87 Symmetric Computer Systems Inc (Calif.) (Bankrupt ~1989). HARDWARE: NS32016 11.0592 MHz, Ethernet, 3 (+1) * 70M ST506 Discs, 5.25"+3.5" Floppies, Parallel printer, 4 ttys, 50M SCSI Cassette Streamer. BINUTILS NEVER COMPILED (gprof ld size nm strip ar ranlib) SYMMETRIC TECHNICAL HISTORY by JHS -------------------------------------------------------------------------------- 88-01-28 Symmetric Model 375. Serial 375-8712-10134 delivered. by Federal Express +49 89 3113026 said I'll have to pay tax on $8970 4.9% + 14% floppy drive mains cable: white goes to fuse, then switch -------------------------------------------------------------------------------- FIXED: Bad blocks in original /lib/libc.a & /vmunix symbol table, pjc replaced disc these disc files from tape: 1989 aprox * b 2 0 3 0 gets to SWAP 0.0M , then reads disv forever, & kills floppy sn 5 bn 5, fixed with # disklabel /dev/rfdc0 floppy fd0a: 5 * 1024 63 labeloffset=800 /usr/spool/lpj created mode & owner: /dev/rmt1 & nrmt1 chmod < 755 > 6755 /etc/shutdown ln -s /usr/p/src/public/pdtar/tar.1 /usr/man/man1/pdtar.1 ln -s /usr/p/src/public/pdtar/tar.5 /usr/man/man5/pdtar.5 -------------------------------------------------------------------------------- 88-05-13 I booted /vmunix.new su cd /usr nohup tar c . & ps lax /vmunix panic ...... dumping to dev1 offset 1904 (savecore count field dev 0,196 this error may have arisen earlier) dump device bad loading /boot I rebooted standard kernel, couldnt access tape drive with tar reboot with reset button, hung after printing sba0: book indicated this was calculating scsi address, assumed scsi hardware locked in bad state, powered off, standard automatic reboot via power on, once again tape drive usable with our own kernel we generated 3-89, when used tape drive got intlevelabort lppl 4 pc 4d91 addr 0 msr 1e70809Q -------------------------------------------------------------------------------- 90-04-02 Sym wouldnt get beyond saying 4.3M swap, fsck wd0a I=1348 fsck wd0c I=1348 tarred root to tape 5 & verified insert my boot tape cd /mnt/tmp skip files (not using count=1) restore xvfh /dev/rmt8 stand etc only created dirs, not files restore xvf /dev/rmt8 stand etc only created etc/nulib dir, not files restore tvf /dev/rmt8 shows files are on tape rm /mnt/restoresymtab rmdir stand etc/nulib etc restore xvf /dev/rmt8 stand etc -------------------------------------------------------------------------------- notes on visit to franks ethernet 137.2 0x2000089 qa2 Class A,b,c b==128,c==192 Odysseus is franks leased sun running 4.3 /dev/rst0 for bsd 4.3 tapes /dev/rst8 for Appollo tapes login fl /usr/julian/src 17.8M uncompressed, 10.4M compressed frank was using tar cvfBb20 ftp 368641 bytes to sym resulted in 1 more byte on Sym than Sun ARGH! -------------------------------------------------------------------------------- 90-10-11 My reconfigurable kernel with correct timezone and 1.4Mdrive support has problems with tape driver, testblock -v, + find / results in 1) /dev/console not echoing a couple of characters, 2 then system hanging, HEXLED=6, necessitating a cold boot. ================================================================================ JHS Symmetric log typed in 901122 Dates start in 1st column, Input commands to monitor or shell start in 2nd column, as do status changes such as connect/disconnect, power on/off, dil strap changes, shell command indicated with a #, monitor commands with a *, Output from computer starts in 3rd column. Where a significat conclusion/ lesson is learnt, it is marked "LESSON:" DATES | INPUT | OUTPUT FROM COMPUTER 901118 FAULT DEBUG AFTER CONNECTING JORDAN'S MICROPOLIS DISC TO SYMMETRIC WITH MY MINISCRIBE AS INTERNAL DRIVE 0 (wrongly, with attenuators on both, cable screw ups + disconnects, caused bad writes to be reported on /dev/console). Old winchester (Miniscribe 85) on external cable, Micropolis not connected. MON 1.11 boot error (no_id) E UND * Old winchester moved back to internal ribbon. Normal boot: sba0 sba0: swap 4.3M wd0a: hard read error sn 2348 bn 4696 status 59 dev = 0x0 ino=0 fs=/ Boot Single User From Floppy * b 2 0 3 0 # bad144 -s /dev/wd0a wd0c: sn 2348 bn 4696 status 59 error 40 2348 2349 wd0c: sn 34896 bn 69792 status 59 error 40 34896 34897 # ls vmunix, vmunix.scs, vmunix.jhs exit # bad144 -s /dev/rwd0h 2348 34896 # cat /mnt/etc/showbadblocks #!/bin/sh for m in wd0a wd0h do b = `bad144 $m` c = `icheck -bn $b $m` echo "Filen on "$m" overlaying current bad blocks" ncheck -i $i $m done 901118 # icheck -b 2348 2349 34896 34897 /dev/wd0a (sector nos. above correspond to block nos. such as 4696 41568) # bad144 -s /dev/wd0h wd0c:hard read error, sn 2348 bn 4696 status 59 error 40 2348 wd0c:hard read error, sn 2348 bn 4696 status 59 error 40 2349 wd0c:hard read error, sn 11825 bn 23650 status 59 error 40 11824 11825 wd0c:hard read error, sn 34896 bn 69792 status 59 error 40 34896 34897 # bad144 -s /dev/rwd0b 2348 11825 34896 901119 # bad144 -S -f /dev/rwd0c wd0c:hard read error, sn 2348 bn 4696 status 59 error 40 2348 34896 # bad144 -S /dev/rwd0c 2348 34896 # bad144 -S /dev/rwd0c 2348 34896 LESSON: bad144 -S does not automatically remove bad sectors, despite what manual says. # bad144 -s -a -v -S /dev/rwd0c hard read error sn 2348 bn 4696 2348 hard read error sn 34896 bn 69792 34896 # bad144 /dev/rwd0c 814 860 14252 14540 14680 14684 60454 60598 84854 901119 Cold Reboot # bad144 /dev/rwd0c Tested to see if disc was actually writing # mount /dev/wd0a /mnt # ls /etc > /mnt/write_test # sync # halt -n * b 2 0 3 0 # mount /dev/wd0a /mnt # cat /mnt/write_test Concluded writing OK # bad144 -a -v /dev/rwd0c 2348 34896 Had 9 sectors, adding 2 copying 73710 to 73708 zeroing 73709 copying 73711 to 73710 copying 73712 to 73711 copying 73713 to 73712 copying 73714 to 73713 copying 73715 to 73714 copying 73716 to 73715 zeroing 73716 write badsect file at 73719 73721 73723 73725 73727 901119 # bad144 /dev/rwd0c 814 860 4696 bn, corresponds to sn 2348 14252 14540 14680 14684 60454 60598 69792 bn, corresponds to sn 34896 84854 LESSON: -S does not work, must specify sectors manually on the command line # bad144 -s /dev/rwd0c no errors reported # icheck -bn 4696 69792 /dev/wd0a # mount /dev/wd0a /mnt # cd /mnt/etc # showbadblocks > /tmp/show.rslt & # sync # fsck /dev/rwd0a I=480 NAME=/bin/ar SIZE=0 REMOVE ? y I= NAME=/bin/true SIZE=0 REMOVE ? y I= NAME=/bin/cat SIZE=0 REMOVE ? y I= NAME=/bin/cc SIZE=0 REMOVE ? y I= NAME=/bin/chmod SIZE=0 REMOVE ? y I= NAME=/bin/chgrp SIZE=0 REMOVE ? y I= NAME=/bin/cmp SIZE=0 REMOVE ? y I= NAME=/bin/date SIZE=0 REMOVE ? y Free inode count wrong in superblock fix y bad cylinder groups salvage y # fsck /dev/rwd0a # fsck /dev/rwd0h size = 0 remove y /genix/src/cmd/sccs/lib/sid_ab.c.Z /p/src/vsl/lft/code/tools link count dir i=824 count=7 should be 6 unref file=4680 recon several more free inode count adjust bad cyl groups salvage y # fsck /dev/rwd0h bin/cmp vmunix vmunix.scs dev = 0x0 ino=38, fs=mnt panic ifree freeing free inode syncing disc This crash was probably because floppy fs holding cmp was damaged. Auto reboot Manually restored from root dump: /bin/ar,true,cat,cc,chmod,chgrp,cmp,date, 901120 Attempt to format Micropolis Miniscribe diconnected (except power & LED) Micropolis on external socket with earth strap Autoloading /vmunix boot error (no_id) E UND * b 2 0 3 0 boot error (no_id) E UND Micropolis power off * b 2 0 3 0 booted up single user ok LESSON: leave power off as monitor tries to access winchester before floppy, getting confused. # dd if=/dev/rwd1a of=/dev/null count=10 wd1: read label status 59 error 10 # format drive 1 cylinders 1024 tracks/cylinder 8 sectors/track 9 (Ive tried more this is max) bytes/sector 1024 pre comp CR size of a 50 offset a 0 size b 0 c 0 d 0 e 0 f 0 g 0 h 0 interleave CR head switch,sectors(3):CR (what is this ?) format or verify f start cyl 0 end cyl 99 start track 0 en track 7 install bootstrap n (cos. read error on bootstrap with y) pack label jordan drive serial no 9999 use ecc y bytes in gap LESSON: from chuck: each winch has its own data lines, not daisy chained ! (only control is daisy) need to connect correct data lines. Got Miniscribe working on /dev/wd1a external socket, mounted filesystem, powered with new pack, & earth lead. Jordan says drive last used as 8 head 1024 b/sec by michael. 901120 18:00 Tried to format Micropolis on I.'s AT, he thought he'd done it, but on returning I saw the strap on the DS2 selector, so cant have! (also wasnt slow enough). Tried to use dd & disklabel - failed. 901121 * b 3 0 3 wd3a:/stand/format Micropolis power on., no W1 or W2 straps in place, DS2 strap in. format drive 1 cylinders 1024 tracks/cylinder 8 sectors/track 9 (Ive tried more this is max) bytes/sector 1024 pre comp CR size of a 50 offset a 0 size b 0 c 0 d 0 e 0 f 0 g 0 h 0 interleave CR head switch,sectors(3):CR (what is this ?) format or verify f start cyl 0 end cyl CR start track 0 en track 7 install bootstrap n (cos. read error on bootstrap with y) pack label "jhs 901121 12:48" drive serial no 1 use ecc y bytes in gap (3) CR Cylinder 1023 writing empty bad-sector tables at sector 73719 #wd3a/vmunix LESSON: learnt that stand alone formatter works ok # disklabel /dev/rwd1c error 0 # bad144 /dev/rwd1c bad pack magic number (pack is unlabelled) # disklabel wd1 miniscribe85 jordan /stand/bootwd write error # disklabel /dev/rwd1a miniscribe85 jordan /stand/bootwd # disklabel /dev/rwd1a lists plausible information # bad144 /dev/rwd1c lists nothing # bad144 -s /dev/rwd1c no errors reported Top of drive visible list: Hd Cyl BFIND 0 1022 8982.. 5 978 9981 1 877 9449.. 5 1023 6366 2 1015 455.. 6 1021 3901 3 619 7744. 7 321 8032.. 5 595 5102 7 394 8840.. # dd if=/dev/wd1c of=/dev/null 147456+0 records in & out note 8*9*1024=73728, 73728 * 2 = 147456 # sync # halt Jordans disc connected as external,+ scs/miniscribe on internal Booted _____r (and possibly write errors) disconnected jordans drive wd0: read error sector 92 status 0x59 sector 0x1 bn 184 directory read error Boot Symmetric 32000 Monitor (1.11) 375(root) 1/11/88 23:34:34 E SVC b 2 0 3 0 boot error (abort) E UND Disconnected my winchester, couldnt get * prompt with reset. Connected Jordan's, left mine disconnected, gto * prompt * b 2 0 3 0 loading wd3a:/boot Boot:/vmunix wd3:read error:sector 8, status 0x59, error 0x10 # bad144 -s /dev/rwd0c wd0c hard read error sn 92 bn 184 error1 92 # bad144 -a -v /dev/rwd0c 92 # fsck /dev/rwd0a Inode 2 Dir Salvage Inode 3 Dir unreferenced reconnect y SORRY NO lost+found directory 1345 896 fsck then reported lots more damage on root # fsck /dev/rwd0h - all OK # fsck /dev/rwd0a - HUNDREDS OF FIXES !, no lost+found directory, so everything lost Emptied the root ! Removed Jordan's attenuator, Started to rebuild root # newfs -N -v /dev/rwd0a miniscribe85 # mount /dev/wd0h /mnt # mount /dev/wd0a /mnt2 # rm /tmp ; ln -s /mnt/tmp /tmp cd /mnt2 /mnt/ucb/zcat /mnt/p/adm/distrib_tape/root_fs.Z | restore rf - sync fsck /dev/rwd0a OK halt RESET loading /boot Bad Format *b 3 0 3 wd0a:/vmunix ^D login root cd /usr/p/adm/bak tar cf - . | ( cd / ; tar xf - ) reboot cd /dev MAKEDEV all labels disklabel /dev/rwd0c miniscribe85 1 /stand/bootwd THIS SUCCEEDED IN INSTALLING BOOT ON WD reboot mv hosts.equiv hosts.equiv.orig mv termcap termcap.orig termcap 90-11-22 Changed jordan's strap to ds2 from ds1 observed no straps on w1 & w2 removed resistor pack on Jordans drive Jordan power on, Sym power on booted no problems. mkdir /p2a /p2b /p2h mount /dev/wd1a /p2a error from winch cant remember halt b swap 4.3M wdoa: hard read error sn 6988 bn 13976 status 59 error 40 Printer daemon started wd0h: hard read error sn 55664 bn 83104 dev=0x7,ion=0 fs=/asr panic ifree: freeing inode. syncing disk loadimg wd0a:/boot wd0:read error sector 28 status 0x59 error 0x40 cant read root inode boot failed. Tried to boot of floppy b 2 0 3 0 Failed disconnected Jordan's floppy Tried to boot of floppy, failed Disconnected my winchester control plug * b 2 0 3 0 1 # bad144 -s /dev/rwd0cA hard read error sn 28 bn 56 28 sn 6988 bn 13976 6988 sn 55664 bn 111328 55664 sn 64912 bn 129824 64912 # mount /dev/wd0a /mnt # cd /mnt/etc wd0a: hard read error sn 28 bn 56 panic Reset * b 2 0 3 0 # bad144 -a -v /dev/rwd0c 28 6988 55664 64912 # fsck /dev/rwd0a Phase 2 ROOT INODE UNALLOCATED TERMINATING # fsck /dev/rwd0h unallocated inode SIZE=0 Remove y /bin/minix1.1 /genix/src/cmd/ddt/bpt.h.Z parse.c.Z atrun.c.Z basename.c.Z /lib/libsrc/f..cmp.c.Z /libplot/t450 /p/src/local/nroff/cmdtab.c.Z /p/src/vsl/lft/doc/lft.mac.Z lfta.rof.Z lftie.rof.Z lfts.rof.Z lots of files reconnected Reconnected Jordans Winch # halt -n * b 2 0 3 0 hung disconnected jordan's winch * b 2 0 3 0 # mount /dev/wd0a /mnt # ls /mnt - empty ! # halt -n * b 2 0 3 0 # newfs -N -v /dev/rwd0 miniscribe85 # mount /dev/wd0a /mnt # mount /dev/wd0h /mnt2 # ls /mnt /ucb /z* panic ifree freeing free inodes boot failed * b 2 0 3 0 # fsck /dev/rfd0 # halt -n * b 2 0 3 0 # cp /dev/null /etc/mtab # mount /dev/wd0a /mnt ; mount /dev/wd0h /mnt2 # cd /mnt dev=0x0 ino=2 fs=/mnt panic ifree freeing free inodes * b 2 0 3 0 # fsck /dev/rfd0 - No errors # newfs -v /dev/rwd0a miniscribe85 90-11-22 14:57 # cp /dev/null /etc/mtab # mount /dev/wd0a /mnt # mount /dev/wd0h /mnt2 # cd /mnt ; ls # ln -s /mnt2/tmp /tmp # /mnt2/ucb/zcat /mnt2/p/adm/dis*/roo*Z | restore xf - # halt * b Auto reboot in progress /dev/wd0a: 304 files wd0h: hard read error sn 18768 bn 9312 /dev/rwd0h: cannot read blk 9312 unexpected incosistency, run fsck manually # bad144 -s /dev/rwd0c sn 18768 bn 37536 18768 # halt * b 2 0 3 0 # icheck /dev/rwd0h -bn 18768 didnt understand what it said # bad144 -a /dev/rwd0c 18768 # fsck /dev/rwd0h cg 2:bad magic number, 4 blocks missing, bad cyl groups, salvage y # fsck /dev/rwd0h OK no errs # mount /dev/wd0h /mnt ; cd /mnt # tar cBf /dev/rcst2 . & to tape 12 when power to jordans drive off, all accesses to miniscribe winch fail. tar failed to stat many files, (even though superuser) # halt * b 2 0 3 0 # disklabel /dev/rwd0c OK # disklabel /dev/rwd1c hard read error sn 0 bn 0 # dd if=/dev/rwd1c of=null bs=1024 hard read error ... Power OFF, ON, Jordans on 0.5 secs after scs Autoboot RC/fsck said Cannot read block 56 run fsck manually # halt * b Cannot read block 56 run fsck manually The 12V supply to the microscribe is 12.65 V (the new supply is 10.85V) # fsck -y /dev/rwd0h loads of errors - Phase 2 unallocated root inode * b 3 0 3 Formatted my drive just like jordans, except pack label "901122 julian" drive serial number 0 started at 21:00, finished before 22:35 # disklabel /dev/rwd0c miniscribe85 "901122 julian" /stand/bootwd write error 0 # disklabel /dev/rwd0c miniscribe85 "901122 julian" /stand/bootwd # sync # bad144 -s /dev/rwd0c bad pack magic number (pack is unlabelled) # disklabel /dev/wd0a miniscribe85 0 /stand/bootwd # disklabel /dev/wd0a info shown # disklabel /dev/wd0c info shown powered up & connected jordans drive # disklabel /dev/wd0h pack is unlabelled # disklabel /dev/wd0h miniscribe85 # disklabel /dev/wd1a hard read error # disklabel /dev/wd1a miniscribe85 1 /stand/boot wd1a: hard write error ^C halt * b 3 0 3 wd3a:/stand/format on jordans first 5 cylinders # disklabel /dev/wd1a miniscribe85 1 /stand/bootwd wd3a:/vmunix # disklabel /dev/wd1a miniscribe85 1 /stand/bootwd # disklabel /dev/wd1h miniscribe85 wd1h: hard write error. # bad144 -s /dev/rwd0c wd0c soft ecc error sn 30227 bn 60454 sn 30299 bn 60598 # bad144 showed nothing < more dosklabels, newfs, restore in this area> # dd if=/dev/rwd0c of=/dev/null bs=512 count=100 skip=60450 invalid argument, takes a long time # dd if=/dev/rwd0c of=/dev/null bs=1024 count=100 skip=30210 # bad144 -a -v /dev/rwd0c 30227 30299 # dd as above no error reported thus proved not necessary to reboot, despite man 8 bad144 saying it is necessary to reboot before it takes effect # newfs -v -S 1024 -m 5 /dev/wd0a miniscribe 85 wd0a hard write error sn 9791 bn 19582 # bad144 -s /dev/rwd0a OK, no errors # newfs -v -S 1024 -m 5 /dev/wd0a miniscribe 85 wd0a hard read error sn 24 bn 48 Jordans power off # newfs -v -S 1024 -m 5 /dev/wd0a miniscribe 85 wd0a hard write error sn 9791 bn 19582 disconnected jordans drive # newfs -v -S 1024 -m 5 /dev/wd0a miniscribe 85 ran ok no errors # halt * b 3 0 3 wd3a:/stand/format connected micropolis & powered up drive: 1 install bootstrap:n pack label:"001123 02:11" drive serial no:1 ecc:y cylinder 1023 writing bad sector tables at sector 73719 * wd3a:/vmunix # disklabel /dev/wd1a miniscribe85 /stand/bootwd # disklabel /dev/wd1b miniscribe85 # disklabel /dev/wd1c miniscribe85 shows plausible table # disklable /dev/wd1h Bad pack magic number (pack is unlabeled) # disklable /dev/wd1h miniscribe85 # bad144 -s /dev/wd1c wd1c: soft ecc sn 61273 bn 122546 # dd if=/dev/rwd1c of=/dev/null bs=1024 skip=61260 count=30 wd1c: soft ecc sn 61273 bn 122546 # bad144 -a -v /dev/rwd1c 61273 had 0 bad sectors, adding 1 zeroing 73718 write badsect file at 73719 73721 73723 73725 73727 # bad144 -s /dev/rwd0c no errors reported Removed micropolis, added W2 strap # dd if=/dev/rwd0a bs=1024 of=/dev/null wd0: read label: status 59 error 10 wd0a: hard read error sn 0 bn 0 status 59 error10 read: I/O error 0+0 records in 0+0 records out # dd if=/dev/rwd1a bs=1024 of=/dev/null wd0: recal: status 1 error 4 wd1a:hard read error sn 0 bn 0 status 59 error10 read: I/O error 0+0 records in 0+0 records out # halt Symmetric Power Off Discovered I had forgotten micropolis power, connected it * b 2 0 3 0 # dd if=/dev/rwd0a bs=1024 of=/dev/null count=100 100+0 records in 100+0 records out # dd if=/dev/rwd1a bs=1024 of=/dev/null count=500 500+0 records in 500+0 records out # dd if=/dev/rwd0a of=/dev/rwd1a bs=1024 9792+0 records in 9792+0 records out Examined tape headers tape 3 Sat nov 17 03:00 tape 2 fri nov 16 03:00 tape 1 tue nov 13 16:47 # dd if=/dev/nrcst2 # newfs -v -S 1024 -m 5 /dev/wd1h miniscribe85 # mount /dev/wd1h /mnt2 # cd /mnt2 # tar xfB /dev/rcst2 & ../julian files couldnt be restored (because many megabytes cant fit on a floppy) # newfs -v -S 1024 -m 5 /dev/wd1a miniscribe85 # newfs -v -S 1024 -m 5 /dev/wd1b miniscribe85 - - - - - - - - - - # cd / # restorefmtape (1) Init wd0a seemed to run OK (2) 1804+7 records in 1804+0 records out <------ SUSPICIOUS Warning: ./lost+found: File exists (3) Init wd0h seemed to run OK (4) Extract usr fs from tape Mount vol 2 then type return CR Mount vol 3 then type return CR <--- TAPE CONTENT != WHAT SH SCRIPT EXPECTS # halt * b 2 Running on miniscribe root # disklabel /dev/wd0a display looks plausible # disklabel /dev/wd0b display looks plausible # disklabel /dev/wd0c display looks plausible # disklabel /dev/wd0h display looks plausible # disklabel /dev/wd1a display looks plausible # disklabel /dev/wd1b display looks plausible # disklabel /dev/wd1c display looks plausible # disklabel /dev/wd1h display looks plausible Later /dev/wd0b currently containing root, use tunefs to 5% # newfs -v -m 5 /dev/wd0b miniscribe85 # newfs -v -m 5 /dev/wd0h miniscribe85 # newfs -v -m 5 /dev/wd1a miniscribe85 # newfs -v -m 5 /dev/wd1b miniscribe85 Later /dev/wd1h currently containing bits of /p, use tunefs to 5% dd if=/dev/nrcst2 of=/dev/null files=9 9542+10 records in 9543+1 records out restor rf /dev/rcst2 - - - - - - - - - - 901127 17:30 (aprox) Installed a 3 disc system that works (at last) previously i had installed a DS3 data cable that shorted all the pins between the WD controller, and a micropolis drive, (I had not realised there was an there earth plane on the back of the ribbon), once earth plane, removed everything worked OK. bad144 -s /dev/wd[02][abh] started 901127 18:34 bad144 -s /dev/rwd[0-2]A[abh] wd0c: soft ecc sn 42427 bn 84854 wd0c: soft ecc sn 3551 bn 7102 wd0c: soft ecc sn 42427 bn 84854 wd0h: soft ecc sn 42427 bn 56630 wd0h: soft ecc sn 42427 bn 56630 wd0c: soft ecc sn 42427 bn 84854 wd0h: soft ecc sn 42427 bn 56630 wd0h: soft ecc sn 42427 bn 56630 with drive 2 mounted: newfs /dev/wd2a miniscribe85 /stand/bootwd newfs /dev/wd2b miniscribe85 newfs /dev/wd2h miniscribe85 /root (all bar dev) tar copied to /wd2a /usr tar copied to /wd2h 901128 Normal operation (restoring damaged files, checking everything, shuffling files about. wd0a: soft ecc sn 3551 bn 7102 901128 17:59 wd0a: soft ecc sn 3551 bn 7102 22:00 wd0h: soft ecc sn 42427 bn 56630 ================================================================================ 901129 ~22:30 Preparing to remove Jordans drive (with bulk disc data erase), and replace by one of mine: umount of wd1[abh] newfs wd1[abh] mount of wd1[abh] testblock of /wd1[abh] removal of jordans drive, replacement with seagate, which reported bad sector table scrambled, LATER: a reformat should clear this sometime. 901129 23:50 reboot Discovered (by using mount) that my badly edited new /etc/fstab had mounted wd0h, wd1h, wd2h, all succesively on top of each other not sure if ive done a reboot allowing this fstab to work its ruin, but for the record, the datestamp of /etc/fstab was -rw-r--r-- 1 root 169 Nov 28 11:04 /etc/fstab.nasty and contents were: /dev/wd0a:/:rw:60:1 /dev/wd0h:/usr:rw:30:2 /dev/wd1a:/:rw:60:3 /dev/wd1b:/:rw:60:4 /dev/wd1h:/usr:rw:30:2 /dev/wd2a:/:rw:60:3 /dev/wd2b:/:rw:60:4 /dev/wd2h:/usr:rw:30:2 Mount did not show multiple 1a or 2a mounted on root (thank goodness) I also did newfs /dev/rwd1a miniscribe85 /stand/bootwd newfs /dev/rwd1b miniscribe85 newfs /dev/rwd1h miniscribe85 and discovered newfs doesnt complain about unreadable boot files, so /dev/wd2a needs testing, as it may not actually have a valid primary boot, wrong syntax not detected was newfs /dev/rwd1a miniscribe85 /stand/wdboot bad144 -s /dev/rwd1c > ~julian/tmp/rwd1c_bad144.log was started log file after: 0 size rwd1c: soft ecc error sn23033 bn 46066 rwd1c: soft ecc error sn41985 bn 83970 rwd1c: soft ecc error sn52208 bn104416 rwd1c: soft ecc error sn67756 bn135512 wd1c: soft ecc error sn52208 bn104416 sh: bad144 -s /dev/rwd1c 2>> ~julian/tmp/rwd1c_bad144.log was started log size still 0 To try and sort out the file overlay mess, the following was done: cd /wd1h ; tar x (with tape 12) read 5M then failed, try again later cd /wd1h/tape_12 ; tar xBf /dev/rmt8 (with tape 12) cd /wd2h ; find . -type f -exec \ /usr/p/julian/bin/equ_rm /usr . {} \; cd /wd2a ; find . -type f -exec \ /usr/p/julian/bin/equ_rm / . {} \; bad144 -a /dev/rwd1c 23033 41985 52208 67756 bad144 /dev/rwd1c 46066 83970 104416 135512 newfs /dev/rwd1a miniscribe85 /stand/bootwd newfs /dev/rwd1b miniscribe85 newfs /dev/rwd1h miniscribe85 mount /dev/wd1a /wd1a mount /dev/wd1b /wd1b mount /dev/wd1h /wd1h wd0h: soft ecc error sn42427 bn56630 diff /usr/adm/shutdownlog shutdownlog > t < 22:52 Thu Nov 29, 1990. Halted. > 23:37 Thu Nov 29, 1990. Halted. ie wd0ah log did not know of the 23:37 shutdown previous entry for both wd0h and wd2h was 07:07 Tue Nov 27, 1990. Halted. wd2h:~adm/txt/technical.log held nothing that this file did not also hold wd0h: soft ecc error sn42427 bn56630 wd1h: soft ecc error sn41985 bn55746 901204 wd0h: soft ecc error sn42427 bn56630 901212 halt power on symmetric left on extension chassis power off drive 1 bearing contact bent to avoid squeak extension chassis power on b /dev/rwd1h: unref dir i=11672 owner=julian mode=40755 size=512 mtime=dec 8 11:51 1990 (reconnected) dir i=11672 connected. parent was i=4617 link count dir i=4617 owner=root mode=40755 size=512 mtime=dec 9 22:39 1990 count 3 should be 2 (adjusted) link count dir i=11672 owner=julian mode=40755 size=512 mtime=dec 8 11:51 1990 count 1 should be 2 unexpected inmconsistency; run fsck manually halt b /dev/rwd1h BAD INODE NUMBER FOR '..' I=11672 OWNER=julian MODE=40755 SIZE=512 MTIME=Dec 8 11:51 1990 DIR=/lost+found/#11672 cd /etc ; mv fstab fstab.moved ; cp fstab.single fstab halt b Up multi user on one disc. fsck /dev/rwd1h Phase 2 - Check Pathnames BAD INODE NUMBER FOR '..' I=11672 OWNER=julian MODE=40755 SIZE=512 MTIME=Dec 8 11:51 1990 DIR=/lost+found/#11672 FIX y DIRECTORY CORRUPTED I=926 OWNER=root MODE=40755 SIZE=12288 MTIME=Dec 9 11:30 1990 DIR=/lost+found SALVAGE y Phase 4 - Check Reference Counts LINK COUNT DIR I=11672 OWNER=julian MODE=40755 SIZE=512 MTIME=Dec 8 11:51 1990 COUNT 1 SHOULD BE 2 ADJUST y fsck /dev/rwd1h DIRECTORY CORRUPTED I=926 OWNER=root MODE=40755 SIZE=12288 MTIME=Dec 9 11:30 1990 DIR=/lost+found SALVAGE y fsck /dev/rwd1h DIRECTORY CORRUPTED I=926 OWNER=root MODE=40755 SIZE=12288 MTIME=Dec 9 11:30 1990 DIR=/lost+found SALVAGE y newfs /dev/rwd2h miniscribe85 mount /dev/wd1a /wd1a mount /dev/wd1b /wd1b mount /dev/wd1h /wd1h mount /dev/wd2a /wd2a mount /dev/wd2b /wd2b mount /dev/wd2h /wd2h (cd /wd1h ; ~local/pdtar -c -b 4 -f - . ) | ( cd /wd2h ; tar xplf - ) du -s /wd1h /wd2h wd1h 38435 wd2h 38426 cd /wd1h find . -type f -exec ~local/cmp -d {} /wd2h \; find . -type l -exec rm {} \; find . -type d -exec rmdir {} \; several times nothing left (as expected) umount /dev/wd1h umount /dev/wd1b umount /dev/wd1a newfs -v -m 5 /dev/rwd1h miniscribe85 newfs -v -m 5 /dev/rwd1b miniscribe85 newfs -v -m 5 /dev/rwd1a miniscribe85 /stand/bootwd bad144 -a /dev/rwd1c 16522 23033 24039 41985 52208 53553 67756 72355 bad144 /dev/rwd1c 33044 46066 46066 48078 83970 83970 104416 104416 107106 135512 135512 144710 bad144 /dev/rwd1c 123456789 901226 ~22:00 bad144 /dev/rwd1c just next prompt came, thus this failed to write 123456789 as drive serial number bad144 /dev/rwd1h bad144 -s /dev/rwd1c nothing reported on control screen, but dev/console received wd1c: soft ecc sn 24039 bn 48078 bad144 -a -f -c /dev/rwd1c 24039 bad144 /dev/rwd1c 48078 fsck /dev/wd1h OK fsck /dev/wd1b OK fsck /dev/wd1a OK cat /etc/showbadblocks cd /dev sh b=`bad144 wd0h` echo $b 32230 32374 i=`icheck -bn $b wd0h` 7196 32230 arg; frag 3 of 4, inode=0, class=free block ncheck -i 0 wd0h <-- syntax from 4.3 book size manual ncheck: cannot open /dev/r0 full list of all inodes on /usr followed ! ncheck -i wd0h 0 long wait full list of all inodes on /usr followed ! 901213 ~julian/bin/u2 newfs -v -m 5 /dev/rwd2h miniscribe85 newfs -v -m 5 /dev/rwd2b miniscribe85 newfs -v -m 5 /dev/rwd2a miniscribe85 /stand/bootwd ~julian/bin/m2 910101 # umount /dev/wd1a # umount /dev/wd1b # umount /dev/wd1h # umount /dev/wd2a # umount /dev/wd2b # umount /dev/wd2h # tunefs -m 0 /dev/wd1a # tunefs -m 0 /dev/wd1b # tunefs -m 0 /dev/wd1h # tunefs -m 0 /dev/wd2a # tunefs -m 0 /dev/wd2b # tunefs -m 0 /dev/wd2h # halt * b 2 boot single user # tunefs -m 0 /dev/rwd0h # tunefs -m 0 /dev/rwd0a # halt -n halt no flush of superblock * b # halt Installed Seagate drive in place of /dev/wd1 * b 2 bad144 /dev/rwd1c scrambled table reported bad144 -f /dev/rwd1c 100 format errors in bad block area * b 3 Boot: wd0a:/stand/format Drive 1 Cylinders 1024 Tracks/cyl 9 Secs 9 Bytes/Sec 1024 Precomp (1023 if none) 1023 Partition A Size 0 same for b-h Interleave factor, n:1 (3): CR head switch, sectors (3): CR format or verify: f verify after y Install bootstrap: y read error on bootstrap pack label (up to 16 chars): "Julian H. Stacey" driver serial number 72696739 use ECC y format y/n y started 17:00 cabinet tempreature 26 centigrade Cylinder 810 wd1: read error: sector 65795 status 0x59 error 0x40 Cylinder 812, track 2, read error Cylinder 1023 1 track with errors writing bad sector table at sector 82935 finished 18:30 Power Off, term resistor pack removed, earth strap added Drive 1 Cylinders 1024 Tracks/cyl 9 Secs 9 Bytes/Sec 1024 Precomp (1023 if none) CR Partition A Size CR same for b-h Interleave factor, n:1 (3): CR head switch, sectors (3): CR format or verify: f verify after y Install bootstrap: n pack label (up to 16 chars): "ST4096 Stacey" driver serial number 72696739 use ECC y format y/n y Cylinder 1023 writing bad sector table at sector 82935 Drive -1 wd0a:/vmunix # disklabel /dev/wd1c seagate # disklabel /dev/wd1a seagate /stand/bootwd # newfs -v -m 0 /dev/rwd1a seagate /stand/bootwd For some reason couldnt write the disc pack (even after power off), so reformatted 1st 10 cylinders # newfs -v -m 0 /dev/rwd1a jhs /stand/bootwd # disklabel /dev/rwd1a jhs /stand/bootwd jhs|Seagate ST4096 96Mb:ty=st506:se#1024:nt#9:ns#9:nc#1024:\ :pa#82701:oa#0:\ :pb#0:ob#0:\ :pc#82944:oc#0:\ :ph#0:oh#0: see /etc/disktab file for more observations - - - - - - - - - 900102 Removal of bad blocks (ecc detected over the last month or two) /usr/adm/messages preened to produce:- Occurences Error 11 wd0a: soft ecc sn 3551 bn 7102 15 wd0h: soft ecc sn 42427 bn 56630 3 wd1h: hard read error, sn 72867 bn 117510 2 wd1h: soft read error, sn 71932 bn 115640 5 wd1c: hard read error, sn 65795 bn 131590 2 wd1c: soft ecc sn 67756 bn 135512 2 wd1h: soft ecc sn 53553 bn 78882 2 wd1h: soft read error, sn 73224 bn 118224 4 wd1h: soft read error, sn 71715 bn 115206 2 wd1c: soft ecc sn 23033 bn 46066 2 wd1h: soft read error, sn 73155 bn 118086 3 wd1c: soft ecc sn 24039 bn 48078 2 wd2h: soft ecc sn 22945 bn 17666 Seagate temporary disc: 2 wd1a: soft ecc sn 55043 bn 110086 2 wd1c: soft read error, sn 65795 bn 131590 - - - - - - # icheck -bn 7102 /dev/rwd0a # ncheck -i 509 /dev/rwd0a # cp /bin/passwd /bin/passwd.jhs bn 7102 error appears on /dev/console # halt * b 2 # bad144 -a -v -c /dev/rwd0a 3551 had 2 bad sectors, adding 1 copying 73717 to 73716 copying 73718 to 73717 copying 3551 to 73718 writing bad sector file at 73719 73721 73723 73725 73727 # halt -n * b fsck /dev/wd0a: Unreferenced file I=633 Mode=100600, Owner=Root, removed I looked in /lost+found, not there, - check root inode 633 from tape, /dev/wd0g, I probably deleted this when rebuilding system after last crash (11/12-90), so this inode 633 is just a random file .... forget it. - - - - - - # icheck -bn 56630 /dev/rwd0h # ncheck -i 12453 /dev/rwd0h # cat /usr/man/man2/sigvec.2 > /dev/null no errors on /dev/console, so assumed OK - - - - - - # icheck -bn 17666 /dev/rwd2h 23594 # ncheck -i 23594 /dev/rwd2h /wd2h/GNU/epoch/epoch-3.1.tar.Z # cp /wd2h/GNU/epoch/epoch-3.1.tar.Z /usr/tmp/poch31 no errors detected, so assumed OK - - - - - - # icheck -bn 110086 /dev/rwd1a free block # bad144 -a -v /dev/rwd1a 55043 had 0 bad sectors, adding 1 zeroing 82934 write badsect file at 82935 82937 82939 82941 82943 - - - - - - # icheck -bn 131590 /dev/rwd1a free block # bad144 -a -v /dev/rwd1a 65795 had 1 bad sectors, adding 1 zeroing 82933 write badsect file at 82935 82937 82939 82941 82943 - - - - - - 910103 The seagate drive is now ready for use, its empty, with bad blocks mapped out. - - - - - - # date > fastboot # halt Seagate removed, micropolis installed on wd1. * b # fsck /dev/wd[12][abh] OK # ln /etc/fstab.triple /etc/fstab # bad144 -s /dev/rwd1c 910111 Removing 2 troublesome sectors /usr/adm/messages excerpt appended: Jan 5 01:40 wd0h: soft ecc sn 42427 bn 56630 Jan 8 14:10 wd0h: soft ecc sn 42427 bn 56630 Jan 8 16:40 wd0h: soft ecc sn 58574 bn 88924 Jan 11 11:10 wd0h: soft ecc sn 58574 bn 88924 # icheck -bn 56630 88924 /dev/wd0h # ncheck -i 12389 16565 /dev/wd0h # cp /julian/delivery/exe/old/rm.exe ~julian/tmp # cp /usr/julian/delivery/exe/old/rm.exe ~julian/tmp # cp /usr/man/man2/fsync.2 ~julian/tmp # halt * b % cmp ~julian/tmp/* appropriate_files ... all ok, no differences 900126 /etc/myname content changed from qa2 to vsl Jan 16 11:20 wd1h: soft ecc sn 41985 bn 55746 Jan 28 01:40 wd1h: soft ecc sn 53553 bn 78882 Feb 18 13:00 wd0h: soft ecc sn 30371 bn 32518 910214 seagate as drive 1, to transfer data via jkh While moving around drives, i believe a data socket with short pins on wd2 caused disc hard errors to accumulate. 910216 # bad144 -s /dev/rwd0c 7047 7048 7049 7050 7051 7052 7053 7054 7055 (dev console reported hard read errs on sn 7047..7055, & bn 14094,14096,14098,14100,14102,14104,14106,14108,14100 /etc/disktab (miniscribe85) specifies that there are 9 sectors/track & 8 tracks/cylinder, Assuming cylinders are numbered 0 to 1023, & tracks 0 to 7: rwd0c 7047 / 9 = 783; 783 / 8 = 97.8 ; 97 * 8 * 9 = 6984 ; 7047 - 6984 = 63 ; 63 / 9 = 7; Thus need to format cylinder 97 track 7. - - - - - - - - - # bad144 -s /dev/rwd2c 30 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 rwd2c cylinder 0 tracks 6 & 7 need reformating, & sn 30 bn 60 to be patched out. sn 8 bn 16 sn 9280 bn 18560 sn 73719 bn 147438 sn 73721 bn 147442 sn 73723 bn 147446 sn 73725 bn 147450 sn 73727 bn 147454 Note bad sector replacements are stored in 73719 73721 73723 73725 73727 so now these are bad, it seems safest to dump data to tape, reformat, restore, i guess patching out could be dodgy. # bad144 -a -v /dev/rwd2c 30 bad144: /dev/rwd2c: cant read bad block info # halt * b 2 ^D (reboot) # format drive 2 Install bootstrap ? y read error on bootstrap # disklabel /dev/rwd2c micropolis /stand/bootwd # disklabel /dev/rwd2h micropolis # disklabel /dev/rwd2h # newfs -v -m 0 /dev/rwd2h micropolis # bad144 -s /dev/rwd2c I seemed to lose space using the micropolis disktab entry, I only got slightly more for the whole disc, than for the H partition of a miniscribe, so for now ive reverted to miniscribe. # disklabel /dev/rwd2c miniscribe85 /stand/bootwd # disklabel /dev/rwd2a miniscribe85 # disklabel /dev/rwd2b miniscribe85 # disklabel /dev/rwd2h miniscribe85 # disklabel /dev/rwd2h # newfs -v -m 0 /dev/rwd2a miniscribe85 # newfs -v -m 0 /dev/rwd2b miniscribe85 # newfs -v -m 0 /dev/rwd2h miniscribe85 Alternate way of finding used bad sectors: # sh # nohup find / -type f -print -exec cp {} /dev/null \; Read nohup.out, confirmed manually: -rwxr-xr-x 1 root 1207 Apr 7 1987 nu.cf.debug* # cp /etc/nulib/nu.cf.debug /dev/null wd0a: hard read error sn 7052 bn 14104 -rwxr-xr-x 1 root 1274 Aug 9 1987 nu.cf.real* # cp /etc/nulib/nu.cf.real /dev/null wd0a: hard read error sn 7054 bn 14108 910226 # bad144 -l /dev/rwd0c 7102 60454 60598 84854 117148 910226 --------------------------------- # /etc/icheck -b numbers -bn numbers /dev/rwd0a # reported in /usr/adm/messages (produced by bad144 -s /dev/rwd0c): # err on sn 7047 7048 7049 7050 7051 7052 7053 7054 7055 # err on bn 14094 14096 14098 14100 14102 14104 14106 14108 14110 # icheck -bn 7047 7048 7049 7050 7051 7052 7053 7054 7055 /dev/rwd0a 505 505 505 505 505 505 505 505 506 # icheck -bn 14094,14096,14098,14100,14102,14104,14106,14108,14110 /dev/rwd0a 1442 # icheck -bn 14095,14097,14099,14101,14103,14105,14107,14109,14111 /dev/rwd0a 1442 # icheck -b 7047 7048 7049 7050 7051 7052 7053 7054 7055 /dev/rwd0a /dev/rwd0a: 7048 arg; frag 0 of 4, inode=505, class=logical data block 2 7049 arg; frag 0 of 4, inode=505, class=logical data block 2 7050 arg; frag 1 of 4, inode=505, class=logical data block 2 7051 arg; frag 1 of 4, inode=505, class=logical data block 2 7052 arg; frag 2 of 4, inode=505, class=logical data block 2 7053 arg; frag 2 of 4, inode=505, class=logical data block 2 7054 arg; frag 3 of 4, inode=505, class=logical data block 2 7055 arg; frag 3 of 4, inode=505, class=logical data block 2 7047 arg; frag 3 of 4, inode=506, class=logical data block 1 bad mode 1359 bad mode 1360 files 719 (r=353,d=32,b=54,c=85,sl=195) used 8124 (i=20,ii=0,b=1841,f=680) free 1311 (b=319,f=35) missing 0 # icheck -b 14094,14096,14098,14100,14102,14104,14106,14108,14110 /dev/rwd0a /dev/rwd0a: bad mode 1359 bad mode 1360 14094 arg; frag 0 of 1, inode=1442, class=logical data block 0 files 719 (r=353,d=32,b=54,c=85,sl=195) used 8124 (i=20,ii=0,b=1841,f=680) free 1311 (b=319,f=35) missing 0 # icheck -b 14095,14097,14099,14101,14103,14105,14107,14109,14111 /dev/rwd0a /dev/rwd0a: bad mode 1359 bad mode 1360 14095 arg; frag 0 of 1, inode=1442, class=logical data block 0 files 719 (r=353,d=32,b=54,c=85,sl=195) used 8124 (i=20,ii=0,b=1841,f=680) free 1311 (b=319,f=35) missing 0 910228 restored /etc/nulib/ nu.cf.debug & nu.cf.real from tape distribution, rm old copies , that contained 3 512 byte blocks with my BAD_BLOCK_DIAGNOSTIC_CREATED_BY_JHS diagnostic in each of the 2 files (2*3=6 ==> the rest were unallocated) 910301 wd2h restored from tape, wd2b & wd2a still to consider 910712 wanted large disc to play in, using seagate, noticed no lost+found, also it was wd1a, preferred wd1h, # disklabel /dev/rwd1c jhs2 /stand/bootwd # disklabel /dev/rwd1h jhs2 # newfs -v -m 0 /dev/rwd1h jhs2 /stand/bootwd 910729 Copying all wd2 trees to seagate, to save space by repartitioning # wd2 to one big parttion, instead of 3 # cptree /wd2[abh] seagate # find .. cmp -d # disklabel /dev/rwd2c micropolis_a /stand/bootwd # disklabel /dev/rwd2a micropolis_a reboot (as suggested in disklabel manual, presumably cos kernel only loads partition table when it boots, not when it mounts) before rebooting i couldnt convice kernel that wd2a was 70M, & not 10M, (as it used to be). # newfs -v -m 0 /dev/rwd2a micropolis_a /stand/bootwd # bad144 -s /dev/rwd2c no errors reported. 910729 Copying all wd1 trees to seagate, to save space by repartitioning tried to reset seagte address to 2, screwewd up, narled the disc, reformated first 2 tracks # newfs -v -m 0 /dev/rwd2a jhs /stand/bootwd sector error, reformated whole disc type: jhs interleave: 3 head switch (sectors): 3 verify after format: y cyls 0 - 1023 tracks 0 - 8 install bootstrap cr pack label: seagate julian drive serial: 72696739 ecc: y bad sector table written at 82935 # newfs -v -m 0 /dev/rwd2a jhs /stand/bootwd warning 2 sectors in last cylinder unallocated. # bad144 -s /dev/rwd2c no errors reported. # disklabel /dev/rwd1c micropolis_a /stand/bootwd # disklabel /dev/rwd1a micropolis_a /stand/bootwd # newfs -v -m 0 /dev/rwd1a micropolis_a /stand/bootwd -------------------------------------------------------------------------------- 910912 Received chuck's symmetric pcb's 910914 Repartitioned seagate ready to boot chuck's symmetric. Created a new /etc/disktab entry: # Cylinder usage: root=120 + swap=54 + usr=847 seaboot|Seagate ST4096 96Mb bootable made by jhs for chucks symmetric\ :ty=st506:se#1024:nt#9:ns#9:nc#1024:\ :pa#9720:oa#0:\ :pb#4374:ob#120:\ :pc#82944:oc#0:\ :pd#0:od#0:\ :pe#0:oe#0:\ :pf#0:of#0:\ :pg#0:og#0:\ :ph#68607:oh#174: disklabel /dev/rwd2c seaboot /stand/bootwd disklabel /dev/rwd2b (tried for interest) showed errors disklabel /dev/rwd2a seaboot disklabel /dev/rwd2b seaboot disklabel /dev/rwd2h seaboot Reboot newfs -v -m 0 /dev/rwd2a seaboot /stand/bootwd newfs -v -m 0 /dev/rwd2h seaboot /stand/bootwd /usr/gnu/bin/tar -c -b 4 -f - . | ( cd /wd2h ; /usr/gnu/bin/tar -x -f - -p -b 4 ) mkdir /wd2a/dev cd /wd2a/dev /dev/MAKEDEV standard /dev/MAKEDEV wd0 /dev/MAKEDEV wd0 /dev/MAKEDEV wd0 /dev/MAKEDEV wd1 /dev/MAKEDEV wd2 /dev/MAKEDEV wd_names /dev/MAKEDEV fd0 /dev/MAKEDEV fd0 /dev/MAKEDEV fd0 /dev/MAKEDEV fd1 /dev/MAKEDEV fd2 /dev/MAKEDEV fd_long0 /dev/MAKEDEV fd_long1 /dev/MAKEDEV fd_long2 /dev/MAKEDEV pty0 /dev/MAKEDEV cst2 cd / /usr/gnu/bin/tar -c -b 4 -f - .[a-z]* [a-c]* delivery [e-t]* v* | ( cd /wd2a ; /usr/gnu/bin/tar -x -f - -p -b 4 ) & -------------------------------------------------------------------------------- 910918 copied my eproms 1.11 1/11/88 to chucks system, removed his 1.9 1/6/86 eproms -------------------------------------------------------------------------------- 910920 edited /etc/myname for 3 machines: old 3 disc 375 skier new chuck 375 biker pc532 surfer on skier & biker # /usr/lib/sendmail -bz -------------------------------------------------------------------------------- 911006 created disktab fuji_jhs entry, formatted 2 discs from jkh, added to biker # umount /dev/wd1a # bad144 -l /dev/rwd1c # bad144 -l /dev/rwd1a # bad144 -a -v -f -c /dev/rwd1c 39127 39226 # bad144 -l /dev/rwd1c # bad144 -l /dev/rwd1a # newfs -v -m 0 /dev/rwd1a fuji_jhs # bad144 -s /dev/rwd1c reported 39127 39226 # bad144 -s /dev/rwd1a 911014 BIKER # bad144 -s /dev/rwd0c wd0c: hard read error, sn 65795 bn 131590 # bad144 -s /dev/rwd1c wd1c: soft ecc sn 4057 bn 8114 wd1c: soft ecc sn 42366 bn 84732 wd1c: soft ecc sn 47316 bn 94632 wd1c: soft ecc sn 69193 bn 138386 wd1c: soft ecc sn 71858 bn 143716 # bad144 -s /dev/rwd2c OK 911113 SKIER /usr/adm/messages had many errors in last 2 months on wd1a sn 37577 bn 75154 # bad144 -l /dev/rwd1a 48078 # bad144 -a -v -c /dev/rwd1a 37577 wrote bad sector tables at 73719,73721,73723,73725,73727 # bad144 -l /dev/rwd1a 48078 75154 911113 BIKER /usr/adm/messages had some errors in last 2 months sn 29679 bn 59358 # bad144 -a -v -c /dev/rwd1c 29679 sn 65795 bn 131590 # bad144 -a -v -c /dev/rwd0c 65795 # halt -n 911202 # chmod 755 /bin/login (remd s bit 920205 SKIER /usr/adm/messages had some repetitive soft ecc errors ..... # bad144 -l /dev/rwd0c 7102 60454 60598 84854 117148 # bad144 -l /dev/rwd1c 48078 75154 # bad144 -l /dev/rwd2c nothing # sync # kill -1 1 # sync # bad144 -a -v -c /dev/rwd0c 4013 30371 # bad144 -a -v -c /dev/rwd1c 53553 # bad144 -a -v -c /dev/rwd2c 22945 # reboot -n # bad144 -l /dev/rwd0c 7102 8026 60454 60598 60742 84854 117148 # bad144 -l /dev/rwd1c 48078 75154 107106 bad144 -l /dev/rwd2c 45890 920219 SKIER /usr/adm/messages had some repetitive wd0h: soft ecc sn 28070 bn 27916 # bad144 -l /dev/rwd0c 7102 8026 60454 60598 60742 84854 117148 # halt # bad144 -a -v -c /dev/rwd0c 28070 # halt -n * b # bad144 -l /dev/rwd0c 7102 8026 56140 60454 60598 60742 84854 117148 BIKER 920319 # bad144 -a -v -c /dev/rwd1c 41716 hard read error sn 41716 bn 83432 # icheck -b 83432 /dev/rwd1a # ncheck -i 12692 /dev/rwd1a /pc532/rescue/dev_root_911126.Z # mount /dev/wd1a /usr1 # cd /usr1/pc532/rescue # zcat dev_root* > /dev/null (ie testing to see if compress detects an error) SKIER 920509 OBJECT: to test Stuarts control card, (Data saved via enet to driver) 2 MISTAKES MADE: - Removed /wd2, Reinstalled with wrong drive id straps (card thought it was drive 2 (of4), isntead of 3 (as original card was set to) - Also turned off power to wd1 while unix running, Result 1: wd0a: hard write error sn 4672 bn 9344 wd0a: hard write error sn 6976 bn 13952 Result 2: boot error (no_id) E UND System booted ok when wd2 removed, (I may not have had power to wd2 turned on, & data bus for wd2 on back of skier may have been loose) format drive=2 type=micropolis_a label=skier_wd2 serial=229563 SKIER 920510 newfs -v -m 0 /dev/rwd1a micropolis_a This was wrong drive ! IVE NOW LOST ALL OF WD1 DATA umount /dev/wd1a newfs -v -m 0 /dev/rwd1a micropolis_a newfs -v -m 0 /dev/rwd2a micropolis_a ------------------ RELOADING FROM TAPE ------------------ icheck -b 9344 13952 /dev/rwd0a /dev/rwd0a: 9344 arg; frag 0 of 4, inode=992, class=inodes 992-1024 13952 arg; frag 0 of 4, inode=1344, class=inodes 1344-1376 bad mode 1359 bad mode 1360 files 614 (r=256,d=21,b=54,c=85,sl=198) used 5709 (i=18,ii=0,b=1269,f=561) fre icheck -bn 9344 13952 /dev/rwd0a ncheck -i 992 1344 /dev/rwd0a 1344 /etc/nulib/. 992 /etc/hosts.en SKIER 920618 Disconnected 220V fan from an adjacent power block, Discs made unusual noise, superblocks corrupt - repaired. Also decided to fix wd2 problem: wd2a: hard read error, sn 32366 bn 64732 wd2c: hard read error, sn 32367 bn 64734 status 59 error 40 bad144 -s /dev/rwd2c 32366 32367 time icheck -bn 64732 64733 64734 64735 /dev/rwd2c 36.4u 6.4s 2:04 34% 7+95k 2+6io 23pf+0w time icheck -bn 64732 64734 /dev/rwd2c 36.0u 5.8s 2:07 32% 7+95k 3+7io 21pf+0w Also printed same list 2 times time icheck -b 64732 64733 64734 64735 /dev/rwd2c /dev/rwd2c: ncheck -i 14400 14401 14402 14403 14404 14405 14406 14407 /dev/rwd2a /dev/rwd2a: ncheck: read error 64712 14404 /910720_cass15/genix_tree/src/cmd/c2 ncheck: read error 64712 ncheck -i 14408 14409 14410 14411 14412 14413 14414 14415 /dev/rwd2a /dev/rwd2a: ncheck: read error 64712 14412 /910720_cass15/genix_tree/src/cmd/eqn ncheck -i 14416 14417 14418 14419 14420 14421 14422 14423 /dev/rwd2a /dev/rwd2a: ncheck: read error 64712 ncheck -i 14424 14425 14426 14427 14428 14429 14430 14431 /dev/rwd2a /dev/rwd2a: ncheck: read error 64712 /usr2/910720_cass15/genix_tree/src/cmd/c2 /dev/null cp: /usr2/910720_cass15/genix_tree/src/cmd/c2: I/O error /usr2/910720_cass15/genix_tree/src/cmd/eqn /dev/null cp: /usr2/910720_cass15/genix_tree/src/cmd/eqn: I/O error cd /usr2/910720_cass15/genix_tree/src/cmd ; ls -l c2 not found eqn not found bad144 -a -v -c /dev/rwd2c 32366 32367 Had 0 bad sectors, adding 2 bad144: can't read sector, 32367 zeroing 73717 bad144: can't read sector, 32366 zeroing 73718 write badsect file at 73719 73721 73723 73725 73727 bad144 -s /dev/rwd2c fsck /dev/rwd2c ** /dev/rwd2c NAME=/910720_cass15/genix_tree/src/cmd/eqn/glob.c.Z REMOVE? y UNALLOCATED I=14420 OWNER=root MODE=0 SIZE=0 MTIME=Dec 31 16:00 1969 /910720_cass15/genix_tree/src/cmd/eqn/io.c.Z /910720_cass15/genix_tree/src/cmd/eqn/lex.c.Z /910720_cass15/genix_tree/src/cmd/eqn/lookup.c.Z /910720_cass15/genix_tree/src/cmd/eqn/mark.c.Z /910720_cass15/genix_tree/src/cmd/eqn/matrix.c.Z /910720_cass15/genix_tree/src/cmd/eqn/move.c.Z /910720_cass15/genix_tree/src/cmd/eqn/over.c.Z /910720_cass15/genix_tree/src/cmd/eqn/paren.c.Z /910720_cass15/genix_tree/src/cmd/eqn/pile.c.Z /910720_cass15/genix_tree/src/cmd/eqn/shift.c.Z /910720_cass15/genix_tree/src/cmd/eqn/sqrt.c.Z /910720_cass15/genix_tree/src/cmd/eqn/text.c.Z ** Phase 4 - Check Reference Counts FREE INODE COUNT WRONG IN SUPERBLK FIX? y ** Phase 5 - Check Cyl groups 26 BLK(S) MISSING BAD CYLINDER GROUPS These were not found in lost+found ! 920721 BIKER /usr/adm/messages Mar 20 02:20 wd1a: hard read error, sn 41716 bn 83432 icheck -bn 83432 /dev/rusr1 12692 ncheck -i 12692 icheck -b 83432 /dev/rusr1 /dev/rusr1: 83432 arg; frag 0 of 4, inode=12692, class=logical data block 1066 files 941 (r=813,d=128,b=0,c=0,sl=0) used 48508 (i=72,ii=1,b=11777,f=1108) free 20619 (b=5080,f=299) missing 0 bad144 -l /dev/rwd1c 8114 59358 78254 78452 83432 84732 94632 138386 143716 bad144 -a -v -c /dev/rwd1c 41716 Had 9 bad sectors, adding 1 copying 74628 to 74627 copying 74629 to 74628 copying 74630 to 74629 copying 74631 to 74630 copying 74632 to 74631 copying 41716 to 74632 write badsect file at 74637 74639 74641 74643 74645 bad144 -l /dev/rwd1c 8114 59358 78254 78452 83432 83432 84732 94632 138386 143716 ^ ^ 920901 BIKER Removed both adjacent dip headers while multiuser, rebooted, keeps doing fsck, then panic: bread: size 0 syncing disks... ps = 8400020 pc = 1335c ipl=0 trap type 6 (dvz) 921019 23:00 BIKER Suspect overheating caused mega disc damage. # bad144 -s /dev/rwd0c /dev/rwd0a /dev/rwd0h /dev/rwd0b \ /dev/rwd1c /dev/rwd2c /dev/rwd1a /dev/rwd2a whole batch of errors occured seperated according to type as follows, wd0c: hard read error, sn 13924 bn 27848 status 59 error 40 sn 13924 bn 27848 sn 14571 bn 29142 sn 24696 bn 49392 sn 24939 bn 49878 sn 24940 bn 49880 sn 24941 bn 49882 sn 25263 bn 50526 sn 25264 bn 50528 sn 25270 bn 50540 sn 25352 bn 50704 sn 25587 bn 51174 sn 25590 bn 51180 sn 25591 bn 51182 sn 25911 bn 51822 sn 25912 bn 51824 sn 25914 bn 51828 sn 25915 bn 51830 sn 25917 bn 51834 sn 25919 bn 51838 sn 26235 bn 52470 sn 26236 bn 52472 sn 26242 bn 52484 sn 26243 bn 52486 sn 27213 bn 54426 sn 40819 bn 81638 soft read errors sn 14572 bn 29144 sn 14573 bn 29146 sn 22108 bn 44216 sn 22347 bn 44694 sn 22349 bn 44698 sn 24697 bn 49394 sn 25020 bn 50040 sn 25251 bn 50502 sn 25268 bn 50536 sn 25271 bn 50542 sn 25350 bn 50700 sn 25594 bn 51188 sn 25595 bn 51190 sn 27215 bn 54430 sn 27294 bn 54588 sn 27296 bn 54592 sn 41139 bn 82278 sn 64147 bn 128294 sn 74516 bn 149032 bad144 -l /dev/rwd0c 131590 bad144 -a -v -c /dev/rwd0c 14572 14573 22108 22347 \ 22349 24697 25020 25251 25268 25271 25350 \ 25594 25595 27215 27294 27296 41139 64147 74516 Had 1 bad sectors, adding 19 copying 74516 to 82915 copying 82934 to 82916 copying 64147 to 82917 copying 41139 to 82918 copying 27296 to 82919 copying 27294 to 82920 copying 27215 to 82921 copying 25595 to 82922 copying 25594 to 82923 copying 25350 to 82924 copying 25271 to 82925 copying 25268 to 82926 copying 25251 to 82927 copying 25020 to 82928 wd0c: soft ecc sn 24697 bn 49394 copying 24697 to 82929 copying 22349 to 82930 wd0c: soft read error, sn 22347 bn 44694 error 40 retries 3 copying 22347 to 82931 copying 22108 to 82932 copying 14573 to 82933 copying 14572 to 82934 write badsect file at 82935 82937 82939 82941 82943 Only 2 had read errors, so maybe now lid is up disk is cooler, so did another bad144 -s /dev/rwd0c to look for errors # bad144 -l /dev/rwd0c 29144 29146 44216 44694 44698 49394 50040 50502 50536 50542 50700 51188 51190 54430 54588 54592 82278 128294 131590 149032 # bad144 -s /dev/rwd0c soft sn 13924 bn 27848 sn 19110 bn 38220 sn 22109 bn 44218 sn 22112 bn 44224 sn 22348 bn 44696 sn 22352 bn 44704 sn 22355 bn 44710 sn 24696 bn 49392 sn 24939 bn 49878 sn 25263 bn 50526 sn 25264 bn 50528 sn 25352 bn 50704 sn 25668 bn 51336 sn 25912 bn 51824 sn 25914 bn 51828 sn 26235 bn 52470 sn 26243 bn 52486 sn 27213 bn 54426 sn 27295 bn 54590 hard sn 14571 bn 29142 sn 24940 bn 49880 sn 24941 bn 49882 sn 25270 bn 50540 sn 25915 bn 51830 sn 25917 bn 51834 sn 40819 bn 81638 sn 73790 bn 147580 # bad144 -a -v -c /dev/rwd0c 13924 19110 22109 22112 22348 22352 22355 24696 24939 25263 25264 25352 25668 25912 25914 26235 26243 27213 27295 i=`icheck -bn 29142 49880 49882 50540 51830 51834 81638 147580 /dev/rwd0c` ncheck -i $i /dev/rwd0c BIKER 921022 Oct 21 10:50 wd0h: soft write error, sn 19113 bn 10038 error 10 retries 3 Oct 21 10:50 wd0h: soft read error, sn 14653 bn 1118 error 40 retries 1 Oct 22 22:30 wd0h: hard read error, sn 25269 bn 22350 status Oct 22 22:30 wd0h: hard read error, sn 25587 bn 22986 status icheck -bn 22350 /dev/rwd0h 4647 4658 30045 10038 arg; frag 3 of 4, inode=0, class=free block ncheck -i 4647 4658 30045 10038 /dev/rwd0h 4647 /bin/join 4658 /bin/ranlib 30045 /ucb/newaliases 30045 /ucb/mailq 30045 /lib/sendmail icheck -bn 22986 /dev/rwd0h 4658 4658 /bin/ranlib # bad144 -a -v -c /dev/rwd0c 25269 25587 921208 BIKER # bad144 -s /dev/rwd0h 14897 40821 41146 64149 64224 # bad144 -a -v -c /dev/rwd0c 13927 19113 22105 22106 22107 22353 25185 25913 40823 41144 41147 41223 64467 wd0c: hard read error, sn 40821 bn 81642 sn 41146 bn 82292 sn 64149 bn 128298 sn 64224 bn 128448 sn 14897 bn 29794 # bad144 -s /dev/rwd0h 22354 40821 # bad144 -s /dev/rwd0c 14575 14897 14899 36280 40821 41146 64149 64794 bad144 -s /dev/rwd0c 14897 40821 921212 BIKER removed all adata of root drive, formatted seagate as seaboot, writing bad sector table at sector 82935 hit reset, system locked, on winch, wouldnt respond, invereted floppt cable system no longe hung on hard disc, but wouldnt boot, inverted to normal b 2 0 3 0 disklabel /dev/rwd0c seaboot disklabel /dev/rwd0a seaboot disklabel /dev/rwd0h seaboot newfs /dev/rwd0a seaboot newfs /dev/rwd0h seaboot 920223 Biker overheating, Turned om side, with blue room fan impinging on cpu, ran ok from 4pm - 10am next day, turned fan off, system stopped within 5 mins (note this was not a crash, but merely the system froze, left led dsplaying Interrupt mask set to 3, when fan was turned on , system resumed running within a few seconds, (ie no reboot sequence was executed). No chips on disc board were hot except wdc 1014 a bit hot, however NSC cpu chip set on main cpu board too hot to touch, When I turned fan on to cpu & obscured draught from disc pcb, (carefully not jogging any possibly bad contacts) this proved it was cpu chip set overheating that was the problem. Room ambient temp. was 26 deg. centigrade (79 fahrenheit). 930312~ Biker Extended chassis slots under cpu, obscured other slots, didnt help. Appended fan under box, didnt help. 930315 Biker: Attached heat sinks to cpu chip set, didnt help. Converted all cpu chip set sockets to new gold contact sockets (only cpu was before, & that was twisted), didnt help. 930316 Biker: Jogged console socket, STRAY INTERRUPT #f psr= 8400020, pc= 7bf, ipl= 1 930409 Skier: bad144 -s /dev/rwd2c 4663 6978 bad144 -s /dev/rwd2a 4663 6978 bad144 -s /dev/rwd1c /usr/adm/messages: wd2c: hard read error, sn 4663 bn 9326 status 59 error 40 wd2c: hard read error, sn 6978 bn 13956 status 59 error 40 icheck -bn 9326 /dev/rwd2c & 2a give same result: 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 icheck -bn 13956 /dev/rwd2c 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 ncheck -i 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 /dev/rwd2a /dev/rwd2a: ncheck: read error 9320 ncheck: read error 13952 ncheck: read error 9320 ncheck: read error 13952 2060 /ingres/doc/error 2049 /local/commands/valid/stuart ncheck: read error 9320 ncheck: read error 13952 2048 /local/commands/mailname.fail.examples 2073 /local/basic/pdp11 ncheck -i 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 /dev/rwd2a /dev/rwd2a: ncheck: read error 9320 ncheck: read error 13952 ncheck: read error 9320 ncheck: read error 13952 3073 /adm/bak/usr/lib/v ncheck: read error 9320 ncheck: read error 13952 3075 /uucp/save 3072 /local/basic/pyramid bad144 /dev/rwd2c 64732 64734 bad144 -a -v -c /dev/rwd2c 4663 6978 Had 2 bad sectors, adding 2 copying 73717 to 73715 copying 73718 to 73716 bad144: can't read sector, 6978 zeroing 73717 bad144: can't read sector, 4663 zeroing 73718 write badsect file at 73719 write badsect file at 73721 write badsect file at 73723 write badsect file at 73725 write badsect file at 73727 bad144 /dev/rwd2c 9326 13956 64732 64734 bad144 /dev/rwd1c 48078 75154 107106 bad144 /dev/rwd0c 7102 8026 56140 60454 60598 60742 84854 117148 Bike: Put in a new power supply, then it finally ran stable. 2006.05.02 Skyr: bad144 -s wd0a 4423 wd0c hard read error sn 4423 bn 8846 wd0c soft read error sn 16160 bn 32320 /sbin/icheck -bn 8846 wd0a > /jhs/8846 /sbin/icheck -bn 8847 wd0a > /jhs/8847 diff -c /jhs/* # no difference cat /jhs/8846 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 /sbin/ncheck -i `cat /jhs/8846` wd0a > /jhs/ncheck # more read error, probably names of bad files also within # that sector. /dev/rwd0a: 2174 /etc/README.scs 2168 /etc/fstab.dual.scs 2169 /etc/gettytab.scs 2170 /etc/hosts.en.scs 2173 /etc/hosts.equiv.scs 2171 /etc/hosts.no_en.scs 2172 /etc/inetd.conf.scs 2167 /etc/nu.cf.scs 2157 /etc/rc.scs 2175 /etc/rc.local.en.scs /sbin/bad144 -a -v wd0a 4423 # reports non bad blocks, but now adding this one /sbin/bad144 -s wd0a # reports error still there, so I guess I may have zapped # a good block & should have done # /sbin/bad144 -a -v wd0c 4423 Had 0 bad sectors, adding 1 zeroing 138230 write badsect file at 138231 write badsect file at 138233 write badsect file at 138235 write badsect file at 138237 write badsect file at 138239 Reboot /sbin/bad144 -s /dev/rwd0c # soft ecc errors: sn 5080 bn 10160 # sn 16160 bn 32320 Had 1 bad sectors, adding 1 copying 138230 to 138229 zeroing 138230 write badsect file at 138231 write badsect file at 138233 write badsect file at 138235 write badsect file at 138237 write badsect file at 138239 Replaced any missing /etc/files from /alt.root