LOCKSS is compatible with most standard PC hardware and usually installs without trouble. In the event something does go wrong, we'll work with you to try to determine the problem. This page describes the troubleshooting procedure we initially follow. If you're so inclined, this may allow you to diagnose the problem yourself, or gather the information we will need to diagnose it. To contact the LOCKSS team, send email to
.
The page is organized into sections. The sequence of sections follows the sequence of actions the system takes during the initial installation.
[edit] CD Fails to Boot
The boot process starts when the computer's BIOS reads the boot block from the CD and executes it, which reads the 1.5-stage bootstrap from the CD and executes it. This culminates with the boot> prompt. If you do not see this prompt, this is the section for you.
- Did the lights on the CD drive flash? If not, it is likely that your BIOS does not have the CD first in its list of devices to boot from, and is trying to boot from some other device. Re-configure your BIOS to place the CD drive first in the list of boot devices, and remove all other devices from the list. The LOCKSS system only ever boots from the CD.
- If the CD lights did flash, your computer is trying to read the CD boot loader and failing. The most likely cause is a bad CD burn. Burn a new CD and try again.
- If you cannot burn a LOCKSS CD that works, try booting a live CD for some other operating system (e.g. Knoppix Linux or Freesbie FreeBSD):
- If the other system fails, there is probably something wrong with your hardware. Debug this using the live CD then re-try installing LOCKSS.
- If the other system succeeds, please collect the kernel configuration information from the live CD and report it to the LOCKSS team.
[edit] OpenBSD Kernel Fails to Load
If you do not respond in a few seconds to the boot> prompt, the boot loader will load the OpenBSD kernel from the CD and execute it. Arcane messages in blue will scroll past on your screen as the kernel discovers the computer's hardware configuration and initializes the appropriate drivers. This culminates with a message about rootdev. If you did see the boot> prompt but don't see the blue rootdev message this is the section for you.
- Did you see any arcane messages in blue? If not:
- Did you see messages from the boot loader saying that it could not find various different permutations of
bsd,obsdand so on? If so, the likely cause is a bad CD burn. Try mounting the CD on another computer. In the root directory of the CD is a filesha1sumswith the SHA1 of every other file on the CD. Try verifying them. If they don't match, burn a new CD and try again. Otherwise try replacing the CD drive in the LOCKSS box. If neither succeeds contact the LOCKSS team. - If you saw neither arcane messages in blue nor complaints from the boot loader, contact the LOCKSS team.
- Did you see messages from the boot loader saying that it could not find various different permutations of
- If you saw some arcane messages in blue but they stopped unexpectedly before the
rootdevmessage, try booting a live CD for some other operating system (e.g. Knoppix Linux or Freesbie FreeBSD):- If the other system fails, there is probably something wrong with your hardware. Debug this using the live CD then re-try installing LOCKSS.
- If the other system succeeds, please collect the kernel configuration information from the live CD and report it to the LOCKSS team.
[edit] System Does not Find Disks
Once the kernel is loaded and running, it starts up the user level processes running the /etc/rc scripts. These do some housekeeping then look for the available disks. The first time you install a LOCKSS box it has to initialize the
disks. Since this erases all pre-existing data, the script asks you before doing it.
If on the first installation (not subsequent installations) you do not get asked for permission to initialize each disk in the system, this is the section for you.
- Did you get asked for permission to initialize some disks but not all? If you did, please follow the instructions in the section Reporting Your System's Hardware below.
- If you didn't get asked for permission to initialize any of the disks in the system:
- Try the instructions in the section Initializing Disk Labels below.
- If those instructions do not fix the problem, please follow the instructions in the section Reporting Your System's Hardware below.
[edit] Errors during Disk Initialization
Initializing a large disk takes a while, during which a long sequence of numbers scrolls past on your screen. If all is well the system proceeds to ask a question about your keyboard. If, instead, the scrolling numbers stop with an alarming error message this is the section for you.
- A likely cause of errors at this stage is a bad disk, or bad disk blocks. Try using Knoppix or Freesbie to re-partition the disk and build a file system filling the whole of it. If this also produces errors, replace the drive.
- Otherwise, please follow the instructions in the section Reporting Your System's Hardware below.
[edit] Problems Configuring Your Keyboard
The system asks you to specify the type of your keyboard. Keyboard types are two-letter country codes such as us or uk. The default is us.
- If you input your country code and you get an error message, the keyboard type will have been reset to the default us. The probable causes are either that the system doesn't support your country code, or that you mis-typed it.
[edit] Network Configuration Fails
After the keyboard question, the system asks a set of questions about network configuration, then performs a simple test to validate the answers. If the test fails it will ask whether you want to re-enter the configuration. If you get this question, this is the section for you.
- The first thing to try is to answer
Yto re-enter configuration details and make sure that there were no typos in your input. - If you are sure your input was perfect, answer
Nand you will get a shell prompt. Follow the instructions in the section Testing Your Network Configuration below. - If the manual test succeeds but the automated test fails, timing issues during NIC initialization are a likely cause. Re-try the configuration process with an Ethernet hub between the LOCKSS box and the switch. If this fixes the problem, please follow the instructions in the section Reporting Your System's Hardware below.
[edit] Disk Full during Installation
If your LOCKSS box installation grinds to a halt part-way through with a series of messages about disk full and the root shell # prompt, this is the section for you.
- If this is an upgrade to an existing LOCKSS box that has a full disk, contact us.
- Were you asked for permission to initialize any disks?
- If not, you should have been. Follow the instructions in the section System Does not Find Disks above.
- If you were asked, and you didn't see endless streams of mysterious numbers scroll by as the system initialized the disks, follow the instructions in the section System Does not Find Disks above.
- If you were asked for permission, and you did see the stream of numbers, please type the command
dfand follow the instructions in the section Reporting Your System's Hardware below, including thedfoutput.
[edit] Package Signature Check Fails
During the initialization process the system verifies the signatures on files containing the SHA1 checksums of the the files to be installed, then verifies the files themselves against these checksums. If you get messages about no good signature or no valid checksum this is the section for you.
- First, verify that your CD is good by following the instructions in the section Verifying Your CD below.
- Then, verify that the keys used to check the signatures were copied correctly from the CD to your configuration floppy, USB disk or CD:
# grep -s mnt2 /etc/fstab && mount /mnt2 # mount /cdrom # cmp /mnt2/pkgchk.tgz /cdrom/LOCKSS/2.0/floppy/pkgchk.tgz # umount /cdrom # grep -s mnt2 /etc/fstab && umount /mnt2
- If there is no output from the
cmpcommand, your configuration device is good. Please follow the instructions in the section Reporting Your System's Hardware below. - Otherwise try re-formatting or replacing it and restarting the configuration process from scratch.
[edit] Cannot Log In as root
If the login: prompt appears, but you cannot log in as root with the password you chose during configuration, or if you were never asked for a password during configuration, this is the section for you.
- The most likely cause of this problem is that the floppy or USB disk used to store the configuration was not blank, as required, but had a pre-existing configuration on it. Try re-formatting the floppy or USB disk and re-starting the configuration process from the beginning.
[edit] Collecting Diagnostic Information
Here are some sections describing procedures to follow during diagnosis.
[edit] Booting the System in Single-User Mode
In some cases you will need to boot the system single user. To do so, wait for the boot> prompt with your finger on the b key. As soon as the prompt appears press the key. Then at your leisure type the rest of the response boot -s before pressing Enter. The arcane messages in blue should scroll by, then you should see the single user root prompt #.
At this point you can type commands. In most cases the first command you want to type is mount -w /, which re-mounts the root file system making it writable. Many commands will fail if they cannot write to /tmp.
[edit] Verifying Your CD
To verify that your CD is good, follow the instructions in the section Booting the System in Single-User Mode above then:
# mount /cdrom # ( cd /cdrom ; sha1sum -c sha1sums ) # umount /cdrom
Unless you see OK for every one of the files listed, your CD is bad.
[edit] Reporting Your System's Hardware
Including full details of your hardware configuration in any mail you send the LOCKSS team will greatly assist diagnosis. To collect this information, at any root prompt type:
# sysctl -a >/tmp/config.out
The file /tmp/config.out can be mailed directly using mail, copied somewhere more convenient using scp, or copied to a floppy by:
# mount -t msdos /dev/fd0c /mnt # cp /tmp/config.out /mnt # umount /mnt
[edit] Initializing Disk Labels
To initialize a disk label to empty so that the installation process will treat it as a new disk to be fully initialized, follow the instructions in the section Booting the System in Single-User Mode above. Now you need to know the name OpenBSD has assigned to the disk, by typing:
# sysctl hw.disknames hw.disknames=cd0,wd0,fd0,rd0
There will be entries for cd0,rd0, respectively the cd and the RAMdisk, and possibly fd0 for the floppy. The other entries are the actual disks.
If you are really sure you want to erase the disk, then use fdisk's e command to set the type code of all partitions to zero:
# fdisk -e ${DISK}
fdisk: 1> p
Disk: wd0 geometry: 3736/255/63 [60018840 Sectors]
Offset: 0 Signature: 0xAA55
Starting Ending LBA Info:
#: id C H S - C H S [ start: size ]
------------------------------------------------------------------------
0: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused
1: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused
2: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused
*3: A6 0 1 1 - 3735 254 63 [ 63: 60018777 ] OpenBSD
fdisk: 1> e 3
Starting Ending LBA Info:
#: id C H S - C H S [ start: size ]
------------------------------------------------------------------------
*3: A6 0 1 1 - 3735 254 63 [ 63: 60018777 ] OpenBSD
Partition ID ('0' to disable) [0 - FF]: [A6] (? for help) 0
Partition 3 is disabled
fdisk: 1> w
Writing MBR at offset 0.
wd0 no disk label
fdisk: 1> q
#
Now start the installation process from the beginning. You should be asked for permission to initialize ${DISK}. If you aren't, contact the LOCKSS team.
[edit] Testing Your Network Configuration
To test your network configuration from a shell prompt, either because you followed the Booting the System in Single-User Mode instructions above or because configuration failed and gave you a shell, first you need to know the name OpenBSD has assigned to your network interface. Type:
ifconfig -a
You should see something like:
# ifconfig -a
lo0: flags=8049<LOOPBACK,MULTICAST> mtu 33224
groups: lo
fxp0: flags=8843<BROADCAST,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:07:e9:0a:3b:ae
media: Ethernet autoselect (100baseTX full-duplex)
status: active
pflog0: flags=0<> mtu 33224
pfsync0: flags=0<> mtu 1460
enc0: flags=0<> mtu 1536
The network interface of interest is the one that includes a MAC address (lladdr) or, to put it another way, the one that is not lo0, pflog0, pfsync0 or enc0. In this example the network interface name is fxp0.
Then follow the instructions below, substituting the following parameters (including dollar sign and braces) for the appropriate value, where:
${NIC}- the name of your network interface,
${IP_ADDRESS}- the IP address assigned to your LOCKSS box,
${NETMASK}- the netmask,
${GATEWAY}- the IP address of the default router,
${FQDN}- the fully qualified domain name of your LOCKSS box (e.g.
lockss.generic.edu) ${NAMESERVER}- the IP address of the DNS server:
# ifconfig ${NIC} inet ${IP_ADDR} netmask ${NETMASK} up
# ifconfig ${NIC}
${NIC}: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:07:e9:0a:3b:ae
groups: egress
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet6 fe80::207:e9ff:fe0a:3bae%fxp0 prefixlen 64 scopeid 0x1
inet 192.168.1.85 netmask 0xffffff00 broadcast 192.168.1.255
# route add default ${GATEWAY}
# ping -c 5 ${GATEWAY}
# host $(FQDN} ${NAMESERVER}
If all is well these commands will succeed, but if all were well you wouldn't be following these instructions. So report to the LOCKSS team exactly what you saw. To reach the LOCKSS team, send email to
.