# Kernel keeps renaming root device [UNSOLVABLE]

## exhausted

I can no longer boot because the root device I specify is never correct.  The root device is supposed to be /dev/sda5.  However, it appears that the kernel is now giving it a different name every time I try to boot.

When I try to boot, I almost always get  a kernel panic because the kernel has named the device "/dev/sde" or "/dev/sdg" or anything BUT /dev/sda.

If the kernel names the boot device /dev/sdc, I'll edit the line in grub's menu.lst to "root=/dev/sdc5"--but then the kernel switched the name on me again and names the boot device something else.  There is no way for me to guess what the kernel is going to name the boot device and I can't find a way to make it stop.

I've tried specifying the root device by UUID in menu.lst.  I found some instructions online for doing this.  It doesn't work--grub doesn't seem to be able to do that sort of thing.  (Then why are there instructions for it?)

I've tried creating udev rules to try to keep the device names from changing, but that didn't work either.

What can I do to keep the device names from changing so I can boot?Last edited by exhausted on Thu Apr 03, 2014 1:48 am; edited 2 times in total

----------

## exhausted

Specifying the UUID for / in /etc/fstab doesn't seem to help either.

----------

## NeddySeagoon

exhausted,

Explain the storage devices attached your your system and how they are connected.

For finding root by UUID, you need an initrd.  It works for me.

----------

## exhausted

Thanks for the quick reply!

Gah!  I forgot that the kernel can't interpret UUIDs passed directly to it via a bootloader.  No wonder my attempt at getting Grub to work by specifying the UUID didn't work.  Since I'm not using an initramfs and would prefer not to, maybe instead of trying to specify the root device by UUID, I should go back to my original plan: Use a rule to force udev to use the device names I specify.

The devices are two solid state drives connected via SATA and a couple of external hard drives connected via USB.  The root device is a SATA SSD.

----------

## exhausted

Okay, here's what I've done:

I've changed the relevant line in menu.lst back to specifying "root=/dev/sda5".

I created a udev rule file, /etc/udev/rules.d/20-persistent_disk_name.rules with the following rule:

```
SUBSYSTEM=="scsi", ATTRS{model}=="INTEL SSDSA2CW08", KERNEL=="sd*", SYMLINK+="sda%n"
```

I still can't boot because the kernel is still changing the name of the boot device.  The udev rule doesn't appear to do anything.

Edit: Maybe the udev rules are useless for forcing a specific device name for a boot device?  Is it not possible to force the correct root device name using a udev rule?

----------

## exhausted

I have also tried specifying the root device in /etc/fstab by UUID:

```
UUID=ebbc6ab0-0c0f-4d21-98e6-63ac2ee4d84d      /      reiserfs   defaults,noatime,data=ordered,notail   1 2
```

This doesn't seem to work either.

----------

## VoidMage

If your disk has GPT partition table, you could boot by PARTUUID (well, if your kernel is recent enough, you could even do it with MBR partition, though it's a bit quirky there).

----------

## exhausted

I'm still using the old MBR partition scheme.

As for the kernel, I'm using 3.8.13-gentoo.

[rant]

This just seems patently insane.  The way device names behaved in Linux has worked for many years.  The device names were predictable.  They didn't just change at fate's whimsy every time a system booted.  The new behavior makes it impossible to know what the device names are going to be from one boot to another without--and this is my major gripe--providing an option to retain the old behavior and without providing a practical way to keep the names from changing.  I'm all for progress and changes for the better, but this...  this is insane!  I can't even boot because I don't know what the name of my boot device is going to be!

[/rant]

----------

## PaulBredbury

PARTUUID does not need an initrd. Works fine in syslinux:

```
LABEL Current

LINUX /boot/3.9.10-x86_64

APPEND root=PARTUUID=00020ed2-01 rootfstype=ext4 usbhid.mousepoll=2 apparmor=1 blah blah
```

The kernel shows the PARTUUID values on the right-hand side, during bootup.

Edit: Hopefully removed confusion of UUID with PARTUUID.Last edited by PaulBredbury on Sun Jul 21, 2013 4:15 am; edited 1 time in total

----------

## exhausted

Now I'm confused as all hell.  I've read a lot of documentation that specifically states that you can't just pass the UUID of a partition to the Linux kernel as a boot parameter because the kernel can't interpret it.  This explains why it doesn't work with grub.

I don't understand how you're getting it to work.

If I try to specify the root device in menu.lst by UUID, it does not work.

How are you getting it to work?

I suspect that you might be confusing UUID with PARTUUID.

----------

## The Doctor

Observation: I don't think writing a udev rule is going to do anything because if your kernel can't mount your root partition udev and your rule will not even be loaded as they reside on your root partition. The only way udev will play any role in this is if you are using an initramfs with udev in which case you may as well mount your root partition directly.

Short term possibility: Disconnect your external drives to see if that helps. If the names are still switching randomly at least you will have a 50% of booting.

Oh, and PaulBredbury is using syslinux instead of grub. It may be worth trying a different boot loader to see if that is the problem. Syslinux doesn't have as many features as the new grub, which I find to be a distinct advantage because it makes it much easer to use.

----------

## PaulBredbury

 *exhausted wrote:*   

> confusing UUID with PARTUUID.

 

I suppose I am  :Embarassed: 

So, why don't you forget about udev rules (which run too late to be helpful) and use PARTUUID  :Wink: 

I know that PARTUUID works, because my USB-connected phone steals the sda name if it's plugged in during boot  :Shocked:  What is Linus thinking??

----------

## Hu

 *exhausted wrote:*   

> I'm still using the old MBR partition scheme.
> 
> As for the kernel, I'm using 3.8.13-gentoo.
> 
> [rant]
> ...

 When did this break for you?  I am not aware of any recent changes in the kernel rules for how to name SCSI/SATA devices.  However, some systems have been known to exhibit a random discovery order, particularly when using external USB devices.

----------

## NeddySeagoon

exhausted,

You didn't explain the storage devices attached your your system and how they are connected. 

Your lspci output would be useful too.

----------

## dwbowyer

Not sure this helps OP, but might point in the right direction:

On some systems, Mixed PATA (legacy IDE) and SATA internal drives can exhibit this behavior too, if you have unplugged and replugged one of them. It's not random though, as the names just swap. I've had to unplug all drives and plug them back in, in the order I've wanted them named. It's also why it's not advised to mix CONFIG_IDE in the kernel along with the SATA drivers.

----------

## exhausted

Everything was perfect until an update.  I believe that it was either a kernel update or a udev update that caused the problem.  The kernel is no longer assigning the name /dev/sda to the boot device.  I must be able to specify the boot partition by device name; I can't use UUID or anything else.  If the kernel doesn't assign the correct name to the boot device, I can't boot.

 *NeddySeagoon wrote:*   

> You didn't explain the storage devices attached your your system and how they are connected.

 

My apologies.  My storage devices are two solid state drives connected via SATA and a couple of external hard drives connected via USB. The root device is a SATA SSD which has always been named sda.

Here's my lspci output:

```
00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07) (prog-if 00 [Normal decode])

   Flags: bus master, 66MHz, medium devsel, latency 32

   Bus: primary=00, secondary=01, subordinate=01, sec-latency=32

   I/O behind bridge: 0000a000-0000bfff

   Memory behind bridge: fa400000-fa5fffff

   Capabilities: [c0] HyperTransport: Slave or Primary Interface

   Capabilities: [f0] HyperTransport: Interrupt Discovery and Configuration

00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)

   Subsystem: Advanced Micro Devices [AMD] AMD-8111 LPC

   Flags: bus master, 66MHz, medium devsel, latency 0

00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03) (prog-if 8a [Master SecP PriP])

   Subsystem: Advanced Micro Devices [AMD] AMD-8111 IDE

   Flags: medium devsel

   [virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]

   [virtual] Memory at 000003f0 (type 3, non-prefetchable)

   [virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]

   [virtual] Memory at 00000370 (type 3, non-prefetchable)

   I/O ports at ffa0 [size=16]

00:07.2 SMBus: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0 (rev 02)

   Subsystem: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0

   Flags: medium devsel, IRQ 9

   I/O ports at c480 [size=32]

00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)

   Subsystem: Advanced Micro Devices [AMD] AMD-8111 ACPI

   Flags: medium devsel

00:07.5 Multimedia audio controller: Advanced Micro Devices [AMD] AMD-8111 AC97 Audio (rev 03)

   Subsystem: Tyan Computer Device 2885

   Flags: bus master, medium devsel, latency 32, IRQ 17

   I/O ports at c800 [size=256]

   I/O ports at cc00 [size=64]

   Kernel driver in use: snd_intel8x0

00:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])

   Flags: bus master, 66MHz, medium devsel, latency 32

   Bus: primary=00, secondary=02, subordinate=04, sec-latency=32

   Memory behind bridge: fa600000-fa8fffff

   Prefetchable memory behind bridge: 00000000ca000000-00000000ca1fffff

   Capabilities: [a0] PCI-X bridge device

   Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration

   Capabilities: [c0] HyperTransport: Slave or Primary Interface

00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])

   Subsystem: Advanced Micro Devices [AMD] Device 36c0

   Flags: bus master, medium devsel, latency 0

   Memory at fa9ff000 (64-bit, non-prefetchable) [size=4K]

00:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])

   Flags: bus master, 66MHz, medium devsel, latency 32

   Bus: primary=00, secondary=05, subordinate=05, sec-latency=32

   Capabilities: [a0] PCI-X bridge device

   Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration

00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])

   Subsystem: Advanced Micro Devices [AMD] Device 36c0

   Flags: bus master, medium devsel, latency 0

   Memory at fa9fe000 (64-bit, non-prefetchable) [size=4K]

00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration

   Flags: fast devsel

   Capabilities: [80] HyperTransport: Host or Secondary Interface

   Capabilities: [a0] HyperTransport: Host or Secondary Interface

   Capabilities: [c0] HyperTransport: Host or Secondary Interface

00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map

   Flags: fast devsel

00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller

   Flags: fast devsel

00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control

   Flags: fast devsel

   Kernel driver in use: k8temp

01:00.0 USB controller: Advanced Micro Devices [AMD] AMD-8111 USB OHCI (rev 0b) (prog-if 10 [OHCI])

   Subsystem: Advanced Micro Devices [AMD] AMD-8111 USB OHCI

   Flags: bus master, medium devsel, latency 32, IRQ 19

   Memory at fa5fd000 (32-bit, non-prefetchable) [size=4K]

   Kernel driver in use: ohci_hcd

01:00.1 USB controller: Advanced Micro Devices [AMD] AMD-8111 USB OHCI (rev 0b) (prog-if 10 [OHCI])

   Subsystem: Advanced Micro Devices [AMD] AMD-8111 USB OHCI

   Flags: bus master, medium devsel, latency 32, IRQ 19

   Memory at fa5fe000 (32-bit, non-prefetchable) [size=4K]

   Kernel driver in use: ohci_hcd

01:0a.0 Multimedia audio controller: VIA Technologies Inc. ICE1712 [Envy24] PCI Multi-Channel I/O Controller (rev 02)

   Subsystem: VIA Technologies Inc. M-Audio Delta 1010

   Flags: bus master, medium devsel, latency 32, IRQ 16

   I/O ports at b080 [size=32]

   I/O ports at b000 [size=16]

   I/O ports at ac00 [size=16]

   I/O ports at a880 [size=64]

   Capabilities: [80] Power Management version 1

   Kernel driver in use: snd_ice1712

01:0b.0 Mass storage controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)

   Subsystem: Silicon Image, Inc. SiI 3114 SATALink Controller

   Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17

   I/O ports at bc00 [size=8]

   I/O ports at b880 [size=4]

   I/O ports at b800 [size=8]

   I/O ports at b480 [size=4]

   I/O ports at b400 [size=16]

   Memory at fa5ffc00 (32-bit, non-prefetchable) [size=1K]

   Expansion ROM at fa500000 [disabled] [size=512K]

   Capabilities: [60] Power Management version 2

   Kernel driver in use: sata_sil

02:07.0 PCI bridge: Hint Corp HB6 Universal PCI-PCI bridge (non-transparent mode) (rev 15) (prog-if 00 [Normal decode])

   Flags: bus master, medium devsel, latency 32

   Bus: primary=02, secondary=03, subordinate=03, sec-latency=32

   Memory behind bridge: fa600000-fa6fffff

   Capabilities: [80] Power Management version 2

   Capabilities: [90] CompactPCI hot-swap <?>

   Capabilities: [a0] Vital Product Data

02:08.0 PCI bridge: Pericom Semiconductor Device e111 (rev 02) (prog-if 00 [Normal decode])

   Flags: bus master, 66MHz, medium devsel, latency 32

   Bus: primary=02, secondary=04, subordinate=04, sec-latency=0

   Memory behind bridge: fa700000-fa7fffff

   Prefetchable memory behind bridge: 00000000ca000000-00000000ca0fffff

   Capabilities: [80] PCI-X bridge device

   Capabilities: [a8] Subsystem: Device 0000:0000

   Capabilities: [b0] Express PCI/PCI-X to PCI-Express Bridge, MSI 00

   Capabilities: [d8] Vital Product Data

   Capabilities: [f0] MSI: Enable- Count=1/1 Maskable- 64bit+

02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5703X Gigabit Ethernet (rev 02)

   Subsystem: Tyan Computer Device 2885

   Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 24

   Memory at fa8e0000 (64-bit, non-prefetchable) [size=64K]

   Expansion ROM at fa8b0000 [disabled] [size=64K]

   Capabilities: [40] PCI-X non-bridge device

   Capabilities: [48] Power Management version 2

   Capabilities: [50] Vital Product Data

   Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+

   Kernel driver in use: tg3

03:00.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) (prog-if 10 [OHCI])

   Subsystem: AFAVLAB Technology Inc Device 702a

   Flags: bus master, medium devsel, latency 32, IRQ 15

   Memory at fa6ff800 (32-bit, non-prefetchable) [size=2K]

   Memory at fa6f8000 (32-bit, non-prefetchable) [size=16K]

   Capabilities: [44] Power Management version 2

03:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 43) (prog-if 10 [OHCI])

   Subsystem: Siig Inc Device 131f

   Flags: bus master, medium devsel, latency 32, IRQ 27

   Memory at fa6fd000 (32-bit, non-prefetchable) [size=4K]

   Capabilities: [40] Power Management version 2

   Kernel driver in use: ohci_hcd

03:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 43) (prog-if 10 [OHCI])

   Subsystem: Siig Inc Device 131f

   Flags: bus master, medium devsel, latency 32, IRQ 24

   Memory at fa6fe000 (32-bit, non-prefetchable) [size=4K]

   Capabilities: [40] Power Management version 2

   Kernel driver in use: ohci_hcd

03:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04) (prog-if 20 [EHCI])

   Subsystem: Siig Inc Device 00e0

   Flags: bus master, medium devsel, latency 32, IRQ 25

   Memory at fa6ff400 (32-bit, non-prefetchable) [size=256]

   Capabilities: [40] Power Management version 2

   Kernel driver in use: ehci_hcd

04:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])

   Subsystem: NEC Corporation uPD720200 USB 3.0 Host Controller

   Flags: bus master, fast devsel, latency 0, IRQ 27

   Memory at fa7fe000 (64-bit, non-prefetchable) [size=8K]

   Capabilities: [50] Power Management version 3

   Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+

   Capabilities: [90] MSI-X: Enable- Count=8 Masked-

   Capabilities: [a0] Express Endpoint, MSI 00

   Kernel driver in use: xhci_hcd

06:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-8151 System Controller (rev 14)

   Subsystem: Advanced Micro Devices [AMD] AMD-8151 System Controller

   Flags: bus master, medium devsel, latency 0

   Memory at <ignored> (32-bit, prefetchable) [size=128M]

   Capabilities: [a0] AGP version 3.0

   Capabilities: [c0] HyperTransport: Slave or Primary Interface

   Kernel driver in use: agpgart-amd64

06:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8151 AGP Bridge (rev 14) (prog-if 00 [Normal decode])

   Flags: bus master, 66MHz, medium devsel, latency 32

   Bus: primary=06, secondary=07, subordinate=07, sec-latency=32

   Memory behind bridge: faa00000-feafffff

   Prefetchable memory behind bridge: ca300000-ea2fffff

07:00.0 VGA compatible controller: NVIDIA Corporation NV40 [GeForce 6800 Ultra] (rev a1) (prog-if 00 [VGA controller])

   Flags: bus master, 66MHz, medium devsel, latency 248, IRQ 16

   Memory at fd000000 (32-bit, non-prefetchable) [size=16M]

   Memory at d0000000 (32-bit, prefetchable) [size=256M]

   Memory at fc000000 (32-bit, non-prefetchable) [size=16M]

   [virtual] Expansion ROM at feae0000 [disabled] [size=128K]

   Capabilities: [60] Power Management version 2

   Capabilities: [44] AGP version 3.0

   Kernel driver in use: nvidia

   Kernel modules: nvidia
```

----------

## exhausted

I've been working on this off and on for so long, I'm probably not far away from giving up.  I suspect that there are several courses of action I could try:

Try downgrading udev and/or the kernel.  This can't be a very good option.  I can't hang on to some old version of udev and/or kernel forever.

Reinstall from scratch.  I really don't want to do that.  Even if I reinstall from scratch, what would prevent this exact same problem from happening again?

Try upgrading from a backup.  I have a complete backup of my system.  Unfortunately, that backup is a year old.  (I'm currently running that backup at the moment.)  Would it be practical to reload from a year-old backup and try updating it?  There's also the risk that I'd run into the same problem after updating the kernel and/or udev, whichever it was that screwed up my system.

Do any of those options seem like a good idea?

----------

## The Doctor

 *Quote:*   

> Do any of those options seem like a good idea?

 

Not really. You can't update a year old install. Installing from scratch probably won't do it since it won't fix the problem. Downgrading the kernel may help. As I pointed out before, udev isn't a player if you can't mount your root since it resides there.

Better: unplug you external drives and see if that helps.

Or: Play with using a PARTUUID for root. As PaulBredbury said, it works and you can boot with it. You can use UUID for everything else.

----------

## NeddySeagoon

exhausted,

Nope, none of those are good ideas, for the reasons you listed.  

Updating a one year old Gentoo is an interesting intellectual exercise but a reinstall would be faster.

Explain what storage devices you have attached to your system and the physical attachment, e.g. USB, SATA, PATA, SCSI ...

Also post the output of /sbin/blkid

----------

## ulenrich

Truely, this is an exhausting story. If your issue would be correctly diagnosed you will probably lough out loud about !

The cause could be a very minor behavioral error on your side: If you for example used some backup method on the partition level and duplicated UUIDs and labels by restoring to another partition,

or something like that ...

----------

## exhausted

 *The Doctor wrote:*   

> unplug you external drives and see if that helps.

 

That's a great suggestion.  I did try it, though, to no avail.

 *The Doctor wrote:*   

> Play with using a PARTUUID for root.

 

If I understand correctly, I can try the following:

    1. Recompile the 3.3.8 kernel on my working year-old system installed from backup) to support PARTUUIDs.

    2. Boot the recompiled kernel to find out what the PARTUUIDs are.

    3. Chroot into the broken installation, recompile the broken installation's kernel to support PARTUUIDs.

    4. Edit the broken installation's fstab to specify / by PARTUUID.  (This step isn't necessary at all, is it?)

    5. Edit grub's menu.lst to pass 

```
root=whatever the PARUUID turns out to be
```

 to the kernel via GRUB.

I'll wait a bit to see anybody sees any flaws in this plan and then I'll try it, probably tomorrow morning.

 *Quote:*   

> Explain what storage devices you have attached to your system and the physical attachment, e.g. USB, SATA, PATA, SCSI ...

 

I must be really thick.  I've stated that I have two SSDs attached via SATA and two external HDDs attached via USB.  This isn't the information you're asking for, is it?  (I greatly appreciate your patience.  I'm actually a computing technology veteran, but I'm definitely not the sharpest tool in the shed.)

----------

## exhausted

 *ulenrich wrote:*   

> Truely, this is an exhausting story. If your issue would be correctly diagnosed you will probably lough out loud about !

 

Yes, I suspect that you are quite right--this problem might very well turn out to have a truly ridiculous cause.

----------

## exhausted

It's got to be the kernel.  Obviously, (as The Doctor has already pointed out) udev has nothing to do with this.  It's got everything to do with how the newer kernel deals with the hardware.  There were no hardware or firmware changes.  I am absolutely certain of that.  The kernel just isn't behaving the same when it comes to assigning bus names.

UPDATE: 

 I chrooted into the broken installation and compiled a 3.3.8-gentoo kernel for it. I built and installed the kernels and modules. I was able to boot the broken installation using the 3.3.8 kernel! 

 Everything seemed perfectly fine until about five or six minutes later: The system spontaneously rebooted. ARGH! 

 I have verified that I can boot the broken installation using an older kernel. Older kernels assign the expected /dev/sda5 bus name to the root partition. However, the system is apparently unstable when booted using an older kernel. It will work for a few minutes, then spontaneously reboot.

I checked my /var/log/messages file (I'm using syslog-ng).  Everything looks normal to me except for a machine check error.  Here's the last several lines of the log:

```
Aug  4 00:13:02 amd64-at login[2833]: ROOT LOGIN  on '/dev/tty2'

Aug  4 00:14:21 amd64-at acpid: client connected from 2870[0:0]

Aug  4 00:14:21 amd64-at acpid: 1 client rule loaded

Aug  4 00:15:03 amd64-at ntpd_intres[2751]: host name not found: 4.ntp.bytestacker.com

Aug  4 00:15:03 amd64-at ntpd_intres[2751]: host name not found: 5.ticker.cis.sac.accd.edu

Aug  4 00:15:04 amd64-at ntpd_intres[2751]: host name not found: 6.sundial.cis.sac.accd.edu

Aug  4 00:15:04 amd64-at ntpd_intres[2751]: host name not found: 7.ntppub.tamu.edu

Aug  4 00:15:05 amd64-at ntpd_intres[2751]: host name not found: 8.chrono.cis.sac.accd.edu

Aug  4 00:15:05 amd64-at ntpd_intres[2751]: host name not found: 9.tick.jpunix.net

Aug  4 00:15:06 amd64-at ntpd_intres[2751]: host name not found: 10.ntp.tmc.edu

Aug  4 00:15:06 amd64-at ntpd_intres[2751]: host name not found: 11.ac-ntp1.net.cmu.edu

Aug  4 00:15:07 amd64-at ntpd_intres[2751]: host name not found: 12.ac-ntp0.net.cmu.edu

Aug  4 00:15:07 amd64-at ntpd_intres[2751]: host name not found: 13.ac-ntp2.net.cmu.edu

Aug  4 00:15:47 amd64-at kernel: mtrr: no MTRR for d0000000,10000000 found

Aug  4 00:15:58 amd64-at acpid: client 2870[0:0] has disconnected

Aug  4 00:15:58 amd64-at acpid: client connected from 2981[0:0]

Aug  4 00:15:58 amd64-at acpid: 1 client rule loaded

Aug  4 00:16:07 amd64-at ntpd_intres[2751]: parent died before we finished, exiting

Aug  4 00:17:26 amd64-at kernel: [Hardware Error]: Machine check events logged

Aug  4 00:18:15 amd64-at acpid: client 2981[0:0] has disconnected

Aug  4 00:18:15 amd64-at acpid: client connected from 3017[0:0]

Aug  4 00:18:15 amd64-at acpid: 1 client rule loaded

Aug  4 00:19:01 amd64-at acpid: client 3017[0:0] has disconnected

Aug  4 00:19:01 amd64-at acpid: client connected from 3044[0:0]

Aug  4 00:19:01 amd64-at acpid: 1 client rule loaded

Aug  4 00:19:38 amd64-at acpid: client 3044[0:0] has disconnected

Aug  4 00:19:38 amd64-at acpid: client connected from 3071[0:0]

Aug  4 00:19:38 amd64-at acpid: 1 client rule loaded

Aug  4 00:23:53 amd64-at kernel: mtrr: no MTRR for d0000000,10000000 found

Aug  4 00:24:00 amd64-at acpid: client 3071[0:0] has disconnected

Aug  4 00:24:00 amd64-at acpid: client connected from 3159[0:0]

Aug  4 00:24:00 amd64-at acpid: 1 client rule loaded

Aug  4 00:31:43 amd64-at acpid: client 3159[0:0] has disconnected

Aug  4 00:31:43 amd64-at acpid: client connected from 3186[0:0]

Aug  4 00:31:43 amd64-at acpid: 1 client rule loaded

Aug  4 00:32:50 amd64-at acpid: client 3186[0:0] has disconnected

Aug  4 00:32:50 amd64-at acpid: client connected from 3214[0:0]

Aug  4 00:32:50 amd64-at acpid: 1 client rule loaded
```

What the heck?  A machine check exception?  I'm inclined to believe that this is not actually a hardware fault.  This system has run nonstop for about a week using my backup Gentoo installation with no sign of any hardware problems.

----------

## NeddySeagoon

exhausted,

Your PARTUUID is sound provided that 3.3.8 supports PARTUUIDs for anything other then GPT.

Its fairly new for MSDOS Partition tables.

----------

## exhausted

After many months of trying to solve this problem, I deem this unsolvable.

It's truly a freakish problem.  The Linux kernel appears to be assigning device names unpredictably and changing up the names with every boot.  Every boot, it's essentially a roll of the dice.  This only affects newer versions of the kernel.

I was forced to wipe the SSD and install Gentoo from scratch, which probably turned out to be a great idea.  The previous installation was from 2005 and had built up a great deal of cruft.  There were lots of configuration files that are no longer used, different files used for some things, the location of some files have changed--there's just been a lot that's happened since 2005.  Installing from scratch got me a much cleaner system.

The new installation uses the latest stable kernel from gentoo-sources with no problems.  sda is always sda, sdb is always sdb, et cetera.

Many thanks to everybody for their help with this weird issue.

----------

## roarinelk

Try and disable "Asynchronous SCSI scanning" in the kernel (CONFIG_SCSI_SCAN_ASYNC).  This gets you predictable sdX device nodes, but increases boot

time a bit depending on how many scsi/sata/ide drives you have because they're scanned for one after the other.

----------

## exhausted

@roarinelk: That's an excellent suggestion!  However, I'd already tried it.  (I've had asynchronous SCSI scanning disabled for years.)

Even using known good kernel configurations (same configuration file for the same version of the kernel on a different installation on the same machine) did not solve the problem.  I suppose I'll never know what caused it, but that's okay.  A fresh, clean installation from scratch worked a treat.

----------

