# Kernel Panic, Default Make Selections, and more...

## Gentoo4Work

The headache is that right now I can't get my kernel to boot.  I'm getting kernel panic when it tries to boot, 'cannot mount root at unknown block'.  No message about filesystem, specifying proper root entries, etc., that I've seen posted elsewhere.  The root device *is* on an EFI / GPT formatted disk with non-legacy geometry... but chrooting via livecd has shown that other kernels don't seem to have any problem mounting the device, it's only my kernel(s) that want to hang on its account during boot.  The major/minor is showing the correct drive/partition (sda3).  Switching between IDE/compatible, IDE/enhanced, and AHCI at the BIOS level hasn't made any difference.  FSTAB is as it should be.

I even unmasked gentoo-sources-2.6.32-r7 and compiled that, which is supposed to have a fix for the problem of LBA > 512 bytes (and in bugzilla the ~amd64 maintainer comments that it's amd64 stable in that regard), but still the same exact error.

What on earth determines the default kernel configuration options for gentoo-sources?  I ask because everytime I've done a make menuconfig I've been presented with all kinds of bizarre options pre-selected as 'yes'.  E.g., it wants me to hard-install support for eePC -- why?  I'm on a high-end workstation... The only thing separating this from an idle annoyance is that, as I try to learn the ins-and-outs of kernel compilaiton, going through the options and seaching for information to understand what they mean, I'm coming across a lot of pre-selected options that don't see to be required by my hardware, and have a description that reads something like, "you almost definitely do NOT need this."  At the same time, a lot of options that are NOT selected and have descriptions that read, "not only is it safe to say YES, but you probably should for compatibility reasons."  Also, I have harware pre-selected that I don't have on my machine, hardware non-selected that IS on my machine... etc.  My inclination is to get rid of all of the stuff that isn't absolutely necessary (esp. everything marked 'experimental', 'NEW!', or 'dangerous'), but, as you can see, right now I'm having compatibility issues, so...

I could be over-complicating things with my setup, but I can't get over the notion that there must be something simple that I'm missing, since the drive does mount r/w from LiveCD's without a hitch.  What I *want* to do is boot grub-9999 (which has worked flawlessly for me), and have a relatively large boot partition that includes, among other things, a variety of different LiveCD images for diagnostic, testing, and snapshot / reversion purposes (which is why I'm booting from a 2TB single disk in the first place... it's basically an archive drive).

Right now, /boot and / are one and the same.  The reason for that is 1) I wanted to install grub-9999 directly with gentoo, rather than some other distro, and 2) I ultimately want to run / on my RAID10, but wanted to make sure that the kernel was comfortable r/w'ing from it before intsalling everything there.

I'm also running ACPI3.0, if that matters (though switching between that, 2.0, 1.0, and disabled in the BIOS hasn't made a lick of difference).

Everyone else has posted nice, neat print-outs of their outputs and configuration screens... I suppose that I could boot from a liveCD, open the logs from the drive, and cut-and-paste them here, if any helpful individuals could point me in the right direction of what I need to post to get this thing working.

----------

## Jaglover

See kernel README for possible targets.

Kernel options depend on each other (this is why we do not edit .config directly), sometimes it takes a little time to figure out why you are unable to disable some unnecessary feature. You may need to go thru menuconfig twice, or even three times to get it cleaned up (disabling dependents during first round).

Having kernel panic while major-minor are correct means usually root filesystem support is missing, or is built as module.

----------

## Hu

Please post the output of nl /etc/fstab ; nl /boot/grub/grub.conf and the last 10 lines of kernel output before the panic.

 *Gentoo4Work wrote:*   

> What on earth determines the default kernel configuration options for gentoo-sources?

 

This is controlled by the defconfig files shipped in the kernel sources.  This is maintained by upstream.  Gentoo may patch the choices on occasion, but most of what you are seeing is from upstream.  The menuconfig process does not attempt to match the settings to your current hardware.

The options where the description text and the defaults disagree are unfortunate.  In some cases, they may be outright mistakes, or the result of changing behaviors over time.  For example, it may have been added with consistent advice, then had the default changed when the underlying code matured.

----------

## kkretsch

 *Gentoo4Work wrote:*   

> I'm getting kernel panic when it tries to boot, 'cannot mount root at unknown block'.  No message about filesystem, specifying proper root entries, etc., that I've seen posted elsewhere.  The root device *is* on an EFI / GPT formatted disk with non-legacy geometry... but chrooting via livecd has shown that other kernels don't seem to have any problem mounting the device, it's only my kernel(s) that want to hang on its account during boot.  The major/minor is showing the correct drive/partition (sda3).

 

Is it possible that your kernel does not have GPT support configured? Please look for CONFIG_EFI_PARTITION in the .config file. If it is not set, you can find it in menuconfig under File systems -> Partition Types -> Advanced partition selection -> EFI GUID Partition support. This would explain the "unknown block" message because the kernel doesn't find your partitions.

By the way, the released grub 1.* works just fine for me. There is probably no need for using the dev snapshot.

----------

## Gentoo4Work

 *kkretsch wrote:*   

>  *Gentoo4Work wrote:*   I'm getting kernel panic when it tries to boot, 'cannot mount root at unknown block'.  No message about filesystem, specifying proper root entries, etc., that I've seen posted elsewhere.  The root device *is* on an EFI / GPT formatted disk with non-legacy geometry... but chrooting via livecd has shown that other kernels don't seem to have any problem mounting the device, it's only my kernel(s) that want to hang on its account during boot.  The major/minor is showing the correct drive/partition (sda3). 
> 
> Is it possible that your kernel does not have GPT support configured? Please look for CONFIG_EFI_PARTITION in the .config file. If it is not set, you can find it in menuconfig under File systems -> Partition Types -> Advanced partition selection -> EFI GUID Partition support. This would explain the "unknown block" message because the kernel doesn't find your partitions.
> 
> By the way, the released grub 1.* works just fine for me. There is probably no need for using the dev snapshot.

 

Honestly, the dev-snapshot choice wasn't a matter of INEEDTOBEBLEEDINGEDGE, but rather that it was the only thing listed in the gentoo/grub2 wiki, and a cursory google search mentioned that the developers themselves were recommending it.  I actually tried FreeBSD before coming to Gentoo, but the hardware limitations were just too restrictive.  My ideal end-goal is sort of an IT-dept-in-a-box, something stable, but automated and auditable in terms of maintenance and self-monitoring.  That's why I went to Gentoo... I figure that a week of trial-error-research-and-trial-again is infinitely better than something like Ubuntu.  Ultimately I intend to use this thing for HPC work, so it's best that I learn the ins-and-outs anyway.

It seems like the problem was actually related to APIC ACPI over-determining interrupts during boot.  I got the real clue when I plugged in a USB drive to seed the grub-boot partition with some rescue disk images, and when I tried to boot the SysRescueCD (from the disk) while the USB drive was plugged in, I got the same kernel panic as I did when I was booting straight from the HD... which obviously shouldn't happen when booting from a LiveCD that had previously worked, unless something else had changed at the BIOS level.  Nothing had, so my conclusion is that hotplugging the USB drive disrupted the process.  Which means that my original problem was that, when I compiled the new kernel (which included support for my SAS RAID array, whereas it hadn't before), the APIC ACPI IRQ handler threw off the whole numbering scheme... so the major-minors probably weren't right after all -- they were right before I rebooted.  Then I went and accidentally hosed my stage4 using a different liveCD, because I hadn't realized the numbering scheme had changed yet   :Embarassed: 

Unfortunately I've got to get some paying work done before I can devote too much more time to this... there will be updates later.  I really appreciate everyone's help.

----------

## Gentoo4Work

 *Hu wrote:*   

> Please post the output of nl /etc/fstab ; nl /boot/grub/grub.conf and the last 10 lines of kernel output before the panic.
> 
>  *Gentoo4Work wrote:*   What on earth determines the default kernel configuration options for gentoo-sources? 
> 
> This is controlled by the defconfig files shipped in the kernel sources.  This is maintained by upstream.  Gentoo may patch the choices on occasion, but most of what you are seeing is from upstream.  The menuconfig process does not attempt to match the settings to your current hardware.
> ...

 

This is GREAT info, and IMO should be added to the official Gentoo handbook.  Especially because so many of the modules have very esoteric descriptions.  I must have re-compiled two or three times before realizing that the autoselections weren't tailored for my hardware, and that I needed to probe the internals on my own before searching through the options by hand.

----------

## Gentoo4Work

Okay, this time I think I'm much closer.  Re-did everything, got the kernel past the panic it was having before, all the way up to the point at which it loaded the kernel into memory.  First got an error that it couldn't open the console, then I got several modprobe errors about there not being modules where they should be... but when I jump into /lib/modules/<kernel>/, and examine modules.dep, everything matches up with my find ... | less.  Why would that happen?

----------

## Hu

I am not sure what would cause that behavior.  I only use modules when necessary.  Could you post the exact output of the error text and of the find command?

----------

## Gentoo4Work

Well, I re-built the system, and I think made some progress.  First, I switched over to a normal MBR / msdos formatted disk for the boot partition.  I put everything else on the RAID, which is apparently the problem; I got past mounting root, etc., but it hung at "failed to initialize consoled.  Try specifying an init..."

From what I can piece together through Google, the problem is that it's a quasi-RAID, neither hardware nor software.  The controller is LSI SAS2 2008, and from what I gather that kernel drivers are supposed to be mpt2fusion (at least, that's what gets auto-detected and allowed me to r/w to the array in the first place).

The thing that makes it really odd is that the array shows up as sde when I boot from a livecd, but when I try to boot straight from the machine it seems to supervene on the normal disk ordering and throw everything else off; I think that's why the kernel was panicking, but I'm not really sure what to do about it.  How does Gentoo understand this array?  Do I need mdadm, draid, should it be transparent?  I don't know what to do at this point.

----------

## sirlark

I got bitten by what sounds like the same problem. I upgraded to 2.6.33, and moved my entire root partition from and old IDE drive (reiserfs) to a new sata drive with ext4. I took great care to make sure that grub pointed to the right place, my fstab was corrected for the new hdd scheme, and that my kernel had the correct drivers for SATA and ext4. Panic on boot, same error message as OP.

Turns out that for me it was some VERY weird default choice that selected Apple MAC partition schemes by default instead of MSDOS partition schemes. This was a new option for me (I hadn't upgraded the kernel in about 6 months). I removed MAC partition support and replaced it with MSDOS parition support, and voila! she boots.

----------

