# [solved] Kernel freezing on boot after CPU/BIOS upgrade

## haarp

Hi there. Got a huge problem.

I installed a Gentoo machine. New board and everything. Popped an old Sempron CPU on there that I had lying around and proceeded to install Gentoo (AMD64). So far so good, machine works great with a gentoo-2.6.32-r1 kernel.

A few days ago, my new Phenom II 955 arrived. Really nice CPU. So I updated the BIOS to the newest version and put the Phenom into the board. Suddenly, Gentoo does not boot anymore. More specifically, the kernel freezes during bootup. These are the last lines I see:

http://img696.imageshack.us/img696/3029/img4586d.jpg

I waited about 2 minutes to see if it just stalls somewhere. I tried disabling HPET in the kernel and BIOS to see if that's the problem, but that only makes the last line go away. Disabled Cool'n'Quiet in the BIOS. Manually downclocked the CPU to about 800MHz. I removed the kernel command line except for the "root" option. I tried the pci=routeirq, pci=nomsi, noapic and noacpi options. I compiled a new kernel, compiled a new kernel with default options, compiled a new kernel with detconfig options. All to no avail.

What boggles my mind though is the fact that Windows, Linux LiveCDs and even the Gentoo Minimal Install CD boot up just fine! I even tried using the same kernel as on the Install CD (currently gentoo-2.6.31-r6) to no avail. I really ran out of ideas. I just can't get this kernel to boot.

The problem must either be with the new CPU or the new BIOS, neither of which I can replace currently. But due the fact that other OS work fine, I suspect some kernel option is borked and only the combination of both makes it freeze. No idea...Last edited by haarp on Wed Jan 13, 2010 10:37 am; edited 2 times in total

----------

## nativemad

Hi, 

have you used some generic CFLGAS during the sempron-install!? -Not that it just segfaults in the background with init or something like that.

Do you use genkernel to compile the kernel-sources!? 

Have you also tried to take the Kernel + initramfs directly from the minimal-cd, instead of just trying to compile the same version.... (Just copy them over to your /boot and add an additional boot-entry for that Kernel... Even if you don't have the appropriate /var/lib/modules -dir, it should at least reach init!)

----------

## haarp

Thanks for answering!

 *nativemad wrote:*   

> Hi, 
> 
> have you used some generic CFLGAS during the sempron-install!? -Not that it just segfaults in the background with init or something like that.
> 
> 

 

Nope. I didn't specify any CFLAGS during the kernel build (I even doubt that's possible. The only CFLAG that can be changed is -Os/-O2 in menuconfig)

Besides, I've never seen the Kernel segfaulting :O

 *Quote:*   

> Do you use genkernel to compile the kernel-sources!? 

 

Nope.

 *Quote:*   

> Have you also tried to take the Kernel + initramfs directly from the minimal-cd, instead of just trying to compile the same version.... (Just copy them over to your /boot and add an additional boot-entry for that Kernel... Even if you don't have the appropriate /var/lib/modules -dir, it should at least reach init!)

 

Good idea! I'll try that when I get home.

----------

## haarp

Sorry, no dice. I can't find the kernels anywhere on any LiveCD. There are /vmlinuz symlinks, but they point into nothing....

----------

## NeddySeagoon

haarp,

Post your lspci and while you are waiting for us to look it over, boot with a liveCD and try one of Pappys Seeds

The contents of make.conf are not used to build kernels. You use the kernels own build system.  IF you want parallel make, you need to pass make the -j4 option for 4 instances of gcc.

----------

## haarp

Hi,

I managed to get my hands on a random Ubuntu kernel. It successfully booted my system. Of course, all the modules are missing, but it boots...

Next I tried an old kernel built by another AMD machine (gentoo-2.6.28-r5) that I know is ok. It won't boot, same problem. Then I tried compiling a vanilla kernel 2.6.31 with default settings (fresh .config). This one wouldn't boot either.

Due to the fact that the Ubuntu kernel boots, this problem can't be related to me changing the CPU/BIOS

but because the kernel of the other AMD machine works fine on the other machine, right now, but not on this machine, it can't be the kernel either.

It must be  a combination of both kernel and CPU/BIOS. There's no other explanation. So far, only the kernels that *I* built seem to fail. Could this be related to the GCC version I use? Both my machines use GCC 4.3.4 and built their kernels with that. Maybe this GCC is somehow not compatible with my setup? I know it sounds unlikely, but meh. What GCC version is the AMD64 Minimal Install CD using for its kernel build? The only other idea I have is that there's some kernel option that all the distros usually deactivate in their kernel builds, but happens to be active in my config AND the default kernel config.   :Confused: 

Neddy, here's lspci -v as seen from an Ubuntu LiveCD:

http://pastebin.com/m2f884b9eLast edited by haarp on Mon Jan 11, 2010 9:02 pm; edited 1 time in total

----------

## haarp

Ok, I directly used pappy's seed for 2.6.32-gentoo-r1. It built, but again, fails to boot.

Are there any options to make the kernel log on boot more verbose?

----------

## nativemad

If its enabled in the working ubuntu kernel, then you could `cat /proc/config.gz | gunzip >/root/ubuntuconfig`to get the config!  :Wink: 

----------

## haarp

 *nativemad wrote:*   

> If its enabled in the working ubuntu kernel, then you could `cat /proc/config.gz | gunzip >/root/ubuntuconfig`to get the config! 

 

Nope. Ubuntu apparently does not export the config into proc  :Mad: 

----------

## NeddySeagoon

haarp,

Heres a few of my kernel config files for a 64 bit kernel on a Phenom II on almost the same hardware.

I would expect them to boot, provide you only need ext2 and ext3. If you need other filesystems you need to add them.

My network interfaces are different - but you don't need networking to get booted.

Daft question time - are you sure you are installing your kernel properly. e.g. mounting /boot, getting the name right for grub?

----------

## haarp

In fact, I don't even need filesystem drivers at all to see if it works  :Mr. Green:  The kernel freezes far ahead of mounting fs.

As for the kernel installs, yes, I'm sure that they work properly. After all, I use grub's integrated command-line editing and tab-complete the kernel filenames to be sure I boot the one I want

----------

## nativemad

Two things that i can see from the screenshot...

First, you pass video=uvesafb as bootoption, but "No AGP bridge found". Maybe you get more luck without the video= or with agpgart fixed compiled into the Kernel.

second: "iommu=noaperture" as bootoption could prevent you from the "Your BIOS doesn't leave a aperture memory hole" message.

...i would bet for the first option, as the two "Console" messages on the bottom could indicate that it would like to switch over the framebuffer...

----------

## haarp

 *nativemad wrote:*   

> Two things that i can see from the screenshot...
> 
> First, you pass video=uvesafb as bootoption, but "No AGP bridge found". Maybe you get more luck without the video= or with agpgart fixed compiled into the Kernel.

 

 *haarp wrote:*   

> I removed the kernel command line except for the "root" option.

 

doesn't matter anyway. Uvesafb fails for me, see here: https://forums.gentoo.org/viewtopic-t-799330-highlight-uvesafb.html

But that's unrelated.

 *Quote:*   

> second: "iommu=noaperture" as bootoption could prevent you from the "Your BIOS doesn't leave a aperture memory hole" message. 

 

Thanks. That stupid message has been haunting me for a while now.

----------

## haarp

 *NeddySeagoon wrote:*   

> haarp,
> 
> Heres a few of my kernel config files for a 64 bit kernel on a Phenom II on almost the same hardware.

 

Wow, Neddy, you're awesome! When compiled, your .config boots!

I made a diff between your settings and mine (It goes from working condition to broken). If anyone could skim over this and see if they spot any obvious culprits, I'd really appreciate it!

http://pastebin.com/m17a6f762

----------

## NeddySeagoon

haarp,

This fragment (line 1640) is terminal.

```
-# CONFIG_RTC_INTF_SYSFS is not set

-# CONFIG_RTC_INTF_PROC is not set

+CONFIG_RTC_INTF_SYSFS=y

+CONFIG_RTC_INTF_PROC=y
```

 PROC is a must have to boot - I'm not sure about SYSFS but lots of things expect it to be there.

----------

## haarp

Hi,

I've got it figured out now. The option that causes this problem is....*drumroll*

TIMER_FREQUENCY! Neddy's 250Hz work just fine, whereas 1000Hz make it freeze on boot. I think it's very well possible that I found a bug in combination with my hardware. What do you think?

And thanks everybody for your help!

edit: Found a related Mailing List entry:

http://lkml.indiana.edu/hypermail/linux/kernel/0901.2/00830.html

----------

## NeddySeagoon

haarp,

Hmm - I suppose its possible that at 1000Hz, you can get a time IRQ before  the IRQ service routine is set up but IRQs should be masked while the service routine isn't ready

That would be a major bug and lots of people would hit it. There would be gnashing and wailing all over the internet.

Another subtle mechanism is that at 1000Hz, there may always be a pending timer IRQ, so the CPU gets out the IRQ and goes straight back in again.

A millisecond is a long time for the CPU to accept the IRQ, decide there is nothing to do anf exit the interrupt routine, so I doubt its that either.

For this to be real, your CPU would probably need to sleep in the interrupt service routine - thats a nono, or be seriously underclocked.

Disregard my last post - while what I said about PROC is true.   CONFIG_RTC_INTF_PROC is not set isn't the kernel symbol for /proc

I'm glad you have it fixed

----------

## haarp

Quite the opposite actually, the CPU is usually slightly overclocked. But I tried all kinds of clocks with the 1000Hz kernel, none of which worked.

The Mailing List entry I posted mentions possibly missing BIOS support for the CPU. I myself suspect that my hardware may have exactly the same problem. While the CPU works without problems in most cases, one quirk I noticed is that setting the VID (Vcore) does not have any effect and that cool'n'quiet doesn't work. To me, this indicates incomplete support by the BIOS.

But since the BIOS is already the newest available (Dec 09, even), I assume that Sapphire is either not interested in implementing support, or not able to.

I however am interested in getting this bug fixed, but have no idea where to start. I fear that I'll just end up in a "update BIOS or we can't help you" scenario.

----------

