# ACPI causes kernel to hang at kernel_thread_helper+0x6/0x10

## abrondz

Hi

I recently updated the UEFI on my brand new ASRock Z68 Pro3-M mainboard which is running Gentoo. After the update, I can no longer seem to run the latest gentoo-sources with ACPI enabled as it causes the system to hang at bootup with a kernel_thread_helper+0x6/0x10 error. The exact error message is quite long and I have no way of pasting it as this occurs before any logging services have been started. dmesg shows nothing either. However, the message is similar to one described here and here.

The error itself is quite vague, but after some research, I found that booting the system with acpi=off fixes the problem, unfortunately, it's a poor fix as disabling ACPI disables CPU functions such as hyperthreading. The only good solution so far for me has been to downgrade to vanilla-sources 2.6.38.8 which works like a charm, but I'd like to use gentoo-sources as that is the recommended kernel.

I've also tried pf-sources which seems to help with the problem at hand, but has its own issues.

I'm wondering what kind of issue this is, if it's a configuration issue on my behalf or a bug gentoo-sources or maybe even in the upstream kernel (the latest vanilla-sources show the exact same behaviour). What's also interesting is that all the descriptions of this problem are quite old, at least two years, meaning that this problem has been around longer than the 3.x kernel series. All in all, it leaves me very puzzled.

Anyone have any ideas on what this could be?

----------

## khayyam

 *abrondz wrote:*   

> seem to run the latest gentoo-sources with ACPI enabled as it causes the system to hang at bootup with a kernel_thread_helper+0x6/0x10 error.

 

abrondz ... without the full "Call Trace" its impossible to say what the above kthread is refering to. Suffice to say, this isn't the error, its just part of a trace. So, with that in mind your error may be unrelated to those linked, but obvious ACPI plays some part in this, it might be a good idea to enable CONFIG_ACPI_DEBUG and to pastebin your dmesg and .config.

Also, as your problem occured with a UEFI update, is your grub a 64bit or 32bit efi ... infact, are you booting in EFI mode or MBR?

best ... khay

----------

## abrondz

I would love to post the dmesg output, but as I mentioned, I'm not sure how to do it as the system crashes before the dmesg log gets written to disk. I'm not an expert user, so if I've missed something obvious, let me know  :Smile: 

My .config can be found here: http://pastebin.com/FbD6gzCr

I'm using a 32 bit grub-legacy and a 32 bit system. I'm assuming it's MBR mode as haven't configured grub for anything else. Again, I'm not an expert in this, i simply followed the Gentoo handbook on this one.

----------

## khayyam

 *abrondz wrote:*   

> I would love to post the dmesg output, but as I mentioned, I'm not sure how to do it as the system crashes before the dmesg log gets written to disk. I'm not an expert user, so if I've missed something obvious, let me know

 

abrondz ... sorry, I wasn't thinking, there are ways of getting more information about the crash via Magic SYSREQ key but this will need to be enabled in the kernel and you will need to have a SysRq key.

 *abrondz wrote:*   

> My .config can be found here: 

 

I should have said to post both the 3.2.18-gentoo and the 2.6.38.8 ... this way they can be compared for possible causes. Anyhow looking at the 3.2.18-gentoo config the first thing I noticed is you have various EFI enabled when you are booting MBR ...

```
CONFIG_EFI=y

CONFIG_FB_EFI=y

CONFIG_EFI_VARS=y

CONFIG_EFI_PARTITION=y
```

These may cause issues, infact Documentation/fb/efifb.txt states "efifb is only for EFI booted Intel Macs", and so this should probably be disabled. I imagine that your partition scheme is some hybrid MBR/GPT (otherwise you wouldn't be able to boot via regular MBR methods), so your not really booting EFI, and the kernel may be expecting you are due to CONFIG_EFI=y. Though for the most part I'm speculating, his would tie in with the fact that ACPI is hit, as ACPI information is gleaned from EFI/BIOS and the methods of doing this will differ, so it might be a good idea to take a close look at your EFI and see what kind of "modes" or "compatability" are enabled/disabled.

 *abrondz wrote:*   

> I'm using a 32 bit grub-legacy and a 32 bit system. I'm assuming it's MBR mode as haven't configured grub for anything else. Again, I'm not an expert in this, i simply followed the Gentoo handbook on this one.

 

I see ... well, your kind of caught in that twighlight world that exisits for little other reason of backward compatablity, your system is totally capable of EFI booting and GPT partitioning but becuase Windows doesn't currently boot EFI the OEM's ship the machines with stupid workarounds to allow it to boot MBR. This isn't helped by the fact that although Linux fully supports GPT/EFI it somewhat follows suite so as to maintain that compatability for dual booting Windows. So, unless you explictly fomat the disk as GPT, adopt EFI boot methods, and an efi bootloader, then you get stuck this these 'workarounds'. You may have a dual booting system and so need MBR compatability but if not you should be booting natively. Unfortuanatly to do this you have to format the disk in GPT format with an ESP (EFI Sysyem Partition) and adjust your EFI firmware to be "non-compatable" (or some such). This makes the process of booting EFI a difficulty becuase all this should really be done prior to install, and added to this the various EFI implimetations differ and so I can provide little or no advice about how yours may be currently configured or how to go about changing it. Its the twighlight zone .. na,na,na,na .. na,na,na,na ...

Anyhow ... I'm not entirely sure your issue is related to EFI, but it's a prime suspect based on the fact that much of the EFI support was introduced during the later period of 2.6.x .. a look at your .config for the bootable kernel may give some better clues.

HTH & best ...

khay

----------

## abrondz

Ah, I actually have that enabled and I do have a SysRq key, but looking at the Wikipedia link and trying out those key combinations gave little result as the system seems to be completely unresponsive.

 *Quote:*   

> I should have said to post both the 3.2.18-gentoo and the 2.6.38.8 ... this way they can be compared for possible causes. 

 

That sounds like a good idea, here's the 2.6.38.8 config: http://pastebin.com/jPdufctK

I disabled all the EFI related options, but the system still crashes, but thanks for the tip anyway  :Smile: 

My main reason for using grub-legacy is its simplicity, I've breifely tried grub2 with some other distributions and the amount of configuration is confusing, altso the Gentoo handbook doesn't even mention what you just said. Or maybe it does, but it's more like a "by the way", otherwise i would've paid more attention to this.

I see there is a lot I haven't catched up to. Back when I had a good idea of how these things worked, we were all still using IDE drives  :Smile: 

----------

## khayyam

 *abrondz wrote:*   

> Ah, I actually have that enabled and I do have a SysRq key, but looking at the Wikipedia link and trying out those key combinations gave little result as the system seems to be completely unresponsive.

 

abrondz ... I'm not sure at what part of the boot process the SysRq will be loaded, the crash may be prior to this. ACPI happens very early on, and so it could be its simply not available. You do have a SysRq key? 

 *abrondz wrote:*   

> That sounds like a good idea, here's the 2.6.38.8 config:

 

OK, diff'ing the two shows very little having changed. The items that stand out are CONFIG_ACPI_VIDEO which for the 2.6.38.8 is =n and the 3.2.12-gentoo =y. Also, CONFIG_DRM_I915 and CONFIG_DRM_KMS_HELPER (Kernel Mode Switching for the ineldrmfb) is currently =y and doesn't seem to have been available for 2.6.38.8 (it looks like you were using VESA). So, what graphics hardware do you have? Also does booting with 'i915.modeset=0' as a kernel parameter have any effect?

 *abrondz wrote:*   

> I disabled all the EFI related options, but the system still crashes, but thanks for the tip anyway

 

OK, but you might want to re-set them at least all bar the FB_EFI .. as I can't be sure what your EFI firmware is doing. At least we know that the problem occurs whether these are enabled or disabled.

 *abrondz wrote:*   

> My main reason for using grub-legacy is its simplicity, I've breifely tried grub2 with some other distributions and the amount of configuration is confusing, altso the Gentoo handbook doesn't even mention what you just said. Or maybe it does, but it's more like a "by the way", otherwise i would've paid more attention to this.

 

Well, grub2 isn't required ... you can boot the kernel directly via efi without a bootloader (at least for kernels > 3.3), there is also rEFInd, and efibootmgr. Grub2 can be built as a efi execuatble and then placed in the ESP and set as the bootloader via efiboomgr. No need to write to the MBR, its alot simpler than the MBR (grub-legacy) method.

As I said part of the reason its not covered in any depth is because there is the tendency to want to maintain compatability with Windows, the assumption being that people will dual boot. Plus, the linux world is familar with MBR, and so there is a change of method that comes with EFI, also its currently difficult to support becuase the various EFI firmware that exist don't all provide a similar functionality. With all this the 'easy option' is generally presented, but easy doesn't always equal the best option. In my case I couldn't find anything relating to booting EFI natively on my machine (or macbooks in general), I had to figure it out by trial and error.  

 *abrondz wrote:*   

> I see there is a lot I haven't catched up to. Back when I had a good idea of how these things worked, we were all still using IDE drives

 

I still am, in someplaces, but I'm pretty much done with using 5.25 floppy disks :)

best ... khay

----------

## abrondz

Yeah, Alt + Print Screen *should* be SysRq, but it seems as if this is occurring way too early in the boot process, as you suggested.

 *Quote:*   

> OK, diff'ing the two shows very little having changed. The items that stand out are CONFIG_ACPI_VIDEO which for the 2.6.38.8 is =n and the 3.2.12-gentoo =y. Also, CONFIG_DRM_I915 and CONFIG_DRM_KMS_HELPER (Kernel Mode Switching for the ineldrmfb) is currently =y and doesn't seem to have been available for 2.6.38.8 (it looks like you were using VESA). So, what graphics hardware do you have? Also does booting with 'i915.modeset=0' as a kernel parameter have any effect? 

 

Tried getting rid of those, no change. The parameter didn't help the situation either. I have a pretty recent nVidia card, a GTX 560 Ti to be precise and I'm using the VESA framebuffer driver.

I'll keep in mind about EFI and Linux till next time I'll do a system upgrade or reinstall, it sure sounds interesting, but I'm not too keen on a complete reinstall just yet  :Smile: 

Also, what are my "risks" in running the vanilla 2.6.38.8? I know I'll be missing useful patches, but if the 2.6 kernel tree is still maintained, I might just get a stable system that "just works".

Thanks for all the help so far  :Smile: 

----------

## khayyam

 *abrondz wrote:*   

> Tried getting rid of those, no change. The parameter didn't help the situation either.

 

abrondz ... the parameter only effects the inteldrmfb KMS ... so it I915 isn't available then of course it has nothing to configure.

 *abrondz wrote:*   

> I have a pretty recent nVidia card, a GTX 560 Ti to be precise and I'm using the VESA framebuffer driver.

 

OK, well that has me wondering why the i915 driver was enabled, considering it was disabled for 2.6.x .. you know that KMS has issues if a framebuffer driver is also enabled. Anyhow, thats obviously not the issue as the crash occurs when they are removed. I assume their is a reason your not using CONFIG_AGP_NVIDIA and CONFIG_FB_NVIDIA and using the VESA instead? Have you tried booting with another framebuffer? I'm clutching at straws here, because I have so little to go on.

Also, when you built the kernel, did you start from a clean slate?

 *abrondz wrote:*   

> Also, what are my "risks" in running the vanilla 2.6.38.8? I know I'll be missing useful patches, but if the 2.6 kernel tree is still maintained, I might just get a stable system that "just works".

 

Well, it depends on what features you need ... if everything works with 2.6.x then there is little reason to move, but by moving you get access to drivers, etc, that are maybe not backported, you also get an upgrade path, as you will have to migrate to > 3.x at some point and so its better to be on that upgrade path than to be having to stay at 2.6.x due to 3.x crashing.

 *abrondz wrote:*   

> Thanks for all the help so far

 

Your welcome ... hopefully we can get a better idea of what exactly it causing the crash, right now its really a question of guessing.

best ... khay

----------

## abrondz

Yeah, you're right, I didn't think that through..

I have to admit that I have been importing the kernel config from version to version. I didn't give it much thought at the time, but now that you mention it, maybe I should've started from scratch. Interestingly enough, I did configure the 3.x kernel first, then imported the .config into the 2.x kernel, meaning that whatever was breaking the 3.x kernel doesn't exist in the 2.6.x kernel.

I'm not using the nVidia framebuffer because it conflicts with the nvidia X11 driver, as mentioned here: http://www.gentoo.org/doc/en/nvidia-guide.xml#doc_chap3

I do understand that there is very little to go on here, in fact, the very reason I came here is because Google gave me nothing and the entire problem is leaving me very puzzled. That's also why I'm considering just sticking to 2.6.x, at least for the time being, as it seems to have everything I need. I will upgrade to 3.x as soon as possible, maybe this is something in the kernel itself, as mentioned earlier.

For the record, I did a few other tests of my own. I made a separate partition and did a quick Gentoo install from scratch, no old .config or anything, same problem. Both Arch Linux and Sabayon failed to boot, but interestingly, they crashed when attempting to launch a framebuffer. Ubuntu booted fine, but no virtual consoles were available, again suggesting a framebuffer issue. Finally, I tried using Ubuntu's .config in Gentoo, just for the heck of it, but it crashed with the same error as before (the one I have only been seeing in Gentoo).

What I can't make any sense of is why Gentoo crashes on ACPI initialization while all the other distributions crash much later, when the kernel was already fully loaded.

Sorry for the long, and probably very confusing, post. I'm quite confused myself..

----------

## khayyam

 *abrondz wrote:*   

> I have to admit that I have been importing the kernel config from version to version. I didn't give it much thought at the time, but now that you mention it, maybe I should've started from scratch. Interestingly enough, I did configure the 3.x kernel first, then imported the .config into the 2.x kernel, meaning that whatever was breaking the 3.x kernel doesn't exist in the 2.6.x kernel.

 

abrondz ... I see, but that assumption may not be true, you also have the actual code to consider. The drivers may be the same in .config terms but not necessarily in terms of the actual code. In any case with a major revision its always best to start with a clean config, or a seed.

 *abrondz wrote:*   

> I'm not using the nVidia framebuffer because it conflicts with the nvidia X11 driver

 

I suspected this was the reason. The problem right now is I'm not even sure what specific hardware/driver is at issue, it could be the graphics card itself, the framebuffer driver, ACPI, EFI .. being able to isolate the problem would really help at this point. Even the fact that disabling ACPI allows it to boot doesn't necessarily means ACPI is the cause, it may simply be crashing due to some factor (initialisation, or what-have-you). Perhaps VESA doesn't get the cards power status/capabilitiies correctly from ACPI, or the information from EFI passed to ACPI about hardware is incorrect. This is why I asked about the framebuffer/nvidea, because by swaping one for another will allow us to narrow down the focus somewhat.

 *abrondz wrote:*   

> For the record, I did a few other tests of my own. I made a separate partition and did a quick Gentoo install from scratch, no old .config or anything, same problem. Both Arch Linux and Sabayon failed to boot, but interestingly, they crashed when attempting to launch a framebuffer. Ubuntu booted fine, but no virtual consoles were available, again suggesting a framebuffer issue. Finally, I tried using Ubuntu's .config in Gentoo, just for the heck of it, but it crashed with the same error as before (the one I have only been seeing in Gentoo).

 

Carefull, these could all be unrelated, and what do you mean by a crash? The only one that seems to have any corrolation is using ubuntu's .config. 

 *abrondz wrote:*   

> What I can't make any sense of is why Gentoo crashes on ACPI initialization while all the other distributions crash much later, when the kernel was already fully loaded.

 

Speculation: the others have, say, the cpufreq driver as a module. Which brings me to some other difference with 2.x and 3.x configs, the former has CONFIG_X86_PCC_CPUFREQ and CONFIG_X86_ACPI_CPUFREQ both =n the latter both =y.

 *abrondz wrote:*   

> Sorry for the long, and probably very confusing, post. I'm quite confused myself..

 

No problem ...

best ... khay

----------

## abrondz

Ah, I'll know that till next time then.

I tried switching over to the nvidia framebuffer, but no change, even disabling the framebuffer altogether does nothing. The only thing that has an effect on the situation is disabling ACPI in the kernel. I do understand that it doesn't mean that ACPI is to blame, but I'm long out of ideas on what else to do here.

What I mean, the system froze completely up, no response. The screen goes blank when launching the framebuffer and that's it. This is not what's happening in Gentoo, although the screen goes blank, it's only because the system has frozen up before anything gets displayed in the framebuffer. Removing the vga= in grub lets me see the the actual error.

I also tried getting rid of everything in the ACPI submenu, leaving only CONFIG_ACPI=y, but the problem was still there. I know that ACPI also enables certain other features, such as hyperthreading, maybe disabling these in turn could shed some light on the matter. Unfortunately, I don't know what all these features are.

----------

## khayyam

 *abrondz wrote:*   

> I tried switching over to the nvidia framebuffer, but no change, even disabling the framebuffer altogether does nothing. The only thing that has an effect on the situation is disabling ACPI in the kernel.

 

abrondz ... well, ACPI is also dealing with regestering power management features of various hardware (processor, PCI, video card). Did you also try disabling cpufreq? 

 *abrondz wrote:*   

> What I mean, the system froze completely up, no response. The screen goes blank when launching the framebuffer and that's it. This is not what's happening in Gentoo, although the screen goes blank, it's only because the system has frozen up before anything gets displayed in the framebuffer. Removing the vga= in grub lets me see the the actual error.

 

OK, but the screen blanking (with the Arch and Sabayon install CD's) many simply be due to them having KMS. 

 *abrondz wrote:*   

> I also tried getting rid of everything in the ACPI submenu, leaving only CONFIG_ACPI=y, but the problem was still there. I know that ACPI also enables certain other features, such as hyperthreading, maybe disabling these in turn could shed some light on the matter. Unfortunately, I don't know what all these features are.

 

It might be an idea to start from a clean kernel source (3.4.4 would probably be a good idea), and enabling just those things required to boot, adding options indivdually. Also, if you have another video card, perhaps trying swaping out the nvidia card.

best ... khay

----------

## abrondz

Upgrading to gentoo-sources-3.4.4 actually fixed the problem, the system booted just fine. Unfortunately, nvidia-drivers did not compile under that kernel. When I tried gentoo-sources-3.3.8-r1, both the system and nvidia-drivers worked. All in all, a simple upgrade fixed the problem, it seems, and I now have ACPI under 3.x.

I don't quite know what happened in 3.2.x, but at least I have a working system now.

Thanks for all the help and suggestions  :Smile: 

----------

## khayyam

abrondz ...

ahh good. As far as the nvidia-drivers are concerned, you could try those in ~arch or 302.17 which is package masked.

Again ... your welcome

best ... khay

----------

