# [SOLVED] Vague NMI errors.

## w0rd88

Hello,

I'm getting the following error on a P4SCT motherboard.

model name	: Intel(R) Pentium(R) 4 CPU 3.00GHz

kernel: 3.3.0-gentoo

---

Uhhuh. NMI received for unknown reason 3d on CPU 0.

Do you have a strange power saving mode enabled?

Dazed and confused, but trying to continue

---

I've read almost everything I can find on this, and tried several of the fixes I've seen on google/gentoo-forums, but the error still persists.  

Appending any or all of the following to the kernel line doesn't do anything.  These are the most common fixes I see.  I've also turned off ACPI in the bios, and basically disabled every power saving option in it to no avail.

nmi_watchdog=0 nohpet acpi=off

After researching, I've found people fixing this with video driver updates, sound driver updates, hotplug stuff, kernel options, etc.  This error message doesn't really seem to be all that helpful in diagnosing where it's coming from.  

It happens when I put load on the system.  Anyone know how to debug this or identify what is actually causing this message?  Perhaps I'm just missing something to debug this particular issue.  

Thanks,

-dLast edited by w0rd88 on Sun Apr 01, 2012 9:13 pm; edited 1 time in total

----------

## Hu

Although some broken systems may require you to disable some or all of ACPI, it is generally a bad idea to disable it if you can make it work.  Is this a regression in the 3.3 kernel or has your system always had this problem?  How much load is required?  What kind of load are you using (CPU, memory, I/O, ...)?

----------

## w0rd88

Thank you for the help.  I finally figured it out.   This is a new install of gentoo, but was a stable system elsewhere that was upgraded to a newer system, so I know the HW works.  I upgraded the memory tho, so I did run a few iterations of memtest before posting, forgot to mention that.  :p   It ran stably for the last 10 days or so, and I did a bunch of compiling.  That silly message was the only error in all that time, and it happened a bunch.

On this motherboard, there is actually a jumper for the watchdog timer.  Here is the description from the manual:

---

"Watch Dog Enable/Disable JP8 enables the Watch Dog function. Watch Dog is a system monitor that can reboot the system when a software application is "hung up". Pins 1-2 will cause WD to reset the system if an application is "hung up". Pins 2-3 will generate a non-maskable interrupt signal for the application that is "hung up". See the table on the right for jumper settings. Watch Dog can also be enabled via BIOS. (*Note, when enabled, the user needs to write his own application software in order to disable the Watch Dog Timer.)"

---

It was enabled (pins 1-2).   I pulled it off for now.  I'll try 2-3 instead to see if it still complains.   Nice of them to put a 'disable watchdog' option in the BIOS that doesn't really disable watchdog.  :/  Why would one reset the system for a hung app anyways?!  

I'm running with ACPI enabled again, and all my modifications made while troubleshooting rolled back, and it's been happy for a few hours now.  I've compiled a few things to put a little load on it as well. 

-d

----------

## Hu

 *w0rd88 wrote:*   

> Nice of them to put a 'disable watchdog' option in the BIOS that doesn't really disable watchdog.  :/  Why would one reset the system for a hung app anyways?!

 Certain setups that require minimum downtime will use a watchdog to reset the system in the hope that the system does not proceed to hang again after a reset.  Particularly when dealing with closed systems, it is unlikely that the administrator will do anything more than reset the machine anyway, so providing an automated reset is considered a feature.

----------

