# [SOLVED] CPU 0: Machine Check Exception:   4 BANK 4:

## mounty1

The error in the title occurs shortly after /usr has been mounted during the first boot after a fresh installation.  The installation itself proceeded exactly according to the instructions in the handbook.  One can use the mcelog utility to diagnose the error.: *Quote:*   

> CPU 0: Machine Check Exception:             4 BANK 4: b200000000070f0f
> 
> TSC 4ed8aec0aa
> 
> HARDWARE ERROR. This is *NOT* a software problem!
> ...

 I typed the first two lines, and mcelog output the rest.  It happens with both the AMD64 and the i686 installations.  The machine has an ASRock motherboard with an AMD64 processor and 1 GiB of RAM.

The problem does not occur when booting the (minimal) installation CD, and it does not occur with Kubuntu, which the machine is now running, apparently successfully.  It also used to run Mandriva 2006 64-bit, again without problems.

I could just keep Kubuntu, but I'd rather stick with what I know.  Could anyone suggest or guess at how to get Gentoo running on the machine ?  Maybe build a 32 bit system ?

I know this error is mentioned elsewhere in the forums (b200000000070f0f occurs six times (seven now)) but I couldn't see anything applicable in my case and it's not exactly an FAQ (yet).

----------

## PMcCauley

One thing you could do is compare the kernel options used for the live cd or one of the other linux installs with the ones you are using.  From the live cd the kernel config is stored in /proc/config.gz.  If this doesn't solve it for you try posting kernel config(s) here (remove comments https://forums.gentoo.org/viewtopic-t-160179.html).   It seems odd to me that it is occuring after mounting /usr, perhaps there is something wrong with the disk.  Maybe check dmesg for anything related as well as run badblocks and smartctl on the drive.

Patrick

----------

## krinn

well, i'm not sure it's what you want but the kernel parameters for mce are

the fact others kernels doesn't report the error could just mean the options are disable in those kernels

```

cat /usr/src/linux/.config | grep X86_MCE

CONFIG_X86_MCE=y

CONFIG_X86_MCE_NONFATAL=y

CONFIG_X86_MCE_P4THERMAL=y
```

```

Check for non-fatal errors on AMD Athlon/Duron / Intel Pentium 4 X86_MCE_NONFATAL

Enabling this feature starts a timer that triggers every 5 seconds which

will look at the machine check registers to see if anything happened.

Non-fatal problems automatically get corrected (but still logged).

Disable this if you don't want to see these messages.

Seeing the messages this option prints out may be indicative of dying hardware,

or out-of-spec (ie, overclocked) hardware.

This option only does something on certain CPUs.

(AMD Athlon/Duron and Intel Pentium 4)
```

----------

## mounty1

 *PMcCauley wrote:*   

> One thing you could do is compare the kernel options used for the live cd or one of the other linux installs with the ones you are using.  From the live cd the kernel config is stored in /proc/config.gz.  If this doesn't solve it for you try posting kernel config(s) here (remove comments https://forums.gentoo.org/viewtopic-t-160179.html).   It seems odd to me that it is occuring after mounting /usr, perhaps there is something wrong with the disk.  Maybe check dmesg for anything related as well as run badblocks and smartctl on the drive.

 Right, I did as you suggested, i.e., build the kernel with the stock config, and now it boots.  Curiously, it now doesn't recognise either network interface.  That is, ifconfig -a just shows lo0, although the installation CD sees them both (incidentally, the NS DP83820 driver seems to be broken).

The config files are quite big and I don't really know where to start.  Any chance that anyone could spot something obvious ?  Maybe all that ACPI stuff ?  The files are (with line counts):http://www.landcroft.co.uk/Gentoo-config/config-OK (860) The working config set, taken from gunzip -c </proc/config.gz on the running system.http://www.landcroft.co.uk/Gentoo-config/config-fail (401) My ideal config, which causes the MCE.http://www.landcroft.co.uk/Gentoo-config/config-diffs (944) The differences, output from diff config-OK config-fail.It seems to me that any of those funny CPU tweaks, switches and timing things could be causing the problem.

----------

## mounty1

For anyone searching:  on the ASRock K8 Combo-Z motherboard, you need to enable CONFIG_BLK_DEV_ALI15X3, since the IDE driver is an ALi 1563 chipset.

Presumably, in general, one must enable the block device driver for ones IDE chipset.

How did I find out ?  I took the advice given above:boot from the installation CD and gunzip -c /proc/config.gz to obtain a working configuration.perform a chroot within the installation environment and rebuild the kernel.boot this kernel to make sure it works.incrementally, diff the working config with my desired config and change the working config a few items at a time, rebuild the kernel and test it.repeat until my desired config works.Tedious, but it did the job.

----------

## obrut<-

do you still have mce enabled?

----------

## mounty1

 *obrut<- wrote:*   

> do you still have mce enabled?

 As far as I remember, it was, but I can't check now as the machine was reduced to a lump of inert scrap owing to the inability of the South Australian electricity company to regulate its supply.

----------

## obrut<-

ah, i see. so it's in e-heaven.

----------

## kaminari

I had the same situation. I solved it by changing in kernel configuration:

```
Processor type and features --->

  "Preemption Model" to "Voluntary Kernel Preemption (Desktop)" 

and

  "Timer frequency" to "300 HZ"
```

----------

## mounty1

I don't see how changing a kernel parameter will stop the machine from being cooked by a mains spike.

 :Razz:  <--- for those at the back not paying full attention.

----------

