# What would cause kernel taints

## albright

doing an emerge this morning, failed with these log messages (see below):

Is this failing hardware? Memory? Motherboard? 

EDIT: system would not reboot save via sys-rq-b

It's a bit worrying   :Sad: 

 *Quote:*   

> Sep  7 08:40:31 olorin kernel: CPU: 1 PID: 18927 Comm: mv Tainted: P      D W  O  3.16.1-gentoo #1
> 
> Sep  7 08:40:31 olorin kernel: Hardware name: System manufacturer System Product Name/P8P67 REV 3.1, BIOS 3602 11/01/2012
> 
> Sep  7 08:40:31 olorin kernel: task: ffff88009f912be0 ti: ffff880109980000 task.ti: ffff880109980000
> ...

 

----------

## eccerr0r

Most common cause of tainted dumps: P: Proprietary kernel modules inserted (Nvidia, ati-drivers, compaq raid array, etc.)

However you have other, more serious problems: D means the kernel panicked earlier and tried to continue onwards.  So there was another oops earlier than this.  W means there was an earlier warning.  And O means you built an out of kernel module you insmodded into the kernel.  Kernel debuggers don't like P and O flags as they don't know what they may be dealing with, and D / W flags could mean secondary corruption.

You'll need to find the first oops and debug that first.  Debugging second corruptions tend to be fruitless as they may have been caused by the first problem.

Judging by this oops, you need a reboot badly, your kernel is in very bad shape right now.

----------

## N8Fear

Tainted in conjunction with the kernel means that there are non-GPL modules loaded like i.e. nvidia or zfs or virtual box drivers.

This (likely) hasn't got to do anything with the call trace you get.

Can you reproduce this issue (by e.g. loading a certain module or by running a certain program)?

----------

## albright

thanks for the replies

If I look throught /var/log/messages-2014* I see these from

yesterday:

 *Quote:*   

> messages-20140907:Sep  6 16:56:23 olorin kernel: CPU: 1 PID: 1009 Comm: khubd Tainted: P        W  O  3.16.1-gentoo #1
> 
> messages-20140907:Sep  6 16:56:54 olorin kernel: CPU: 1 PID: 4738 Comm: upowerd Tainted: P      D W  O  3.16.1-gentoo #1
> 
> messages-20140907:Sep  6 17:01:02 olorin kernel: CPU: 1 PID: 16877 Comm: offlineimap Tainted: P      D W  O  3.16.1-gentoo #1
> ...

 

It looks like the khubd error was the first ...

This machine has been suffering random lockups for the last two months but they seemed to be the

result of a failing hard drive which was replaced recently. Maybe that is not the only or most basic

problem with this machine ...

----------

## Hu

N8Fear is inaccurate.  Refer to eccerr0r's post instead, since there are ways to get a tainted kernel even without the ability to load modules.

OP: what kernel modules do you load on this system?  We should start with accounting for why you have P+O, then deal with the warnings if those are independent of using the out-of-tree modules.

----------

## albright

I think the only proprietary module is nvidia

Here's the full list:

 *Quote:*   

> xt_REDIRECT             1686  1 
> 
> xt_statistic            1167  0 
> 
> xt_CT                   3250  2 
> ...

 

----------

## Hu

If you blacklist the nVidia module and reboot, can you reproduce any of the failures?

----------

## eccerr0r

While I highly doubt nvidia is causing the problem but as stated above, yes, it would make it much better to take this variable out of the equation hence removing it is a good idea to test.  The reason being, if nvidia-driver had a function call "wipe_out_random_memory_location(x)" and due to closed source we don't see it, this truly is the problem and not whatever the oops indicates.

As stated earlier a WARNING could cause taint.  Do you see WARNING (in all caps) show up in your kernel logfiles?

----------

## albright

No errors have occurred in the last 30 hours or so.

I have a suspicion that the last error was caused when I plugged

in a bad usb drive (hence the khubd error). It was the same drive

that started the problem in the first place, which I had put in a

usb case to see if I could recover anything. The drive was unreadable ...

Since then the system has been running perfectly.

If the problem recurs, I'll try with the nvidia module blacklisted.

----------

## eccerr0r

Something really bad must have happened to get the "D" taint.  Perhaps that first W caused death, don't know.

Kind of funny, these taints are all just to help out the LKML and debuggers know whether to start looking at a problem.  It looks like many of the flags were added post proprietary modules.  Though you don't have it, I'm curious of the "S" taint - where the kernel detects SMP incompatible CPUs installed - I had been running a dual Celeron machine in the past that should qualify for the "S" taint.

I've never had hard drives in recent times cause system freeze-ups - they cause slow downs from retrys and I/O errors when they get offlined for me.  You may have to look into other hardware issues, most of the system freezes I've had were due to bad motherboard devices.

----------

## albright

 *Quote:*   

> You may have to look into other hardware issues, most of the system freezes I've had were due to bad motherboard devices.

 

yes, I fear you are right; I've changed a sata cable and moved to a different port in

the hopes of placating the hardware gods  :Wink: 

----------

