# IRQ errors on new computer [worked around]

## albright

Ever since building new computer with asus p8p67 motherboard

I've been getting irq errors from pci ethernet cards (I've tried

a couple of different ones). Usually the network stays up, but

seems to slow down after the error

I've upgraded bios to the latest ...

Tried an old 2.6.32 kernel - same problem as on 3.2.5

I *am* booting with irqpoll option, which seems to help but not

eliminate the problem.

Here's what the errors look like:

```

Feb 22 11:11:13 olorin kernel: irq 19: nobody cared (try booting with the "irqpoll" option)

Feb 22 11:11:13 olorin kernel: Pid: 12423, comm: hsgamma_FGRP1_0 Tainted: P           O 3.2.4-pf #2

Feb 22 11:11:13 olorin kernel: Call Trace:

Feb 22 11:11:13 olorin kernel: <IRQ>  [<ffffffff8106bd9f>] ? __report_bad_irq+0x2f/0xd0

Feb 22 11:11:13 olorin kernel: [<ffffffff8106c057>] ? note_interrupt+0x177/0x220

Feb 22 11:11:13 olorin kernel: [<ffffffff81069b7a>] ? handle_irq_event_percpu+0x7a/0x140

Feb 22 11:11:13 olorin kernel: [<ffffffff810376ce>] ? __do_softirq+0xae/0x120

Feb 22 11:11:13 olorin kernel: [<ffffffff81069c81>] ? handle_irq_event+0x41/0x70

Feb 22 11:11:13 olorin kernel: [<ffffffff8106c974>] ? handle_fasteoi_irq+0x54/0xd0

Feb 22 11:11:13 olorin kernel: [<ffffffff81003c45>] ? handle_irq+0x15/0x20
```

```

Feb 22 11:18:13 olorin kernel: irq 19: nobody cared (try booting with the "irqpoll" option)

Feb 22 11:18:13 olorin kernel: Pid: 3104, comm: kwin Tainted: P           O 3.2.4-pf #2

Feb 22 11:18:13 olorin kernel: Call Trace:

Feb 22 11:18:13 olorin kernel: <IRQ>  [<ffffffff8106bd9f>] ? __report_bad_irq+0x2f/0xd0

Feb 22 11:18:13 olorin kernel: [<ffffffff8106c057>] ? note_interrupt+0x177/0x220

Feb 22 11:18:13 olorin kernel: [<ffffffff81069b7a>] ? handle_irq_event_percpu+0x7a/0x140

Feb 22 11:18:13 olorin kernel: [<ffffffff810376ce>] ? __do_softirq+0xae/0x120

Feb 22 11:18:13 olorin kernel: [<ffffffff81069c81>] ? handle_irq_event+0x41/0x70

Feb 22 11:18:13 olorin kernel: [<ffffffff8106c974>] ? handle_fasteoi_irq+0x54/0xd0

Feb 22 11:18:13 olorin kernel: [<ffffffff81003c45>] ? handle_irq+0x15/0x20

Feb 22 11:18:13 olorin kernel: [<ffffffff81003b53>] ? do_IRQ+0x53/0xd0

Feb 22 11:18:13 olorin kernel: [<ffffffff8137c52b>] ? common_interrupt+0x6b/0x6b

Feb 22 11:18:13 olorin kernel: <EOI> 

Feb 22 11:18:13 olorin kernel: handlers:

Feb 22 11:18:13 olorin kernel: [<ffffffffa0053b70>] e1000_intr

Feb 22 11:18:13 olorin kernel: Disabling IRQ #19
```

----------

## NeddySeagoon

albright,

Please post your

```
lspci
```

 output, your 

```
dmesg
```

 output right from boot if possible, your 

```
/proc/interrupts
```

irqpoll is a workaround, not a fix.

emerge wgetpaste, the use it to put your dmesg on a pastebin, as it won't fit into a post.

I may ask for your kernel .config too, so may as well put that on a pastebin just now too.

----------

## albright

thanks for the reply:

here's lspci:

```
00:00.0 Host bridge: Intel Corporation Device 0100 (rev 09)

00:01.0 PCI bridge: Intel Corporation Device 0101 (rev 09)

00:16.0 Communication controller: Intel Corporation Cougar Point HECI Controller #1 (rev 04)

00:1a.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host Controller #2 (rev 05)

00:1b.0 Audio device: Intel Corporation Cougar Point High Definition Audio Controller (rev 05)

00:1c.0 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 1 (rev b5)

00:1c.1 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 2 (rev b5)

00:1c.2 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 3 (rev b5)

00:1c.3 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 4 (rev b5)

00:1c.4 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 5 (rev b5)

00:1c.5 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 6 (rev b5)

00:1c.6 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b5)

00:1d.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host Controller #1 (rev 05)

00:1f.0 ISA bridge: Intel Corporation Device 1c46 (rev 05)

00:1f.2 SATA controller: Intel Corporation Cougar Point 6 port SATA AHCI Controller (rev 05)

00:1f.3 SMBus: Intel Corporation Cougar Point SMBus Controller (rev 05)

01:00.0 VGA compatible controller: nVidia Corporation G94 [GeForce 9600 GT] (rev a1)

03:00.0 USB Controller: Device 1b21:1042

06:00.0 USB Controller: Device 1b21:1042

07:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)

08:00.0 PCI bridge: Device 1b21:1080 (rev 01)

09:01.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)
```

/proc/interrupts: http://pastebin.ca/2120877

dmesg: http://pastebin.ca/2120874

.config: http://pastebin.ca/2120879

----------

## NeddySeagoon

albright,

```
r8169 0000:07:00.0: eth0: unable to load firmware patch rtl_nic/rtl8168e-3.fw (-2)
```

but the IRQ you are having problems with is IRQ 19, which is eth1, or the intel chipset.

That missing firmware is a bugfix for the r1869 driver and your hardware. It probably won't fix your overall problem but its worth a try, in case its some interaction.

Your kernel .config shows 

```
CONFIG_R8169=m
```

so the driver is a module. 

Make a directory called /lib/firmware and copt the directory /usr/src/linux/firmware/rtl_nic to /lib/firmware. The firmware will be loaded every time the r8169 module loads.

You should notice that the firmware loading error is gone from dmesg.

If that has no impact, try a testing gentoo-sources kernel.

```
lspci -n
```

 for the lince starting 09:01.0 will be useful too. Thats the PCI vendor and device IDs. of your Intel network card.

With that information, we may be able to find some bugs/fixes.

----------

## albright

Not starting out to well

/usr/src/linux/firmware/rtl_nic does not exist

----------

## Gusar

Install the linux-firmware package. Yeah it installs a lot of stuff, but among them is the rtl_nic firmware.

----------

## albright

 *Quote:*   

> linux-firmware package

 

OK - sorry about that; rebooted without the rtl firmware error message ...

----------

## albright

I suppose to no one's surprise the problem persists,

exactly the same:

```
Feb 22 23:04:36 olorin kernel: irq 19: nobody cared (try booting with the "irqpoll" option)

Feb 22 23:04:36 olorin kernel: Pid: 5025, comm: primegrid_gcwsi Tainted: P           O 3.2.4-pf #3

Feb 22 23:04:36 olorin kernel: Call Trace:

Feb 22 23:04:36 olorin kernel: <IRQ>  [<ffffffff8106bd9f>] ? __report_bad_irq+0x2f/0xd0

Feb 22 23:04:36 olorin kernel: [<ffffffff8106c057>] ? note_interrupt+0x177/0x220

Feb 22 23:04:36 olorin kernel: [<ffffffff81069b7a>] ? handle_irq_event_percpu+0x7a/0x140

Feb 22 23:04:36 olorin kernel: [<ffffffff810376ce>] ? __do_softirq+0xae/0x120

Feb 22 23:04:36 olorin kernel: [<ffffffff81069c81>] ? handle_irq_event+0x41/0x70

Feb 22 23:04:36 olorin kernel: [<ffffffff8106c974>] ? handle_fasteoi_irq+0x54/0xd0

Feb 22 23:04:36 olorin kernel: [<ffffffff81003c45>] ? handle_irq+0x15/0x20

Feb 22 23:04:36 olorin kernel: [<ffffffff81003b53>] ? do_IRQ+0x53/0xd0

Feb 22 23:04:36 olorin kernel: [<ffffffff8137c52b>] ? common_interrupt+0x6b/0x6b

Feb 22 23:04:36 olorin kernel: <EOI> 

Feb 22 23:04:36 olorin kernel: handlers:

Feb 22 23:04:36 olorin kernel: [<ffffffffa0045b70>] e1000_intr

Feb 22 23:04:36 olorin kernel: Disabling IRQ #19

```

I found a discussion of what seem likely to also be my problem in the LKML

archive:

 *Quote:*   

> Interrupt handling for *PCI boards with ASUS Sandybridge motherboards* 
> 
> seems to be broken. 
> 
> It has been seen with network and non-network PCI boards. PCIx network 
> ...

 

[http://groups.google.com/group/kernelarchive/browse_thread/thread/b24259e62f270163/21e66c4f7a50717e?show_docid=21e66c4f7a50717e

I'll try the ~amd64 masked acpid ...

Edit: there's a fair bit of activity about this problem on LKML and the consensus seems to be

that the ASM1083 PCIe-PCI bridge chipset is crap hardware. Too bad for me. I think I'll

just try a usb ethernet card   :Confused: 

Edit 2: sorry to keep adding to this post, but I was unaware of how extensive the discussion

of this problem was (Linus himself has weighed in). It appears that it may be a deeper problem

than just with the chipset mentioned above. But a possible solution is a pci-x ethernet 

card ...

----------

## swanson

Hmm, that is interesting. I have an Asus E45M1-M Pro motherboard (i.e. not Sandybridge) with an ASM1083 PCIE to PCI chip and have been getting IRQ nobody cared which kills ath9k on PCIE and the onboard firewire using that IRQ. The PCI slots only have a Pulsar ADSL card installed but no driver loaded.

Wonder if it's ASUS's BIOS/ACPI which is badly programmed. They aren't know for greatness.

EDIT: A wee thread search on kernel list confirms that the ASM1083 is crap. [PATCH] Unhandled IRQs on AMD E-450: temporarily switch to low-performance polling IRQ mode

----------

## albright

 *Quote:*   

> A wee thread search on kernel list confirms that the ASM1083 is crap

 

that is a sad post to have to read - and the patch does not seem all that successful

I've put in a pci-x ethernet card and everything is working perfectly.

Curiously, my cpu now runs (under identical load, from boinc) a couple of degrees cooler.

Does that make any sense to any one?

I'm no longer forced to use irqpoll; could that affect the processor load?

----------

## NeddySeagoon

albright,

irqpoll does what its name implies.

It polls the potential source if IRQs to see if it has anything to do. The polling rate is driver dependent.

Most of the time the kernel asks the device if it has anything for it to do, it gets told no thanks.  This pollimg is not free. It needs CPU time and CPU cycles.

If the core/CPU would otherwise have been in a low poer idle state, it gets woken up. Once woken up for the polling, the CPU does not go back to its low power state immedately. Hence the small temperate rise you observe.

Worse, if the CPU/core was busy, the polling would interrupt whatever was going on so the poll could run. The context switch (twice) is not free either. It takes time and CPU cycles to do the context switch, which does nothing for the progress of your applications. Hence irqpoll is a workaround, not a fix.

----------

