# irq 23: nobody cared

## alex.blackbit

```
[40431.205618] irq 23: nobody cared (try booting with the "irqpoll" option)

[40431.205624] Pid: 31225, comm: cc1plus Tainted: G        W  2.6.30-gentoo-r2-blackbit #1

[40431.205627] Call Trace:

[40431.205629]  <IRQ>  [<ffffffff8025dad5>] ? __report_bad_irq+0x30/0x7d

[40431.205640]  [<ffffffff8025dc27>] ? note_interrupt+0x105/0x16e

[40431.205645]  [<ffffffff8023abe3>] ? __do_softirq+0x12b/0x168

[40431.205648]  [<ffffffff8025e20c>] ? handle_fasteoi_irq+0x8e/0xaf

[40431.205652]  [<ffffffff8020d07f>] ? handle_irq+0x17/0x1d

[40431.205654]  [<ffffffff8020c8bc>] ? do_IRQ+0x57/0xbf

[40431.205658]  [<ffffffff8020b7d3>] ? ret_from_intr+0x0/0xa

[40431.205660]  <EOI> <3>handlers:

[40431.205662] [<ffffffff805a395e>] (ahci_interrupt+0x0/0x41a)

[40431.205667] [<ffffffff805eaea0>] (usb_hcd_irq+0x0/0x64)

[40431.205672] Disabling IRQ #23
```

anybody seen this before?

i wonder if this is a kernel or hardware problem.

hopefully _not_ the hardware.

the system becomes really slow after this.

----------

## DaggyStyle

 *alex.blackbit wrote:*   

> 
> 
> ```
> [40431.205618] irq 23: nobody cared (try booting with the "irqpoll" option)
> 
> ...

 

yes, it is kernel<->hardware communication issue, like it said, just all irqpoll to the end of your kernel boot command at grub.conf

----------

## alex.blackbit

i already did.

system is currently runnning with "irqpoll".

and i run the same compilation as before.

could that have anything to do with using tmpfs for /var/tmp/portage ?

----------

## depontius

This looks to me like a hardware or hardware setup problem.  My guess is that some piece of hardware is asserting irg23, and no handler is fielding that irq.

Have you tried booting with "irqpoll" as they suggest?

Since this is irq23, it's clearly not standard xt-pic.  You might also try "noapic" or "nolapic" to change the irq routing.  It's possible that an irq is getting routed to irq23 in hardware, but the driver waiting for its interrupt elsewhere instead of there.  In other words, things have been set up improperly.  Or it might be hardware.

Take a quick look in /usr/src/linux/Documentation/kernel-parameters.txt, for starters.  There might be further suggestions in there.

----------

## alex.blackbit

i am waiting for the thing to happen again, irqpoll is enabled.

this makes me quite worried, because it started to happen lately, machine is running since one year.

if it's software related, it must have been introduced in kernel 2.6.30.

----------

## think4urs11

Moved from Portage & Programming to Kernel & Hardware as requested by thread starter

----------

## albright

Do you have MSI (that's Message Signalled Interrupts) 

enabled in the kernel? I needed to do that to get

rid of the interrupt problem.

----------

## alex.blackbit

thanks for the hint, but PCI_MSI is enabled in my kernel.

----------

## DaggyStyle

had that in two diff laptops, only the kernel upgrades fixed it

----------

## alex.blackbit

DaggyStyle, may i ask which versions produced the error for you and which did not?

----------

## alex.blackbit

okay, until now the problem did not appear again. i'll have to wait.

----------

## alex.blackbit

the problem just re-appeared.

```
[ 5791.555753] irq 23: nobody cared (try booting with the "irqpoll" option)

[ 5791.555759] Pid: 4718, comm: cc1 Tainted: G        W  2.6.30-gentoo-r4-blackbit #1

[ 5791.555761] Call Trace:

[ 5791.555763]  <IRQ>  [<ffffffff8025dc85>] ? __report_bad_irq+0x30/0x7d

[ 5791.555774]  [<ffffffff8025ddd7>] ? note_interrupt+0x105/0x16e

[ 5791.555777]  [<ffffffff8025e3bc>] ? handle_fasteoi_irq+0x8e/0xaf

[ 5791.555780]  [<ffffffff8020d07f>] ? handle_irq+0x17/0x1d

[ 5791.555783]  [<ffffffff8020c8bc>] ? do_IRQ+0x57/0xbf

[ 5791.555787]  [<ffffffff8020b7d3>] ? ret_from_intr+0x0/0xa

[ 5791.555788]  <EOI> <3>handlers:

[ 5791.555790] [<ffffffff805a43aa>] (ahci_interrupt+0x0/0x41a)

[ 5791.555794] [<ffffffff805eb9bc>] (usb_hcd_irq+0x0/0x64)

[ 5791.555799] Disabling IRQ #23
```

the kernel is running with the irqpoll option

```
# cat /proc/cmdline 

root=/dev/sda1 video=uvesafb:1280x1024-32,mtrr:3,ywrap irqpoll

#
```

what can i do now?

----------

## gringo

so what do you have attached to irq 23 ? can we have a look to your /proc/interrupts table please ?

is this a multicore machine ? or did you upgrade bios or changed sth. relevant in your kernel config lately ?

cheers

----------

## alex.blackbit

yes, this is a multicore machine, 1 intel xeon E5420, 4 cores @ 2.50GHz.

```
# cat /proc/interrupts 

           CPU0       CPU1       CPU2       CPU3       

  0:         63         26         25         25   IO-APIC-edge      timer

  1:          0          0          1          1   IO-APIC-edge      i8042

  3:          0          1          1          0   IO-APIC-edge    

  4:          1          1          6          3   IO-APIC-edge      serial

  6:          2          1          0          2   IO-APIC-edge      floppy

  9:          0          0          0          0   IO-APIC-fasteoi   acpi

 12:          3          0          1          0   IO-APIC-edge      i8042

 14:          0          0          0          0   IO-APIC-edge      ata_piix

 15:          0          0          0          0   IO-APIC-edge      ata_piix

 16:        187        188        199        176   IO-APIC-fasteoi   radeon@pci:0000:05:00.0

 17:         52         49         50         48   IO-APIC-fasteoi   HDA Intel

 20:       2110       2112       2106       2106   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2

 21:          8          6          7          8   IO-APIC-fasteoi   uhci_hcd:usb3

 22:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4

 23:      14199      14175      14162      14218   IO-APIC-fasteoi   ahci, uhci_hcd:usb5

 25:          0          0          0          0   IO-APIC-fasteoi   he

 54:       6683       6707       6711       6680   PCI-MSI-edge      eth0

NMI:          0          0          0          0   Non-maskable interrupts

LOC:     394585     137772     391873     514501   Local timer interrupts

SPU:          0          0          0          0   Spurious interrupts

RES:       1007       1220       1034       1089   Rescheduling interrupts

CAL:         64        311        338        334   Function call interrupts

TLB:       1418       2346       1416       2131   TLB shootdowns

TRM:          0          0          0          0   Thermal event interrupts

THR:          0          0          0          0   Threshold APIC interrupts

ERR:          0

MIS:          0

#
```

----------

## krinn

 *alex.blackbit wrote:*   

> 
> 
> [ 5791.555790] [<ffffffff805a43aa>] (ahci_interrupt+0x0/0x41a)
> 
> [ 5791.555794] [<ffffffff805eb9bc>] (usb_hcd_irq+0x0/0x64)
> ...

 

irq conflic with some usb device and ahci (sata drivers). Could explain your slowdown (you may try hdparm -t a drive, then redo it after the crash to compare speed)

if your bios is set to pnp os, set it off so the bios will assign itself irq, and invert if it's not the case ^^

you could also disable some usb ports (if not in use)

can also try to just remove the device on that usb port

you can do that to for your sata controller, but loosing some usb devices isn't a pain as loosing a sata controller :p

----------

## pappy_mcfae

Are the uhci drivers installed in the kernel, or set as modules?

Blessed be!

Pappy

----------

## alex.blackbit

everything is in the kernel, the only module is scsi_wait_scan.

----------

## gringo

 *Quote:*   

> if your bios is set to pnp os, set it off so the bios will assign itself irq, and invert if it's not the case

 

yes, that did help me too a few times too, specially if the bios is know to be buggy. I mean enabling it and letting the kernel do the job.

I think this only available for 32bit setups but do you have pci access mode set to bios or any in you kernel config ? 

And a wild guess : do you really need ahci support in this board ?

cheers

----------

## alex.blackbit

i'll check the bios settings when i come home.

ahci are not really needed, but i'd like to keep it unless i don't find a better solution.

----------

## pappy_mcfae

Depending on the kernel version, you can move around the IRQ's. "Reroute for broken boot IRQ's" would probably be a good place to start. If that doesn't get it, then post your .config, the results of lspci -n and cat /proc/cpuinfo as well as your /etc/fstab file, and I'll see if your issue is kernel related.

Blessed be!

Pappy

----------

## alex.blackbit

sounds like a plan.

this is what i'll try first.

the next time i boot, i'll have the rereouting feature in the kernel.

i'll wait a few days to see if the problem comes back.

----------

## alex.blackbit

"Reroute for broken boot IRQ's" was enabled before, today the error came back, during a full 1200 package re-emerge of my installation for gcc-4.4.1.

i now entered the bios and changed the "installed O/S" from "other O/S" to "Windows". o.O

let's see what happens.

config.gz

```
# lspci -n

00:00.0 0600: 8086:4003 (rev 20)

00:01.0 0604: 8086:4021 (rev 20)

00:05.0 0604: 8086:4025 (rev 20)

00:09.0 0604: 8086:4029 (rev 20)

00:0f.0 0880: 8086:402f (rev 20)

00:10.0 0600: 8086:4030 (rev 20)

00:10.1 0600: 8086:4030 (rev 20)

00:10.2 0600: 8086:4030 (rev 20)

00:10.3 0600: 8086:4030 (rev 20)

00:10.4 0600: 8086:4030 (rev 20)

00:11.0 0600: 8086:4031 (rev 20)

00:15.0 0600: 8086:4035 (rev 20)

00:15.1 0600: 8086:4035 (rev 20)

00:16.0 0600: 8086:4036 (rev 20)

00:16.1 0600: 8086:4036 (rev 20)

00:1b.0 0403: 8086:269a (rev 09)

00:1c.0 0604: 8086:2690 (rev 09)

00:1d.0 0c03: 8086:2688 (rev 09)

00:1d.1 0c03: 8086:2689 (rev 09)

00:1d.2 0c03: 8086:268a (rev 09)

00:1d.3 0c03: 8086:268b (rev 09)

00:1d.7 0c03: 8086:268c (rev 09)

00:1e.0 0604: 8086:244e (rev d9)

00:1f.0 0601: 8086:2670 (rev 09)

00:1f.1 0101: 8086:269e (rev 09)

00:1f.2 0106: 8086:2681 (rev 09)

00:1f.3 0c05: 8086:269b (rev 09)

05:00.0 0300: 1002:71c6

05:00.1 0380: 1002:71e6

09:00.0 0604: 8086:3500 (rev 01)

09:00.3 0604: 8086:350c (rev 01)

0a:00.0 0604: 8086:3510 (rev 01)

0a:02.0 0604: 8086:3518 (rev 01)

0f:00.0 0200: 8086:1096 (rev 01)

0f:00.1 0200: 8086:1096 (rev 01)

10:0a.0 0203: 1127:0400 (rev 01)

#
```

```

# grep -Ev "^($|#)" /etc/fstab 

/dev/sda1               /               xfs             noatime         0 1

/dev/sda2               /boot           xfs             noatime         0 1

/dev/sda3               none            swap            sw              0 0

/dev/sda5               /usr            xfs             noatime         0 0

/dev/sda6               /var            xfs             noatime         0 0

/dev/sda7               /opt            xfs             noatime         0 0

/dev/sda8               /home           xfs             noatime         0 0

/dev/sda9               /usr/portage    reiserfs        noatime         0 0

/dev/sda10              /usr/portage/distfiles  btrfs   noatime         0 0

/dev/sdb1               /data           xfs             noatime         0 0

/dev/cdrom1             /mnt/cdrom      auto            user,noauto,ro  0 0

/dev/fd0                /mnt/floppy     auto            user,noauto     0 0

//axp/export            /mnt/axp        cifs            noauto,rw,user,credentials=/home/ahuemer/.cifspasswd    0 0

shm                     /dev/shm        tmpfs           nodev,nosuid,noexec     0 0

#
```

----------

## pappy_mcfae

And here I am. The email freak-outs have yet to put me down. They've been pissing me off, but life is like that sometimes.

Anyway, I took a look at your .config, and decided to go with a seed and give you the Pappy touch. If you continue to have issues, try moving the cards to different slots (if available/applicable), see if you can get the BIOS to do an IRQ remap. If this is an available option, it will be under PCI configurations, or some such. If not that, I'm sort of out of ideas. So, let's get those fingers crossed!!!

I set you up with the standard VESA frame buffer. I know nothing about the proper setup for ATI cards, but I do know that at least the one I have works just fine with the VESA framebuffer, when I attach a monitor to it, anyway. As for ATI and X, there are others who know that one better than I.

Click here for your new .config. Compile as is.

For the best results, please do the following:

1) Move your .config file out of your kernel source directory (/usr/src/linux-2.6.30-gentoo-r4 ).

2) Issue the command make mrproper. This is a destructive step. It returns the source to pristine condition. Unmoved .config files will be deleted!

3) Copy my .config into your source directory.

4) Issue the command make && make modules_install.

5) Install the kernel as you normally would, and reboot.

6) Once it boots, please post /var/log/dmesg so I can see how things loaded.

Definitely keep me posted on this one. I want to know if I cured the problem, and set up ATM properly.

Blessed be!

Pappy

----------

## alex.blackbit

the problem just re-appeared.

```
[451920.769253] irq 23: nobody cared (try booting with the "irqpoll" option)

[451920.769258] Pid: 0, comm: swapper Tainted: G        W  2.6.30-gentoo-r4-blackbit #4

[451920.769261] Call Trace:

[451920.769263]  <IRQ>  [<ffffffff8025cebf>] ? __report_bad_irq+0x30/0x7d

[451920.769274]  [<ffffffff8025d013>] ? note_interrupt+0x107/0x170

[451920.769277]  [<ffffffff8025d600>] ? handle_fasteoi_irq+0x8a/0xaa

[451920.769281]  [<ffffffff8020cffb>] ? handle_irq+0x17/0x1d

[451920.769283]  [<ffffffff8020c844>] ? do_IRQ+0x54/0xbb

[451920.769287]  [<ffffffff8020b753>] ? ret_from_intr+0x0/0xa

[451920.769289]  <EOI>  [<ffffffff80210b9f>] ? mwait_idle+0xaa/0xdb

[451920.769296]  [<ffffffff8078d865>] ? notifier_call_chain+0x2e/0x5b

[451920.769299]  [<ffffffff8020a1d3>] ? cpu_idle+0x4a/0x8d

[451920.769301] handlers:

[451920.769302] [<ffffffff8059d1b9>] (ahci_interrupt+0x0/0x426)

[451920.769307] [<ffffffff805e33b6>] (usb_hcd_irq+0x0/0x5d)

[451920.769312] Disabling IRQ #23
```

when i find the time i'll try out your .config, pappy_mcfae.

----------

## pappy_mcfae

OK. Keep me informed.

Blessed be!

Pappy

----------

## energyman76b

usually problems like this are caused by bad hardware or buggy bios... if you can't get a bios update, and support of your vendor is telling you 'we don't support linux' tell them, that this is nice and you won't buy their stuff anymore...

what you can try: instead attaching that usb 1.1 device directly, get a 2.0 hub and connect it there, so the uhci-usb controler is not needed anymore. Remove the driver from the kernel and retry.

----------

## DaggyStyle

ok, update, I've switched the way my hd connects from ata to ahci and the similar error went away, actually, adding irqpoll is slowing my system.

----------

## alex.blackbit

 *energyman76b wrote:*   

> usually problems like this are caused by bad hardware or buggy bios... if you can't get a bios update, and support of your vendor is telling you 'we don't support linux' tell them, that this is nice and you won't buy their stuff anymore...
> 
> what you can try: instead attaching that usb 1.1 device directly, get a 2.0 hub and connect it there, so the uhci-usb controler is not needed anymore. Remove the driver from the kernel and retry.

 

i removed the UHCI driver, connected the devices to the EHCI hub.

no change, the error came back yesterday.

is there really no other chance than disabling AHCI for sata?

----------

## energyman76b

well, you should go to lkml and report your problem and findings there.

----------

## alex.blackbit

current dmesg without UHCI driver:

```
[157152.418524] irq 23: nobody cared (try booting with the "irqpoll" option)

    [157152.418530] Pid: 1359, comm: cc1plus Tainted: G        W 2.6.31-gentoo-blackbit #2

    [157152.418532] Call Trace:

    [157152.418534]  <IRQ>  [<ffffffff81066e3f>] ?  __report_bad_irq+0x30/0x7d

    [157152.418544]  [<ffffffff81066f93>] ? note_interrupt+0x107/0x170

    [157152.418547]  [<ffffffff81067580>] ? handle_fasteoi_irq+0x8a/0xaa

    [157152.418551]  [<ffffffff8100d1cf>] ? handle_irq+0x17/0x1d

    [157152.418554]  [<ffffffff8100c84b>] ? do_IRQ+0x54/0xb2

    [157152.418558]  [<ffffffff8100b6d3>] ? ret_from_intr+0x0/0xa

    [157152.418559]  <EOI>

    [157152.418560] handlers:

    [157152.418562] [<ffffffff813d2a6f>] (ahci_interrupt+0x0/0x426)

    [157152.418566] Disabling IRQ #23
```

problem report on LKML: [1] [2] [3]

----------

