# IRQ & APIC troubles

## DiskDoc

Hi! Since I got my Clevo D870 laptop (P4 HT) one and a half years ago I've had to use the noapic kernel parameter to prevent random lockups.

Recently, I realised some of my system slowdows could be due to overloaded interrupts. Sure enough, a lot of devices were sharing interrupts, markedly the harddrive controller and eth0.

I first tried solving this by disabling some unneeded devices (serial,parallel) in the BIOS to free up some IRQs. That didn't really help much. Next I decided to try and remove "noapic" from the kernel line in Grub - and whaddaya know! The devices spread nicely among the interrupts and my system instantly became more responsive!

I though everything was wonderful but after a couple of days I've now had a couple of those characteristic noapic-solveable lockups when the system slowly grinds to a halt. So I'm back to using noapic   :Crying or Very sad: 

This is what my interrupt list looks like now:

```
           CPU0       CPU1

  0:     310180          0          XT-PIC  timer

  1:       5277          0          XT-PIC  i8042

  2:          0          0          XT-PIC  cascade

  5:         84          0          XT-PIC  ehci_hcd:usb1, uhci_hcd:usb4, Intel ICH5

  8:          2          0          XT-PIC  rtc

  9:       1494          0          XT-PIC  acpi

 11:      55826          0          XT-PIC  ide2, ohci1394, yenta, uhci_hcd:usb2, uhci_hcd:usb3, uhci_hcd:usb5, eth0

 12:        110          0          XT-PIC  i8042

 14:       2548          0          XT-PIC  ide0

NMI:          0          0

LOC:     310140     310139

ERR:          0

MIS:          0
```

I'd like ide0,ide2,ICH5 and eth0 to have their own interrupts, at least. I know there are some other kernel parameters to try, but I'm unsure of which to try first. Perhaps noacpi is a bad idea for a laptop? Overheating protection and such..

Interrupts without noapic:

```
           CPU0       CPU1

  0:      48570          0    IO-APIC-edge  timer

  1:        495          0    IO-APIC-edge  i8042

  8:          2          0    IO-APIC-edge  rtc

  9:        151          0   IO-APIC-level  acpi

 12:        116          0    IO-APIC-edge  i8042

 14:        509          0    IO-APIC-edge  ide0

 16:        954          0   IO-APIC-level  yenta, uhci_hcd:usb2, uhci_hcd:usb5

 17:         69          0   IO-APIC-level  Intel ICH5

 18:       1518          0   IO-APIC-level  eth0

 19:          0          0   IO-APIC-level  uhci_hcd:usb4

 20:      10020          0   IO-APIC-level  ide2, uhci_hcd:usb3

 21:          3          0   IO-APIC-level  ohci1394

 22:         15          0   IO-APIC-level  ehci_hcd:usb1

NMI:          0          0

LOC:      48506      48505

ERR:          0

MIS:          0
```

The best thing would be to get the APIC working properly but I can't seem to find any solution on the web. Do theese problems sound familiar to anyone?

----------

## augury

There is a how to on detecting and recompiling apic's that were compiled with microsoft compiler instead of intel compiler or are otherwise buggy.  Are you running irqbalance?  Anything in dmesg?

----------

## DiskDoc

Are you sure about that apic recompile howto? I thought the APIC was a chip/controller, can you update software on in somehow? I tried googling but all I found was a ACPI howto in the Gentoo wiki.

In dmesg it says:  *Quote:*   

> Starting balanced_irq

  so I guess I'm using it.

I can't remember now if I've ever been able to read dmesg before the system failed completely. It´s overwritten with every new boot, right? So I can't view old entries.. I'm guessing syslog-ng needs to be reconfigured somehow.

Here's the beginning of my dmesg booted WITH APIC support, just in case:

```
Linux version 2.6.13-gentoo-r5 (root@localhost) (gcc version 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)) #1 SMP Tue Nov 1 00:42:55 EET 2005

BIOS-provided physical RAM map:

 BIOS-e820: 0000000000000000 - 000000000009e800 (usable)

 BIOS-e820: 000000000009e800 - 00000000000a0000 (reserved)

 BIOS-e820: 00000000000d2000 - 00000000000d4000 (reserved)

 BIOS-e820: 00000000000d8000 - 0000000000100000 (reserved)

 BIOS-e820: 0000000000100000 - 000000001ff70000 (usable)

 BIOS-e820: 000000001ff70000 - 000000001ff7b000 (ACPI data)

 BIOS-e820: 000000001ff7b000 - 000000001ff80000 (ACPI NVS)

 BIOS-e820: 000000001ff80000 - 0000000020000000 (reserved)

 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)

 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)

 BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved)

 BIOS-e820: 00000000fffffc00 - 0000000100000000 (reserved)

511MB LOWMEM available.

found SMP MP-table at 000f65e0

On node 0 totalpages: 130928

  DMA zone: 4096 pages, LIFO batch:1

  Normal zone: 126832 pages, LIFO batch:31

  HighMem zone: 0 pages, LIFO batch:1

DMI present.

ACPI: RSDP (v000 PTLTD                                 ) @ 0x000f6640

ACPI: RSDT (v001 PTLTD    RSDT   0x06040000  LTP 0x00000000) @ 0x1ff76d7a

ACPI: FADT (v001 Clevo  SPDG     0x06040000 PTL  0x00000003) @ 0x1ff7aecf

ACPI: MADT (v001 PTLTD           APIC   0x06040000  LTP 0x00000000) @ 0x1ff7af43ACPI: BOOT (v001 PTLTD  $SBFTBL$ 0x06040000  LTP 0x00000001) @ 0x1ff7afa1

ACPI: SSDT (v001 PTLTD  ACPIHT   0x06040000  LTP 0x00000001) @ 0x1ff7afc9

ACPI: DSDT (v001  INTEL SPRGDLEG 0x06040000 MSFT 0x0100000e) @ 0x00000000

ACPI: Local APIC address 0xfee00000

ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)

Processor #0 15:2 APIC version 20

ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)

Processor #1 15:2 APIC version 20

ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])

ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])

ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])

IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23

ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)

ACPI: IRQ9 used by override.

Enabling APIC mode:  Flat.  Using 1 I/O APICs

Using ACPI (MADT) for SMP configuration information

Allocating PCI resources starting at 20000000 (gap: 20000000:dec00000)

Built 1 zonelists

Kernel command line: root=/dev/hde3 video=radeon,mtrr,ywrap splash=verbose,theme:livecd-2005.1 CONSOLE=/dev/tty1

mapped APIC to ffffd000 (fee00000)

mapped IOAPIC to ffffc000 (fec00000)

Initializing CPU#0

PID hash table entries: 2048 (order: 11, 32768 bytes)

Detected 2600.339 MHz processor.

Using tsc for high-res timesource

Console: colour VGA+ 80x25

Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)

Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)

Memory: 511920k/523712k available (3291k kernel code, 11256k reserved, 1250k data, 232k init, 0k highmem)

Checking if this processor honours the WP bit even in supervisor mode... Ok.

Calibrating delay using timer specific routine.. 5207.66 BogoMIPS (lpj=10415324)Mount-cache hash table entries: 512

CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400

00000000 00000000

CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000

CPU: Trace cache: 12K uops, L1 D cache: 8K

CPU: L2 cache: 512K

CPU: Physical Processor ID: 0

CPU: After all inits, caps: bfebfbff 00000000 00000000 00000080 00004400 00000000 00000000

Intel machine check architecture supported.

Intel machine check reporting enabled on CPU#0.

CPU0: Intel P4/Xeon Extended MCE MSRs (12) available

CPU0: Thermal monitoring enabled

mtrr: v2.0 (20020519)

Enabling fast FPU save and restore... done.

Enabling unmasked SIMD FPU exception support... done.

Checking 'hlt' instruction... OK.

CPU0: Intel(R) Pentium(R) 4 CPU 2.60GHz stepping 09

Booting processor 1/1 eip 2000

Initializing CPU#1

Calibrating delay using timer specific routine.. 5200.57 BogoMIPS (lpj=10401146)CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400

00000000 00000000

CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000

CPU: Trace cache: 12K uops, L1 D cache: 8K

CPU: L2 cache: 512K

CPU: Physical Processor ID: 0

CPU: After all inits, caps: bfebfbff 00000000 00000000 00000080 00004400 00000000 00000000

Intel machine check architecture supported.

Intel machine check reporting enabled on CPU#1.

CPU1: Intel P4/Xeon Extended MCE MSRs (12) available

CPU1: Thermal monitoring enabled

CPU1: Intel(R) Pentium(R) 4 CPU 2.60GHz stepping 09

Total of 2 processors activated (10408.23 BogoMIPS).

ENABLING IO-APIC IRQs

..TIMER: vector=0x31 pin1=0 pin2=-1

checking TSC synchronization across 2 CPUs: passed.

Brought up 2 CPUs

checking if image is initramfs... it is

Freeing initrd memory: 1235k freed

NET: Registered protocol family 16

ACPI: bus type pci registered

PCI: PCI BIOS revision 2.10 entry at 0xfd972, last bus=3

PCI: Using configuration type 1

ACPI: Subsystem revision 20050408

ACPI: Interpreter enabled

ACPI: Using IOAPIC for interrupt routing

ACPI: PCI Root Bridge [PCI0] (0000:00)

PCI: Probing PCI hardware (bus 00)

ACPI: Assume root bridge [\_SB_.PCI0] segment is 0

PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1

Boot video device is 0000:01:00.0

PCI: Transparent bridge - 0000:00:1e.0

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGP_._PRT]

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIB._PRT]

ACPI: PCI Interrupt Link [LNKA] (IRQs *11)

ACPI: PCI Interrupt Link [LNKB] (IRQs *5)

ACPI: PCI Interrupt Link [LNKC] (IRQs *5)

ACPI: PCI Interrupt Link [LNKD] (IRQs *11)

ACPI: PCI Interrupt Link [LNKE] (IRQs 5) *0, disabled.

ACPI: PCI Interrupt Link [LNKF] (IRQs *11)

ACPI: PCI Interrupt Link [LNKG] (IRQs *11)

ACPI: PCI Interrupt Link [LNKH] (IRQs *5)

ACPI: Embedded Controller [EC] (gpe 28)

Linux Plug and Play Support v0.97 (c) Adam Belay

pnp: PnP ACPI init

pnp: PnP ACPI: found 9 devices

SCSI subsystem initialized

usbcore: registered new driver usbfs

usbcore: registered new driver hub

PCI: Using ACPI for IRQ routing

PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report

PCI: Bridge: 0000:00:01.0

  IO window: 3000-3fff

  MEM window: d0100000-d01fffff

  PREFETCH window: e0000000-efffffff

.

.

.
```

----------

## DiskDoc

I've tried running with nolapic, pci=noacpi & acpi=noirq and nolapic_irq_nobalance but to no avail. I couldn't find anything special in my logfiles when the crash happened. This really bothers me  :Crying or Very sad: 

----------

## PrakashP

I would report (via detailed bug report) to lkml and try to get this fixed.

----------

## augury

yes your correct.  sorry its alphabet soup to me.  irqbalance runs as a daemon (# emerge irqbalance; rc-update add /etc/init.d/irqbalance default).  it runs in the backround and does very little but it may correct your problem i dont know.  looking at dmesg it says 

```
PCI: Using ACPI for IRQ routing

PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report

PCI: Bridge: 0000:00:01.0 
```

 its a stab in the dark but the exact problem isn't immediatly evident.  its not something that should be happening though.

----------

## DiskDoc

Thanks for the tip on using IRQbalance! It together with nolapic seems no have increased stability..but I still had system freezes occasionally  :Sad:  Now I'm back to using noapic and clogged IRQ:s..don't want to mess up my filesystems with too many lockups.

Yes, I suppose this needs to be reported.. It won't be easy though. I'll have to save it for some day I´m ready to do all it takes to get enough details to the developers.

----------

## DiskDoc

Updated with the latest BIOS today, also running stable gentoo 2.6.15-r1. Tried changing "noapic" to "nolapic" on the kernel boot line and the usual happened: the system booted, seemed to run happliy and snappily and gradually froze as I started to tax it. The mouse pointer moved, everything else was frozen - unless I clicked the mouse button. Perhaps it triggered an interrupt somehow so everything in the system could go one "step" forward and freeze again. Finally the system froze completely.

I found a discussion on the Mandriva forums where they seem to be wresting the same problem:

http://forum.mandrivaclub.com/viewtopic.php?t=40909

----------

