# nvidia chipsets + Core 2 Duo = uniprocessor [solved]

## jesnow

Some nvidia NFORCE chipsets appear to be pretty thoroughly broken for use with 

SMP under linux. Probably under windows too, but of course the problem is

better hidden there.  I have an ASUS P5NSLI board that shows this problem, 

but it appears that other boards show the same problem. 

Sumptom 1: During boot you get messages like:

```

ENABLING IO-APIC IRQs

..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1

..MP-BIOS bug: 8254 timer not connected to IO-APIC

...trying to set up timer (IRQ0) through the 8259A ...  failed.

...trying to set up timer as Virtual Wire IRQ... failed.

...trying to set up timer as ExtINT IRQ... works.

checking TSC synchronization across 2 CPUs: passed.

```

Symptom 2: You then find that only CPU0 is allowed to process interrupts: 

```

jesnow@Merckx ~ $ cat /proc/interrupts

           CPU0       CPU1

  0:     548400          0    XT-PIC-XT        timer

  1:       1328          0   IO-APIC-edge      i8042

  6:          3          0   IO-APIC-edge      floppy

  9:          0          0   IO-APIC-fasteoi   acpi

 12:          4          0   IO-APIC-edge      i8042

 14:         50          0   IO-APIC-edge      ide0

 16:       5672          0   IO-APIC-fasteoi   eth0

 17:      15399          0   IO-APIC-fasteoi   libata, ohci_hcd:usb2

 18:          0          0   IO-APIC-fasteoi   libata

 19:        302          0   IO-APIC-fasteoi   ehci_hcd:usb1

 20:       6670          0   IO-APIC-fasteoi   HDA Intel

 21:      74848          0   IO-APIC-fasteoi   nvidia

NMI:          0          0

LOC:     536748     537048

ERR:          1

MIS:          0

```

Nvidia themselves have this to say about it:

http://download.nvidia.com/XFree86/nforce/1.0-0301/KnownProblems.html

 *Quote:*   

> 
> 
>  Network and other devices randomly stop working when ACPI is enabled
> 
>  This problem may be caused by an incorrect ACPI table entry that causes the timer interrupt to be incorrectly configured. 
> ...

 

as well as

http://download.nvidia.com/XFree86/Linux-x86/1.0-8774/README/appendix-l.html

 *Quote:*   

> 
> 
> Appendix L. Known Issues
> 
> The following problems still exist in this release and are in the process of being resolved.
> ...

 

The 'acpi_skip_timer_override' boot-time option did something for a while, but I had stability issues 

and stopped using it. 'noapic' makes you a uniprocessor machine, and kills the sound. In the meantime, I have replicated this behavior on all recent kernels, and combinations of relevant boot parameters. My conclusion is, until somebody takes an interest and fixes this problem, motherboards with the affected chipsets (I don't know which these are) are seriously degraded in performance for most tasks. My mobo has the NForce MCP55 chipset. 

Anybody who might know more about this, or a possible workaround, please post!

Cheers, 

Jon.Last edited by jesnow on Sat Dec 22, 2007 10:30 pm; edited 3 times in total

----------

## ntrl

Hi.

try  pci=bios

----------

## jesnow

 *ntrl wrote:*   

> Hi.
> 
> try  pci=bios

 

Tried that, tried acpi_skip_timer_override, which worked on some kernels

a while a go, nada. 

jon

----------

## ntrl

What MB? Asus, gigabyte?  May be flash new bios?

----------

## jesnow

 *ntrl wrote:*   

> What MB? Asus, gigabyte?  May be flash new bios?

 

Yes, an ASUS P5NSLI, newest available BIOS. I'm really at Witts End on this. 

Jon.

----------

## jesnow

I have submitted a bug report on kernel bugzilla, will report back. 

Jon.

----------

## jesnow

Someone in Kernel bugzilla had this to say: 

 *Quote:*   

> 
> 
> ------- Additional Comment #1 From Dave Jones 2007-03-20 14:53 -------  
> 
> CPU1 won't handle interrupts unless irqbalance is run.
> ...

 

So I tried that and got: 

```

Merckx jesnow # equery list irqbalance

[ Searching for package 'irqbalance' in all categories among: ]

 * installed packages

[I--] [  ] sys-apps/irqbalance-0.55 (0)

Merckx jesnow # /etc/init.d/irqbalance start

 * irqbalance: your machine lacks different physical processors; not enabling

```

Which is strange, since both cpus are recognized everywhere else, like in /proc and so forth.

----------

## ZeroDivide

I was having the same problems with an Epox 9NPA3-SLI motherboard.

Eventually I found that disabling ACPI caused APIC to work properly.  So I just have to pass "acpi=no" as a kernel parameter and smp appears to work fine.

----------

## jesnow

 *ZeroDivide wrote:*   

> I was having the same problems with an Epox 9NPA3-SLI motherboard.
> 
> Eventually I found that disabling ACPI caused APIC to work properly.  So I just have to pass "acpi=no" as a kernel parameter and smp appears to work fine.

 

I tried this, and interestingly, it had no effect at all -- that is SMP dontinued to be broken as before, and  there are lots of ACPI messages in dmesg.

----------

## Rikai

If there's a bad ACPI table entry, you might try fixing it youself with a custom DSDT?

At the very least, I don't think it can hurt.

----------

## jesnow

 *Rikai wrote:*   

> If there's a bad ACPI table entry, you might try fixing it youself with a custom DSDT?
> 
> At the very least, I don't think it can hurt.

 

Wow, this looks daunting, but I can give it a try. I could find no references to 

DSDT in connection with my MB in the web, so no indication that this has worked for 

anyone else. 

The procedure outlined seems so simplistic that I can't imagine it working for me except by pure chance without specific knowledge of ACPI programming.  Worth a try though. Thanks!

----------

## EasterParade

I haven't looked so hard at the boot messages on booting. My mobo is a asus p5n32-e sli with nforce680 chipset and a pentium 4 .

I don't know whether it is related to that problem but I had to give the kernel pci=nomsi on boot to make net.eth0 actually work. I've tried several kernels and for each of them I had to write that into grub.

Does anyone know what pci=nomsi does and how it affects other pci traffic on the board?

I have a beta bios and intend to flash it to the same bios rev. because it just got out of beta status. The system is all in all pretty fast and has no stability problems but I haven't pushed it to its limits yet. It is only 4 weeks old.

My /proc/interrupts looks like this

```
cat /proc/interrupts

           CPU0       CPU1

  0:    1382981          0   IO-APIC-edge      timer

  1:          2          0   IO-APIC-edge      i8042

  6:          3          0   IO-APIC-edge      floppy

  8:          1          0   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:          4          0   IO-APIC-edge      i8042

 14:      12293          0   IO-APIC-edge      ide0

 16:      83499          0   IO-APIC-fasteoi   nvidia

 17:          7          0   IO-APIC-fasteoi   Bt87x audio, bttv0

 20:      21919          0   IO-APIC-fasteoi   ohci_hcd:usb1

 21:        215          0   IO-APIC-fasteoi   libata, HDA Intel

 22:     141894          0   IO-APIC-fasteoi   libata, eth0

 23:       8632          0   IO-APIC-fasteoi   libata, ehci_hcd:usb2

NMI:        201        149

LOC:    1382974    1382973

ERR:          0

```

Lioba

----------

## jesnow

 *Rikai wrote:*   

> If there's a bad ACPI table entry, you might try fixing it youself with a custom DSDT?
> 
> At the very least, I don't think it can hurt.

 

OK, I did that, and got only warnings, no errors:

```

Intel ACPI Component Architecture

ASL Optimizing Compiler version 20060912 [Apr  5 2007]

Copyright (C) 2000 - 2006 Intel Corporation

Supports ACPI Specification Revision 3.0a

dsdt.dsl   386:     Method (\_WAK, 1, NotSerialized)

Warning  1079 -                 ^ Reserved method must return a value (_WAK)

dsdt.dsl  2310:                     Method (GFSB, 0, NotSerialized)

Warning  1086 -                                ^ Not all control paths return a value (GFSB)

dsdt.dsl  4629:                 Method (RVLT, 1, NotSerialized)

Warning  1086 -                            ^ Not all control paths return a value (RVLT)

dsdt.dsl  4749:                 Method (RTMP, 1, NotSerialized)

Warning  1086 -                            ^ Not all control paths return a value (RTMP)

dsdt.dsl  4901:                     Store (GFSB (), Local0)

Warning  1091 -                               ^ Called method may not always return a value

dsdt.dsl  4927:                 Method (OCOP, 1, NotSerialized)

Warning  1086 -                            ^ Not all control paths return a value (OCOP)

dsdt.dsl  4943:                             Subtract (Local1, GFSB (), Local1)

Warning  1091 -                                            ^ Called method may not always return a value

dsdt.dsl  5206:                             Multiply (GFSB (), Local1, Local1)

Warning  1091 -                                          ^ Called method may not always return a value

dsdt.dsl  5213:                             Subtract (Local1, GFSB (), Local1)

Warning  1091 -                                            ^ Called method may not always return a value

ASL Input:  dsdt.dsl - 8318 lines, 261914 bytes, 3426 keywords

AML Output: dsdt.aml - 27191 bytes 1073 named objects 2353 executable opcodes

Compilation complete. 0 Errors, 9 Warnings, 0 Remarks, 939 Optimizations

```

But it looks to me like these are not warnings that are relevant to the timer binding. At least I can't make heads or tails of them, and anyway, they're warnings.

What now?

Cheers, 

Jon.

----------

## jesnow

I'm a little upset that this thread is petering out and that there is no action on the kernel.org bug I submitted. I think this may go back to the general linux mentality of "it boots, it works". There have to be a lot of people out there with broken SMP who don't know it, and they wonder why their machine is less responsive under linux than under that other OS. 

Is it really OK fo a large fraction of machines out there to not process interrupts on both CPU's? What happens when the primary advance in computing power is by adding cores? That's happening now. 

At the very least can people with Core 2 processors run cat /proc/interrupts and post here if their machine works (say what your mobo and chipset are).

Cheers, 

Jon.

----------

## arkhan_jg

It may also be that it's not strictly a fault in the kernel; if the ACPI table is broken on the motherboard, i.e. ASUS only tested that their BIOS worked with the incredibly out-of-spec tolerating windows, and the nforce chipset drivers aren't happy with a badly implemented ACPI table under linux...

I presume you've tried disabling ACPI in the BIOS directly rather than just passing ACPI=no in the kernel arguments? APM will do the job for a desktop, ACPI causes more problems than it solves IMO for desktops, the implementations are usually so badly written, which is a shame is it does solve some IRQ issues.

I don't have linux installed on any of my core 2 at home, but I have ubuntu on my nforce 4 sli x16 (asus p5n32-sli se deluxe) core 2 at work; I'll have a look at that one when I come back off holiday.

Also, I thought the MCP55 chipset was an athlon AM2 one? The P5NSLI normally has an nforce 570.

----------

## xanas3712

I'll try to check on this when I get home, but I'm not running nforce and both cores seem to work fine, but I'm curious of what the output is supposed to look like by comparison since I've not looked into this before.

```

           CPU0       CPU1       

  0:   22889202          0   IO-APIC-edge      timer

  1:          2          0   IO-APIC-edge      i8042

  6:          5          0   IO-APIC-edge      floppy

  7:          0          0   IO-APIC-edge      parport0

  8:          0          0   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:          4          0   IO-APIC-edge      i8042

 16:   10362546          0   IO-APIC-fasteoi   uhci_hcd:usb3, eth0, libata, nvidia

 18:          3          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb7

 19:     954920          0   IO-APIC-fasteoi   uhci_hcd:usb6, libata, libata, libata

 21:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4

 22:     536054          0   IO-APIC-fasteoi   HDA Intel

 23:     643674          0   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb5

NMI:      14487      13645 

LOC:   22706729   22706709 

ERR:          0

```

I guess this would mean I am not working quite right either?

IRQbalance didn't have the problem for me it did for you, it seems to start anyhow, but maybe it won't assign interrupts until I reboot?

 *Quote:*   

> 
> 
>          CPU0       CPU1       
> 
>   0:   25065078          0   IO-APIC-edge      timer
> ...

 

It changed some while running I noticed

----------

## jesnow

 *xanas3712 wrote:*   

> I'll try to check on this when I get home, but I'm not running nforce and both cores seem to work fine, but I'm curious of what the output is supposed to look like by comparison since I've not looked into this before.
> 
> ```
> 
>            CPU0       CPU1       
> ...

 

Not running nforce? Maybe this problem is more widespread than I thought! How many "happy" linux users out there have broken SMP and don't know it? Your luck with irqbalange is encouraging, but it looks like you have only 0.1% use of CPU1 on two interrupts. I guesss this is better than I have, but still not impressive. And why did you need it at all? Isn't the linux kernel supposed to handle multiprocessing all by itself? Who knows the answer to this?

Jon.

----------

## aceFruchtsaft

My output is also similar and I have an Asus P5B Deluxe with an Intel P965 Chipset:

```

# cat /proc/interrupts 

           CPU0       CPU1       

  0:    3986155          0   IO-APIC-edge      timer

  1:          2          0   IO-APIC-edge      i8042

  6:          3          0   IO-APIC-edge      floppy

  8:          1          0   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:          4          0   IO-APIC-edge      i8042

 16:     284222          0   IO-APIC-fasteoi   stex, uhci_hcd:usb3, nvidia

 17:      78225          0   IO-APIC-fasteoi   libata, uhci_hcd:usb4

 18:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb7

 19:      24421          0   IO-APIC-fasteoi   uhci_hcd:usb6

 21:          3          0   IO-APIC-fasteoi   ohci1394

 22:       1267          0   IO-APIC-fasteoi   EMU10K1

 23:          6          0   IO-APIC-fasteoi   ohci1394, ehci_hcd:usb2, uhci_hcd:usb5

314:      48582          0   PCI-MSI-edge      eth0

315:      17035          0   PCI-MSI-edge      libata

NMI:        395        252 

LOC:    3906513    3906431 

ERR:          0

```

Irqbalance seems to change this a bit:

```

# cat /proc/interrupts 

           CPU0       CPU1       

  0:    4215430          0   IO-APIC-edge      timer

  1:          2          0   IO-APIC-edge      i8042

  6:          3          0   IO-APIC-edge      floppy

  8:          1          0   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:          4          0   IO-APIC-edge      i8042

 16:     285262      13461   IO-APIC-fasteoi   stex, uhci_hcd:usb3, nvidia

 17:      78613       4471   IO-APIC-fasteoi   libata, uhci_hcd:usb4

 18:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb7

 19:      24464       2917   IO-APIC-fasteoi   uhci_hcd:usb6

 21:          3          0   IO-APIC-fasteoi   ohci1394

 22:       1267          0   IO-APIC-fasteoi   EMU10K1

 23:          6          0   IO-APIC-fasteoi   ohci1394, ehci_hcd:usb2, uhci_hcd:usb5

314:      48597        409   PCI-MSI-edge      eth0

315:      17955          0   PCI-MSI-edge      libata

NMI:        401        261 

LOC:    4131093    4131011 

ERR:          0

```

----------

## drescherjm

Here is my result for a dual processor Opteron on a 2.6.19 vserver sources kernel:

```

# cat /proc/interrupts

           CPU0       CPU1

  0:     180589   54482577   IO-APIC-edge      timer

  1:        385      96788   IO-APIC-edge      i8042

  8:      12110    1348081   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:      10010    1907253   IO-APIC-edge      i8042

 14:          1        455   IO-APIC-edge      ide0

 15:        212      35128   IO-APIC-edge      ide1

 16:      17716   18720723   IO-APIC-fasteoi   ivtv0, nvidia

 17:      17442    2716677   IO-APIC-fasteoi   libata, AMD AMD8111, ivtv1

 19:          0          3   IO-APIC-fasteoi   ohci_hcd:usb1, ohci_hcd:usb2, ohci1394

 24:      53393    7817791   IO-APIC-fasteoi   eth0

NMI:       2358       2773

LOC:   54662053   54661795

ERR:          0

```

It looks like interrupts favor CPU1 for me.

----------

## DirtyHairy

Sorry folks, I might be wrong on that one, but imho balancing interrupts is irrelevant where speed is concerned (at least for an usual desktop configuration). The speed improvement with multicore systems comes from the ability of balancing multiple threads between several physical CPUs, nothing else. If this works (and you can check this as easily as firing up several time-consuming processes and then looking at top output), your SMP is working fine. For example, 

```
cat /proc/interrupts
```

 gives

```
  0:   29318702          0   IO-APIC-edge      timer

  1:      39862          0   IO-APIC-edge      i8042

  3:     136564          0   IO-APIC-edge      serial

  8:         70          0   IO-APIC-edge      rtc

  9:      64191          0   IO-APIC-fasteoi   acpi

 12:    5859246          0   IO-APIC-edge      i8042

 14:     401980          0   IO-APIC-edge      libata

 15:        181          0   IO-APIC-edge      libata

 16:    2596924          0   IO-APIC-fasteoi   yenta, uhci_hcd:usb2, fglrx

 21:          1          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5

 22:    1546273          0   IO-APIC-fasteoi   uhci_hcd:usb3, ipw3945, HDA Intel

 23:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4

218:      29486          0   PCI-MSI-edge      eth0

NMI:          0          0

LOC:   29318542   29145006

ERR:          0

MIS:          0

```

on my machine (CoreDuo) right now, but SMP is working perfectly...

----------

## Cyker

Just for kicks, here's my S3 VirgeDX-powered headless server  :Smile: 

```
           CPU0       CPU1       

  0:  334558800  368774720   IO-APIC-edge      timer

  1:   16698570          9   IO-APIC-edge      i8042

  8:          1          1   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:          0          4   IO-APIC-edge      i8042

 14:     634681     436830   IO-APIC-edge      ide0

 15:    1227436     718240   IO-APIC-edge      ide1

 16:   86082431   59423925   IO-APIC-fasteoi   eth0

 17:  339155214  320390828   IO-APIC-fasteoi   ohci_hcd:usb2, eth1

 18:          0          0   IO-APIC-fasteoi   libata

 19:   18419792   15351524   IO-APIC-fasteoi   libata

 20:   15879395   16773603   IO-APIC-fasteoi   libata

 21:     767036    4517244   IO-APIC-fasteoi   ehci_hcd:usb1

NMI:          0          0 

LOC:  703332108  703332107 

ERR:          1

MIS:          0
```

I'm not sure about the workarounds for the interrupt stuff; Having only one CPU handle interrupts will bottleneck the system quite badly, espescially during heavy I/O.

We really need to assemble a list of 'Linux-approved' motherboards and hardware in general...

----------

## xanas3712

I'm not sure honestly.  I do know the system runs better even before the interrupt change than the single cores here.  And I've not noticed an issue until it was brought up about it.

Anyhow, I didn't say my chipset earlier and forgot what motherboard I was using for some strange reason.  It's an intel 965P.  Gigabyte-965P-S3 if I recall correctly.

----------

## DirtyHairy

Running irqbalance changed things for me a bit

```
           CPU0       CPU1

  0:   36419287          0   IO-APIC-edge      timer

  1:      43461       3358   IO-APIC-edge      i8042

  3:     141920      27782   IO-APIC-edge      serial

  8:         70          0   IO-APIC-edge      rtc

  9:      66966       3981   IO-APIC-fasteoi   acpi

 12:    5993784     920334   IO-APIC-edge      i8042

 14:     472453     142750   IO-APIC-edge      libata

 15:        181          0   IO-APIC-edge      libata

 16:    2690653     499048   IO-APIC-fasteoi   yenta, uhci_hcd:usb2, fglrx

 21:          1          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5

 22:    1546324       5800   IO-APIC-fasteoi   uhci_hcd:usb3, ipw3945, HDA Intel

 23:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4

218:      34126      30765   PCI-MSI-edge      eth0

NMI:          0          0

LOC:   36419127   36245351

ERR:          0

MIS:          0

```

but I really didn't notice any performance differences... I suppose it is the case as with many optimizations: there ARE configurations where they may pay off, but for an ordinary user using an ordinary desktop configuration, the difference will be barely noticable. I have been doing scientific calculations on that machine which involved much number crunching as well as heavy data transfers, and never noticed any sign of the system becoming laggy or unresponsive (without irqbalance which I just tried out now), the system was very snappy even with both cores 100% loaded  :Smile: 

----------

## Cyker

Yeah... Linux already has a fantastic scheduler, and most Linux users don't really need two cores anyway so I think for the most part the performance difference will be negligible.

If you're doing heavy processing (Like this Folding@Home thing which seems to be popular  :Wink: ), then the interrupts used will not be as bad since it actually does only very little I/O.

I suspect on my system it might have more of an effect because I *am* doing ridiculous amounts of I/O; Between the RAID array and the two gigabit cards espescially, and on systems that heavily use both cores for real-time stuff (I'm thinking FPS games and such) then it could also suck pretty bad.

----------

## xanas3712

Maybe newer games, I'll only be able to test it maybe when unreal 2007 comes out.  With an 8800 GTS quake 4 runs flawlessly, but there aren't too many other linux games with high end graphics (and that one is aging quickly) to really know for sure.

----------

## Cyker

Yeah, true sadly  :Sad: 

Damnit, if only NWN2 had been written using OpenGL instead of D3D!!!!  :Evil or Very Mad: 

This is all very speculative - I'm mostly talking out my proverbial because I haven't experienced any of this stuff; Only the people that have can really say...

----------

## DirtyHairy

To give my final 5 cents on this topic  :Smile:  : I just tried out the doom3 demo and didn't notice any performance change compared to the unbalanced case; honestly, I also didn't expect to. Hardware interrupts are (as the name suggests and contrary to software interrupts) triggered by hardware which wants to signal the OS. The processor then interrupts whatever it is currently doing and jumps to an interrupt handler, which, after doing its work, transfers control back to the interrupted task. Typically, interrupts are generated when some peripherial has data waiting to be processed by the OS --- if you move an PS/2 mouse, you will see interrupts appearing in /proc/interrupts. However, most hardware handling huge chunks of data (e.g. harddisk controllers) writes data directly into RAM via DMA and only raises an interrupt now and then to signal that the buffer is full and waiting to be flushed. Therefore, on a non-pathological system, the processor will spend only a very small fraction of time handling interrupts, and so, splitting interrupts between multiple CPUs shouldn't give an significant performance increase in most cases (although there are other things like cache invalidation which may make interrupt balancing interesting, but still, this shouldn't change the general picture).

To put emphasis on this once more: SMP is not directly connected to IRQ balancing, and any program that splits its computations between several processes or threads will benefit from a SMP system regardless of how IRQs are distributed between the cores. This also means that a program which does all its stuff in one thread won't run faster on a multicore system than it would on a single core. Also, a multicore system as a whole will be faster and more responsive because the tasks are distributed between the cores...

----------

## jesnow

 *DirtyHairy wrote:*   

> To give my final 5 cents on this topic :) : I just tried out the doom3 demo and didn't notice any performance change compared to the unbalanced case; honestly, I also didn't expect to. Hardware interrupts are (as the name suggests and contrary to software interrupts) triggered by hardware which wants to signal the OS. The processor then interrupts whatever it is currently doing and jumps to an interrupt handler, which, after doing its work, transfers control back to the interrupted task. Typically, interrupts are generated when some peripherial has data waiting to be processed by the OS --- if you move an PS/2 mouse, you will see interrupts appearing in /proc/interrupts. However, most hardware handling huge chunks of data (e.g. harddisk controllers) writes data directly into RAM via DMA and only raises an interrupt now and then to signal that the buffer is full and waiting to be flushed. Therefore, on a non-pathological system, the processor will spend only a very small fraction of time handling interrupts, and so, splitting interrupts between multiple CPUs shouldn't give an significant performance increase in most cases (although there are other things like cache invalidation which may make interrupt balancing interesting, but still, this shouldn't change the general picture).
> 
> To put emphasis on this once more: SMP is not directly connected to IRQ balancing, and any program that splits its computations between several processes or threads will benefit from a SMP system regardless of how IRQs are distributed between the cores. This also means that a program which does all its stuff in one thread won't run faster on a multicore system than it would on a single core. Also, a multicore system as a whole will be faster and more responsive because the tasks are distributed between the cores...

 

You don't have the problem I was originally complaining about, as you don't have an nforce chipset. In my case, I can have a completely unresponsive machine with only one core maxxed out because that is the core that is servicing interrupts. It is a pathology. One that is probably more widespread than you think, but your machine apparently doesn't have it. 

Cheers, 

Jon.

----------

## DirtyHairy

I just pointed out that not having IRQ balancing is no problem and should not hurt performance as severely as you describe it. You say one core is maxed out: what is your top output?

----------

## Akkara

Somewhat older Abit μGURU AN8-ultra (socket-939) seems to work.

cat /proc/interrupts

```
           CPU0       CPU1       

  0:    1510969  732269134   IO-APIC-edge      timer

  1:        692     254246   IO-APIC-edge      i8042

  8:          0          0   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:      56138   13966507   IO-APIC-edge      i8042

 14:          3        234   IO-APIC-edge      ide0

 15:          0         37   IO-APIC-edge      ide1

 16:      19574    7282662   IO-APIC-fasteoi   ICE1712

 18:      61134   21540577   IO-APIC-fasteoi   nvidia

 19:          0          3   IO-APIC-fasteoi   ohci1394

 20:          0          2   IO-APIC-fasteoi   ehci_hcd:usb1

 21:          0          0   IO-APIC-fasteoi   libata

 22:      14708    4402811   IO-APIC-fasteoi   libata

 23:     307512  154372258   IO-APIC-fasteoi   eth0

NMI:      28520      33686 

LOC:  733656959  733656887 

ERR:          0

```

lspci

```
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)

00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)

00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)

00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)

00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)

00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev a2)

00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3)

00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3)

00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)

00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)

00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)

00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)

00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)

00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)

00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration

00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map

00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller

00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control

01:00.0 VGA compatible controller: nVidia Corporation G70 [GeForce 7800 GT] (rev a1)

05:07.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link)

05:08.0 Multimedia audio controller: VIA Technologies Inc. ICE1712 [Envy24] PCI Multi-Channel I/O Controller (rev 02)

```

cat /proc/cpuinfo

```
processor       : 0

vendor_id       : AuthenticAMD

cpu family      : 15

model           : 35

model name      : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+

stepping        : 2

cpu MHz         : 2200.000

cache size      : 1024 KB

physical id     : 0

siblings        : 2

core id         : 0

cpu cores       : 2

fpu             : yes

fpu_exception   : yes

cpuid level     : 1

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy

bogomips        : 4421.67

TLB size        : 1024 4K pages

clflush size    : 64

cache_alignment : 64

address sizes   : 40 bits physical, 48 bits virtual

power management: ts fid vid ttp

processor       : 1

(...like above, omitted...)
```

----------

## jesnow

Maybe this is unique to Core 2 processors on nforce chipsets?

Jon.

----------

## onlinepancakes

Thanks for the info... Just tried cat /proc/interrupts and my CPU1 was all 0 except LOC... Did it on a live-cd, but still, shouldn't matter...

Hardware:

Intel E6600 at 2.4ghz

Asus P5N32 E SLI Plus

Was gonna reinstall Gentoo but seeing that my second core wont even be touched why bother...

----------

## jesnow

 *onlinepancakes wrote:*   

> Thanks for the info... Just tried cat /proc/interrupts and my CPU1 was all 0 except LOC... Did it on a live-cd, but still, shouldn't matter...
> 
> Hardware:
> 
> Intel E6600 at 2.4ghz
> ...

 

I don't think a reinstall will help.  In fact it's my opinion that the BIOS is broke, maybe not terminally so, but not working properly. When I do 'emerge --sync' on my machine, CPU0 goes to 100%, CPU1 to 0% and the machine is mostly unusable until it's done. When I do 'emerge -DNu world' both processors light up like a christmas tree, but the machine is useable because gcc understand multiple threads on multiple cores. 

Of course I can use 'nice' but the point is, it's broke. 

J.

----------

## drescherjm

 *Quote:*   

> When I do 'emerge --sync' on my machine, CPU0 goes to 100%, CPU1 to 0% 

 

The weird part here is at least for me emerge sync is mostly disk bound so the cpu usage is like 30% max on a single processor.

 *Quote:*   

> and the machine is mostly unusable until it's done. 

 

Do you have low latency desktop selected in your kernel config?

Are you using raid of any type?

----------

## JeliJami

 *drescherjm wrote:*   

>  *Quote:*   When I do 'emerge --sync' on my machine, CPU0 goes to 100%, CPU1 to 0%  
> 
> The weird part here is at least for me emerge sync is mostly disk bound so the cpu usage is like 30% max on a single processor.
> 
> 

 

In my case, I get 100% on one CPU also, but with /usr/portage mounted in RAM (see TIP: Compressing portage using squashfs: initscript method). No I/O latency from any disk!

----------

## Cadynum

"cat /proc/interrupt" gave me 0's on cpu1 before installing irqbalance. Now it looks like this:

```

           CPU0       CPU1       

  0:    1623924          0   IO-APIC-edge      timer

  1:          2          0   IO-APIC-edge      i8042

  2:          0          0    XT-PIC-XT        cascade

 10:     123681      76683   IO-APIC-fasteoi   libata, ohci_hcd:usb2, ohci1394, nvidia

 11:      21295      70148   IO-APIC-fasteoi   libata, libata, ehci_hcd:usb1, HDA Intel

 12:          4          0   IO-APIC-edge      i8042

1277:     110962      64886   PCI-MSI-edge      eth_wan

1278:     105899      53638   PCI-MSI-edge      eth_lan

NMI:          0          0 

LOC:    1623618    1623582 

ERR:          0

```

Notice the first line, still 0 on cpu1, is it supposed to be like that?

I can't say I noticed any performance improvement.

System:

Core 2 Duo

Evga 680i

----------

## onlinepancakes

I have been doing some research about this issue, and from what I have read this has only been effecting Intel systems since 2003/2004. Started with the Xeons on Intels own chipset and now hitting the Core 2 Duo line on Nforce 5 and 6 on Linux systems. From what I have seen it, started when the 2.6 kernel came out. One article that I found on Google showed this same thing happening on the 2.6 Kernel with dual Xeons, yet on the 2.4 kernel it worked fine. Also has been doing this on the Pentium 4's with HT support. Many P4 people complained that IRQ was not working with HT on without the use of IRQ balancing daemon. So far this has been reported happening on all Intel processors except the Pentium 3's and Pentium D's on the 2.6 Kernel.

So from what I have read, this seems to be only an issue with Intel and the 2.6 Kernel. Kernel.org needs to get on this, because if the article I read was true and it worked on the 2.4 Kernel and not 2.6 Kernel, then this is a big 2.6 Kernel bug that needs to fixed. Its not a new bug, its an old bug thats been following it since it was released.

----------

## momerath

I'm spec'ing out a dual (maybe even quad) core system right now. This thread has me a bit scared.  On the one hand we seem to have at least one person (the thread originator) with what I would consider a 'problem' if it happened on my new system.  Onlinepancakes seems to have found evidence of this problem; my initial searching hasn't turned anything up.  Could you (pancakes) post some links?  Could anyone else with a core 2 duo (or better yet quad) on any chipset post their interrupts output and speak to your experiences in high-load situations?

----------

## jesnow

Yes, onlinepancakes, by all means please post the links, we need them. It's hard to tell what the scope of the problem really is, because so few people have the wherewithal to find out how interrupts are spread out over their processors. I recommend xosview, which shows both cpu load and interrupts for both cores. 

That said, it *seems* like an issue limited to certain nvidia chipsets -- anybody who can post the output of

cat /proc/interrupts where it's working is certainly invited to do so. 

But even still, it only affects the distribution of interrupt handling cycles -- programs like gcc that explicitly make use of both cores still get the full benefit of both. 

J

----------

## BonesToo

Is this thread dead now?  I'm looking into getting a Core 2 duo with a nforce 680 mobo.  No replies on this topic for a month now, has this been solved or just a dead topic   :Sad: 

----------

## drescherjm

I believe there is no solution but there are ways to reduce the effect. Its not just core2 and just nvidia. As I have neither in a dual processor Opteron with an amd chipset and it significantly favors one cpu over the other when it comes to hardware interrupts.  I am not convinced if this was at complete balance that there would be a large visible performance improvement so I would say don't let this be the reason not to upgrade.

----------

## jesnow

I have an nforce 570 motherboard, and I can't seem to get much traction in getting other nforce users to type 'cat /proc/interrupts/' for me. It seems to be an esoteric bug, and thus unlikely to be fixed anytime soon. 

But I have no idea if it applies to your situation.  

This one is in the "definition of 'works'" category -- some people will say it 'works' and give it a gold star as long as it boots and they can do their thing. Even if it means using VESA mode, no dma on the disk drives, all the interrupts served by one cpu, etc. I call it a bug if (as is happening right now as I write) one cpu is maxed and one is idling because of some interrupt-heavy thing that's going on. 

Nothing much we can do about that. Try to get someone who has the configuration you want to check their interrupt distribution acrtoss processors and report back here!

Cheers, 

Jon.

----------

## drescherjm

I don't have any 570s with dual processors (or cores) running linux. I do have single cpu configs and duals (an quads) running windows but that wont help... Here is another AMD chipset dual opteron: 

# cat /proc/interrupts

           CPU0       CPU1

  0:    1960323  640900007   IO-APIC-edge      timer

  1:       1021     145473   IO-APIC-edge      i8042

  8:          0          0   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:       7910    1150622   IO-APIC-edge      i8042

 15:      13105    9216776   IO-APIC-edge      ide1

 16:      48820    7394848   IO-APIC-fasteoi   libata

 18:     135917   27847490   IO-APIC-fasteoi   eth0

 19:      23495    6395321   IO-APIC-fasteoi   libata, ohci_hcd:usb1, ohci_hcd:usb2

 24:   19188669 2960769151   IO-APIC-fasteoi   aic79xx, eth1

 25:          0         17   IO-APIC-fasteoi   aic79xx, eth2

 28:     531451   58196065   IO-APIC-fasteoi   sx8

 29:     364810   38256146   IO-APIC-fasteoi   sx8

NMI:      33373     133471

LOC:  642626810  642626216

ERR:          0

I believe the motherboard is a TYAN Thunder K8S 2882.

----------

## jesnow

Thanks, of course, but th thread is about nvidia + core 2...

J.

----------

## Genewb

 *momerath wrote:*   

> I'm spec'ing out a dual (maybe even quad) core system right now. This thread has me a bit scared.  On the one hand we seem to have at least one person (the thread originator) with what I would consider a 'problem' if it happened on my new system.  Onlinepancakes seems to have found evidence of this problem; my initial searching hasn't turned anything up.  Could you (pancakes) post some links?  Could anyone else with a core 2 duo (or better yet quad) on any chipset post their interrupts output and speak to your experiences in high-load situations?

 

In case this is useful:

```

           CPU0       CPU1       CPU2       CPU3       

  0:   14121462          0          0          0   IO-APIC-edge      timer

  1:      18200          0          0       2112   IO-APIC-edge      i8042

  9:          0          0          0          0   IO-APIC-fasteoi   acpi

 12:          6          0          0          0   IO-APIC-edge      i8042

 16:     745809          0          0     109136   IO-APIC-fasteoi   nvidia

 20:          2          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1

 21:     704860          0          0     103744   IO-APIC-fasteoi   sata_nv, HDA Intel

 22:    1389678          0          0     190090   IO-APIC-fasteoi   sata_nv, eth1

 23:     284555          0          0      41198   IO-APIC-fasteoi   sata_nv, ohci_hcd:usb2

NMI:          0          0          0          0 

LOC:   14101945   14106701   14101859   14106525 

ERR:          0

```

As for I/O load, its never been under heavy I/O load so I wouldn't know. Q6600 with a Striker Extreme (680i - C55 + MCP55), fyi.

----------

## jspaces

Hi, here is my systems output from an ASUS M2N SLI Deluxe motherboard with the nForce 570 SLI MCP chipset.

The kernel is latest amd64 version available at this moment in portage, "2.6.20-gentoo-r8".

The cpu information >

# cat /proc/cpuinfo

processor       : 0

vendor_id       : AuthenticAMD

cpu family      : 15

model           : 67

model name      : AMD Athlon(tm) 64 X2 Dual Core Processor 5000+

stepping        : 2

cpu MHz         : 2612.037

cache size      : 512 KB

physical id     : 0

siblings        : 2

core id         : 0

cpu cores       : 2

fpu             : yes

fpu_exception   : yes

cpuid level     : 1

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm cr8_legacy

bogomips        : 5228.45

TLB size        : 1024 4K pages

clflush size    : 64

cache_alignment : 64

address sizes   : 40 bits physical, 48 bits virtual

power management: ts fid vid ttp tm stc

processor       : 1

vendor_id       : AuthenticAMD

cpu family      : 15

model           : 67

model name      : AMD Athlon(tm) 64 X2 Dual Core Processor 5000+

stepping        : 2

cpu MHz         : 2612.037

cache size      : 512 KB

physical id     : 0

siblings        : 2

core id         : 1

cpu cores       : 2

fpu             : yes

fpu_exception   : yes

cpuid level     : 1

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm cr8_legacy

bogomips        : 5224.16

TLB size        : 1024 4K pages

clflush size    : 64

cache_alignment : 64

address sizes   : 40 bits physical, 48 bits virtual

power management: ts fid vid ttp tm stc

Code without irqbalance installed >

# cat /proc/interrupts

           CPU0       CPU1       

  0:     118957   53056369   IO-APIC-edge      timer

  1:        137      35368    IO-APIC-edge      i8042

  6:          0          5        IO-APIC-edge      floppy

  7:          0          0        IO-APIC-edge      parport0

  8:          0          1        IO-APIC-edge      rtc

  9:          0          0        IO-APIC-fasteoi   acpi

 12:          0          4       IO-APIC-edge      i8042

 14:       1612     541310   IO-APIC-edge      ide0

 16:       9521    4633677   IO-APIC-fasteoi   nvidia

 18:        336      88904   IO-APIC-fasteoi   EMU10K1

 19:          0          6   IO-APIC-fasteoi   ohci1394, ohci1394

 20:          0          0   IO-APIC-fasteoi   libata

 21:          0          0   IO-APIC-fasteoi   libata

 22:      12717    4170287   IO-APIC-fasteoi   libata, ohci_hcd:usb2

 23:          0          2   IO-APIC-fasteoi   ehci_hcd:usb1

313:      25708    9099180   PCI-MSI-edge      eth0

NMI:      11399      12101 

LOC:   53178761   53178264 

ERR:          0

Code with irqbalance running >

# cat /proc/interrupts

           CPU0       CPU1

  0:     119446   53718365   IO-APIC-edge      timer

  1:        229      35555   IO-APIC-edge      i8042

  6:          0          5   IO-APIC-edge      floppy

  7:          0          0   IO-APIC-edge      parport0

  8:          0          1   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 12:          0          4   IO-APIC-edge      i8042

 14:       3933     544947   IO-APIC-edge      ide0

 16:      64860    4635815   IO-APIC-fasteoi   nvidia

 18:       3669      88905   IO-APIC-fasteoi   EMU10K1

 19:          0          6   IO-APIC-fasteoi   ohci1394, ohci1394

 20:          0          0   IO-APIC-fasteoi   libata

 21:          0          0   IO-APIC-fasteoi   libata

 22:      13152    4180508   IO-APIC-fasteoi   libata, ohci_hcd:usb2

 23:          0          2   IO-APIC-fasteoi   ehci_hcd:usb1

313:      25710    9209566   PCI-MSI-edge      eth0

NMI:      11421      12132

LOC:   53841289   53840792

ERR:          0

----------

## drescherjm

 *Quote:*   

> Thanks, of course, but th thread is about nvidia + core 2... 

 

My point was to show the problem is not limited to nvidia or core 2.

----------

## jesnow

Finally, ASUS came out with a bios update, after which everthing works, and irqbalance does what it is supposed to:

 # cat /proc/interrupts

           CPU0       CPU1

  0:        254          0   IO-APIC-edge      timer

  1:          9      20224   IO-APIC-edge      i8042

  6:          5          0   IO-APIC-edge      floppy

  7:          0          0   IO-APIC-edge      parport0

  8:          2          0   IO-APIC-edge      rtc

  9:          0          0   IO-APIC-fasteoi   acpi

 14:     323048          0   IO-APIC-edge      libata

 15:        394    1861136   IO-APIC-edge      ide1

 16:          0          0   IO-APIC-fasteoi   uhci_hcd:usb2

 17:         36    1127517   IO-APIC-fasteoi   uhci_hcd:usb5, HDA Intel

 18:      70948          0   IO-APIC-fasteoi   uhci_hcd:usb1, ehci_hcd:usb3

 19:        157     713538   IO-APIC-fasteoi   uhci_hcd:usb4

 20:         93     174004   IO-APIC-fasteoi   eth3

NMI:          0          0

LOC:   49283889   52025358

ERR:          0

MIS:          0

This isn't perfect, note that libata only works on CPU1 still, but that's all right. 

Cheers, 

Jon.

----------

