# nFORCE2 + 2.6.x == system lock-ups

## semiSfear

Hello all

I know there is a reason why gentoo-dev-sources are called development sources. But tonight I lost a whole day worth of very important work.

I know there has been a lot of previous problems with nFORCE mobos and 2.6 kernels but those are easily resolved by passing these kernel options: noapic nolapic

Still, sometimes my system freeze. When I'm in X everything freezes, keyboard, mouse etc But if I play music, the music keeps on playing, the music doesn't stop. (?)

With 2.6.1-gentoo-r1 this only happend when I was copying large amount of data to or from a SMB server.

Then I switched to 2.6.3-gentoo-r1 I left the computer running at night and when I woke up to resume work the computer had frooze. The same day I switched to the new kernel! The jobs I left to run during night was some compilation, paper work (OO.org) and XMMS playing music.

My question is has anyone else had similar or exact same problems? This is what my log says:

```
Mar 24 06:28:37 mist Badness in pci_find_subsys at drivers/pci/search.c:167

Mar 24 06:28:37 mist Call Trace:

Mar 24 06:28:37 mist [<c01e5d08>] pci_find_subsys+0xe8/0xf0

Mar 24 06:28:37 mist [<c01e5d3f>] pci_find_device+0x2f/0x40

Mar 24 06:28:37 mist [<c01e5b48>] pci_find_slot+0x28/0x50

Mar 24 06:28:37 mist [<e1b1303d>] os_pci_init_handle+0x35/0x62 [nvidia]

Mar 24 06:28:37 mist [<e1b258ff>] __nvsym00057+0x1f/0x24 [nvidia]

Mar 24 06:28:37 mist [<e1c35e58>] __nvsym04875+0xf8/0x170 [nvidia]

Mar 24 06:28:37 mist [<e1c35c2a>] __nvsym00780+0x21a/0x224 [nvidia]

Mar 24 06:28:37 mist [<e1bcd824>] __nvsym03928+0x70/0x98 [nvidia]

Mar 24 06:28:37 mist [<e1bcd3b5>] __nvsym00610+0x67d/0x954 [nvidia]

Mar 24 06:28:37 mist [<e1bfc80a>] __nvsym00688+0x16a/0x338 [nvidia]

Mar 24 06:28:37 mist [<e1b28009>] __nvsym00827+0xd/0x1c [nvidia]

Mar 24 06:28:37 mist [<e1b296a4>] rm_isr_bh+0xc/0x10 [nvidia]

Mar 24 06:28:37 mist [<c0123ef6>] tasklet_action+0x46/0x70

Mar 24 06:28:37 mist [<c0123d10>] do_softirq+0x90/0xa0

Mar 24 06:28:37 mist [<c010d0ad>] do_IRQ+0xfd/0x130

Mar 24 06:28:37 mist [<c010b3f4>] common_interrupt+0x18/0x20

Mar 24 06:28:37 mist [<c01e3990>] __copy_to_user_ll+0x0/0x70

Mar 24 06:28:37 mist [<c0123403>] sys_gettimeofday+0x53/0xb0

Mar 24 06:28:37 mist [<c010aa87>] syscall_call+0x7/0xb

Mar 24 06:28:37 mist

Mar 24 06:28:37 mist Badness in pci_find_subsys at drivers/pci/search.c:167

Mar 24 06:28:37 mist Call Trace:

Mar 24 06:28:37 mist [<c01e5d08>] pci_find_subsys+0xe8/0xf0

Mar 24 06:28:37 mist [<c01e5d3f>] pci_find_device+0x2f/0x40

Mar 24 06:28:37 mist [<c01e5b48>] pci_find_slot+0x28/0x50

Mar 24 06:28:37 mist [<e1b1303d>] os_pci_init_handle+0x35/0x62 [nvidia]

Mar 24 06:28:37 mist [<e1b258ff>] __nvsym00057+0x1f/0x24 [nvidia]

Mar 24 06:28:37 mist [<e1bb9f32>] __nvsym03763+0x72/0xe0 [nvidia]

Mar 24 06:28:37 mist [<e1bfea41>] __nvsym04466+0x15/0x78 [nvidia]

Mar 24 06:28:37 mist [<e1c35e87>] __nvsym04875+0x127/0x170 [nvidia]

Mar 24 06:28:37 mist [<e1c35c2a>] __nvsym00780+0x21a/0x224 [nvidia]

Mar 24 06:28:37 mist [<e1bcd824>] __nvsym03928+0x70/0x98 [nvidia]

Mar 24 06:28:37 mist [<e1bcd3b5>] __nvsym00610+0x67d/0x954 [nvidia]

Mar 24 06:28:37 mist [<e1bfc80a>] __nvsym00688+0x16a/0x338 [nvidia]

Mar 24 06:28:37 mist [<e1b28009>] __nvsym00827+0xd/0x1c [nvidia]

Mar 24 06:28:37 mist [<e1b296a4>] rm_isr_bh+0xc/0x10 [nvidia]

Mar 24 06:28:37 mist [<c0123ef6>] tasklet_action+0x46/0x70

Mar 24 06:28:37 mist [<c0123d10>] do_softirq+0x90/0xa0

Mar 24 06:28:37 mist [<c010d0ad>] do_IRQ+0xfd/0x130

Mar 24 06:28:37 mist [<c010b3f4>] common_interrupt+0x18/0x20

Mar 24 06:28:37 mist [<c01e3990>] __copy_to_user_ll+0x0/0x70

Mar 24 06:28:37 mist [<c0123403>] sys_gettimeofday+0x53/0xb0

Mar 24 06:28:37 mist [<c010aa87>] syscall_call+0x7/0xb

Mar 24 06:28:37 mist

```

My system specs:

Abit NF7 nFORCE2 mobo

NVIDIA GeForce4 Ti4600

512 MB DDR

AMD Athlon XP 2200+

Also worth to note is that I don't use the nvidia-net drivers but the so called reversed engineer network drivers for my nFORCE2 on-board net adapter.

Thanks in advance

----------

## semiSfear

I google'd for this problem and found this:

 *Quote:*   

> This is caused by the closed source nvidia driver. Only nvidia can
> 
> debug this problem.

 

Which is a bummer since I never had this problem before using 2.4.x kernel series.

If someone can find more information on this matter it would be great.

This feels so bad. I mean whats the point in bosting that linux is stable when I can't even leave my computer on over night and not loose important work   :Confused: 

Now I don't say this happens often, but the same day I upgrade my kernel   :Mad: 

It kinda sucks...

----------

## stahlsau

try disabling "preemptible kernel" in your .config

worked for me. If it doesnt work, try using the "nv" driver in your xf86config and wait if it locks again...

----------

## jonnevers

I have a similiar setup (nf7-s, 2400+, nvidia gfx) and I have gotten great results from using love-sources for the kernel and recently the 5336 Nvidia Drivers. When I first got the board (some time ago) love-sources was the only kernel that had all the correct nforce/amd drivers... it also has (or used to have) patches for apci for nforce boards specifically.

There are a bunch of kernel options related to nforce and/or amd that need to be checked. 

(and i have preempt turned on)

-Jon

----------

## semiSfear

After reading a very interesting thread at nvnews.net forum I realize this problem is common. And there is no "real" solution to this problem. It seems this problem is due to a combination of different factors:

1. Hardware, what mobo you use, nFORCE or no nFORCE

2. CPU, 64bit or 32bit CPU

3. Framebuffer support in kernel, vesa or other.

4. APIC support, enabled or diabled based on what CPU you got (64 or 32), also if you got nFORCE mobo or not.

5. Kernel series, 2.6 or 2.4

Kernel developers say that this isn't a kernel bug, and NVIDIA developers haven't said anything. There is no straight answer or workaround to this problem. It's a trade off, disable some options and hope the problem is less frequent.

For example like you say, stahlsau. I can try using nv instead of nvidia driver/module. That may be better, but performance will get worse. Same thing goes for preemptible kernel. Disabling framebuffer in kernel is better thou, since framebuffer is a cosmethic issue at boot (worthless but pretty). But that may not have any effect on the problem. Some have experienced less frequent lock ups after disabling framebuffer, others have not.

I guess I can only try to find a good balance between disabling some options to meake the problem less frequent and still have some speed. So I shout out to the NVIDIA developers, "Fix your damn drivers!!"   :Laughing:   :Laughing: 

----------

## semiSfear

 *jonnevers wrote:*   

> 
> 
> There are a bunch of kernel options related to nforce and/or amd that need to be checked. 
> 
> -Jon

 

Could you be more specific. Which options, and what are they called?

----------

## stahlsau

hi

since i disabled premptible kernel i didn't notice anything going slower than before, and i tested lots of options bout nforce, nvidia, apic, acpi...so switching preemptible kernel to "no" is afaik the only option that works for me. No more lockups, no slowdown....can't think of something better  :Smile: 

----------

## _Nomad_

First of all, I too have a similar setup, and I don't have any lockups at all...

Mainly this is due to the nforce2-apic and nforce2-disconnect-quirk patches that is floating around... These have in some occations been included in the love-sources and that is why most ppl have had good results using that. 

As far as I know, the acpi (note, NOT APIC) issues that some ppl have claimed to have been experiencing is also related to this. And disabling preemptive have been known to work in some cases, although I've never had any problems using it. 

Now, I don't know if these patches are included in the latest love-sources release and I don't seem to remember where to get them. However if you would like to patch your kernel manually, just PM me and I'll mail them to you...

Ohh... and also this is a great resource for nforce mobo's

----------

## semiSfear

Thanks for the tip man. I am seriously considering to try out love- sources because everyone seem to talk about it   :Very Happy: 

Just have to decide which one of them:

linux-2.6.5-rc1-love1 aka "Endless Love"

or

linux-2.6.4-love1 aka "Angry Love Aardvark"

----------

## _Nomad_

why not 2.6.5_rc2-love1 aka "I Want To Do The Latin Hustle"   :Laughing: 

----------

## jonnevers

Processor type and features  --->        

    Processor family (Athlon/Duron/K7)  --->

        (X) Athlon/Duron/K7

 ATA/ATAPI/MFM/RLL support  --->

         <*>         AMD and nVidia IDE support

Networking support  ---> 

    Ethernet (10 or 100Mbit)  --->

│        <*>   Reverse Engineered nForce Ethernet support 

Device Drivers  ---> 

    Character devices  --->

        <*>   NVIDIA nForce/nForce2 chipset support 

I think those are all of the important ones for nforce..

----------

## bushwakko

I had lockups in 2.6 when I got my new nforce2 motherboard. I disabled APIC and it works like a rock! :)

...stable as a rock. works like a computer should ;)

----------

## mattjgalloway

Check that you don't have too many things sharing IRQs

I had LOTS of lockups using an nForce 2 mobo with any kernel. The sound, ethernet and USB mouse would freeze, but everything else carry on. My problem? The controllers for these 3 things were sharing one IRQ. I'd check this first.

Do:

```
lspci
```

or

```
cat /proc/pci
```

To see what is taking what IRQ. In your case look for ethernet card and maybe mouse or something. Basically whatever locks. Then try fiddling with BIOS to see if you can force them to use different IRQs. I fixed mine by disabling the parallel port and 1 serial port. (I wasnt using them anyway)

----------

## Admiral LSD

My EPoX 8RDA+, also nForce2 based, hasn't had this lock up issue since 2.4.22  :Razz: 

edit: also forgot to mention that I'm running ACPI, APIC and gentoo-dev-sources  :Wink: 

----------

## boorad

It may be a combination of things, like the top of the thread says, but I have had success compiling all sound as modules, not into the kernel.

```

Device Drivers -->

    Character devices --->

        <M> Enhanced Real Time Clock Support

    Sound --->

        <*> Sound car support

            Advanced Linux Sound Architecture --->

                <M> Advanced Linux Sound Architecture

                <M> Sequencer support

                < >   Sequencer dummy client

                <M> OSS Mixer API

                <M> OSS PCM (digital audio) API

                [*] OSS Sequencer API

                <M> RTC Timer support

```

and in your /etc/modules.autoload.d/kernel-2.6 file, load the modules:

```

# sound modules

snd-ac97-codec

snd-intel8x0

snd-mixer-oss

snd-pcm-oss

snd-seq-oss

snd-seq-midi

snd-rtctimer

```

I have had a freeze-free day after doing this.

"tsigo" got me on the right track in this thread: 

https://forums.gentoo.org/viewtopic.php?t=135549

----------

## _Adik_

i have quite similiar problem, ive described it here: 

https://forums.gentoo.org/viewtopic.php?t=153893

i have Gigabyte 7N400 Pro2 motherboard, ive tried

everything but problem with lockups still occur...

- love-sources does not help me much

- disable acip and apic does not help me much

- disable preemtible kernel does not help me much

- changing coller to Volcano 12 does not help me much

- updating bios to the newest version does not help me much

any ideas?

----------

## mattjgalloway

Seriously - check shared IRQ things. Try diasabling as much as you can in the BIOS. i.e. onboard sound, lan, parallel port serial ports, etc. Then see if it doesn't lock. If it doesn't then you know it's a shared IRQ problem. Well, at least it's likely to be...

----------

## _Adik_

 *mattjgalloway wrote:*   

> Seriously - check shared IRQ things. Try diasabling as much as you can in the BIOS. i.e. onboard sound, lan, parallel port serial ports, etc. Then see if it doesn't lock. If it doesn't then you know it's a shared IRQ problem. Well, at least it's likely to be...

 

you may have right becouse when I turn off my integrated rtl8169 Gigabti NIC my system is stable as rock, but there is question... WHAT I HAVE TO DO 

NOW? Is there a solution for this or I have to buy new NIC?

----------

## mattjgalloway

Go into your BIOS and try fiddling with your settings. If you have a Giga-Byte mobo, then you'll probably need to press Ctrl-F1 when you get into the BIOS settings to do anything good.

Now look around in the Integrated Peripherals section and try altering some of the IRQs. I really can't determine what you EXACTLY need to do, because I can't see your settings currently. For me, I disabled my parallel port and one serial port as I didn't need them. Basically - disable what you don't need. Then try changing some IRQs if you still get lockups.

----------

## _Adik_

 *mattjgalloway wrote:*   

> Go into your BIOS and try fiddling with your settings. If you have a Giga-Byte mobo, then you'll probably need to press Ctrl-F1 when you get into the BIOS settings to do anything good.
> 
> Now look around in the Integrated Peripherals section and try altering some of the IRQs. I really can't determine what you EXACTLY need to do, because I can't see your settings currently. For me, I disabled my parallel port and one serial port as I didn't need them. Basically - disable what you don't need. Then try changing some IRQs if you still get lockups.

 

ive tryin to set irqs but i cannot set is to be stable, cat /proc/intterupts always shows me errors...

----------

## mattjgalloway

What exactly is your /proc/interupts saying?

----------

## _Adik_

 *mattjgalloway wrote:*   

> What exactly is your /proc/interupts saying?

 

this when everything with IRQ's is set to auto in BIOS:

```

           CPU0       

  0:      71456          XT-PIC  timer

  1:        213          XT-PIC  i8042

  2:          0          XT-PIC  cascade

 12:         84          XT-PIC  i8042

 14:       2256          XT-PIC  ide0

 15:        453          XT-PIC  eth0

NMI:          0 

LOC:      71412 

ERR:         93

```

and something like this when i set IRQ's manualy:

```

           CPU0       

  0:      76810          XT-PIC  timer

  1:        109          XT-PIC  i8042

  2:          0          XT-PIC  cascade

  3:        522          XT-PIC  eth0

 12:         84          XT-PIC  i8042

 14:       2740          XT-PIC  ide0

NMI:          0 

LOC:      76766 

ERR:         91

```

any ideas?

p.s. ive disables everything what i dont need in bios...

----------

## mattjgalloway

Umm, looks fine to me... nothing's being shared...

----------

## _Adik_

 *mattjgalloway wrote:*   

> Umm, looks fine to me... nothing's being shared...

 

yeah, but is locks...

and when onboard nic is disabled system is stable as rock...

----------

## mattjgalloway

Hmmm

What is your onboard NIC? Maybe try upgrading drivers?

----------

## _Adik_

 *mattjgalloway wrote:*   

> Hmmm
> 
> What is your onboard NIC? Maybe try upgrading drivers?

 

its realtek 8169 gigabit ethernet

now im trying to compile latest kernel ( 2.6.5-rc3-love1 ) and see if this change something...

----------

## PrakashP

Try Nvidia driver 53.41. It seems to fix lock-ups. (Nforce2 has various patterns of lock-ups...) It seems my system is finally rock-stable. (nforce2 and gf4 ti 4200).

----------

## _Adik_

 *PrakashKC wrote:*   

> Try Nvidia driver 53.41. It seems to fix lock-ups. (Nforce2 has various patterns of lock-ups...) It seems my system is finally rock-stable. (nforce2 and gf4 ti 4200).

 

ist not graphic card problem but NIC...

----------

## PrakashP

Sorry, I was rather referring to the thread starter.  :Smile: 

----------

