# 3COM 3c905B stops working with 2.6.22-r9 after some time

## leosgb

Hi,

I was using 2.6.20-gentoo-r8 and recently upgraded to 2.6.22-gentoo-r9. I use a 3Com card in this computer and it was always rocking stable. It ran continuously 24/7 for almost 4 months with the older kernel. But since I upgraded my kernel I have to reboot the computer every other day or so.

Is there any known problem with the new kernel and this card? This is my lspci:

```

04:09.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 34)

        Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100

        Flags: bus master, medium devsel, latency 64, IRQ 18

        I/O ports at c880 [size=128]

        Memory at faaff400 (32-bit, non-prefetchable) [size=128]

        Expansion ROM at 40000000 [disabled] [size=128K]

        Capabilities: [dc] Power Management version 1

```

I already tried to stop it using the 

```
/etc/init.d/net.eth0 stop
```

 and also the tried to restart the card too but I didnt have any success. The only way it works is when I reboot the system.

Any help is welcome. Thank you for reading my post.[/quote]

----------

## Peach

 *leosgb wrote:*   

> The only way it works is when I reboot the system.

 

please compile the card as a module and try to rmmod and then modprobe it again and report the output of dmesg checking also for errors.

----------

## leosgb

Ok. I will do it tonight and post back any messages. Thank you!

----------

## Cyker

Are you getting NETDEV WATCHDOG..... errors in dmesg when it stops working?

I'm wondering if there is some sort of problem with the older network drivers in more recent kernels; I know first-hand that the ne2k-pci driver has issues in .22 and .23...

----------

## leosgb

Is there any way that I can retrieve my old dmesg messsages? This computer is my home server and I had to roll back the kernel to 2.6.20 because it needs to be stable. I will reboot and switch to .23 tonight and in a few days I should have something to report but it would be great if I could just read a "dmesg history" somewhere and just get to the bottom of this. Thanks!

----------

## leosgb

Hi guys,

I decided to try to find my "dmesg history" and I guess it is called /var/log/messages. That said, this is what I found in my messages file:

This is one day that my network failed.

```

Nov  9 13:51:45 Tuxedo NETDEV WATCHDOG: eth0: transmit timed out

Nov  9 13:51:45 Tuxedo eth0: transmit timed out, tx_status 00 status e601.

Nov  9 13:51:45 Tuxedo diagnostics: net 0cda media 8880 dma 0000003a fifo 8000

Nov  9 13:51:45 Tuxedo eth0: Interrupt posted but not delivered -- IRQ blocked by another device?

Nov  9 13:51:45 Tuxedo Flags; bus-master 1, dirty 82337(1) current 82337(1)

Nov  9 13:51:45 Tuxedo Transmit list 00000000 vs. ffff8100362652a0.

Nov  9 13:51:45 Tuxedo 0: @ffff810036265200  length 8000002a status 8001002a

Nov  9 13:51:45 Tuxedo 1: @ffff8100362652a0  length 00000036 status 0c0105e2

Nov  9 13:51:45 Tuxedo 2: @ffff810036265340  length 00000036 status 0c0105e2

Nov  9 13:51:45 Tuxedo 3: @ffff8100362653e0  length 00000036 status 0c0105e2

Nov  9 13:51:45 Tuxedo 4: @ffff810036265480  length 00000036 status 0c0105e2

Nov  9 13:51:45 Tuxedo 5: @ffff810036265520  length 00000036 status 0c0105e2

Nov  9 13:51:45 Tuxedo 6: @ffff8100362655c0  length 00000036 status 0c0105e2

Nov  9 13:51:45 Tuxedo 7: @ffff810036265660  length 80000036 status 00010036

Nov  9 13:51:45 Tuxedo 8: @ffff810036265700  length 80000036 status 00010036

Nov  9 13:51:45 Tuxedo 9: @ffff8100362657a0  length 80000036 status 00010036

Nov  9 13:51:45 Tuxedo 10: @ffff810036265840  length 80000036 status 00010036

Nov  9 13:51:45 Tuxedo 11: @ffff8100362658e0  length 00000036 status 0c0105e2

Nov  9 13:51:45 Tuxedo 12: @ffff810036265980  length 00000036 status 0c0105e2

Nov  9 13:51:45 Tuxedo 13: @ffff810036265a20  length 8000002a status 0001002a

Nov  9 13:51:45 Tuxedo 14: @ffff810036265ac0  length 8000002a status 0001002a

Nov  9 13:51:45 Tuxedo 15: @ffff810036265b60  length 8000002a status 8001002a

Nov  9 13:51:45 Tuxedo eth0: Resetting the Tx ring pointer.

Nov  9 14:26:10 Tuxedo NETDEV WATCHDOG: eth0: transmit timed out

Nov  9 14:26:10 Tuxedo eth0: transmit timed out, tx_status 00 status e681.

Nov  9 14:26:10 Tuxedo diagnostics: net 0cda media 8880 dma 0000003a fifo 8000

Nov  9 14:26:10 Tuxedo eth0: Interrupt posted but not delivered -- IRQ blocked by another device?

Nov  9 14:26:10 Tuxedo Flags; bus-master 1, dirty 82353(1) current 82353(1)

Nov  9 14:26:10 Tuxedo Transmit list 00000000 vs. ffff8100362652a0.

Nov  9 14:26:10 Tuxedo 0: @ffff810036265200  length 80000106 status 8c010106

Nov  9 14:26:10 Tuxedo 1: @ffff8100362652a0  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 2: @ffff810036265340  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 3: @ffff8100362653e0  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 4: @ffff810036265480  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 5: @ffff810036265520  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 6: @ffff8100362655c0  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 7: @ffff810036265660  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 8: @ffff810036265700  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 9: @ffff8100362657a0  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 10: @ffff810036265840  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 11: @ffff8100362658e0  length 8000002a status 0001002a

Nov  9 14:26:10 Tuxedo 12: @ffff810036265980  length 80000106 status 0c010106

Nov  9 14:26:10 Tuxedo 13: @ffff810036265a20  length 800000f9 status 0c0100f9

Nov  9 14:26:10 Tuxedo 14: @ffff810036265ac0  length 80000106 status 0c010106

Nov  9 14:26:10 Tuxedo 15: @ffff810036265b60  length 800000f9 status 8c0100f9

Nov  9 14:26:10 Tuxedo eth0: Resetting the Tx ring pointer.

Nov  9 14:26:25 Tuxedo NETDEV WATCHDOG: eth0: transmit timed out

Nov  9 14:26:25 Tuxedo eth0: transmit timed out, tx_status 00 status e201.

Nov  9 14:26:25 Tuxedo diagnostics: net 0cda media 8880 dma 00000032 fifo 8000

Nov  9 14:26:25 Tuxedo eth0: Interrupt posted but not delivered -- IRQ blocked by another device?

Nov  9 14:26:25 Tuxedo Flags; bus-master 1, dirty 82369(1) current 82369(1)

Nov  9 14:26:25 Tuxedo Transmit list 00000000 vs. ffff8100362652a0.

Nov  9 14:26:25 Tuxedo 0: @ffff810036265200  length 80000042 status 80010042

Nov  9 14:26:25 Tuxedo 1: @ffff8100362652a0  length 800000f9 status 0c0100f9

Nov  9 14:26:25 Tuxedo 2: @ffff810036265340  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 3: @ffff8100362653e0  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 4: @ffff810036265480  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 5: @ffff810036265520  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 6: @ffff8100362655c0  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 7: @ffff810036265660  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 8: @ffff810036265700  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 9: @ffff8100362657a0  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 10: @ffff810036265840  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 11: @ffff8100362658e0  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 12: @ffff810036265980  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 13: @ffff810036265a20  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 14: @ffff810036265ac0  length 80000042 status 00010042

Nov  9 14:26:25 Tuxedo 15: @ffff810036265b60  length 80000042 status 80010042

Nov  9 14:26:25 Tuxedo eth0: Resetting the Tx ring pointer.

Nov  9 14:27:05 Tuxedo NETDEV WATCHDOG: eth0: transmit timed out

Nov  9 14:27:05 Tuxedo eth0: transmit timed out, tx_status 00 status e201.

Nov  9 14:27:05 Tuxedo diagnostics: net 0cda media 8880 dma 00000032 fifo 8000

Nov  9 14:27:05 Tuxedo eth0: Interrupt posted but not delivered -- IRQ blocked by another device?

Nov  9 14:27:05 Tuxedo Flags; bus-master 1, dirty 82385(1) current 82385(1)

Nov  9 14:27:05 Tuxedo Transmit list 00000000 vs. ffff8100362652a0.

Nov  9 14:27:05 Tuxedo 0: @ffff810036265200  length 80000042 status 80010042

Nov  9 14:27:05 Tuxedo 1: @ffff8100362652a0  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 2: @ffff810036265340  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 3: @ffff8100362653e0  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 4: @ffff810036265480  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 5: @ffff810036265520  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 6: @ffff8100362655c0  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 7: @ffff810036265660  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 8: @ffff810036265700  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 9: @ffff8100362657a0  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 10: @ffff810036265840  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 11: @ffff8100362658e0  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 12: @ffff810036265980  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 13: @ffff810036265a20  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 14: @ffff810036265ac0  length 80000042 status 00010042

Nov  9 14:27:05 Tuxedo 15: @ffff810036265b60  length 80000042 status 80010042

Nov  9 14:27:05 Tuxedo eth0: Resetting the Tx ring pointer.

```

I also have this:

```

Nov 11 11:48:22 Tuxedo 0000:04:09.0: 3Com PCI 3c905B Cyclone 100baseTx at ffffc200004e2400.

.

.

.

Nov 12 11:46:32 Tuxedo NETDEV WATCHDOG: eth0: transmit timed out

Nov 12 11:46:32 Tuxedo eth0: transmit timed out, tx_status 00 status e601.

Nov 12 11:46:32 Tuxedo diagnostics: net 0cda media 8880 dma 0000003a fifo 8000

Nov 12 11:46:32 Tuxedo eth0: Interrupt posted but not delivered -- IRQ blocked by another device?

Nov 12 11:46:32 Tuxedo Flags; bus-master 1, dirty 170233(9) current 170233(9)

Nov 12 11:46:32 Tuxedo Transmit list 00000000 vs. ffff810036c417a0.

Nov 12 11:46:32 Tuxedo 0: @ffff810036c41200  length 00000036 status 0c0105b8

Nov 12 11:46:32 Tuxedo 1: @ffff810036c412a0  length 8000002a status 0001002a

Nov 12 11:46:32 Tuxedo 2: @ffff810036c41340  length 8000002a status 0001002a

Nov 12 11:46:32 Tuxedo 3: @ffff810036c413e0  length 8000002a status 0001002a

Nov 12 11:46:32 Tuxedo 4: @ffff810036c41480  length 8000002a status 0001002a

Nov 12 11:46:32 Tuxedo 5: @ffff810036c41520  length 8000002a status 0001002a

Nov 12 11:46:32 Tuxedo 6: @ffff810036c415c0  length 8000002a status 0001002a

Nov 12 11:46:32 Tuxedo 7: @ffff810036c41660  length 8000002a status 8001002a

Nov 12 11:46:32 Tuxedo 8: @ffff810036c41700  length 8000002a status 8001002a

Nov 12 11:46:32 Tuxedo 9: @ffff810036c417a0  length 00000036 status 0c0105b8

Nov 12 11:46:32 Tuxedo 10: @ffff810036c41840  length 00000036 status 0c0105b8

Nov 12 11:46:32 Tuxedo 11: @ffff810036c418e0  length 00000036 status 0c0105b8

Nov 12 11:46:32 Tuxedo 12: @ffff810036c41980  length 00000036 status 0c0105b8

Nov 12 11:46:32 Tuxedo 13: @ffff810036c41a20  length 80000036 status 00010036

Nov 12 11:46:32 Tuxedo 14: @ffff810036c41ac0  length 80000036 status 00010036

Nov 12 11:46:32 Tuxedo 15: @ffff810036c41b60  length 00000036 status 0c0105b8

Nov 12 11:46:32 Tuxedo eth0: Resetting the Tx ring pointer.

```

This NETDEV WATCHDOG list goes on forever on both days. It stopped once I changed my kernel to 2.6.20. Any ideas?

Thank you for the suggestions.

----------

## Cyker

Yes, that is the exact same problem I have had!

Can you "cat /proc/interrupts"?

I'm curious to see what else is sharing the IRQ your 3Com is using...

But I think it's a kernel bug with the older network drivers - I got the same error with the ne2k-pci driver, and also with the 3com (Although that I suspect was caused by known problems with the sky2 driver)

----------

## leosgb

This is my /proc/interrupts:

```

           CPU0

  0:    8095907   IO-APIC-edge      timer

  1:       4014   IO-APIC-edge      i8042

  2:          0    XT-PIC-XT        cascade

  5:     140654   IO-APIC-fasteoi   ohci1394, ehci_hcd:usb1, ohci_hcd:usb2, HDA Intel, eth0

  8:          0   IO-APIC-edge      rtc

 10:      61970   IO-APIC-fasteoi   sata_nv

 11:    2539368   IO-APIC-fasteoi   sata_nv, nvidia

 12:          3   IO-APIC-edge      i8042

 14:      97209   IO-APIC-edge      ide0

NMI:          0

LOC:    8096257

ERR:          0

```

So it shares the IRQ qith my 1394, USB and sound huh? Can I change that?

Thank you.

----------

## Cyker

Bizarrely, there is apparently NO WAY to manually assign IRQs to devices in Linux unless you're configuring ISA cards (Then you can use isapnptools).

You can't even disable IRQ sharing!

However, I notice your IRQs only go up to 14, which suggests your IOAPIC isn't being used. If you can enable IOAPIC or similar in your BIOS, that can potentially open up 255 IRQs ( \o/  (I think...)), which Linux may or may not use to spread the other devices around...

(On mine, IOAPIC gives me another 8 IRQs - 16 to 23, but for some FUBAR reason they are the ONLY IRQs Linux assigns things to, so half my NICs are still sharing an IRQ!  :Sad:  )

----------

## leosgb

It is interesting though that with the IRQ sharing (I dont think it would change from 2.6.20 to 2.6.23) my card only hangs with .23! It must be some weird and obscure kernel driver bug... I recompiled my kernel as 2.6.23 w/ ACPI disabled following some hints I saw in linuxquestions. I just checked and my card is down again. So, whatever might be causing this it must be in the driver not in the kernel compilation. I will roll back to 2.6.20 and wait for the next release to see if they fixed this problem.

Thanks!

----------

## Cyker

The NIC drivers haven't been updated in *years*, so I think some of the more recent kernel changes have been having an effect.

As long as its not under heavy load, the NICs seem fine, but if I hammer the sky2 and 3c59x drivers at the same time I get the lost IRQ and then everything goes FOOM with more or less the same error you get.

Unfortunately I can't regress because of bugs in the sky2 driver, which is being actively worked on. .23 has been most stable with it sofar, and since the 3c59x is just a coax bridge which doesn't get used much I have to put up with it  :Sad: 

I just wish I could change the IRQ it uses!!

----------

## donjames

Hi,

I'm having the same problem with 3com 905b and 905c network cards.

They just quit working.  I'm running 2.6.20-gentoo-r8.

They worked fine for about 2 years and then just quit working.

Could someone recommend a brand of network card that does work?

Thanks,

Don James

----------

## hpeters

I'm using a 3com 3c905B and the 2.6.23.1 vanilla kernel without any problems.

The card gets very heavy use as its used with multiple hdtv video streams.

The system is an older x86 pentum4 with just the bare minium under networking in the kernel selected.

The driver is compiled as a module and ioapic is turned on though it is still sharing an interrupt with the libata and usb drivers.

 Don't know what your problem is but it does work prefect here.

Harley

----------

## Cyker

The problem is not universal; There seems to be some combination of kernel and hardware config that will trip it more than others.

Buggy/Crap ACPI and BIOS are a suspect, and high IRQ loading seems to also aggravate the problem, espescially synchronous loading (e.g. disk and NIC IRQs being triggered at the same time for network bulk disk copy, or NIC bridges passing traffic to each other).

However, the biggest contributor seems to be multi-core system; I've seen very few problems like this on Single-core systems (Inc. P4 - HT still counts as single-core here) - It seems to affect dual-core and multi-CPU systems the most.

Enabling/disabling kernel/userland irqbalancing has (apparently) helped mitigate the problem with some people, but it's so random that I haven't found any patterns yet.

As for card recomendations, pretty much anything based newer silicon seems to work better, probably because the drivers were written more recently.

The most stable NICs I've used thus far have been based on Intel/Broadcom chips, and also the nForce/Forcdeth chip. More recent Realtek chips are also better, 'tho avoid them for GigE as they have a reputation for being IRQ spammers  :Wink: 

----------

## donjames

Hi folks,

Thanks for the replies.

I am still working on the problem.  

Just wondered what "ioapic" is.

Can this be turned on in the cmos setup?

Thanks,

Don James

Henderson, Texas USA

----------

## Cyker

If you've got a very recent 'board, it's probably forced on.

Otherwise, there will be a BIOS option somewhere mentioning APIC (Not ACPI! Be careful if you have dyslexia; They both do very different things!  :Wink: )

If the 'board is pretty old - I think the cutoff is 2nd gen Athlon or Older PIII or older - It might not have APIC support.

If it's a dual-core capable 'board it almost certainly will 'tho.

----------

