# Two network cards, eth1 suddenly stops working

## Arny

For months now I've had a server running with eth0 manually set to 192.168.1.1 and eth1 set to connect to adsl. This configuration has worked flawlessly on several machines and for several months at a time, so I'm pretty comfortable with it. However, in the last couple of days I was running some updates including the rather important gcc 3.4.4, and after emerging my system and world a couple of times my connection died.

After poking around a bit I came to the conclusion that my network card had died  there was nothing showing up in lspci for the second card at all. I ripped it out, threw in another one; same problem. Then I put the original one back in but in a different PCI slot and it seemed to work. However after about 5 minutes or so the connection died again and once again the network card seems to have disappeared from my machine's awareness.

So, does anyone have any pointers for troubleshooting this? I've already spent a fair amount of time browsing through here and trying things, nothing seems to help. If you need any extra info I'll gladly provide it, many thanks in advance.

----------

## neilhwatson

Error messages in the system logs?  Have you changed your kernel?

----------

## Arny

The kernel is the same as I was using before the updates, I was going to update it later this week. The following messages are in /var/log/messages at the time I originally lost connection:

```
Dec 19 22:28:52 Marvin spurious 8259A interrupt: IRQ7.

Dec 19 22:29:02 Marvin NETDEV WATCHDOG: eth1: transmit timed out

Dec 19 22:29:02 Marvin eth1: Transmit timeout, status ff ffff ffff media ff.

Dec 19 22:29:02 Marvin eth1: Tx queue start entry 260915  dirty entry 260911.

Dec 19 22:29:02 Marvin eth1:  Tx descriptor 0 is ffffffff.

Dec 19 22:29:02 Marvin eth1:  Tx descriptor 1 is ffffffff.

Dec 19 22:29:02 Marvin eth1:  Tx descriptor 2 is ffffffff.

Dec 19 22:29:02 Marvin eth1:  Tx descriptor 3 is ffffffff. (queue head)

Dec 19 22:29:02 Marvin eth1: link up, 100Mbps, full-duplex, lpa 0xFFFF

Dec 19 22:29:14 Marvin NETDEV WATCHDOG: eth1: transmit timed out

Dec 19 22:29:14 Marvin eth1: Transmit timeout, status ff ffff ffff media ff.

Dec 19 22:29:14 Marvin eth1: Tx queue start entry 4  dirty entry 0.
```

The above repeats several times before this:

```
Dec 19 22:31:53 Marvin pppd[8398]: pppd 2.4.2 started by root, uid 0

Dec 19 22:31:53 Marvin pppd[8398]: Using interface ppp1

Dec 19 22:31:53 Marvin pppd[8398]: Connect: ppp1 <--> /dev/pts/2

Dec 19 22:31:58 Marvin pppd[5694]: Connection terminated.

Dec 19 22:31:58 Marvin pppd[5694]: Connect time 71581019.5 minutes.

Dec 19 22:31:58 Marvin pppd[5694]: Sent 84833207 bytes, received 182079385 bytes.

Dec 19 22:31:59 Marvin pppoe[5761]: read (asyncReadFromPPP): Session 7488: Input/output error

Dec 19 22:31:59 Marvin pppoe[5761]: Sent PADT

Dec 19 22:31:59 Marvin pppd[5694]: Connect time 71581019.5 minutes.

Dec 19 22:31:59 Marvin pppd[5694]: Sent 84833207 bytes, received 182079385 bytes.

Dec 19 22:31:59 Marvin pppd[5694]: Exit.

Dec 19 22:32:05 Marvin NETDEV WATCHDOG: eth1: transmit timed out

Dec 19 22:32:05 Marvin eth1: Transmit timeout, status ff ffff ffff media ff.

Dec 19 22:32:05 Marvin eth1: Tx queue start entry 4  dirty entry 0.

Dec 19 22:32:05 Marvin eth1:  Tx descriptor 0 is ffffffff. (queue head)

Dec 19 22:32:05 Marvin eth1:  Tx descriptor 1 is ffffffff.

Dec 19 22:32:05 Marvin eth1:  Tx descriptor 2 is ffffffff.

Dec 19 22:32:05 Marvin eth1:  Tx descriptor 3 is ffffffff.

Dec 19 22:32:05 Marvin eth1: link up, 100Mbps, full-duplex, lpa 0xFFFF

Dec 19 22:32:25 Marvin pppd[8398]: LCP: timeout sending Config-Requests

Dec 19 22:32:25 Marvin pppd[8398]: Connection terminated.

Dec 19 22:32:26 Marvin pppoe[8399]: recv (receivePacket): Network is down

Dec 19 22:32:46 Marvin pppoe[8399]: Timeout waiting for PADO packets

Dec 19 22:32:46 Marvin pppd[8398]: Exit.

```

Any thoughts?

----------

## neilhwatson

Are you sure it's your ethernet card and not your ISP?  Configure your system so that you other card connects to the ADSL modem.  Or, try connecting the card to another computer.  Run ping for an hour.

----------

## Arny

Definitely not the ISP - I've tried using the other card to connect, no problems there, but then of course the server is inaccessible because the second network card is still not working. I'm currently connected using my router but I prefer having everything going through my server.

----------

## Jrauch

Have you tried a different ethernet cable.  I've had issues in the past that were caused by cables going bad after moving them, even though they had worked perfectly for long periods before that.

Usually that will show up as carrier errors in ifconfig.  Do you have any of those, or any other errors there for that matter?

----------

## T|TaN

I have had the same issue as Arny.

It is outlined in this post:

https://forums.gentoo.org/viewtopic-t-408533-highlight-.html

----------

## Arny

Jrauch, I haven't tried a different cable but considering the fact that the only connection I'm having trouble with is whichever one is connected to eth1, I don't see how it could be a problem with the cable. For example, I'm currently using the same cable to connect to the net via my router.

T|TaN, thanks for the thought but it doesn't look like we're having the same problem. Your system still seems to be recognising both network cards whereas mine has suddenly decided that there is only one card.

EDIT: I don't think I mentioned this above, but booting into the Live CD presents the same problems, although sometimes the network card is detected and works fine, other times it doesn't.

----------

## neilhwatson

IRQ conflict?  What does 

```
cat /proc/interrupts
```

 reveal?

----------

## Arny

```
Marvin ~ # cat /proc/interrupts

           CPU0

  0:    8865912          XT-PIC  timer

  2:          0          XT-PIC  cascade

  7:          0          XT-PIC  parport0

  9:          1          XT-PIC  acpi

 11:     228382          XT-PIC  eth0

 12:          0          XT-PIC  uhci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb3

 14:     117116          XT-PIC  ide0

 15:         25          XT-PIC  ide1

NMI:          0

ERR:          0

```

which means precisely bugger all to me - any help there?

----------

## neilhwatson

I see eth0 where is eth1?

----------

## Arny

It doesn't appear anymore, that's kind of the problem...

It was there and then suddenly decided to disappear. It no longer even shows up on lspci...

```
Marvin ~ # lspci

00:00.0 Host bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333]

00:01.0 PCI bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333 AGP]

00:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)

00:11.0 ISA bridge: VIA Technologies, Inc. VT8233 PCI to ISA Bridge

00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)

00:11.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1b)

00:11.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1b)

00:11.4 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1b)

01:00.0 VGA compatible controller: nVidia Corporation NV5M64 [RIVA TNT2 Model 64/Model 64 Pro] (rev 15)

```

That's the major issue - why would it be there one second and then not work at all the next? If it was there I could figure this out...

----------

## neilhwatson

How many other pci cards do you have?  Perhaps remove some and see if the card returns to service.

----------

## Arny

The only PCI cards I have in this system are the network cards, so no luck there...

I'm totally out of ideas here :/

----------

## neilhwatson

Since you have the same problem with a knoppix disk then it has to be hardware.  Either the NIC or the motherboard.

----------

## Arny

I'm kinda leaning towards it being a motherboard problem myself, which could result in rebuilding the system. What a pain.

Thanks for all of your help so far, if you think of anything else please let me know.

----------

## Arny

Seems this server has a whole host of problems suddenly. Now I'm getting CRC errors when trying to boot into any of my kernels and memtest86 is returning plenty of errors on the RAM. Looks like it's just this machine's time...

----------

## T|TaN

Hmmm yes, different problem.

Don't throw it out yet! You may just have bad ram, which can cause all kinds of issues withought any rhyme or reason.  Can you try some fresh ram or have the ram tested?

But yes I agree problems while running the live cd, do point to hardware issues.

----------

## Arny

Yep it ended up being the RAM - there was a heap of errors within the first MB of both sticks, meaning the kernel couldn't load itself into memory cleanly, thus the continuous failures. I wish I knew why the RAM just decided to die all of a sudden. Ah well, I'm back up and running now  :Smile: 

----------

## T|TaN

Awesome!  Ya Ram can just die, not a whole lot to do with it other then make keychains with it when it does, hehe.

----------

## Gentree

bit odd both sticks failing at the same time in the same way . Could it be dirty contacts. Wipe them over with contact cleaner and try different slots before binning them .

If they're of a decent size try them in a slower machine with a good long memtest.

If all else fails sell 'em on Ebay   :Laughing: 

----------

