# eth0 'link down' every 2 - 48 hours

## Cygon

Some weeks ago, my home server's network (which had run fine for >2 years) started to have outages. I don't know how I could track down this issue, maybe someone more knowledgeable than me could give me some advice?

Here's what I observed when the outages occur:My server becomes completely unreachable (I can neither ping my server's IP from the outside nor can I ping another LAN IP from the server).

The LEDs on the network adapter and switch still show a connection

Shutting down eth0 results in the switch no longer showing a connection. Upon restarting, the connection LED is green again, but eth0 remains bricked.

After rebooting, everything works fine again

I have another network adapter in this server. Suspecting a hardware issue, I flipped the adapters via udev rules, but the outages still occurred.

It might be that higher bandwidth usage increases the likelihood of an outage

The issue seems to resolve itself after a few hours

This morning, I had these lines in my syslog:

```
Mar  7 07:31:23 tiamat kernel: r8169 0000:03:00.0: eth0: link down

Mar  7 07:31:26 tiamat kernel: r8169 0000:03:00.0: eth0: link up

Mar  7 09:28:07 tiamat kernel: r8169 0000:03:00.0: eth0: link down

Mar  7 09:28:10 tiamat kernel: r8169 0000:03:00.0: eth0: link up
```

I don't see them for the other 2 outages I had since, so I'm not sure if it's related. I can't find anything else in the logs. Since using a different network adapter didn't have any effect, I now believe this is a software issue.

I'd be grateful for any help in finding out what's going on!

----------

## gentoo_ram

I was having weird problems with my ethernet link going up and down to my cable modem.  Tried all kinds of stuff, nothing worked... until I swapped the ethernet cable.  Problem solved.

----------

## chiefbag

Failing it being a physical hardware issue like previously mentioned these Realtek chipsets are notorious.

Have you recently performed any system/kernel upgrades?

See the below thread for just one example.

https://forums.gentoo.org/viewtopic-t-908102-highlight-r8169.html

----------

## JC99

I'm using Realtek chipsets on my server and am having a similar problem. I have had this server running for almost a year but only ran into this problem over the last little while. I reboot the server and everything works again. I checked my logs and found the following, same as what you are seeing...

```
Mar  8 13:55:36 penguin kernel: r8169 0000:03:00.0: eth1: link down

Mar  8 13:55:37 penguin kernel: r8169 0000:01:00.0: eth0: link up
```

I'll try different network cards and see if that makes a difference.

----------

## Cygon

Thanks for the tips. After my last post, it worked straight for almost 72 hours, so I held back on any changes in to make sure I'm not jumping to conclusions. Today, I had 4 outages in the last 6 hours again, so here goes:

The second outage was a kernel panic. I changed the cables while I checked it, but no joy.

During the third outage I was connected via SSH and noticed that responses got slower and slower (pings lost or >3 seconds, the screen from 'top' was sent in two packets, I had to look at half a console window for several seconds until the other half got through

Before rebooting my home server, I tried rebooting my switch. No change.

The fourth time it happened, I took down eth0 and eth1 (eth1 has nothing connected to it). Unlike before, after bringing eth0 back up, pings got through again!Here's ifconfig after eth0 recovered:

```
eth0      Link encap:Ethernet  HWaddr 00:1e:2a:d2:89:5e

          inet addr:192.168.124.1  Bcast:192.168.124.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:294470 errors:19 dropped:1360 overruns:0 frame:85

          TX packets:433366 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:124079469 (118.3 MiB)  TX bytes:453476682 (432.4 MiB)

          Interrupt:19 Base address:0xac00
```

I'm still no smarter than before, but that I've had two kernel panics on a server that ran rock solid for 2 years makes me think that maybe, just maybe this could be a weird hardware issue after all. Maybe a bad capacitor about to give in and causing random issues or so.

----------

## Cygon

I had another 2 kernel panics yesterday. Either the built-in r8169 driver is seriously messed (doubtful) or my server is experiencing a hardware failure.

Just ordered a new mainboard, CPU and RAM. I'll report back whether this fixes the issue. Otherwise I've got no idea what I'll do.

----------

## JC99

I tried 2 new network cards (Intel Pro 1000 GT) and didn't make a difference, I still had the internet go down.

Did the new CPU/Mobo/RAM fix the problem for you?

----------

## Cygon

I replaced my server's mainboard, CPU, RAM last week, leaving everything else the same. The server is now rock solid again.

So it was a hardware issue after all.

There were no visibly blasted caps on the old board, leaving me a bit in the dark as to what might have failed. My PSU was also fine (I checked it just in case it might have gone below the minimum voltage on any line -- had that experience with another PC ages ago).

----------

## Cygon

Looks like the story isn't over yet.

A few days ago the same issues began happening again. Absolutely nothing in the logs, but every few days networking just stops working with no packets going in or out of eth0 on my server. This really got me confused since I had replaced my server hardware, cabling and even bought a different network adapter for my workstation.

Completely out of ideas, I unplugged my entire home server including switch, router and modem to let it cool down a bit (since the last time it was off for a few hours was during the hardware replacement). Yep, pretty desperate, but I was running out of ideas and my only hope was to find some kind of system in this madness.

Upon powering up again, I observed my switch displaying a connection on port 2, then 3, then 4, then only 3, then jumped between 2 and 4 a bit, then 2, 3 and 4 at the same time, then only 3 again... that didn't seem normal. Especially when it kept going on and on with no sign of settling down. Two of those three devices are my Squeezebox and another switch, clearly they shouldn't come up and loose connection again all the time. Nothing like that was happening before I powered down the switch, so this was the first time that switch was brought to my attention.

So now I took that switch out of the look and I'm using my router's built-in switch. I don't know whether the issues will return, but removing that switch already had a very positive effect on another issue I was suffering from: before, there were lots of transmission errors visible in ifconfig and my workstation could upload files at no more than 150-200 KiB/s. Now I easily get 25 MiB/s upload and zero errors.

I now believe it's likely that the switch was the culprit all the time (and the kernel panics being a separate, unrelated hardware issue). All I can do now is hope that this is the end of it. Unless my hopes are crushed once again, I'll keep things as they are right now and I'll try to remember to revisit this thread again in a few weeks, to report that everything is finally alright  :Smile: 

----------

## JC99

I switched to using rp-pppoe to connect to the internet instead of ppp entries in /etc/conf.d/net and I have been up for 15 days without any problems and my box hasn't disconnected from the internet in all that time.

----------

