# NFS hangs with kernel 3.3 [SOLVED]

## sidamos

After switching from kernel 3.2.12 to 3.3.8, I get NFS hangs when copying larger files (>10 MB) from another machine (NFS server) to this machine. When it happens, NFS is completely dead (df hangs, too).

I get this in the log of the client:

```
kernel: nfs: server <servername> not responding, still trying
```

dmesg | grep eth0 (kernel 3.2.12)

```
r8169 0000:05:00.0: eth0: link down

r8169 0000:05:00.0: eth0: link down

ADDRCONF(NETDEV_UP): eth0: link is not ready

r8169 0000:05:00.0: eth0: link up

ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

eth0: no IPv6 routers present
```

dmesg | grep eth0 (kernel 3.3.8)

```
r8169 0000:05:00.0: eth0: RTL8168evl/8111evl at 0xf919c000, 00:25:22:bf:52:35, XID 0c900800 IRQ 42

r8169 0000:05:00.0: eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]

r8169 0000:05:00.0: eth0: link down

r8169 0000:05:00.0: eth0: link down

ADDRCONF(NETDEV_UP): eth0: link is not ready

r8169 0000:05:00.0: eth0: link up

ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
```

ifconfig eth0 (both kernels)

```
eth0      Protokoll:Ethernet  Hardware Adresse 00:25:22:bf:52:35  

          inet Adresse:192.168.0.2  Bcast:192.168.0.255  Maske:255.255.255.0

          inet6 Adresse: fe80::225:22ff:febf:5235/64 Gültigkeitsbereich:Verbindung

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:66167 errors:0 dropped:0 overruns:0 frame:0

          TX packets:44335 errors:0 dropped:0 overruns:0 carrier:0

          Kollisionen:0 Sendewarteschlangenlänge:1000 

          RX bytes:58669686 (55.9 MiB)  TX bytes:9089928 (8.6 MiB)

          Interrupt:42 Basisadresse:0xe000 
```

lspci -v (both kernels)

```
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)

        Subsystem: ASRock Incorporation Motherboard (one of many)

        Flags: bus master, fast devsel, latency 0, IRQ 42

        I/O ports at c000 [size=256]

        Memory at e0004000 (64-bit, prefetchable) [size=4K]

        Memory at e0000000 (64-bit, prefetchable) [size=16K]

        Capabilities: [40] Power Management version 3

        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+

        Capabilities: [70] Express Endpoint, MSI 01

        Capabilities: [b0] MSI-X: Enable- Count=4 Masked-

        Capabilities: [d0] Vital Product Data

        Capabilities: [100] Advanced Error Reporting

        Capabilities: [140] Virtual Channel

        Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00

        Kernel driver in use: r8169

        Kernel modules: r8169
```

I am loading the suggested firmware patch for this chip:

```
CONFIG_EXTRA_FIRMWARE="rtl_nic/rtl8168e-3.fw"

CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"
```

So, dmesg says "jumbo features" with 3.3.8, however I use MTU 1500.

Any help/hints would be greatly appreciated.Last edited by sidamos on Wed Nov 07, 2012 7:04 am; edited 1 time in total

----------

## Hu

Does this still happen in newer kernels, such as 3.5?

----------

## sidamos

I have not tried any unstable kernels, yet.

----------

## slick

I can confirm, since I running >=3.3. I get the same trouble. If I use NFS more than normal (like big copy jobs), the connection freeze complete.

Logs are like this ("filer" is the hostname) :

```
Sep 16 16:15:01 [kernel] nfs: server filer not responding, still trying

Sep 16 16:16:08 [kernel] nfs: server filer OK

                - Last output repeated 82 times -
```

I can access the filerserver in this time (ssh/samba/...) and ping it. It's a LAN. There is no network trouble. Before it works fine. On other machines (with other/older kernel, like 3.0.x) I have not this problem (and I can accessing futhermore from this clients while my nfs connection freeze). 

I guess there is really a problem.

----------

## sidamos

Still happens with 3.5.7.  :Sad: 

----------

## sidamos

Now I updated the server from kernel 3.1.10 to 3.5.7 and the issue is gone!

Seems that 3.1.10 as server and >3.2 as client has issues.

----------

