# NFS lockups when listing or saving files

## GreatEmerald

I am experiencing a rather odd issue with NFS. Whenever an NFS client tries to list a directory that contains at least one file in it, it has to wait exactly 30 seconds for the files to appear. During the time, the program that is supposed to be listing them (I tried with Dolphin and Konqueror) appears frozen. Once it gets unfrozen, it is possible to move through other directories without such an issue for some time, but if the NFS share is idle, a new attempt at listing directory contents causes another lockup. However, there are no problems when attempting to list directories, no matter how many of them are there. Also, the same thing is not triggered by the command-line ls tool, only when using a graphical manager, like Dolphin.

Same thing happens when attempting to save (or otherwise commit, like send) files over NFS. If I open a file, then edit it and save quickly, the performance is good. But if I take some time to edit the file, then pressing the save button makes the program freeze for 30 seconds. If I try to upload a large file, the upload speed is very good, but at the end of the upload it also stalls for 30 seconds before it reports that the file upload is complete.

From my testing, it appears that the directory/file needs to stay not refreshed for around a minute and a half before the lockup is triggered.

There are two error messages that appear in dmesg on the server. The first one is shown once on every boot:

```
NFSD: starting 90-second grace period

NFSD: Unable to end grace period: -110
```

However, the most important one is the second one, as it gets printed to dmesg exactly at the time when the program listing or saving a file unfreezes:

```
NFSD: Unable to create client record on stable storage: -110
```

I tried searching for information on this error, but there is literally nothing on the net that has any information about it, it just seems that it's a recent addition to nfsd.

So, any ideas about what all this could mean? Anyone else had such issues? Should I submit a bug somewhere?

The server is using net-fs/nfs-utils-1.2.6, USE flags: "ipv6 nfsv4 nfsv41 tcpd -caps -kerberos -nfsdcld -nfsidmap (-selinux)", kernel 3.6.11. The shared directory is using NFSv4.

Here is the nfsstat output for the client:

```
Client rpc stats:

calls      retrans    authrefrsh

2360       0          2360    

Client nfs v4:

null         read         write        commit       open         open_conf    

0         0% 17        0% 35        1% 0         0% 372      15% 12        0% 

open_noat    open_dgrd    close        setattr      fsinfo       renew        

0         0% 0         0% 358      15% 2         0% 6         0% 42        1% 

setclntid    confirm      lock         lockt        locku        access       

2         0% 2         0% 0         0% 0         0% 0         0% 150       6% 

getattr      lookup       lookup_root  remove       rename       link         

661      28% 67        2% 2         0% 3         0% 0         0% 0         0% 

symlink      create       pathconf     statfs       readlink     readdir      

0         0% 0         0% 4         0% 582      24% 0         0% 31        1% 

server_caps  delegreturn  getacl       setacl       fs_locations rel_lkowner  

10        0% 0         0% 0         0% 0         0% 0         0% 0         0% 

secinfo      exchange_id  create_ses   destroy_ses  sequence     get_lease_t  

0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 

reclaim_comp layoutget    getdevinfo   layoutcommit layoutreturn getdevlist   

0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 

(null)       

0         0%
```

And the server:

```
Server rpc stats:

calls      badcalls   badclnt    badauth    xdrcall

2366       0          0          0          0       

Server nfs v4:

null         compound     

2         0% 2364     99% 

Server nfs v4 operations:

op0-unused   op1-unused   op2-future   access       close        commit       

0         0% 0         0% 0         0% 154       2% 358       5% 0         0% 

create       delegpurge   delegreturn  getattr      getfh        link         

0         0% 0         0% 0         0% 2583     36% 407       5% 0         0% 

lock         lockt        locku        lookup       lookup_root  nverify      

0         0% 0         0% 0         0% 67        0% 0         0% 0         0% 

open         openattr     open_conf    open_dgrd    putfh        putpubfh     

372       5% 0         0% 12        0% 0         0% 2315     32% 0         0% 

putrootfh    read         readdir      readlink     remove       rename       

4         0% 17        0% 31        0% 0         0% 3         0% 0         0% 

renew        restorefh    savefh       secinfo      setattr      setcltid     

43        0% 358       5% 372       5% 0         0% 2         0% 2         0% 

setcltidconf verify       write        rellockowner bc_ctl       bind_conn    

2         0% 0         0% 35        0% 0         0% 0         0% 0         0% 

exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   

0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 

getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     

0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 

set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 

0         0% 0         0% 0         0% 0         0% 0         0%
```

----------

## count0

Try the following:

```
mkdir /var/lib/nfs/v4recovery
```

This directory must match what you have in /proc/fs/nfsd/nfsv4recoverydir

----------

## goteguru

Many thx count0, that was the solution. Maybe the ebuild should create this dir?

This was very annoying bug and I cannot even imagine how did you

find out this.  :Surprised: 

----------

