# Stale NFS file handle

## SeeksTheMoon

On my server I have three RAIDs, let's call them A, B and C which are mounted in /mnt/A, /mnt/B, /mnt/C.

For easy access I bind-mounted them together in /mnt/storage in one tree.

I mounted this tree with sshfs which works like a charm, but I want to try it with NFS and see if it works better.

So these are my mounts:

```
/mnt/A/Projects on /mnt/storage/Projects type none (rw,bind)

/mnt/B/Stuff on /mnt/storage/Stuff type none (rw,bind)

/mnt/C/Specialstuff on /mnt/storage/Stuff/Specialstuff type none (rw,bind)
```

Notice that a C directory is bind-mounted into the Specialstuff subdirectory of B/Stuff.

So I created my /etc/exports:

```
/mnt/storage 192.168.0.2(rw,async,fsid=0,nohide,no_subtree_check,insecure)

/mnt/storage/Projects    192.168.0.2(rw,async,nohide,no_subtree_check,insecure)

/mnt/storage/Stuff     192.168.0.2(rw,async,nohide,no_subtree_check,insecure)

/mnt/storage/Stuff/Specialstuff 192.168.0.2(rw,async,nohide,no_subtree_check,insecure)
```

After I mounted server:/mnt/storage on my client to /mnt/storage, I can access /mnt/storage and /mnt/storage/Projects but it is impossible to access /mnt/storage/Stuff in any way:

```
LANG=C ls -l /mnt/storage/Stuff

ls: cannot access /mnt/storage/Stuff: Stale NFS file handle.
```

Searching for that message provides me with useless information like:

 mount again (this obviously does not solve the problem)

 restart NFS (dito)

 try the noac option (it is not only not explained though once mentioned in the manpage, it does not work either)

 flush the export table (duh, so it's like restarting NFS without restarting it, aka not working. exports -v tells me that everything is exported like I wrote it into /etc/exports)

 use -O for overlay mount if device is busy (it is not busy and I can access the files directly on the server and after mounting them with sshfs, so exporting the bind-mount point should suffice)

  *Quote:*   

> A filehandle becomes stale whenever the file or directory referenced by the handle is removed by another host, while your client still holds an active reference to the object. A typical example occurs when the current directory of a process, running on your client, is removed on the server (either by a process running on the server or on another client).

 

 Not happening here. One server, one client, no changes on the disks, especially no removals because I would hate that if that happened.

  *Quote:*   

> kill processes with open files on the partition, kill processes that have cd'ed to the partition, kill all of the users, reboot

 

You must be joking, right? Even if there were such processes or users (which are not), those "solutions" are not going to happen. There is no need to kill random stuff because the client cannot access one directory via NFS. It works with SSH and FTP and locally, no need for aggressive special treatment like that.

Obviously I will not change my disk layout, the RAIDs etc. So... I am out of options. What should I do to access the stale file handle?

----------

## eccerr0r

Bumping because I've been having the same issue.  I have a feeling there's a race condition in the kernel that make nfs+bind mounts fail.

As a workaround I got rid of all bind mounts and suddenly it started working.  I had to use symlinks instead of bindmounts to get it to work.

----------

## thegeezer

i vaguely recall that when you export an NFS share you will have issues if you mount anything under that, even if a bind mount

i tried to find the clarifying web link but the following link is the best i can find, which basically says you have to list additional mounts in your exports... but looking closer you have done this.

http://unix.stackexchange.com/questions/42131/how-to-properly-export-and-import-nfs-shares-that-have-subdirectories-as-mount-p

one thing that is not clear is "I can access /mnt/storage and /mnt/storage/Projects but it is impossible to access /mnt/storage/Stuff"   is this on the server or remotely over the NFS mounts ?

----------

## SeeksTheMoon

 *thegeezer wrote:*   

> one thing that is not clear is "I can access /mnt/storage and /mnt/storage/Projects but it is impossible to access /mnt/storage/Stuff"   is this on the server or remotely over the NFS mounts ?

 

This access problem happens on the client side. It works with sshfs though (on the client) and there is no problem on the server directly, I can browse my directories, create, edit, delete, ... If I try to cd or ls this directory on my NFS mount on the client, I get this "stale NFS handle" message.

If I did not overlook something, the link tells me what I already did. The problem is somewhere inside this "submount". If it is actually a race condition I guess I would have to wait for a kernel where it is fixed as this should not be trivial.

The problem is that I cannot use symlinks here because I have an FTP daemon serving some of this directory structure and the chroot jail would not work with symlinks. I could create an additional export directory for NFS with symlinks but I don't like the idea of a workaround  :Very Happy: 

I guess I will check out http://linux-nfs.org/ for additional information and maybe file a bug there if nobody somehow knows a solution.

----------

## eccerr0r

Yeah I think there is a bug to be filed with bind mounts and no_subtree_checks, I think this *should* work, but the kernel has a bug and I doubt the fix is trivial.

I did away with it because of worries of computation costs to keep coherency, though it *should* be able to figure it out.  The binding server should announce changes to both sides especially if one is NFS.

It's a can of worms I didn't want to open, so I did away with bind mounts and rearranged my file structure to accommodate.

Incidentally, the issue I was having is that I moved a machine from a physical to a virtual machine.  I had users' homes on the old physical machine served via NFS but it bind-mounted for local use.  To reduce overhead I moved all the homes from the now virtual machine to the physical machine, but the virtual machine still bind mounted stuff due to nfs exports and I kept on getting stale mounts.  I got rid of nfs exports and all bind mounts on the virtual machine and the problem went away.  This unfortunately changes my backup plans as I was using NFS to backup that machine, now I've had to resort to ssh/rsync.

----------

## MrUlterior

FWIW I came across this problem after upgrading to NFSv4, for me the solution was to execute "umount -f <NFS DIR>" and then "mount <NFS DIR>". Note that I had to run umount -f even through "mount" on the client was not reporting it mounted.

----------

