umount bind of stale NFS
- by Paul Eisner
i've got a problem removing mounts created with mount -o bind from a locally mounted NFS folder. Assume the following mount structure:
NFS mounted directory:
$ mount -o rw,soft,tcp,intr,timeo=10,retrans=2,retry=1 \
10.20.0.1:/srv/source /srv/nfs-source
Bound directory:
$ mount -o bind /srv/nfs-source/sub1 /srv/bind-target/sub1
Which results in this mount map
$ mount
/dev/sda1 on / type ext3 (rw,errors=remount-ro)
# ...
10.20.0.1:/srv/source on /srv/nfs-source type nfs (rw,soft,tcp,intr,timeo=10,retrans=2,retry=1,addr=10.20.0.100)
/srv/nfs-source/sub1 on /srv/bind-target/sub1 type none (rw,bind)
If the server (10.20.0.1) goes down (eg ifdown eth0), the handles become stale, which is expected.
I can now un-mount the NFS mount with force
$ umount -f /srv/nfs-source
This takes some seconds, but works without any problems. However, i cannot un-mount the bound directory in /srv/bind-target/sub1. The forced umount results in:
$ umount -f /srv/bind-target/sub1
umount2: Stale NFS file handle
umount: /srv/bind-target/sub1: Stale NFS file handle
umount2: Stale NFS file handle
Here is a trace http://pastebin.com/ipvvrVmB
I've tried umounting the sub-directories beforehand, find any processes accessing anything within the NFS or bind mounts (there are none).
lsof also complains:
$ lsof -n
lsof: WARNING: can't stat() nfs file system /srv/nfs-source
Output information may be incomplete.
lsof: WARNING: can't stat() nfs file system /srv/bind-target/sub1 (deleted)
Output information may be incomplete.
lsof: WARNING: can't stat() nfs file system /srv/bind-target/
Output information may be incomplete.
I've tried with recent stable Linux kernels 3.2.17, 3.2.19 and 3.3.8 (cannot use 3.4.x, cause need the grsecurity patch, which is not, yet, supported - grsecurity is not patched in in the tests above!).
My nfs-utils are version 1.2.2 (debian stable).
Does anybody have an idea how i can either:
force the un-mount some other way? (any dirty trick is welcome, data loss or damage neglible at this point)
use something else instead of mount -o bind? (cannot use soft links, cause mounted directories will be used in chroot; bindfs via FUSE is far to slow to be an option)
Thanks,
Paul
Update 1
With 2.6.32.59 the umount of the (stale) sub-mounts work just fine. It seems to be a kernel regression bug.
The above tests where with NFSv3. Additional tests with NFSv4 showed no change.
Update 2
We have tested now multiple 2.6 and 3.x kernels and are now sure, that this was introduced in 3.0.x. We will fille a bug report, hopefully they figure it out.