Daemons die with bus error when their binaries live on NFS
- by mbac32768
We have some daemons executing on a number of hosts.
The daemon executable images are these very large binaries that are hosted on NFS.
When the binaries are updated on the NFS server, the previously running daemons sometimes drop dead with a Bus error. I'm assuming what's happening is the NFS server is replacing the binaries in a way that's invisible to the VFS layer on the NFS clients so they end up loading pages from the updated binary, which of course leads to madness.
We tried moving the new binaries into place instead of cp, but that doesn't seem to fix it.
I'm considering simply mlock()'ing the binary in the daemon startup script, but surely there's magic NFS options or semantics that we should be abusing. Is there a better way to fix this?