Stale NFS File Handle
By Julien Gabel on Wednesday 7 March 2007, 19:29 - Solaris - Permalink
How to decode this cryptic NFS write error a Solaris client system, found in
the /var/adm/messages log file?
Feb 26 11:54:26 clssunp120 nfs: [ID 626546 kern.notice] NFS write error on \ host nfs1_prd: Stale NFS file handle. Feb 26 11:54:26 clssunp120 nfs: [ID 702911 kern.notice] (file handle: 40260001 \ ffffffff a0000 1dc24 d5 a0000 2 0)
The problem is that the file handle specification is used and interpreted internally by the NFS client subsystem, and evolved with each release of the operating system. Luckily, we need only two fields (on the eight printed in the record file). The first field is the file system id (or device id) and is generally the first number of the file handle. The second field of interest is the inode number, which is found at the four, or five reference in the file handle (please note that the inode is reported in hexadecimal format).
So, consider the following file handle:
40260001 ffffffff a0000 1dc24 d5 a0000 2 0
We can now know which file system is the culprit:
# grep 40260001 /etc/mnttab nfssrv:/t/tools/SunOS/isa /tools/isa nfs rw,intr,soft,dev=40260001 1172569690
Translate the inode number to decimal, to be used in conjunction with the
find(1) utility:
# echo "ibase=16; `echo 1dc24 | tr [:lower:] [:upper:]`" | bc 121892
Last, the file name a process is currently trying to reference can be found easily with:
# find /tools/isa -mount -inum 121892 -print 2> /dev/null
Search for the info doc 73152 on the SunSolve web site for more information in this subject (you must have a registered Sun support customer account to be able to view this document).

Comments
#!/bin/sh
#
# fhfind: takes the expanded filehandle string from an
# NFS write error or stale filehandle message and maps
# it to a pathname on the server.
#
# The device id in the filehandle is used to locate the
# filesystem mountpoint. This is then used as the starting
# point for a find for the file with the inode number
# extracted from the filehandle.
#
# If the filesystem is big - the find can take a long time.
# Since there's no way to terminate the find upon finding
# the file, you might need to kill fhfind after it prints
# the path.
#
if [ $# -ne 8 ]; then
echo
echo "Usage: fhfind <filehandle> e.g."
echo
echo " nfsfind 1540002 2 a0000 4df07 48df4455 a0000 2 25d1121d"
exit 1
fi
# Filesystem ID
FSID1=$1
FSID2=$2
# FID for the file
FFID1=$3
FFID2=`echo $4 | tr [a-z] [A-Z]` # uppercase for bc
FFID3=$5
# FID for the export point (not used)
EFID1=$6
EFID2=$7
EFID3=$8
# Use the device id to find the /etc/mnttab
# entry and thus the mountpoint for the filesystem.
E=`grep $FSID1 /etc/mnttab`
if [ "$E" = "" ]; then
echo
echo "Cannot find filesystem for devid $FSID1"
exit 0
fi
set - $E
MNTPNT=$2
INUM=`echo "ibase=16; $FFID2" | bc` # hex to decimal for find
echo
echo "Now searching $MNTPNT for inode number $INUM"
echo
find $MNTPNT -mount -inum $INUM -print 2>/dev/null
> there's no way to terminate the find upon finding the file
I am not sure about Solaris tools, but GNU head will close its stdin after completion, which will result in a SIGPIPE, which kills the GNU find.
So, the following idiom should do the trick:
find ... | head -1
Thanks for all this useful info.
I face a quirk: my file systems are automounted.
The device id I get seems constant, but doesn't match anything in /etc/mnttab: 62000000.
Is this to be expected? Can I get an other clue of where to look for my inodes?
Is this to be expected?
Sadly yes, it is a common case, too.
Can I get an other clue of where to look for my inodes?
Well in these cases, I generally tend to use
pfilesordtrace(and eventuallylsof), but it is a relatively tedious work... good luck.Thanks for the info. Here's a tip:
$ printf "%d\n" 0x1dc24
121892
$ printf "%x\n" 121892
1dc24