blog'o thnet

To content | To menu | To search

Saturday 18 October 2008

Discrepancies Between df And du Outputs

As a SA, it not uncommon to have regularly requests about big differences between the du and df outputs on a UFS file system. (For ZFS specific considerations, please see the ZFS FAQ.)

The du utility reports the sum of space allocated to all files in the file hierarchy rooted in the directory plus the space allocated to the directory itself. The df utility reports the amount of disk space occupied by a mounted file system.

When a file is remove from the file system, i.e. is unlinked (the hard link count goes to zero), the space belonging to this file is accounted against the du tool, but is not visible to the df utility until all references to it (open file descriptors) are closed. In order to find the guilty process, one can follow the information found in the SunManagers Frequently Asked Questions. Here is an example of such finding, but using a slightly different method to get the process currently holding the open descriptor to the deleted file.

Find the file which has been unlinked through the procfs interface:

# find /proc/*/fd \( -type f -a ! -size 0 -a -links 0 \) -print | xargs \ls -li
 415975 --w-------   0 user  group  2125803025 Oct 15 23:59 /proc/1252/fd/3

Eventually, get more detail about it:

# pargs -c 1252
1252:   rvd.basic -reliability 5 -listen tcp:9876 -logfile /path/to/log/rvd_9876.l
argv[0]: rvd.basic
argv[1]: -reliability
argv[2]: 5
argv[3]: -listen
argv[4]: tcp:9876
argv[5]: -logfile
argv[6]: /path/to/log/rvd_9876.log

Check to see if you can understand what is the content of the unlinked file:

# tail /proc/1252/fd/3
-------------------------------------------------------------------------------
2008-10-15 23:59:32.002116 - [MSG] BBG_Transmitter_class.cc, line 792 (thread 25087:4)
[4060] Sent a heartbeat
-------------------------------------------------------------------------------
BBG_Transmitter_class.cc: [4111] No activity detected. Send a Heartbeat message
-------------------------------------------------------------------------------
2008-10-15 23:59:32.134829 - [MSG] BBG_Transmitter_class.cc, line 1138 (thread 25087:4)
[4065] Heartbeat acknowledged by Bloomberg

You can correlate the size of the removed, but always referenced, file to the space accounted from the du and df tools:

# df -k /path/to
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/md/dsk/d5       6017990 5874592   83219    99%    /path/to
# du -sk /path/to
3791632 /data
# echo "(5874616-3791632)*1024" | bc
2132975616

So, we now found the ~2GB log file which was always opened (used) by a process. Now, there are two solutions to be able to get back the freed space:

  1. Truncate the unlinked file (quick workaround).
  2. Simply restart properly the corresponding program (better option).

Use the solution which fits the best your need in your environment.

Monday 5 February 2007

Available Space Count Using Numerous Little Files

Using big bundled software suite such as the IBM WebSphere Java Application Server can sometimes lead to confusion when determining the currently available space to be used.

In fact, we get a particular case where du(1) and df(1m) said some space are not in used, but--when trying to allocate it--we simply can't. Here is the description of this curious behavior:

  • Solaris 9 (Generic_112233-11) using SVM with soft partition.
  • The information from df(1m) seems not good, as we can't use the reported free space: 1GB of 8GB seems free... but not usable.
  • du(1) and df(1m) agreed together, and their results are very similar (note: minfree is set to 1%).

Some notes now:

  • File system is consistent (passes fsck(1m) happily).
  • There is no file descriptor currently open on this file system.
  • No data were stored under the directory on which is mounted the problematic file system.
  • Tested with and without disk quota, and with and without logging options.
  • The file system seems very fragmented, see below.

Here we will provide a test case showing the differences between du(1) and df(1m), and the reality. First, create and configure the test file system, and populate it with appropriate (problematic) data:

# metainit d107 -p d7 100m
# newfs d7 -m d17 d27 1
# grep d107 /etc/vfstab
/dev/md/dsk/d107 /dev/md/rdsk/d107 /t/data/WebSphere ufs 1 yes logging
# mount /t/data/WebSphere
# tunefs -m 1 /t/data/WebSphere
# cd /t/data/WebSphere
# gzip -dc /tmp/testcasedata.tar.gz | tar xf -
# rm testcasedata.tar.gz && lockfs -af

Now, we can observe the common UNIX utilities reports, and calculate the exact available space helped by the fstyp(1m) command:

# df -k /t/data/WebSphere
Filesystem        kbytes    used   avail capacity  Mounted on
/dev/md/dsk/d107   95207   28416   65839    31%    /t/data/WebSphere
# du -sk /t/data/WebSphere
27384   /t/data/WebSphere
#
# fstyp -v /dev/md/dsk/d107 | head -15
ufs
magic   11954   format  dynamic time    Fri Jan 14 14:47:00 2005
sblkno  16      cblkno  24      iblkno  32      dblkno  2408
sbsize  2048    cgsize  8192    cgoffset 216    cgmask  0xffffffe0
ncg     3       size    102400  blocks  95207
bsize   8192    shift   13      mask    0xffffe000
fsize   1024    shift   10      mask    0xfffffc00
frag    8       shift   3       fsbtodb 1
minfree 1%      maxbpg  2048    optim   time
maxcontig 16    rotdelay 0ms    rps     167
csaddr  2408    cssize  1024    shift   9       mask    0xfffffe00
ntrak   24      nsect   424     spc     10176   ncyl    21
cpg     8       bpg     5088    fpg     40704   ipg     19008
nindir  2048    inopb   64      nspf    2
nbfree  5024    ndir    151     nifree  50438   nffree  26599
#
# echo "(26599*100)/95207" | bc
27
# echo "8*5024" | bc
40192

Well. Now, we can say:

  1. The fragmentation ratio for this file system is pretty high: 27%.
  2. Although it seems that only ~40MB are really available for use, the df(1m) utility reports us with ~65MB. The overestimation is about 15% in this (not very high volume) test case! Wow...

The bad news is that the problem is due to an high number of very small files provided with the third party software from IBM, and correspond to locale files. These files were <1KB. And because this is a third party component, we can't do anything about that. In fact, the size of a single file system block is 8192 bytes, at least on sun4u processor architecture (see the mkfs_ufs(1m) documentation for more details).

The good news is that the problem may be worked around by changing the optimization space file system's tunable from time to space (please refer to the tunefs(1m) manual page for more information). The little downside here is that the data must be rewritten in order to benefit from this modification, for example using a ufsdump(1m) and ufsrestore(1m) cycle.