Blog

For some months, I used to encrypt the SWAP device (which is a ZFS volume) and thus have an encrypted /tmp. This worked fine with Solaris 11 Express, but I encountered a strange behavior in Solaris 11 EA which leads to have the SWAP device to... well, just disappeared.

Here is what I found after two boots; and on several machines:

# swap -l
No swap devices configured

# zfs list -t volume
NAME         USED  AVAIL  REFER  MOUNTPOINT
rpool/dump  32.8G   240G  31.8G  -

# grep swap /etc/vfstab
swap            -               /tmp            tmpfs   -       yes     -
/dev/zvol/dsk/rpool/swap        -               -               swap    -       no      encrypted

So, the rpool/swap dataset disappeared. I am sure not to have destroyed it, in particular since this appears on multiple servers. Nevertheless, I found this in the history of the zpool command:

# zpool history | grep destroy
[...]
2011-10-05.10:22:49 zfs destroy rpool/swap

# last reboot | head -2
reboot    system boot                   Wed Oct  5 10:23
reboot    system down                   Wed Oct  5 10:20

So, this problem seems to be related to some actions at boot time. What have the logs of SMF services to say about that?

# find /var/svc/log -print | xargs grep -i swap
/var/svc/log/system-filesystem-usr:default.log:cannot create 'rpool/swap': pool must be upgraded to set this property or value
/var/svc/log/system-filesystem-usr:default.log:cannot open 'rpool/swap': dataset does not exist
/var/svc/log/system-filesystem-usr:default.log:cannot create 'rpool/swap': pool must be upgraded to set this property or value
/var/svc/log/system-filesystem-usr:default.log:cannot open 'rpool/swap': dataset does not exist

# tail -3 /var/svc/log/system-filesystem-usr:default.log
[ Oct  5 12:00:05 Executing start method ("/lib/svc/method/fs-usr"). ]
cannot create 'rpool/swap': pool must be upgraded to set this property or value
[ Oct  5 12:00:13 Method "start" exited with status 0. ]

Ouch, what happened here? The message is interesting, but is a little misleading: it is on fresh Solaris 11 EA installations, and so the pools and datasets are all up to date:

# zpool upgrade && zfs upgrade
This system is currently running ZFS pool version 33.
All pools are formatted using this version.
This system is currently running ZFS filesystem version 5.
All filesystems are formatted with the current version.

So, it seems that the rpool/swap device is re-created at boot time, and for some reason it doesn't work as expected. Here is an attempt to discover where the device is re-created and why it does fail.

# find /lib/svc/method -print | xargs grep -i sbin/swapadd
/lib/svc/method/fs-usr:/usr/sbin/swapadd -1
/lib/svc/method/nfs-client:     /usr/sbin/swapadd
/lib/svc/method/fs-local:/usr/sbin/swapadd >/dev/null 2>&1

# grep "zfs destroy" /usr/sbin/swapadd
                zfs destroy $zvol > /dev/null 2>&1

# sed -n '/zfs create/,/\$zvol/p' /usr/sbin/swapadd
        zfs create -V $volsize -o volblocksize=`/usr/bin/pagesize` \
            -o primarycache=$primarycache -o secondarycache=$secondarycache \
            -o encryption=$encryption -o keysource=raw,file:///dev/random $zvol

So, the re-creation at boot time of the rpool/swap appears only when using an encrypted volume. And after a bit of digging, here what I found. At the first boot, here is the command used to create the encrypted volume:

zfs create -V 4G -o volblocksize=8192 -o primarycache=metadata -o secondarycache=all -o encryption=on -o keysource=raw,file:///dev/random rpool/swap

But on a second boot, here is the slightly different command used this time:

zfs create -V 4G -o volblocksize=8192 -o primarycache=metadata -o secondarycache=all -o encryption=aes-128-ctr -o keysource=raw,file:///dev/random rpool/swap

This is because the arguments passed to the command is backed-up and restored from the settings just before the deletion of the volume. As mentioned in the zfs(1m) manual page, only the following encryption algorithm are supported... and so the one which is sets is not valid (the error message saying that the pool must be upgraded to set this property or value is a little more clear by now).

encryption=off | on | aes-128-ccm | aes-192-ccm | aes-256-ccm | aes-128-gcm | aes-192-gcm | aes-256-gcm

The question is, how can this happen? Where does this algorithm com from? The answer is simple: it seems that this is the swap(1m) command which alters some properties of the rpool/swap volume:

# swap -d /dev/zvol/dsk/rpool/swap
# zfs destroy rpool/swap
# zfs create -V 4G -o volblocksize=8192 -o primarycache=metadata -o secondarycache=all -o encryption=on -o keysource=raw,file:///dev/random rpool/swap
# zfs list -H -o type,volsize,volblocksize,encryption rpool/swap
volume  4G      8K      on
# swap -1 -a /dev/zvol/dsk/rpool/swap
# zfs list -H -o type,volsize,volblocksize,encryption rpool/swap
volume  4G      1M      aes-128-ctr

Not only is the algorithm changed to something not supported (yet?), but the volblocksize property is touched as well. This was not the case on Solaris 11 Express 2010.11.

Hope someone can help me on this side, and that this is a known bug which is already (or will be quickly) addressed, in particular for the Solaris 11 GA. I already posted a comment on the blog of Darren Moffat, just in case this can help a bit.