blog'o thnet

To content | To menu | To search

Tag - Bug

Entries feed

Thursday 22 March 2012

Problem with the beadm utility inside a Zone

Although beadm utility is now supported inside a non-global zone, I find a case where its behavior seems not works as expected. So, connected inside a Zone (say, myzone), I can create a new BE (say, solaris-1), activate it, and reboot on it.

ZG# zoneadm list -vc
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              solaris  shared
   4 myzone           running    /zones/myzone                  solaris  excl

ZG# zlogin myZone
[Connected to zone 'myzone' pts/7]
Oracle Corporation      SunOS 5.11      11.0    February 2012

ZNG# beadm list
BE      Active Mountpoint Space   Policy Created
--      ------ ---------- -----   ------ -------
solaris NR     /          917.06M static 2012-03-21 14:04

ZNG# beadm create solaris-1

ZNG# beadm activate solaris-1

ZNG# beadm list
BE        Active Mountpoint Space   Policy Created
--        ------ ---------- -----   ------ -------
solaris   N      /          43.0K   static 2012-03-21 14:04
solaris-1 R      -          917.19M static 2012-03-21 16:48

ZNG# init 6
[Connection to zone 'myzone' pts/9 closed]

ZG# zlogin myzone
[Connected to zone 'myzone' pts/7]
Oracle Corporation      SunOS 5.11      11.0    February 2012

ZNG# beadm list
BE        Active Mountpoint Space   Policy Created
--        ------ ---------- -----   ------ -------
solaris   -      -          3.06M   static 2012-03-21 14:04
solaris-1 NR     /          979.46M static 2012-03-21 17:56

All works very well, I didnt get any problem and can do whatever I want after that: fallback on the other BE, go on with this one installing new packages, create more new BE, etc.

But if I tried to automagically create a new BE from the pkg utility, the created BE seems not have all the good stuff it must had.

ZNG# pkg install --require-new-be site/application/testpkg
           Packages to install:   1
       Create boot environment: Yes
Create backup boot environment:  No

DOWNLOAD                                  PKGS       FILES    XFER (MB)
Completed                                  1/1       20/20      0.0/0.0

PHASE                                        ACTIONS
Install Phase                                  73/73

PHASE                                          ITEMS
Package State Update Phase                       1/1
Image State Update Phase                         2/2
pkg: '/sbin/bootadm update-archive -R /tmp/tmpCqUVIT' failed.
with a return code of 1.

A clone of solaris exists and has been updated and activated.
On the next boot the Boot Environment solaris-1 will be
mounted on '/'.  Reboot when ready to switch to this updated BE.

ZNG# beadm list
BE        Active Mountpoint Space   Policy Created
--        ------ ---------- -----   ------ -------
solaris   N      /          102.0K  static 2012-03-21 14:04
solaris-1 R      -          950.52M static 2012-03-21 17:39

So a new BE was created, but this time there is something wrong. Try to see what's missing:

ZNG# beadm list
beadm mount newsolaBE         Active Mountpoint Space   Policy Created
--         ------ ---------- -----   ------ -------
newsolaris R      -          864.50M static 2012-03-21 17:52
solaris    N      /          80.84M  static 2012-03-21 14:04
ZNG# beadm mount newsolaris /mnt

ZNG# bootadm  update-archive -vn -R /mnt
file not found: /mnt//boot/solaris/bin/create_ramdisk
/mnt/: not a boot archive based Solaris instance

ZNG# ls -l /mnt/boot/solaris/bin/create_ramdisk
/mnt/boot/solaris/bin/create_ramdisk: No such file or directory

ZNG# ls -l /mnt/boot
/mnt/boot: No such file or directory

ZNG# ls -l /mnt
total 72
lrwxrwxrwx   1 root     root           9 Mar 21 14:17 bin -> ./usr/bin
drwxr-xr-x  17 root     sys           18 Mar 21 17:18 dev
drwxr-xr-x   2 root     root           2 Mar 21 14:26 dpool
drwxr-xr-x  48 root     sys          114 Mar 21 17:52 etc
drwxr-xr-x   2 root     sys            2 Mar 21 14:11 export
dr-xr-xr-x   2 root     root           2 Mar 21 14:11 home
drwxr-xr-x  12 root     bin          185 Mar 21 14:17 lib
drwxr-xr-x   2 root     sys            2 Mar 21 14:11 mnt
dr-xr-xr-x   2 root     root           2 Mar 21 14:26 net
dr-xr-xr-x   2 root     root           2 Mar 21 14:26 nfs4
drwxr-xr-x   2 root     sys            2 Mar 21 14:11 opt
dr-xr-xr-x   2 root     root           2 Mar 21 14:11 proc
drwx------   2 root     root           5 Mar 21 16:50 root
drwxr-xr-x   2 root     root           2 Mar 21 14:26 rpool
lrwxrwxrwx   1 root     root          10 Mar 21 14:17 sbin -> ./usr/sbin
drwxr-xr-x   5 root     root           5 Mar 21 14:11 system
drwxrwxrwt   2 root     sys            2 Mar 21 17:19 tmp
drwxr-xr-x   2 root     root           2 Mar 21 14:26 tools
drwxr-xr-x  22 root     sys           32 Mar 21 14:26 usr
drwxr-xr-x  28 root     sys           29 Mar 21 14:17 var

Well, I don't why there is a difference between those two BE, but the differences are significant enough to be a problem.

Comments welcome!

Monday 10 October 2011

Encrypted SWAP Device Just Disappeared In Solaris 11 EA

For some months, I used to encrypt the SWAP device (which is a ZFS volume) and thus have an encrypted /tmp. This worked fine with Solaris 11 Express, but I encountered a strange behavior in Solaris 11 EA which leads to have the SWAP device to... well, just disappeared.

Here is what I found after two boots; and on several machines:

# swap -l
No swap devices configured

# zfs list -t volume
NAME         USED  AVAIL  REFER  MOUNTPOINT
rpool/dump  32.8G   240G  31.8G  -

# grep swap /etc/vfstab
swap            -               /tmp            tmpfs   -       yes     -
/dev/zvol/dsk/rpool/swap        -               -               swap    -       no      encrypted

So, the rpool/swap dataset disappeared. I am sure not to have destroyed it, in particular since this appears on multiple servers. Nevertheless, I found this in the history of the zpool command:

# zpool history | grep destroy
[...]
2011-10-05.10:22:49 zfs destroy rpool/swap

# last reboot | head -2
reboot    system boot                   Wed Oct  5 10:23
reboot    system down                   Wed Oct  5 10:20

So, this problem seems to be related to some actions at boot time. What have the logs of SMF services to say about that?

# find /var/svc/log -print | xargs grep -i swap
/var/svc/log/system-filesystem-usr:default.log:cannot create 'rpool/swap': pool must be upgraded to set this property or value
/var/svc/log/system-filesystem-usr:default.log:cannot open 'rpool/swap': dataset does not exist
/var/svc/log/system-filesystem-usr:default.log:cannot create 'rpool/swap': pool must be upgraded to set this property or value
/var/svc/log/system-filesystem-usr:default.log:cannot open 'rpool/swap': dataset does not exist

# tail -3 /var/svc/log/system-filesystem-usr:default.log
[ Oct  5 12:00:05 Executing start method ("/lib/svc/method/fs-usr"). ]
cannot create 'rpool/swap': pool must be upgraded to set this property or value
[ Oct  5 12:00:13 Method "start" exited with status 0. ]

Ouch, what happened here? The message is interesting, but is a little misleading: it is on fresh Solaris 11 EA installations, and so the pools and datasets are all up to date:

# zpool upgrade && zfs upgrade
This system is currently running ZFS pool version 33.
All pools are formatted using this version.
This system is currently running ZFS filesystem version 5.
All filesystems are formatted with the current version.

So, it seems that the rpool/swap device is re-created at boot time, and for some reason it doesn't work as expected. Here is an attempt to discover where the device is re-created and why it does fail.

# find /lib/svc/method -print | xargs grep -i sbin/swapadd
/lib/svc/method/fs-usr:/usr/sbin/swapadd -1
/lib/svc/method/nfs-client:     /usr/sbin/swapadd
/lib/svc/method/fs-local:/usr/sbin/swapadd >/dev/null 2>&1

# grep "zfs destroy" /usr/sbin/swapadd
                zfs destroy $zvol > /dev/null 2>&1

# sed -n '/zfs create/,/\$zvol/p' /usr/sbin/swapadd
        zfs create -V $volsize -o volblocksize=`/usr/bin/pagesize` \
            -o primarycache=$primarycache -o secondarycache=$secondarycache \
            -o encryption=$encryption -o keysource=raw,file:///dev/random $zvol

So, the re-creation at boot time of the rpool/swap appears only when using an encrypted volume. And after a bit of digging, here what I found. At the first boot, here is the command used to create the encrypted volume:

zfs create -V 4G -o volblocksize=8192 -o primarycache=metadata -o secondarycache=all -o encryption=on -o keysource=raw,file:///dev/random rpool/swap

But on a second boot, here is the slightly different command used this time:

zfs create -V 4G -o volblocksize=8192 -o primarycache=metadata -o secondarycache=all -o encryption=aes-128-ctr -o keysource=raw,file:///dev/random rpool/swap

This is because the arguments passed to the command is backed-up and restored from the settings just before the deletion of the volume. As mentioned in the zfs(1m) manual page, only the following encryption algorithm are supported... and so the one which is sets is not valid (the error message saying that the pool must be upgraded to set this property or value is a little more clear by now).

encryption=off | on | aes-128-ccm | aes-192-ccm | aes-256-ccm | aes-128-gcm | aes-192-gcm | aes-256-gcm

The question is, how can this happen? Where does this algorithm com from? The answer is simple: it seems that this is the swap(1m) command which alters some properties of the rpool/swap volume:

# swap -d /dev/zvol/dsk/rpool/swap
# zfs destroy rpool/swap
# zfs create -V 4G -o volblocksize=8192 -o primarycache=metadata -o secondarycache=all -o encryption=on -o keysource=raw,file:///dev/random rpool/swap
# zfs list -H -o type,volsize,volblocksize,encryption rpool/swap
volume  4G      8K      on
# swap -1 -a /dev/zvol/dsk/rpool/swap
# zfs list -H -o type,volsize,volblocksize,encryption rpool/swap
volume  4G      1M      aes-128-ctr

Not only is the algorithm changed to something not supported (yet?), but the volblocksize property is touched as well. This was not the case on Solaris 11 Express 2010.11.

Hope someone can help me on this side, and that this is a known bug which is already (or will be quickly) addressed, in particular for the Solaris 11 GA. I already posted a comment on the blog of Darren Moffat, just in case this can help a bit.

Sunday 13 March 2011

Customized Solaris installation and patching experience

I recently faced a curious problem when trying to patch an Alternate Boot Environment created with Live Upgrade on Solaris 10. Although I initially though it was a LU problem, the solution is finally related to the patches to be applied and the way a Solaris is installed.

Assuming the ABE is named s10u9, I first tried to apply the Critical Patch Updates the new way, i.e. through a switch to the installcluster script , which quickly failed like this:

# cd /net/jumpstart/export/media/patch/cpu/10_Recommended_CPU_2010-10
# ./installcluster --apply-prereq --s10cluster
[...]
# ./installcluster -B s10u9 --s10cluster
ERROR: Patch set cannot be installed from a live boot environment without zones
       support, to a target boot environment that has zones support.

So, I tried to apply the patches using the luupgrade command, but it failed with a very similar message:

# ptime luupgrade -t -n s10u9 -s /net/jumpstart/export/media/patch/cpu/10_Recommended_CPU_2010-10/patches `cat patch_order`
Validating the contents of the media .
The media contains 198 software patches that can be added.
All 198 patches will be added because you did not specify any specific patches to add.
Mounting the BE .
ERROR: The boot environment  supports non-global zones. The current boot
environment does not support non-global zones. Releases prior to Solaris 10 cannot be
used to maintain Solaris 10 and later releases that include support for non-global zones.
You may only execute the specified operation on a system with Solaris 10 (or later)
installed.

The fact is, the Primary Boot Environment is a Solaris 10 installation. So, why complaining that the PBE is an older release? Looking on OTN discussion forums and in the README file which came with the Critical Patch Updates release, there is a known bug which can end this way. This will occur when /etc/zones/index in the inactive boot environment has an incorrect setting for the state for the global zone. The correct setting is installed. So, get check this one:

# lumount -n s10u9
/.alt.s10u9
# grep "^global:configured:" /.alt.s10u9/etc/zones/index
# luumount -n s10u9

So no luck here. But wait: if the PBE is a customized Solaris 10 installation, it may be that the installed packages missed the Zone feature, which seems to be mandatory by installcluster or liveupgrade -t to figure out if the PBE is a proper (usable) Solaris 10 installation. So, I just installed the missing packages from the install media...

# mount -r -F hsfs `lofiadm -a /net/jumpstart/export/media/iso/sol-10-u9-ga-sparc-dvd.iso` /mnt
# pkginfo -d /mnt/Solaris_10/Product | nawk '$2 ~ /zone/ || $2 ~ /pool$/ {print $0}'
application SUNWluzone                       Live Upgrade (zones support)
system      SUNWpool                         Resource Pools
system      SUNWzoner                        Solaris Zones (Root)
system      SUNWzoneu                        Solaris Zones (Usr)
# yes | pkgadd -d /mnt/Solaris_10/Product SUNWluzone SUNWzoner SUNWzoneu SUNWpool
[...]
# umount /mnt
# lofiadm -d /dev/lofi/1

... and this must be OK right now:

# ./installcluster -B s10u9 --s10cluster
Setup ...
CPU OS Cluster 2010/10 Solaris 10 SPARC (2010.10.06)
Application of patches started : 2011.02.07 11:17:08

Applying 120900-04 (  1 of 198) ... skipped
[...]
Installation of patch set to alternate boot environment complete.

Please remember to activate boot environment s10u9 with luactivate(1M)
before rebooting.
Install log files written :
  /.alt.s10u9/var/sadm/install_data/s10s_rec_cluster_short_2011.02.07_11.17.08.log
  /.alt.s10u9/var/sadm/install_data/s10s_rec_cluster_verbose_2011.02.07_11.17.08.log

And it is... The question is, why is the Zone feature necessary and mandatory in this case?

Saturday 19 February 2011

Solaris 11 Express: Problem #6

In this series, I will report the bugs or problems I find when running the Oracle Solaris 11 Express distribution. I hope this will give more visibility on those PR to Oracle to correct them before the release of Solaris 11 next year.

Well, this one is not a really a bug per-se, but it is really annoying nonetheless. The new interface where most network things are managed through dladm and ipadm is really interesting because this leads to a single point of management for the stack IP. Most things works pretty well, unless setting the mode and speed of an interface, were we always need to use the old ndd command, or the system kernel configuration file, or set parameters in the specific interface configuration file, as in old days on old Solaris releases.

Here is an example of such annoying behavior when hard setting a mode and speed for an interface is required:

# dladm show-link
LINK        CLASS     MTU    STATE    BRIDGE     OVER
eri0        phys      1500   up       --         --

# dladm show-phys
LINK         MEDIA                STATE      SPEED  DUPLEX    DEVICE
eri0         Ethernet             up         100    half      eri0

# dladm show-linkprop -p duplex eri0
LINK         PROPERTY        PERM VALUE          DEFAULT        POSSIBLE
eri0         duplex          r-   full           full           half,full

As you can see, the auto-negotiation leads to an unusual 100Mbps half-duplex mode. And because the associated link property is read-only, the only way to put the link at 100Mbps full-duplex is to set it another way as before Solaris 11:

# ndd -set /dev/eri0 adv_100fdx_cap 1
# ndd -set /dev/eri0 adv_autoneg_cap 0

# cat << EOF >> /etc/system
set eri:adv_autoneg_cap=0
set eri:adv_100fdx_cap=1
set eri:adv_100hdx_cap=0
set eri:adv_10fdx_cap=0
set eri:adv_10hdx_cap=0
EOF

As I said previously it is not a bug, but is is really a pity it not possible to use the new way to set these parameters.

Tuesday 1 February 2011

Solaris 11 Express: Problem #5

In this series, I will report the bugs or problems I find when running the Oracle Solaris 11 Express distribution. I hope this will give more visibility on those PR to Oracle to correct them before the release of Solaris 11 next year.

Some builds before Oracle decided not to provide a binary distribution of Solaris Next anymore (build snv_124 if I recall correctly), virtual consoles were introduced. This a well-known feature for the Linux and BSD people, but Solaris 11 Express is the first supported release of Solaris where we can run multiple virtual terminals on the console.

This long-time missing feature is not enable by default though. Here are the steps to do so:

$ pfexec svccfg -s vtdaemon setprop options/secure=false
$ pfexec svccfg -s vtdaemon setprop options/hotkeys=true
$ pfexec svcadm enable vtdaemon
$ pfexec svcadm enable console-login:vt2
$ pfexec svcadm enable console-login:vt3
$ pfexec svcadm enable console-login:vt4
$ pfexec svcadm enable console-login:vt5
$ pfexec svcadm enable console-login:vt6

Using Control-Alt-F1 to Control-Alt-F6, one can now switch to virtual consoles. And using Control-Alt-F7, one can switch back to the X server. Note: not setting the options/secure property to false will automatically lock the X server screen.

Although switching to virtual consoles works as expected, getting back to the X server is not really easy. From my experience, with the options/secure property set to true, using Control-Alt-F7 get me to a new login prompt, bypassing my (always?) logged-in session. With the options/secure property set to false, using Control-Alt-F7 leave me with a black screen and a blinking _ cursor... and nothing what I can do without a remote access.

FYI, this problem is covered by the Bug ID number 7001741. Note that you can add yourself to the interest list at the bottom of the bug report page:

Monday 17 January 2011

Solaris 11 Express: Problem #4

In this series, I will report the bugs or problems I find when running the Oracle Solaris 11 Express distribution. I hope this will give more visibility on those PR to Oracle to correct them before the release of Solaris 11 next year.

Here is one way you can use to create a local copy of an Oracle Solaris 11 Express IPS package repository. This is based on the Oracle document on this exact purpose, and the README file provided as part of the ISO media which provides the entire IPS package repository.

The dpool/store/data is a compressed ZFS dataset. Here, we duplicate the content of the Oracle Solaris 11 Express IPS package repository from the ISO media. Then, the pkg/server SMF is enabled.

# mkdir /dpool/store/data/repo_support
# mount -F hsfs `lofiadm -a /dpool/store/iso/sol-11-exp-201011-repo-full.iso` /mnt
# rsync -aP /mnt/repo /dpool/store/data/repo_support
# umount /mnt
# lofiadm -d /dev/lofi/1
# svccfg -s pkg/server setprop pkg/inst_root=/dpool/store/data/repo_support/repo
# svcadm refresh pkg/server
# svcadm enable pkg/server

Last, simply reset the origin for the solaris publisher URI, and verify the new configuration:

# pkg set-publisher -G http://pkg.oracle.com/solaris/release -g http://unic.thilelli.net/ solaris
# pkg publisher
PUBLISHER                             TYPE     STATUS   URI
solaris                  (preferred)  origin   online   http://unic.thilelli.net/

Then, try some listing and searching with the new local repository:

# pkg list -a | grep constructor
install/distribution-constructor              0.5.11-0.151.0.1 installed  -----
# pkg search constructor
pkg: Some repositories failed to respond appropriately:
solaris:
http protocol error: code: 503 reason: Service Unavailable
URL: 'http://unic.thilelli.net/solaris/search/1/False_2_None_None_%3A%3A%3Aconstructor'

Ouch, the newly created repository is unsearchable... And in fact there is no index directory provided in the /dpool/store/data/repo_support/repo/publisher/solaris sub-directory. So, in order to create one we need to force a refresh of the repository.

# svccfg -s pkg/server setprop pkg/readonly=false
# svcadm refresh pkg/server
# pkgrepo refresh -s http://unic.thilelli.net/
Repository refresh initiated.

But just after initiating the refresh of the repository, I started seeing a very high load on the disk on which the dpool is built, and which hosts now the local repository.

# iostat -xnz 2
[...]
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   92.0    0.0  195.0    0.0  3.1  1.0   33.3   10.8  89 100 c9t1d0

So, what is causing this huge I/O workload? Fire iosnoop, and iotop from the DTraceToolkit to observe the activity, and find who is doing these I/O:

# /opt/DTT/Bin/iosnoop
^C
    0   747 R 261342101   3072 pkg.depotd 
    0   747 R 266583819   3072 pkg.depotd 
    0   747 R 55676891    1536 pkg.depotd 
    0   747 R 260418077   2048 pkg.depotd 

# /opt/DTT/Bin/iotop
Tracing... Please wait.

2011 Jan 10 19:26:36,  load: 0.58,  disk_r:    910 KB,  disk_w:  12667 KB

  UID    PID   PPID CMD              DEVICE  MAJ MIN D            BYTES
    0      0      0                  sd3      94 192               5632
    0      5      0 zpool-rpool      sd2      94 128 W           215552
    0    312      0 zpool-dpool      sd3      94 192 R           401408
    0    747      9 pkg.depotd       sd3      94 192 R           530944
    0    312      0 zpool-dpool      sd3      94 192 W         12755968

Ok, so it seems that the local repository is the culprit. How is the process and its LWP working at this same time?

# prstat -cZmL -p `pgrep pkg.depotd` 2
Please wait...
[...]
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 
   747 root     0.1 0.6 0.0 0.0 0.0 0.0  99 0.0 118   2 254   0 pkg.depotd/2
   747 root     0.1 0.0 0.0 0.0 0.0 0.0 100 0.0 103   0 103   0 pkg.depotd/3
   747 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0  80   1  48   0 pkg.depotd/1
   747 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0  10   0   5   0 pkg.depotd/4
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/15
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/14
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/13
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/12
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/11
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/10
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/9
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/8
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/7
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/6
   747 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   0   0   0   0 pkg.depotd/5
ZONEID     NLWP  SWAP   RSS MEMORY      TIME  CPU ZONE                        
     0       64  635M  736M    36%   0:01:22 0.4% global                      
Total: 1 processes, 64 lwps, load averages: 0.29, 0.31, 0.25

Ok, the LWP #2 of the pkg.depotd program seems hot here. What is the content of the running stack trace for this particular LWP?

# pstack 747/2
747:    /usr/bin/python2.6 /usr/lib/pkg.depotd --cfg svc:/application/pkg/serv
-----------------  lwp# 2 / thread# 2  --------------------
[...]
   [ /usr/lib/python2.6/vendor-packages/pkg/indexer.py:692 (_generic_update_index) ]
 fed57e48 PyEval_EvalCodeEx (87c0530, 87bb8ac, 0, 8bc2d7c, 4, 8bc2d8c) + 91c
 fed594f6 fast_function (87e5df4, fcf1de3c, 4, 4, 0, fef70000) + 176
 fed58fea call_function (fcf1de3c, 3, 629a7d59, fed5337e) + ee
 fed56399 PyEval_EvalFrameEx (8bc2c34, 0, 87bb8ac, 0) + 3029
   [ /usr/lib/python2.6/vendor-packages/pkg/indexer.py:790 (server_update_index) ]
 fed57e48 PyEval_EvalCodeEx (87c05c0, 87bb8ac, 0, 8bc2bfc, 2, 8bc2c04) + 91c
 fed594f6 fast_function (87e5e64, fcf1e00c, 2, 2, 0, fedcc7f4) + 176
 fed58fea call_function (fcf1e00c, 1, 88e7a3ae, fed5337e) + ee
 fed56399 PyEval_EvalFrameEx (8bc2ab4, 0, 880a604, 0) + 3029
   [ /usr/lib/python2.6/vendor-packages/pkg/server/repository.py:1003 (__update_searchdb_unlocked) ]
 fed59488 fast_function (8898a04, fcf1e14c, 2, 2, 0, fedcc7f4) + 108
 fed58fea call_function (fcf1e14c, 1, 88e7a3ae, fed5337e) + ee
 fed56399 PyEval_EvalFrameEx (875ca24, 0, 880a604, 0) + 3029
   [ /usr/lib/python2.6/vendor-packages/pkg/server/repository.py:1431 (__run_update_index) ]
 fed59488 fast_function (8898ed4, fcf1e28c, 1, 1, 0, fedcc7f4) + 108
 fed58fea call_function (fcf1e28c, 0, 88e7a3ae, fed5337e) + ee
 fed56399 PyEval_EvalFrameEx (8a58904, 0, 880a604, 0) + 3029
   [ /usr/lib/python2.6/vendor-packages/pkg/server/repository.py:687 (__refresh_index) ]
 fed59488 fast_function (889880c, fcf1e3cc, 1, 1, 0, fedcc7f4) + 108
 fed58fea call_function (fcf1e3cc, 0, 55, 0) + ee
 fed56399 PyEval_EvalFrameEx (8a5876c, 0, 880a604, 0) + 3029
[...]

It is now pretty clear that refreshing the repository is at the heart of the the load. When updating the index went well, it will last for five to ten minutes, and the log will look like this:

# tail -f /var/svc/log/application-pkg-server:default.log
  [04/Nov/2008:09:46:45] ENGINE Listening for SIGTERM.
  [04/Nov/2008:09:46:45] ENGINE Listening for SIGUSR1.
  [04/Nov/2008:09:46:45] ENGINE Bus STARTING
  [04/Nov/2008:09:46:45] ENGINE Started monitor thread '_TimeoutMonitor'.
  [04/Nov/2008:09:46:46] ENGINE Serving on 0.0.0.0:80
  [04/Nov/2008:09:46:46] ENGINE Bus STARTED
  [04/Nov/2008:09:46:51] INDEX Search Available
  [04/Nov/2008:09:46:56] INDEX Updating search indexes
  [04/Nov/2008:09:50:15] INDEX Search indexes updated and available.

The question is why is it so hot, and especially so long with this repository? The update generate more than 1.5 billion of files under the index/TMP directory before dying, and trying again... indefinitely:

# tail -f /var/svc/log/application-pkg-server:default.log
[...]
[ Jan 12 04:16:10 Stopping because all processes in service exited. ]
[ Jan 12 04:16:18 Executing start method ("//lib/svc/method/svc-pkg-depot start"). ]
Dropping net_privaddr privilege.
ppriv -s A=basic,-file_link_any,-proc_info,-proc_session,net_privaddr -e /usr/lib/pkg.depotd --cfg svc:/application/pkg/server:default
[12/Jan/2011:04:16:26] ENGINE Listening for SIGHUP.
[12/Jan/2011:04:16:26] ENGINE Listening for SIGTERM.
[12/Jan/2011:04:16:26] ENGINE Listening for SIGUSR1.
[12/Jan/2011:04:16:26] ENGINE Bus STARTING
[12/Jan/2011:04:16:26] INDEX Checking for updated package data.
[12/Jan/2011:04:16:26] ENGINE Started monitor thread '_TimeoutMonitor'.
[12/Jan/2011:04:16:27] ENGINE Serving on 0.0.0.0:80
[12/Jan/2011:04:16:27] ENGINE Bus STARTED
[12/Jan/2011:04:16:27] INDEX Updating search indexes
[ Jan 12 07:36:09 Stopping because all processes in service exited. ]
[ Jan 12 07:36:17 Executing start method ("//lib/svc/method/svc-pkg-depot start"). ]
Dropping net_privaddr privilege.
ppriv -s A=basic,-file_link_any,-proc_info,-proc_session,net_privaddr -e /usr/lib/pkg.depotd --cfg svc:/application/pkg/server:default
[12/Jan/2011:07:36:28] ENGINE Listening for SIGHUP.
[12/Jan/2011:07:36:28] ENGINE Listening for SIGTERM.
[12/Jan/2011:07:36:28] ENGINE Listening for SIGUSR1.
[12/Jan/2011:07:36:28] ENGINE Bus STARTING
[12/Jan/2011:07:36:28] INDEX Checking for updated package data.
[12/Jan/2011:07:36:28] ENGINE Started monitor thread '_TimeoutMonitor'.
[12/Jan/2011:07:36:28] ENGINE Serving on 0.0.0.0:80
[12/Jan/2011:07:36:28] ENGINE Bus STARTED
[12/Jan/2011:07:36:29] INDEX Updating search indexes

So, I am asking me why the index is not provided as part of the ISO media... Nonetheless, the problem is that I can't now enable the pkg/server SMF service anymore without pkg.depotd trying to update the index... Too bad. So, I decided to reinstall the depot from the beginning, but how can I create an index properly with this release from the ISO media?

Update #1 (2011-01-19): Well, I think I find a better way (the proper way?) to achieve this without problem: use the file path as an argument to the pkgrepo command.

# svccfg -s pkg/server setprop pkg/readonly=true
# svcadm refresh pkg/server
# svcadm restart pkg/server
# pkgrepo -s /dpool/store/data/repo_support/repo refresh
Repository refresh initiated.
# tail /var/svc/log/application-pkg-server\:default.log 
[...]
[17/Jan/2011:21:51:37]  pkg://solaris/SUNWdcopy@0.5.11,5.11-0.133:20101027T183617Z
[17/Jan/2011:21:51:37]  pkg://solaris/driver/x11/xsvc@0.5.11,5.11-0.151.0.1:20101104T233133Z
[17/Jan/2011:21:51:37]  pkg://solaris/SUNWdedis@0.5.11,5.11-0.130:20101027T183621Z
[17/Jan/2011:21:51:40] INDEX Checking for updated package data.
# pkg search -s http://unic.thilelli.net/ constructor
INDEX           ACTION VALUE                                                       PACKAGE
pkg.description set    Distribution Constructor libraries, commands and data files pkg:/install/distribution-constructor@0.5.11-0.151.0.1

Monday 10 January 2011

Solaris 11 Express: Problem #3

In this series, I will report the bugs or problems I find when running the Oracle Solaris 11 Express distribution. I hope this will give more visibility on those PR to Oracle to correct them before the release of Solaris 11 next year.

I recently switch from the official Oracle release repository to the support repository for Solaris 11 Express. Before the switch, one non-global zone was created. Since there were some updates to this repository, I pkg update'ed, rebooted to the new boot environment, and tried to update the non-global zone:

# beadm list                                                                         
BE        Active Mountpoint Space Policy Created          
--        ------ ---------- ----- ------ -------          
solaris   -      -          9.88M static 2010-12-01 09:32 
solaris-1 NR     /          5.44G static 2011-01-03 19:35 

# zoneadm list -vc                                                                   
  ID NAME             STATUS     PATH                           BRAND    IP    
   0 global           running    /                              ipkg     shared
   - zone1            installed  /dpool/store/zone/zone1        ipkg     shared

# zoneadm -z zone1 detach

# zoneadm -z zone1 attach -u
Log File: /var/tmp/zone1.attach_log.lfa49e
Attaching...

preferred global publisher: solaris
       Global zone version: entire@0.5.11,5.11-0.151.0.1.1:20101222T214417Z
   Non-Global zone version: entire@0.5.11,5.11-0.151.0.1:20101105T054056Z

                     Cache: Using /var/pkg/download.
  Updating non-global zone: Output follows
Creating Plan                          
ERROR: Could not update attaching zone
                    Result: Attach Failed.

# cat /var/tmp/zone1.attach_log.lfa49e
[Monday, January  3, 2011 08:42:24 PM CET] Log File: /var/tmp/zone1.attach_log.lfa49e
[Monday, January  3, 2011 08:42:25 PM CET] Attaching...
[Monday, January  3, 2011 08:42:25 PM CET] existing
[Monday, January  3, 2011 08:42:25 PM CET] 
[Monday, January  3, 2011 08:42:25 PM CET]   Sanity Check: Passed.  Looks like an OpenSolaris system.
[Monday, January  3, 2011 08:42:31 PM CET] preferred global publisher: solaris
[Monday, January  3, 2011 08:42:32 PM CET]        Global zone version: entire@0.5.11,5.11-0.151.0.1.1:20101222T214417Z
[Monday, January  3, 2011 08:42:32 PM CET]    Non-Global zone version: entire@0.5.11,5.11-0.151.0.1:20101105T054056Z

[Monday, January  3, 2011 08:42:32 PM CET]                      Cache: Using /var/pkg/download.
[Monday, January  3, 2011 08:42:32 PM CET]   Updating non-global zone: Output follows
pkg set-publisher: 
Unable to locate certificate '/dpool/store/zone/zone1/root/dpool/store/zone/zone1/root/var/pkg/ssl/Oracle_Solaris_11_Express_Support.certificate.pem' needed to access 'https://pkg.oracle.com/solaris/support/'.
pkg unset-publisher: 
Removal failed for 'za23954': The preferred publisher cannot be removed.

pkg: The following pattern(s) did not match any packages in the current catalog.
Try relaxing the pattern, refreshing and/or examining the catalogs:
        entire@0.5.11,5.11-0.151.0.1.1:20101222T214417Z
[Monday, January  3, 2011 08:44:04 PM CET] ERROR: Could not update attaching zone
[Monday, January  3, 2011 08:44:06 PM CET]                     Result: Attach Failed.

FYI, this problem was covered by the Bug ID number 13000, but is always present at this time, at least for Solaris 11 Express 2010.11.

So, it seems that the change of repository for the solaris publisher was not well managed by the non-global zone update mechanism. Just to be sure, I tried to create a new non-global zone in the new boot environment, but the problem exists in this case, too:

# zoneadm -z zone2 install
A ZFS file system has been created for this zone.
   Publisher: Using solaris (https://pkg.oracle.com/solaris/support/ ).
   Publisher: Using opensolaris.org (http://pkg.opensolaris.org/dev/).
       Image: Preparing at /dpool/store/zone/zone2/root.
 Credentials: Propagating Oracle_Solaris_11_Express_Support.key.pem
 Credentials: Propagating Oracle_Solaris_11_Express_Support.certificate.pem
Traceback (most recent call last):
  File "/usr/bin/pkg", line 4225, in handle_errors
    __ret = func(*args, **kwargs)
  File "/usr/bin/pkg", line 4156, in main_func
    ret = image_create(pargs)
  File "/usr/bin/pkg", line 3836, in image_create
    variants=variants, props=set_props)
  File "/usr/lib/python2.6/vendor-packages/pkg/client/api.py", line 3205, in image_create
    uri=origins[0])
TypeError: 'set' object does not support indexing

pkg: This is an internal error.  Please let the developers know about this
problem by filing a bug at http://defect.opensolaris.org and including the
above traceback and this message.  The version of pkg(5) is '052adf36c3f4'.
ERROR: failed to create image

FYI, this problem is covered by the Bug ID number 17653.

Well, no luck here. I didn't see a Solaris IPS update for these problems yet, which are very annoying, at least.

Update #1 (2011-02-03): The problem is now fixed in the latest support pkg repository. Install or attach a non-global ipkg branded zone works now as expected:

# zoneadm -z zone1 install
A ZFS file system has been created for this zone.
   Publisher: Using solaris (https://pkg.oracle.com/solaris/support/ ).
   Publisher: Using opensolaris.org (http://pkg.opensolaris.org/dev/).
   Publisher: Using sunfreeware (http://pkg.sunfreeware.com:9000/).
       Image: Preparing at /dpool/export/zone/zone1/root.
 Credentials: Propagating Oracle_Solaris_11_Express_Support.key.pem
 Credentials: Propagating Oracle_Solaris_11_Express_Support.certificate.pem
       Cache: Using /var/pkg/download. 
Sanity Check: Looking for 'entire' incorporation.
  Installing: Core System (output follows)
               Packages to install:     1
           Create boot environment:    No
[...]
        Note: Man pages can be obtained by installing SUNWman
 Postinstall: Copying SMF seed repository ... done.
 Postinstall: Applying workarounds.
        Done: Installation completed in 332.525 seconds.

  Next Steps: Boot the zone, then log into the zone console (zlogin -C)
              to complete the configuration process.

And:

# zoneadm -z zone1 detach
# zoneadm -z zone1 attach -u
Log File: /var/tmp/zone1.attach_log.PhaOKf
Attaching...

preferred global publisher: solaris
       Global zone version: entire@0.5.11,5.11-0.151.0.1.2:20110127T225841Z
   Non-Global zone version: entire@0.5.11,5.11-0.151.0.1.2:20110127T225841Z

                     Cache: Using /var/pkg/download.
  Updating non-global zone: Output follows
No updates necessary for this image.   
  Updating non-global zone: Zone updated.
                    Result: Attach Succeeded.
# zoneadm list -vc
  ID NAME             STATUS     PATH                           BRAND    IP    
   0 global           running    /                              ipkg     shared
   - zone1            installed  /dpool/export/zone/zone1       ipkg     shared

Monday 3 January 2011

Solaris 11 Express: Problem #2

In this series, I will report the bugs or problems I find when running the Oracle Solaris 11 Express distribution. I hope this will give more visibility on those PR to Oracle to correct them before the release of Solaris 11 next year.

For some customers, I had the habit to clone a non-global zone using a template zone. But in order to save some space, I generally use the capability to use a ZFS snapshot as input for the clone, avoiding creating a snapshot each time a new clone is created.

It seems that this capability is not usable anymore on Solaris 11 Express at this time:

# zoneadm -z zone3 clone -s dpool/store/zone/zone1/ROOT/zbe@zone2_snap zone1
/usr/lib/brand/ipkg/clone: -s: unknown option

Nevertheless, this functionality is always described in the manual page:

brand-specific usage: clone {sourcezone} usage: clone [-m method] [-s ] [brand-specific args] zonename Clone the installation of another zone. The -m option can be used to specify 'copy' which forces a copy of the source zone. The -s option can be used to specify the name of a ZFS snapshot that was taken from a previous clone command. The snapshot will be used as the source instead of creating a new ZFS snapshot. All other arguments are passed to the brand clone function; see brands(5) for more information.

No luck here. Even if the space consideration may be minimized by the deduplication feature of ZFS in Solaris 11 Express, it is not always appropriate nor usable: on small size server for example.

FYI, this problem is covered by the Bug ID number 6383119. Note that you can add yourself to the interest list at the bottom of the bug report page:

Sunday 19 December 2010

Solaris 11 Express: Problem #1

In this series, I will report the bugs or problems I find when running the Oracle Solaris 11 Express distribution. I hope this will give more visibility on those PR to Oracle to correct them before the release of Solaris 11 next year.

After a fresh installation of Solaris 11 Express from the LiveUSB media, and with the default set of packages provided by this media (i.e. the slim_install profile), I just fell on the following error message from a secure-shell login connection:

invalid UTF-8 sequence: Cannot convert UTF-8 strings to the local codeset

In fact, this problem is not new and was caused by the integration of the locale fix number 6740240 which added a dependency of the libssh against the locale framework. This generate error messages on systems where the iconv data for UTF-8 is not available. So, this cause no problem for the proper execution of SSH per-se, but I find this a little bit annoying on a fresh installed system. So, my point here is that the dependency is not properly managed from the installation of Solaris 11 Express.

FYI, this problem is covered by the Bug ID number 6872504. Note that you can add yourself to the interest list at the bottom of the bug report page:

Sunday 22 March 2009

Finding The Process Responsible For Crashing A System

Recently, we encounter a wave of suicide on most of the nodes which formed some Oracle RAC cluster on lots of Sun M5000 domains platforms. Although the logs found on Oracle RAC were interesting, they didn't help us to determine precisely the origin of the crashes. Since the domains panic'ed, we were able to briefly analyze the cores generated at crash time to get the process which initiated the panics. Here is how to do so.

First, be sure to have proper and usable core on persistent storage:

# cd /var/crash/nodename
# file *.0
unix.0:         ELF 64-bit MSB executable SPARCV9 Version 1, UltraSPARC1 Extensions Required, statically linked, not stripped, no debugging information available
vmcore.0:       SunOS 5.10 Generic_127111-11 64-bit SPARC crash dump from 'nodename'

Then, extract useful information using MDB dcmds such as ::status, ::showrev and ::panicinfo which give us the exact panic message and provide us the message and thread responsible for the system crash:

# mdb -k unix.0 vmcore.0
Loading modules: [ unix krtld genunix specfs dtrace ufs sd mpt px ssd fcp fctl md ip qlc hook neti sctp arp usba nca zfs random logindmux ptm cpc sppp crypto wrsmd fcip nfs ipc ]
> ::status
debugging crash dump vmcore.0 (64-bit) from nodename
operating system: 5.10 Generic_127111-11 (sun4u)
panic message: forced crash dump initiated at user request
dump content: kernel pages only
> ::showrev
Hostname: nodename
Release: 5.10
Kernel architecture: sun4u
Application architecture: sparcv9
Kernel version: SunOS 5.10 sun4u Generic_127111-11
Platform: SUNW,SPARC-Enterprise
> ::panicinfo
             cpu                0
          thread      300171c7300
         message forced crash dump initiated at user request
          tstate       4400001606
              g1                b
              g2                0
              g3          11c13e0
              g4              6e0
              g5         88000000
              g6                0
              g7      300171c7300
              o0          1208020
              o1      2a10176b9e8
              o2                1
              o3                0
              o4 fffffffffffffff5
              o5             1000
              o6      2a10176b0b1
              o7          10626a4
              pc          1044d8c
             npc          1044d90
               y                0

Well. Now what we have the exact thread number (thread ID), we can find the corresponding UNIX process helped by the following script:

# cat /var/tmp/findstack.vmcore.sh
#!/usr/bin/env sh

echo "::ps" | mdb -k unix.0 vmcore.0 | \
 nawk '$8 !~ /ADDR/ {print $8" "$NF}' > /tmp/.core.$$

cat /dev/null > /tmp/core.$$

while read ps; do
  echo "process name: `echo ${ps} | nawk '{print $2}'`" >> /tmp/core.$$
  echo ${ps} | nawk '{print $1"::walk thread | ::findstack"}' | \
   mdb unix.0 vmcore.0 >> /tmp/core.$$
  echo >> /tmp/core.$$
done < /tmp/.core.$$

\rm /tmp/.core.$$

exit 0

Now, just find the lines for the guilty process in the output file. In our case, it is the oprocd.bin process:

# vi /tmp/core.*
[...]
process name: oprocd.bin
stack pointer for thread 300171c7300: 2a10176b0b1
  000002a10176b161 kadmin+0x4a4()
  000002a10176b221 uadmin+0x11c()
  000002a10176b2e1 syscall_trap+0xac()
[...]

This process is locked in memory to monitor the cluster and provide I/O fencing. oprocd.bin performs its check, stops running, and if the wake up is beyond the expected time, then it resets the processor and reboots the node. An oprocd.bin failure results in Oracle Clusterware restarting the node. Please read the Oracle Clusterware and Oracle Real Application Clusters documentation for more information.

Although the incident is always under investigation, it seems the nodes were impacted by the additional second that was added at the end of 2008...

Update #1 (2010-10-09): Here is a very interesting french blog entry about the same problem.

- page 1 of 3