blog'o thnet

To content | To menu | To search

Tag - live upgrade

Entries feed - Comments feed

Sunday 13 March 2011

Customized Solaris installation and patching experience

I recently faced a curious problem when trying to patch an Alternate Boot Environment created with Live Upgrade on Solaris 10. Although I initially though it was a LU problem, the solution is finally related to the patches to be applied and the way a Solaris is installed.

Assuming the ABE is named s10u9, I first tried to apply the Critical Patch Updates the new way, i.e. through a switch to the installcluster script , which quickly failed like this:

# cd /net/jumpstart/export/media/patch/cpu/10_Recommended_CPU_2010-10
# ./installcluster --apply-prereq --s10cluster
[...]
# ./installcluster -B s10u9 --s10cluster
ERROR: Patch set cannot be installed from a live boot environment without zones
       support, to a target boot environment that has zones support.

So, I tried to apply the patches using the luupgrade command, but it failed with a very similar message:

# ptime luupgrade -t -n s10u9 -s /net/jumpstart/export/media/patch/cpu/10_Recommended_CPU_2010-10/patches `cat patch_order`
Validating the contents of the media .
The media contains 198 software patches that can be added.
All 198 patches will be added because you did not specify any specific patches to add.
Mounting the BE .
ERROR: The boot environment  supports non-global zones. The current boot
environment does not support non-global zones. Releases prior to Solaris 10 cannot be
used to maintain Solaris 10 and later releases that include support for non-global zones.
You may only execute the specified operation on a system with Solaris 10 (or later)
installed.

The fact is, the Primary Boot Environment is a Solaris 10 installation. So, why complaining that the PBE is an older release? Looking on OTN discussion forums and in the README file which came with the Critical Patch Updates release, there is a known bug which can end this way. This will occur when /etc/zones/index in the inactive boot environment has an incorrect setting for the state for the global zone. The correct setting is installed. So, get check this one:

# lumount -n s10u9
/.alt.s10u9
# grep "^global:configured:" /.alt.s10u9/etc/zones/index
# luumount -n s10u9

So no luck here. But wait: if the PBE is a customized Solaris 10 installation, it may be that the installed packages missed the Zone feature, which seems to be mandatory by installcluster or liveupgrade -t to figure out if the PBE is a proper (usable) Solaris 10 installation. So, I just installed the missing packages from the install media...

# mount -r -F hsfs `lofiadm -a /net/jumpstart/export/media/iso/sol-10-u9-ga-sparc-dvd.iso` /mnt
# pkginfo -d /mnt/Solaris_10/Product | nawk '$2 ~ /zone/ || $2 ~ /pool$/ {print $0}'
application SUNWluzone                       Live Upgrade (zones support)
system      SUNWpool                         Resource Pools
system      SUNWzoner                        Solaris Zones (Root)
system      SUNWzoneu                        Solaris Zones (Usr)
# yes | pkgadd -d /mnt/Solaris_10/Product SUNWluzone SUNWzoner SUNWzoneu SUNWpool
[...]
# umount /mnt
# lofiadm -d /dev/lofi/1

... and this must be OK right now:

# ./installcluster -B s10u9 --s10cluster
Setup ...
CPU OS Cluster 2010/10 Solaris 10 SPARC (2010.10.06)
Application of patches started : 2011.02.07 11:17:08

Applying 120900-04 (  1 of 198) ... skipped
[...]
Installation of patch set to alternate boot environment complete.

Please remember to activate boot environment s10u9 with luactivate(1M)
before rebooting.
Install log files written :
  /.alt.s10u9/var/sadm/install_data/s10s_rec_cluster_short_2011.02.07_11.17.08.log
  /.alt.s10u9/var/sadm/install_data/s10s_rec_cluster_verbose_2011.02.07_11.17.08.log

And it is... The question is, why is the Zone feature necessary and mandatory in this case?

Thursday 7 October 2010

Live Upgrading To Solaris 10 9/10

If you try to update to the latest Solaris 10 Update (U9), one new step is now required in order to be able to successfully luupgrade to the desired Update. As mentioned in the Oracle Solaris 10 9/10 Release Notes, a new Auto Registration mecanism has been added to this release to facilitate registering the system using your Oracle support credentials.

So, if you try the classical luupgrade following incantation, it will fail with the reported message:

# luupgrade -u -n s10u9 -s /mnt -j /var/tmp/profile
System has findroot enabled GRUB
No entry for BE  in GRUB menu
Copying failsafe kernel from media.
61364 blocks
miniroot filesystem is 
Mounting miniroot at 
ERROR: The auto registration file <> does not exist or incomplete.
       The auto registration file is mandatory for this upgrade.
       Use -k  argument along with luupgrade command.

So, you now need to set the Auto Registration choice as a mandatory parameter. Here is how it resembles right now:

# echo "auto_reg=disable" > /var/tmp/sysidcfg
# luupgrade -u -n s10u9 -s /mnt -j /var/tmp/profile -k /var/tmp/sysidcfg
System has findroot enabled GRUB
No entry for BE  in GRUB menu
Copying failsafe kernel from media.
61364 blocks
miniroot filesystem is 
Mounting miniroot at 
#######################################################################
 NOTE: To improve products and services, Oracle Solaris communicates
 configuration data to Oracle after rebooting.

 You can register your version of Oracle Solaris to capture this data
 for your use, or the data is sent anonymously.

 For information about what configuration data is communicated and how
 to control this facility, see the Release Notes or
 www.oracle.com/goto/solarisautoreg.

 INFORMATION: After activated and booted into new BE ,
 Auto Registration happens automatically with the following Information

autoreg=disable
#######################################################################
Validating the contents of the media .
The media is a standard Solaris media.
The media contains an operating system upgrade image.
The media contains  version <10>.
Constructing upgrade profile to use.
Locating the operating system upgrade program.
Checking for existence of previously scheduled Live Upgrade requests.
Creating upgrade profile for BE .
Checking for GRUB menu on ABE .
Saving GRUB menu on ABE .
Checking for x86 boot partition on ABE.
Determining packages to install or upgrade for BE .
Performing the operating system upgrade of the BE .
CAUTION: Interrupting this process may leave the boot environment unstable
or unbootable.
Upgrading Solaris: 100% completed
Installation of the packages from this media is complete.
Restoring GRUB menu on ABE .
Updating package information on boot environment .
Package information successfully updated on boot environment .
Adding operating system patches to the BE .
The operating system patch installation is complete.
ABE boot partition backing deleted.
PBE GRUB has no capability information.
PBE GRUB has no versioning information.
ABE GRUB is newer than PBE GRUB. Updating GRUB.
GRUB update was successfull.
INFORMATION: The file  on boot
environment  contains a log of the upgrade operation.
INFORMATION: The file  on boot
environment  contains a log of cleanup operations required.
INFORMATION: Review the files listed above. Remember that all of the files
are located on boot environment . Before you activate boot
environment , determine if any additional system maintenance is
required or if additional media of the software distribution must be
installed.
The Solaris upgrade of the boot environment  is complete.
Creating miniroot device
Configuring failsafe for system.
Failsafe configuration is complete.
Installing failsafe
Failsafe install is complete.

Not sure this will ease the upgrade path to this Update, even if there is nothing really wrong with this. It may just have been less intrusive I think.

Sunday 2 May 2010

Live Upgrading When Diagnostics Mode Is Enabled

Recently, we faced an interesting problem when using Live Upgrade on some of our SPARC servers (with lots of non-global zones hosted on SAN devices). Here are the basic steps we generally follow when using LU:

  1. Update the Live Upgrade functionality according to the Article ID #1004881.1, Solaris Live Upgrade Software: Patch Requirements.
  2. Create the ABE.
  3. Upgrade the ABE with an operating system image (and test the upgrade according to a JumpStart profile).
  4. Apply a determined Recommended Patch Cluster to the ABE.
  5. Activate the ABE to be the next booted BE.
  6. Reboot on the new BE, and post-configuration steps--eventually.

In some circumstances, and even if all the steps went pretty well--the activation of the new BE was ok (we traced its activities)--we did reboot on the old BE:

# lustatus
Boot Environment     Is       Active Active    Can    Copy
Name                 Complete Now    On Reboot Delete Status
-------------------- -------- ------ --------- ------ -------
s10u4                yes      yes    yes       no     -
s10u8                yes      no     no        yes    -
# lucurr
s10u4
# luactivate -n s10u8
[...]
# lustatus
Boot Environment     Is       Active Active    Can    Copy
Name                 Complete Now    On Reboot Delete Status
-------------------- -------- ------ --------- ------ -------
s10u4                yes      yes    no        no     -
s10u8                yes      no     yes       no     -
# shutdown -y -g 0 -i 6
[...]
# lucurr
s10u4

Ouch. After a bit of digging, and seeing nothing wrong from the console via the Service Processor, we hit the following message from the log of the SMF legacy script run by LU when rebooting (at the shutdown time more precisely):

# cat /var/svc/log/rc6.log
[...]
Executing legacy init script "/etc/rc0.d/K62lu".
Live Upgrade: Deactivating current boot environment <s10u4>.
zlogin: login allowed only to running zones (zonename1 is 'installed').
zlogin: login allowed only to running zones (zonename2 is 'installed').
Live Upgrade: Executing Stop procedures for boot environment <s10u4>.
Live Upgrade: Current boot environment is <s10u4>.
Live Upgrade: New boot environment will be <s10u8>.
Live Upgrade: Activating boot environment <s10u8>.
Creating boot_archive for /.alt.tmp.b-9Tb.mnt
updating /.alt.tmp.b-9Tb.mnt/platform/sun4v/boot_archive
Live Upgrade: The boot device for boot environment <s10u8> is
</dev/dsk/c1t0d0s4>.
/etc/lib/lu/lubootdev: ERROR: Unable to get current boot devices.
/etc/lib/lu/lubootdev: INFORMATION: The system is running with the system
boot PROM diagnostics mode enabled. When diagnostics mode is
enabled, Live Upgrade is unable to access the system boot
device list, causing certain features of Live Upgrade (such
as changing the system boot device after activating a boot
environment) to fail. To correct this problem, please run
the system in normal, non-diagnostic mode. The system might
have a key switch or other external means of booting the
system in normal mode. If you do not have such a means, you
can set one or both of the EEPROM parameters 'diag-switch?'
or 'diagnostic-mode?' to 'false'.  After making a change,
either through external means or by changing an EEPROM
parameter, retry the Live Upgrade operation or command.
ERROR: Live Upgrade: Unable to change primary boot device to boot
environment <s10u8>.
ERROR: You must manually change the system boot prom to boot the system
from device </pci@0/pci@0/pci@2/scsi@0/sd@0,0:e>.
Live Upgrade: Activation of boot environment <s10u8> completed.
Legacy init script "/etc/rc0.d/K62lu" exited with return code 0.
[...]

Well, pretty explicit in fact, but very unexpected when the activation went so well beforehand. So, go to check the EEPROM, and change it back if necessary:

# eeprom diag-switch?
diag-switch?=true
# eeprom diag-switch?=false

And all returned to a normal situation when activating again, and rebooting. Although this case is self explanatory in the corresponding log file, and is describe in the Bug ID #6949588, I think this one may be put more visible to the system administrator, for example by checking the EEPROM configuration during the BE activation code (at the luactivate command).

Tuesday 16 October 2007

Error While Patching A New Boot Environment

After creating a new boot environment (BE) named beastie, see below, to upgrade a system running Solaris 8 to Solaris 10 11/06 (as I do many times in the past without a hiccup), I encounter a problem when I tried to apply the appropriate Recommended cluster patch to the new BE with this message:

# luupgrade -t -n beastie -s /var/tmp/10_Recommended

Validating the contents of the media .
The media contains 76 software patches that can be added.
All 76 patches will be added because you did not specify any specific
patches to add.
Mounting the BE .
ERROR: The boot environment  supports non-global
zones.The current boot environment does not support non-global zones.
Releases prior to Solaris 10 cannot be used to maintain Solaris 10 and
later releases that include support for non-global zones. You may only
execute the specified operation on a system with Solaris 10 (or later)
installed.

I can't find any reference to a known bug or problem after looking for this against SunSolve, Sun Support, and Googling. Has anyone already seen this error, and solved it The Right Way? As for me, I needed to boot from the BE and apply the cluster patch: this was a pain since this bundle include the -36 kernel patch which is known to be relatively disruptive, since it need two reboot to apply the entire cluster patch (it contains a new version for the kernel).

Update #1 (2009-03-22): Seems to be explained in this excellent BigAdmin article.

Sunday 21 May 2006

Upgrading from snv_38 to snv_39 Using Solaris Live Upgrade

After writing about how to patch (or upgrade) a running system playing with a mirrored OpenSolaris SVM, here is a little step-by-step how to on upgrading (or patching, etc.) a live system using the Live Upgrade feature.

Before installing or running Live Upgrade, you are required to install a limited set of patch revisions. Make sure you have the most recently updated patch list by consulting sunsolve.sun.com. Search for the info doc 72099 on the SunSolve web site (you must have a registered Sun support customer account to be able to view this document).

Note: In the following procedure, we will assume that all we want (and need) to upgrade to is provided via a one large DVD ISO image.

If all seems OK, you must begin to update the current running system with the appropriate lu packages, i.e. those provided for the targeted OS revision. You can either use the provided tools:

# /cdrom/cdrom0/Solaris_11/Tools/Installers/liveupgrade20

Or do it yourself:

# pkgrm SUNWluu SUNWlur
# pkgadd -d /cdrom/cdrom0/Solaris_11/Product SUNWlur SUNWluu

Since the current OS is totally installed on the first slice of the first disk (c1d0s0), and that the slice six (c1d0s6) is exactly the same size as the first one, we will use it for the second ABE device for our purpose and create the corresponding Boot Environment.

# lucreate -c snv_38 -n snv_39 -m /:/dev/dsk/c1d0s6:ufs
/* If the snv_38 BE already exists, just create the new one for snv_39. */
# lucreate -n snv_39 -m /:/dev/dsk/c1d0s6:ufs
Discovering physical storage devices
Discovering logical storage devices
Cross referencing storage devices with boot environment configurations
Determining types of file systems supported
Validating file system requests
Preparing logical storage devices
Preparing physical storage devices
Configuring physical storage devices
Configuring logical storage devices
Analyzing system configuration.
Comparing source boot environment <snv_38> file systems with the file
system(s) you specified for the new boot environment. Determining which
file systems should be in the new boot environment.
Updating boot environment description database on all BEs.
Searching /dev for possible boot environment filesystem devices

Updating system configuration files.
The device </dev/dsk/c1d0s6> is not a root device for any boot environment.
Creating configuration for boot environment <snv_39>.
Source boot environment is <snv_38>.
Creating boot environment <snv_39>.
Checking for GRUB menu on boot environment <snv_39>.
The boot environment <snv_39> does not contain the GRUB menu.
Creating file systems on boot environment <snv_39>.
Creating <ufs> file system for </> on </dev/dsk/c1d0s6>.
Mounting file systems for boot environment <snv_39>.
Calculating required sizes of file systems for boot environment <snv_39>.
Populating file systems on boot environment <snv_39>.
Checking selection integrity.
Integrity check OK.
Populating contents of mount point </>.
Copying.
Creating shared file system mount points.
Creating compare databases for boot environment <snv_39>.
Creating compare database for file system </>.
Updating compare databases on boot environment <snv_39>.
Making boot environment <snv_39> bootable.
Updating bootenv.rc on ABE <snv_39>.
Population of boot environment <snv_39> successful.
Creation of boot environment <snv_39> successful.

Verify the correct attribution of the different file systems, in particular between those which are cloned (required by a Solaris installation, such as /, /var, /usr, and /opt) and those which are shared (such as /export).

# lufslist -n snv_38
               boot environment name: snv_38
               This boot environment is currently active.
               This boot environment will be active on next system boot.

Filesystem           fstype    device size Mounted on     Mount Options
-------------------- -------- ------------ -------------- --------------
/dev/dsk/c1d0s1      swap       4301821440 -              -
/dev/dsk/c1d0s0      ufs        8595417600 /              -
/dev/dsk/c1d0s7      ufs       58407713280 /export        -
#
# lufslist -n snv_39
               boot environment name: snv_39

Filesystem           fstype    device size Mounted on     Mount Options
-------------------- -------- ------------ -------------- --------------
/dev/dsk/c1d0s1      swap       4301821440 -              -
/dev/dsk/c1d0s6      ufs        8595417600 /              -
/dev/dsk/c1d0s7      ufs       58407713280 /export        -

You then just need to upgrade the second BE using the installation media of the desired release or revision.

# luupgrade -u -n snv_39 -s /cdrom/cdrom0

Install media is CD/DVD. </cdrom/cdrom0>.
Waiting for CD/DVD media </cdrom/cdrom0> ...
Copying failsafe multiboot from media.
Uncompressing miniroot
Creating miniroot device
miniroot filesystem is <ufs>
Mounting miniroot at </cdrom/cdrom0/Solaris_11/Tools/Boot>
Validating the contents of the media </cdrom/cdrom0>.
The media is a standard Solaris media.
The media contains an operating system upgrade image.
The media contains <Solaris> version <11>.
Constructing upgrade profile to use.
Locating the operating system upgrade program.
Checking for existence of previously scheduled Live Upgrade requests.
Creating upgrade profile for BE <snv_39>.
Checking for GRUB menu on ABE <snv_39>.
Checking for x86 boot partition on ABE.
Determining packages to install or upgrade for BE <snv_39>.
Performing the operating system upgrade of the BE <snv_39>.
CAUTION: Interrupting this process may leave the boot environment unstable
or unbootable.
Upgrading Solaris: 100% completed
Installation of the packages from this media is complete.
Deleted empty GRUB menu on ABE <snv_39>.
Adding operating system patches to the BE <snv_39>.
The operating system patch installation is complete.
ABE boot partition backing deleted.
Configuring failsafe for system.
Failsafe configuration is complete.
INFORMATION: The file </var/sadm/system/logs/upgrade_log> on boot
environment <snv_39> contains a log of the upgrade operation.
INFORMATION: The file </var/sadm/system/data/upgrade_cleanup> on boot
environment <snv_39> contains a log of cleanup operations required.
INFORMATION: Review the files listed above. Remember that all of the files
are located on boot environment <snv_39>. Before you activate boot
environment <snv_39>, determine if any additional system maintenance is
required or if additional media of the software distribution must be
installed.
The Solaris upgrade of the boot environment <snv_39> is complete.
Installing failsafe
Failsafe install is complete.

If something went wrong during the upgrade of the new Boot Environment snv_39, you can always restart with a very fresh one using the lumake -n snv_39 command. If all went smooth, you can now check and compare the newly created BE:

# lucompare -t snv_39 -o /tmp/lucompare.snv_39
# lumount -n snv_39
/.alt.snv_39
# mount -p | grep snv_39
/dev/dsk/c1d0s6 - /.alt.snv_39 ufs - no rw,intr,largefiles,logging,xattr,onerror=panic
# luumount -n snv_39
#
# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
snv_38                     yes      yes    yes       no     -
snv_39                     yes      no     no        yes    -

Since the overall upgrade is OK, you just need to activate the fresh BE snv_39, export your data zpool and perform a clean reboot, otherwise the new environment will not be activated. Do not use the uadmin, halt, or reboot commands!

# luactivate snv_39
# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
snv_38                     yes      yes    no        no     -
snv_39                     yes      no     yes       no     -
#
# zpool export datazp
# shutdown -y -g 0 -i 6

Et voilà! After the reboot, you must see something similar to:

# uname -a
SunOS unic 5.11 snv_39 i86pc i386 i86pc
#
# cat /etc/release
                            Solaris Nevada snv_39 X86
           Copyright 2006 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                              Assembled 01 May 2006

Last, please find some invaluable documentation on the subject below: