blog'o thnet

To content | To menu | To search

Tag - Upgrade

Entries feed - Comments feed

Wednesday 18 July 2012

Update the HBA firmware on Oracle-branded HBAs

Updating the emlxs driver will no longer automatically update the HBA firmware on Oracle-branded HBAs. If an HBA firmware update is required on an Oracle-branded HBA, a WARNING message will be placed in the /var/adm/messages file, such as this one:

# grep emlx /var/adm/messages
[...]
Jul 18 02:37:11 beastie emlxs: [ID 349649 kern.info] [ 1.0340]emlxs0:WARNING:1540: Firmware update required. (A manual HBA reset or link reset (using luxadm or fcadm) is required.)
Jul 18 02:37:15 beastie emlxs: [ID 349649 kern.info] [ 1.0340]emlxs1:WARNING:1540: Firmware update required. (A manual HBA reset or link reset (using luxadm or fcadm) is required.)
[...]

If found, this message is stating that the emlxs driver has determined that the firmware kernel component needs to be updated. To perform this update, execute luxadm -e forcelip on Solaris 10 (or a fcadm force-lip on Solaris 11) against each emlxs instance that reports the message. As stated in the documentation:

This procedure, while disruptive, will ensure that both driver and firmware are current. The force lip will temporarily disrupt I/O on the port. The disruption and firmware upgrade takes approximately 30-60 seconds to complete as seen from the example messages below. The example shows an update is needed for emlxs instance 0 (emlxs0) and emlxs instance 1 (emlxs1), which happens to correlate to the c1 and c2 controllers in this case.

# fcinfo hba-port
HBA Port WWN: 10000000c9e43860
        OS Device Name: /dev/cfg/c1
        Manufacturer: Emulex
        Model: LPe12000-S
        Firmware Version: 1.00a12 (U3D1.00A12)
        FCode/BIOS Version: Boot:5.03a0 Fcode:3.01a1
        Serial Number: 0999BT0-1136000725
        Driver Name: emlxs
        Driver Version: 2.60k (2011.03.24.16.45)
        Type: N-port
        State: online
        Supported Speeds: 2Gb 4Gb 8Gb
        Current Speed: 8Gb
        Node WWN: 20000000c9e43860
HBA Port WWN: 10000000c9e435fe
        OS Device Name: /dev/cfg/c2
        Manufacturer: Emulex
        Model: LPe12000-S
        Firmware Version: 1.00a12 (U3D1.00A12)
        FCode/BIOS Version: Boot:5.03a0 Fcode:3.01a1
        Serial Number: 0999BT0-1136000724
        Driver Name: emlxs
        Driver Version: 2.60k (2011.03.24.16.45)
        Type: N-port
        State: online
        Supported Speeds: 2Gb 4Gb 8Gb
        Current Speed: 8Gb
        Node WWN: 20000000c9e435fe

In order not to interrupt the service, and because MPxIO (native multipathing I/O) is in use, each emlxs instance will be update one after each other.

# date
Wed Jul 18 09:34:11 CEST 2012

# luxadm -e forcelip /dev/cfg/c1

# grep emlx /var/adm/messages
[...]
Jul 18 09:35:48 beastie emlxs: [ID 349649 kern.info] [ 5.0334]emlxs0: NOTICE: 710: Link down.
Jul 18 09:35:53 beastie emlxs: [ID 349649 kern.info] [13.02C0]emlxs0: NOTICE: 200: Adapter initialization. (Firmware update needed. Updating. id=67 fw=6)
Jul 18 09:35:53 beastie emlxs: [ID 349649 kern.info] [ 3.0ECB]emlxs0: NOTICE:1520: Firmware download. (AWC file: KERN: old=1.00a11  new=1.10a8  Update.)
Jul 18 09:35:53 beastie emlxs: [ID 349649 kern.info] [ 3.0EEB]emlxs0: NOTICE:1520: Firmware download. (DWC file: TEST:             new=1.00a4  Update.)
Jul 18 09:35:53 beastie emlxs: [ID 349649 kern.info] [ 3.0EFF]emlxs0: NOTICE:1520: Firmware download. (DWC file: STUB: old=1.00a12  new=2.00a3  Update.)
Jul 18 09:35:53 beastie emlxs: [ID 349649 kern.info] [ 3.0F1D]emlxs0: NOTICE:1520: Firmware download. (DWC file: SLI2: old=1.00a12  new=2.00a3  Update.)
Jul 18 09:35:53 beastie emlxs: [ID 349649 kern.info] [ 3.0F2C]emlxs0: NOTICE:1520: Firmware download. (DWC file: SLI3: old=1.00a12  new=2.00a3  Update.)
Jul 18 09:36:01 beastie emlxs: [ID 349649 kern.info] [ 3.0143]emlxs0: NOTICE:1521: Firmware download complete. (Status good.)
Jul 18 09:36:06 beastie emlxs: [ID 349649 kern.info] [ 5.055E]emlxs0: NOTICE: 720: Link up. (8Gb, fabric, initiator)

# date
Wed Jul 18 09:39:51 CEST 2012

# luxadm -e forcelip /dev/cfg/c2

# grep emlx /var/adm/messages
[...]
Jul 18 09:41:35 beastie emlxs: [ID 349649 kern.info] [ 5.0334]emlxs1: NOTICE: 710: Link down.
Jul 18 09:41:40 beastie emlxs: [ID 349649 kern.info] [13.02C0]emlxs1: NOTICE: 200: Adapter initialization. (Firmware update needed. Updating. id=67 fw=6)
Jul 18 09:41:40 beastie emlxs: [ID 349649 kern.info] [ 3.0ECB]emlxs1: NOTICE:1520: Firmware download. (AWC file: KERN: old=1.00a11  new=1.10a8  Update.)
Jul 18 09:41:40 beastie emlxs: [ID 349649 kern.info] [ 3.0EEB]emlxs1: NOTICE:1520: Firmware download. (DWC file: TEST:             new=1.00a4  Update.)
Jul 18 09:41:40 beastie emlxs: [ID 349649 kern.info] [ 3.0EFF]emlxs1: NOTICE:1520: Firmware download. (DWC file: STUB: old=1.00a12  new=2.00a3  Update.)
Jul 18 09:41:40 beastie emlxs: [ID 349649 kern.info] [ 3.0F1D]emlxs1: NOTICE:1520: Firmware download. (DWC file: SLI2: old=1.00a12  new=2.00a3  Update.)
Jul 18 09:41:40 beastie emlxs: [ID 349649 kern.info] [ 3.0F2C]emlxs1: NOTICE:1520: Firmware download. (DWC file: SLI3: old=1.00a12  new=2.00a3  Update.)
Jul 18 09:41:48 beastie emlxs: [ID 349649 kern.info] [ 3.0143]emlxs1: NOTICE:1521: Firmware download complete. (Status good.)
Jul 18 09:41:53 beastie emlxs: [ID 349649 kern.info] [ 5.055E]emlxs1: NOTICE: 720: Link up. (8Gb, fabric, initiator)

That's it. Lastly, the documentation says:

At this point, the firmware upgrade is complete as indicated by the Status good message above. A reboot is not strictly necessary to begin using the new firmware. But the fcinfo hba-port command may still report the old firmware version. This is only a reporting defect that does not affect firmware operation and will be corrected in a later version of fcinfo. To correct the version shown by fcinfo, a second reboot is necessary. On systems capable of DR, you can perform dynamic reconfiguration on the HBA (via cfgadm unconfigure/configure) instead of rebooting.

For my part, I tried to unconfigure/configure each emlxs instance using cfgadm without a reboot, but this didn't work as expected on Solaris 10. The fcinfo utility still report the old firmware version, seems until the next reboot.

Saturday 16 July 2011

About The Oracle Solaris 11 Express Support Repository Updates

With the last update of the Oracle Solaris 11 Express 2010.11 SRU released last week (5 July 2011), Oracle introduced the number of the repository update in the output of the uname command.

For Solaris up to version 10, the uname command just displayed the kernel revision number which is nothing but explicit, unless you are a intimate with the kernel PatchID.

If you wanted to be confident about the Update of the OS you are running, a better way was to look at the /etc/release file which is more accurate in term of operating system baseline information, because it was updated by a system update or by applying the Oracle Solaris Patch Update Bundle for a given Update.

It seems that things are now evolving, certainly because of the way IPS works. The uname now shows the exact update of the SRU, reflecting very precisely the update the system is running, but the /etc/release file is currently stuck at the build of the Solaris release, say snv_151a in the case of Oracle Solaris 11 Express 2010.11:

$ pkg search -p entire
PACKAGE                        PUBLISHER
pkg:/entire@0.5.11-0.151.0.1.8 solaris
$ uname -v
151.0.1.8
$ cat /etc/release
                      Oracle Solaris 11 Express snv_151a X86
     Copyright (c) 2010, Oracle and/or its affiliates.  All rights reserved.
                           Assembled 04 November 2010

Monday 10 January 2011

Solaris 11 Express: Problem #3

In this series, I will report the bugs or problems I find when running the Oracle Solaris 11 Express distribution. I hope this will give more visibility on those PR to Oracle to correct them before the release of Solaris 11 next year.

I recently switch from the official Oracle release repository to the support repository for Solaris 11 Express. Before the switch, one non-global zone was created. Since there were some updates to this repository, I pkg update'ed, rebooted to the new boot environment, and tried to update the non-global zone:

# beadm list                                                                         
BE        Active Mountpoint Space Policy Created          
--        ------ ---------- ----- ------ -------          
solaris   -      -          9.88M static 2010-12-01 09:32 
solaris-1 NR     /          5.44G static 2011-01-03 19:35 

# zoneadm list -vc                                                                   
  ID NAME             STATUS     PATH                           BRAND    IP    
   0 global           running    /                              ipkg     shared
   - zone1            installed  /dpool/store/zone/zone1        ipkg     shared

# zoneadm -z zone1 detach

# zoneadm -z zone1 attach -u
Log File: /var/tmp/zone1.attach_log.lfa49e
Attaching...

preferred global publisher: solaris
       Global zone version: entire@0.5.11,5.11-0.151.0.1.1:20101222T214417Z
   Non-Global zone version: entire@0.5.11,5.11-0.151.0.1:20101105T054056Z

                     Cache: Using /var/pkg/download.
  Updating non-global zone: Output follows
Creating Plan                          
ERROR: Could not update attaching zone
                    Result: Attach Failed.

# cat /var/tmp/zone1.attach_log.lfa49e
[Monday, January  3, 2011 08:42:24 PM CET] Log File: /var/tmp/zone1.attach_log.lfa49e
[Monday, January  3, 2011 08:42:25 PM CET] Attaching...
[Monday, January  3, 2011 08:42:25 PM CET] existing
[Monday, January  3, 2011 08:42:25 PM CET] 
[Monday, January  3, 2011 08:42:25 PM CET]   Sanity Check: Passed.  Looks like an OpenSolaris system.
[Monday, January  3, 2011 08:42:31 PM CET] preferred global publisher: solaris
[Monday, January  3, 2011 08:42:32 PM CET]        Global zone version: entire@0.5.11,5.11-0.151.0.1.1:20101222T214417Z
[Monday, January  3, 2011 08:42:32 PM CET]    Non-Global zone version: entire@0.5.11,5.11-0.151.0.1:20101105T054056Z

[Monday, January  3, 2011 08:42:32 PM CET]                      Cache: Using /var/pkg/download.
[Monday, January  3, 2011 08:42:32 PM CET]   Updating non-global zone: Output follows
pkg set-publisher: 
Unable to locate certificate '/dpool/store/zone/zone1/root/dpool/store/zone/zone1/root/var/pkg/ssl/Oracle_Solaris_11_Express_Support.certificate.pem' needed to access 'https://pkg.oracle.com/solaris/support/'.
pkg unset-publisher: 
Removal failed for 'za23954': The preferred publisher cannot be removed.

pkg: The following pattern(s) did not match any packages in the current catalog.
Try relaxing the pattern, refreshing and/or examining the catalogs:
        entire@0.5.11,5.11-0.151.0.1.1:20101222T214417Z
[Monday, January  3, 2011 08:44:04 PM CET] ERROR: Could not update attaching zone
[Monday, January  3, 2011 08:44:06 PM CET]                     Result: Attach Failed.

FYI, this problem was covered by the Bug ID number 13000, but is always present at this time, at least for Solaris 11 Express 2010.11.

So, it seems that the change of repository for the solaris publisher was not well managed by the non-global zone update mechanism. Just to be sure, I tried to create a new non-global zone in the new boot environment, but the problem exists in this case, too:

# zoneadm -z zone2 install
A ZFS file system has been created for this zone.
   Publisher: Using solaris (https://pkg.oracle.com/solaris/support/ ).
   Publisher: Using opensolaris.org (http://pkg.opensolaris.org/dev/).
       Image: Preparing at /dpool/store/zone/zone2/root.
 Credentials: Propagating Oracle_Solaris_11_Express_Support.key.pem
 Credentials: Propagating Oracle_Solaris_11_Express_Support.certificate.pem
Traceback (most recent call last):
  File "/usr/bin/pkg", line 4225, in handle_errors
    __ret = func(*args, **kwargs)
  File "/usr/bin/pkg", line 4156, in main_func
    ret = image_create(pargs)
  File "/usr/bin/pkg", line 3836, in image_create
    variants=variants, props=set_props)
  File "/usr/lib/python2.6/vendor-packages/pkg/client/api.py", line 3205, in image_create
    uri=origins[0])
TypeError: 'set' object does not support indexing

pkg: This is an internal error.  Please let the developers know about this
problem by filing a bug at http://defect.opensolaris.org and including the
above traceback and this message.  The version of pkg(5) is '052adf36c3f4'.
ERROR: failed to create image

FYI, this problem is covered by the Bug ID number 17653.

Well, no luck here. I didn't see a Solaris IPS update for these problems yet, which are very annoying, at least.

Update #1 (2011-02-03): The problem is now fixed in the latest support pkg repository. Install or attach a non-global ipkg branded zone works now as expected:

# zoneadm -z zone1 install
A ZFS file system has been created for this zone.
   Publisher: Using solaris (https://pkg.oracle.com/solaris/support/ ).
   Publisher: Using opensolaris.org (http://pkg.opensolaris.org/dev/).
   Publisher: Using sunfreeware (http://pkg.sunfreeware.com:9000/).
       Image: Preparing at /dpool/export/zone/zone1/root.
 Credentials: Propagating Oracle_Solaris_11_Express_Support.key.pem
 Credentials: Propagating Oracle_Solaris_11_Express_Support.certificate.pem
       Cache: Using /var/pkg/download. 
Sanity Check: Looking for 'entire' incorporation.
  Installing: Core System (output follows)
               Packages to install:     1
           Create boot environment:    No
[...]
        Note: Man pages can be obtained by installing SUNWman
 Postinstall: Copying SMF seed repository ... done.
 Postinstall: Applying workarounds.
        Done: Installation completed in 332.525 seconds.

  Next Steps: Boot the zone, then log into the zone console (zlogin -C)
              to complete the configuration process.

And:

# zoneadm -z zone1 detach
# zoneadm -z zone1 attach -u
Log File: /var/tmp/zone1.attach_log.PhaOKf
Attaching...

preferred global publisher: solaris
       Global zone version: entire@0.5.11,5.11-0.151.0.1.2:20110127T225841Z
   Non-Global zone version: entire@0.5.11,5.11-0.151.0.1.2:20110127T225841Z

                     Cache: Using /var/pkg/download.
  Updating non-global zone: Output follows
No updates necessary for this image.   
  Updating non-global zone: Zone updated.
                    Result: Attach Succeeded.
# zoneadm list -vc
  ID NAME             STATUS     PATH                           BRAND    IP    
   0 global           running    /                              ipkg     shared
   - zone1            installed  /dpool/export/zone/zone1       ipkg     shared

Thursday 7 October 2010

Live Upgrading To Solaris 10 9/10

If you try to update to the latest Solaris 10 Update (U9), one new step is now required in order to be able to successfully luupgrade to the desired Update. As mentioned in the Oracle Solaris 10 9/10 Release Notes, a new Auto Registration mecanism has been added to this release to facilitate registering the system using your Oracle support credentials.

So, if you try the classical luupgrade following incantation, it will fail with the reported message:

# luupgrade -u -n s10u9 -s /mnt -j /var/tmp/profile
System has findroot enabled GRUB
No entry for BE  in GRUB menu
Copying failsafe kernel from media.
61364 blocks
miniroot filesystem is 
Mounting miniroot at 
ERROR: The auto registration file <> does not exist or incomplete.
       The auto registration file is mandatory for this upgrade.
       Use -k  argument along with luupgrade command.

So, you now need to set the Auto Registration choice as a mandatory parameter. Here is how it resembles right now:

# echo "auto_reg=disable" > /var/tmp/sysidcfg
# luupgrade -u -n s10u9 -s /mnt -j /var/tmp/profile -k /var/tmp/sysidcfg
System has findroot enabled GRUB
No entry for BE  in GRUB menu
Copying failsafe kernel from media.
61364 blocks
miniroot filesystem is 
Mounting miniroot at 
#######################################################################
 NOTE: To improve products and services, Oracle Solaris communicates
 configuration data to Oracle after rebooting.

 You can register your version of Oracle Solaris to capture this data
 for your use, or the data is sent anonymously.

 For information about what configuration data is communicated and how
 to control this facility, see the Release Notes or
 www.oracle.com/goto/solarisautoreg.

 INFORMATION: After activated and booted into new BE ,
 Auto Registration happens automatically with the following Information

autoreg=disable
#######################################################################
Validating the contents of the media .
The media is a standard Solaris media.
The media contains an operating system upgrade image.
The media contains  version <10>.
Constructing upgrade profile to use.
Locating the operating system upgrade program.
Checking for existence of previously scheduled Live Upgrade requests.
Creating upgrade profile for BE .
Checking for GRUB menu on ABE .
Saving GRUB menu on ABE .
Checking for x86 boot partition on ABE.
Determining packages to install or upgrade for BE .
Performing the operating system upgrade of the BE .
CAUTION: Interrupting this process may leave the boot environment unstable
or unbootable.
Upgrading Solaris: 100% completed
Installation of the packages from this media is complete.
Restoring GRUB menu on ABE .
Updating package information on boot environment .
Package information successfully updated on boot environment .
Adding operating system patches to the BE .
The operating system patch installation is complete.
ABE boot partition backing deleted.
PBE GRUB has no capability information.
PBE GRUB has no versioning information.
ABE GRUB is newer than PBE GRUB. Updating GRUB.
GRUB update was successfull.
INFORMATION: The file  on boot
environment  contains a log of the upgrade operation.
INFORMATION: The file  on boot
environment  contains a log of cleanup operations required.
INFORMATION: Review the files listed above. Remember that all of the files
are located on boot environment . Before you activate boot
environment , determine if any additional system maintenance is
required or if additional media of the software distribution must be
installed.
The Solaris upgrade of the boot environment  is complete.
Creating miniroot device
Configuring failsafe for system.
Failsafe configuration is complete.
Installing failsafe
Failsafe install is complete.

Not sure this will ease the upgrade path to this Update, even if there is nothing really wrong with this. It may just have been less intrusive I think.

Sunday 2 May 2010

Live Upgrading When Diagnostics Mode Is Enabled

Recently, we faced an interesting problem when using Live Upgrade on some of our SPARC servers (with lots of non-global zones hosted on SAN devices). Here are the basic steps we generally follow when using LU:

  1. Update the Live Upgrade functionality according to the Article ID #1004881.1, Solaris Live Upgrade Software: Patch Requirements.
  2. Create the ABE.
  3. Upgrade the ABE with an operating system image (and test the upgrade according to a JumpStart profile).
  4. Apply a determined Recommended Patch Cluster to the ABE.
  5. Activate the ABE to be the next booted BE.
  6. Reboot on the new BE, and post-configuration steps--eventually.

In some circumstances, and even if all the steps went pretty well--the activation of the new BE was ok (we traced its activities)--we did reboot on the old BE:

# lustatus
Boot Environment     Is       Active Active    Can    Copy
Name                 Complete Now    On Reboot Delete Status
-------------------- -------- ------ --------- ------ -------
s10u4                yes      yes    yes       no     -
s10u8                yes      no     no        yes    -
# lucurr
s10u4
# luactivate -n s10u8
[...]
# lustatus
Boot Environment     Is       Active Active    Can    Copy
Name                 Complete Now    On Reboot Delete Status
-------------------- -------- ------ --------- ------ -------
s10u4                yes      yes    no        no     -
s10u8                yes      no     yes       no     -
# shutdown -y -g 0 -i 6
[...]
# lucurr
s10u4

Ouch. After a bit of digging, and seeing nothing wrong from the console via the Service Processor, we hit the following message from the log of the SMF legacy script run by LU when rebooting (at the shutdown time more precisely):

# cat /var/svc/log/rc6.log
[...]
Executing legacy init script "/etc/rc0.d/K62lu".
Live Upgrade: Deactivating current boot environment <s10u4>.
zlogin: login allowed only to running zones (zonename1 is 'installed').
zlogin: login allowed only to running zones (zonename2 is 'installed').
Live Upgrade: Executing Stop procedures for boot environment <s10u4>.
Live Upgrade: Current boot environment is <s10u4>.
Live Upgrade: New boot environment will be <s10u8>.
Live Upgrade: Activating boot environment <s10u8>.
Creating boot_archive for /.alt.tmp.b-9Tb.mnt
updating /.alt.tmp.b-9Tb.mnt/platform/sun4v/boot_archive
Live Upgrade: The boot device for boot environment <s10u8> is
</dev/dsk/c1t0d0s4>.
/etc/lib/lu/lubootdev: ERROR: Unable to get current boot devices.
/etc/lib/lu/lubootdev: INFORMATION: The system is running with the system
boot PROM diagnostics mode enabled. When diagnostics mode is
enabled, Live Upgrade is unable to access the system boot
device list, causing certain features of Live Upgrade (such
as changing the system boot device after activating a boot
environment) to fail. To correct this problem, please run
the system in normal, non-diagnostic mode. The system might
have a key switch or other external means of booting the
system in normal mode. If you do not have such a means, you
can set one or both of the EEPROM parameters 'diag-switch?'
or 'diagnostic-mode?' to 'false'.  After making a change,
either through external means or by changing an EEPROM
parameter, retry the Live Upgrade operation or command.
ERROR: Live Upgrade: Unable to change primary boot device to boot
environment <s10u8>.
ERROR: You must manually change the system boot prom to boot the system
from device </pci@0/pci@0/pci@2/scsi@0/sd@0,0:e>.
Live Upgrade: Activation of boot environment <s10u8> completed.
Legacy init script "/etc/rc0.d/K62lu" exited with return code 0.
[...]

Well, pretty explicit in fact, but very unexpected when the activation went so well beforehand. So, go to check the EEPROM, and change it back if necessary:

# eeprom diag-switch?
diag-switch?=true
# eeprom diag-switch?=false

And all returned to a normal situation when activating again, and rebooting. Although this case is self explanatory in the corresponding log file, and is describe in the Bug ID #6949588, I think this one may be put more visible to the system administrator, for example by checking the EEPROM configuration during the BE activation code (at the luactivate command).

Sunday 21 May 2006

Upgrading from snv_38 to snv_39 Using Solaris Live Upgrade

After writing about how to patch (or upgrade) a running system playing with a mirrored OpenSolaris SVM, here is a little step-by-step how to on upgrading (or patching, etc.) a live system using the Live Upgrade feature.

Before installing or running Live Upgrade, you are required to install a limited set of patch revisions. Make sure you have the most recently updated patch list by consulting sunsolve.sun.com. Search for the info doc 72099 on the SunSolve web site (you must have a registered Sun support customer account to be able to view this document).

Note: In the following procedure, we will assume that all we want (and need) to upgrade to is provided via a one large DVD ISO image.

If all seems OK, you must begin to update the current running system with the appropriate lu packages, i.e. those provided for the targeted OS revision. You can either use the provided tools:

# /cdrom/cdrom0/Solaris_11/Tools/Installers/liveupgrade20

Or do it yourself:

# pkgrm SUNWluu SUNWlur
# pkgadd -d /cdrom/cdrom0/Solaris_11/Product SUNWlur SUNWluu

Since the current OS is totally installed on the first slice of the first disk (c1d0s0), and that the slice six (c1d0s6) is exactly the same size as the first one, we will use it for the second ABE device for our purpose and create the corresponding Boot Environment.

# lucreate -c snv_38 -n snv_39 -m /:/dev/dsk/c1d0s6:ufs
/* If the snv_38 BE already exists, just create the new one for snv_39. */
# lucreate -n snv_39 -m /:/dev/dsk/c1d0s6:ufs
Discovering physical storage devices
Discovering logical storage devices
Cross referencing storage devices with boot environment configurations
Determining types of file systems supported
Validating file system requests
Preparing logical storage devices
Preparing physical storage devices
Configuring physical storage devices
Configuring logical storage devices
Analyzing system configuration.
Comparing source boot environment <snv_38> file systems with the file
system(s) you specified for the new boot environment. Determining which
file systems should be in the new boot environment.
Updating boot environment description database on all BEs.
Searching /dev for possible boot environment filesystem devices

Updating system configuration files.
The device </dev/dsk/c1d0s6> is not a root device for any boot environment.
Creating configuration for boot environment <snv_39>.
Source boot environment is <snv_38>.
Creating boot environment <snv_39>.
Checking for GRUB menu on boot environment <snv_39>.
The boot environment <snv_39> does not contain the GRUB menu.
Creating file systems on boot environment <snv_39>.
Creating <ufs> file system for </> on </dev/dsk/c1d0s6>.
Mounting file systems for boot environment <snv_39>.
Calculating required sizes of file systems for boot environment <snv_39>.
Populating file systems on boot environment <snv_39>.
Checking selection integrity.
Integrity check OK.
Populating contents of mount point </>.
Copying.
Creating shared file system mount points.
Creating compare databases for boot environment <snv_39>.
Creating compare database for file system </>.
Updating compare databases on boot environment <snv_39>.
Making boot environment <snv_39> bootable.
Updating bootenv.rc on ABE <snv_39>.
Population of boot environment <snv_39> successful.
Creation of boot environment <snv_39> successful.

Verify the correct attribution of the different file systems, in particular between those which are cloned (required by a Solaris installation, such as /, /var, /usr, and /opt) and those which are shared (such as /export).

# lufslist -n snv_38
               boot environment name: snv_38
               This boot environment is currently active.
               This boot environment will be active on next system boot.

Filesystem           fstype    device size Mounted on     Mount Options
-------------------- -------- ------------ -------------- --------------
/dev/dsk/c1d0s1      swap       4301821440 -              -
/dev/dsk/c1d0s0      ufs        8595417600 /              -
/dev/dsk/c1d0s7      ufs       58407713280 /export        -
#
# lufslist -n snv_39
               boot environment name: snv_39

Filesystem           fstype    device size Mounted on     Mount Options
-------------------- -------- ------------ -------------- --------------
/dev/dsk/c1d0s1      swap       4301821440 -              -
/dev/dsk/c1d0s6      ufs        8595417600 /              -
/dev/dsk/c1d0s7      ufs       58407713280 /export        -

You then just need to upgrade the second BE using the installation media of the desired release or revision.

# luupgrade -u -n snv_39 -s /cdrom/cdrom0

Install media is CD/DVD. </cdrom/cdrom0>.
Waiting for CD/DVD media </cdrom/cdrom0> ...
Copying failsafe multiboot from media.
Uncompressing miniroot
Creating miniroot device
miniroot filesystem is <ufs>
Mounting miniroot at </cdrom/cdrom0/Solaris_11/Tools/Boot>
Validating the contents of the media </cdrom/cdrom0>.
The media is a standard Solaris media.
The media contains an operating system upgrade image.
The media contains <Solaris> version <11>.
Constructing upgrade profile to use.
Locating the operating system upgrade program.
Checking for existence of previously scheduled Live Upgrade requests.
Creating upgrade profile for BE <snv_39>.
Checking for GRUB menu on ABE <snv_39>.
Checking for x86 boot partition on ABE.
Determining packages to install or upgrade for BE <snv_39>.
Performing the operating system upgrade of the BE <snv_39>.
CAUTION: Interrupting this process may leave the boot environment unstable
or unbootable.
Upgrading Solaris: 100% completed
Installation of the packages from this media is complete.
Deleted empty GRUB menu on ABE <snv_39>.
Adding operating system patches to the BE <snv_39>.
The operating system patch installation is complete.
ABE boot partition backing deleted.
Configuring failsafe for system.
Failsafe configuration is complete.
INFORMATION: The file </var/sadm/system/logs/upgrade_log> on boot
environment <snv_39> contains a log of the upgrade operation.
INFORMATION: The file </var/sadm/system/data/upgrade_cleanup> on boot
environment <snv_39> contains a log of cleanup operations required.
INFORMATION: Review the files listed above. Remember that all of the files
are located on boot environment <snv_39>. Before you activate boot
environment <snv_39>, determine if any additional system maintenance is
required or if additional media of the software distribution must be
installed.
The Solaris upgrade of the boot environment <snv_39> is complete.
Installing failsafe
Failsafe install is complete.

If something went wrong during the upgrade of the new Boot Environment snv_39, you can always restart with a very fresh one using the lumake -n snv_39 command. If all went smooth, you can now check and compare the newly created BE:

# lucompare -t snv_39 -o /tmp/lucompare.snv_39
# lumount -n snv_39
/.alt.snv_39
# mount -p | grep snv_39
/dev/dsk/c1d0s6 - /.alt.snv_39 ufs - no rw,intr,largefiles,logging,xattr,onerror=panic
# luumount -n snv_39
#
# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
snv_38                     yes      yes    yes       no     -
snv_39                     yes      no     no        yes    -

Since the overall upgrade is OK, you just need to activate the fresh BE snv_39, export your data zpool and perform a clean reboot, otherwise the new environment will not be activated. Do not use the uadmin, halt, or reboot commands!

# luactivate snv_39
# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
snv_38                     yes      yes    no        no     -
snv_39                     yes      no     yes       no     -
#
# zpool export datazp
# shutdown -y -g 0 -i 6

Et voilà! After the reboot, you must see something similar to:

# uname -a
SunOS unic 5.11 snv_39 i86pc i386 i86pc
#
# cat /etc/release
                            Solaris Nevada snv_39 X86
           Copyright 2006 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                              Assembled 01 May 2006

Last, please find some invaluable documentation on the subject below:

Monday 1 May 2006

How to Patch a Live System Mirrored with SVM

Aim of this memo

The main purpose of this technical note is to demonstrate how to patch a running (live) system currently mirrored using SVM, minimizing the downtime as far as possible.

The idea is simple: detach one side of the mirror, apply the cluster patch against it and reboot on it. If all seems OK, re-encapsulate the system. This can achieve similar goal currently found in the Live Upgrade feature of the Solaris OS (see live_upgrade(5)), with less complexity and different requirement (LVM RAID-1 vs. spare disk, or free slice).

Using this solution, the downtime can go between 10 to 30 minutes of service unavailability (depending on the hardware POST) and a maximum of two reboots are required, whatever is the number of patches to apply.

Here it is

Here is a system encapsulated using SDS 4.x or SVM 1.x, and the associated SVM encapsulation configuration:

# metastat -p
d3 -m d13 d23 1
d13 1 1 c0t0d0s3
d23 1 1 c0t1d0s3
d1 -m d11 d21 1
d11 1 1 c0t0d0s1
d21 1 1 c0t1d0s1
d0 -m d10 d20 1
d10 1 1 c0t0d0s0
d20 1 1 c0t1d0s0
#
# cat /etc/vfstab
#device         device          mount   FS      fsck    mount   mount
#to mount       to fsck         point   type    pass    at boot options
#
fd      -       /dev/fd fd      -       no      -
/proc   -       /proc   proc    -       no      -
/dev/md/dsk/d3  -       -       swap    -       no      -
/dev/md/dsk/d0  /dev/md/rdsk/d0 /       ufs     1       no      -
/dev/md/dsk/d1  /dev/md/rdsk/d1 /var    ufs     1       no      -
swap    -       /tmp    tmpfs   -       yes     -

Run an explorer and generate a cluster patch, based on tools provided by the OSE for example, if you are luckily enough to have one included with your support plan (or just pick one provided at SunSolve).

Then, be sure to be able to boot on the two disks, just in case:

# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0
# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t1d0s0

The next step is to voluntarily detach one side of the mirror: take the first one for the sake of simplicity (i.e. c0t0d0). Indeed, in this case we are pretty sure that its alias name at the OBP is disk.

Note: You can always create it at the OBP (using the usual set of commands, such as show-disks, devalias, etc.) if you want. That is just a matter of personal preferences.

# lockfs -af /* Just to minimize the fs inconsistencies at next fsck(1m). */
#
# metadetach d0 d10
# metadetach d1 d11
# metadetach d3 d13
#
# metaclear d10
# metaclear d11
# metaclear d13

Check and repair the file systems if necessary, since we will boot on them the next time:

# fsck /dev/dsk/c0t0d0s0
# fsck /dev/dsk/c0t0d0s1

Next steps include mounting the recently detached file systems and prepare the first disk to boot without SVM encapsulation:

# mkdir /mirror
# mount /dev/dsk/c0t0d0s0 /mirror
# mount /dev/dsk/c0t0d0s1 /mirror/var
#
# cat << EOF > /mirror/etc/vfstab
#device         device          mount   FS      fsck    mount   mount
#to mount       to fsck         point   type    pass    at boot options
#
fd      -       /dev/fd fd      -       no      -
/proc   -       /proc   proc    -       no      -
/dev/dsk/c0t0d0s3       -       -       swap    -       no      -
/dev/dsk/c0t0d0s0       /dev/rdsk/c0t0d0s0      /       ufs     1       no      -
/dev/dsk/c0t0d0s1       /dev/rdsk/c0t0d0s1      /var    ufs     1       no      -
swap    -       /tmp    tmpfs   -       yes     -
EOF
#
# cp /mirror/etc/system /mirror/etc/system.orig
# sed -e 's;rootdev:/pseudo/md@0:0,0,blk;*rootdev:/pseudo/md@0:0,0,blk;' \
   /mirror/etc/system.orig > /mirror/etc/system

Last, install patches against the first disk, clean things up a little and reboot if the install procedure went all smooth:

# ./install_all_patches -R /mirror
#
# umount /mirror/var
# umount /mirror
# rmdir /mirror
#
# shutdown -y -g 0 -i 6

After rebooting, carefully review the behavior of the very freshly patched system. If all seems well, don't forget to re-encapsulate the second disk. Here is a quick and easy way to this:

/* Recreate the metadb. */
# metadb -d c0t0d0s4 c0t1d0s4
# metadb -a -c3 -f c0t0d0s4 c0t1d0s4
#
/* Clean the system metadevices always present. */
# metaclear d0
# metaclear d1
# metaclear d3
# metaclear d20
# metaclear d21
# metaclear d23
#
/* Re-create them as part of a mirror. */
# metainit -f d10 1 1 c0t0d0s0
# metainit d0 -m d10
# metainit -f d11 1 1 c0t0d0s1
# metainit d1 -m d11
# metainit -f d13 1 1 c0t0d0s3
# metainit d3 -m d13
#
/* Be able to boot on the new metadevices. */
# metaroot d0
#
/* Reboot, and create the second side of the mirror. */
# shutdown -y -g 0 -i 6
[...]
# metainit d20 1 1 c0t1d0s0
# metattach d0 d20
# metainit d21 1 1 c0t1d0s1
# metattach d1 d21
# metainit d23 1 1 c0t1d0s3
# metattach d3 d23

For a little more detailed explanation about encapsulating the system using SVM on Sun Solaris, please refer to the dedicated entry in this blog.

Last, it must be mentioned that this documentation was written by our OSE, and that this procedure was officially marked as supported by Sun Microsystems.