blog'o thnet

To content | To menu | To search

Tag - JumpStart

Entries feed - Comments feed

Sunday 29 August 2010

Apropos Solaris

John Fowler (Oracle Executive Vice President for Server and Storage Systems) held an on-line webcast on August 10 on the strategy for hardware servers based on SPARC and x86, and the formalization of the upcoming release of Solaris 11 in 2011.

This post is only aimed at summarize the main points, the complete slides of the presentation are available at the Oracle web site.

  1. Message #1: SPARC is alive and will continue. Solaris is alive and will continue. Both actively.
  2. Message #2: What is interesting here is that this is not only intentions, it is a real roadmap up to five years, on the ex-Sun well-known products. Oracle clearly has some strong plans about Solaris, SPARC ad x86 platforms, and just began to speak publicly about them. We will see probably more about them all at the Oracle OpenWorld in few weeks now.

The points are:

  • A roadmap for SPARC and Solaris up to 2015.
  • SPARC will double performance improvement every two years:
    • Cores: 128 (32 in 2010).
    • Threads: 16384 (512 in 2010).
    • Memory capacity: 64TB (4TB in 2010).
    • Logical Domains: 256 (128 in 2010).
    • Java Ops per second: 50000 (5000 in 2010).
  • Very SPARC oriented: it seems that there will only be one SPARC brand at the end of 2015.
  • Two big families of SPARC servers: lots of threads known as the T-Series, lots of sockets known as M-Series.
  • A least one Update to Solaris 10 around 2010Q3, a beta program of Solaris 11 known as Solaris 11 Express due to last 2010, then Solaris 11 due in 2011 and up to 2015.

Solaris 11 will be based on the now close OpenSolaris distribution, which will include:

  • Image Packaging System (IPS): totally new packaging system fully integrated with ZFS and Boot Environment Administration (aimed at replacing Live Upgrade).
  • Crossbow network virtualization stack.
  • ZFS de-duplication, and lots of recent optimizations and functionalities.
  • CIFS file services : in-kernel implementation of CIFS.
  • Enhanced Gnome user environment.
  • Updated installer and auto network installer ("AI", aimed at replacing JumpStart)
  • Network Automagic configuration.
  • And many more (I heard Solaris 10 BrandZ...).

Sunday 2 May 2010

Live Upgrading When Diagnostics Mode Is Enabled

Recently, we faced an interesting problem when using Live Upgrade on some of our SPARC servers (with lots of non-global zones hosted on SAN devices). Here are the basic steps we generally follow when using LU:

  1. Update the Live Upgrade functionality according to the Article ID #1004881.1, Solaris Live Upgrade Software: Patch Requirements.
  2. Create the ABE.
  3. Upgrade the ABE with an operating system image (and test the upgrade according to a JumpStart profile).
  4. Apply a determined Recommended Patch Cluster to the ABE.
  5. Activate the ABE to be the next booted BE.
  6. Reboot on the new BE, and post-configuration steps--eventually.

In some circumstances, and even if all the steps went pretty well--the activation of the new BE was ok (we traced its activities)--we did reboot on the old BE:

# lustatus
Boot Environment     Is       Active Active    Can    Copy
Name                 Complete Now    On Reboot Delete Status
-------------------- -------- ------ --------- ------ -------
s10u4                yes      yes    yes       no     -
s10u8                yes      no     no        yes    -
# lucurr
s10u4
# luactivate -n s10u8
[...]
# lustatus
Boot Environment     Is       Active Active    Can    Copy
Name                 Complete Now    On Reboot Delete Status
-------------------- -------- ------ --------- ------ -------
s10u4                yes      yes    no        no     -
s10u8                yes      no     yes       no     -
# shutdown -y -g 0 -i 6
[...]
# lucurr
s10u4

Ouch. After a bit of digging, and seeing nothing wrong from the console via the Service Processor, we hit the following message from the log of the SMF legacy script run by LU when rebooting (at the shutdown time more precisely):

# cat /var/svc/log/rc6.log
[...]
Executing legacy init script "/etc/rc0.d/K62lu".
Live Upgrade: Deactivating current boot environment <s10u4>.
zlogin: login allowed only to running zones (zonename1 is 'installed').
zlogin: login allowed only to running zones (zonename2 is 'installed').
Live Upgrade: Executing Stop procedures for boot environment <s10u4>.
Live Upgrade: Current boot environment is <s10u4>.
Live Upgrade: New boot environment will be <s10u8>.
Live Upgrade: Activating boot environment <s10u8>.
Creating boot_archive for /.alt.tmp.b-9Tb.mnt
updating /.alt.tmp.b-9Tb.mnt/platform/sun4v/boot_archive
Live Upgrade: The boot device for boot environment <s10u8> is
</dev/dsk/c1t0d0s4>.
/etc/lib/lu/lubootdev: ERROR: Unable to get current boot devices.
/etc/lib/lu/lubootdev: INFORMATION: The system is running with the system
boot PROM diagnostics mode enabled. When diagnostics mode is
enabled, Live Upgrade is unable to access the system boot
device list, causing certain features of Live Upgrade (such
as changing the system boot device after activating a boot
environment) to fail. To correct this problem, please run
the system in normal, non-diagnostic mode. The system might
have a key switch or other external means of booting the
system in normal mode. If you do not have such a means, you
can set one or both of the EEPROM parameters 'diag-switch?'
or 'diagnostic-mode?' to 'false'.  After making a change,
either through external means or by changing an EEPROM
parameter, retry the Live Upgrade operation or command.
ERROR: Live Upgrade: Unable to change primary boot device to boot
environment <s10u8>.
ERROR: You must manually change the system boot prom to boot the system
from device </pci@0/pci@0/pci@2/scsi@0/sd@0,0:e>.
Live Upgrade: Activation of boot environment <s10u8> completed.
Legacy init script "/etc/rc0.d/K62lu" exited with return code 0.
[...]

Well, pretty explicit in fact, but very unexpected when the activation went so well beforehand. So, go to check the EEPROM, and change it back if necessary:

# eeprom diag-switch?
diag-switch?=true
# eeprom diag-switch?=false

And all returned to a normal situation when activating again, and rebooting. Although this case is self explanatory in the corresponding log file, and is describe in the Bug ID #6949588, I think this one may be put more visible to the system administrator, for example by checking the EEPROM configuration during the BE activation code (at the luactivate command).

Thursday 22 March 2007

Patching an x86 Miniroot Image for the Solaris OS

Generally speaking, BigAdmin is a great and valuable source for Sun's systems administrators. Here is an awesome article describing how to patch (update) the kernel used during an installation or system upgrade process, known as miniroot, for x86 based Solaris platform.

At work, we precisely encounter a bug between Solaris 6/06 and the provided nVidia driver which prevents jumpstarting it on a Sun Fire X4100 M2 Server. The support team said we can apply specific patches, already present in Solaris 11/06 at that time. Because we don't really known the exact procedure to follow to update the miniroot accordingly, and because these machines must be provisioned very quickly, we doesn't investigate much on that way (ending installing them with DVD-ROMs, in servers room). Now, after reading the proposed article, we will certainly take the time to do so... if we know how to get the proper bundle of patches to correct our bug.