blog'o thnet

To content | To menu | To search

Wednesday 3 December 2008

GRUB Boot Archive With SVM, A Better Approach

In a previous discussion about the GRUB boot archive and how it can be regenerated in Failsafe mode, I mentioned that it will not be as easy as it can be when the root file system use the md driver. I previously show a method to do this which necessitate to unmirror one or more file systems when the root file system is build upon a SVM mirror. This was not very optimal since a lot of of manipulations are involved, which may lead to human error(s), and may seems to be a little complicated.

This method was build on Performing System Recovery from the Solaris Volume Manager official documentation, which show up last month on the Sun-Managers mailing list.

Note: Although this test case was done using Solaris 10 10/08 under a virtual machine build upon VirtualBox on latest OpenSolaris release, the instructions must be valid for Solaris 10 1/06 and later.

Initial setup

As we saw before, the system use only a root file system, and a swap device. Both are encapsulated with SVM:

# df -k -F ufs
Filesystem     kbytes    used   avail capacity  Mounted on
/dev/md/dsk/d0 6147798 3455578 2630743      57%  /
# swap -l
swapfile             dev  swaplo blocks   free
/dev/md/dsk/d1      85,1       8 4194288 4194288
# metastat -c d0 d1
d0               m  6.0GB d10 d20
    d10          s  6.0GB c0d0s0
    d20          s  6.0GB c1d1s0
d1               m  2.0GB d11 d21
    d11          s  2.0GB c0d0s1
    d21          s  2.0GB c1d1s1

Regenerate the GRUB boot archive

The idea is to boot on the GRUB Failsafe mode, get the md configuration from local root file system, and load manually the md module, hence properly configured. The main advantage is to be fully self hosted from the Failsafe mode, and not have to manipulate SVM more than necessary, especially when breaking the mirror, loosing redundancy for a time.

[...]
Booting to milestone "milestone/single-user:default".
Configuring devices.
Searching for installed OS instances...
/dev/dsk/c0d0s0 is under md control, skipping.
/dev/dsk/c1d1s0 is under md control, skipping.
No installed OS instance found.

Starting shell.
# mount -F ufs -o ro /dev/dsk/c0d0s0 /a
# cp -p /a/kernel/drv/md.conf /kernel/drv
# umount /a
# update_drv -f md
devfsadm: mkdir failed for /dev 0x1ed: Read-only file system
# metainit -r
# metasync d0
# fsck /dev/md/rdsk/d0
# mount -F ufs /dev/md/dsk/d0 /a
# bootadm update-archive -R /a
# umount /a
# reboot

Really interesting!

Thursday 13 March 2008

Update A Corrupted GRUB Boot Archive, With SVM

In a previous discussion about the GRUB boot archive and how it can be regenerated, I mentioned that it will not be as easy as it can be when the root file system use the md driver. I will now show two different methods to do the same thing when the root file system is build upon a SVM mirror (RAID-1):

  1. Unmirror the root file system only.
  2. Unmirror the entire system, i.e. all devices.

Note: Although this test case was done using Solaris 10 8/07 under a virtual machine build upon VirtualBox on latest Solaris Express Community Edition, the instructions must be valid for Solaris 10 1/06 and later.

Initial setup

As we can see, the system use only a root file system, and a swap device. Both are encapsulated with SVM.

# df -k -F ufs
Filesystem     kbytes    used   avail capacity  Mounted on
/dev/md/dsk/d0 6147798 3455578 2630743      57%  /
# swap -l
swapfile             dev  swaplo blocks   free
/dev/md/dsk/d1      85,1       8 4194288 4194288
# metastat -c d0 d1
d0               m  6.0GB d10 d20
    d10          s  6.0GB c0d0s0
    d20          s  6.0GB c1d1s0
d1               m  2.0GB d11 d21
    d11          s  2.0GB c0d0s1
    d21          s  2.0GB c1d1s1

Unmirror the root file system only

The idea is to boot on the GRUB Failsafe mode, select the first side of the mirror, and modify the system and vfstab configuration files to use the correct device path. For the system file, this means to actually remove the rootdev:/pseudo/md@0:0,0,bl entry, not just comment it. For the vfstab file, this means replacing the root file system metadevice path /dev/md/[r]dsk/d0 by the first underlying device path, i.e. /dev/[r]dsk/c0d0s0. Last, regenerate the boot archive on the alternate root path.

[...]
Booting to milestone "milestone/single-user:default".
Configuring devices.
Searching for installed OS instances...
/dev/dsk/c0d0s0 is under md control, skipping.
/dev/dsk/c1d1s0 is under md control, skipping.
No installed OS instance found.

Starting shell.
# fsck /dev/rdsk/c0d0s0
# mount -F ufs /dev/dsk/c0d0s0 /a
# cp /a/etc/system /a/etc/system.bckp
# cp /a/etc/vfstab /a/etc/vfstab.bckp
# TERM=vt100 vi /a/etc/system
# TERM=vt100 vi /a/etc/vfstab
# bootadm update-archive -R /a
# umount /a
# fsck /dev/rdsk/c0d0s0
# reboot

Then, boot into milestone/multi-user:default level and detach the second half of the mirror, since the first half correspond to the valid and updated underlying device. Next, restore the original configuration files which refers to the encapsulated metadevices, and reboot.

# df -k -F ufs
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/dsk/c0d0s0      6147798 3458810 2627511    57%    /
# swap -l
swapfile             dev  swaplo blocks   free
/dev/md/dsk/d1      85,1       8 4194288 4194288
# metastat -c d0
d0               m  6.0GB d10 d20
    d10          s  6.0GB c0d0s0
    d20          s  6.0GB c1d1s0
# metadetach d0 d20
d0: submirror d20 is detached
# metastat -c d0
d0               m  6.0GB d10
    d10          s  6.0GB c0d0s0
# cp /etc/system.orig /etc/system
# cp /etc/vfstab.orig /etc/vfstab
# shutdown -y -i 6 -g 0

After the reboot, just reattach the second half of the mirror, and wait for complete synchronization to be fully redundant again.

# df -k -F ufs
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/md/dsk/d0       6147798 3458714 2627607    57%    /
# swap -l
swapfile             dev  swaplo blocks   free
/dev/md/dsk/d1      85,1       8 4194288 4194288
# metattach d0 d20
d0: submirror d20 is attached
# metastat -c d0
d0               m  6.0GB d10 d20 (resync-29%)
    d10          s  6.0GB c0d0s0
    d20          s  6.0GB c1d1s0

Unmirror the entire system, i.e. all devices

The idea is exactly the same as for unmirroring the root file system only, but adapting the vfstab file to change the swap entry, too. (So, I didn't reproduce the code listing here.)

Then, boot into milestone/single-user:default level modifying the corresponding GRUB entry as follow: kernel /platform/i86pc/multiboot -s. Completely delete all the metadevices and metadb configurations to clear SVM settings. Last, continue into milestone/multi-user:default level to boot unmirrored.

# metaclear -f -r d0 d1
# metadb -f -d  c1d0s4 c1d0s4
# ^D

Now, the system must be fully encapsulate by SVM again. Please refer to online Sun Documentation, or some past entries on this subject, depending on the system's architecture: SPARC systems, or x86 platforms.

Sunday 9 March 2008

Update A Corrupted GRUB Boot Archive, Without SVM

Solaris 10 systems on x86 architecture use the GNU GRand Unified Bootloader (GRUB) which is the boot loader responsible for loading a boot archive into a system's memory. The boot archive is a collection of critical files (kernel modules and configuration files) that are required to boot the Solaris OS. As stated in the Sun documentation:

These files are needed during system startup before the root file system is mounted. Two boot archives are maintained on a system:

  • The boot archive that is used to boot the Solaris OS on a system. This boot archive is sometimes called the primary boot archive.
  • The boot archive that is used for recovery when the primary boot archive is damaged. This boot archive starts the system without mounting the root file system. On the GRUB menu, this boot archive is called failsafe. The archive's essential purpose is to regenerate the primary boot archive, which is usually used to boot the system.

The Solaris OS generally keeps the boot archive properly synchronized on its own. Sometimes, the boot archive gets corrupted--for example when (bad) patches are applied, or the the operating system crashed. In these cases, the boot archive must be regenerated. This is easily accomplished following the Sun documentations x86: How to Boot the Failsafe Archive for Recovery Purposes, and x86: How to Boot the Failsafe Archive to Forcibly Update a Corrupt Boot Archive. The main drawback is when the system is encapsulated under a SVM mirror (RAID-1) since the md driver is not managed under the failsafe mode. Please refer to this blog entry on this subject, if needed.

Friday 30 March 2007

ZFS Recent News

Well. More than a real blog entry, this post is more about keeping in touch with some recent add-ons in ZFS area. First, you can read the ZFS Overview and Guide just published on BigAdmin. Second, you must watch the excellent Thumper do it yourself, which is a very nice showcase of ZFS use. Third, a great listing of recent add-ons put in latest SXCE builds is available at Robert Milkowski's blog.

Last, be sure to check Tim Foster explanations about the recently announced ZFS Boot support in build 62, for the x86 platform. All interesting links included. His script to set up ZFS root automatically too! (Since all bits not yet well integrated...)

Monday 11 July 2005

Memo About Some Very Interesting CLI Tools

Boot disk configuration

After verifying there are two disks in the boot list...

# bootlist -m normal -o
hdisk0
hdisk1

... verify and create a boot image on the second mirrored boot disk:

# bosboot -vd hdisk1 && bosboot -ad hdisk1

How to know on which disk the OS has booted (bootblock used and kernel loaded):

# bootinfo -b
hdisk0

How to know on which mode the OS has booted (kernel in 32-bit or 64-bit):

# bootinfo -K
64

If there is some problem booting on one disk, be sure that the corresponding raw device are the same device as ipldevice:

# bootinfo -b
hdisk0
#
# ls -ilF /dev/ipldevice /dev/rhdisk0
 8231 crw-------   2 root     system       17,  0 Apr 21 14:37 /dev/ipldevice
 8231 crw-------   2 root     system       17,  0 Apr 21 14:37 /dev/rhdisk0

VM information vs. ODM information

Assuming the following mounted file system:

# mount
  node       mounted        mounted over    vfs       date        options      
-------- ---------------  ---------------  ------ ------------ --------------- 
[...]
         /dev/fslv07      /files/tmpcdinst jfs2   Jun 27 10:14 rw,log=/dev/loglv01

Here are the corresponding information found in the ODM:

# odmget -q "name=fslv07 and attribute=type" CuAt 

CuAt:
        name = "fslv07"
        attribute = "type"
        value = "jfs2"
        type = "R"
        generic = "DU"
        rep = "s"
        nls_index = 639

This can be compared with the status returned by the lsvg command:

# lsvg -l colombvg
colombvg:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
[...]
fslv07              jf2        280   280   2    open/syncd    /files/tmpcdinst

More on this particular subject in the story Export and Import a Volume Group... When Things Goes the Wrong Way.

Get the ODM volume group information for a given disk:

# lqueryvg -p hdisk0 -Avt
Max LVs:        256
PP Size:        27
Free PPs:       0
LV count:       12
PV count:       2
Total VGDAs:    3
Conc Allowed:   0
MAX PPs per PV  1016
MAX PVs:        32
Conc Autovaryo  0
Varied on Conc  0
Logical:        00ce3a0c00004c000000010364ba67bc.1   hd5 1  
                00ce3a0c00004c000000010364ba67bc.2   hd6 1  
                00ce3a0c00004c000000010364ba67bc.3   hd8 1  
                00ce3a0c00004c000000010364ba67bc.4   hd4 1  
                00ce3a0c00004c000000010364ba67bc.5   hd2 1  
                00ce3a0c00004c000000010364ba67bc.6   hd9var 1  
                00ce3a0c00004c000000010364ba67bc.7   hd3 1  
                00ce3a0c00004c000000010364ba67bc.8   hd1 1  
                00ce3a0c00004c000000010364ba67bc.9   hd10opt 1  
                00ce3a0c00004c000000010364ba67bc.10  loglv00 1  
                00ce3a0c00004c000000010364ba67bc.11  fslv00 1  
                00ce3a0c00004c000000010364ba67bc.12  fslv04 1  
Physical:       00ce3a0c64ba5da3                2   0  
                00ce3a0c8df2265d                1   0  
VGid:           00ce3a0c00004c000000010364ba67bc
Total PPs:      158
LTG size:       128
HOT SPARE:      0
AUTO SYNC:      0
VG PERMISSION:  0
SNAPSHOT VG:    0
IS_PRIMARY VG:  0
PSNFSTPP:       4352
VARYON MODE:    0
VG Type:        0
Max PPs:        32512

Operating system general status and information

Gather system configuration information:

# snap -r    /* Remove snap command output from the /tmp/ibmsupt directory. */
# snap -ac   /* Creates a compressed pax image (snap.pax.Z file) of all files
                in the /tmp/ibmsupt. */

This tool can be compared to the explorer (known as the SUNWexplo package) on Sun Solaris OE.

About starting services at boot time

List the content of the inittab file:

# lsitab -a   /* Use this command instead of `cat /etc/inittab`. */

Create a new file system

Create a new Enhanced Journaled File System in the the colombvg volume group with a size of 5 gigabytes in read-write mode, using the mount point /files/ddaeurd1/DATA and being automatically mounted at boot time:

# crfs -v jfs2 -g colombvg -a size=5G -m /files/ddaeurd1/DATA -p rw -A yes