RAID-1 Volume From the root File System Using SVM on x86 Platform
By Julien Gabel on Wednesday 30 May 2007, 20:12 - OpenSolaris - Permalink
Here is a little step-by-step guide to create a soft mirror from the
root file system, known as an encapsulation of the system's disk.
This will provide full protection against one disk failure, and complete
redundancy. In the same time, this will have the effect to speed read requests
(since there exists multiple backing devices hosting the same data), but write
performance is generally degraded. First, know your running system,
particularly on which disk it is currently installed and which other device is
available for the second mirror side.
# df -hF ufs
Filesystem size used avail capacity Mounted on
/dev/dsk/c1d0s0 7.9G 5.2G 2.6G 67% /
# swap -lh
swapfile dev swaplo blocks free
/dev/dsk/c1d0s1 102,65 4K 4.0G 4.0G
#
# echo | format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c1d0
/pci@0,0/pci-ide@8/ide@0/cmdk@0,0
1. c2d0
/pci@0,0/pci-ide@8/ide@1/cmdk@0,0
[...]
Well, we will use the c2d0 as the second submirror. So, we need
to default to one Solaris partition that uses the whole disk and make it
bootable (we are using GRUB in this case). The slice for the second submirror
must have a slice tag of root and the root slice must be slice
0 (so, we will duplicate the label's content from the boot disk to
the mirror disk).
# fdisk -B /dev/rdsk/c2d0p0
# fdisk /dev/rdsk/c2d0p0
Total disk size is 36483 cylinders
Cylinder size is 16065 (512 byte) blocks
Cylinders
Partition Status Type Start End Length %
========= ====== ============ ===== === ====== ===
1 Active Solaris2 1 36482 36482 100
SELECT ONE OF THE FOLLOWING:
1. Create a partition
2. Specify the active partition
3. Delete a partition
4. Change between Solaris and Solaris2 Partition IDs
5. Exit (update disk configuration and exit)
6. Cancel (exit without updating disk configuration)
Enter Selection:
#
# prtvtoc /dev/rdsk/c1d0s2 | fmthard -s - /dev/rdsk/c2d0s2
fmthard: New volume table of contents now in place.
#
# /sbin/installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c2d0s0
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 260 sectors starting at 50 (abs 16115)
Create replicas of the metadevice state database:
# metadb -a -c 3 -f c1d0s4 c2d0s4
# metadb
flags first blk block count
a u 16 8192 /dev/dsk/c1d0s4
a u 8208 8192 /dev/dsk/c1d0s4
a u 16400 8192 /dev/dsk/c1d0s4
a u 16 8192 /dev/dsk/c2d0s4
a u 8208 8192 /dev/dsk/c2d0s4
a u 16400 8192 /dev/dsk/c2d0s4
Flag -f is needed because it is the first
invocation/creation of metadb(1m).
Set up the RAID-0 metadevices (stripe or concatenation volumes)
corresponding to the / file system and the swap
space, and automatically configure system files (/etc/vfstab and
/etc/system) for the root metadevice.
# metainit -f d10 1 1 c1d0s0 d10: Concat/Stripe is setup # metainit -f d11 1 1 c1d0s1 d11: Concat/Stripe is setup # metainit d20 1 1 c2d0s0 d20: Concat/Stripe is setup # metainit d21 1 1 c2d0s1 d21: Concat/Stripe is setup # metainit d0 -m d10 d0: Mirror is setup # metainit d1 -m d11 d1: Mirror is setup # # cp /etc/vfstab /etc/vfstab.beforesvm # sed -e 's@/dev/dsk/c1d0s1@/dev/md/dsk/d1@' /etc/vfstab.beforesvm > /etc/vfstab # metaroot d0 # diff /etc/vfstab /etc/vfstab.beforesvm 6,7c6,7 < /dev/md/dsk/d1 - - swap - no - < /dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no - --- > /dev/dsk/c1d0s1 - - swap - no - > /dev/dsk/c1d0s0 /dev/rdsk/c1d0s0 / ufs 1 no -
Flag -f is needed because the file systems created on the
slice we want to initialize a new metadevice are currently mounted (in
use).
Reboot on the metadevices: the operating system will now boot encapsulated, on a one-side mirror. Last, attach the second part of the mirror and adapt the system dump configuration.
# lockfs -af && shutdown -y -g 0 -i 6
[...]
# metattach d0 d20
d0: submirror d20 is attached
# metattach d1 d21
d1: submirror d21 is attached
#
# metastat -p
d1 -m /dev/md/rdsk/d11 /dev/md/rdsk/d21 1
d11 1 1 /dev/rdsk/c1d0s1
d21 1 1 /dev/rdsk/c2d0s1
d0 -m /dev/md/rdsk/d10 /dev/md/rdsk/d20 1
d10 1 1 /dev/rdsk/c1d0s0
d20 1 1 /dev/rdsk/c2d0s0
# metastat | grep %
Resync in progress: 41 % done
Resync in progress: 46 % done
#
# rmdir /var/crash/*
# mkdir /var/crash/`hostname`
# chmod 700 /var/crash/`hostname`
# dumpadm -s /var/crash/`hostname` -d /dev/md/dsk/d1
Dump content: kernel pages
Dump device: /dev/md/dsk/d1 (swap)
Savecore directory: /var/crash/bento
Savecore enabled: yes
Last, define the alternative boot path in the menu.lst GRUB
configuration file: the Solaris/BSD slice 0 on the first fdisk
partition on the second BIOS disk.
cat << EOF >> /boot/grub/menu.lst title Solaris Nevada snv_65 X86 (Alternate Boot Path) root (hd1,0,a) kernel$ /platform/i86pc/kernel/$ISADIR/unix module$ /platform/i86pc/$ISADIR/boot_archive EOF # # bootadm list-menu The location for the active GRUB menu is: /boot/grub/menu.lst default 0 timeout 10 0 Solaris Nevada snv_65 X86 1 Solaris failsafe 2 Solaris Nevada snv_65 X86 (Alternate Boot Path)
For further (and deeper) information on this subject, please refer to the excellent Sun Microsystems Documentation on Solaris Volume Manager, and particularly x86: Creating a RAID-1 Volume From the root (/) File System.





Comments
Heya Julien,
you can even now use some nifty tools to get a graphical view of your disks.
See http://search.cpan.org/~jfenal/ or a CPAN mirror near you.
Regards,
J.
Hi Jérôme,
I am happy to hear from you, since many months right now (yes, I had some news from Christophe Pages recently). As of your tool, I gave it a try in the past, and found it very interesting. I didn't manage SVM configurations which are as complicated as the one we can see on your web site, but this gives a very good idea of the actual layout nonetheless.
What did you do to come up with the slice 4 info in the initial metadb initialization command? That's a bit opaque.
Although it may not be obvious, the VTOC layout on the first disk is assumed to be in place since the installation time. So was created the slice c1d0s4, in preparation of the later use of Solaris SVM--as a mandatory prerequisite. Please refer to the online Sun documentation (http://docs.sun.com/app/docs/doc/81...) for more information about the creation and the sizing of State Database Replicas.