by Bernd Schemmer, November 2008
14.07.2012/bs: Links
checked and corrected
Homepage: http://www.bnsmb.de/
Table of Contents
This article describes how to use a ramdisk as a 3rd submirror for an SVM mirror to improve the read performance of the mirror.
This article only describes how to configure a SVM submirror on a ramdisk - it does not discuss the concept of a volume manager like SVM in general. It also does not talk about a practical usage for this configuration -- in fact, it may be that there is no practical usage for it.
It's only meant as a proof of concept.
A SPARC or x86 machine running Solaris 10 with enough memory to create the ramdisk.
The environment for the example below was Solaris 10 Update 6 running in a QEMU virtual machine with 1 GB memory and three virtual disks.
All commands must be executed by the user root or a user with the approbate rights.
Talking about RAM disks in the Solaris OS | http://www.bnsmb.de/http://wiki.qemu.org/Index.html |
QEMU Homepage | http://wiki.qemu.org/Index.html |
QEMU Project at OpenSolaris | http://www.opensolaris.org/os/project/qemu/ |
First create the SVM mirror on the disks in the normal way, e.g.:
bash-3.00# metainit d11 1 1 c0d1s0 d11: Concat/Stripe is setup bash-3.00# metainit d12 1 1 c1d0s0 d12: Concat/Stripe is setup bash-3.00# metainit d10 -m d11 d10: Mirror is setup bash-3.00# metattach d10 d12 d10: submirror d12 is attached bash-3.00# metastat -p d10 d10 -m d11 d12 1 d11 1 1 c0d1s0 d12 1 1 c1d0s0 bash-3.00# newfs /dev/md/rdsk/d10 /dev/md/rdsk/d10: Unable to find Media type. Proceeding with system determined parameters. newfs: construct a new file system /dev/md/rdsk/d10: (y/n)? y Warning: inode blocks/cyl group (306) >= data blocks (59) in last cylinder group. This implies 944 sector(s) cannot be allocated. /dev/md/rdsk/d10: 529200 sectors in 35 cylinders of 240 tracks, 63 sectors 258.4MB in 7 cyl groups (5 c/g, 36.91MB/g, 17536 i/g) super-block backups (for fsck -F ufs -o b=#) at: 32, 75696, 151360, 227024, 302688, 378352, 454016, bash-3.00# echo "/dev/md/dsk/d10 /dev/md/rdsk/d10 /mnt ufs 2 yes logging" >>/etc/vfstab bash-3.00# mount /mnt bash-3.00# df -k /mnt Filesystem kbytes used avail capacity Mounted on /dev/md/dsk/d10 249127 1041 223174 1% /mnt
The next step is to create the ramdisk for the 3rd submirror using ramdiskadm:
bash-3.00# ramdiskadm -a testdisk 230m /dev/ramdisk/testdisk
Note: The ramdisk must be as big as the other submirrors.
Due to a restriction in the SVM commands (metainit, metastat, etc) you must create symbolic links for the ramdisk in the /dev/dsk and /dev/rdsk directories:
bash-3.00# ln -s /dev/ramdisk/testdisk /dev/dsk/c5t0d0s0 bash-3.00# ln -s /dev/rramdisk/testdisk /dev/rdsk/c5t0d0s0
Note: Use a control number (c5 in this example) which is not already in use.
Now create a metadevice on the ramdisk
bash-3.00# metainit d13 1 1 c5t0d0s0 d13: Concat/Stripe is setup bash-3.00# metastat d13 d13: Concat/Stripe Size: 471000 blocks (229 MB) Stripe 0: Device Start Block Dbase Reloc c5t0d0s0 0 No No Device Relocation Information: Device Reloc Device ID c5t0d0 No -
And attach the new metadevice to the existing mirror:
bash-3.00# metattach d10 d13 d10: submirror d13 is attached bash-3.00# metastat -p d10 d10 -m d11 d12 d13 1 d11 1 1 c1d0s0 d12 1 1 c0d1s0 d13 1 1 c5t0d0s0
That's it -- now we have a SVM mirror with one submirror on a ramdisk.
But as can be seen in the iostat output below this configuration not really improves the read performance of the mirror. For this we must change the read policy for the mirror.
The default read policy for SVM mirror devices is round robin:
bash-3.00# metaparam d10 d10: Mirror current parameters are: Pass: 1 Read option: roundrobin (default) Write option: parallel (default)
We want to change it , so that reads are done from our ramdisk based submirror. Unfortunately there is no option to select a specific submirror as primary submirror for read requests:
bash-3.00# metaparam
usage: metaparam [-s setname] [options] concat/stripe | RAID
metaparam [-s setname] [options] mirror
Concat/Stripe or RAID options:
-h hotspare_pool | "none"
Mirror options:
-r roundrobin | geometric | first
-w parallel | serial
-p 0-9
Therefore we must first destroy and recreate the mirror (without loosing the data, of course):
Stop all applications accessing the mirror and ensure that all three submirrors are in sync:
bash-3.00# metastat | grep sync
Now umount the mirror and clear the SVM metadevice for the mirror
bash-3.00# umount /mnt bash-3.00# metaclear d10
and recreate the mirror using the ramdisk as first submirror:
bash-3.00# metainit d10 -m d13 bash-3.00# metattach d10 d12 bash-3.00# metattach d10 d11 bash-3.00# bash-3.00# metastat -p d10 d10 -m d13 d12 d11 1 d13 1 1 c5t0d0s0 d11 1 1 c1d0s0 d12 1 1 c0d1s0
So now we can change the read policy for the mirror to "first":
bash-3.00# metaparam -r first d10
bash-3.00# metaparam d10
d10: Mirror current parameters are:
Pass: 1
Read option: first (-r)
Write option: parallel (default)
This configuration now improves the read performance for the mirror a lot (see the iostat output below).
There is a small performance impact for writing to the mirror but that should not be a problem.
Examples:
Writing to a mirror with two submirrors on disk
bash-3.00# time dd if=/dev/zero of=/mnt/testfile bs=512 write: No space left on device 429777+0 records in 429777+0 records out real 0m28.010s user 0m0.249s sys 0m23.197s
Writing to a mirror with two submirrors on disk and one submirror on a ramdisk:
bash-3.00# time dd if=/dev/zero of=/mnt/testfile bs=512 write: No space left on device 429777+0 records in 429777+0 records out real 0m33.227s user 0m0.247s sys 0m26.602s
One open issue for this configuration is the behaviour of the mirror after a reboot:
After the machine is rebooted, the ramdisk does not exist and therefore the submirror on the ramdisk also does not exist and the mirror needs maintenance:
# bash bash-3.00# metastat d10: Mirror Submirror 0: d13 State: Needs maintenance Submirror 1: d11 State: Okay Submirror 2: d12 State: Okay Pass: 1 Read option: first (-r) Write option: parallel (default) Size: 460800 blocks (225 MB) d13: Submirror of d10 State: Needs maintenance Invoke: metareplace d10 c5t0d0s0 <new device> Size: 460800 blocks (225 MB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c5t0d0s0 0 No Maintenance No d11: Submirror of d10 State: Okay Size: 465885 blocks (227 MB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0d1s0 0 No Okay Yes d12: Submirror of d10 State: Okay Size: 465885 blocks (227 MB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1d0s0 0 No Okay Yes Device Relocation Information: Device Reloc Device ID c1d0 Yes id1,cmdk@AQEMU_HARDDISK=QM00003 c5t0d0 No - c0d1 Yes id1,cmdk@AQEMU_HARDDISK=QM00002
This is not a big problem because the data still exists on the existing submirrors and the mirror also works without the 3rd submirror - but without the read performance improvement, of course.
To repair the mirror recreate the ramdisk
bash-3.00# ramdiskadm -a testdisk 230m
and reattach the submirror on the ramdisk
bash-3.00# metareplace -e d10 c5t0d0s0
d10: device c5t0d0s0 is enabled
bash-3.00# metastat d10
d10: Mirror
Submirror 0: d13
State: Okay
Submirror 1: d11
State: Okay
Submirror 2: d12
State: Okay
Pass: 1
Read option: first (-r)
Write option: parallel (default)
Size: 460800 blocks (225 MB)
d13: Submirror of d10
State: Okay
Size: 460800 blocks (225 MB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c5t0d0s0 0 No Okay No
d11: Submirror of d10
State: Okay
Size: 465885 blocks (227 MB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c0d1s0 0 No Okay Yes
d12: Submirror of d10
State: Okay
Size: 465885 blocks (227 MB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1d0s0 0 No Okay Yes
Device Relocation Information:
Device Reloc Device ID
c5t0d0 No -
c0d1 Yes id1,cmdk@AQEMU_HARDDISK=QM00002
c1d0 Yes id1,cmdk@AQEMU_HARDDISK=QM00003
That's it - now the 3rd submirror on the ramdisk is again in place and after the synchronisation is done we have a mirror with faster read performance again.
bash-3.00# metastat -p d10 d10 -m d11 d12 d13 1 d11 1 1 c1d0s0 d12 1 1 c0d1s0 d13 1 1 c5t0d0s0
d10 is the mirror, d11 and d12 are the sub mirrors on two virtual harddisks, and d13 is the submirror on the ramdisk.
bash-3.00# dd if=/dev/md/rdsk/d10 of=/dev/null bs=512 bash-3.00# iostat -xn 1 5555 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0d0 1905.7 0.0 952.8 0.0 0.0 0.4 0.0 0.2 0 41 c0d1 1904.7 0.0 952.3 0.0 0.0 0.4 0.0 0.2 0 42 c1d0 3810.3 0.0 1905.2 0.0 0.0 0.8 0.0 0.2 0 85 md/d10 1905.7 0.0 952.8 0.0 0.0 0.4 0.0 0.2 0 42 md/d11 1904.7 0.0 952.3 0.0 0.0 0.4 0.0 0.2 0 42 md/d12
bash-3.00# dd if=/dev/md/rdsk/d10 of=/dev/null bs=512 bash-3.00# iostat -xn 1 5555 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0d0 1643.0 0.0 821.5 0.0 0.0 0.4 0.0 0.2 0 36 c0d1 1644.0 0.0 822.0 0.0 0.0 0.4 0.0 0.2 0 36 c1d0 4930.1 0.0 2465.1 0.0 0.0 0.8 0.0 0.2 0 82 md/d10 1644.0 0.0 822.0 0.0 0.0 0.4 0.0 0.2 0 37 md/d11 1643.0 0.0 821.5 0.0 0.0 0.4 0.0 0.2 0 37 md/d12 1643.0 0.0 821.5 0.0 0.0 0.1 0.0 0.0 0 8 md/d13 1643.0 0.0 821.5 0.0 0.0 0.1 0.0 0.0 0 6 ramdisk1
bash-3.00# dd if=/dev/md/rdsk/d10 of=/dev/null bs=512 bash-3.00# iostat -xn 1 5555 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0d1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1d0 16022.1 0.0 8011.0 0.0 0.0 0.6 0.0 0.0 1 62 md/d10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md/d11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md/d12 16022.1 0.0 8011.0 0.0 0.0 0.6 0.0 0.0 0 58 md/d13 16022.1 0.0 8011.0 0.0 0.0 0.4 0.0 0.0 0 41 ramdisk1
noname:/# metastat -p d80 d80 -m d81 d82 d83 1 d81 1 1 c3t2d0s0 d82 1 1 c3t3d0s0 d83 1 1 c7t0d0s0
d80 is the mirror, d81 and d82 are the submirrors on two harddisks, and d83 is the submirror on the ramdisk
noame:/var# time dd if=/dev/md/rdsk/d80 of=/dev/null bs=512 522000+0 records in 522000+0 records out real 1m13.890s user 0m1.084s sys 0m13.530s extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 7170.6 0.0 3585.3 0.0 0.0 0.9 0.0 0.1 1 90 md/d80 3585.8 0.0 1792.9 0.0 0.0 0.4 0.0 0.1 0 43 md/d81 3584.8 0.0 1792.4 0.0 0.0 0.4 0.0 0.1 0 43 md/d82 3585.8 0.0 1792.9 0.0 0.0 0.4 0.0 0.1 3 36 c3t2d0 3584.8 0.0 1792.4 0.0 0.0 0.4 0.0 0.1 3 36 c3t3d0
noname:/var# time dd if=/dev/md/rdsk/d80 of=/dev/null bs=512 522000+0 records in 522000+0 records out real 0m54.764s user 0m1.098s sys 0m13.353s extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 9765.7 0.0 4882.8 0.0 0.0 0.9 0.0 0.1 1 88 md/d80 3254.9 0.0 1627.4 0.0 0.0 0.4 0.0 0.1 0 40 md/d81 3255.9 0.0 1627.9 0.0 0.0 0.4 0.0 0.1 0 40 md/d82 3255.9 0.0 1627.9 0.0 0.0 0.0 0.0 0.0 0 4 md/d83 3255.9 0.0 1627.9 0.0 0.0 0.0 0.0 0.0 0 2 ramdisk1 3254.9 0.0 1627.5 0.0 0.0 0.3 0.0 0.1 2 34 c3t2d0 3255.9 0.0 1628.0 0.0 0.0 0.3 0.0 0.1 2 34 c3t3d0
noname:/var# time dd if=/dev/md/rdsk/d80 of=/dev/null bs=512 522000+0 records in 522000+0 records out real 0m9.799s user 0m1.004s sys 0m8.773s extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 54466.5 149.3 36656.3 9497.7 0.0 1.6 0.0 0.0 3 100 md/d80 54467.5 0.0 36656.8 0.0 0.0 0.5 0.0 0.0 2 47 md/d83 54467.5 0.0 36656.8 0.0 0.0 0.3 0.0 0.0 0 24 ramdisk1
On one hand using a ramdisk as 3rd submirror for a SVM mirror significantly improves the read access. But on the other hand it's something complicated to configure and the SVM mirror must be repaired after each reboot. And you need a lot of memory for it.
So it's possible to use this approach but it's only useful if an application really needs it and it's not possible to get the necessary read performance with other methods (like tuning the cache, for example).