?

Log in

No account? Create an account
entries friends calendar profile Previous Previous Next Next
Building servers - Ed's journal
sobrique
sobrique
Building servers
One thing I really rather enjoy is installing and configuring servers.

In that 'fluffy' time just before they enter production, they're like new children. Filled with potential, and with disks that may be freely repartitioned, restarted or just rebuilt entirely.

You can fiddle around with the disk slice layout to make an aesthetic picture, and install software on it without having to fear 'unintended consequences' - compilers are a real pain for triggering cascades of $Program depends on libgcc.so.1 version 3.0.1 32bit, and now won't work.

One of the fun things to play with, is disk suite. It's disk mirroring software, that comes free with the OS. Often, contact with disksuite (and indeed, and disk management software) comes at the worst possible time - it's broke. Fix NOW.

So the opportunity to 'play' in times where the worst that will happen is an 'oops, guess I get to rebuild again' is a delight.

So for my reference, here's how to configure disksuite:


Loosely assuming that you're two OS mirror disks are c0t0d0 and c0t1d0. If they're not, amend appropriately :)

Ideally, before installing the OS, you will prepare, by creating a 50mb 'metadb' slice. If you don't, then you'll either have to re-partition later, or 'append' the metadbs onto existing slices.

Step 0: Install OS.

Step 1: Install disksuite. (duh). 4.2.1 may be downloaded from Sun. Just searching on www.sun.com should do the job however.
Download, unpack, and then install with pkgadd. Really important (tm) that you reboot straight after, so don't do this if you're not going to be able to.

# for i in SUNWmdr SUNWmdu SUNWlvmr SUNWlvma SUNWlvmg SUNWmdg SUNWmdx
> do
>   pkgadd -d Packages $i
> done
# shutdown -y -g 0 -i 6


Step 2: Once rebooted, partition your disks. If you're mirroring your root disk, then by _far_ the easiest way of doing this (assuming, that you have two identical disks) is using prtvtoc and fmthard. RTFM before doing this, because it's direct partition table messing.

# prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2


This will copy the partition table from c0t0d0 to c0t1d0.

Step 3: Create metadb replicas. The metadb is the database which keeps track of the disk layout. So you need several, because the loss of this db means your system is fucked. As a 'protective' measure, Solaris won't boot unless your metadb's are 'quorate'. Which means that 'more than half' match up.
The smart arses out there, will quickly realise that this means that a two disk system won't be quorate if a disk fails, and they'd be right. I'll deal with that in a minute.

So,
# metadb -a -f -c 2 c0t0d0s5
(assuming that s5 is your 'db' partition as above). -a for add, -f for force (because there aren't any yet) and -c to say 'create two copies'. You don't _need_ two, but it's strongly recommended if you have less than 4 disks in your system. (both for quoracy, and because these quite literally are the crown jewels that'll destroy your system if they're corrupt)
metadb -a -c2 c0t1d0s5 (slice 5 on the other disk)

List 'em with 'metadb' and you should see 4 entries.

Step 4: Decide on how you're going to name your disks.
I like to use:
d10y for 'volume y'
d0xy for 'submirror x of volume y'

I don't tend to use disksuite logging, because IMO UFS logging is better. (again, I'll get to that later)

Step 5: Create the volumes.
The procedure is slightly different for slices that can't be unmounted, such as '/' and '/usr'. This will work just fine for any disk geometry, but because the volumes have to resync is no where near as quick as just sticking the volume together and then running newfs.

metainit -f d00 1 1 c0t0d0s0 (make volume from root slice)

metainit d100 -m d00 (add root submirror to volume)

metaroot d100 - change vfstab to boot from d100

lockfs -fa (flush io buffers)
reboot (check that it boots from right device etc.)

metainit d10 1 1 c0t1d0s0 (make submirror from disk 1)
metattach d100 d10 - attach submirror to d100. You'll need to wait for it to resync.

Repeat for /var /opt /usr and swap (if separate slices)
Modify /etc/vfstab to use 'md' rather than '/dev/dsk/..'
Add 'logging' option to vfstab (last column)

Starting with DiskSuite 4.2.1, an optional /etc/system parameter exists which allows DiskSuite to boot with just 50% of the state database replicas online. For example, if one of the two boot disks were to fail, just two of the four state database replicas would be available. Without this /etc/system parameter (or with older versions of DiskSuite), the system would complain of "insufficient state database replicas", and manual intervention would be required on bootup. To enable the "50% boot" behaviour with DiskSuite 4.2.1, execute the following command:

# echo "set md:mirrored_root_flag=1" >> /etc/system

This is only necessary if you've only got 2 disks

Set eeprom devaliases (this is typically easier to do on a cold system, but if you're like me, you can't be arsed to go sit at a terminal in the computer room). This is necessary to ensure your system comes up if one of the disks has had a "ping-fuckit" moment. If it's the disk referenced by the 'disk' alias, then the system won't boot. So you specify to boot of disk0 first, disk1 second, and then fail.

"eeprom nvramrc" to print ('data not available' typically means 'none set')

Run 'format' and note down the 'device file' for the disk volume you want to boot from
On the server I tested this on, I got
AVAILABLE DISK SELECTIONS:
0. c0t0d0 
/pci@1f,4000/scsi@3/sd@0,0
1. c0t1d0 
/pci@1f,4000/scsi@3/sd@1,0
Specify disk (enter its number): ^C^D
bash-2.03# eeprom "nvramrc=devalias disk0 /pci@1f,4000/scsi@3/disk@0,0
> devalias disk1 /pci@1f,4000/scsi@3/disk@1,0"
bash-2.03# eeprom use-nvramrc?
use-nvramrc?=false
bash-2.03# eeprom use-nvramrc?=true
bash-2.03# eeprom boot-device
boot-device=disk net
bash-2.03# reboot -- disk1


Note A: Change 'sd' at the end to 'disk'. So the OS gets to decide.
Note B: DON'T CHANGE THE BOOT DEVICE UNTIL YOU'VE TESTED YOUR DEVALIASES.

Then you can use "eeprom 'boot-device=disk0 disk1'" to permanantly change your boot sequence

Sunsolve reference to devaliases on a hot system

4 comments or Leave a comment
Comments
zaitan From: zaitan Date: July 29th, 2004 06:38 am (UTC) (Link)
Thank you for this, I will be using it next week sometime when we will be building our nice new server. The root filesystem will be on two of the four internal disks. I thought the OS came with DiskSuite, or does it usually come with an older version?

After 3.5 days of my admin course, I understand more than 50% of the commands you referred to :)
sobrique From: sobrique Date: July 29th, 2004 06:39 am (UTC) (Link)
Depends which install level of the OS you use, and which flavour of slowlaris.
zaitan From: zaitan Date: July 29th, 2004 07:46 am (UTC) (Link)
We will probably be doing a custom build, based on something fairly cut down with all the stuff we will probably need but without all the rubbish that we don't.
sobrique From: sobrique Date: July 29th, 2004 07:49 am (UTC) (Link)
I used to do that. And then I got pissed off with installing stuff afterwards ("whadda ya mean I need SUNWhea as a pre-requisite") and so now do a 'full' and then clobber stuff I _don't_ want.

Lazy I know, but ... ;p
4 comments or Leave a comment