Mar 102010

Here is all the information you will need to create a Raid-5 under Linux with an to create an ext4 partition using LVM or not. I will also add all the commands to create a Volume Group (LVM).

More information: I should first mention that I use a RHEL 5.4 like distribution. RHEL 5.4 means “Red Hat Enterprise Linux 5.4” it can also be “Scientific Linux 5.4” or “CentOS 5.4” since both those Linux distributions are the exact same to RHEL, only they are recompiled from its freely available source. RHEL is not free since you need to paid for license and/or support while the other two are free. Per Linux license Red Hat has to make their source available. It was also successfully tested on RHEL 5.5

Plain and simple you will first need to select a computer add several disks of your choice. I used 1.5TB SATA-3 drive speeding at 300 MB/s. I installed them in a Dell T610 server which has hot swappable disk bracket on the front.

Anecdote: When I ordered the server I selected just one disk of 500GB because additional terabyte hard disk was almost 2000$ each at Dell, totally crazy, while we could find same disks on the market at about 140$ each. My disks brand is Seagate with specs: sata-3, data size of 1.5TB and 7200rpm, 32MB cache. At first I was missing the disk tray (Dell pn F238F) since Dell only gave the empty filler which is useless I asked them to buy the disks tray… it was 55$ each… which is quite expensive for simple trays, so I finally found them at ServerNexus.com for 23$ each. (Paid 6 trays for 138$ instead of 330$ at Dell’s).

Once disks are installed and you can see them with “parted –l” gnu partition manager which should be used over fdisk or sfdisk since those last two do not support GPT partitions.

This is how one can do software Raid-5 under Linux and use LVM to create several partitions and format them as Ext4 filesystem. The commands are:

1 ) To create a MD device md0 as raid5 using 6 disks /dev/sdb to /dev/sdg use

mdadm --create /dev/md0 --level=5 --raid-devices=6 --spare-device=0 --force /dev/sd*[b-g]
cat /proc/mdstat

This last command would reveal that the raid-5 is in the progress of constructing its parity on all disks. Even now before the disks are unformated with any filesystem, MD is already at work creating empty parity to later provide the raid-5 protection we need. This is the command output and if you want to watch it grow use “watch cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdb[0] sdg[5] sdf[4] sde[3] sdd[2] sdc[1]
7325692480 blocks level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]
[=======>.............]  resync = 38.0% (557434240/1465138496) finish=136.7min speed=110616K/sec

In this info you can see which disks are used (sdb sdc sde sdf sdg), the progress at 38% of resunc of the raid-5 but be careful about [UUUUUU] since each U represent one disk Up. Otherwise you can see something like [UU_UUU] that signifies the 3th disk is down, dead or unplug.

At this point if you don’t need LVM you can directly create an ext4 filesystem on this meta device called md0. The command would be:

  • mkfs.ext4 -L test /dev/md0

if you don’t have mkfs.ext4 just install mdadm and  e4fsprogs, usually with :

  • yum install mdadm e4fsprogs

2 ) After you created a Meta Device (MD) and it’s working you will need to create a configuration file called /etc/mdadm.conf and this is how to do it.

echo      'DEVICE /dev/hd[a-z] /dev/sd*[a-z]' > /etc/mdadm.conf
mdadm      --examine --scan --config=mdadm.conf >> /etc/mdadm.conf

it should look like this :

DEVICE /dev/sd*[b-g]
ARRAY /dev/md0 level=raid5 num-devices=6 UUID=be242eb0:3fe5ec86:4b698eb2:c9f1759e
# Add this last line to receive alert by mdmonitor by email
MAILADDR you@YourCompagny.com

Check if the md monitor service is working :

  • service mdmonitor status
  • service mdmonitor start

3 ) Creating a Logical Volume Group (LVM) will provide the ability to create several virtual partitions that could grow as needed. Even now that we are ready to create a volume you don’t need to wait for md to finish it’s partity calculation. This is the command to initialize a disk for use by LVM:

  • pvcreate /dev/md0

4 ) Create a volume group

  • vgcreate vg01 /dev/md0

5 ) Create a logical volume called test of 100 Gig data size

  • lvcreate -L 100G  -n test vg01

6 ) Create and ext4 FS with a label called test from the previously created logical volume

  • mkfs.ext4 -L test /dev/vg01/test

7 ) At a later time you may need to extend the size of your volume this is how. Note that you also need to resize the Ext4 formatting with resize4fs before you can see more space available with tools such as df

  • lvextend -L +50G /dev/vg01/test
  • resize4fs /dev/vg01/test

8 ) To delete the logical volume test we just created as test volume,

  • umount /dev/vg01/test
  • lvremove /dev/vg01/test

Do you really want to remove active logical volume test? [y/n]: y
Logical volume “test” successfully removed

9 ) Other useful commands are vgdisplay; pvscan ; vgscan ; lvscan

  • vgdisplay

— Volume group —
VG Name               vg01
System ID
Format                lvm2
Metadata Areas        1
Metadata Sequence No  8
VG Access            read/write
VG Status            resizable
MAX LV                0
Cur LV                1
Open LV              1
Max PV                0
Cur PV                1
Act PV                1
VG Size               6.82 TB
PE Size               4.00 MB
Total PE              1788499
Alloc PE / Size      25600 / 100.00 GB
Free  PE / Size     1762899 / 6.72 TB
VG UUID             8uQLL7-VvTg-ekll-70yy-JdTK-lOsp-E4n5uf

  • pvscan

PV /dev/md0   VG vg01   lvm2 [6.82 TB / 6.72 TB free]
Total: 1 [842.32 GB] / in use: 1 [842.32 GB] / in no VG: 0 [0   ]

  • vgscan

Reading all physical volumes.  This may take a while…
Found volume group “vg01″ using metadata type lvm2

  • lvscan

ACTIVE            ‘/dev/vg01/test’ [100.00 GB] inherit


10)
And voila! Mount it and use it.

  • mkdir –p /share/test
  • mount -L test /share/test
  • df -h /share/test

Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg01-test 99G  1.2G   93G   2% /share/test

You may also want to add this line in /etc/fstab in order to automatically mount the new disk on the directory /share/test at Linux boot time…

LABEL=test              /share/test     ext4    defaults        0  0

11) Aftermath, I wanted to be more secure so I decided to add a spare device to the MD Raid5 pool…

mdadm --manage --add /dev/md0 /dev/sdh
mdadm -Q --detail /dev/md0

/dev/md0:
Version : 0.90
Creation Time : Wed Mar 10 17:40:12 2010
Raid Level : raid5
Array Size : 7325692480 (6986.32 GiB 7501.51 GB)
Used Dev Size : 1465138496 (1397.26 GiB 1500.30 GB)
Raid Devices : 6
Total Devices : 7
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Mon Mar 15 16:50:42 2010
State : clean
Active Devices : 6
Working Devices : 7
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 64K

UUID : be242eb0:3fe5ec86:4b698eb2:c9f1759e
Events : 0.10

Number   Major   Minor   RaidDevice State
0       8       16        0      active sync   /dev/sdb
1       8       32        1      active sync   /dev/sdc
2       8       48        2      active sync   /dev/sdd
3       8       64        3      active sync   /dev/sde
4       8       80        4      active sync   /dev/sdf
5       8       96        5      active sync   /dev/sdg

6       8      112        -      spare   /dev/sdh

12) And then how do we delete the TEST volume we just create…

% lvscan
ACTIVE ‘/dev/vg01/devel’ [1.00 TB] inherit
ACTIVE ‘/dev/vg01/test’ [1.00 TB] inherit
ACTIVE ‘/dev/vg01/locals’ [1.00 TB] inherit
ACTIVE ‘/dev/vg01/web’ [10.00 GB] inherit

% lvremove /dev/vg01/test
Do you really want to remove active logical volume test? [y/n]: y
Logical volume “test” successfully removed

Special note: If you don’t specially need to use Linux but want a file server you may want to consider Solaris or any OS that support ZFS such as (FreeBSD, Illumian) simply because of the ability to use ZFS. ZFS is a better filesystem than any other ever made so far. With Solaris and ZFS all this text could be replace by:

  • zpool create pool raidz sdb sdc sdd sde sdf sdg spare sdh
  • zfs create pool/test
  • zfs set quota=100g pool/test

And this is another story… :) but frankly ZFS on Linux would be super cool however no plan was made by Sun-Oracle to do this. Oracle was working on BTRFS (a ZFS clone for Linux) before they bought Sun Microsystems but now they own both BRTFS and ZFS who knows what will happens… In the meantime BTRFS is going strong on Linux and testing releases are available. Final release is due soon.

Hope this help someone, and please share or comments below…

Réjean.

3 Responses to “Creating Raid5 under Linux RHEL5.x using md, lvm and ext4 filesystem.”

  1. Rejean says:

    And note that performance seams very nice… between 101 MB/s and 478 MB/s on write… I even saw 565 MB/s.

    % dd if=/dev/zero of=ttttt bs=1024 count=102400
    102400+0 records in
    102400+0 records out
    104857600 bytes (105 MB) copied, 0.219311 seconds, 478 MB/s

    % dd if=/dev/zero of=ttttt bs=1024 count=1024000
    1024000+0 records in
    1024000+0 records out
    1048576000 bytes (1.0 GB) copied, 10.3743 seconds, 101 MB/s

  2. sys01admin says:

    FYI when using long form to specify parameters you’ll need the double dash: mdadm –-create, not -create. Maybe a WordPresss thing. Thanks for the write up, very informative.

  3. Rejean says:

    Yes that is right it’s –create…. And yes it is an unseen WordPress problem. I have replace the UL and LI list items with a pre (pre-formated) list of commands… so now double – appears ok. thanks a lot.

Leave a Reply

(required)

(required)