
There are two kinds of RAID (Redundant Array of Inexpensive Disks) available under Linux; hardware, controlled by a special piece of hardware that presents an array of disks as a single device, and software, wherein the Linux kernel conspires with the [mdadm] tool to present an array of devices as a single device. In addition to combining multiple devices, RAID offers various modes for leveraging the component hardware for improved throughput (striping) and for improved data reliability through redundancy (parity and mirroring). Hardware and software raid may also be combined to leverage the advantages of both.
The raid levels are explained in http://inferno.slug.org/HOWTO/Software-RAID-HOWTO-1.html#ss1.4
The 7506-8 is a bargain at about $350 (July 2005). We have a total of 20 of them in 10 rackmount RAID boxes.
Hit Alt-3 during boot to access the 3ware BIOS configuration utility.
~~~ under construction ~~~
The 3ware raid hardware contains a BIOS extension which permits configuring the hardware array independently of any otherwise installed software. This is a very nice feature!
The 7XXX series 3ware controllers have a hard limit to the size of a single array at 2Tb. This can be overcome by configuring multiple arrays on the controller, with some loss of total capacity due to increased parity requirements.
Note that the last firmware upgrade for the 7XXX series controllers that is offered on the 3ware downloads page is version 7.1.1, dated July 2004, which contains these actual firmware versions:
Monitor Version: ME7X 1.01.00.040 Firmware Version: FE7X 1.05.00.068 BIOS Version: BE7X 1.08.00.048
The 9500 series controllers can configure 4Tb arrays.
The commandline and web-based admin utilities that 3ware provides for use with their 9000 series controllers will work with the 7XXX and 8XXX series controllers as well. The older utilities for the 7XXX and 8XXX series will not work under 2.6.XX series Linux kernels. Download the development versions of the tools from http://www.3ware.com/support/download.asp
~~~ snip ~~~
How the hardware (we have two of these controllers in a 4U rack with 16 hotswap drive bays) appears to the kernel (from dmesg):
3ware Storage Controller device driver for Linux v1.26.02.001.
ACPI: PCI Interrupt 0000:02:01.0[A] -> GSI 48 (level, low) -> IRQ 177
scsi0 : 3ware Storage Controller
3w-xxxx: scsi0: Found a 3ware Storage Controller at 0x3000, IRQ: 177.
Vendor: 3ware Model: Logical Disk 0 Rev: 1.2
Type: Direct-Access ANSI SCSI revision: 00
SCSI device sda: 2241197056 512-byte hdwr sectors (1147493 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 2241197056 512-byte hdwr sectors (1147493 MB)
SCSI device sda: drive cache: write back
sda: sda1
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
ACPI: PCI Interrupt 0000:06:01.0[A] -> GSI 72 (level, low) -> IRQ 185
scsi1 : 3ware Storage Controller
3w-xxxx: scsi1: Found a 3ware Storage Controller at 0x5000, IRQ: 185.
Vendor: 3ware Model: Logical Disk 0 Rev: 1.2
Type: Direct-Access ANSI SCSI revision: 00
SCSI device sdb: 2241197056 512-byte hdwr sectors (1147493 MB)
SCSI device sdb: drive cache: write back
SCSI device sdb: 2241197056 512-byte hdwr sectors (1147493 MB)
SCSI device sdb: drive cache: write back
sdb: sdb1
Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
What we are looking at here are two 8-channel Escalade RAID controllers with 8 160Gb drives attached to each, for a total of 2.2 Tb of disk storage.
When the server is booted up, the two hardware configured arrays are available via the SCSI emulation layer as /dev/sda and /dev/sdb, with the partitions /dev/sda1 and /dev/sdb1 mountable.
The mkraid utlity can be used to combine the devices /dev/sda1 and /dev/sdb1 into /dev/md0:
/etc/raidtab:
raiddev /dev/md0
raid-level 0
nr-raid-disks 2
persistent-superblock 1
chunk-size 4
device /dev/sda1
raid-disk 0
device /dev/sdb1
raid-disk 1
And run mkraid /dev/md0 to initialize the array.
See /proc/mdstat after doing so for the gory details!
OR (for instance when the mkraid utility is not on the system (?):
mdadm --create --verbose /dev/md0 \
--chunk=4 \
--level=0 \
--raid-devices=2 \
/dev/sda1 /dev/sdb1
mdadm --examine --scan >/etc/mdadm.conf
mkreiserfs /dev/md0
mkdir /raid
mount /dev/md0 /raid
To get the array to persist after rebooting, two strategies are available. One is to use fdisk to set the type of the component partitions (/dev/sda1 and /dev/sdb1 in this case) to 0xfd (Linux raid auto) using fdisk, and/or you can create the file /etc/mdadm.conf:
# These are the two raid 5 devices associated with the
# 3ware 7500-8 controllers
DEVICE /dev/sda1 /dev/sdb1
# I used the following ARRAY declaration followed by a call to
# mdadm --examine --scan to construct the correct ARRAY declaration
#ARRAY /dev/md0
# The auto=md option creates a single large partition at /dev/md0
# auto=mdp4 for example would create /dev/md0 - /dev/md4 partitions
# (of equal size??)
ARRAY /dev/md0 level=raid0 num-devices=2 auto=md UUID=39a09508:36be9972:854a3968:15dcdfbe
MAILADDR pehrens@ligo.caltech.edu
# If we decide that we should handle mdadm events:
#PROGRAM /ldcg_admin/mdadm-event-handler
Adding software RAID 0 on top of the hardware RAID 5 increased the buffered disk read rate as reported by hdparm -t from 97 Mb/s to 137 Mb/s. A nearly 50% boost.
Performance Matrix:
time dd if=/dev/zero of=/raid/test.16gb_file count=16384 bs=1024k time dd if=/raid/test.16gb_file of=/dev/null count=16384 bs=1024k
iostat -kd 5 /dev/md0 # can be educational
Numbers were calculated from the timed dd commands above. These commands write and read a 16Gb file in 1Mb blocks, so the number 16384 can be divided by the runtime in seconds to get Mb/s. On an otherwise unloaded machine the results are repeatable to within a couple of percent.
| Filesystem | Read Rate Mb/s | Write Rate Mb/s |
| Reiserfs 3.6 | 114 | 43 |
| Ext3 | 93 | 24 |
| XFS | 96 | 24 |
| JFS | 90 | 25 |
Across filesystems via NFS (Read FROM or Write TO a Solaris exported filesystem):
| Filesystem | Read Rate Mb/s | Write Rate Mb/s |
| Reiserfs 3.6 | 12 | 9 |
| Ext3 | 13 | 10 |
| XFS | 14 | 9 |
| JFS | 12 | 9 |
Note that it was necessary to switch to an 8Gb file for the NFS tests due to disk space limitations on the Solaris server.
Note that debugging of NFS issues on Solaris is accomplished by killing the mountd process and restarting in debug mode with /usr/lib/nfs/mountd -v, which will result in logging to /var/adm/messages. Solaris 10 NFS exports are defined in /etc/dfs/dfstab, and look like:
share -F nfs -o log,rw=datacache10 /usr2/test
################################################
#/etc/smartd.conf
#
# Monitor /dev/hda and 16 ide drives on two
# 3ware 7X00-8 raid controllers
#
# /dev/hda
# Run long self-test on boot disk Sunday at 3 a.m.
#
/dev/hda -l error -f -s L/../../7/03 -m pehrens,anderson_s,kozak_d
#
# 8 ATA disks on each of two 3ware 7X00 controllers.
# Start staggered short self-tests daily
#
# Attribute 176 is unknown to smartd - ??
#
# Note that it was necessary to make the devices twe0 and twe1
# by hand, since I am unable to run the tw_cli from
# http://www.3ware.com/support/download.asp tw_cli
# (3ware commandline interface - it crashes!!):
#
# mknod /dev/twe0 u 164 0
# mknod /dev/twe1 u 164 1
#
/dev/twe0 -d removable -d 3ware,0 -i 176 -l error -f -s S/../.././00 -m pehrens,anderson_s,kozak_d
/dev/twe0 -d removable -d 3ware,1 -i 176 -l error -f -s S/../.././01 -m pehrens,anderson_s,kozak_d
/dev/twe0 -d removable -d 3ware,2 -i 176 -l error -f -s S/../.././02 -m pehrens,anderson_s,kozak_d
/dev/twe0 -d removable -d 3ware,3 -i 176 -l error -f -s S/../.././03 -m pehrens,anderson_s,kozak_d
/dev/twe0 -d removable -d 3ware,4 -i 176 -l error -f -s S/../.././04 -m pehrens,anderson_s,kozak_d
/dev/twe0 -d removable -d 3ware,5 -i 176 -l error -f -s S/../.././05 -m pehrens,anderson_s,kozak_d
/dev/twe0 -d removable -d 3ware,6 -i 176 -l error -f -s S/../.././06 -m pehrens,anderson_s,kozak_d
/dev/twe0 -d removable -d 3ware,7 -i 176 -l error -f -s S/../.././07 -m pehrens,anderson_s,kozak_d
/dev/twe1 -d removable -d 3ware,0 -i 176 -l error -f -s S/../.././00 -m pehrens,anderson_s,kozak_d
/dev/twe1 -d removable -d 3ware,1 -i 176 -l error -f -s S/../.././01 -m pehrens,anderson_s,kozak_d
/dev/twe1 -d removable -d 3ware,2 -i 176 -l error -f -s S/../.././02 -m pehrens,anderson_s,kozak_d
/dev/twe1 -d removable -d 3ware,3 -i 176 -l error -f -s S/../.././03 -m pehrens,anderson_s,kozak_d
/dev/twe1 -d removable -d 3ware,4 -i 176 -l error -f -s S/../.././04 -m pehrens,anderson_s,kozak_d
/dev/twe1 -d removable -d 3ware,5 -i 176 -l error -f -s S/../.././05 -m pehrens,anderson_s,kozak_d
/dev/twe1 -d removable -d 3ware,6 -i 176 -l error -f -s S/../.././06 -m pehrens,anderson_s,kozak_d
/dev/twe1 -d removable -d 3ware,7 -i 176 -l error -f -s S/../.././07 -m pehrens,anderson_s,kozak_d
#
################################################
Fedora Core 4 users will want to do this:
yum -y install smartmontools
*** create the /etc/smartd.conf ***
chkconfig --add smartd
chkconfig smartd on
/etc/init.d/smartd
Then check /var/log/messages to see if it worked.
Raidtools: http://people.redhat.com/mingo/raidtools/
Raidtools2: http://packages.qa.debian.org/r/raidtools2.html
Mdadm: http://www.cse.unsw.edu.au/~neilb/source/mdadm/