# Software RAID 5, SATA, and other performance factors

## bssteph

WAS: (Significant?) performance hit using software RAID 5?

I have four disks in a software RAID 5 on one of my boxes at home, and after finally sorting out my backup schemes, the ability of the array to safely lose one disk isn't as important (still nice though).

My question is a simple one. Right now I have two SATA and two IDE disks in this array. What kind of performance hit am I taking with this setup? I've read that writes to the array will be slower because of calculation of the parity information, but I've also read that it should be relatively insignificant in modern hardware (which it is, it's an AMD64 X2). Is it worth losing the parity to consider some other alternative? Should I replace the IDE drives? It's a fairly normal desktop load (KDE, Azureus, some games, the usual) but also with MythTV recording TV (say, I should try to move that off the raid, probably).

In short I've never been terribly impressed with the "feel-good" performance factor of the array, and I just wonder if I'm missing something big or just throwing the wrong usage at it, or if it's all in my head. Any thoughts welcome.Last edited by bssteph on Tue Nov 28, 2006 8:25 am; edited 1 time in total

----------

## NeddySeagoon

bssteph,

With a dual core CPU, I would expect the bottleneck to be the drives themselves, not the ability to do the parity calculations fast enough.

That is, there will be a CPU load associated with the raid but you won't notice and data transfer rate changes by shifting to a single drive.

A single drive may even be worse.

What read speeds do you get with 

```
hdparm /dev/md...
```

and for each drive alone?

Now, raid0 is faster but do you need the speed ?

Loose one drive and its all gone

----------

## bssteph

Hi NeddySeagoon.

First, your sig is fitting for the discussion, and I agree, losing the parity would stink, but then again, things don't seem entirely right here. Which I might have partially figured out...

Those hdparms:

```
mal ~ # hdparm -Tt /dev/hda

/dev/hda:

 Timing cached reads:   1840 MB in  2.00 seconds = 920.14 MB/sec

 Timing buffered disk reads:  174 MB in  3.02 seconds =  57.63 MB/sec

mal ~ # hdparm -Tt /dev/hdb

/dev/hdb:

 Timing cached reads:   1848 MB in  2.00 seconds = 924.92 MB/sec

 Timing buffered disk reads:  170 MB in  3.01 seconds =  56.39 MB/sec

mal ~ # hdparm -Tt /dev/sda

/dev/sda:

 Timing cached reads:   1822 MB in  2.00 seconds = 911.40 MB/sec

 Timing buffered disk reads:  202 MB in  3.01 seconds =  67.02 MB/sec

mal ~ # hdparm -Tt /dev/sdb

/dev/sdb:

 Timing cached reads:   1834 MB in  2.00 seconds = 917.93 MB/sec

 Timing buffered disk reads:  202 MB in  3.01 seconds =  67.15 MB/sec

mal ~ # hdparm -Tt /dev/md0

/dev/md0:

 Timing cached reads:   1874 MB in  2.00 seconds = 936.99 MB/sec

 Timing buffered disk reads:  490 MB in  3.01 seconds = 162.75 MB/sec
```

I'm not really sure if those numbers are normal or not. buffered disk reads makes sense for the array I guess (speed is 3 of the 4 disks combined, give or take), the cached reads I don't know what to expect. The SATA disks and the board are 3.0 Gb/s but the kernel says:

```
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xC400 irq 225

ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xC408 irq 225

scsi0 : sata_nv

ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

ata1.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ (depth 0/32)

ata1.00: ata1: dev 0 multi count 16

ata1.00: configured for UDMA/133

scsi1 : sata_nv

ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

ata2.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ (depth 0/32)

ata2.00: ata2: dev 0 multi count 16

ata2.00: configured for UDMA/133

  Vendor: ATA       Model: HDT722525DLA380   Rev: V44O

  Type:   Direct-Access                      ANSI SCSI revision: 05

  Vendor: ATA       Model: HDT722525DLA380   Rev: V44O

  Type:   Direct-Access                      ANSI SCSI revision: 05
```

Of note is "link up 1.5 Gbps", so that would be a problem. I think the kernel is configured properly. When I get home tonight I'm going to check the physical board and the cables (the board in particular, one revision of the board has some connectors which are only SATA 1.5 Gbps, I guess). Fixing that would be nice, but I would assume that the PATA drives would still slow down the array somewhat? I'm thinking of replacing them with some more SATA drives.

So I'm thinking this isn't very RAID related anymore. I don't suppose there's a way to do a hdparm-like write test on the disks without deconstructing the array?

----------

## NeddySeagoon

bssteph,

For 7200rpm drives the head/platter data rate is around 60Mb sec, which is approximately what you get for your sustained data rate for your single drives. It depends when on the platter the test runs, outer tracks have more sectors per track than inner ones, so the data rate is higher there.

The connection speed only helps getting data into and out of the drive cache, it plays no part in the sustained data rate.

The buffered reads from /dev/md0, at 163MB/sec shows you are up against the head/platter limit again and the raid 5 data shuffling plays a very small or no part in the data rate limit.

As you say, you cannot write test the drives individually but the write speed will be the same as the read speed, for the same reason, the head/platter data rate. You can write a large file to /dev/md0 to test its write rate.

Seeing those numbers, I would stick with the raid5.

----------

## linuxtuxhellsinki

```
/dev/hda:

/dev/hdb: 
```

I think you should consider of moving that second drive /dev/hdb to another IDE channel, cause now they're waiting each other all the time.  (I don't know is it going to help "very much" speedwise ?)

----------

## bssteph

linuxtuxhellsinki, I think I am going to buy some SATA drives, but thanks.

NeddySeagoon, thank you very much for the extended info. I indeed think now that I will stick with the RAID 5 (putting MythTV recordings on a RAID 0 when I get to that point).

Unrelated to the decision, but in case anyone goes looking around, I found out why my SATA disks were only using 1.5 Gbps. Those disks are from Hitachi, and there is a little DOS application downloadable as a floppy/CD image called "Feature Tool" which was necessary to enable 3.0 Gbps on those disks.

----------

## NeddySeagoon

bssteph,

The Speed increase you will see by operating the SATA interface at 300Gbit/sec instead of 150Gbit/sec is close to zero.

The interface is already capable of 3x the head/platter data rate, so 6x won't move the bottleneck.

Now, if the platter speed increased by 6x, to 43,200 rprm or the data bits got smaller, so the head/platter data rate improved, you would see some gain.

Thats partly why faster disks are better.

----------

## linuxtuxhellsinki

...maybe some 15K scsi (+90MB/s) ?      :Shocked: 

 :Smile: 

 :Razz: 

 :Very Happy: 

----------

## tgh

 *Quote:*   

> I'm not really sure if those numbers are normal or not. buffered disk reads makes sense for the array I guess (speed is 3 of the 4 disks combined, give or take), the cached reads I don't know what to expect. 

 

HDParm cached / buffered read speed is directly related to how fast your CPU is and/or how fast it can pull data from RAM.

And ~925MB/s seems a little slow for anything made this year.  Our Athlon64 X2 3800/4200/4600 boxes with DDR2 5300 RAM pull values in the 1900-2200 MB/s range and my old 939 Athlon64 3200 with DDR 400 RAM pulled in a value around 1600 MB/s.

RAID5's two downsides are: (a) slower write speed and slower re-write speeds due to having to touch at least 2 disks in the array for every write and (b) rebuild time which can be lengthy.  If performance is a concern (over capacity), your next step is RAID10.  Which offers a more balanced read/write performance and does better when the array is extremely busy.

And I'm not sure that with modern 80-pin cabling that having 2 drives on the same IDE cable really matters anymore.  I still break mine out to separate cables when at all possible, but that's more out of habit then due to any hard numbers.  (Unless... the bandwidth of the 2 drives combined are such that they exceed the capacity of the IDE chipset.  Which all depends on what chip was used for the IDE controller.  Or maybe the 2 drives combine exceed the capacity of the link between the motherboard chipset and the IDE controller chip.)

Shouldn't be any performance hit mixing IDE/SATA drives within a Software RAID.  Unless you run into a bus bottleneck somewhere (unlikely with modern PCIe chipsets).   Physically, a 7200 rpm IDE drive is identical to the 7200 rpm SATA drive from the same manufacturer, they just have different I/O chips bolted on.  And a good IDE controller can easily handle the 65-75MB/s coming off a modern drive.  

('s one big advantage of Software RAID... mix-n-match whatever you can lay your hands on when building the array.)

----------

## bssteph

Thanks for the info, tgh.

 *tgh wrote:*   

> HDParm cached / buffered read speed is directly related to how fast your CPU is and/or how fast it can pull data from RAM.
> 
> And ~925MB/s seems a little slow for anything made this year.  Our Athlon64 X2 3800/4200/4600 boxes with DDR2 5300 RAM pull values in the 1900-2200 MB/s range and my old 939 Athlon64 3200 with DDR 400 RAM pulled in a value around 1600 MB/s.

 

Seeing this made me check a LiveCD and sure enough, I'm getting much better numbers on it. ~1900 MB/sec on all the drives. I'm comparing dmesg and kernel configs now. Any guesses? The hardware is a X2 3800+ on a Gigabyte nForce4 with 2x1 GB DDR400 RAM.

I've changed the title to represent the different path this thread is going.

----------

## NeddySeagoon

tgh ,

 *tgh wrote:*   

> HDParm cached / buffered read speed is directly related to how fast your CPU is and/or how fast it can pull data from RAM.

 This is only true for PIO modes, where the CPU actually does the reading. In DMA modes, the CPU only loads the DMA controller, then its down to memory speed an what else is sharing the memory bandwidth.

 *tgh wrote:*   

> I'm not sure that with modern 80-pin cabling that having 2 drives on the same IDE cable really matters anymore.

 It depends on access patterns because transactions to the two drives on the same IDE cable cannot be overlapped. The 80 conductor cable only increases the bandwidth. For a raid, one drive per IDE is best.

linuxtuxhellsinki,

15k drivers provide lower latancey an higher data rates but your 90MB/sec is well withing the capability of IDE interfaces, SCSI no longer has the performace edge it once did.

----------

## tgh

I don't really have any guesses as to why you were seeing lower buffered hdparm values.  Comparing dmesg and .config is what I would do as well.

Make sense on the PIO vs DMA controlling the overall speed.  I know that my old VIA C3 unit has a lot lower bandwidth then my Athlon64.  But the C3 uses PC133 RAM instead of DDR400.  That machine only gets 137MB/s buffered (probably due to the older slower memory architecture).

I'm with you on SATA vs SCSI.  I think the cost difference is either patent/chipset fees or that (supposedly) SCSI drives are individually tested while SATA/PATA are sample (batch) tested.

The other place that SCSI supposedly still shines is under heavy loads where the SCSI controller can re-order the request queue to service things out of order to speed up access.  SATA-II (and some SATA-I) have NCQ capability now, but it's still pretty fresh.

For low-end, small workgroup servers (< 10 people), SATA 7200rpm or 10k RPM drives compete very well with SCSI.  I'd even say you could build a mid-level server using SATA as well... since the price of SATA drives lets you have twice the number of spindles for the same cost.  More spindles means you can separate work loads (or at least spread the load across more drives).  Plus the $/GB is a lot lower for SATA then SCSI.  Heck, even some of the low/mid-range SAN gear by the big names now uses SATA drives.

SATA drives were also designed for hot-plug.  The form factor seems to be standardized as well, because on the SATA backplane units that I've played with there's no seperate interface for the drive.  Instead, the back of the SATA drive mates directly with the backplane.  (I think SAS drives are the same way, but I haven't had a chance to muck with those yet.  SATA backplanes and SATA drives were good enough for our purposes, especially once we went with RAID10.)

----------

## bssteph

Argh:

```
(AMD64 minimal 2006.1 LiveCD bootup + messages)

livecd ~ # hdparm -Tt /dev/sda

/dev/sda:

 Timing cached reads:   3908 MB in  2.00 seconds = 1954.81 MB/sec

 Timing buffered disk reads:  204 MB in  3.02 seconds =  67.48 MB/sec

livecd ~ # evms_activate 

(normal evms messages about LVM2/MD container headers)

livecd ~ # hdparm -Tt /dev/sda

/dev/sda:

 Timing cached reads:   3844 MB in  2.00 seconds = 1922.67 MB/sec

 Timing buffered disk reads:  202 MB in  3.00 seconds =  67.31 MB/sec

livecd ~ # mount /dev/evms/root /mnt/gentoo/

livecd ~ # hdparm -Tt /dev/sda

/dev/sda:

 Timing cached reads:   3864 MB in  2.00 seconds = 1933.59 MB/sec

 Timing buffered disk reads:  202 MB in  3.01 seconds =  67.12 MB/sec

livecd ~ # /mnt/gentoo/sbin/hdparm -Tt /dev/sda

/dev/sda:

 Timing cached reads:   1910 MB in  2.00 seconds = 955.74 MB/sec

 Timing buffered disk reads:  202 MB in  3.01 seconds =  67.12 MB/sec

livecd ~ #
```

So, there's no problem, but my hdparm was full of nonsense!? hdparm 6.3 on the LiveCD, 6.9 on my system. I downgraded hdparm on my system to 6.3 and it's seeing ~1900 MB/sec now.

----------

## linuxtuxhellsinki

So there's an bug in hdparm-6.9 ?

tgh & neddy

I agree that sata is really good and best choice for desktop usage with it's good price/performance factor, but those 15K scsi drives still shines with very low latency and seek times (like this cheetah)     :Wink: 

And yes, those SAS drives are using same interface as SATA, and there are 2.5" of 'em with quite nice specs

Edit :   SATA connector fits in to SAS but not the opposite way.

----------

## bssteph

 *linuxtuxhellsinki wrote:*   

> So there's an bug in hdparm-6.9 ?

 

Or 6.3, and the lower numbers are right. For the moment I'm too frustrated to find out which.

----------

## tgh

One way to cross-check the number is memtest or memtest86+.  Which gives you a running detail of the memory bandwidth on your system.

For instance, on my M2N-E (AM2 socket) motherboard with an Athlon64 X2 4600+ w/  I get a timed cached reads of 2400 MB/s in hdparm.  I have 4GB of ECC DDR2 RAM installed.  Memtest86+ gives the following values:

L1 - 19763MB/s

L2 - 4709MB/s

Mem - 3040MB/s

So hdparm 6.6 is reporting a value that is 78% of my overall memory bandwidth on my AMD64 system.

PS: Interesting, there's a bug at the end of the memtest86+ install script.  It tells you to add the "root (hd3,-1)" line to your grub.conf file, but this is incorrect (and will prevent the memtest86+ kernel from booting.  You only need the title and kernel lines added to grub.conf.

----------

## bssteph

 *tgh wrote:*   

> One way to cross-check the number is memtest or memtest86+.  Which gives you a running detail of the memory bandwidth on your system.
> 
> For instance, on my M2N-E (AM2 socket) motherboard with an Athlon64 X2 4600+ w/  I get a timed cached reads of 2400 MB/s in hdparm.  I have 4GB of ECC DDR2 RAM installed.  Memtest86+ gives the following values:
> 
> L1 - 19763MB/s
> ...

 

Wow, thanks very much for the tip. memtest86+ reports memory bandwidth of 2243 MB/s for a 6.3 hdparm report of about 86% of the memory bandwidth.

Now I can figure out if it's something with hdparm 6.9 that's worth reporting while I read the rest of the thread. ;)

----------

## someone19

[quote="tgh]And I'm not sure that with modern 80-pin cabling that having 2 drives on the same IDE cable really matters anymore.  I still break mine out to separate cables when at all possible, but that's more out of habit then due to any hard numbers.  (Unless... the bandwidth of the 2 drives combined are such that they exceed the capacity of the IDE chipset.  Which all depends on what chip was used for the IDE controller.  Or maybe the 2 drives combine exceed the capacity of the link between the motherboard chipset and the IDE controller chip.)[quote]

The PATA bus (IDE) that uses 40 pin connectors have the 80pin cables to filter interference, much like ethernet CAT 3/5/5e/6 cableing only uses two of the four pair available.  The best performance on an IDE interface is with one drive per interface.  If you have two drives setup as master/slave only one drive can read/write at the same time.  The 'parrallel' in PATA is to/from one drive, but the interface is serial in its access of the two drives.  SO in a RAID setup, where you have two drives on one cable - you can't get double the throughput because you can only read from one drive at a time.  The early IDE raid controller cards were set so that you could only put one drive on one cable - the controller couldn't recognise a slave drive because they knew it would kill the performance.  It isn't a matter of the chipset, merely a limitation of the origonal IDE pinouts designed...  20 years ago or so?  <shudder>

This was just some info I wanted to see in this discussion incase somebody finds this thread in the future.

----------

