# Raid performance

## nadir-san

Ive been getting into raid.

I have two setups, but the performance doesn't seem to make sense, perhaps someone can shed some light on the situation.

My first setup is dual 72G WD raptors in raid1.

```
tomoe ~ # hdparm -tT /dev/md3

/dev/md3:

 Timing cached reads:   2182 MB in  2.00 seconds = 1092.01 MB/sec

 Timing buffered disk reads:  204 MB in  3.03 seconds =  67.38 MB/sec

tomoe ~ # cat /proc/mdstat 

Personalities : [linear] [raid0] [raid1] [raid10] [multipath] [faulty] 

md3 : active raid1 sda4[0]

      57817856 blocks [2/1] [U_]

      

md0 : active raid1 sdb1[1] sda1[0]

      104320 blocks [2/2] [UU]

```

second setup is 4 x 500G WD caviar.

```
miyu raid # hdparm -tT /dev/md0

/dev/md0:

 Timing cached reads:   814 MB in  2.00 seconds = 407.24 MB/sec

 Timing buffered disk reads:  254 MB in  3.01 seconds =  84.45 MB/sec

miyu raid # cat /proc/mdstat 

Personalities : [raid6] [raid5] [raid4] 

md0 : active raid5 sda1[0] sdd1[4] sdc1[2] sdb1[1]

      1465151808 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

      [>....................]  recovery =  2.6% (12993920/488383936) finish=303.9min speed=26069K/sec

```

now .... unless I'm seeing things, the raid 5 array is faster ???

doesn't make sense. Those raptors should be out performing.

note: for the raptors it says md3 : active raid1 sda4[0]

      57817856 blocks [2/1] [U_]

but no mention of the second disk mdb4 ?????

strange, even though I have it in raidtab

```

tomoe ~ # cat /etc/raidtab 

# /boot (RAID 1)

raiddev                 /dev/md0

raid-level              1

nr-raid-disks           2

chunk-size              32

persistent-superblock   1

device                  /dev/sda1

raid-disk               0

device                  /dev/sdb1

raid-disk               1

# / (RAID 1)

raiddev                 /dev/md3

raid-level              1

nr-raid-disks           2

chunk-size              32

persistent-superblock   1

device                  /dev/sda4

raid-disk               0

device                  /dev/sdb4

raid-disk               1

```

```

tomoe ~ # mdadm --monitor /dev/md3

Apr  8 01:41:38: DegradedArray on /dev/md3 unknown device

```

 :Surprised: 

----------

## Keruskerfuerst

1. MD3 is not active: 57817856 blocks [2/1] [U_]

----------

## HeissFuss

From one of the

```
# cat /proc/mdstat
```

The raid 5 parity was still being built.  For the other as the post above says, md3 is not active as it is missing the second partition.Last edited by HeissFuss on Mon Apr 09, 2007 6:33 pm; edited 1 time in total

----------

## eccerr0r

When you see '_' in '[UUU_]' it means one disk is down or inactive, and the RAID is in degraded mode...  If it were up there would be a 'U' there.

```
doujima:~$ sudo hdparm -t /dev/md1

/dev/md1:

 Timing buffered disk reads:  420 MB in  3.01 seconds = 139.42 MB/sec

doujima:~$ cat /proc/mdstat

Personalities : [raid1] [raid6] [raid5] [raid4]

md1 : active raid5 hdg2[3] hde2[2] hdc2[1] hda2[0]

      23470656 blocks level 5, 32k chunk, algorithm 2 [4/4] [UUUU]

```

Not necessarily the fastest RAID but it does work... however I've found that I needed to use my faster athlon XP2200+ over my old Celeron-1364 to get the speed, the celeron wouldn't get anywhere close to the burst speed seen here.  These are random old 120G disks (seagates and maxtors, mixed lots).

----------

## Mad Merlin

RAID 1 does not normally increase the speed of reading or writing, it only provides tolerance of n-1 dead hard drives. RAID 5 does increase the speed of reading proportional to the number of disks you have, but write speeds can suffer.

----------

## Keruskerfuerst

Raid1 doubles the read speed and decreases the write speed (by up to 20%).

----------

## Mad Merlin

 *Keruskerfuerst wrote:*   

> Raid1 doubles the read speed and decreases the write speed (by up to 20%).

 

In theory it seems like it should, but in practice it normally does not.

4 disk RAID 1:

```

dd if=/boot/bigfile of=/dev/null

819200+0 records in

819200+0 records out

419430400 bytes (419 MB) copied, 8.46766 s, 49.5 MB/s

```

4 disk RAID 5:

```

dd if=bigfile of=/dev/null

4194304+0 records in

4194304+0 records out

2147483648 bytes (2.1 GB) copied, 23.9611 s, 89.6 MB/s

```

4 disk RAID 0:

```

dd if=/tmp/bigfile of=/dev/null

4194304+0 records in

4194304+0 records out

2147483648 bytes (2.1 GB) copied, 16.7629 s, 128 MB/s

```

Each disk in the RAID maxes out around 60MB/s, and they're not entirely unbusy right now, but you get the idea. RAID 1 is the slowest, but most reliable of RAID 0, 1, and 5.

----------

## Keruskerfuerst

You should use bonnie++ or any similar I/O program to measure the performance of a raid array.

----------

## drescherjm

 *Quote:*   

> now .... unless I'm seeing things, the raid 5 array is faster ???
> 
> doesn't make sense. Those raptors should be out performing. 

 

No they should not for several reasons. The raptors have a better seek time so they are better for random reads but they are several generations behind the your 500GB drives so they have a slower STR (which is what hdparm reports). 

If you use hdparm to benchmark a single raptor vs a single current generation 500GB SATA drive I'd bet the 500GB drive comes out on top. The reason is very simple. Although the raptor spins the disk faster the 500GB SATA drive, the 500GB drive packs the bits significantly (more than 2 times) closer together so to read a large amount of data the raptor has to read a significantly longer distance around the disk than the 500 GB drive and even though it runs at 1.5 times the speed the 500GB drive will win.

A second reason is for reading raid 5 is faster than raid1 in most cases. It is basically raid0 for n-1 drives if you have a modern cpu or you are using a real hardware raid controller that was made within the last 3 years... At work I have a couple of 6 drive raid 6 arrays using 330GB Seagate 7200.10 drives that net a 267MB/s hdparm score on the uncached reads. So for raid 6 the number should be around n-2 which in my case should be around 300MB/s (4 X 75MB/s) but there is some overhead that robs me the extra 33MB/s.

----------

## Keruskerfuerst

Hdparm is not suited for raid arrays.

----------

## drescherjm

 *Quote:*   

> Hdparm is not suited for raid arrays.

 

Agreed as it only measures STR and that has limited value unless you have tons of memory and only sequentially load large files in one chunk and you have no file fragmentation...

----------

## nadir-san

Thanks for the quick response.

but there seems to be something bad wrong.

```

miyu ~ # cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4]

md1 : active raid5 sda1[4](F) sdb1[1] sdd1[5](S) sdc1[2]

      1465151808 blocks level 5, 64k chunk, algorithm 2 [4/2] [_UU_]

     

unused devices: <none> 

```

I rebuilt the array again, for the 4th time, this morning when I woke up I saw _UU_

kinda disappointing. It just doesn't want to stay alive.

Im restarting again now, even though I know it will fail again,  here is the build report from mdadm.

```

miyu ~ # mdadm --create /dev/md1 --level=5 --raid-disks=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 --spare-devices=0

mdadm: /dev/sda1 appears to be part of a raid array:

    level=raid5 devices=4 ctime=Mon Apr  9 13:06:02 2007

mdadm: /dev/sdb1 appears to be part of a raid array:

    level=raid5 devices=4 ctime=Mon Apr  9 13:06:02 2007

mdadm: /dev/sdc1 appears to be part of a raid array:

    level=raid5 devices=4 ctime=Mon Apr  9 13:06:02 2007

mdadm: /dev/sdd1 appears to be part of a raid array:

    level=raid5 devices=4 ctime=Mon Apr  9 13:06:02 2007

Continue creating array? y

mdadm: array /dev/md1 started.

miyu ~ # cat /proc/mdstat 

Personalities : [raid6] [raid5] [raid4] 

md1 : active raid5 sdd1[4] sdc1[2] sdb1[1] sda1[0]

      1465151808 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

      [>....................]  recovery =  0.0% (85248/488383936) finish=286.2min speed=28416K/sec

      

unused devices: <none>

miyu ~ # mdadm --detail /dev/md1 

/dev/md1:

        Version : 00.90.03

  Creation Time : Tue Apr 10 19:36:32 2007

     Raid Level : raid5

     Array Size : 1465151808 (1397.28 GiB 1500.32 GB)

  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)

   Raid Devices : 4

  Total Devices : 4

Preferred Minor : 1

    Persistence : Superblock is persistent

    Update Time : Tue Apr 10 19:36:32 2007

          State : clean, degraded, recovering

 Active Devices : 3

Working Devices : 4

 Failed Devices : 0

  Spare Devices : 1

         Layout : left-symmetric

     Chunk Size : 64K

 Rebuild Status : 0% complete

           UUID : 40b869c3:eaf7f095:64550e9c:fbb46ff4

         Events : 0.1

    Number   Major   Minor   RaidDevice State

       0       8        1        0      active sync   /dev/sda1

       1       8       17        1      active sync   /dev/sdb1

       2       8       33        2      active sync   /dev/sdc1

       4       8       49        3      spare rebuilding   /dev/sdd1

```

It's already listing one device as spare, even though I explicitly stated --spare-devices=0 , I noticed this before too. These devices are brand new and I check each one out, they seem to be fine. It's sda1 that fails over and over again. I notice this dude is having the same problem.

This could be bugworthy.

----------

## HeissFuss

It looks like the disk is failing during the building of the parity since the 4th disk hasn't been activated before the first drive fails.  Are you sure that that device is really error free?  Are there any messages in dmesg or messages (hint: there will be)?  Did you run badblocks on the drive?

If you don't believe that disk is bad, try building the array in a different order and see if that disk (and not just the first on in the array) fails again.

----------

## Cyker

A quick FYI:

```
    Number   Major   Minor   RaidDevice State 

       0       8        1        0      active sync   /dev/sda1 

       1       8       17        1      active sync   /dev/sdb1 

       2       8       33        2      active sync   /dev/sdc1 

       4       8       49        3      spare rebuilding   /dev/sdd1
```

This is NORMAL when building a mdadm RAID5 array - It's just that nobody tells you!  :Evil or Very Mad: 

Whenever you make an array, it will pretend one of them is a spare, then quickly construct the rest in degraded mode (and from there you can use it!), and then begin constructing the rest of the array in the background using the 'spare'.

Apparently it's faster doing it this way - I don't really understand the why, but at least you can start using the array right away...

----------

## nadir-san

is it 'fake' information, then is one disk actially holdin more pariity or something, that wouldn't really make sence, each one would need to have the same redundant factor if it were to be really stable,I would imagine at least.

I have reconstructed the array now in reverse order so now it says 

```

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] 

md0 : active raid5 sda1[4] sdb1[2] sdc1[1] sdd1[0]

      1465151808 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

      [==>..................]  recovery = 14.7% (72069760/488383936) finish=262.0min speed=26478K/sec

      

unused devices: <none>

```

```

miyu ~ # mdadm --detail /dev/md0

/dev/md0:

        Version : 00.90.03

  Creation Time : Tue Apr 10 22:13:18 2007

     Raid Level : raid5

     Array Size : 1465151808 (1397.28 GiB 1500.32 GB)

  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)

   Raid Devices : 4

  Total Devices : 4

Preferred Minor : 0

    Persistence : Superblock is persistent

    Update Time : Tue Apr 10 22:15:17 2007

          State : clean, degraded, recovering

 Active Devices : 3

Working Devices : 4

 Failed Devices : 0

  Spare Devices : 1

         Layout : left-symmetric

     Chunk Size : 64K

 Rebuild Status : 14% complete

           UUID : 2a2c294e:5f2caa45:c7488c4f:7f3942e6

         Events : 0.16

    Number   Major   Minor   RaidDevice State

       0       8       49        0      active sync   /dev/sdd1

       1       8       33        1      active sync   /dev/sdc1

       2       8       17        2      active sync   /dev/sdb1

       4       8        1        3      spare rebuilding   /dev/sda1

```

so like you say I'm assuming   '... spare rebuilding   /dev/sda1' is actually just the same as any other disk in the array.

also

 *Quote:*   

> and then begin constructing the rest of the array in the background using the 'spare'.

 

dp you mean by this, that when its finished it should list UUUU . or will it still show UUU_

it's quite confusing even tldp don't explain it.

btw, there have been some putbacks in 2.6.20 regarding raid456 , so this new attempt is using vanilla 2.6.20r6

----------

## Cyker

I know! It's such a trivial thing but (almost) none of the mdadm docs by anyone mention it!

You are correct 'tho, and yes, when the array finishes rebuilding, it will be all [UUUU]  :Mr. Green: 

You can mount and format the array (Make sure you format /dev/md0 and not /dev/sda or sda1!  :Embarassed: ) while the rebuild is happening 'tho.

If you're in a hurry, it might be a good idea - I was waiting at first, but then it took several hours for the convertion to finish and in that time I'd already copied over all 200+GB of the old /home data from the old system!!!  :Shocked:   :Mr. Green: 

If you setup the kernel and partitions right, you can even reboot the compy and the rebuild will resume when it left off  :Mr. Green: 

(Given your luck so far 'tho, I'd not try that just yet!  :Wink: )

NB: If the array rebuilds and it STILL fails, it might actually be that sda is not working!

----------

## nadir-san

ok, cool thanks for clearing up those issues for me Cyker, yeah I already 'moved' some data over and lost it the first time i tried building  :Sad: , hopefully this new kernel will work, or at least I will be able to identify whats actually going wrong if it breaks. Sound , thanks  :Smile: . I'll post my results tomorrow XD.

----------

## drescherjm

 *Quote:*   

> hopefully this new kernel will work, or at least I will be able to identify whats actually going wrong if it breaks.

 

It is unlikely that it is the kernel that is at fault. It is more likely that it is a hardware problem. At work about 10% of brand new drives I get are DOA, die or experience some type of failure within the first 3 months and I have bought over 100 in the last 10 years. It can also be cabling (loose power connector) or it could be the driver for your sata or ide card. It can be a bad power supply. At work where I have over 7TB on linux software raid I have seen all of these hardware issues cause a raid to kick out a drive putting the raid into degraded mode.

----------

## HeissFuss

Be aware that copying data to the RAID5 or reading from it while it's syncing will lengthen the sync time significantly (just look at the sync speed drop when you start copying.)

----------

## nadir-san

Ok, well, just to let you guys know, it looks like it WAS the kernel at fault, after upgrading, it seems fine, running about a week now, (reboted about 10 times/ took a disk out added back in), looks good. Heres hoping it stays like that. It seems pretty damn cool actually, happy happy.

----------

## Cyker

What kernel are you using now?

I just got the 2.6.20-r6 from the 'stable tree', and the boot-time benchmarks rate slightly slower than .19 (Which in turn was slower thsn .18 - Not a trend that continues I hope!)

Curiously, the array actually seems faster and less I/O-waity during heavy load, although I'm thinking this is because .20 also has NCQ support for nv_sata.

----------

## nadir-san

Yesm tahts the kernel im using, I'm not so worried about the speed on that machine, i just want it to be stable, I'm also running the same kernel on my raid1 arroy on raptors.

I do want performance from that... I havnt noticed any difference, still damn fast.

----------

