# very poor mdadm raid 0 performance

## NoControl

I just installed Gentoo on an old PC I got my hands on. It has two 20 GB IDE hard drives (5400 rpm). I believe I made all the correct hdparm settings and they are working (see output below). Working apart, the drives show a performance which I believe is normal for these type of drives (around 20 MByte/s, see below).

I used mdadm to put them in a raid 0 configuration (details below). The drives are on separate IDE channels with no other devices on them. However, the performance of the raid device (md0) is even poorer (21 MByte/s) than each of the drives separately! Any ideas how this could happen? Could it be the chunk size of the array? I set it to 16Kbyte (see below) because it will be dealing with a lot of small files, but mdadm's default is 64Kbyte. Or maybe the file system (JFS) is to blame?

```
wigwam ~ # hdparm -mcudat /dev/hda /dev/hdc

/dev/hda:

 multcount    = 16 (on)

 IO_support   =  1 (32-bit)

 unmaskirq    =  0 (off)

 using_dma    =  1 (on)

 readahead    = 256 (on)

 Timing buffered disk reads:   72 MB in  3.07 seconds =  23.43 MB/sec

/dev/hdc:

 multcount    = 16 (on)

 IO_support   =  1 (32-bit)

 unmaskirq    =  0 (off)

 using_dma    =  1 (on)

 readahead    = 256 (on)

 Timing buffered disk reads:   68 MB in  3.08 seconds =  22.09 MB/sec

wigwam ~ # cfdisk -P s /dev/hda

Partition Table for /dev/hda

               First       Last

 # Type       Sector      Sector   Offset    Length   Filesystem Type (ID) Flag

-- ------- ----------- ----------- ------ ----------- -------------------- ----

 1 Primary           0       16064     63       16065 Linux (83)           Boot

 2 Primary       16065      289169      0      273105 Linux swap / So (82) None

 3 Primary      289170    39873329      0    39584160 Linux raid auto (FD) None

wigwam ~ # cfdisk -P s /dev/hdc

Partition Table for /dev/hdc

               First       Last

 # Type       Sector      Sector   Offset    Length   Filesystem Type (ID) Flag

-- ------- ----------- ----------- ------ ----------- -------------------- ----

 1 Primary           0      250991     63      250992 Linux swap / So (82) None

 2 Primary      250992    39862367      0    39611376 Linux raid auto (FD) None

wigwam ~ # mdadm --detail /dev/md0

/dev/md0:

        Version : 00.90.03

  Creation Time : Sat May 12 00:31:11 2007

     Raid Level : raid0

     Array Size : 39597568 (37.76 GiB 40.55 GB)

   Raid Devices : 2

  Total Devices : 2

Preferred Minor : 0

    Persistence : Superblock is persistent

    Update Time : Sun May 13 11:09:39 2007

          State : clean

 Active Devices : 2

Working Devices : 2

 Failed Devices : 0

  Spare Devices : 0

     Chunk Size : 16K

           UUID : c582a4d1:20232912:c40b0e12:0e5ff0e8

         Events : 0.3

    Number   Major   Minor   RaidDevice State

       0       3        3        0      active sync   /dev/hda3

       1      22        2        1      active sync   /dev/hdc2

wigwam ~ # hdparm -t /dev/md0

/dev/md0:

 Timing buffered disk reads:   64 MB in  3.03 seconds =  21.09 MB/sec
```

 Any clues anyone?

----------

## widan

 *NoControl wrote:*   

> The drives are on separate IDE channels with no other devices on them. However, the performance of the raid device (md0) is even poorer (21 MByte/s) than each of the drives separately! Any ideas how this could happen?

 

Which IDE controller (in lspci) ? Some old controllers can't run DMA on both channels at the same time, as they share a single DMA engine (or some other internal resources, or because of buggy chip revisions), and accesses need to be serialized across both channels.

What happens if you run two instances of the hdparm test (one on each drive) at the same time ? Does the speed of each drive drops significantly, or is it similar to the one when testing one drive at a time ?

----------

## NoControl

I'm afraid you were completely right in your guess:

```
wigwam ~ # hdparm -t /dev/hda & hdparm -t /dev/hdc

[1] 5601

/dev/hda:

/dev/hdc:

 Timing buffered disk reads:   Timing buffered disk reads:   34 MB in  3.03 seconds =  11.22 MB/sec

 42 MB in  3.06 seconds =  13.75 MB/sec
```

The output is obviously jumbled, but the results are clear (and consistent).

It's an old motherboard using an Intel 440BX, more specifically (according to lspci): "IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)".

Is there a way around this? Before this, I had an even older motherboard with an Intel 440LX chipset (which is even older, I believe). That chipset had no problem using multiple DMA channels. I suspect putting both drives on the same IDE channel won't make a difference, since they'd still only be able to send data one at a time?

----------

## widan

I don't know if there is a workaround for that, but you're not the only one to notice that problem with that particular chipset, you can look at those posts on LKML:

SW-RAID0 Performance problems

Multiple Disk IDE performance problems...

Both involve the BX chipset. Unfortunately neither mentions a solution.

----------

## NoControl

This really is a pity. I'm going to try moving both disks to one IDE channel. That isn't optimal, either, but if it is possible to use DMA on two disks on one channel, I think the performance hit will be less than it is now. That'll have to wait until Friday, though.

Thank you, widan, for your answers!

----------

## eccerr0r

I've seen the same thing on my PIIX4/440BX board(Celeron-1200@1364), except this time with RAID5.  Since I was running RAID5 I had to have an off-board controller but in either case the whether I ran everything off the PCI controller or spread amongst the two, the disk bandwidth simply would not exceed that of a single drive, and usually less than of a single drive, even though the disks are on separate channels.

I think my 815 board might have been slightly faster but the CPU(P3-1000) in it is slower.  Too bad I can't stick my tualatin on the 815 board to test all other things being equal.

the same RAID mixed between an even faster athlon (Athlon-XP2200) on its crappy SiS735 IDE and the same offboard IDE finally produced disk speeds approaching theoretical speeds.

----------

## widan

 *eccerr0r wrote:*   

> in either case the whether I ran everything off the PCI controller or spread amongst the two, the disk bandwidth simply would not exceed that of a single drive, and usually less than of a single drive, even though the disks are on separate channels.

 

If it does that also with a PCI controller, then the PIIX4 isn't the culprit (the devices inside PIIX4 are PCI devices themselves), it points more towards a PCI bandwidth issue.

Apparently there is a problem with 440-series chipsets and PCI... According to the first post in that thread it isn't possible to get 33MB/s on PCI (so available bandwidth is probably even less), and the second post indicates that apparently the northbridge can't handle bursts of more than 8 dwords (ie DMA writes to/from sequential addresses, with address auto-increment), so DMA transfers of large quantities of data are not very efficient.

----------

## eccerr0r

Likely it is the 440BX dealing with multiple masters.  It _is_ possible to get 33MB/sec through PCI, since individual disks I've easily gotten in excess of 50MB/sec, through my Promise Ultra66 PCI board.  Just when there's more than one controller, it seems to have trouble filling the remainder of the theoretical 133MB/sec with something useful.  At least apparently it's stable and no corruption occurs, unlike Via KT133...

I'll need to see what happens with raid0 but IIRC I did see something weird when trying to run two hdparms on two disks on two channels - total bandwidth doesn't add.

 *widan wrote:*   

> 
> 
> Apparently there is a problem with 440-series chipsets and PCI... According to the first post in that thread it isn't possible to get 33MB/s on PCI (so available bandwidth is probably even less), and the second post indicates that apparently the northbridge can't handle bursts of more than 8 dwords (ie DMA writes to/from sequential addresses, with address auto-increment), so DMA transfers of large quantities of data are not very efficient.

 

----------

## eccerr0r

Heh.  Tested two old conner 400MB disks here

I raid0 the two disks together:

```
$ cat /proc/mdstat

Personalities : [raid0] [raid1]

md1 : active raid1 hdb1[0] hdd1[1]

      4352 blocks [2/2] [UU]

md0 : active raid0 hdb2[0] hdd2[1]

      816768 blocks 64k chunks

unused devices: <none>

$ hdparm -t /dev/hdb2 /dev/hdd2 /dev/md0

/dev/hdb2:

 Timing buffered disk reads:   4 MB in  3.76 seconds =  1.06 MB/sec

/dev/hdd2:

 Timing buffered disk reads:  10 MB in  3.17 seconds =  3.15 MB/sec

/dev/md0:

 Timing buffered disk reads:   8 MB in  3.28 seconds =  2.44 MB/sec

```

Both my 440BX and 440LX boards have similar results (both are PIIX4 82371EX)

Something really weird here...

----------

## NoControl

 *NoControl wrote:*   

> I'm going to try moving both disks to one IDE channel. That isn't optimal, either, but if it is possible to use DMA on two disks on one channel, I think the performance hit will be less than it is now.

 

Well, as expected, that didn't deliver better performance, either. Before I could test it with one drive on DMA and the other on PIO (god forbid  :Surprised: ), one of the two drives began showing signs that it was nearing its last spin-up. One of the disadvantages of old hardware, I guess  :Smile: 

----------

