# [solved] Best practice with RAID-5, LVM, and EXT3?

## shepmaster

Hey all, I have 3x750 GB drives in the mail. My plan is to set them up in a RAID-5 and put that drive into LVM. I will then section out a few chunks of the LVM drive for things like /home and /media.

I have read a few differing accounts as to ext3 stripe size; some say it's crucial, some say it has no effect.

I was curious to get some feedback from people who have a similar setup, and, if possible, actual commands and calculations needed.

Thanks!Last edited by shepmaster on Mon Dec 15, 2008 4:14 pm; edited 1 time in total

----------

## richard.scott

Does this thread help:

http://www.linuxforums.org/forum/coffee-lounge/121795-stripe-size-should-i-use-best-performance.html

----------

## redgsturbo

 *shepmaster wrote:*   

> Hey all, I have 3x750 GB drives in the mail. My plan is to set them up in a RAID-5 and put that drive into LVM. I will then section out a few chunks of the LVM drive for things like /home and /media.
> 
> I have read a few differing accounts as to ext3 stripe size; some say it's crucial, some say it has no effect.
> 
> I was curious to get some feedback from people who have a similar setup, and, if possible, actual commands and calculations needed.
> ...

 

I've run/running the same with reiserfs v3, and xfs.  No complaints here

----------

## shickapooka800

since you would be using LVM on top of raid, would the chunk sizes translate the same as raw partitions?

----------

## shepmaster

 *redgsturbo wrote:*   

> I've run/running the same with reiserfs v3, and xfs.  No complaints here

 

Thanks. Did you do any tuning of the RAID / LVM / FS layer? I guess that's what my real question is   :Very Happy:  .

----------

## shepmaster

 *shickapooka800 wrote:*   

> since you would be using LVM on top of raid, would the chunk sizes translate the same as raw partitions?

 

That's one of the things I am trying to figure out. I've read various things that seem to indicate that you need to do complicated trickery to account for how LVM the LVM 0-byte is offset from the RAID 0-byte.

----------

## shepmaster

 *richard.scott wrote:*   

> Does this thread help:
> 
> http://www.linuxforums.org/forum/coffee-lounge/121795-stripe-size-should-i-use-best-performance.html

 

Kind of - that poster is asking similar questions, and doesn't seem to have an answer one way or the other. The link in that post goes to a paragraph about FS block sizes, which is an important consideration, but one that I think is orthogonal to the RAID / LVM / FS chunk and striping size.

----------

## redgsturbo

 *shepmaster wrote:*   

>  *redgsturbo wrote:*   I've run/running the same with reiserfs v3, and xfs.  No complaints here 
> 
> Thanks. Did you do any tuning of the RAID / LVM / FS layer? I guess that's what my real question is   .

 

nope.  I just used the defaults for everything.

----------

## overkll

I remember finding an article on this...  Let's see if I can google it up... Yup, here it is:

Stripe Width and Stripe Size

----------

## shepmaster

 *overkll wrote:*   

> I remember finding an article on this...  Let's see if I can google it up... Yup, here it is:
> 
> Stripe Width and Stripe Size

 

Thanks for that link, it helps with setting the RAID parameters. However, I am also interested in how to tune LVM and the filesystem to best take advantage of the underlying RAID configuration.

----------

## shepmaster

 *shepmaster wrote:*   

> ...I have read a few differing accounts...

 

I wanted to put some of the things I have found out there about setting the appropriate settings for the filesystem:

ext3

Post on settings needed all the way from RAID to FS.

Post on the math to calculate stride values.

Calculator that does the stride sizes for you, shows command to run.

xfs

Post on how to set stride size.

----------

## shepmaster

Here is what I ended up doing:

I have 3 750 GB drives, all SATA.

Partitions

I used this partition scheme for each drive

 256 MB - RAID 1 - /boot 

 1 GB - swap

 Remainder of drive - RAID 5 - LVM physical volume

Create RAID devices

```
$ mdadm --create /dev/md0 --chunk=128 --level=1 --raid-devices=3 /dev/sd[abc]1 

$ mdadm --create /dev/md1 --chunk=128 --level=5 --raid-devices=3 /dev/sd[abc]3
```

Boot and swap drives

```
$ mke2fs /dev/md0

$ swapon -v -p 1 /dev/sda2

$ swapon -v -p 1 /dev/sdb2

$ swapon -v -p 1 /dev/sdc2

$ cat /proc/swaps
```

Using -p 1 ensures that the swap devices are all set to the same priority, which will cause the kernel to use them in a round-robin fashion, similar to a RAID0.

LVM volumes

```
$ pvcreate /dev/md1

$ vgcreate --physicalextentsize 128M main /dev/md1 

$ lvcreate --size 300G --name media main
```

Repeat the last command for each volume you want - I had ones for /home and/mail, etc.

XFS

```
$ mkfs.xfs -d su=128k -d sw=2 -l su=128k /dev/main/media
```

This was the trickiest part to figure out. Use -d su to set the stripe size (in this case 128k, as set when creating the RAID). Use -d sw to set the number of data disks. RAID5 has n-1 data disks, so I used 2. You can use -l su to set the log stripe size to the same as the data stripe size.

EXT3

```
$ mke2fs -b 4096 -j -O sparse_super -E stride=32,stripe-width=64 /dev/main/name
```

I found it easiest to use the calculator to get the right RAID settings. The short version is that stride should be the RAID stripe size divided by the blocksize. The stripe-width will be the stride multiplied by the number of data disks.

Miscellaneous tuning

I played around a bit with different RAID chunk sizes, XFS vs EXT3, and various system settings, testing each combination with iozone. A chunk size of 128K coupled with XFS seemed best for my workload. Internet wisdom says you should test with your own workload. You will probably get decent performance if you don't, however.

While you are writing to the RAID device, take a look at /sys/block/mdX/md/stripe_cache_active and compare the value to /sys/block/mdX/md/stripe_cache_size. If it is continuously being maxed out during writes, increase stripe_cache_size:

```
$ echo 8192 > /sys/block/mdX/md/stripe_cache_size
```

----------

## piwacet

Thanks for this, it is very helpful.

One question.  For this stride and stripe calculator,

http://busybox.net/~aldot/mkfs_stride.html

How do you figure out the value that should go into "number of filesystem blocks (in KiB)"?

----------

## shepmaster

 *piwacet wrote:*   

> How do you figure out the value that should go into "number of filesystem blocks (in KiB)"?

 

I don't think I changed that value, but I don't honestly remember. I think that it is mislabeled - I think it really means "size of a filesystem block". 4 KiB is a size and unit that doesn't make any sense for "number of blocks", but it does make sense for "size of a block".

----------

## piwacet

Thanks, I've been thinking and reading about this and I think you're right, I think it's mislabeled.  I'll just leave it at 4.

Thanks!

----------

