# USB Flash Drive RAID aka "Stupid RAID Tricks"

## eno2001

OK, I have to admit up front that what I'm doing is completely idiotic.  The only reason I'm doing it is... "because I can".  But I need a little info first.  I have four 2 gig flash drives that I got relatively inexpensively.  I've been experimenting with setting them up as a RAID the past few days.  They seem to work fine in RAID0, RAID1 and RAID10 levels.  RAID5 performs poorly, but that comes as no surprise due to several factors that many of you are probably aware of.  The most obvious one being the problem of a single USB bus (which impacts all the RAID levels but 5 seems hardest hit).

But here's where I'm not sure what's going on.  If I do a RAID0 array, the flash drives only "blink" their activity lights when there is disk read or write access.  But when I set up RAID1, RAID5 or RAID10, all four drives are constantly blinking regardless as to whether or not they are being accessed.  So my questions:

1. Is all this access normal for RAID1, RAID5 and RAID10?

2. If so, what is it that they are constantly doing in the background since there is no data currently on the RAID yet?

3. Is this mostly read access or write access?

I ask the last question because flash drives have a limited number of write cycles.  If there is a lot of write activity from the software RAID, then the lifetime of the flash drive will be considerably shorter than with normal usage.  (Hence the stupidity of this whole endeavor  :Smile:  )  If it's just read activity, I think I can live with that.  My end goal is to use this array to carry a QEMU virtual machine around with me on a small but redundant device.  I don't actually expect the flash drives to really survive my bashing them in the end, but I'm having fun, and that's the main point.  Isn't it?    :Twisted Evil: 

----------

## eno2001

I went ahead and set up a RAID10 configuration.  After a while, the activity lights just stopped.  Perhaps there is some prep work the software RAID does as it is setting up the RAID?  I should also note that I shifted from EXT3 to Reiserfs based on some reading I did about performance on RAIDs of various file systems.  So it could have also been something endemic to ext3?  Right now, the virtual machine that is using this RAID is up and running, but the flash drives appear to be quiesced.  So, I assume this means my drives CAN be set up as RAID10 with few repercussions.  Now, we'll just see how long this VM continues to run before the flash drives die.  I plan to use the VM on a daily basis until it just goes.   :Smile: 

----------

## ziggysquatch

This is a very interesting project and thanks for posting it.  If I get some freetime I think I'm gonna try this.  I would like to know how long your flash drives last.

----------

## eno2001

I discovered that the drive activity lights were active because the array was being initialized.  I was unaware of the process before, but in looking at how to set Gentoo up on a RAID box I found this:

```
watch -n1 'cat /proc/mdstat'
```

That shows you a "progress bar" that tells you what your RAID is doing.  Specifically at the beginning it indicates when a RAID is being initialized.  That is why when I came in the next morning, the drive activity lights had stopped.  The initialization was complete.  So, it appears that normal use of the USB flash drive array isn't much different than utilizing a single USB flash drive other than a slight speed increase and fault tolerance.  So far, I'm enjoying this RAID10 array as the host of my VM.  To my VM it looks like /dev/hda1 and /dev/hdb1 thanks to the way that QEMU presents the logical volumes.  I'll try to explain the layering below:

Host OS USB Flash Drives

/dev/sdd

/dev/sde

/dev/sdf

/dev/sdg

Created /dev/md0 as RAID10 which uses the four drives listed above

Created LVM2 Volume Group /dev/shadow on RAID device /dev/md0

Created LVM2 Logical Volume /dev/shadow/boot (100 Megs)

Created LVM2 Logical Volume /dev/shadow/root (remaining space which is nearly 4 gigs)

Using Qemu, I pass the /dev/shadow/boot to the -hda option and /dev/shadow/root to the -hdb option.  This means that the Qemu VM sees the 100 meg logical volume as /dev/hda and the 4 gig logical volume as /dev/hdb.  Within the VM, I then did a pretty standard Gentoo installation and partitioned /dev/hda and /dev/hdb.  I also pass /tmp/shadow.swap (a 512 meg file created with 'qemu-img') as -hdc so that I my swap file is NOT on the USB array.  That way I can prevent killing the flash drives even faster!  As far as I'm aware, /proc and /sys are resident in RAM and don't hit the drives.  /var and /tmp can get busy I suppose, but I don't think as much as swap.

I think most folks would question the performance of so many layers, but it works pretty well on this P4 and I've also proven it to be portable between my Gentoo boxes as long as I have RAID and LVM support built and the 'mdadm' and 'lvm2' userspace utils installed.  Oh... and Qemu.  So far I'm liking this quite a bit.   :Razz: 

----------

## Cyker

ROFL! This is almost as cool as that Mac guy who tried to RAID a bunch of USB floppy disks together!  :Very Happy: 

----------

## syadnom

in theory, if you have multiple USB buses you could get a very fast drive setup.  I am going to give this a shot.  i have 2 PCI usb2 cards and 2 USB chipsets on my motherboard.  I suppost that a dual core processor is a must as USB uses CPU time and so does software RAID.  I would have to say that raid5 is not useful on flash keys as they are prone to block failure not spindle failure and the drives will have fault remapping on the chip.  Also, i would assume that using this on a production system is 'retarded' so please no comments on that.

I currently run a CF array with 4 Cf cards @ 2GB per on IDE->CF adapters.  I have 2 adapters with 2 CF plugs on each so just 2 IDE channels and 2 cables.

the CF cards i have are the crucial 2GB.  with these in RAID0 on ubuntu ( / ) i get 7.8GB formatted and write performance of about 20-22MB/sec and read @30-34MB/sec.  these are not the fastest CF cards for sure, prob just 20x.  my lexar flash key is faster.

i guess the question is whether or not faster access with more drives on CPU intensive USB2 buses will beat faster interfaces with slower devices and less devices.

----------

## JustinClift

Hiyas, was toying with this idea earlier today.

Apparently these days, read speeds of 30MB's are pretty common, so raiding a few (RAID 0) could be interesting.

Guessing it depends upon the USB bus connections and paths internally.  If they're all using one or two USB buses, that could be a choke point.  :Sad: 

Also wondering if there are flash drives already around with eSATA as the connection?  (THAT could prove far more interesting!)

Actually did a search for that first, and all that turned up was this ADATA product in development:

http://www.engadget.com/2008/01/08/a-data-shows-off-badly-designed-esata-flash-drive/

Though personally I'm not bothered about the looks.  More interested in seeing if there's a "daily usage" performance boost.  :Wink: 

----------

## syadnom

I have done quite a bit of testing on this.  I did not do this on gentoo as i have moved off to other systems.

I have done it on ubuntu and linux md raid0.  

you MUST get readyboost capable or equivilent speed drives.  slower drives and raid dont mix well.  10x 5MB/s flash drives with have only something like 20% gain per drive from the slowest drives speed.  so 10x 5MB/s drives is only like 15MB/s.  faster drives benefit much much more.  you can get the regular 50-70% out of faster drives.  i have 5 sandisk micro 2GB flash keys, which can do about 20MB/s read speed per drive.  this turns into a pretty reasonable 60MB/s BUT you have to spread those drives out over different USB controllers.  i have 2 onboard controllers plus a PCI card with 2 controllers.  I put 2 drives on the PCI card as it performs better than the 2 onboard.

I also have tested this on solaris and ZFS.  I know this is a linux board, but I am just giving an idea on what the hardware can do and that knowledge is through solaris and ZFS.

I use ZFS because it is a great, high performance, easy management filesystem.  I watched the video that can be youtubed or googles for on running a bunch of flash keys with ZFS and got the idea to try it out.  The great thing about ZFS is that it caches IO writes, which is what really slows down USB drives.  ZFS 'hides' the added IO processing time behind it's caching system and the 5 disk sandisk 'array' i can pull nearly 80MB/s even though the drives are on USB!  

the real solution here is to get some CF->IDE->SATA adapters.  I have made some just glueing a CF->IDE adapter to an IDE->SATA adapter.  I have some Sandisk ULTRA III cards(2GB) that I retired from my canon XTi in favor of an ULTRA IV 16GB card.  these ultra3 cards can easily sustain 20MB/s+ write and 25MB/s read.  3 of them, with these adapters on raid0 can benchmark out at 60MB/s+(read) using linux md.  this is really about the same as a modern SATA disk but the access times are so low that the system feels like it is running on nitrous. 

the nice thing about CF cards is that they already handle the logic of spreading the write load over all sectors to minimize bad sectors because of too many writes making them more reliable than SD or USB key drives because they lack that logic and would need to use a special filesystem(jffs,yffs) to handle the sector load balancing.  with cf, you just format it ext3 or reiserfs and be done!

----------

## JustinClift

Heh, cool.

That all sounds pretty nifty, but what's the actual responsiveness of your system like when you've got it up and running?

What sort of CPU/RAM, etc do you have in your system?

I'm looking to get a new box in the near future (next week or two), with probably either a Q6600 or Q9450 (if it arrives), 8 GB RAM, etc.

Not a gaming PC as I have a PS3, more of a development workstation as I do a reasonable amount of coding, and I also use VMware Workstation a fair bit.

Wondering if there's reasonable boot and daily usage performance from having RAID 0 flash like this (CF may be the way to go like you mention), plus some RAID 0 (2 x 250GB?) hdd storage for swap, var, tmp, and VMware partitions.

Thinking of "cheap mans SSD" until they're more reasonably priced?  :Smile: 

----------

## JustinClift

Looks like the Compact Flash organisation started developing a spec for CF with a SATA interface mid-last year:

  "The CFA is Developing a Specification for a CompactFlash Card with a SATA Interface"

http://www.compactflash.org/pr/070717.pdf

----------

## JustinClift

And here's a SATA to compactflash adapter:

http://www.darkwire.com.au/html/sata_to_cf_card_adapter.html

Interesting eh? ;->

----------

## DeadlyNinja

We raided 6 flash drives and got some pretty good performance.

Flash Raid Experiment

----------

## JustinClift

Hey, good work there!   :Very Happy: 

Haven't gotten around to getting a new system yet - other priorities at the moment like getting Salasaga's first full release out the door.  :Smile: 

However, when I do, I'm really thinking about doing some raid benchmarks with PostgreSQL.  Thinking that with 4+ compact flash adapters, the transactions per second could actually be fairly "insane" from a price/performance perspective.

If that turns out to be true, then also thinking of doing some reliability benchmarking with raid 5 and raid 6.  Also interested in seeing just how easy it is to live upgrade a system (i.e. 4 x 8 GB -> 4 x 16GB) that's using CompactFlash and raid 6.  Given the reputation/nature of flash, raid 6 would probably be the only truly "production acceptable" starting point. Heh.  :Wink: 

----------

