# How would you build dedup'ed, networked, mirrored storage?

## DingbatCA

Like most of these problems I have little to know budget and management would like me to pull off an act of God. The good news, I love these problems.

VM's... Lots of VM's.  Starting around 100, and growing with time/money to 5000+. Just think web farm.  I need to build up the storage infrastructure behind it all.  

Starting simple with two storage nodes, each with single 80GB disks.  Later on the storage nodes will be monster raid systems.  Dont need a truly distributed network file system, just a mirror, in an active/active configuration.  Need to be able to spread the load across many (2+) storage servers.

So here is what I tried...

http://www.batbuilds.com/~adam/layout.png

Turns out GlusterFS sucks.  Even when directly mounted (Single storage node, going to local host) to a local ram drive I could never get speeds past 1.2MB/s...  Spent MANY hour working with gluster support to get around this.  It also will not function on-top of LessFS. two strikes against gluster, I give up.  Is there a better way?  I would really like to keep the compression/deduplication given by LessFS.  But it is not a requirement. 

Ideas?

----------

## BitJam

The folks at BackBlaze have put up instructions on how they build their storage servers: Petabytes on a budget: How to build cheap cloud storage.

The bottom line is:  67 terabyte 4U servers for $7,867 each.

----------

## DingbatCA

That's just cool!

But they fail to talk about how they pull it all together.  "Custom software"  does not do me much good. Really cool, but still looking for something I can build in-house.

----------

## trepanne

you might want to check out opensolaris.  IIRC they've open-sourced the clustering code, and ZFS now includes deduplication.

try googling around for presentations by textdrive/joyent around the time they were migrating from freeBSD to opensolaris.  i think you want a similar infrastructure (they run a lot of VMs virtualized across lots of hosts, including a metric assload of hard disks)

----------

## Mad Merlin

I'm constantly disappointed by the state of networked filesystems. They all seem to have major drawbacks. What I truly want is basically network RAID 10 without a single point of failure... increased reliability, performance and storage space, just like local RAID 10, it would be a no-brainer to implement (provided you have the hardware). Unfortunately, it doesn't exist (or it's hiding from me very well).

The good news is that Ceph looks to have exactly this in its crosshairs, and it's already merged into the vanilla kernel. The bad news is that it's still very experimental code, the authors strongly recommend that you not use it in production yet.

A not particularly close second is Lustre (not Glustre), but the fact that it does not mirror data and the fact that it requires a custom kernel (heavily patched RHEL 5... outrageously ancient by today's standards) is a complete non-starter for me.

For small scale systems, you can use GFS2 or OCFS2 on top of an active/active DRBD, but DRBD only supports exactly two nodes. The plus is that GFS2, OCFS2 and DRBD are all in the vanilla kernel.

Data deduplication is a very minor concern for me, so I haven't looked into that at all, but if LessFS is just another layer on top of the filesystem, it seems like it should work with any of the above options.

----------

## DingbatCA

So it turns out that GlusterFS wants SETXATTR support;  which LessFS does not support.  I am no programer, but there are days I wish was.  I asked for help from LessFS.  Perhaps they will be kind and through me a bone.

```
unique: 6, opcode: LOOKUP (1), nodeid: 1, insize: 44

LOOKUP /VMs

getattr /VMs

NODEID: 2

unique: 6, success, outsize: 144

unique: 7, opcode: SETXATTR (21), nodeid: 2, insize: 79

[b]unique: 7, error: -38 (Function not implemented), outsize: 16[/b]

```

CePH's looks VERY cool, been watching it for the last few months.  Wish it was a bit farther along.

----------

## linuxtuxhellsinki

 *trepanne wrote:*   

> you might want to check out opensolaris.  

 

It's Dead  :Confused: 

----------

## DingbatCA

Has any one look into TwistedStorage? http://twistedstorage.sourceforge.net/index.html

----------

