# Scalable Storage Clustering with redundancy? What to use?

## humbletech99

I'm looking at implementing a large and scalable storage solution to replace lots of file servers which needs to meet the following requirements:

1. Redundancy

2. Scalability/Expandability

3. Performance

4. Single Directory tree for all storage

5. Preferably runs on Linux, my favourite.

6. Preferably on commodity hardware that is easy to replace/add, although not necessary

Ideally I'd like it to have something like a Global Filesystem like AFS or Microsoft's DFS where everything is organized into one directory tree, but also to have redundancy and data security like a Cluster Filesystem, so if one or more nodes breaks then it still works and we just put in new nodes. I also need to be able to extend it by putting in new nodes so that I can make the storage grow indefinitely.

At the moment my storage requirements are 15-30TB, but this will increase and so I need to able to add more space by adding more nodes.

What I think I am really after is a Cluster FileSystem, something like the Google File System, except that is not available cos Google are evil and eat up lots of talented open source people and keep cool stuff like that to themselves.

Ideally the solution should also be fast and reliable, so that I can crunch data on it from other servers.

Of course it doesn't __have__ to be open source or unixy or anything, but it would be nice...

Any ideas?

----------

## alex.blackbit

hi,

in a german it magazine called ix there was a quite good article about that topic some time ago, but since you live in london i guess you will not be able to read it.

DFS is well... okay, but the underlying CIFS protocol cannot be clustered in the current implementations from microsoft and samba, BUT samba is currently working on that in a separated CVS tree. afaik afs is the best solution for large systems. clients exist for all important systems and server can be clustered. i never toughed it myself, but i do read only good things about it.

----------

## humbletech99

thanks, I had started reading up on AFS, although I think Coda is newer... I'll keep reading.

----------

## richard.scott

How did you get on with your research? 

Have you started to use Coda or OpenAFS?

----------

## humbletech99

no it looks like we'll have to buy a proprietary solution since I'm not sure any of the open source opens are rock solid enough...

----------

## sf_alpha

I an not sure. But I think GFS + GNBD is suitable.

GFS client layer itself provide fail-over if any GNBD server fail.

But GNBD server must can access same storage device (via SAN, multipath, etc).

GNBD servers that share same target will maintain its consistency using fencing but it must properly configured.

----------

## humbletech99

yes but by the time you have a san, you've paid for the proprietary solution...

I wanted a clustered filesystem that could just scale out on white box hardware...

----------

## nwmcsween

Ceph is what your looking for.

----------

## gentoo-freak

i was wondering about some information about ceph and gentoo posted here or somewhere else... found only a few mailing lists and not good information about this...

has anyone played around with some gentoo and ceph ?

greetz nerds

----------

## vaxbrat

Just finished setting up ceph on top of btrfs on a few servers as an experiment.  However their site is all gung-ho about using ceph-deploy, but the zip I pulled today doesn't want to work under either python 2.7 or python 3.2

----------

## gentoo-freak

 *vaxbrat wrote:*   

> Just finished setting up ceph on top of btrfs on a few servers as an experiment.  However their site is all gung-ho about using ceph-deploy, but the zip I pulled today doesn't want to work under either python 2.7 or python 3.2

 

did you have an update to this topic ? 

cheers

----------

