# What are my best options for setting up a cluster?

## Naughtyus

I have a need to set up a cluster system so that I can offer high availability for my servers.  The hardware I have is as follows:

2 x VMWare servers

2 x 3TB RAID arrays (Promise VTrak M210p )

The SCSI RAID arrays are external enclosures which hold 8 drives.  Array 1 is attached to Server 1, and Array 2 is attached to Server 2.  Ideally I would like Array 2 to be an identical copy of Array 1.  I don't know if this is the best option, as I could also directly connect both Server 1 and Server 2 directly to both Array 1 and Array 2.

The end goal I would like is to be able to disconnect any one of the components of the system and still have my users able to work as normal.  For example if 5 hard drives in Array 1 were to die at once, I want my users to be able to keep working, but in the background everything should be working on Array 2.  Repeat for each of the components...

I will be doing the clustering at the Virtual Machine level, where each server will have at a VMWare client machine running Linux and several applications which require access to the RAID Array.  So if I have VMWare Client 1 on Server 1, I would like it to be in a cluster with VMWare Client 2 on Server 2, and so on.

I've read several articles on open source clustering options, and am still confused as to whether or not what I want to do is possible (or whether this is even popular).  From what I can tell my best course of action is to use OpenMosix, but I could really use some advice as to how I should best proceed in setting up the above system.

Thank you for any help you can offer

----------

## Naughtyus

Any thoughts?

----------

## erik258

I too am looking to build a smallish cluster of old pIIs and pIIIs, and maybe some older athlons.  I don't really need it for anything, but would like to get these older computers doing something... so I too look forward to hearing some ideas.  

Naughtyus, I subscribed to the gentoo-cluster mailinglist and although it's almost dead, the people there are really running clusters and seem very knowledgable.  They were happy to give me some ideas.  

It seems to me that openmosix is a poor choice for me, for this reason:  I can't build 2.4 kernels with  gcc >=4.1 , so I am stuck with 2.6 kernels, but the 2.6 kernel and userland utils are highly experimental for openmosix; i would like to do openmosix but not if it's poorly supported and uselessly  unstable.    The nice thing about mosix is (i think, it's been a while since i was running it) it's easy to have your cluster help with odd jobs.  

There's also general "high performance computing"; HPC as it's known is useful if you want to schedule tasks on the different processors, but looks like a lot of configuration and setup, then to have to manually decide which jobs can be shared with which other hosts in the 'cluster'.  That doesn't even seem like a true cluster, more of a distributed computing paradigm.  

Well there, i've said a whole bunch of unsupported stuff that isn't well grounded and may be wrong.  That's usually a good way to have someone more knowledgable then myself chime in :)

----------

## Naughtyus

I wonder, would it make sense for me to set up my two RAID arrays in a mirror?  I'm getting the impression that this would make things a lot easier to setup.

----------

## erik258

I think the answer to a lot of these questions is dependant on what the cluster is going to be doing.  any ideas?

----------

## Naughtyus

The linux portion of it will be running Xythos, which is essentially a file sharing service managed by tomcat.  I may also choose to use linux to serve several samba shares.  Windows is currently used for Active Directory services, but this generally doesn't need to be attached to the raid arrays.

----------

## zeek

 *Naughtyus wrote:*   

> The end goal I would like is to be able to disconnect any one of the components of the system and still have my users able to work as normal.  For example if 5 hard drives in Array 1 were to die at once, I want my users to be able to keep working, but in the background everything should be working on Array 2.  Repeat for each of the components...
> 
> 

 

This kind of failover requires cluster aware applications.  openmosix is junk btw.

The newer VMWare can migrate VM between physical boxes when using a shared storage.  Can you connect both servers to each disk array (like fibre channel)?

----------

## erik258

everything could access the arrays if they were all on one box, running NAS service with a fast (bonded?) ethernet connection.

----------

## hoka

If you want the contents to be identical at the block device level - you can use distributed replicated block disks (DRBD, http://www.drbd.org/) and setup something like IP failover. Are you strictly attempt to have high-availability storage without dumping money into FC? Are there other services you are trying to make HA? Are you looking for an active/active or active/passive solution for these needs?

OpenMosix is used for process load balancing/availability - so say you have a bunch of processes that are running some intensive computation, OM should be able to push the processes to unloaded machines given certain constraints (iirc the process has to run for ~5s or longer among other things).

----------

## Naughtyus

The raid arrays each have four scsi connections, so can each be directly connected to up to four physical servers.  I don't particularly need the load balancing, or clustering, I was just under the impression that this was the best way to go about creating the sort of environment I require (where a server can be shut off, without having an effect on anything)

----------

## erik258

 *Quote:*   

> The raid arrays each have four scsi connections, so can each be directly connected to up to four physical servers.

 

I didn't realize that was possible.  Do you really want 4 different systems sharing disks?  It seems like that would be a data-integrity nightmare, but I guess it would be no different than a multi-processor system accessing the filesystem at the same time.  

Just to be clear, your goal here is high availability file storage, where all the servers can help when under heavy load, but when things are quiet you can turn some off or suspend them or something and only run what you need, but if there are any problems, some sort of 'roll-over' policy is automatically put into effect.  Right?

----------

## Naughtyus

Essentially, yes.  The load balancing, and shared processing aren't explicitly needed.  The main thing I would like to achieve is for any one part in the system to fail, and not have an effect on the overall system at all

----------

## linuxtuxhellsinki

I think you've readed these, but here are the links anyway   :Confused: 

http://www.linuxjournal.com/article/9074

http://www.linux-ha.org/PressRoom

----------

## Naughtyus

Thank you for the links!

I've been doing some more research, and I'm trying to figure out if RedHat's GFS might be applied to my situation.  Again in the docs, I'm finding it hard to tell whether GFS has been designed with redundant data stores in mind.  Different things in the docs make me think both that it will work and that it will not...

GFS was (at least) designed with the idea that each server will be directly connected to the data store, which is what I will be doing.  That at least is hopeful.  I am downloading a CentOS CD right now to see if I can determine the answers to my questions by playing with the software itself.  If anyone knows offhand if GFS and RedHat's clustering management might be a good fit for me, I'd appreciate any input.

DRBD also looks like a promising alternative, my worry with it is that if I want to have each server responsible for a single task (DB, SMB, NFS, etc.), I'm under the impression that I will have to make a fixed partition size for each of these servers, and will be unable to have them all access the same total pool.  Is this correct?

----------

## mauricev

 *Quote:*   

> I'm under the impression that I will have to make a fixed partition size for each of these servers, and will be unable to have them all access the same total pool. Is this correct?

 

No, I don't think it is. I am not using a shared cluster, but I have drbd configured to match the partitions created in the hardware RAID array. That's all thrown into one big LVM physical volume. I then make logical volumes out of that in the size I need. ext3 sits on that. The ext3 and logical volumes can be resized, pulling in capacity from the one big physical volume. I believe that LVM is cluster aware, so I think you'd just have to substitute gfs for ext3.

----------

