# software raid over iscsi

## Cinquero

Does someone have some experience with a mdraid over iscsi setup?

Scenario: there are, for example, 2 storage servers exporting their storage via iSCSI (that is iscsi-target). Another server is then used to combine these two iSCSI devices into a software RAID1 setup (/dev/md0, for example).

That, actually, works quite well. At least until one kills one of the servers -- then the RAID1 just hangs.

Is there any way to let the iSCSI device automatically fail and let the RAID1 continue with a failed disk?

If it is possible, how? And how to proceed, i.e. how to integrate another iSCSI disk from another host into the RAID1 array as a replacement during live operation?

Best regards,

Mark

----------

## jschellhaass

Have you looked at using sys-fs/multipath-tools.

jeff

----------

## Cinquero

No, I'm not really looking for multipath. I want to replace storage devices completely and not just use a fallback route or replication in the backend.

----------

## HeissFuss

I would have mentioned DRDB, but you don't want backend replication or multipath/clustering (what's the reasoning for that by the way.)  From mailing lists, it seems that there are issues with MD raid and iSCSI.  You may want to try using LVM for the mirror instead.  It may handle failure more gracefully.

----------

## Cinquero

I do not like DRBD because it wastes space like RAID1 -- RAID5 would be better  :Smile: .

Additionally, I'd like to be able to move the storage around between hosts during live operation and without downtime... can you detach one drbd side, destroy it, create a new slave on another host, and then re-sync that one to the master during live operation?

----------

## HeissFuss

 *Quote:*   

> I do not like DRBD because it wastes space like RAID1

 

I thought the topic was about RAID1?

 *Quote:*   

> Additionally, I'd like to be able to move the storage around between hosts during live operation and without downtime... can you detach one drbd side, destroy it, create a new slave on another host, and then re-sync that one to the master during live operation?

 

Maybe I'm just not understanding the big picture of your setup.  When you're referring to moving storage, do you mean moving the target or moving the initiator?

I'm not going to pretend to know exactly how DRDB works, but my understanding is that it syncs its local storage with a remote server in a primary/failover fashion.  If you made the two servers into a cluster, you'd publish the iSCSI from the primary and connect to the HA ip from the client.  In that case, removing one of the servers would be near transparent (there would be a brief period of inaccessibility in the case of needing to fail to the secondary.)  The setup though is fairly complex.  Another option would be to have both servers "primary", publish both devices over iSCSI and use multipath on the client to connect.  However, you'd need to use a cluster file system for that such as GFS unless you were sure that you'd only be writing to one at a time.  In any of these cases you should be able to remove a node online.

Also, one of the large advantages I see with DRDB (besides it being designed for storage over ip) is that the mirror resync is not full but only changes (I don't know how they'd manage that unless they have a large transaction log pool.  It's worth looking into.)

----------

## Cinquero

The big picture is a HA XEN VM with cheap storage that is movable almost arbitrarily between machines during live operation.

Imagine a XEN VM having two (or more) legs like (unlike) a human: with two or more legs it can start to walk around -- it can do live migration on the VM *and* on the storage side. You take one leg up (disconnect one iSCSI target), and step on another machine by reconnecting to another iSCSI target and resyncing it.

That's basically about which I wonder if it is at all possible...

----------

## Cinquero

I'm now facing the problem that device nodes of failed iscsi devices may disappear -- causing mdadm to throw the following error:

$ mdadm /dev/md0 --remove /dev/sdb

mdadm: cannot find /dev/sdb: No such file or directory

Any idea how to fix that?

--

UPDATE: ChangeLog(mdadm)

Changes Prior to 2.6.2 release

    -   --fail detached and --remove faulty can be used to fail and

	remove devices that are no longer physically present.

----------

