# RAID-5 crash recovery (multiple disks failed)

## tpm

Hello,

my home-server runs - sorry, now I must say ran - a RAID-5 consisting of 6 WD2000JD SATA disks (/dev/md2).

A few days ago one drive (/dev/sdd) began making problems (although apparently not broken), so I disconnected it from the controller card (Promise SATAII150 TX4).

The next boot-up the RAID-5 started in degraded mode. But then the worst happened: Another drive (/dev/sdb) suddenly failed while the file system (reiserfs) on top of the RAID-5 was mounted. I immediately turned the machine off.

I reconnected /dev/sdd (which surprisingly now works fine) and booted up again. /dev/md2 was offline and not startable any more.

My idea was to use raidreconf to rebuild the RAID-5 with my old raidtab-file (-o /etc/raidtab; see below) and a slightly modified new one (-n /etc/raidtab.new), where I replaced the "raid-disk 3" (which is /dev/sdd) by "failed-disk 3":

```

raiddev /dev/md2

 raid-level             5

 nr-raid-disks          6

 nr-spare-disk          0

 persistent-superblock  1

 chunk-size             64

 device                 /dev/sda4

 raid-disk              0

 device                 /dev/sdb4

 raid-disk              1

 device                 /dev/sdc4

 raid-disk              2

 device                 /dev/sdd4

 raid-disk              3

 device                 /dev/sde4

 raid-disk              4

 device                 /dev/sdf4

 raid-disk              5

```

The reconf-process lasted ~14 hours, it was nearly finished and then, ouch - the system hang up!

So I did a bit more investigation and found the lsraid tool. Here is the (important) output of 'lsraid -p':

```

[dev   9,   2] /dev/md/2        59B311A6.7BBEB7A2.1ECF6966.495DCEC6 offline

[dev   8,   4] /dev/sda4        59B311A6.7BBEB7A2.1ECF6966.495DCEC6 good

[dev   8,  36] /dev/sdc4        59B311A6.7BBEB7A2.1ECF6966.495DCEC6 good

[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing

[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing

[dev   8,  52] /dev/sdd4        59B311A6.7BBEB7A2.1ECF6966.495DCEC6 good

[dev   8,  68] /dev/sde4        59B311A6.7BBEB7A2.1ECF6966.495DCEC6 good

[dev   8,  20] (unknown)        59B311A6.7BBEB7A2.1ECF6966.495DCEC6 unknown

[dev   8,  20] /dev/sdb4        59B311A6.7BBEB7A2.1ECF6966.495DCEC6 unbound

[dev   8,  84] /dev/sdf4        59B311A6.7BBEB7A2.1ECF6966.495DCEC6 unbound

```

As you see, lsraid wants to tell me, that /dev/sdd4 (the first drive that failed) is "good", while /dev/sdb4 (the second drive that failed) _and_ /dev/sdf4 (oh, it failed, too?!) are 'unbound'. I don't know either, why "[dev   8,  20]" which is obviously /dev/sdd4, is listed two times ("unknown" and "unbound") and why "/dev/md/2" instead of "/dev/md2", but in every case the two "missing" entries don't look good.

'lsraid -R -p' outputs the following (excerpt):

```

# md device [dev 9, 2] /dev/md/2 queried offline

# Authoritative device is [dev 8, 4] /dev/sda4

raiddev /dev/md/2

raid-level              5

nr-raid-disks           6

nr-spare-disks          0

persistent-superblock   1

chunk-size              64

device                  /dev/sda4

raid-disk               0

device                  /dev/sdc4

raid-disk               1

device                  /dev/sdd4

raid-disk               4

device                  /dev/sde4

raid-disk               5

device                  /dev/null

failed-disk             2

device                  /dev/null

failed-disk             3

```

Ouch again! Probably my raidreconf-attempt mixed up the whole thing even more. As further partitions of /dev/sdd (firstly failing disk) and (!) /dev/sdb (secondly failing disk) are part of other (working) RAID devices, these disks aren't broken. So I wonder for what reason the md-module kicked them out of the arrays at boot-time, but is now willing to get /dev/sdd4 back...

My question is: How can I rebuild that RAID-5 without losing too much (all) data? The way decribed at http://software.cfht.hawaii.edu/linuxpc/RAID_recovery.html#mutiple looks good, but I'm not sure with marking /dev/sdd4 as failed in /etc/raidtab (which I already tried using raidreconf) - although it surely failed at first, lsraid now says "/dev/sdd4 is just fine - in contrast to /dev/sdb4 and /dev/sdd4"... Any ideas are welcome.

Kind regards and thanks in advance

----------

## groovin

sorry, this isnt a solution to your problem but i just thought id ask if your SATA controllers could be toast? or running old firmware that doesnt mesh with your setup?

----------

## cummings66

I'm curious as to if you can save anything becauase as I understand it, raid 5 is great for managing to save data if one drive fails, but if 2 fail you're out of luck and will lose stuff.

My personal thoughts are that you're out of luck because you had 2 drives go offline at once, plus you then messed with the config and then had it crash on you during a rebuild.  I can't think of anything worse that could happen that would cost you your data.

For those naysayers who say you should have had hardware raid, let me start by saying that I've had a hardware raid system go belly up and also lock up.  In that case what happened was the controller started dying, it sounds very familiar as to what you just described, so maybe the previous poster has a thought that's on target.

----------

## tpm

@groovin: A good thought. I visited the Promise download page and although the controller cards are fairly new, there is a firmware update available (which I accomplished just now) - provided with the interesting note "Enhanced the scheme to prevent unknown plug/unplug interrupt."... Furthermore it's rather unlikely that two (or even more) pretty fresh hard disks die at the same time, especially because they work "again" and don't seem to be really broken or damaged... Really seems that the controllers are the evil, hope the firmware update pacified them a bit :/...

@cummings66: In principle you are right, more than one failed drive is, to understate, "not really good" for a RAID-5. But seeing that the drives are not physically damaged (at least I think so), there is still a spark of hope in me. The underlying RAID-structure (superblocks et cetera) is messed up as one can see, though theoretically the filesystem has not be totally broken even if "If an error occurs during reconfiguration, a power failure for example, restore from backup (you DID make a backup, right?), and try again." from the manual of raidreconf doesn't sound amazing... At least some of the (sadly not unimportant) data must be recoverable, giving up is not an option for me...

Either way: Thanks for the thoughts!

----------

## tpm

I ran 'mdadm --examine' on each drive and the output showed what I expected:

On /dev/sd[acef]4 it looked like:

```

          Magic : a92b4efc

        Version : 00.90.01

           UUID : 59b311a6:7bbeb7a2:1ecf6966:495dcec6

  Creation Time : Tue Apr 12 07:20:24 2005

     Raid Level : raid5

   Raid Devices : 6

  Total Devices : 5

Preferred Minor : 2

    Update Time : Tue Jun  7 14:42:14 2005

          State : clean

 Active Devices : 4

Working Devices : 4

 Failed Devices : 3

  Spare Devices : 0

       Checksum : 76f2fa3e - correct

         Events : 0.249929

         Layout : left-asymmetric

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State

this     0       8        4        0      active sync   /dev/sda4

   0     0       8        4        0      active sync   /dev/sda4

   1     1       8       36        1      active sync   /dev/sdc4

   2     2       0        0        2      faulty removed

   3     3       0        0        3      faulty removed

   4     4       8       52        4      active sync   /dev/sdd4

   5     5       8       68        5      active sync   /dev/sde4

```

... on /dev/sdd4:

```

          Magic : a92b4efc

        Version : 00.90.01

           UUID : 59b311a6:7bbeb7a2:1ecf6966:495dcec6

  Creation Time : Tue Apr 12 07:20:24 2005

     Raid Level : raid5

   Raid Devices : 6

  Total Devices : 6

Preferred Minor : 2

    Update Time : Wed May 18 17:09:49 2005

          State : clean

 Active Devices : 6

Working Devices : 6

 Failed Devices : 0

  Spare Devices : 0

       Checksum : 76d82624 - correct

         Events : 0.230320

         Layout : left-asymmetric

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State

this     3       8       52        3      active sync   /dev/sdd4

   0     0       8        4        0      active sync   /dev/sda4

   1     1       8       36        1      active sync   /dev/sdc4

   2     2       8       20        2      active sync   /dev/sdb4

   3     3       8       52        3      active sync   /dev/sdd4

   4     4       8       68        4      active sync   /dev/sde4

   5     5       8       84        5      active sync   /dev/sdf4

```

... and on /dev/sdb4:

```

          Magic : a92b4efc

        Version : 00.90.01

           UUID : 59b311a6:7bbeb7a2:1ecf6966:495dcec6

  Creation Time : Tue Apr 12 07:20:24 2005

     Raid Level : raid5

   Raid Devices : 6

  Total Devices : 5

Preferred Minor : 2

    Update Time : Tue Jun  7 14:27:14 2005

          State : active

 Active Devices : 5

Working Devices : 5

 Failed Devices : 1

  Spare Devices : 0

       Checksum : 76ef2670 - correct

         Events : 0.249925

         Layout : left-asymmetric

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State

this     2       8       20        2      active sync   /dev/sdb4

   0     0       8        4        0      active sync   /dev/sda4

   1     1       8       36        1      active sync   /dev/sdc4

   2     2       8       20        2      active sync   /dev/sdb4

   3     3       0        0        3      faulty removed

   4     4       8       52        4      active sync   /dev/sdd4

   5     5       8       68        5      active sync   /dev/sde4

```

So I ran 'mkraid --force /dev/md2'. However, that simply didn't work.

'mdadm --create /dev/md2 -l5 -n6 /dev/sda4 /dev/sdb4 /dev/sdc4 missing /dev/sde4 /dev/sdf4' worked quite well (although there were warnings that two of the partitions would not only be valid raid-devices but also contain reiserfs file systems :/): /dev/md2 apperead in /proc/mdstat again.

Thus I started 'fsck.reiserfs --check /dev/md2', unfortunately that tool did not find any reiserfs file system. Hence I first ran 'fsck.reiserfs --rebuild-sb /dev/md2', which advised running 'fsck.reiserfs --rebuild-tree /dev/md2' afterwards. At the latest before rebuilding the whole file system tree one should certainly make a backup. It's all very well to say that, but how to easily back up ~900 gb?

'fsck.reiserfs --rebuild-tree /dev/md2' ran well for some hours... Suddenly it stopped. Message: '... bad sectors ... hardware problem...". Because 'badblocks -v /dev/sd[abcdef]4' reported not a single bad block at all, I checked /proc/mdstat - Doh! (One drive had failed in the meantime, but in fact it was not the drive that failed:) Mean seedy SATA controllers...Last edited by tpm on Fri Jun 17, 2005 7:13 am; edited 3 times in total

----------

## groovin

ouch, im sorry to hear that. if you have 900gb of data, perhaps you should invest in a reliable hardware sata raid card like a 3ware escalade? those things are great, i run them all over the place and as long as you keep 3dm and the firmware updated, they are solid.

----------

