# [solved]RAID array crashed, data is clean. Maybe not so much

## tipp98

Hi, I have a 4 disk raid5 array runnig on PMP hardware. I think the mistake I made was turning on the external enclosure while the host system was running, resulting in one of the drives not being added to the array due to slow spinup. Two minutes later another drive lost communication for failure to read SCR. I am still trying to figure out what that means, but I am hopeful that it was also caused by a poor startup. Some of my logs can be found at the bottom of Post #7084980.

I know the file system is clean because it went down clean and no data was written before the array collapsed. 

Here is the current status

```
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 

md5 : inactive sdh1[4](S) sdg1[1](S) sdf1[0](S) sde1[2](S)

      3875034428 blocks super 1.2

       

unused devices: <none>
```

/dev/sde1:

```

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : 3812f2b7:c88db9b1:d640aab4:2f3d478a

           Name : Falcon:5  (local to host Falcon)

  Creation Time : Fri Jul 22 13:11:48 2011

     Raid Level : raid5

   Raid Devices : 4

 Avail Dev Size : 1937517214 (923.88 GiB 992.01 GB)

     Array Size : 5812550400 (2771.64 GiB 2976.03 GB)

  Used Dev Size : 1937516800 (923.88 GiB 992.01 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 515be01f:5c7ad9a5:6974d3f1:25eab899

    Update Time : Mon Jul  9 19:04:16 2012

       Checksum : c0144eed - correct

         Events : 20885

         Layout : left-symmetric

     Chunk Size : 128K

   Device Role : Active device 2

   Array State : .AAA ('A' == active, '.' == missing)
```

/dev/sdf1:

```
          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : 3812f2b7:c88db9b1:d640aab4:2f3d478a

           Name : Falcon:5  (local to host Falcon)

  Creation Time : Fri Jul 22 13:11:48 2011

     Raid Level : raid5

   Raid Devices : 4

 Avail Dev Size : 1937517214 (923.88 GiB 992.01 GB)

     Array Size : 5812550400 (2771.64 GiB 2976.03 GB)

  Used Dev Size : 1937516800 (923.88 GiB 992.01 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

          State : active

    Device UUID : ed9b6881:06e50a9c:0f3dae3a:f1103cd7

    Update Time : Mon Jul  9 19:00:29 2012

       Checksum : 9e2ae8b4 - correct

         Events : 20871

         Layout : left-symmetric

     Chunk Size : 128K

   Device Role : Active device 0

   Array State : AAAA ('A' == active, '.' == missing)
```

/dev/sdg1:

```

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : 3812f2b7:c88db9b1:d640aab4:2f3d478a

           Name : Falcon:5  (local to host Falcon)

  Creation Time : Fri Jul 22 13:11:48 2011

     Raid Level : raid5

   Raid Devices : 4

 Avail Dev Size : 1937517214 (923.88 GiB 992.01 GB)

     Array Size : 5812550400 (2771.64 GiB 2976.03 GB)

  Used Dev Size : 1937516800 (923.88 GiB 992.01 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : d91c2f83:d3037a3c:a8c3af41:35a3a3db

    Update Time : Mon Jul  9 19:21:09 2012

       Checksum : 4bcba62e - correct

         Events : 20888

         Layout : left-symmetric

     Chunk Size : 128K

   Device Role : Active device 1

   Array State : .A.A ('A' == active, '.' == missing)
```

/dev/sdh1:

```

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : 3812f2b7:c88db9b1:d640aab4:2f3d478a

           Name : Falcon:5  (local to host Falcon)

  Creation Time : Fri Jul 22 13:11:48 2011

     Raid Level : raid5

   Raid Devices : 4

 Avail Dev Size : 1937517214 (923.88 GiB 992.01 GB)

     Array Size : 5812550400 (2771.64 GiB 2976.03 GB)

  Used Dev Size : 1937516800 (923.88 GiB 992.01 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 0ee2d136:dcee340b:504fff1b:6db663cd

    Update Time : Mon Jul  9 19:21:09 2012

       Checksum : 9a38f54e - correct

         Events : 20888

         Layout : left-symmetric

     Chunk Size : 128K

   Device Role : Active device 3

   Array State : .A.A ('A' == active, '.' == missing)
```

grep Events

```
         Events : 20885

         Events : 20871

         Events : 20888

         Events : 20888
```

grep State

```

          State : clean

   Array State : .AAA ('A' == active, '.' == missing)

          State : active

   Array State : AAAA ('A' == active, '.' == missing)

          State : clean

   Array State : .A.A ('A' == active, '.' == missing)

          State : clean

   Array State : .A.A ('A' == active, '.' == missing)
```

The devices are a bit jumbled, but in the array they are ordered f,g,e,h. sdf1 was the first to fail and sde1 followed. EDIT: I was slightly mistaken about sdf1 not coming up. Seeing the state "active" and looking at the log, it was added but failed before the array was mounted.

As is evident by the grep summary, two devices are behind, but three say they are clean. I do not assume --re-add will work with an array that is in pieces, so it looks like I have to recreate the array with --assume-clean. Or, is there a better way?Last edited by tipp98 on Sat Jul 14, 2012 4:46 pm; edited 1 time in total

----------

## NeddySeagoon

tipp98,

The array is not clean or the event count on each drive would be identical. sdf1 is the odd one out with Events : 20871.

If you are confident about everything, the  Events : 20885 on the other three drives will be te same events.

Assemble and run the raid using the three drives with the identical event counts then readd sdf1.  It will resync.

----------

## tipp98

Hey Neddy,

The problem is that there are not three devices with the same event count, sde1 is three events behind @ 885. 

I tried the following with no luck:

mdadm -v --assemble --force --update=summaries /dev/md5

```

mdadm: looking for devices for /dev/md5

mdadm: no RAID superblock on /dev/sde2

mdadm: /dev/sde2 has wrong uuid.

mdadm: cannot open device /dev/sde1: Device or resource busy

mdadm: /dev/sde1 has wrong uuid.

mdadm: cannot open device /dev/sde: Device or resource busy

mdadm: /dev/sde has wrong uuid.

mdadm: cannot open device /dev/sdh1: Device or resource busy

mdadm: /dev/sdh1 has wrong uuid.

mdadm: cannot open device /dev/sdh: Device or resource busy

mdadm: /dev/sdh has wrong uuid.

mdadm: no RAID superblock on /dev/sdg2

mdadm: /dev/sdg2 has wrong uuid.

mdadm: cannot open device /dev/sdg1: Device or resource busy

mdadm: /dev/sdg1 has wrong uuid.

mdadm: cannot open device /dev/sdg: Device or resource busy

mdadm: /dev/sdg has wrong uuid.

mdadm: no RAID superblock on /dev/sdf2

mdadm: /dev/sdf2 has wrong uuid.

mdadm: cannot open device /dev/sdf1: Device or resource busy

mdadm: /dev/sdf1 has wrong uuid.

mdadm: cannot open device /dev/sdf: Device or resource busy

mdadm: /dev/sdf has wrong uuid.

mdadm: no RAID superblock on /dev/sdb4

mdadm: /dev/sdb4 has wrong uuid.

mdadm: no RAID superblock on /dev/sdb3

mdadm: /dev/sdb3 has wrong uuid.

mdadm: no RAID superblock on /dev/sdb2

mdadm: /dev/sdb2 has wrong uuid.

mdadm: no RAID superblock on /dev/sdb1

mdadm: /dev/sdb1 has wrong uuid.

mdadm: no RAID superblock on /dev/sdb

mdadm: /dev/sdb has wrong uuid.

mdadm: cannot open device /dev/sda2: Device or resource busy

mdadm: /dev/sda2 has wrong uuid.

mdadm: no RAID superblock on /dev/sda1

mdadm: /dev/sda1 has wrong uuid.

mdadm: cannot open device /dev/sda: Device or resource busy

mdadm: /dev/sda has wrong uuid.

```

On all of the member devices it says that they are busy and that they have a wrong uuid. I must be doing it wrong, I would at least expect it to be able to open the device and find valid uuid's, even if it still fails due to being out of sync. But then, I don't understand what the --force would be good for. Please tell me I am doing something wrong.

Thanks,

Kyle

----------

## tipp98

Yeah, I needed to stop the array, which I didn't think to do because it was marked inactive and reporting all disks as spare in mdstat.

I re-added the disk and it is now doing a full resync, probably overkill, but I'm ok with that. Thanks Neddy.

----------

## tipp98

So, the re-add got to 98.5% complete when sdh1 spit out a read error. I verified it with an offline smart test. Since I am confident that the data on /dev/sdf1 (the one being resync'd) is good, especially now that it is 98.5% resync'd, is there a way to clear the recovery process, allowing it to start on sdh1?

I am wondering if --update=resync would reset the recovery offset to zero, along with all the other drives, or would I just be digging under my feet? I have a feeling I would need to hexed the drive to make mdadm forget about it, and that sounds like many hours of research.

# mdadm -E /dev/sd[efgh]1|grep -E 'dev|Recovery|State|Events'

```
/dev/sde1:

          State : clean

         Events : 20915

   Array State : AAA. ('A' == active, '.' == missing)

/dev/sdf1:

Recovery Offset : 1910096384 sectors

          State : clean

         Events : 20915

   Array State : AAA. ('A' == active, '.' == missing)

/dev/sdg1:

          State : clean

         Events : 20915

   Array State : AAA. ('A' == active, '.' == missing)

/dev/sdh1:

          State : clean

         Events : 20914

   Array State : AAAA ('A' == active, '.' == missing)

```

----------

## NeddySeagoon

tipp98,

Check the warranty status on your drive.  If its still in warranty, have a new one shipped before you return the old one.

You can try to image the old drive to the new drive, or you can assemble the raid on the reminainf drives, even though the sync has not completed.

I think thats what --assume-clean does.

When your new drive arrives you can add it to the raid set.

If you really want to continue to use the old drive, look at the SMART error log now.  Save  a copy to a disk file.

Use dd to write to the entire drive with something like 

```
dd if=/dev/zero of=/dev/sd...  bs=4096
```

Its essential you get that command right.

Larger block sizers may be faster but use an integer multiple of 4096.  When/if that completes check that it actually got to the end of the drive.

Look at the SMART log again. What has happened to the Reallocated Sector Cout and the Pending Sectors ?

Reallocated should have increased and Pending should be zero.

If the drive is still under warranty, its not worth messing with.

----------

## tipp98

Good call, the warranty is still good until the end of next month. Since I figured I'd have to recreate the array a couple days ago, I started that yesterday. 

What I did was zero the superblock and run the following:

```
~ # mdadm --create /dev/md5 --chunk=128 --level=5 --raid-devices=4 --assume-clean /dev/sdf1 /dev/sdg1 /dev/sde1 /dev/sdh1

mdadm: partition table exists on /dev/sdf1 but will be lost or

       meaningless after creating array

mdadm: partition table exists on /dev/sde1 but will be lost or

       meaningless after creating array

mdadm: partition table exists on /dev/sdh1 but will be lost or

       meaningless after creating array

Continue creating array? y

mdadm: Defaulting to version 1.2 metadata

mdadm: array /dev/md5 started.
```

Those warnings gave me cause for concern (did I need to specify the K with chunk size and was the order correct?), but I verified everything and sure enough, dumpe2fs -h reported the file system.

In order to lower the chances of the same thing happening on another drive at 99.5%, I decided to include sdh1 instead of using "missing", followed by a little data scrub to fix the last 1.5%.

```
~ # echo check >> /sys/block/md5/md/sync_action  
```

I saw your message at about 35% through the check and ...

```
  5 Reallocated_Sector_Ct   0x0033   194   194   140    Pre-fail  Always       -       41

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       20
```

by the end ...

```
  5 Reallocated_Sector_Ct   0x0033   192   192   140    Pre-fail  Always       -       59

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       156
```

The check completed without kicking sdh back out, which is what I was thinking should happen. But, and I never would have caught this if it wasn't for your wisdom, the Current_Pending_Sector count continued to increase after the sync completed and the array was idle. That caused me to scratch my head and I was going to shut the array down for the night. But, I still had data to back up to it and the remaining drivers were healthy so I started the backup followed by a long smart test. This morning, sdh1 was kicked back out and 

```
  5 Reallocated_Sector_Ct   0x0033   191   191   140    Pre-fail  Always       -       65

197 Current_Pending_Sector  0x0032   196   196   000    Old_age   Always       -       1197
```

After all of this I feel like I have increased my RAID knowledge and best practices 3 fold. In addition to going through all of that for the first time, I discovered that three out of four of my drives support ERC, so I turned that on with smartctl. I wonder if there is a jumper setting for that.... This array stays mostly offline, but I shall be turning it on periodically for a little data scrub and, if I ever Create another array for backups it will be something with checksumming for that extra assurance, probably ZFS or Btrfs.

It's funny, I fired this array up because I had a partition on a 500G drive that was automatically remounted ro, and needed the space to dump the disk before attempting to fix it. When it rains it pours.... But, you need rain to grow.

Thanks again, time to fill out my RMA request. 

P.S. After snooping in the contents of /sys/block/md5/md/dev-sdX, it looks as though I may have been able to cancel the recovery on sdf1 after all, but I've not read any documentation on that as I was past that point of opportunity.

----------

