# RAID6: emergency help?

## RayDude

I created a raid6 array from six drives and copied all my old data onto it. I still have the old drives but ... setting them up and copying them over would be a rather large task.

The first time I booted the machine with the new raid6 array it worked perfectly.

Then I powered it off and put the cover back on and rebooted and one of the drives became faulty.

Then trying to re-add it to the array to get it to rebuild another drive went faulty.

Then one more mdadm -A and another drive went away and it looks like all data is lost.

What am I doing wrong?

Here's what mdadm currently says:

```
server ~ # mdadm --detail /dev/md127

/dev/md127:

        Version : 1.2

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

     Array Size : 11720534016 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 2930133504 (2794.39 GiB 3000.46 GB)

   Raid Devices : 6

  Total Devices : 5

    Persistence : Superblock is persistent

    Update Time : Fri Dec  7 00:10:39 2012

          State : clean, FAILED

 Active Devices : 3

Working Devices : 4

 Failed Devices : 1

  Spare Devices : 1

         Layout : left-symmetric

     Chunk Size : 512K

           Name : SparePC:soulstorage

           UUID : bfc07787:5075d763:c70b65ac:687f7544

         Events : 61

    Number   Major   Minor   RaidDevice State

       0       8       17        0      active sync   /dev/sdb1

       1       8       33        1      active sync   /dev/sdc1

       2       8       49        2      active sync   /dev/sdd1

       3       0        0        3      removed

       4       0        0        4      removed

       5       0        0        5      removed

       3       8       97        -      faulty spare   /dev/sdg1

       6       8       81        -      spare   /dev/sdf1

```

Here's what it said the previous failure:

```
server ~ # mdadm --detail /dev/md127

/dev/md127:

        Version : 1.2

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

  Used Dev Size : -1

   Raid Devices : 6

  Total Devices : 5

    Persistence : Superblock is persistent

    Update Time : Fri Dec  7 00:04:04 2012

          State : active, degraded, Not Started

 Active Devices : 4

Working Devices : 5

 Failed Devices : 0

  Spare Devices : 1

         Layout : left-symmetric

     Chunk Size : 512K

           Name : SparePC:soulstorage

           UUID : bfc07787:5075d763:c70b65ac:687f7544

         Events : 56

    Number   Major   Minor   RaidDevice State

       0       8       17        0      active sync   /dev/sdb1

       1       8       33        1      active sync   /dev/sdc1

       2       8       49        2      active sync   /dev/sdd1

       3       8       97        3      active sync   /dev/sdg1

       4       0        0        4      removed

       5       0        0        5      removed

       6       8       81        -      spare   /dev/sdf1

```

It is interesting to note that the failed drives are all connected to a raid (with hardware disabled) card that had, until now been working fine.

Can someone help me? I have no idea what's failing or why...

Update: It looks like they are hard errors....

```
sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh] CDB: 

end_request: I/O error, dev sdh, sector 264200

sd 9:0:0:0: [sdh] Unhandled sense code

sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh] CDB: 

end_request: I/O error, dev sdh, sector 265224

sd 8:0:0:0: [sdg]  

sd 8:0:0:0: [sdg]  

sd 8:0:0:0: [sdg]  

sd 8:0:0:0: [sdg] CDB: 

end_request: I/O error, dev sdg, sector 264200

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:1, dev:sdg1

 disk 4, o:1, dev:sdh1

 disk 5, o:1, dev:sdf1

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:1, dev:sdg1

 disk 4, o:1, dev:sdh1

sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh]  

sd 9:0:0:0: [sdh] CDB: 

end_request: I/O error, dev sdh, sector 265224

md/raid:md127: read error NOT corrected!! (sector 263176 on sdh1).

md/raid:md127: Disk failure on sdh1, disabling device.

md/raid:md127: read error not correctable (sector 263184 on sdh1).

md/raid:md127: read error not correctable (sector 263192 on sdh1).

md/raid:md127: read error not correctable (sector 263200 on sdh1).

md/raid:md127: read error not correctable (sector 263208 on sdh1).

md/raid:md127: read error not correctable (sector 263216 on sdh1).

md/raid:md127: read error not correctable (sector 263224 on sdh1).

md/raid:md127: read error not correctable (sector 263232 on sdh1).

md/raid:md127: read error not correctable (sector 263240 on sdh1).

md/raid:md127: read error not correctable (sector 263248 on sdh1).

md/raid:md127: read error not correctable (sector 263256 on sdh1).

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:1, dev:sdg1

 disk 4, o:0, dev:sdh1

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:1, dev:sdg1

nfsd: last server has exited, flushing export cache

md: unbind<sdf1>

md: export_rdev(sdf1)

md: unbind<sdg1>

md: export_rdev(sdg1)

md: unbind<sdh1>

md: export_rdev(sdh1)

md: unbind<sdc1>

md: export_rdev(sdc1)

md: unbind<sdd1>

md: export_rdev(sdd1)

md: unbind<sdb1>

md: export_rdev(sdb1)

md: bind<sdc1>

md: bind<sdd1>

md: bind<sdg1>

md: bind<sdh1>

md: bind<sdf1>

md: bind<sdb1>

md: kicking non-fresh sdh1 from array!

md: unbind<sdh1>

md: export_rdev(sdh1)

md/raid:md127: device sdb1 operational as raid disk 0

md/raid:md127: device sdg1 operational as raid disk 3

md/raid:md127: device sdd1 operational as raid disk 2

md/raid:md127: device sdc1 operational as raid disk 1

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:1, dev:sdg1

md: unbind<sdb1>

md: export_rdev(sdb1)

md: unbind<sdf1>

md: export_rdev(sdf1)

md: unbind<sdg1>

md: export_rdev(sdg1)

md: unbind<sdd1>

md: export_rdev(sdd1)

md: unbind<sdc1>

md: export_rdev(sdc1)

md: bind<sdc1>

md: bind<sdd1>

md: bind<sdg1>

md: bind<sdh1>

md: bind<sdf1>

md: bind<sdb1>

md: unbind<sdb1>

md: export_rdev(sdb1)

md: unbind<sdf1>

md: export_rdev(sdf1)

md: unbind<sdh1>

md: export_rdev(sdh1)

md: unbind<sdg1>

md: export_rdev(sdg1)

md: unbind<sdd1>

md: export_rdev(sdd1)

md: unbind<sdc1>

md: export_rdev(sdc1)

md: bind<sdc1>

md: bind<sdd1>

md: bind<sdg1>

md: bind<sdh1>

md: bind<sdf1>

md: bind<sdb1>

md: kicking non-fresh sdh1 from array!

md: unbind<sdh1>

md: export_rdev(sdh1)

md/raid:md127: device sdb1 operational as raid disk 0

md/raid:md127: device sdg1 operational as raid disk 3

md/raid:md127: device sdd1 operational as raid disk 2

md/raid:md127: device sdc1 operational as raid disk 1

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:1, dev:sdg1

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:1, dev:sdg1

 disk 4, o:1, dev:sdf1

sd 8:0:0:0: [sdg]  

sd 8:0:0:0: [sdg]  

sd 8:0:0:0: [sdg]  

sd 8:0:0:0: [sdg] CDB: 

end_request: I/O error, dev sdg, sector 264192

md/raid:md127: read error not correctable (sector 262144 on sdg1).

md/raid:md127: Disk failure on sdg1, disabling device.

md/raid:md127: read error not correctable (sector 262152 on sdg1).

md/raid:md127: read error not correctable (sector 262160 on sdg1).

md/raid:md127: read error not correctable (sector 262168 on sdg1).

md/raid:md127: read error not correctable (sector 262176 on sdg1).

md/raid:md127: read error not correctable (sector 262184 on sdg1).

md/raid:md127: read error not correctable (sector 262192 on sdg1).

md/raid:md127: read error not correctable (sector 262200 on sdg1).

md/raid:md127: read error not correctable (sector 262208 on sdg1).

md/raid:md127: read error not correctable (sector 262216 on sdg1).

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:0, dev:sdg1

 disk 4, o:1, dev:sdf1

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:0, dev:sdg1

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

 disk 3, o:0, dev:sdg1

 disk 0, o:1, dev:sdb1

 disk 1, o:1, dev:sdc1

 disk 2, o:1, dev:sdd1

md: unbind<sdb1>

md: export_rdev(sdb1)

md: unbind<sdf1>

md: export_rdev(sdf1)

md: unbind<sdg1>

md: export_rdev(sdg1)

md: unbind<sdd1>

md: export_rdev(sdd1)

md: unbind<sdc1>

md: export_rdev(sdc1)

md: bind<sdc1>

md: bind<sdd1>

md: bind<sdg1>

md: bind<sdh1>

md: bind<sdf1>

md: bind<sdb1>

md: unbind<sdb1>

md: export_rdev(sdb1)

md: unbind<sdf1>

md: export_rdev(sdf1)

md: unbind<sdh1>

md: export_rdev(sdh1)

md: unbind<sdg1>

md: export_rdev(sdg1)

md: unbind<sdd1>

md: export_rdev(sdd1)

md: unbind<sdc1>

md: export_rdev(sdc1)

```

----------

## NeddySeagoon

RayDude,

Don't do anything that may involve writes.  Post the output of 

```
mdadm -E /dev/sd[abcdef]1
```

What you hope to find is four members of the set with the same event count so you can assemble the raid in degraded mode.

I've just been through this with my 4 spindle raid5.

Its also worth installing smartmontools and looking at the drives internal error log.

If you saved dmesg with the error reports that showed why the drives were kicked out of the array, that would be good too.

----------

## RayDude

Thanks Neddy!

I think the raid controller I used for my external SATA box is not compatible with these drives... Because the three drives that failed are all attached to it. (two inside, one outside)

It looks like b, c, d, and f are all at the same count which means I might be able to recover this. I bought two new dual controllers (DGMS, Fry's sucks) to see if they will work with the drives. I think b, c, and d are okay because they are plugged into the mother board.

It would be so awesome if I could get the array to rebuild itself. It takes days to copy 8 TB over gigabit.

Here's what mdadm said

```
server ~ # mdadm -E /dev/sd[bcdfgh]1

/dev/sdb1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 6b6e4f45:5d52d5ea:c6ff4d3a:ebf8b515

    Update Time : Fri Dec  7 00:10:47 2012

       Checksum : ed444042 - correct

         Events : 63

         Layout : left-symmetric

     Chunk Size : 512K                                                    

                                                                          

   Device Role : Active device 0

   Array State : AAA... ('A' == active, '.' == missing)

/dev/sdc1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : dc35e656:81f9e617:e9a6eafe:00cf70d6

    Update Time : Fri Dec  7 00:10:47 2012

       Checksum : b1442c11 - correct

         Events : 63

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 1

   Array State : AAA... ('A' == active, '.' == missing)

/dev/sdd1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 59a13e07:c83416e1:4c96063b:6ca6bbb3

    Update Time : Fri Dec  7 00:10:47 2012

       Checksum : 443299a5 - correct

         Events : 63

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 2

   Array State : AAA... ('A' == active, '.' == missing)

/dev/sdf1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 374073a6:b52ff21f:01661c22:cc5f2acb

    Update Time : Fri Dec  7 00:10:47 2012

       Checksum : 20c7bc89 - correct

         Events : 63

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : spare

   Array State : AAA... ('A' == active, '.' == missing)

/dev/sdg1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 0445633e:5bf226e9:50f860b0:4784965b

    Update Time : Fri Dec  7 00:10:36 2012

       Checksum : a0a13abe - correct

         Events : 57

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 3

   Array State : AAAAA. ('A' == active, '.' == missing)

/dev/sdh1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : active

    Device UUID : 697e05c1:644d35c8:1cb2b136:97a14f39

    Update Time : Thu Dec  6 23:58:39 2012

       Checksum : 665ba374 - correct

         Events : 48

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 4

   Array State : AAAAA. ('A' == active, '.' == missing)

```

----------

## frostschutz

If you have solved the drive failure problem (by hooking them up through some other card). And if the drives do work reliably then. You should be able to reassemble the RAID. By your output, the first four ones are good (same timestamp and event count), whereas the latter two are out-of-date. So you should assemble (using --force if you must) with only the first four drives, and then once this is up and running, re-add the other two. Since RAID 6 allows two drive failures, it will resync. As long as the first drives aren't bad, the sync should succeed and you are back in the game with no additional data loss since the md failure.

Good luck.

If it does not work out you may have to resort to your backup after all.

----------

## RayDude

Thanks!

I removed the old sata card and added the two new SIL3132 boards. Unfortunately only one of them is recognized and I don't know why. The one plugged into the X16 port is not initialized, no bios no nothing. So I can only see five drives.

I had a four port PCI SATA raid card in my hand but I couldn't remember if this MOBO had a PCI slot so I got the PCIe cards...

Now I'm stuck buying a four port card from NewEgg because Frys didn't have any four port PCIe in stock (PCI is probably slow anyway) or I just have to bite the bullet and assemble with the four good drives and hope that there are no write errors until I find a way to hook up the sixth drive.

Man I wish I'd planned this better. I forgot this MOBO only had four SATA ports.

Update: Well, I let it rebuild the drives and it went to active sync.

Then I added the last drive and its currently rebuilding. So I'll have one redundant drive until I can find a solution that gives me four more SATA devices.

Any suggestions for a good card that's less than a hundred?

Thanks again guys!

----------

## NeddySeagoon

RayDude,

Write errors are actually fairly safe. The drive will realise the write failed and reallocate the failed sector.

Its read errors that are the problem.  When a drive has problems with a read but its still successful, the data will be moved to s spare sector.

When a read fails, the data is lost and can't be moved. That's a fairly simplistic explaination anyway.

----------

## Mad Merlin

 *RayDude wrote:*   

> Any suggestions for a good card that's less than a hundred?

 

As far as I know, such a thing doesn't exist. You can grab an LSI 9211-4i for ~$200 or an 9211-8i for ~$250, which are 4 port (1x SFF-8087) and 8 port (2x SFF-8087), respectively. These are barebones HBA cards meant for passing through the disks to the host OS rather than doing any RAID themselves and work well.

You can often find rebrands and/or used cards for less, have a look at this list, the 9211 uses the SAS2008 chipset.

I can't really recommend anything less expensive than that, I've tried a couple of them and have been burned more than once.

----------

## frostschutz

I'm not sure how expensive they are in $, but Lian Li IB-01 or Dawicontrol DC-624e, are around 80€, so shouldn't be >$100.

The Lian Li is just a port multiplier, though, and the Dawicontrol may need some extras ( http://theangryangel.co.uk/blog/marvell-88se9172-sata3-under-linux-as-of-320  but it may work out of the box in newer kernels maybe )

Since you mentioned you used internal and external ports on your card, there are lots of cards where the external port is shared, i.e. you can use either the external or the internal connector but not both at the same time.

----------

## RayDude

mdadm help again. I bought a cheap Marvell Based RAID III card from Newegg.

It worked from the start and drive six of the array started rebuilding.

At some point after about 20 hours the new controller died and took three hard drives with it ... again...

I guess the old adage, you get what you paid for, applies here.

Anywho. I bought a PCI raid card SIL3124 based, and its up and running. All six drives read as clean but three of them have a smaller event counts. How do I rebuild this with minimal damage. Is it possible?

```
/dev/sdb1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 6b6e4f45:5d52d5ea:c6ff4d3a:ebf8b515

    Update Time : Wed Dec 12 16:23:17 2012

       Checksum : ed4c7e51 - correct

         Events : 49889

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 0

   Array State : AAA... ('A' == active, '.' == missing)

/dev/sdc1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : dc35e656:81f9e617:e9a6eafe:00cf70d6

    Update Time : Wed Dec 12 16:23:17 2012

       Checksum : b14c6a20 - correct

         Events : 49889

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 1

   Array State : AAA... ('A' == active, '.' == missing)

/dev/sdd1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 59a13e07:c83416e1:4c96063b:6ca6bbb3

    Update Time : Wed Dec 12 16:23:17 2012

       Checksum : 443ad7b4 - correct

         Events : 49889

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 2

   Array State : AAA... ('A' == active, '.' == missing)

/dev/sde1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 0445633e:5bf226e9:50f860b0:4784965b

    Update Time : Wed Dec 12 12:34:15 2012

       Checksum : a0b11323 - correct

         Events : 49876

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 3

   Array State : AAAAAA ('A' == active, '.' == missing)

/dev/sdf1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x2

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

Recovery Offset : 4761467048 sectors

          State : clean

    Device UUID : b74d0c97:082efe8c:f85ebd68:2e5d8734

    Update Time : Wed Dec 12 12:34:15 2012

       Checksum : 4a4bffb9 - correct

         Events : 49876

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 5

   Array State : AAAAAA ('A' == active, '.' == missing)

/dev/sdg1:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : bfc07787:5075d763:c70b65ac:687f7544

           Name : SparePC:soulstorage

  Creation Time : Thu Nov 29 09:03:33 2012

     Raid Level : raid6

   Raid Devices : 6

 Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)

     Array Size : 23441068032 (11177.57 GiB 12001.83 GB)

  Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 45527180:c644c76d:c1b4c0f2:d2f7ed4a

    Update Time : Wed Dec 12 12:34:15 2012

       Checksum : 9915d2c6 - correct

         Events : 49876

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 4

   Array State : AAAAAA ('A' == active, '.' == missing)
```

----------

## RayDude

Update: I just typed 'mdadm -A --force /dev/md127' and it assembled and began rebuilding disk 6 again.

Then I ran fsck.ext4 on /dev/md127 and let the delete a few bad inodes.

Unfortunately, as I suspected, the PCI card is too slow and the rebuild of drive 6 hasn't moved a percent in several hours.

I'm still without a solution.

----------

## NeddySeagoon

RayDude,

If all your raid6 was doing was rebuilding - i.e. you were not writing anything to it, you might be lucky.

If there were files open for writing when your three drives went offline, you can expect those files to be in a mess.

If directory writes were in progress, the contents of these directories may be lost.  

This wiki article works.  The advantage with --create in degraded mode over --force, is that you can try all the combinatons to see if one degraded combination is better than another.  

At first sight, there is no reason why it should be but the drives are not all written concurrently, so a failure such as you had will leave the drives in slightly different states. Degraded mode for you means four out of six drives. You need to think carefully before your mount the filesystem, even read only, as journal replays and the resulting writes will still happen.  I think you can avoid the journal replay is you want but I don't know how.

----------

## frostschutz

--create is also dangerous though, if you get the command wrong that's bye bye to your data. Your top priority is your hardware issue though. It just doesn't do having three drives vanish in one go. If you have so many controllers failing maybe you should check for short circuits in your PSU/case, or any other cables for that matter. I've used cheap controllers myself and never had a failure. So your problems seem fishy to me somehow.

----------

## RayDude

Thanks guys.

There were only three bad inodes after the fsck, so the array seems fine.

With the PCI card, its still rebuilding the final drive. It is at 94% and will hopefully be done before the morning...

I have ordered an LSI logic RAID card from Amazon, it will arrive tomorrow. Hopefully it will be reliable in my mother board. Since I'll likely be able to boot my SSD off it, I'll connect four drives to the mother board that way if I have problems the most I will lose is two drives.

This sure has been an experience though, wow.

----------

## RayDude

Just Venting...

I bought a SAS controller from LSI.

It didn't appear to support 3TB drives, so I upgraded the FW from a kubuntu boot usb drive.

It still doesn't appear to support 3TB drives.

What do I have to do to get a working four port SATA card?

*exasperated*

----------

## NeddySeagoon

RayDude,

How do you mean  *RayDude wrote:*   

> didn't appear to support 3TB drives

 ?

What happens when you connect a 3G drive.

?  The only difference is 48bit LBA or not.  Or not means you max out at 137G.

A lot of bolt on goodies claim a 2TB limit so Windows users with MSDOS partition tables are not surprised when they find a 2Tb limit but its not hardware related.

Try it, I will be surprised if it doesn't 'just work'.

----------

## RayDude

Thanks Neddy.

I already have working GPT 3 TB partitions on the drives. When I connect them to the LSI card, the are reported as 2048 MB and the GPT partitions are not present when the machine is booted.

I've found some forums complaining about this problem with LSI but I haven't found any solutions. I've emailed tech support.

I should point out that I removed the BDROM from the PCI raid card and the raid performance doubled to 135 MB / second... Pretty interesting.

 *NeddySeagoon wrote:*   

> RayDude,
> 
> How do you mean  *RayDude wrote:*   didn't appear to support 3TB drives ?
> 
> What happens when you connect a 3G drive.
> ...

 

----------

## NeddySeagoon

RayDude,

GPT uses two copies of the partition table, one at the start of the drive and one at the end.  It gets really upset if the two don't match.

With a 2048Mb limit, the copy at the end of the drive can't be read.

dmesg probably has errors about attempting to read beyond the end of the device.

----------

## RayDude

Thanks, I'm sure that's why the partition table seems empty when I attempt to look at it. Now its up to LSI.

I guess I'll have to break down and buy another raid card. The reviews on promise and hot point look bad. I might just buy another silicon images board, but make it for PCI Express instead of PCI... I don't want to because the one I used for several years died...

----------

## Mad Merlin

You didn't mention which model of LSI card you ended up with. However, LSI has an article on the issue here: http://webcache.googleusercontent.com/search?q=cache:6MN0yCPeVn0J:http://kb.lsi.com/Print16399.aspx%2Blsi+2TB&oe=UTF-8&hl=en&ct=clnk

It looks like the 6Gbit/s cards (such as the 9211 I suggested above) support >= 3T drives while the older ones do not (unless you have SAS drives, which you likely don't). However, I've personally only used the 9211 with SSDs (which are, sadly, smaller than 2T still).

----------

## RayDude

Final Update. My older card would not recognize 3TB even with the IT firmware. I sent it back and ordered a Marvell SAS card. It works okay, but the performance is kinda random. For example the SSD plugged into the MVSAS card is about half as fast as it was on the motherboard. But it wasn't always this slow... I'm having trouble figuring out how to get full performance out of it.

----------

