# Questioning the integrity and safety of Raid 5...

## RaceTM

Hey all,

I am currently running a couple of raid 5 arrays, and I have recently experienced some mental anguish when one of my brand new drives was marked as bad and kicked out of the array.  I noticed a week later, so I went back and checked the system log.  Apparantly, a bad sector was enncountered, so the entire drive was marked as faulty.

Now, this scares the crap out of me, for two reasons:

1) I don't think I have ever had a hard drive for more than a year or two which didn't develop at least SOME bad sectors

2) If mdadm is so precise and finicky as to kick out a hard drive with ANY bad sectors whatsoever, then so be it..I know that my array will have very good integrity.  However, what is to stop a drive on the array from developing bad sectors in areas which are not frequently used? And hopefully you see where this is leading.

So here is a scenario which will help explain my concerns:

Lets say I have a raid 5 array, made up of three devices: sda1, sdb1, and sdc1.  Now, lets also assume that sda1 has developed a bad sector on an area of the drive which is rarely used.  Since this sector is never read, mdadm doesn't notice, so the drive is marked as active and healthy.  Now, assume that one day, while reading a file, a bad sector is discovered on sdc1, so that drive is kicked out of the array.  ok, no problem, insert a new drive and start the rebuild process.  During the rebuild, however, the faulty sector on sda1 will be discovered, and all of a sudden we are left with two disabled drives and a very broken array.

I understand the rationale behind kicking a drive out of the array as soon as it develops a bad sector, since the drive will no longer have the same ammount of usable space as the other drives.

I guess this leads me to my questions.  Are my concerns valid? Is there any way to work around this or otherwise force mdadm to actively check for bad sectors on all drives very frequently?  Any other thoughts?

Thanks!

----------

## widan

 *RaceTM wrote:*   

> During the rebuild, however, the faulty sector on sda1 will be discovered, and all of a sudden we are left with two disabled drives and a very broken array.

 

It can happen, though it's not very common. Some hardware RAID controllers (at least LSI Logic) have "patrol read" features that try to read all sectors of all drives in the background when there is no other I/O, to detect failing disks before they fail completely (and maybe also try to "fix" the bad sector by rewriting the correct data to it).

 *RaceTM wrote:*   

> I understand the rationale behind kicking a drive out of the array as soon as it develops a bad sector, since the drive will no longer have the same ammount of usable space as the other drives.

 

Actually hard drives have spare sectors and can do sector reallocation, so they don't shrink as bad sectors develop (until the spare pool is empty). A write to a bad sector will trigger automatic reallocation and you usually won't even see it (unless you look at the reallocation count in the SMART data). A read to a bad sector will give you an error. If you force a write to a bad sector with dd, it will force reallocation and the bad sector will disappear.

 *RaceTM wrote:*   

> Is there any way to work around this or otherwise force mdadm to actively check for bad sectors on all drives very frequently?

 

You can run the long SMART self-test, it will detect bad sectors, but it takes time, more than an hour with no other I/O to the disk, and it can take much longer if there is a lot of I/O at the same time, as the test only progresses when the disk is idle so as not to kill performance.

----------

## eccerr0r

I do a check/repair on my raid5 every so often for the sole reason of making sure all sectors are still readable

```
echo repair >/sys/block/md2/md/sync_action
```

I do this in crontab every month or so.

Anyway as far as I know linux SW RAID does not support sparing at the OS level if a block dies.  It depends on the hardware sparing mechanism.  When Linux finally returns a sector as bad, it actually has tried many times before throwing the disk out.  Since there's no software-level sparing, there's no way to now ensure that the disk is in sync so it tosses the drive.

Remember, RAID is not a replacement for backups.  You still need to back up your array.

----------

## HeissFuss

I've rarely seen just "one" bad sector.  Usually that means "prepare to replace drive."

I run a weekly check of my array.  

```
echo check >> /sys/block/mdX/md/sync_action
```

Take a look at the data scrubbing section of the gentoo wiki raid guide.

http://gentoo-wiki.com/HOWTO_Install_on_Software_RAID

----------

## widan

 *HeissFuss wrote:*   

> I've rarely seen just "one" bad sector.  Usually that means "prepare to replace drive."

 

It depends. Drives can develop isolated bad sectors as a result of a power failure while writing.

----------

## RaceTM

 *widan wrote:*   

> 
> 
> Actually hard drives have spare sectors and can do sector reallocation, so they don't shrink as bad sectors develop (until the spare pool is empty). A write to a bad sector will trigger automatic reallocation and you usually won't even see it (unless you look at the reallocation count in the SMART data). A read to a bad sector will give you an error. If you force a write to a bad sector with dd, it will force reallocation and the bad sector will disappear. 

 

Really?  I thought this was only the case with SCSI drives?

----------

## RaceTM

 *eccerr0r wrote:*   

> I do a check/repair on my raid5 every so often for the sole reason of making sure all sectors are still readable
> 
> ```
> echo repair >/sys/block/md2/md/sync_action
> ```
> ...

 

Thanks, I will give this a shot.  Is there any potential harm in running repair on a working array?

----------

## widan

 *RaceTM wrote:*   

> Really?  I thought this was only the case with SCSI drives?

 

IDE hard drives can reallocate bad sectors too.

----------

## eccerr0r

 *RaceTM wrote:*   

> Thanks, I will give this a shot.  Is there any potential harm in running repair on a working array?

 

There is a risk but I figure that if there is a inconsistancy problem I'm hosed anyway, so might as well have it all synced up.

If there is no inconsistancy problem, then repairing a good array poses no problem.

And yeah, all "recent" (pretty much all IDE HDD's over 1GB or so) have spare sectors and remap on the fly.  But I've rarely seen the case where a disk develops one bad sector and continues to work forever...  Usually a bad sector is a bad omen.  I also think most hdd's have enough "backup capacitance" to write out complete sectors in case power is suddenly lost as well, but I've never tested that...

----------

## Cyker

Most modern IDE drives (I stress the most; Smeg knows what cost-cutting measures these drive makers will stoop to these days) have a pool of unused sectors that they remap on to bad sectors.

Note - EVERY SINGLE DRIVE currently in circulation has a load of bad sectors on them - These are all re-mapped out in the factory so that all the drives have a consistent size (Anyone is old enough to remember the days when you'd read the error list on an MFM drive to try and find the 10MB drive with the biggest capacity will understand why this is a Good Thing  :Wink: )

Because these bad sectors are due to 'soft' things, e.g. slightly weak magnetics in a certain spot, they are generally quite stable.

Sometimes you get a crappy batch where the magnetic surface somehow wears out over time, slowly becoming less and less usable - The drive's electronics will (should?) map these areas out as they drop below a certain threshold of readability until there are no spare sectors left, at which point the bad sectors become visible.

Alternatively, if the drive platters are physically damaged somehow, bad sectors will also often appear regardless of spares available, but such drives need to be replaced ASAP since physical damage is often something that can spread.

In general, I usually recommend any drive that starts to develop bad sectors should be replaced ASAP.

You can try using the manufacture's utilities (<rant>Why the hell do they still use floppies and DOS FFS?!?!</rant>) to 'recondition' the drive - Which basically just zero-fills it and forces it to re-do the sector remapping - but if it still develops bad sectors then you're gonna want to replace that drive anyway (esp. since it'd still be under warranty!)

RAID5 is VERY VERY pedantic for a GOOD REASON.

Remember - If a RAID5 array FAILS, you have *NO HOPE* of recovering the data on it AT ALL! (Well, unless you're rich and desperate enough...)

So RAID5 *has* to be very careful with its data integrity.

What I'm basically saying is: The drive is new and has bad sectors - Get off your arse and RMA it!!!  :Mr. Green:   :Wink: 

----------

## widan

 *eccerr0r wrote:*   

> But I've rarely seen the case where a disk develops one bad sector and continues to work forever...

 

I have one that developed its only bad sector as a result of a loose power connector. After connecting it to another plug that fit more tightly, and nuking the bad sector by writing to it with dd, the bad sector got remapped, and the drive still works fine to this day (the incident was 2 years ago or so).

But it's true that in most cases it's not a good sign.

 *eccerr0r wrote:*   

> I also think most hdd's have enough "backup capacitance" to write out complete sectors in case power is suddenly lost as well, but I've never tested that...

 

Maybe recent drives do, but the Maxtor from 2 years ago that was involved in the above incident did not.

 *Cyker wrote:*   

> <rant>Why the hell do they still use floppies and DOS FFS?!?!</rant>

 

It's still better than a Windows utility...

----------

## RaceTM

Thanks guys, I have added a check to my crontab.  What would be the best method to test the drive which was kicked out? It has been put back in to the array as a functional spare (don't ask me how..I definitely didn't put it back in), so I would like to check it again before I decide whether or not to RMA it.

----------

## HeissFuss

The simplest way is to run badblocks on the device.  It will print out all of the bad blocks.

----------

