# ext4, mdadm read errors [solved]

## ascendant

Hi, everyone, this looks to be the right forum though I'm not actually convinced yet that this is a hardware problem,

Basically, I've got a 4-disk raid 5 with each disk passing badblocks, as well as the raid volume itself.  However, when I format the volume with ext4 and copy files to it, one of the disks (sdc) starts barfing out read errors and mdadm ejects it.

```
smartctl -i /dev/sdc

Model Family:     Western Digital Caviar Green (Adv. Format)

Device Model:     WDC WD20EARX-00PASB0

Firmware Version: 51.0AB51

User Capacity:    2,000,398,934,016 bytes [2.00 TB]

Sector Sizes:     512 bytes logical, 4096 bytes physical

smartctl -i /dev/sdd

Model Family:     Western Digital Caviar Green (Adv. Format)

Device Model:     WDC WD20EARX-008FB0

Firmware Version: 51.0AB51

User Capacity:    2,000,398,934,016 bytes [2.00 TB]

Sector Sizes:     512 bytes logical, 4096 bytes physical

smartctl -i /dev/sde

Model Family:     Western Digital Caviar Green (Adv. Format)

Device Model:     WDC WD20EARX-008FB0

Firmware Version: 51.0AB51

User Capacity:    2,000,398,934,016 bytes [2.00 TB]

Sector Sizes:     512 bytes logical, 4096 bytes physical

(parted) select /dev/sdc

Using /dev/sdc

(parted) print                                                            

Model: ATA WDC WD20EARX-00P (scsi)

Disk /dev/sdc: 3907029168s

Sector size (logical/physical): 512B/4096B

Partition Table: gpt

Disk Flags: 

Number  Start  End          Size         File system  Name  Flags

 1      2048s  3907028991s  3907026944s               dev1  raid

Disk /dev/sdd: 3907029168s

Number  Start  End          Size         File system  Name  Flags

 1      2048s  3907028991s  3907026944s               dev2  raid

Disk /dev/sde: 3907029168s

Number  Start  End          Size         File system  Name  Flags

 1      2048s  3907028991s  3907026944s               dev3  raid
```

```
/dev/md1:

        Version : 1.2

  Creation Time : Fri Nov  2 17:53:04 2012

     Raid Level : raid5

     Array Size : 5860535808 (5589.04 GiB 6001.19 GB)

  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)

   Raid Devices : 4

  Total Devices : 3

    Persistence : Superblock is persistent

  Intent Bitmap : /bitmap

    Update Time : Fri Nov  2 18:04:07 2012

          State : clean, degraded

 Active Devices : 3

Working Devices : 3

 Failed Devices : 0

  Spare Devices : 0

         Layout : left-symmetric

     Chunk Size : 512K

           Name : roar:raid  (local to host roar)

           UUID : 12c57c99:e30ff349:728630cc:63c27604

         Events : 12

    Number   Major   Minor   RaidDevice State

       0       0        0        0      removed

       1       8       33        1      active sync   /dev/sdc1

       2       8       49        2      active sync   /dev/sdd1

       3       8       65        3      active sync   /dev/sde1
```

```
          State : clean, FAILED

 Active Devices : 2

Working Devices : 2

 Failed Devices : 1

         Events : 4337

    Number   Major   Minor   RaidDevice State

       0       0        0        0      removed

       1       0        0        1      removed

       2       8       49        2      active sync   /dev/sdd1

       3       8       65        3      active sync   /dev/sde1

       1       8       33        -      faulty spare   /dev/sdc1

```

As you may have noticed, one drive is missing, this is intentional and will be added to the array once I can get it all working reliably.  That disk is the same size as the rest and currently contains the data that will be moved onto the array.  Here are the relevant commands that I have used:

```
checking existing drives:

badblocks -b 4096 -o badblocks.txt -p 2 -s -t random -vw /dev/sd?

creating  the array:

mdadm --verbose --create /dev/md1 --name=raid --level=5 --raid-devices=4 missing /dev/sdc1 /dev/sdd1 /dev/sde1 --bitmap=/bitmap

formatting the array:

mkfs.ext4 -v -m 0 -b 4096 -E stride=128,stripe-width=384 -L raid -O dir_index,uninit_bg /dev/md1

Checking for duplicate failed sectors returns nothing, always different sectors fail:

grep "sdc, sector" /var/log/messages | awk '{ print $NF }' | sort | uniq -d
```

```
Linux roar 3.5.0-gentoo #9 SMP Fri Nov 2 05:11:25 CDT 2012 x86_64 Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz GenuineIntel GNU/Linux

[I] sys-kernel/gentoo-sources

    Installed versions:  3.5.0(3.5.0)^bs(06:03:35 AM 08/10/2012)(-build -deblob -symlink)

[I] sys-fs/mdadm

     Installed versions:  3.1.4^t(02:58:24 AM 10/03/2011)(-static)

[I] sys-fs/e2fsprogs

     Installed versions:  1.42(09:33:36 PM 02/23/2012)(nls -elibc_FreeBSD -static-libs)
```

Update:

smarctl -x on the bad drive: http://dpaste.com/823331/

Have tried another SATA cable, and one of the other ports.

----------

## lagalopex

What does dmesg give you when the disk fails?

Any reason for an external bitmap? What filesystem is root?

And the greens are not so well suited for a raid... reds are.

----------

## ascendant

Thanks for the interest!

I realize that WD Greens are risky in a hardware raid configuration due to their lack of TLER, but they are known to work in software raid.  WD Reds are also rather more costly.

External bitmap is to reduce resync times in the event of a drive failure.  / is ext4.

Additionally, after more testing, I have discovered that the array will still fail even without a filesystem and only running badblocks although it takes hours to do so rather than seconds. (heh, and also mdadm can write raid 5 at up to 1.1 GB/s when it is no longer limited by functioning hardware)

I will run more tests on the offending drive.  With any luck, the previous tests are now invalid and it will error enough that I can return it.

Here are excerpts from messages:

```
Nov  2 18:48:45 roar kernel: ata3: EH in SWNCQ mode,QC:qc_active 0x7FFC7FFF sactive 0x7FFC7FFF

Nov  2 18:48:45 roar kernel: ata3: SWNCQ:qc_active 0x40000 defer_bits 0x7FF87FFF last_issue_tag 0x12

Nov  2 18:48:45 roar kernel: dhfis 0x40000 dmafis 0x0 sdbfis 0x0

Nov  2 18:48:45 roar kernel: ata3: ATA_REG 0x41 ERR_REG 0x4

Nov  2 18:48:45 roar kernel: ata3: tag : dhfis dmafis sdbfis sactive

Nov  2 18:48:45 roar kernel: ata3: tag 0x12: 1 0 0 1  

Nov  2 18:48:45 roar kernel: ata3.00: exception Emask 0x1 SAct 0x7ffc7fff SErr 0x0 action 0x6 frozen

Nov  2 18:48:45 roar kernel: ata3.00: Ata error. fis:0x41

Nov  2 18:48:45 roar kernel: ata3.00: failed command: WRITE FPDMA QUEUED

Nov  2 18:48:45 roar kernel: ata3.00: cmd 61/08:00:e8:2c:1d/00:00:01:00:00/40 tag 0 ncq 4096 out

Nov  2 18:48:45 roar kernel: res 41/04:00:00:00:00/04:00:00:00:00/00 Emask 0x1 (device error)

Nov  2 18:48:45 roar kernel: ata3.00: status: { DRDY ERR }

Nov  2 18:48:45 roar kernel: ata3.00: error: { ABRT }

<snip duplicates>

Nov  2 18:48:45 roar kernel: ata3: hard resetting link

Nov  2 18:48:45 roar kernel: ata3: nv: skipping hardreset on occupied port

Nov  2 18:48:46 roar kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Nov  2 18:48:46 roar kernel: ata3.00: configured for UDMA/133

Nov  2 18:48:46 roar kernel: sd 2:0:0:0: [sdc]  

Nov  2 18:48:46 roar kernel: Result: hostbyte=0x00 driverbyte=0x08

Nov  2 18:48:46 roar kernel: sd 2:0:0:0: [sdc]  

Nov  2 18:48:46 roar kernel: Sense Key : 0xb [current] [descriptor]

Nov  2 18:48:46 roar kernel: Descriptor sense data with sense descriptors (in hex):

Nov  2 18:48:46 roar kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 

Nov  2 18:48:46 roar kernel: 00 00 00 00 

Nov  2 18:48:46 roar kernel: sd 2:0:0:0: [sdc]  

Nov  2 18:48:46 roar kernel: ASC=0x0 ASCQ=0x0

Nov  2 18:48:46 roar kernel: sd 2:0:0:0: [sdc] CDB: 

Nov  2 18:48:46 roar kernel: cdb[0]=0x2a: 2a 00 01 1d 2c e8 00 00 08 00

Nov  2 18:48:46 roar kernel: end_request: I/O error, dev sdc, sector 18689256

Nov  2 18:48:46 roar kernel: sd 2:0:0:0: [sdc]  

Nov  2 18:48:46 roar kernel: md/raid:md1: Disk failure on sdc1, disabling device.

Nov  2 18:48:46 roar kernel: md/raid:md1: Operation continuing on 2 devices.

<snip duplicates>

Nov  2 18:48:46 roar kernel: end_request: I/O error, dev sdc, sector 18690048

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688000 on sdc1).

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688008 on sdc1).

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688016 on sdc1).

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688024 on sdc1).

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688032 on sdc1).

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688040 on sdc1).

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688048 on sdc1).

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688056 on sdc1).

Nov  2 18:48:46 roar kernel: md/raid:md1: read error not correctable (sector 18688064 on sdc1).

<snip>

Nov  2 18:48:46 roar kernel: ata3: EH complete

Nov  2 18:48:46 roar kernel: RAID conf printout:

Nov  2 18:48:46 roar kernel: --- level:5 rd:4 wd:2

Nov  2 18:48:46 roar kernel: disk 1, o:0, dev:sdc1

Nov  2 18:48:46 roar kernel: disk 2, o:1, dev:sdd1

Nov  2 18:48:46 roar kernel: disk 3, o:1, dev:sde1

Nov  2 18:48:46 roar kernel: RAID conf printout:

Nov  2 18:48:46 roar kernel: --- level:5 rd:4 wd:2

Nov  2 18:48:46 roar kernel: disk 2, o:1, dev:sdd1

Nov  2 18:48:46 roar kernel: disk 3, o:1, dev:sde1

Nov  2 18:48:46 roar kernel: Buffer I/O error on device md1, logical block 7006848

Nov  2 18:48:46 roar kernel: Buffer I/O error on device md1, logical block 7006849

Nov  2 18:48:46 roar kernel: Buffer I/O error on device md1, logical block 7006850

<snip>

Nov  2 18:48:46 roar kernel: EXT4-fs warning (device md1): ext4_end_bio:250: I/O error writing to inode 12 (offset 26737115136 size 524288 starting block 7006848)
```

----------

## ascendant

This issue was resolved by disabling native command queueing on the SATA controller:

```
#!/bin/bash

devs=$(ls -1 /dev/disk/by-id/ata-WDC* | grep -v part)

get_dev(){

   ls -l $1 | grep -Po '(?<=\/)sd.'

}

echo "$devs" | while read target

do   dev=$(get_dev $target)

   echo 1 > /sys/block/$dev/device/queue_depth

done
```

----------

## miroR

 *ascendant wrote:*   

> This issue was resolved by disabling native command queueing on the SATA controller:
> 
> ```
> #!/bin/bash
> 
> ...

 

I see...

Hmm... Is it worth the cheap money (I might go for the 2TB capacity because higher end equivalents are too expensive...)?

These HDDs you have work fine with these workarounds?

Can you recommend them?

Thanx!

----------

## ascendant

Nobody should recommend WD green for RAID.  You should decide for yourself whether you accept the consequences of using WD greens for an array.

also, the workarounds with regard to NCQ is specific to my machine, and you should not apply it to your machine unnecessarily.

You should also definitely not just run that script without exactly understanding the effect it would have on your system.  In the end, the array on my system is now stable.  It includes the 3 WD greens and one Samsung HD203WI

I chose the 2TB size because larger drives are ridiculously unreliable, while 2TB drives are only moderately unreliable.

----------

