# [solved] Slow RAID1 resync (adjusting min_speed no effect)

## Prospero

About a week ago an old drive (Maxtor) in my box's RAID1 array failed. It had been active for almost 4 years, so I simply considered it broken, removed it and ordered a new drive (Seagate of twice the size - company didn't have the old size in stock and this one only cost €1 more).

So this morning I installed the new drive, partitioned it, and attempted to add it to the raid array. Here's the setup:

```

# fdisk -l /dev/sdb

Disk /dev/sdb: 80.0 GB, 80026361856 bytes

255 heads, 63 sectors/track, 9729 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sdb1   *           1           9       72261   fd  Linux raid autodetect

/dev/sdb2              10         132      987997+  fd  Linux raid autodetect

/dev/sdb3             133        9729    77087902+   5  Extended

/dev/sdb5             133         376     1959898+  fd  Linux raid autodetect

/dev/sdb6             377        9729    75127941   fd  Linux raid autodetect

# fdisk -l /dev/sda

Disk /dev/sda: 160.0 GB, 160041885696 bytes

255 heads, 63 sectors/track, 19457 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1           9       72261   fd  Linux raid autodetect

/dev/sda2              10       19457   156216060    5  Extended

/dev/sda5              10         133      995998+  fd  Linux raid autodetect

/dev/sda6             134         378     1967931   fd  Linux raid autodetect

/dev/sda7             379        9733    75144006   fd  Linux raid autodetect

/dev/sda8            9734        9857      995998+  fd  Linux raid autodetect

/dev/sda9            9858       10102     1967931   fd  Linux raid autodetect

/dev/sda10          10103       19457    75144006   fd  Linux raid autodetect

```

Here's what I had in mind:

```

/dev/md1:    /dev/sdb1  /dev/sda1

/dev/md2:    /dev/sdb2  /dev/sda5  /dev/sda8

/dev/md5:    /dev/sdb5  /dev/sda6  /dev/sda9

/dev/md6:    /dev/sdb6  /dev/sda7  /dev/sda10

```

All of them RAID1 (and please don't lecture me about putting two partitions from the same drive in a RAID1, I know it's more or less pointless but the space is unused anyway if I don't do this).

So, my problem is this: I booted into an old LiveCD environment (2006.1 I think) and am in the process of recreating the arrays. Now, constructing most drives went just fine, except adding /dev/sda10 to /dev/md6: this is taking ages!

I already searched around, and the best suggestion I found, adjusting /proc/sys/dev/raid/speed_limit_min, has no effect!

```

# cat /proc/mdstat

Personalities : [raid1]

md6 : active raid1 sda10[3] sda7[1] sdb6[0]

      75127872 blocks [3/2] [UU_]

      [=======>.............]  recovery = 39.2% (29472512/75127872) finish=5753.3min speed=132K/sec

md5 : active raid1 sda9[2] sda6[1] sdb5[0]

      1959808 blocks [3/3] [UUU]

md2 : active raid1 sda8[2] sda5[1] sdb2[0]

      987904 blocks [3/3] [UUU]

md1 : active raid1 sda1[1] sdb1[0]

      72192 blocks [2/2] [UU]

unused devices: <none>

```

```

# cat /proc/sys/dev/raid/speed_limit_min

50000

```

```

# cat /proc/sys/dev/raid/speed_limit_max

200000

```

```

# mdadm --version

mdadm - v2.5.2 -  27 June 2006

```

And the only relevant dmesg output

```

md: syncing RAID array md6

md: minimum _guaranteed_ reconstruction speed: 50000 KB/sec/disc.

md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.

md: using 128k window, over a total of 75127872 blocks.

```

Anybody have an idea what might be wrong?Last edited by Prospero on Wed Jun 11, 2008 1:17 pm; edited 1 time in total

----------

## NeddySeagoon

Prospero,

```
/dev/md2:    /dev/sdb2  /dev/sda5  /dev/sda8 
```

will be a speed problem.

All the head movements between /dev/sda5 and /dev/sda8 when no data can be transferred will make it very slow.

The idea of raid, is multiple copies on separate spindles. The kernel implementation doesn't check the separate spindles idea, so you can do any raid level you like on a single drive, thats not good for reliability though

----------

## Prospero

Well strangely, the combination of 5 and 8 isn't giving me a problem, probably due to the fact that it's not the biggest partition. I understand your point though, with RAID writing everything to both partitions it will slow down tremendously since it needs to switch positions for every byte I send. I guess I'll just remove the extra partitions.

But still, it doesn't explain why sda10 is taking so long when all other partitions manage to keep syncing at 50MB/s. It's been a while since I took classes on hardware architecture (which included hard drives and RAID), so I hope the following doesn't sound stupid:

sda10 is a copy of sdb6. It is also the largest partition - however, when syncing sdb6 to sda6, those partitions have the same "offset", so the transmission goes smoothly. In other cases, the partition is small enough to complete before any serious slowdown takes places. But with sda10 and sdb6 we have both a large amount of data and a huge difference in location.

Could that explain the slowdown? Or am I talking nonsense now? The thing is, I just bought that drive, I am seriously hoping it can be explained by foolishness on my part, and not a faulty drive.

----------

## NeddySeagoon

Prospero,

Drives are 'zoned'. This means that the sectors per track is reduced nearer the spindle so the head/platter data rate is lower there. A single drive may give you 60Mb/sec in the outer zone and only half that in the inner zone.

Neither partition 6 or 10 are close to the outside of the drive.

----------

## Prospero

Ok that can't be it then...

How about connectors? Could an improperly secured SATA connector cause localized slowdown problems such as these? I'm currently "testing" the drive:

```

dd if=/dev/zero of=/dev/sda

```

If that works out I'll just check the connectors, make sure they're properly secured, and repartition it, this time with just the single mirror on the /dev/sda drive for each partition.

Edit: Ok, it seems that the new drive wasn't detected on the last reboot, so now I overwrote the original data...  :Sad:  Hopefully one of the RAID syncs from last night went alright, I'll try salvaging that first before I try anything new

----------

## NeddySeagoon

Prospero,

poor connections will cause retries all over the drive.

dmesg will be full of error messages too.

----------

## Prospero

Well dmesg was mostly clean - up until the dd command terminated with an error, here's what I got:

```

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

ata1.00: (BMDMA stat 0x0)

ata1.00: tag 0 cmd 0xc8 Emask 0x9 stat 0x51 err 0x40 (media error)

ata1: EH complete

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

ata1.00: (BMDMA stat 0x0)

ata1.00: tag 0 cmd 0xc8 Emask 0x9 stat 0x51 err 0x40 (media error)

ata1: EH complete

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

ata1.00: (BMDMA stat 0x0)

ata1.00: tag 0 cmd 0xc8 Emask 0x9 stat 0x51 err 0x40 (media error)

ata1: EH complete

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

ata1.00: (BMDMA stat 0x0)

ata1.00: tag 0 cmd 0xc8 Emask 0x9 stat 0x51 err 0x40 (media error)

ata1: EH complete

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

ata1.00: (BMDMA stat 0x0)

ata1.00: tag 0 cmd 0xc8 Emask 0x9 stat 0x51 err 0x40 (media error)

ata1: EH complete

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

ata1.00: (BMDMA stat 0x0)

ata1.00: tag 0 cmd 0xc8 Emask 0x9 stat 0x51 err 0x40 (media error)

sd 0:0:0:0: SCSI error: return code = 0x08000002

sda: Current: sense key=0x3

    ASC=0x11 ASCQ=0x4

end_request: I/O error, dev sda, sector 192265944

ata1: EH complete

SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)

sda: Write Protect is off

sda: Mode Sense: 00 3a 00 00

SCSI device sda: drive cache: write back

SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)

sda: Write Protect is off

sda: Mode Sense: 00 3a 00 00

SCSI device sda: drive cache: write back

```

Now I'm no expert, but this seems bad. I'm also noticing that whatever operation I do on the hard drive (scan on LiveCD boot, hdparm -I, fdisk -l), it first stalls for a few seconds

So... broken hard drive?

----------

## NeddySeagoon

Prospero,

Its probably failed but it may be just one bad sector.  try to make an image with ddrescue then write the entire drive with dd.

If the write succeeds, the bad sector(s) will have been remapped to spare sectors/tracks.

Its worth looking at the drives own error log with smartmontools too

----------

## Prospero

Well one other thing I noticed was that after a reboot, the drive had "forgotten" it's partition table, and this was before I ever did anything with dd, so I'm strongly inclined to think it's broken.

However, I downloaded Seagate's diagnostic tool, and I'll run that first - I'll also try your suggestions if no errors come up there.

In any case, thanks for all your help

Edit:

It took me 3 hours to get the diagnostic tool to work - as it couldn't detect the disc, which surprisingly, was also the case when I tried with the Gentoo LiveCD and an Ubuntu CD I had lying around. After plugging the drive into another computer it finally started up, and I was able to test the drive.

Short test: Jumps to 10% complete within 30 seconds, and then hangs. I left it running for an hour, no progress

Long test: Finishes but finds about 90 errors, which it can fix. Upon reboot, drive not detected.

I'm sending it back, it's obviously broken

Edit2: Sent back the drive, vendor agreed it was broken, got a new drive - putting solved in thread title

----------

