# SOLVED: Broken mirror - mdadm raid1 problem

## Woolong

Hi,

It's a simple two disk raid1 setup. Each disk has 7 partitions, same size on both disks. sda4 and sdb4 are the "container" partitions that include partitions 5-8. I followed the gentoo softraid guide, and used mdadm to create the raid1 array.

```

cat /etc/mdadm.conf

DEVICE /dev/sda* /dev/sdb*

ARRAY /dev/md0 level=raid1 num-devices=2 devices=/dev/sda1,/dev/sda1

ARRAY /dev/md1 level=raid1 num-devices=2 devices=/dev/sda2,/dev/sdb2

ARRAY /dev/md2 level=raid1 num-devices=2 devices=/dev/sda3,/dev/sdb3

ARRAY /dev/md3 level=raid1 num-devices=2 devices=/dev/sda5,/dev/sdb5

ARRAY /dev/md4 level=raid1 num-devices=2 devices=/dev/sda6,/dev/sdb6

ARRAY /dev/md5 level=raid1 num-devices=2 devices=/dev/sda7,/dev/sdb7

ARRAY /dev/md6 level=raid1 num-devices=2 devices=/dev/sda8,/dev/sdb8

```

The system boots and runs okay, but for some reason the sdb* partitions are not "up".

```

cat /proc/mdstat

Personalities : [raid1]

md1 : active raid1 sda2[0]

      1003968 blocks [2/1] [U_]

md2 : active raid1 sda3[0]

      505920 blocks [2/1] [U_]

md3 : active raid1 sda5[0]

      60556864 blocks [2/1] [U_]

md4 : active raid1 sda6[0]

      8008256 blocks [2/1] [U_]

md5 : active raid1 sda7[0]

      2008000 blocks [2/1] [U_]

md6 : active raid1 sda8[0]

      449664 blocks [2/1] [U_]

md0 : active raid1 sda1[0]

      72192 blocks [2/1] [U_]

```

This md6 is supposed to have both sda8 and sdb8, but at the bottom it says "removed".

```

mdadm /dev/md6

/dev/md6: 439.13MiB raid1 2 devices, 0 spares. Use mdadm --detail for more detail.

/dev/md6: No md super block found, not an md component.

```

```

mdadm --detail /dev/md6

/dev/md6:

        Version : 00.90.01

  Creation Time : Tue Oct 25 16:36:04 2005

     Raid Level : raid1

     Array Size : 449664 (439.20 MiB 460.46 MB)

    Device Size : 449664 (439.20 MiB 460.46 MB)

   Raid Devices : 2

  Total Devices : 1

Preferred Minor : 6

    Persistence : Superblock is persistent

    Update Time : Wed Oct 26 17:00:45 2005

          State : clean, degraded

 Active Devices : 1

Working Devices : 1

 Failed Devices : 0

  Spare Devices : 0

           UUID : e4dd3202:3e737875:5914cfa4:71d8b905

         Events : 0.731

    Number   Major   Minor   RaidDevice State

       0       8        8        0      active sync   /dev/sda8

       1       0        0        -      removed

```

I checked the man page, and tried to get as much info as possible...

```

mdadm --examine /dev/sdb8

/dev/sdb8:

          Magic : a92b4efc

        Version : 00.90.01

           UUID : e4dd3202:3e737875:5914cfa4:71d8b905

  Creation Time : Tue Oct 25 16:36:04 2005

     Raid Level : raid1

   Raid Devices : 2

  Total Devices : 2

Preferred Minor : 6

    Update Time : Tue Oct 25 17:58:53 2005

          State : clean

 Active Devices : 1

Working Devices : 2

 Failed Devices : 1

  Spare Devices : 1

       Checksum : 1e2c541b - correct

         Events : 0.555

      Number   Major   Minor   RaidDevice State

this     2       8       24        2      spare   /dev/sdb8

   0     0       8        8        0      active sync   /dev/sda8

   1     1       0        0        1      faulty removed

   2     2       8       24        2      spare   /dev/sdb8

```

From dmesg, I realized all the sdb* partitions are bind to the array at first, then got kicked out because of being "non-fresh".

```

md: Autodetecting RAID arrays.

md: autorun ...

md: considering sdb8 ...

md:  adding sdb8 ...

md: sdb7 has different UUID to sdb8

md: sdb6 has different UUID to sdb8

md: sdb5 has different UUID to sdb8

md: sdb3 has different UUID to sdb8

md: sdb2 has different UUID to sdb8

md: sdb1 has different UUID to sdb8

md:  adding sda8 ...

md: sda7 has different UUID to sdb8

md: sda6 has different UUID to sdb8

md: sda5 has different UUID to sdb8

md: sda3 has different UUID to sdb8

md: sda2 has different UUID to sdb8

md: sda1 has different UUID to sdb8

md: created md6

md: bind<sda8>

md: bind<sdb8>

md: running: <sdb8><sda8>

md: kicking non-fresh sdb8 from array!

md: unbind<sdb8>

md: export_rdev(sdb8)

raid1: raid set md6 active with 1 out of 2 mirrors

md: considering sdb7 ...

md:  adding sdb7 ...

md: sdb6 has different UUID to sdb7

md: sdb5 has different UUID to sdb7

md: sdb3 has different UUID to sdb7

md: sdb2 has different UUID to sdb7

md: sdb1 has different UUID to sdb7

md:  adding sda7 ...

md: sda6 has different UUID to sdb7

md: sda5 has different UUID to sdb7

md: sda3 has different UUID to sdb7

md: sda2 has different UUID to sdb7

md: sda1 has different UUID to sdb7

md: created md5

md: bind<sda7>

md: bind<sdb7>

md: running: <sdb7><sda7>

md: kicking non-fresh sdb7 from array!

md: unbind<sdb7>

md: export_rdev(sdb7)

raid1: raid set md5 active with 1 out of 2 mirrors

md: considering sdb6 ...

md:  adding sdb6 ...

md: sdb5 has different UUID to sdb6

md: sdb3 has different UUID to sdb6

md: sdb2 has different UUID to sdb6

md: sdb1 has different UUID to sdb6

md:  adding sda6 ...

md: sda5 has different UUID to sdb6

md: sda3 has different UUID to sdb6

md: sda2 has different UUID to sdb6

md: sda1 has different UUID to sdb6

md: created md4

md: bind<sda6>

md: bind<sdb6>

md: running: <sdb6><sda6>

md: kicking non-fresh sdb6 from array!

md: unbind<sdb6>

md: export_rdev(sdb6)

raid1: raid set md4 active with 1 out of 2 mirrors

md: considering sdb5 ...

md:  adding sdb5 ...

md: sdb3 has different UUID to sdb5

md: sdb2 has different UUID to sdb5

md: sdb1 has different UUID to sdb5

md:  adding sda5 ...

md: sda3 has different UUID to sdb5

md: sda2 has different UUID to sdb5

md: sda1 has different UUID to sdb5

md: created md3

md: bind<sda5>

md: bind<sdb5>

md: running: <sdb5><sda5>

md: kicking non-fresh sdb5 from array!

md: unbind<sdb5>

md: export_rdev(sdb5)

raid1: raid set md3 active with 1 out of 2 mirrors

md: considering sdb3 ...

md:  adding sdb3 ...

md: sdb2 has different UUID to sdb3

md: sdb1 has different UUID to sdb3

md:  adding sda3 ...

md: sda2 has different UUID to sdb3

md: sda1 has different UUID to sdb3

md: created md2

md: bind<sda3>

md: bind<sdb3>

md: running: <sdb3><sda3>

md: kicking non-fresh sdb3 from array!

md: unbind<sdb3>

md: export_rdev(sdb3)

raid1: raid set md2 active with 1 out of 2 mirrors

md: considering sdb2 ...

md:  adding sdb2 ...

md: sdb1 has different UUID to sdb2

md:  adding sda2 ...

md: sda1 has different UUID to sdb2

md: created md1

md: bind<sda2>

md: bind<sdb2>

md: running: <sdb2><sda2>

md: kicking non-fresh sdb2 from array!

md: unbind<sdb2>

md: export_rdev(sdb2)

raid1: raid set md1 active with 1 out of 2 mirrors

md: considering sdb1 ...

md:  adding sdb1 ...

md:  adding sda1 ...

md: created md0

md: bind<sda1>

md: bind<sdb1>

md: running: <sdb1><sda1>

md: kicking non-fresh sdb1 from array!

md: unbind<sdb1>

md: export_rdev(sdb1)

raid1: raid set md0 active with 1 out of 2 mirrors

md: ... autorun DONE.

```

I tried commands like mdadm /dev/md6 -a /dev/sdb8, but it won't bind the partitions on sdb.  :Sad: 

Any idea? Please help!Last edited by Woolong on Sat Nov 05, 2005 4:31 am; edited 1 time in total

----------

## bLanark

I've just encountered this too. I thought I'd just give this a bump and see if anyone notices it. . .  :Smile: 

----------

## bLanark

I've just discovered this little snippet: 

```

raidhotadd /dev/md0 /dev/hdg1

```

Of course, you'll need to use your own device names in this command.

The rebuild takes some time; you can check progress:

```

# cat /proc/mdstat

Personalities : [raid1] [raid5] [raid6] [raid10]

md0 : active raid1 hdg1[2] hde1[0]

      245111616 blocks [2/1] [U_]

      [>....................]  recovery =  4.1% (10191616/245111616) finish=66.2min speed=59105K/sec

```

I guess I'll know in the morning how it went  :Smile: 

If more than one array has failed, then I'd advise you to rebuild your arrays one at a time, especially if they share physical drives.

----------

## bLanark

Well, that worked well, and now I have:

```

# cat /proc/mdstat

Personalities : [raid1] [raid5] [raid6] [raid10]

md0 : active raid1 hdg1[1] hde1[0]

      245111616 blocks [2/2] [UU]

```

Yay! 

(Hopefully I'll survive the next reboot too)

----------

## Woolong

Thanks for the reply. After waiting for a couple of days without getting any post, I gave up on the forums and turned to the softraid-howto http://tldp.org/HOWTO/Software-RAID-HOWTO.html

I don't have raidtool, only emerged mdadm. I think the command you listed above is pretty much the same as this one:

```

mdadm /dev/md6 --add /dev/sdb8

```

It failed to bind the second device to the mirror array.

I thought maybe there is something wrong with the superblock, so I tried to wipe out the whole drive using fdisk. Delete all partitions on the disk, and partition it the same way as the first disk. Reboot and tried the command above again, still wouldn't bind. To my surprise, when I mounted partitions on sdb, some data are still left!

I thought the drive was wiped out when I delete all the partitions with fdisk. Guess not.

I've been presuming the drive was good because it's brand new. Since I have some spare drives, it would hurt to give them a try. Upon replacing the drive, the mirror array finally accepts the second device. I haven't got the time to verify whether it's a software config issue or a hardware one...

I'd like to point out the Gentoo softraid guide http://www.gentoo.org/doc/en/gentoo-x86-tipsntricks.xml#software-raid does mention that you need to install MBR on both drives if you are mirroring. But the instructions aren't there, or I'm too ignorant. Either way, here's what I learnt from the more complete Linux software-raid guide.

```

grub

grub>device (hd0) /dev/sdb        (sdb is the second drive of my mirror)

grub>root (hd0,0)       

grub>setup (hd0)

```

----------

## bol

 *Woolong wrote:*   

> 
> 
> I don't have raidtool, only emerged mdadm. I think the command you listed above is pretty much the same as this one:
> 
> ```
> ...

 

I have exactly the same problem, the secondary disc isn't syncing.

Well, one partition syncs, but non of the rest.

Wish me luck..

Thanks

----------

## bol

Now i'm there again.. 

The discs are not syncing.

And smartctl gives this output:

```
tw0t ~ # smartctl -H /dev/hdc

smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
```

mdstat:

```
tw0t ~ # cat /proc/mdstat

Personalities : [raid1]

md1 : active raid1 hdc1[2](F) hda1[0]

      56128 blocks [2/1] [U_]

md3 : active raid1 hdc3[2](F) hda3[0]

      5855616 blocks [2/1] [U_]

md5 : active raid1 hdc5[2](F) hda5[0]

      19534912 blocks [2/1] [U_]

md6 : active raid1 hdc6[2](F) hda6[0]

      39061952 blocks [2/1] [U_]

md7 : active raid1 hdc7[2](F) hda7[0]

      11679104 blocks [2/1] [U_]

md8 : active raid1 hdc8[2](F) hda8[0]

      979840 blocks [2/1] [U_]

```

And dmesg is giving this:

```

end_request: I/O error, dev hdc, sector 112191

raid1: hdc1: rescheduling sector 112128

end_request: I/O error, dev hdc, sector 112193

raid1: hdc1: rescheduling sector 112130

end_request: I/O error, dev hdc, sector 112195

raid1: hdc1: rescheduling sector 112132

end_request: I/O error, dev hdc, sector 112197

raid1: hdc1: rescheduling sector 112134

end_request: I/O error, dev hdc, sector 112191

end_request: I/O error, dev hdc, sector 112191

raid1: Disk failure on hdc1, disabling device.

        Operation continuing on 1 devices

raid1: hda1: redirecting sector 112128 to another mirror

raid1: hda1: redirecting sector 112130 to another mirror

raid1: hda1: redirecting sector 112132 to another mirror

raid1: hda1: redirecting sector 112134 to another mirror

RAID1 conf printout:

 --- wd:1 rd:2

 disk 0, wo:0, o:1, dev:hda1

 disk 1, wo:1, o:0, dev:hdc1

RAID1 conf printout:

 --- wd:1 rd:2

 disk 0, wo:0, o:1, dev:hda1

end_request: I/O error, dev hdc, sector 112319

Buffer I/O error on device hdc1, logical block 14032

end_request: I/O error, dev hdc, sector 112319

Buffer I/O error on device hdc1, logical block 14032

end_request: I/O error, dev hdc, sector 112439

end_request: I/O error, dev hdc, sector 112439

printk: 2 messages suppressed.

Buffer I/O error on device hdc2, logical block 244960

end_request: I/O error, dev hdc, sector 2072135

end_request: I/O error, dev hdc, sector 2072367

end_request: I/O error, dev hdc, sector 2072367

printk: 3 messages suppressed.

Buffer I/O error on device hdc3, logical block 11711232

end_request: I/O error, dev hdc, sector 13783618

Buffer I/O error on device hdc3, logical block 11711233

end_request: I/O error, dev hdc, sector 13783619

Buffer I/O error on device hdc3, logical block 11711234

end_request: I/O error, dev hdc, sector 13783620

Buffer I/O error on device hdc3, logical block 11711235

end_request: I/O error, dev hdc, sector 13783621

Buffer I/O error on device hdc3, logical block 11711236

end_request: I/O error, dev hdc, sector 13783622

Buffer I/O error on device hdc3, logical block 11711237

end_request: I/O error, dev hdc, sector 13783623

Buffer I/O error on device hdc3, logical block 11711238

end_request: I/O error, dev hdc, sector 13783624

end_request: I/O error, dev hdc, sector 13783617

end_request: I/O error, dev hdc, sector 13783618

end_request: I/O error, dev hdc, sector 13783619

........
```

I guess the disk is f*cked right?Last edited by bol on Wed Sep 06, 2006 12:05 pm; edited 1 time in total

----------

## bol

*bump*

----------

