# Software RAID6, not finding 2 of the 6 disks.  Ideas?

## DingbatCA

Have a RAID 6 array, running 6 drives.  When the system boots, it only finds 4 of the 6 drives. It is missing sde1 and sdf1.  The hardware is in good shape with no errors.

```
cat /proc/mdstat

md3 : active raid6 sdd1[3] sdc1[2] sdb1[1] sda1[0]

      1171909120 blocks level 6, 64k chunk, algorithm 2 [6/4] [UUUU__]
```

I have to stop the array and restart it with all 6 disks.

```

mdadm --stop /dev/md3

mdadm --assemble /dev/md3 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1

mdadm: /dev/md3 has been started with 6 drives.

cat /proc/mstat

md3 : active raid6 sda1[0] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1]

      1171909120 blocks level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
```

I have tried a resync, removing and re-adding the disks.  The partition are all setup as "Linux raid autodetect"   When I look at the logs after a boot, I get this 

```
md: considering sdf1 ...

md:  adding sdf1 ...

md:  adding sde1 ...

md: md3 already running, cannot run sdf1

md: export_rdev(sde1)

md: export_rdev(sdf1)

md: ... autorun DONE.
```

What am I missing?  How do I fix this?

----------

## drescherjm

What do you have in your mdadm.conf?

----------

## DingbatCA

yes, and mdadm is part of the init scripts.  The problem is before that.  It happens during the kernel load.

```

mdadm --examine --scan > /etc/mdadm.conf

```

----------

## anonybosh

Did you not include sde1 and sdf1 when you originally created the array? If so, I think that all you need to do is 'hotadd' them to the array:

# mdadm --manage --add /dev/md3 /dev/sde1 /dev/sdf1

----------

## drescherjm

 *DingbatCA wrote:*   

> yes, and mdadm is part of the init scripts.  The problem is before that.  It happens during the kernel load.
> 
> ```
> 
> mdadm --examine --scan > /etc/mdadm.conf
> ...

 

Did you put this line or is it in by default? I have never knowingly done that.

I know my mdadm.conf (on several systems at work) is not being overwritten on each boot.

----------

## DingbatCA

The examine line was used once to build my mdadm.conf.  mdadm is part of the rc's "mdadm | boot"

I just finished a rebuild trying the "mdadm --manage --add /dev/md3 /dev/sde1 /dev/sdf1"

Rebooted and still have the same issue, two disks missing.  And I still have the message in the logs.

```

md: considering sdf1 ...

md:  adding sdf1 ...

md:  adding sde1 ...

md: md3 already running, cannot run sdf1

md: export_rdev(sde1)

md: export_rdev(sdf1)

md: ... autorun DONE.

```

I can stop the array and restart it with all the disks.  No problems.  This array was a built over a year ago, this is the first time I have had any issues.

----------

## drescherjm

 *Quote:*   

> The examine line was used once to build my mdadm.conf. 

 

So does the DEVICE line in mdadm have all the drives?

Here is my complete mdadm from one server (well minus the comment at the top):

```
DEVICE /dev/sd[abcdef]1

DEVICE /dev/sd[abcdef]3
```

Also, are these drives are on a different controller than the other drives? Is the driver for that other controller compiled into the kernel or as a module? I had a problem in the past where mdadm would start before the driver was loaded for the sata controller.

----------

## DingbatCA

The mdadm.conf does not matter.  All of this is happening before the system reaches init.  This is all kernel level stuff.  But, yes, the device line is in there. I am stuck on this one.

----------

## drescherjm

 *Quote:*   

> The mdadm.conf does not matter.

 

I believe it can if you compiled your system with genkernel and mdadm support which puts your mdadm.conf in your initrd.

What about your drive controller? All 6 drives on the same controller?

----------

## DingbatCA

6 drives on 2 controllers.  2 and 4.  The two problem drives are on the 4 port controller.  It does not look to be a hardware issue. 

Kernel is built by hand.  No genkernel, and no initrd.

----------

## drescherjm

Are all the drives listed when you do an mdadm -E on ALL array members?

```
# mdadm -E /dev/sda3

/dev/sda3:

          Magic : a92b4efc

        Version : 00.90.03

           UUID : 05237c00:b934f79e:db41f949:13571be1

  Creation Time : Thu Jun 15 00:13:10 2006

     Raid Level : raid6

  Used Dev Size : 11727360 (11.18 GiB 12.01 GB)

     Array Size : 46909440 (44.74 GiB 48.04 GB)

   Raid Devices : 6

  Total Devices : 6

Preferred Minor : 1

    Update Time : Wed Apr 23 14:13:01 2008

          State : clean

 Active Devices : 6

Working Devices : 6

 Failed Devices : 0

  Spare Devices : 0

       Checksum : e38ab4e9 - correct

         Events : 0.739150

     Chunk Size : 256K

      Number   Major   Minor   RaidDevice State

this     0       8        3        0      active sync   /dev/sda3

   0     0       8        3        0      active sync   /dev/sda3

   1     1       8       19        1      active sync   /dev/sdb3

   2     2       8       35        2      active sync   /dev/sdc3

   3     3       8       51        3      active sync   /dev/sdd3

   4     4       8       67        4      active sync   /dev/sde3

   5     5       8       83        5      active sync   /dev/sdf3

```

----------

## DingbatCA

Explain this one!?

```

[root@blackqueen ~]# mdadm -E /dev/sda1

/dev/sda1:

          Magic : a92b4efc

        Version : 00.90.00

           UUID : 21721e25:f5b01083:71c26e11:8475d0d8

  Creation Time : Thu Apr 17 10:04:59 2008

     Raid Level : raid6

    Device Size : 292977280 (279.40 GiB 300.01 GB)

     Array Size : 1171909120 (1117.62 GiB 1200.03 GB)

   Raid Devices : 6

  Total Devices : 6

Preferred Minor : 3

    Update Time : Wed Apr 23 12:51:17 2008

          State : clean

 Active Devices : 6

Working Devices : 6

 Failed Devices : 0

  Spare Devices : 0

       Checksum : 5814a095 - correct

         Events : 0.10384

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State

this     0       8        1        0      active sync   /dev/sda1

   0     0       8        1        0      active sync   /dev/sda1

   1     1       8       17        1      active sync   /dev/sdb1

   2     2       8       33        2      active sync   /dev/sdc1

   3     3       8       49        3      active sync   /dev/sdd1

   4     4       8       65        4      active sync   /dev/sde1

   5     5       8       81        5      active sync   /dev/sdf1

[root@blackqueen ~]# cat /proc/mdstat 

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] 

md0 : active raid1 hdc1[1] hda1[0]

      125376 blocks [2/2] [UU]

      

md1 : active raid1 hdc2[1] hda2[0]

      1000320 blocks [2/2] [UU]

      

md3 : active raid6 sdd1[3] sdc1[2] sdb1[1] sda1[0]

      1171909120 blocks level 6, 64k chunk, algorithm 2 [6/4] [UUUU__]

      

md2 : active raid1 hdc3[1] hda3[0]

      10000256 blocks [2/2] [UU]

      

unused devices: <none>

```

Last edited by DingbatCA on Wed Apr 23, 2008 6:31 pm; edited 1 time in total

----------

## drescherjm

Is that the same for the rest of the drives? I mean do all drives show that all 6 are active?

 *Quote:*   

> Enplane this one!? 

 

I do not know...

Do you have an older version of mdadm? What kernel are you using?

In this case I am using a 2.6.21 xen kernel and mdadm-2.6.4

```
# uname -a

Linux datastore1 2.6.21-xen #3 SMP Mon Mar 31 16:17:04 EDT 2008 x86_64 AMD Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux

 # equery list mdadm

[ Searching for package 'mdadm' in all categories among: ]

 * installed packages

[I--] [  ] sys-fs/mdadm-2.6.4 (0)
```

----------

## DingbatCA

Linux blackqueen 2.6.24-gentoo-r5 #6 SMP Wed Apr 16 22:30:32 PDT 2008 x86_64 x86_64 x86_64 GNU/Linux

And I have an older version of mdadm

mdadm-2.5.4-3

----------

## DingbatCA

Well... I fixed it...

I had to open up the box in order to figure out if the 2 port controller was running the first two, or last two drives.  It was running the last two drives (sde, sdf).  The driver for the 2 port sil card was a module, not part of the kernel.  Recompiled with the driver as part of the kernel, no more problems...  I feel like such a n00b!

----------

