# [Half-Solved] Raid not found at boot, but "mdadm -A -s" ok

## sglorz

Hello,

I have 3 disks on LSI SAS card in software RAID5.

Here is a sample of dmesg:

```

[    1.577026] ioc0: LSISAS1068 B0: Capabilities={Initiator}

[    3.994315] scsi6 : ioc0: LSISAS1068 B0, FwRev=000a3300h, Ports=1, MaxQ=286, IRQ=16

[    4.032140] mptsas: ioc0: attaching sata device: fw_channel 0, fw_id 1, phy 1, sas_addr 0x1221000001000000

[    4.040211] scsi 6:0:0:0: Direct-Access     ATA      WDC WD15EARS-00Z 0A80 PQ: 0 ANSI: 5

[    4.042340] sd 6:0:0:0: Attached scsi generic sg3 type 0

[    4.042797] sd 6:0:0:0: [sdd] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)

[    4.043975] mptsas: ioc0: attaching sata device: fw_channel 0, fw_id 2, phy 2, sas_addr 0x1221000002000000

[    4.051387] scsi 6:0:1:0: Direct-Access     ATA      WDC WD15EARS-00Z 0A80 PQ: 0 ANSI: 5

[    4.053469] sd 6:0:1:0: Attached scsi generic sg4 type 0

[    4.053927] sd 6:0:1:0: [sde] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)

[    4.055107] mptsas: ioc0: attaching sata device: fw_channel 0, fw_id 3, phy 3, sas_addr 0x1221000003000000

[    4.059475] sd 6:0:0:0: [sdd] Write Protect is off

[    4.059667] sd 6:0:0:0: [sdd] Mode Sense: 73 00 00 08

[    4.062528] scsi 6:0:2:0: Direct-Access     ATA      WDC WD15EARS-00M AB51 PQ: 0 ANSI: 5

[    4.064729] sd 6:0:2:0: Attached scsi generic sg5 type 0

[    4.065186] sd 6:0:2:0: [sdf] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)

[    4.066547] sd 6:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[    4.066702] ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18

[    4.066714] firewire_ohci 0000:01:07.0: PCI INT A -> Link[APC3] -> GSI 18 (level, low) -> IRQ 18

[    4.069354] sd 6:0:1:0: [sde] Write Protect is off

[    4.069546] sd 6:0:1:0: [sde] Mode Sense: 73 00 00 08

[    4.076283] sd 6:0:1:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[    4.081242] sd 6:0:2:0: [sdf] Write Protect is off

[    4.081441] sd 6:0:2:0: [sdf] Mode Sense: 73 00 00 08

[    4.088100] sd 6:0:2:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

```

At boot time, kernel cannot find the RAID5:

```
[    4.599466] md: Waiting for all devices to be available before autodetect

[    4.599655] md: If you don't use raid, use raid=noautodetect

[    4.600034] md: Autodetecting RAID arrays.

[    4.600219] md: Scanned 0 and added 0 devices.

[    4.600403] md: autorun ...

[    4.600586] md: ... autorun DONE.

```

But at end of boot, mdadm is fine:

```
bach / # mdadm -A -s

mdadm: /dev/md0 has been started with 3 drives.

```

```

bach / # cat /proc/mdstat

Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]

md0 : active raid5 sdd[0] sdf[2] sde[1]

      2930276992 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>

```

And here is mdadm.conf:

```
bach / # cat /etc/mdadm.conf

ARRAY /dev/md0 level=raid5 num-devices=3 metadata=0.90 UUID=0d357e33:2a6bebb6:3f069d62:12d82675

   devices=/dev/sdd,/dev/sde,/dev/sdf

```

How can I resolve this?

ThanksLast edited by sglorz on Sat Jun 11, 2011 9:17 pm; edited 1 time in total

----------

## BradN

Are your partitions containing the raid data set to the raid auto-detect partition type (value FD)?

----------

## sglorz

That what I was thinking of.

My disks have a GPT partition.

parted report this:

```
bach ~ # parted -l

Error: /dev/sdd: unrecognised disk label

Error: /dev/sde: unrecognised disk label

Error: /dev/sdf: unrecognised disk label

Error: partition length of 5860553984 sectors exceeds the

loop-partition-table-imposed maximum of 4294967295

```

So I don't know if the raid (fd) flag is on and I don't like the "unrecognised disk label", but as it is RAID, it may be ok.

----------

## NeddySeagoon

sglorz,

A piece of my /proc/mdstat reads

```
md5 : active raid5 sdd5[3] sdb5[1] sdc5[2] sda5[0]

      15759360 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

```

Yours says

```
 md0 : active raid5 sdd[0] sdf[2] sde[1]

      2930276992 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] 
```

You have donated whole unpartitioned drives to the raid set.  There is no partition table to hold the type FD, that askes the kernel to auto assemble the raid set.  You must use an initrd with mdadm if your root is on that raid set.

Traps for the unwary. The kernel will only auto assemble raid sets with raid superblock version 0.90.  The default now is version 1.2

If your /boot is raided, is must be raid1 with superblock version 0.90 or grub will not start.

----------

## sglorz

My /boot is not raided and is on another disk (/dev/sdc1).

So, do I have to use an initrd or may be call "mdadm -A -s" at boot time ?

----------

## NeddySeagoon

sglorz,

Provided your /etc/mdam.conf describes your raid sets, just add mdadm to the boot runlevel. It will start the raid before /etc/fstab if fed to mount to mount your filesystems.  

```
rc-update add mdadm boot
```

----------

## sglorz

I already done that, but it doesn't work   :Crying or Very sad: 

```
bach e-sata # rc-update

mdadm |         boot

```

Moreover, /etc/init.d/mdam script seems to launch mdadm monitor and not assembling raid arrays:

```
bach etc # cat /etc/init.d/mdadm

#!/sbin/runscript

# Copyright 1999-2006 Gentoo Technologies, Inc.

# Distributed under the terms of the GNU General Public License v2

# $Header: /var/cvsroot/gentoo-x86/sys-fs/mdadm/files/mdadm.rc,v 1.2 2006/04/25 05:41:51 vapier Exp $

depend() {

        use logger dns net

}

start() {

        ebegin "Starting mdadm monitor"

        mdadm --monitor --scan \

                --daemonise \

                --pid-file /var/run/mdadm.pid \

                ${MDADM_OPTS}

        eend $?

}

stop() {

        local ret

        ebegin "Stopping mdadm monitor"

        start-stop-daemon --stop --pidfile /var/run/mdadm.pid

        ret=$?

        rm -f /var/run/mdadm.pid

        eend ${ret}

}

```

So, should I add a mdadm-assemble script? or there is a more 'nice' way to do that?

----------

## NeddySeagoon

sglorz,

--scan says to assemble your raid using  /etc/mdadm.conf

Please post your  /etc/mdadm.conf

----------

## sglorz

Here is my mdadm.conf:

```
bach / # cat /etc/mdadm.conf

# mdadm configuration file

#

# mdadm will function properly without the use of a configuration file,

# but this file is useful for keeping track of arrays and member disks.

# In general, a mdadm.conf file is created, and updated, after arrays

# are created. This is the opposite behavior of /etc/raidtab which is

# created prior to array construction.

#

#

# the config file takes two types of lines:

#

#       DEVICE lines specify a list of devices of where to look for

#         potential member disks

#

#       ARRAY lines specify information about how to identify arrays so

#         so that they can be activated

#

# You can have more than one device line and use wild cards. The first

# example includes SCSI the first partition of SCSI disks /dev/sdb,

# /dev/sdc, /dev/sdd, /dev/sdj, /dev/sdk, and /dev/sdl. The second

# line looks for array slices on IDE disks.

#

#DEVICE /dev/sd[bcdjkl]1

#DEVICE /dev/hda1 /dev/hdb1

#

# If you mount devfs on /dev, then a suitable way to list all devices is:

#DEVICE /dev/discs/*/*

#

#

# The AUTO line can control which arrays get assembled by auto-assembly,

# meaing either "mdadm -As" when there are no 'ARRAY' lines in this file,

# or "mdadm --incremental" when the array found is not listed in this file.

# By default, all arrays that are found are assembled.

# If you want to ignore all DDF arrays (maybe they are managed by dmraid),

# and only assemble 1.x arrays if which are marked for 'this' homehost,

# but assemble all others, then use

#AUTO -ddf homehost -1.x +all

#

# ARRAY lines specify an array to assemble and a method of identification.

# Arrays can currently be identified by using a UUID, superblock minor number,

# or a listing of devices.

#

#       super-minor is usually the minor number of the metadevice

#       UUID is the Universally Unique Identifier for the array

# Each can be obtained using

#

#       mdadm -D <md>

#

#ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371

#ARRAY /dev/md1 super-minor=1

#ARRAY /dev/md2 devices=/dev/hda1,/dev/hdb1

#

# ARRAY lines can also specify a "spare-group" for each array.  mdadm --monitor

# will then move a spare between arrays in a spare-group if one array has a failed

# drive but no spare

#ARRAY /dev/md4 uuid=b23f3c6d:aec43a9f:fd65db85:369432df spare-group=group1

#ARRAY /dev/md5 uuid=19464854:03f71b1b:e0df2edd:246cc977 spare-group=group1

#

# When used in --follow (aka --monitor) mode, mdadm needs a

# mail address and/or a program.  This can be given with "mailaddr"

# and "program" lines to that monitoring can be started using

#    mdadm --follow --scan & echo $! > /var/run/mdadm

# If the lines are not found, mdadm will exit quietly

#MAILADDR root@mydomain.tld

#PROGRAM /usr/sbin/handle-mdadm-events

ARRAY /dev/md0 level=raid5 num-devices=3 metadata=0.90 UUID=0d357e33:2a6bebb6:3f069d62:12d82675

   devices=/dev/sdd,/dev/sde,/dev/sdf

```

----------

## NeddySeagoon

sglorz, 

```
ARRAY /dev/md0 level=raid5 num-devices=3 metadata=0.90 UUID=0d357e33:2a6bebb6:3f069d62:12d82675

   devices=/dev/sdd,/dev/sde,/dev/sdf 
```

Thas correct if its a single line but from your post, its not clear.

----------

## BradN

 *Quote:*   

> Traps for the unwary. The kernel will only auto assemble raid sets with raid superblock version 0.90. The default now is version 1.2
> 
> If your /boot is raided, is must be raid1 with superblock version 0.90 or grub will not start.

 

Oh nice, another pointless issue to deal with... Why do they make a new superblock version default without the kernel's autodetection and grub knowing about it?  I can understand having it an option for testing, but damn, people don't think things like that through as often or as thoroughly as they should.

Well, OK, maybe we can ignore grub in the event that newer grub versions do handle it (I have no idea), after all, not all linux is gentoo and lots use newer grub or maybe even other bootloaders.  But the kernel?  I mean, really?  If I were in charge of the mdadm ebuild I would patch in a 0.90 default and print a warning message, that way most people know what's up and the terminally un-aware won't hit as many problems.

----------

## sglorz

 *NeddySeagoon wrote:*   

> sglorz,
> 
> --scan says to assemble your raid using  /etc/mdadm.conf
> 
> Please post your  /etc/mdadm.conf

 

No the --scan is not for scanning disks:

```
bach ~ # mdadm --monitor --help

Usage: mdadm --monitor options devices

[...]

Options that are valid with the monitor (-F --follow) mode are:

[...]

  --scan        -s   : find mail-address/program in config file

[...]

```

----------

## sglorz

 *NeddySeagoon wrote:*   

> sglorz, 
> 
> ```
> ARRAY /dev/md0 level=raid5 num-devices=3 metadata=0.90 UUID=0d357e33:2a6bebb6:3f069d62:12d82675
> 
> ...

 

This line is coming from  

```
mdadm --detail --scan --verbose
```

But, I may have a solution:

```
mdadm --daemonise /dev/md0
```

I'll tell you if it helps me.

----------

## sglorz

Ok, it does not work

I've added an init script then, until I found a better solution.

Here is the script:

```

bach ~ # cat /etc/init.d/mdadm-assemble

#!/sbin/runscript

# Distributed under the terms of the GNU General Public License v2

depend() {

        before localmount

}

start() {

        ebegin "Assembling all RAID"

        mdadm --assemble -s

        eend $?

}

stop() {

        ebegin "Stopping all RAID"

        mdadm --stop -s

        eend $?

}

```

Added at boot time:

```
rc-update add mdadm-assemble boot 
```

----------

## tox2ik

I experienced a similar problem after upgrading the kernel from 2.6.31 to 2.6.38 (gentoo-sources amd64). During boot, the kernel or the system (I'm suspecting the kernel) tried to assemble my raid0 array without luck. What happened was it started different /dev/md* devices for every member disk with the result that neither would start. This was clearly evident after a quick look at /proc/mdstat. In a naive attempt to solve the problem I tried disabling the kernel option CONFIG_MD_AUTODETECT. This did not help because after booting into single or multiuser the arrays were still partially assembled and member discs were assigned to different /dev/md* devices. I tried also to disable the various kernel options at boot time incuding domdadm dodmraid and dolvm (because I use them before with initramfs) to make sure the genkernel-included linuxrc was not causing the problem. I'm booting without initramfs, mind you.

On the bright side:

I was able to assemble the array with mdadm -As after stopping all partially detected arrays with mdadm -S /dev/md*. This works well because my /etc/mdadm.conf contains correct information about the array. 

And to force the array being assembled during boot I added a line to /lib/rcscripts/addons/raid-start.sh

```

# Start software raid with mdadm

if [ -x /sbin/mdadm ] ; then

        ebegin "Starting up RAID devices"

--->     mdadm -S /dev/md*

        output=$(mdadm -As 2>&1)

        ret=$?

        [ ${ret} -ne 0 ] && echo "${output}"

        eend ${ret}

fi

```

I'm not going to recommend this solution to anyone, but it works for me at least. Would be glad to read anyone shed some light on this and explain why the kernel (or whatever) fails to detect the arrays. Should I perhaps set the partition type to FD like described above? Seems weird that I would need to because I never needed it before.

----------

## NeddySeagoon

Kernel raid auto assembly is going away some time soon.

We will all have to move to the initrd solution if we have root on raid.

For new installs, I've moved to GPT, a raid1 /boot and everything else in a lvm2 volume.

I'm still avoiding grub2 though.

Other than raid, the same thing works for single drive installs.

----------

## BradN

 *NeddySeagoon wrote:*   

> Kernel raid auto assembly is going away some time soon.
> 
> We will all have to move to the initrd solution if we have root on raid.

 

Aha, so this explains why the kernel was never updated for the new superblock format.

----------

## dahoste

Just posting a 'me too' reply.   I got burnt on a recent massive update (triggered by the openrc upgrade).  Upon reboot, all of my 0.90 arrays came online, but I had one 1.0 array that behaved as described by tox2ik.  I recovered by using his 'not recommended' hack on raid-start.sh.

I'd love to know if there's a more appropriate (upstream compliant) means of addressing this issue.  I'm going to be using a lot more 1.*-based arrays, and would very much prefer to comply with a best-practice solution for bootstrapping them in gentoo.  Thanks!

----------

## NeddySeagoon

dahoste,

Upstream want you to use mdadm in an initrd.  That works even for root on raid.

----------

## dahoste

Hey NeddySeagoon, thanks for the reply.

But what exactly does that mean, in gentoo-land?   I already had (what I thought was) a thoroughly specified mdadm.conf.

And with baselayout2, mdraid is hooked into the boot runlevel (that's what actually invokes the 'mdadm -As' step).

My confusion stems from the fact that I thought mdraid (via raid-start.sh) was indeed the source of array activation, and thus I didn't (and still don't) understand why my 1.0-based array didn't activate properly, despite being declared in mdadm.conf.

Any insight you can provide will be most welcome.

----------

## reavertm

On a bit unrelated and funny (or less for me) note, while my two 0.90 RAID1 arrays are assembled fine with 2.6.38 kernel, they don't work anymore just when I switch to wireless-testing kernel 3.0.0-rc4

Nearly the same kernel config, both built using genkernel (MDADM="yes" LVM="yes"), both booting from initrd with 'domdadm dolvm' params.

In kernel-3.0.0-rc4 case, running mdadm -a (by initrd domdadm itself) or manually in busybox ends up with 'Invalid argument' (mind you, exactly the same /etc/mdadm.conf).

May I kindly ask, WTF? :)

0.90 metadata support silently removed or what?

----------

## dahoste

Sorry, NeddySeagoon, disregard my question.  I wasn't reading closely enough.  I see that an initrd-based boot is a completely different beast, and something about which I'll need to go educate myself thoroughly, before attempting.

[sigh]...  you'd think system setup would get *easier* over time, but this stuff seems more convoluted than ever.

----------

## Hu

 *dahoste wrote:*   

> [sigh]...  you'd think system setup would get *easier* over time, but this stuff seems more convoluted than ever.

 There is a general push to move as much policy out of the kernel as possible.  This makes it much easier to do fancy things without the kernel guessing incorrect defaults.  However, one of the unfortunate consequences of this is that anyone who relied on the automatic defaults needs to take action to keep the system working.  For what it is worth, you should be able to write an initramfs that works with the future kernel and also works with the auto-detection enabled kernels you have previously used.

----------

## dahoste

Hey Hu,

I'm certainly not objecting to the migration of general system configuration *out* of the kernel.

It's the "write an initramfs" part that frustrates.  The consequence is that now instead of just having to understand mdadm, the casual sysadmin has to completely refactor how their system boots, just to be able to enjoy the benefits of raid.   Oh well.    :Sad: 

----------

## richard.scott

If your using the latest genkernel (v3.4.16) the old way of booting root on RAID won't work anymore    :Shocked: 

You need to do this to get it all to work:

```
# mdadm --detail --scan >> /etc/mdadm.conf

# genkernel --mdadm --mdadm-config=/etc/mdadm.conf ramdisk
```

to build your initramfs with mdadm support.

oh, don't forget to add the domdadm option to your kernel boot args as follows:

```
title=Gentoo (2.6.39-hardened-r7)

root (hd0,0)

kernel /kernel-genkernel-x86_64-2.6.39-hardened-r7 root=/dev/ram0 init=/linuxrc real_root=/dev/md2 vga=792 domdadm

initrd /initramfs-genkernel-x86_64-2.6.39-hardened-r7
```

Hope this helps someone.

Rich

----------

## sglorz

Yes it helps! Thanks!

----------

## wildbug

 *NeddySeagoon wrote:*   

>  The kernel will only auto assemble raid sets with raid superblock version 0.90.  The default now is version 1.2
> 
> If your /boot is raided, is must be raid1 with superblock version 0.90 or grub will not start.

 

I don't think this is true.  I know I've read that, too (/usr/src/linux/Documentation/md.txt, IIRC), but I have 0.90, 1.1, and 1.2 arrays on my system, including unpartitioned volumes (i.e., /dev/sdk, not /dev/sdk1, so no 0xFD), and all are auto-assembled.

----------

## wildbug

Hmm, I might be wrong about my previous post.  md reports that it's looking for 0.90 superblocks when it assembles my root device; the other devices aren't assembled until about 11 seconds later (dmesg).  Is OpenRC responsible for that?  I'm not using /etc/init.d/mdadm.

Anyway...

There's another way to start md devices at boot other than by autodetection.  You can supply the md= kernel parameter.  I.e.,

```
kernel /linux-2.6.39 root=/dev/md127 md=127,/dev/sda,/dev/sdb,/dev/sdc raid=noautodetect
```

I think this works with all superblock versions.  See http://www.kernel.org/doc/Documentation/md.txt

----------

