# [Solved] Raid devices suddenly won't assemble

## grooveman

Hi, I did a very minor world update last night, I rebooted, and suddenly my system wont boot anymore.

My system has been working well for over a year with md raid in mirror with two volume groups.  One of those vgs is for / .

Now when I boot, I get:

No volume groups found.

block device /dev/vg/slash is not a valid root device.

could not find the root block device in .

My initramfs hasn't changed in weeks, neither has my grub.conf or mdadm.conf.

I can enter the shell (ash), and confirm that the md devices have not been assembled.  I can assemble them there manually.  However, there do not seem to be any vg commands available in that limited environment, so I cannot assemble the volume groups.

I'm using base layout 2.0.3, and I do have lvm set to start at boot.

When I boot to a live cd, I can assemble and mount everything without a problem.

Any help is appreciated, thank you.

(Sorry if this post is terse or stilted, but my isp is down and I am posting this with my phone.)

----------

## NeddySeagoon

grooveman,

If you have root on lvm on raid, your initrd must both assemble your raid and start lvm.

You may use kernel autoassembly to bring up the raid sets is you were careful haoe to used mdadm to create the raid but you still need an initrd to run lvm before you mount root

----------

## grooveman

Neddy,

Yes, i know that is the case. I have an initramfs, and it has been working for a very long time, it just suddenly stopped working.  I haven't recompiled a kernel or messed with any lvm or mdadm stuff for some time.

My initramfs was made with genkernel using the lvm, mdadm and mdadm config flags.  I have tried rebuilding it, but as I expected, it made no difference.  I have likewise tried booting to older kernel/initramfs combinations that always worked before as well (I keep them in my grub.conf in case of emrgencies).

It is my understanding that genkernel puts all you need in the initramfs for you when you use these flags- it always has.  And, like I said, I haven't messed with it in weeks, and it always worked.

----------

## NeddySeagoon

grooveman,

I don't understand your  *grooveman wrote:*   

> I can enter the shell (ash), and confirm that the md devices have not been assembled. I can assemble them there manually. However, there do not seem to be any vg commands available in that limited environment, so I cannot assemble the volume groups. 

 

The lvm commands must be there as the 'limited environment' is the root filesystem provided by your initrd.  I'm also surprised the shell is ash as genkernel uses busybox to provide its shell and shell commands.  Busybox does not include mdadm nor lvm, so they should be built statically and included in the initrd.

When you are in the initrd shell, can you use ls to see mdadm and lvm ?

----------

## grooveman

Sorry neddy I was on my phone for those posts (my internet was down, not related at all to this problem).  Posting was cumbersome.

Yes, it is busy box, ash is the built-in shell for busybox.  

On the initramfs I find my /etc/mdadm.conf, I have the mdadm command, I have my /etc/lvm/lvm.conf file as well (though I have made no alterations to it).  I do not have any md devices listed in /dev and neither do I have any of my volume groups.  As I was trying to say before, I can manually assemble the raid devices, because the mdadm command is in the initramfs.  The lvm command, is in fact, present on the initramfs (thank you for that tip), and with it, I was able to activate the volume groups. The fact that I can assemble the raid and activate the volume groups is evidence that there is no hardware malfunction.  I can even exit the shell, and contiue the boot now.

So, if I can assemble and activate these devices in the initramfs environment, why can't it?  All the tools are clearly there, but the system is just not using them...

I'm not entirely sure how the the initrd works as far as changing over to the root in its boot sequence.  However, before a couple days ago, it always worked, and I made no changes whatsoever to the initrd, or to my mdadm, grub or lvm configuration.  There has been no updates to lvm, grub or mdadm in at least a month or more.  And, I have been using the same kernel for two weeks.  It just stopped working.

----------

## NeddySeagoon

grooveman,

As it works manually but not with the init script in your initrd, it must be a timing issue.

The process goes like this ...

Grub loads the kernel and the initrd and passes control to the kernel.

The kernel initialises and mounts the initrd as its root filesystem then executes the init script it finds there.

The init script does whatever is needed to mount your real root including running mdadm and lvm, at least one of which fails.

This fail drops you to the bisybox shell.

You can run mdadm and lvm by hand and continue the boot. 

Does your kernel contain CONFIG_DEVTMPFS=y ?

That makes the /dev entries needed by the initrd.

I use lvm over raid5 for my systems with multiple drives but I hand roll the initrd using that guide.  You only need the initrd part.

I do not use an mdadm.conf as everything is hard coded in the mdadm commands.  My /dev is static too. The md nodes and and vg nodes are not required, neither is  CONFIG_DEVTMPFS=y.

The requirement for  CONFIG_DEVTMPFS=y happened when the dust settled after the problems with stage3 autobuilds following the baselayout2 stabilisation. 

This does not explain your issue yet.  If your /dev nodes were missing, they are not just going to appear ... at least I hope not.

Hmm maybe they could appear late if  CONFIG_DEVTMPFS=m as the module would have to be loaded.

There are a few things for you to poke at there.

----------

## grooveman

I Found it!!

Though I am perplexed as to how something like this can happen...

After being able to manually assemble and activate my drives (and boot my system), it got me thinking...  I checked my mdadm.conf (which I had been using for over a year now) against what I pulled up from my manually assembled drives -- and lo and behold, the UUIDs changed!  All of them are different than they were two days ago -- inexplicably re-generated anew.

So I put the new UUIDs in my mdadm.conf, and rebuilt my initramfs, and my system now boots normally.  I am still dumbfounded... but I am grateful to finally have this sorted out!

I appreciate your input on this neddy.  You are always there for the tough posts. Thank you.

-G

----------

