# [SOLVED] How to properly boot a custom initramfs?

## cami

I've been trying to figure out how to boot with initramfs for weeks, but I always end up with a kernel panicking in mount_root. The thing is I don't want it to attempt mounting the (real) root filesystem, I want it to mount initramfs and run its magic. I can't find any documentation on the correct bootloader / kernel command line for initramfs, only old initrd stuff which doesn't work.

- Should I compile initramfs into the kernel or use an external image?

- Do I need "initrd path/to/external/image" in grub.cfg if I compiled initramfs into bzImage?

- Can grub2 load a kernel from a RAID-1 volume?

- What's the correct root= parameter?

- Do I need rootfstype?

- Do I need rootdelay/rootwait?

Final goal is the following setup:

- (OK) BIOS assembles RAID1 mirror "Volume0_0" on SATA disks sdb and sdc (full-disk raid)

- (OK) BIOS loads grub2 from "Volume0"

- (OK?) Grub2 loads kernel from sdb1 (or sdc1 - or Volume0_0p1?)

- (FAIL) Kernel loads initramfs

- (OK*) initramfs assembles /dev/Volume0_0

- (OK*) initramfs mounts /dev/Volume0_0p3 over / using xfs

- (?) initramfs runs /sbin/init

*) only tested using chrootLast edited by cami on Mon Jun 20, 2016 10:38 am; edited 2 times in total

----------

## NeddySeagoon

cami,

 *Quote:*   

> - Should I compile initramfs into the kernel or use an external image?

 

Both ways work.  If your kernel is a EFI stub, the initrd must be built in as only one file is loaded.

 *Quote:*   

> - Do I need "initrd path/to/external/image" in grub.cfg if I compiled initramfs into bzImage?

 

Maybe. You can pass an ready made initramfs, or have the kernel build sytem wake it for you.

 *Quote:*   

> - Can grub2 load a kernel from a RAID-1 volume?

 

Yes but you mention full disc raid.  That's an extra complication.

 *Quote:*   

> - What's the correct root= parameter?

 

That's harder without knowing your disc layout.

 *Quote:*   

> - Do I need rootfstype?

 

No.  It just saves a few milliseconds mounting root as the kernel does not need to try all the filesystems it knows about.

 *Quote:*   

> - Do I need rootdelay/rootwait? 

 

For SATA HDD, no.  For root on USB and the like, yes.

 *Quote:*   

> Final goal is the following setup:
> 
> - (OK) BIOS assembles RAID1 mirror "Volume0" on SATA disks sdb and sdc (full-disk raid) 

 

What sort of raid do you have?

Fakeraid, mdadm raid or real hardware raid

 *Quote:*   

> - (OK) BIOS loads grub2 from "Volume0"
> 
> - (OK?) Grub2 loads kernel from sdb1 (or sdc1 - or Volume0p1?)

 

Yep.  It also loads the initrd if if been told to.

Post the init script from your initrd.

Are we really talking BIOS or is it UEFI?

----------

## cami

 *NeddySeagoon wrote:*   

>  *cami wrote:*   - Should I compile initramfs into the kernel or use an external image? 
> 
> - Do I need "initrd path/to/external/image" in grub.cfg [...]? 
> 
> Both ways work.  [...] You can [...] have the kernel build sytem wake it for you.

 

I'd prefer to compile initramfs into linux, then, and have the kernel wake it. I'd like the bootloader to do as little as possible.

How to tell grub2 to load from the RAID-1 (instead of its members)?

The RAID members are (hd1) and (hd2). The boot partition is on (hdX,msdos1). 

The final root fs is on device /dev/md/Volume0_0p3. Should I use root=/dev/md/Volume0_0p3 ? But, this device is created and mounted by /init in initramfs, I don't want the kernel to try to mount it automatically unless that is necessary.

 *NeddySeagoon wrote:*   

> Fakeraid, mdadm raid or real hardware raid

 

I think it's FakeRaid. The array was created using BIOS (EFI?) menus. When Linux is running, I'm using mdadm to assemble it. It's definitely not hardware RAID.

 *NeddySeagoon wrote:*   

> Are we really talking BIOS or is it UEFI?

 

This is an unmodded ASRock X58 Extreme board from 2009, Intel X58 chipset. It might be EFI, if I had to guess I'd say it is not, but I really don't know. 

 *NeddySeagoon wrote:*   

> Post the init script from your initrd.

 

It's just a stub so far, as I want to focus on the earlier parts of the boot process to work before elaborating it. (Sidenote: I'm using initramfs, not initramdisk, but I guess initrd is used as shorthand for both.)

Nevertheless, I posted init stub from my initramfs on pastebin.

----------

## NeddySeagoon

cami,

The init script is normally a shell script.

Building the initrd into the kernel makes for time consuming updates.  

To change anything you need to rebuild the kernel.

You will have Intel Storage Manager fakeraid.

----------

## cami

Thanks for the warning. I change the kernel much more frequently than the initramfs, however. It's only there to assemble the RAID.

I finally got initramfs to boot - it seems the /dev/console node needs to be present in initramfs (maybe initramfs was even executed and I just couldn't see it). I set root=/dev/Volume0_0p3 for now.

I know that init is a script normally, but at least for now, I am using an executable so I dont need to install a shell in initramfs (one less source of problems). This might change in the future.

The system gets very far booting, only the final call to mount fails due to lack of /dev/md/Volume0_0p3. I'd expect it to be there when mdadm exits, maybe I need to insert a brief wait or sync?

The messages from initramfs /init don't appear in netconsole, I guess because they are sent directly to the terminal and not to the kernel logger, and using klogctl() instead of printf() will make them appear?

----------

## NeddySeagoon

cami,

There are three nodes needed in /dev to get started but you get them for free if you choose DEVTMPFS and mount  DEVTMPFS in your kernel config.

You need console, null and one other, I forget.

To get net console, you may need a few items on the kernel command line. 

Did you find /usr/src/linux/Documentation/networking/netconsole.txt ?

----------

## cami

Yes, I have netconsole running, I can see all the kernel messages, just not the init messages. Im not sure but I think netconsole only forwards the kernel messages, not process outputs. Thats the reasond for my plan to use klogctl. 

Maybe /dev/zero or /dev/u?random. Atm it doesnt seem as if theyre necessary. The default initramfs only contains /dev/console.

----------

## szatox

 *Quote:*   

> You need console, null and one other, I forget. 

 You need console and null. /dev/forget is not necessary.

Cami, if you're using genkernel-generated initramfs, you can provide "debug" flag on your boot line. It will make the default init stop twice during early boot process dropping you to shell. Handy tool when things go wrong and you need to have a closer look at them. Or when you /dev/forget your password.

You can also build custom initramfs manually if you feel like doing some magic yourself.

You can do that using 2 or 3 packages:

baselayout (USE="build")

busybox (USE="make-symlinks") - simplified, but usable replacement for coreutils

glibc - you only need it if you want to run programs not included in busybox (e.g. you need it if you want to run the _real_ init, or setup LVM. Of course the latter requires you to install LVM inside initramfs too)

Of course you need /dev/{console,null} and /init (it must be at least a shell script starting sh, a link to /bin/sh will not work, unless you add more scripts to make it work)

Once you're done with setting thing it up, just pack it up with find . | cpio -H newc -o | gzip > ../initramfs.gz, copy to boot and update bootloader so it knows where the nitramfs is.

Oh ,and one hint more: multidevice (mdraid) has 2 metadata formats: 1.2 and deprecated 0.9. The deprecated 0.9 is supported by kernel directly and can be autodetected. Just make sure - like in REALLY sure - you don't have both at the same time on the same device. It is possible to do that, and if you allow it it will make your system fail to boot if you're lucky, and trash your data if you're not. If you want root on LVM (device mapper), there is no way to dodge initramfs though... At least I never heard about it.

----------

## cami

Not using genkernel or busybox (yet). The more automation, the more possible sources of oddities. Thanks for the hint anyway, busybox is definitely my #1 candidate for a more standard early init

```
# base system

dir     /dev    755 0 0

dir     /dev/md 755 0 0

nod     /dev/console    600 0 0 c 5 1

nod     /dev/sdb        600 0 0 b 8 16

nod     /dev/sdc        600 0 0 b 8 32

dir     /etc    755 0 0

dir     /lib64  755 0 0

file    /lib64/libc.so.6                /lib64/libc.so.6   755 0 0

file    /lib64/libgcc_s.so.1            /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.3/libgcc_s.so.1    755 0 0

file    /lib64/ld-linux-x86-64.so.2     /lib64/ld-linux-x86-64.so.2   755 0 0

dir     /proc   755 0 0

dir     /root   700 0 0                                                 

dir     /run    755 0 0

dir     /sbin   755 0 0                                                 

dir     /sys    755 0 0

# init

file    /init           /usr/src/initramfs/init 755 0 0

# mdadm

file    /sbin/mdadm     /sbin/mdadm     755 0 0

file    /etc/mdadm.conf /etc/mdadm.conf 644 0 0

dir     /run/mdadm      755 0 0
```

----------

## NeddySeagoon

cami,

With full disk raid you cannot use raid metadata 0.9 and kernel autoassembly.

Kernel autoassembly depends on the partition type being fd in the partition table.

With full disk raid, you don't have a partition table, so its not an option for you.

----------

## cami

Thanks for the hint. I think i read that somewhere, too, that's why I went for mdadm / initramfs.

So, funny story, I made it work by accident on Friday evening! As you have certainly noticed, the RAID mirror uses disks sdb and sdc. You might also have guessed already that sda is the "old" system. It might also be worth noting that I installed Grub 0.97 on sda and normally used it for booting the old system. So on Friday, I was running some commands in GRUB2 (which is on the RAID), and noticing that they wouldn't work, decided not to attempt to boot the RAID. I also thought it'd be quicker to boot sda directly from GRUB2 instead of a full reboot. The system came up nicely, but when I tried to mount the RAID it failed - already mounted. What? Well, it appeared that in GRUB2, (hd0) is the FakeRAID, not sda, and I had booted the RAID successfully by accident. Oh, the irony! I'd have noticed that long before if  (hd1) would have been sda, but it isn't. It is something weird that looks like sdb/sdc but produces odd kernel behavior.

Lesson learned: don't rely on device numbers. That's why we have labels and UUID. I was planning to use them in a later stage, oh well.

For reference, some additional complications I encountered:

Used the Linux socket API to send error messages as UDP datagrams in addition to printing them out, so I could see them like the netconsole messages. Had to use bind() to enforce the same source port, otherwise netcat only shows the first message.

Can't use the same ports for init messages (socket API) and kernel (netconsole). Well, not really surprising.

Read-write access to the array required mdmon. I decided to add it to the initramfs, because once the main filesystem is mounted the array is already being used and cannot be reassembled.

I added mdmon --takeover --all to /etc/init.d/mdraid. It runs successfully, but the initramfs mdmon doesn't go away. I shrugged and let it live.

Over-mounting / with xfs on /dev/md/Volume0_0p3 failed for reasons I didn't bother to investigate. I was planning to use the switch_root utility from sys-apps/util-linux anyway, as it is specialized for this task and does some other nice things. Now mounting  /dev/md/Volume0_0p3 on /mnt and letting switch_root do all the remaining work, resulting in a more standard init process.

If interested, you can find my current init.c and file list for gen_initramfs_list.sh on pastebin. The socket code has already been removed, If anyone wants to see it just poke me. And sorry, I didn't bother to install a shell yet, but an equivalent init script would be:

```
#!/bin/sh

# mount /proc and /sys

mount -t proc proc /proc

mount -t sysfs sysfs /sys

# run mdadm

mdadm --assemble --scan --auto=mdp

# mount /

mount -t xfs /dev/md/Volume0_0p3 /mnt

# run /sbin/init

/sbin/switch_root /mnt /sbin/init
```

----------

