# Uefi, grub, partuuid

## dbishop

This is likely not a very standard challenge.  Part of my problem seems to be grub, the other part lies with system bios.

I have a platform that has ten hot swap disks and one internal disk port designed for a directly plugged SSD-on-a-chip, limited to 64GB.  So it useful only for failsafe recovery, almost like a permanently mounted sysrecuecd (which is what it will be, albeit modded somewhat).

I am trying to allow booting from any of the hot swap disk positions and finally, failing all else, boot from the internal module.  And, as luck would have it, the internal module gets enumerated last.  If there are 10 disks plugged in, it is seen as sdk, if two, it becomes sdc, with no other disks it is sda.  This is an industrial/scientific application not a server/desktop application. The system is (re)started all the time with entirely different disk environments virtually every time.

As the hot-swap disk environment changes, the normal drive enumeration in the kernel changes -- sda,b,c...,k never point to the same physical drive.  Since the bulk of these disks are for raw storage (often organized into RAID volumes) they are not even the same physical disks.

UUID identification seems a likely way forward but this requires initrd's which I am trying to avoid.  It seems that PARTUUID does not, but grub2 seems blissfully unaware of partuuid's. I have seen a few hacks that allow it to see them, but I really don't want to fork grub2 if I can help it.  Grub2 is forked enough as it is (-;

I can get this to work if I only ever boot from the internal module.  I use 'LABEL=' in fstab and use unique disk labels on all disks.  That is effective at getting around the sd{..k} issue.

What I cannot seem to do is get the UEFI bootloader and grub2 working on each disk so that they are fully independent and so that Aptio BIOS will scan each disk and find all the ESPs and boot fro the first ones it finds. I can create a naming convention to help the system BIOS but I just cant seem to get all the pieces to work at once.

I understand this is an unusual problem, and I am probably not even stating it clearly. Not even sure if grub is the way to go or if something like the kernel stub or refind or something else will do better.

Any help will be wonderful.

----------

## chithanh

PARTUUID= is just a kernel parameter that you pass in grub. If you manage your boot menu manually then you can ignore whatever grub installation scripts support.

If the root volume is driven by a different driver than the other storage devices, another workaround may be to make the root device driver built-in and the others modules. That way the race for what becomes sda is avoided.

----------

## dbishop

I don't believe it is a "race" for what becomes sda that is the issue.

My understanding of the process is that under EFI  the hardware system nvmem gets updated with valid boot devices so that the BIOS knows what to do. For example, here is the output of efibootmgr for the machine in question:

```
gilmour EFI # efibootmgr -v

BootCurrent: 0004

Timeout: 10 seconds

BootOrder: 0000,0004,0006,0005

Boot0000* gentoo   HD(1,GPT,5e2d7fa0-1791-4a14-b1a9-4eb9ec1f7812,0x800,0x1000)/File(\EFI\gentoo\grubx64.efi)

Boot0004* UEFI: Built-in EFI Shell   VenMedia(5023b95c-db26-429b-a648-bd47664c8012)..BO

Boot0005  Hard Drive    BBS(HD,,0x0)..GO..NO........o.S.A.T.A. .S.S.D....................A...........................>..Gd-.;.A..MQ..L.F.A.4.3.7.0.A.5.D.1.4.2.0.0.6.1.9.6.6.8........BO

Boot0006  USB CD   BBS(10,,0x0)..GO..NO........a.O.p.t.i.a.r.c. .D.V.D. .R.W. .A.D.-.7.5.6.0.S. .S.B.0.1....................A................................Gd-.;.A..MQ..L.0.0.0.0.0.0.0.0.0.2.C.0........BO

```

The Boot0000* line tells the whole story -- HD(1) is a GPT volume with a named resource identifier and a file to run. But when I plug in other disks (or remove them) after grub-install was run, the definition of HD(1) changes. There lies the problem -- I think, anyway

The thing about using PARTUUID is so that when the bootloader is running, the kernel can track the correct partitions without me having to come up with my own unique labels all the time.  For example, if I have two disks with the same labels, udev will link them to the last disk mounted with these labels in /dev/disk/by-label/  -- using PARTUUID will allow me to sidestep that, but grub can't deal with PARTUUID at the moment, only UUID

The reason I am dealing with EFI and grub in the same subject is that it is not just one or the other in tis scenario.  Both are involved and fixing one will not fix the other -- I have tried many different ways but none have worked yet.

The only way I can get a reliable boot disk -- even one -- is to make sure that it appears irst iin the enumerating done by the system BIOS and also by the kernel -- these must be synchronized as far as I can tell.  I hope that there is a way to break this interdependence.  Otherwise I have to plug in a disk in a specific spot and neither move it nor or add another with common label names.

Does this make sense?

----------

## VoidMage

Well, till grub gets full PARTUUID support, this is how I'm doing things:

```
menuentry "Linux 4.4.1" {

set root=(hd1,gpt4)

search --no-floppy --fs-uuid --set=root <UUID>

linux /boot/linux-4.4.1-0 root=PARTUUID=<PARTUUID>

}
```

It's not fully correct (cause a failure isn't really handled here), but it works reasonably well.

----------

## dbishop

Yes, I know what you presented can be done, but according to my understanding -- which may be flawed -- where you have it is too late in the process for everything I am trying to do.

This line requires that the disk get enumerated by system BIOS (UEFI in this case) consistently. In my case it will not work out that way because the number of disks will vary from boot to boot, as will the SATA ports that actually get occupied:

```
set root=(hd1,gpt4)
```

That particular set root statement could be a UUID since it resides on a specific disk, but if it was a PARTUUID it would be better.  To do this grub would somehow have to step through the disks rather than just use a straight lookup (this may not be an accurate portrayal, I have no idea about the inner workings of grub).

My imagined idea of what it should look like would be:

```
set root=(<UUID>,gpt4)      <-- grub should search for the target HDD by matching the UUID, then the partition
```

or perhaps

```
set root=(<PARTUUID>)      <-- or grub could simply seek the matching PARTUUID and get both questions answered at once, which disk and which partition
```

Then these lines would work just fine:

```

search --no-floppy --fs-uuid --set=root <UUID>

linux /boot/linux-4.4.1-0 root=PARTUUID=<PARTUUID>
```

And recall that PARTUUID only helps avoid glitches when two disks have common label names, but it is not necessary if I keep good records and name them, say, sequentially (e.g, boot001, boot002, ...)

THEN THERE IS BIOS TO TACKLE!

The system BIOS needs to be told what disks MIGHT be present and then remember the order to use them in.  And this needs to be persistent, and the BIOS must not auto-delete entries it cannot find.  This is the part that I really have no idea about. It seems like it should be possible, but the trend towards one-size-fits-all and "automagic" configuring continues to trade control and freedom for "simplicity".

----------

## NeddySeagoon

dbishop,

Heres a bit of a hack.  Its based on the idea of how an initrd works.

1) You always boot to the rescue 64G drive if you don't know whats loaded

2) This looks around to see what's there and rewrites the boot menu, including the rescue entry.

3) you reboot to the rescue 64G drive and choose a menu option.

----------

## dbishop

I like the idea, but I don't think it can work: the DOM disk (the 'rescue disk') is really the last device to get enumerated by UEFI BIOS and the kernel.  So if nothing is there, it becomes sda. Any other drive, it becomes sdb, two others,sdc... you get the picture.

So whenever I add or remove ANY disks, the drive's enumeration changes and everything breaks. This is the behaviour I want to fix

My confusion is this:

There are a number of places where these statements appear in /boot/grub/grub.cfg:

```
set root='hd0,gpt2'        <-- this is the /boot partition
```

or

```
set root='hd0,gpt4'         <-- this is the / partition
```

and I believe that's where it all falls apart.

Can these statements get replaced with this?

```
set root=UUID=ecadf334-2292-4a21-8447-1e80a6d36680
```

or perhaps this?

```
search --no-floppy --fs-uuid --set=root b8ca7b24-17e6-461f-a3dd-d09e06f7ac75
```

Otherwise the system loads the kernel from the /boot partition on the disk regardless of how many are plugged in but cannot find the / (root) partition and just hangs when it tries to run init

----------

