# Messing with bios and efi-boot, system broken.

## 1clue

Hi,

I have this: http://www.supermicro.com/products/motherboard/atom/x10/a1srm-ln7f-2758.cfm with 16gb RAM. It has been running Gentoo for years, but it has been booting UEFI into grub, and I was trying to get rid of the grub part.

I built a kernel (4.15.12) with what I thought would be the correct parameters per https://wiki.gentoo.org/wiki/EFI_stub_kernel and I installed it using 'make install' so it went into /boot. I then copied the kernel into EFI/BOOT/BOOTX64.efi or whatever it is.

I reset and went into the BIOS to setup my EFI boot options, added the kernel and then saved and reset. I got it going apparently into the kernel I wanted, then reset again and deleted the old junk because my boot partition was getting full.

System won't boot. It goes into grub, which is good, and then goes into the kernel that's left, which is good, but finally it freezes there. No output. No apparent panic.

I reset again, and pushed 'del' to get into the bios. It says 'entering setup' but then goes straight into grub. I can't stop it.

I apparently have a bad kernel. I can't boot into system rescue cd, can't configure it to do so because I can't get into the bios.

Somebody have an idea which does not involve crossing jumpers on the motherboard?

Thanks.

----------

## bammbamm808

Try a complete shutdown instead of a reboot. On my asus mainboard, rebooting often gets me right back to rEFInd instead of my efi firmware.

----------

## 1clue

Tried it, no joy.

----------

## 1clue

I also reset the BMC and restarted the IPMI client.

----------

## 1clue

That's odd, the sensors through IPMI are all reading zero. The fan speed sensors are missing from the panel. Changing fan speed mode does nothing to the fans.

Not looking good.

----------

## bammbamm808

New CMOS battery?

----------

## Ant P.

Have you tried the CMOS reset jumper yet? I had a nightmare of an EFI problem a while back (all bootloader config changes were silently failing), and that was the only thing that got it back into a sane state.

----------

## bammbamm808

This does sound as though it could be some UEFI firmware voodoo.

----------

## krinn

http://www.supermicro.com/manuals/motherboard/Atom_on-chip/MNL-1617.pdf

 *Quote:*   

> C-3  
> 
> To Recover the Main BIOS Block Using a USB-
> 
> Attached Device
> ...

 

(actually it's in page C1)

ps: yep, you can download the needed bios image from their www

ps2: you said "no jumper", none involve there  :Smile: 

ps3: the fact your del key is not entering the bios and your sensors weirdness suggest you've done "bad" with your bios

ps4: hey, even if you don't really do it, actually the ctrl+home key (you'll read that) will enter the bios in their "restore bios" mode, which is in the bios (so even if you don't need to flash it, you should be able to use ctrl+home instead of del to enter it).

ps5: (because i love to ps like mad): good luck

----------

## P.Kosunen

 *1clue wrote:*   

> It has been running Gentoo for years, but it has been booting UEFI into grub, and I was trying to get rid of the grub part.

 

Why?

----------

## 1clue

 *P.Kosunen wrote:*   

>  *1clue wrote:*   It has been running Gentoo for years, but it has been booting UEFI into grub, and I was trying to get rid of the grub part. 
> 
> Why?

 

Because I can, more than anything else.  This box isn't in production and I thought it would be good to figure this out.

@bammbamm808, I just checked the event log and there are warnings about the battery. I'll try that.

@Ant P, Trying to avoid the jumper but if it's a battery issue then my cmos is gone anyway. This box has a huge number of bios settings, it's such a drag to have to go through all that crap again.

@krinn, for a minute there I thought you were listing your game consoles.   :Very Happy:   At any rate I have the manual both printed on paper and electronic, but thanks. Was trying to avoid the CMOS reset. I missed the ctrl+home bit, thanks.

----------

## 1clue

I think this is something to do with my boot disk.

If I unplug the drives, then I can boot into BIOS.

If I put in a system rescue cd with the drives unplugged I can boot to the cd or get BIOS.

Same as above for boot menu.

I plug in the drives and hit reset or cold boot, it goes straight to grub and then freezes on whatever kernel I select.

I have 4 kernels in the list, at least 3 of which worked perfectly before I started this crap.

I can disconnect the drives, plug in the sysrescuecd, boot to the grub menu on that, then plug in my drives and continue the boot, and I get drives.

So I did the chroot, I did grub-mkconfig and rebooted again. Nothing.

I did efibootmgr and it shows what I expected.

I reboot and it locks up on any kernel I select.

I took the drive out and did an fsck from another box. I checked file names and locations, everything seems to check out. Not so sure about grub config files, but grub is the only way this thing boots right now.

These lockups are a blank screen with:

```
Loading Linux 4.12.12-gentoo-k3 ...
```

The k3 is my suffix, if I recompile the kernel I'll bump the suffix.

So the thing is, there's nothing in the logs now for the system.  There's no kernel panic, nothing after loading Linux .....

I've left it sit for the better part of an hour on the off chance it's just waiting on a timeout.

----------

## bammbamm808

Not sure if this applies to AMD gpus, but sometimes nvidia graphics need "nomodeset" passed to kernel to boot.

----------

## 1clue

 *bammbamm808 wrote:*   

> Not sure if this applies to AMD gpus, but sometimes nvidia graphics need "nomodeset" passed to kernel to boot.

 

Mine is all intel, and no nvidia on the system at all. The video card is an ASPEED chip made specifically for headless servers, driver built into the kernel. And I have never had to pass anything to the kernel on this box.

I have kernels on this system which have been used for at least weeks at a time. They booted successfully before, but no longer.

Maybe I need to read more of the handbook surrounding the boot loaders.

----------

## grumblebear

First, correctly configure the new kernel for booting efi stub. You mentioned, you have done that.

Then place the kernel in the correct location of the ESP partition. I assume you have your boot disk GPT partitioned.

What I think you missed, is to configure the boot priorities for UEFI in the BIOS. Also, if you get warnings about the CMOS battery, you will have to replace it anyway sooner or later, else you will have to reconfigure your entire BIOS more than once.

----------

## 1clue

Right at the moment my system is setup to boot either bios or efi, and it is somehow getting to grub. It's possible to get to grub either through BIOS boot or through efi boot, as that was my original efi setup to use grub.

My battery was changed out yesterday.

I've been fiddling with EFI boot priorities and have tried pretty much anything I can think of. It ALWAYS hits grub, and it ALWAYS launches one of the kernels (the latest mostly-untested or one of the older ones which have been successfully used for months before I started tinkering with what already worked) and then it ALWAYS prints that single line and stops. No panic, no other messages.

----------

## Tony0945

Try seeting a longer timeout in grub.cfg or grub.conf, whichever one you have.

----------

## mir3x

U said that u followed https://wiki.gentoo.org/wiki/EFI_stub_kernel .

But have u used GPT partition table ?

Did u created EFI partition ?

Check https://wiki.gentoo.org/wiki/Handbook:AMD64/Installation/Disks

----------

## bammbamm808

An experience like this led me to rEFInd. I've used it for years, and once configured, works well and predictably.I've even got some customizations to.the look of my rEFInd menu screen. Setting it up to boot windows 8.1 was fairly easy as well.

----------

## 1clue

 *mir3x wrote:*   

> U said that u followed https://wiki.gentoo.org/wiki/EFI_stub_kernel .
> 
> But have u used GPT partition table ?
> 
> Did u created EFI partition ?
> ...

 

Yes, it's GPT partition table and I have an EFI partition. It has worked booting EFI-only for years before I tried to get it to boot without using grub. Partition types are correct, the filesystem is fat32, all that.  My BIOS, when I can get to it, sees the files in my EFI partition because I can make entries for it in there.

----------

## mir3x

So maybe you forgot to add 

GRUB_PLATFORMS="efi-64" to  /etc/portage/make.conf ?

( and reemerge grub)

----------

## 1clue

I rechecked the entire handbook out to the reboot.

Remember that this has been an EFI-boot system for years now. Yes, /etc/portage/make.conf GRUB_PLATFORMS="efi-64" and I just got done rebuilding it.

I also did 'grub-install --target=x86_64-efi --efi-directory=/boot' again too. My EFI partition is mounted directly to /boot, it's not a separate folder inside of /boot.

----------

## mir3x

So maybe add earlyprintk=vga  to kernel params ? ( press o in grub menu to quick add AFAIR)

If it changes nothing then probably kernel is not loading at all ( try refind - it has nice feature to detect latest kernel, so you would never have to update config again)

----------

## skellr

I would delete the kernel you put into /EFI/BOOT.  /EFI/BOOT is a fallback directory to use if all else fails. If the kernel that's there is broken, and things are failing before it gets to that point, you probably wont see any meaningful error message. It maybe falling back to that image and causing grief.

----------

## Tony0945

Concur with skellr. Perhaps try use the kernel from sysrescuecd.

As mir3x said, with refind, refind will find all available kernels and boot the latest after your programmable timeout. With grub, it only takes a slight mis-typing to make your system unbootable.

----------

## Zucca

Ok. My time to guess.

For some reason the UEFI still fails to locate the "efistubbed" kernel. I've been there with one AMD APU UEFI laptop.

If UEFI cannot locate, boot or does recognize the OS loader (the efistub kernel in this case), it then searches the ESP partition for "\BOOT\BOOTX64.EFI" and tries to boot it.

I managed to properly configure the UEFI to boot into the efistub kernel by downloading some uefi shell application and place it to "\BOOT\BOOTX64.EFI". Then via the shell I manually created a startup file for the UEFI shell. Th startup file only had a timeout (to enter the setup) and then a command to load the efistub kernel. I never got the UEFI to boot the efistub kernel directly.  :Sad:  But at least it was still UEFI which loaded the kernel.

If you're giving paths for UEFI directly, always remember to use backslash as directory separator. I think even Windows supports slash nowdays as separator, but afaik most UEFI implementations do not.

If the UEFI loads GRUB after whatever you try to do, then it might be reading the OS loader from disk offsets, because all other methods have failed? Yeah. I know. Wild guess. I'd try to zero the whole ESP (using dd) and then mkfs it and put the files back.

It's funny how I've also managed to get UEFI GRUB working ok, but efistub kernels have always failed in a way or another.  :Rolling Eyes: 

And as for why GRUB cannot no longer boot kernel properly... Well GRUB is a complex beast. I think someone will eventually run linux on grub shell to boot linux.

----------

## P.Kosunen

 *1clue wrote:*   

> Because I can, more than anything else.  This box isn't in production and I thought it would be good to figure this out.

 

Good to practise, but stub kernel just makes things more difficult IMO.

If that stub kernel does not print anything to screen there might be something missing, i remember having same problem when missing some framebuffer driver.

----------

## Zucca

 *P.Kosunen wrote:*   

> Good to practise, but stub kernel just makes things more difficult IMO.

 It's mostly simpler, but you need to store the kernel command line inside the kernel, which kinda ruins the simplicity in some cases because to change the command line you need to recompile the kernel.

There might be exceptions if the UEFI implementation somehow allows to pass arguments for the kernel... In fact, I think I've seen such UEFIs...

----------

## grumblebear

No, you do not need to compile the command line arguments into the kernel, the exception being the default bootx64.efi if you want to use that.

Other kernels can be configured with efibootmgr, including the command line. In fact that is the most simple setup in my opinion. Why have to bother with grub, when the UEFI Bios has its own bootloader. I cannot understand why the wiki still seems to favor grub for first time installs. There are dozens of threads here about struggling with grub and boot partitions.

----------

## Zucca

 *grumblebear wrote:*   

> Other kernels can be configured with efibootmgr, including the command line.

 So this is common among UEFI implementations?

And yes. You're absolutely right about UEFI being cabaple bootloader, so why to have another there...

However, I have at least one reason to use a seperate bootloader: The ability to boot rescue image without actually having physical media for it. This, of course, assumes you don't break your bootloader installation.  :Wink: 

----------

## 1clue

At this point I'm thinking I need to:

chroot into the system

Completely remove the new unknown kernel

emerge @system (after update of course)

update grub config

Reinstall grub itself to /EFI/boot/bootx64.efi

Or maybe switch over to REFInd?

----------

## Zucca

 *1clue wrote:*   

> Or maybe switch over to REFInd?

 It sure should be much simpler than GRUB.

It seems that rEFInd expects to find kernel and initrd from the ESP.

If you get it working and being able to recognize (find) kernels automatically, please report here.

I have a love-hate affair with GRUB and I feel like having a little adventure.

----------

## 1clue

 *Zucca wrote:*   

>  *1clue wrote:*   Or maybe switch over to REFInd? It sure should be much simpler than GRUB.
> 
> It seems that rEFInd expects to find kernel and initrd from the ESP.
> 
> If you get it working and being able to recognize (find) kernels automatically, please report here.
> ...

 

My affair with grub is hate/hate. At heart I'm a lilo guy. I have NEVER managed to figure out grub, except to get it working and then blindly follow the same procedure over and over again after that. I initially installed this system as grub because I needed it up and running, and then afterward switched to UEFI by what I thought to be the simplest means possible.

I think when this settles down I hope to run rEFInd.

Status:

I disconnected all drives and halted.

I removed the battery and reset the CMOS, for far longer than the recommended 5 seconds.  It worked this time, mostly.

I reinstalled the battery and booted the box, again with no drives.

It got boot errors, but after multiple resets and alternating function keys on the screen, I got back into BIOS.

I also completely reset the BMC too.

I STILL had crap from my attempts at boots in the bios.

I deleted everything I could find.

I configured boot order for EFI-only and went with basic drive priorities (cdrom first, etc)

I reattached the drives

I saved and reset the bios

Grub came up, I selected 1 kernel back, and it booted.

I did an update (emerge-webrsync...) which is still going.

I removed the questionable kernel and everything related to it.

I did grub-mkconfig > /boot/grub/grub.cfg

I have high hopes that next boot will be a hands-off boot.

At that point I'll switch over to getting rEFInd to work.

----------

## 1clue

Crap.

Now whatever I do, when I power up or reset, the screen says "System initializing..." and then on the lower right, the BIOS post codes go by until it gets to "ED" and then it hangs. For hours.

I've reset everything that can reset. I've unplugged every wire, popped the battery, reset the CMOS, cold reset the BMC. Whether it has drives plugged in or not, it always gets to the ED post code and stops. It doesn't even get to the point where I can press a key to enter the BIOS.

Also, AFAICT, there is no ED bios post code. I'm on AMI Bios 1.76.

----------

## bammbamm808

 *Zucca wrote:*   

>  *1clue wrote:*   Or maybe switch over to REFInd? It sure should be much simpler than GRUB.
> 
> It seems that rEFInd expects to find kernel and initrd from the ESP.
> 
> If you get it working and being able to recognize (find) kernels automatically, please report here.
> ...

 

Ive been using rEFInd for years. It finds the efistub kernels reliably and requires minimal fiddling and cinfiguration. Ive also used it to boot win 8.1 flawlessly. It does memtest and any other boot alternatives you might want. It'd easy to skin and theme as well.

----------

## P.Kosunen

 *1clue wrote:*   

> Crap.

 

support@supermicro.com

Try contacting Supermicros tech support.

https://www.anandtech.com/show/11110/semi-critical-intel-atom-c2000-flaw-discovered

There is also this.

----------

## 1clue

@P.Kosunen,

What a spectacularly craptastic event.

I'm trying really hard to tell you I appreciate your help, but it's difficult given the ramifications.

Maybe I can get another motherboard? I wonder if they'll let me upgrade to a c3958?

More likely I'll be up the creek without a paddle and get to buy my own board.

----------

## P.Kosunen

https://forum.pfsense.org/index.php?topic=126783.0

It sounds like Supermicro will give you new or fixed board even if warranty has expired, just mention Atom C2000 bug. You can ask about upgrade, i would guess it is not possible in case of manufacturer, reseller/store might be able to do it when returning broken product.

----------

## 1clue

That, at least, is really good news. I've already emailed the supermicro support center. If they don't offer to replace it I'll mention the bug.

Your google voodoo is much stronger than mine.

Thanks.

----------

## 1clue

I have an RMA request in, at the direction of the SuperMicro support guy. I should have my RMA number within 24 hours.

Since I got this board I've been impressed with SuperMicro and how they build things. Every piece of hardware is first-rate and performance is fantastic.

When I started having problems, at first I thought it was me or my messing up Gentoo. 

When the system failed to even get to the point where I can choose to enter the BIOS I had a sudden dislike of SuperMicro. It no longer seemed possible that this was my doing. It was maybe stronger because I had thought them so rock-solid before. Every piece of consumer hardware I've ever bought has turned out to have some sort of irritating flaw, whether or not it seriously impacted my intended use of the product.

In truth it may have been me messing with a direct-to-kernel boot that caused this failure, based on the link P.Kosunen provided. I have been booting and resetting frequently to get this thing to work, and that actually seems to be something that may exacerbate the aging of that resistor.

At this point, I've had pretty rapid response and instruction to RMA the board from SuperMicro. As well, it's not SuperMicro's fault that the flaw exists, but Intel's. I'm back in love with SuperMicro again. When it comes time to buy new desktop hardware they will definitely be on my short list of suppliers, along with system76 even though it will almost certainly be a gentoo box in short order.

----------

## Goverp

I've just entered this swamp myself, a new HP laptop, free up 600 GB disk for Gentoo (hey, one place where Windows 10 is better than Vista!), install via LiveDVD in a USB stick (only took 1hr to find out how to boot that!  Thanks Sakaki), and now fighting with rEFInd and HP's and Windows boot managers/loaders.

A relevant point to the above discussion - rEFInd found my Gentoo kernel (in the EFI partition), but the screen went black when I booted into it.  I'm pretty sure the issue is that it missed the refind-linux.conf file that specified "root=/dev/sda7" to boot the kernel, and I'd ommitted to include said parameter within the kernel.  I think that may be because I called my kernel "gentoo.efi", but reading the rEFInd documentation, there's special handling of files labelled "vmlinux*.efi", so it probably wasn't looking.  I also noticed a comment about bootx64.efi not picking up external parameters, so perhaps it's the same problem - the kernel doesn't know where the root partition is.

As an aside, digging into the Windows boot manager configuration (BCDEdit et al).  It's default listing hides the firmware boot loader; you need to run "BCDEdit /enum all" to see the {fwbootmgr} entry, which has timeout=0 and the Windows boot manager first on its list.  Perhaps that's why all attempt to start say rEFInd instead of the Windows manager fail.  I may try fiddling with it in stages - I get seriously worried about bricking my system when playing in the EFI space. ...

----------

## P.Kosunen

 *Goverp wrote:*   

> I've just entered this swamp myself, a new HP laptop, free up 600 GB disk for Gentoo (hey, one place where Windows 10 is better than Vista!), install via LiveDVD in a USB stick (only took 1hr to find out how to boot that!  Thanks Sakaki), and now fighting with rEFInd and HP's and Windows boot managers/loaders.

 

I remeber having problems with some old HP Probook with Win7 Pro, it always defaulted to Windows after reboot even when efibootmgr settings did look good. I ended up moving /Microsoft/Boot/bootmgfw.efi to /Microsoft/bootmgfw.efi and copied refind.efi and config to /Microsoft/Boot/bootmgfw.efi + /Microsoft/Boot/refind.conf.

7 entry in refind.conf:

```
menuentry winseven {

   icon /EFI/refind/icons/os_win.icns

   loader /EFI/Microsoft/bootmgfw.efi

}
```

----------

## 1clue

Status update:

My board is on the way back home. I don't know if it's the original board or a different one, but considering the short time my board was at their site before the return was on its way, I suspect the serial number will not match the one I bought.

No problem for me, I'd rather have quick turnaround than the same exact piece of hardware.

----------

## 1clue

Another status update:

The board is back. They didn't get my weird BIOS post code of ED. They updated all the firmware, did the hardware fix and sent it back.

Something is wrong with the system. I've ordered some new cables for everything inside but at the moment it's actually running.

I had a bit of a scare, I was messing around with the IPMI software and set the IPMI interface to "dedicated." The problem is, that evidently changes it to the serial port and not a NIC. I don't have a serial cable, and nothing I could do would bring it back. I ordered a usb-to-serial-rj45 cable as well.

I finally found sys-apps/ipmicfg, which is a command-line utility for messing with supermicro firmware. I found out how to reset the BMC network settings from there. It's a pretty handy utility if you have a supermicro box.

So, at the moment I'm running in UEFI boot, with grub as a boot loader. Again.

I might give rEFInd a try. Or maybe I'll try the stub kernels again?  IDK right now.

----------

## Zucca

 *1clue wrote:*   

> So, at the moment I'm running in UEFI boot, with grub as a boot loader. Again.
> 
> I might give rEFInd a try. Or maybe I'll try the stub kernels again?  IDK right now.

 As you may have noticed, I recently moved to UEFI and rEFInd. After the installation and little configuration it... it just works. Just remember rEFInd is "OS loader" not boot loader. So you need to use "efistubbed" kernels with it.

I don't see much advantage with plain efistubbed kernel vs. letting rEFInd to choose one.

----------

## 1clue

I was watching your thread. It was one of the reasons I'm considering using rEFInd.

----------

