# NVME ssd not found when UEFI boot is enabled [SOLVED]

## muhlemmer

Hi, I bought a new laptop and I decided to take the things on the performance side. It's a intel skylake / sunrise point based system with a core i7 processor. i915, iwlwifi, etc... The root drive is a NVME SSD (Samsung 950) and I have a second SSD on SATA. I compiled an own configured kernel, like I'm doing for years with my old Intel based laptop. I am running gentoo-sources-4.4.4.

The issues

When booting with UEFI enabled, the NVME SSD is not found. I start to use Dracut, to do some fault finding in a shell, but the complete /dev/nvme0p1 nodes are missing. During the early boot process there is also a dmesg error shown:

```
i801_smbus 0000:00:1f.4: Failed to allocate irq 255: -22
```

I can boot with classic BIOS mode, but only from my SATA SSD

Actions taken until now

Boot with sysrescuecd in UEFI mode. The device nodes show up and device can be mounted (using alterkern64, which is ver. 4.1.15)

Use sysresquecd Grub bootloader to load my kernel, problems show up identical again

Booting the kernel in EFI Stub mode, stand alone or using rEFInd, problems identical

Disable EFI support under processor types, it boots, it finds and mounts my NVME root partitions, but I cannot reboot the system anymore. I still get similar, but more IRQ errors in dmesg.

To rule out misconfiguration on my side, I used Genkernel to compile my kernel, but it "forgets" to select NVME block device, so I add that one myself in the menuconfig phase. Anyway, same issues using the same kerel version. (4.4.4)

To rule out regression bugs, I installed kernel 4.1.15, but same problems

Tried a lot of different kernel configs regarding PCI(e), SMbus, EFI etc.

Now, to me this sounds like either a kernel bug, like missing hardware support or a BIOS firmware bug. But maybe anyone has a last bit of advise, before I open up a bug report and I would like to have some input on what kind of bug tracing is required, when opening a bug. (Kernel bugs is something new for me). What I do wonder, why o why is the sysrescuecd kernel able to do the miracle of finding my NVME disk? They use some tweaks I didn't find? According there website they use the fedora core patch set. Might it be possible that the guys at fedora killed this bug, but the solution didn't make it to upstream?

Why enable UEFI?

I know that people think UEFI is evil, so I'd like to motivate myself a bit.

Classic BIOS mode is slow as hell on this laptop. Loading Grub and kernel seems to take ages (At least 5 seconds all together)

I bought a new expensive laptop, because I want speed and fast boot times.

THere might be users out there, with a same laptop, that did not purchase a sedond SSD or HDD next to the NVME one. They cannot boot from classic BIOS mode like this and are stuck to use UEFI.

It upsets me, I selected a laptop deliberate without nVidia or ATI crap, all intel, because I had the idea that intel just works (since they are quite involved in the kernel development)

----------

## Wallsandfences

I have a current intel nuc and I needed to upgrade my bios to make the 950 work. It now boots flawlessly and fast in uefi mode.

You will most probably encounter some trouble with skylake vga and firmware, i fear...

Rüdiger

----------

## Roman_Gruber

first thing is a bios update. there is no way around uefi, it is the new standard, i suppose

as with new fancy stuff, you are like a beta tester, lots of issues until it works. my main reason why i decided to buy 3 years old laptop and guess waht still issues with the linux kernel.

you may try to point your uefi to the sata discs grub, and than use that grub to use your nvme disc.

I am not sure if I have read that even windows users have problems booting windows from that nvme thing. 

you may look up how these users solved it and try to apply the fix for your case too.

----------

## muhlemmer

No bios Update available yet. Pulled in git-sources-4.5-rc7 since there was a lot of work on NVMe for the 4.5 release. But with that version, the device is not showing anymore at all. Tried to obtain a membership in the nvme-linux mailing list to raise a bug, but until now I didn't receive subscription confirmation. Probably I'll post to the kernel bugzilla instead.

----------

## olejseba

Sorry for my Engish.

But.

1) Run sysrescuecd in EFI mode.

2) Give us the resoult  fdisk -l. (L)

3) If you have something like this :

```

/dev/nvme0n1p1     2048   1050623   1048576    512M System EFI

/dev/nvme0n1p2  1050624  68159487  67108864     32G Linux - system plików

/dev/nvme0n1p3 68159488 500118158 431958671    206G Linux LVM

```

4) Then: /dev/nvme0n1p2 root partition

```

mount /dev/nvme0n1p2 /mnt/gentoo

mount --rbind dev /mnt/gentoo/dev

mount --rbind sys /mnt/gentoo/sys

mount -t proc none /mnt/gentoo/proc

chroot /mnt/gentoo /bin/bash

mount /dev/nvme0n1p1 /boot/efi

```

Now You must need decided whether with or without grub2.

grub2:

```

grub2-install --target=x86_64-efi --recheck --debug

```

Without grub. Then something like that.

```

# efibootmgr --create --part 1 --label "Gentoo_nvme" --disk /dev/nvme0n1p1 --loader '\EFI\gentoo\bootx64.efi' -u 'init=/usr/lib/systemd/systemd root=/dev/nvme0n1p2 rootfstype=ext4 raid=noautodetect'

```

[/quote]

Remember.

1) set efi USE flags and recompile system witch grub.

2) add to kernel fat dos x32 efi

----------

## muhlemmer

So, if I start sysrescuecd an run fdisk -l I get (removed my other disks to make things shorter):

```
Disk /dev/nvme0n1: 238.5 GiB, 256060514304 bytes, 500118192 sectors

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disklabel type: gpt

Disk identifier: 57BFF26A-281C-4C94-B88D-AC56D6010AB6

Device             Start       End   Sectors   Size Type

/dev/nvme0n1p1      4096   1048575   1044480   510M EFI System

/dev/nvme0n1p2   1048576  70311935  69263360    33G Linux filesystem

/dev/nvme0n1p3  70311936 117186559  46874624  22.4G Linux swap

/dev/nvme0n1p4 117186560 500118158 382931599 182.6G Linux filesystem
```

If I run lspci -k, I get:

```
00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 07)

        Subsystem: CLEVO/KAPOK Computer Sky Lake Host Bridge/DRAM Registers

00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 06)

        Subsystem: CLEVO/KAPOK Computer Sky Lake Integrated Graphics

        Kernel modules: i915

00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)

        Subsystem: CLEVO/KAPOK Computer Sunrise Point-H USB 3.0 xHCI Controller

        Kernel driver in use: xhci_hcd

00:14.2 Signal processing controller: Intel Corporation Sunrise Point-H Thermal subsystem (rev 31)

        Subsystem: CLEVO/KAPOK Computer Sunrise Point-H Thermal subsystem

00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31)

        Subsystem: CLEVO/KAPOK Computer Sunrise Point-H CSME HECI

00:17.0 SATA controller: Intel Corporation Device a102 (rev 31)

        Subsystem: CLEVO/KAPOK Computer Device 3568

        Kernel driver in use: ahci

00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #3 (rev f1)

        Kernel driver in use: pcieport

        Kernel modules: shpchp

00:1c.3 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #4 (rev f1)

        Kernel driver in use: pcieport

        Kernel modules: shpchp

00:1c.5 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #6 (rev f1)

        Kernel driver in use: pcieport

        Kernel modules: shpchp

00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #13 (rev f1)

        Kernel driver in use: pcieport

        Kernel modules: shpchp

00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31)

        Subsystem: CLEVO/KAPOK Computer Sunrise Point-H LPC Controller

00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31)

        Subsystem: CLEVO/KAPOK Computer Sunrise Point-H PMC

00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD Audio (rev 31)

        Subsystem: CLEVO/KAPOK Computer Sunrise Point-H HD Audio

00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)

        Subsystem: CLEVO/KAPOK Computer Sunrise Point-H SMBus

        Kernel driver in use: i801_smbus

        Kernel modules: i2c_i801

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V (rev 31)

        Subsystem: CLEVO/KAPOK Computer Ethernet Connection (2) I219-V

        Kernel driver in use: e1000e

        Kernel modules: e1000e

01:00.0 Network controller: Intel Corporation Wireless 3165 (rev 81)

        Subsystem: Intel Corporation Dual Band Wireless AC 3165

        Kernel modules: iwlwifi

02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5229 PCI Express Card Reader (rev 01)

        Subsystem: CLEVO/KAPOK Computer RTS5229 PCI Express Card Reader

        Kernel driver in use: rtsx_pci

        Kernel modules: rtsx_pci

04:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a802 (rev 01)

        Subsystem: Samsung Electronics Co Ltd Device a801

        Kernel driver in use: nvme

        Kernel modules: nvme
```

Last entry shows that my nvme device is found and a driver is used for it.

My first try was to use grub, but it can't load the nvme disk either and drops me to the grub shell (when BIOS is in EFI mode). Now, I compiled my kernel with EFI stub support and started to use rEFInd as bootloader. This is to attach a dracut initramfs with a shell and some basic tools for fault-finding. This would not be possible if I use efibootmgr alone. (Anyway, the same error occurred in either set-up).

So, when in the dracut emergency shell and I execute fdisk -l:

```
Disk /dev/sda: 232.9 GiB, 250059350016 bytes, 488397168 sectors

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disklabel type: dos

Disk identifier: 0x85f1faaa

Device     Boot   Start       End   Sectors   Size Id Type

/dev/sda1          2048   1050623   1048576   512M  b W95 FAT32

/dev/sda2       1050624 488397167 487346544 232.4G 83 Linux

Disk /dev/sdb: 698.7 GiB, 750156374016 bytes, 1465149168 sectors

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 4096 bytes
```

sda = (second) SATA SSD

sdb = HDD with btrfs

nvme0n1 doesn't show

When I execute lspci -k, you can see that there is no device 04:00.0 anymore: (note that dracut did not include the resolution lib, so it outputs the raw pci id numbers instead of names)

```
00:00.0 Class 0600: Device 8086:191f (rev 07)                                   

        Subsystem: Device 1558:3568                                              

00:02.0 Class 0300: Device 8086:1912 (rev 06)                                       

        Subsystem: Device 1558:3568                                                   

        Kernel driver in use: i915                                                       

00:14.0 Class 0c03: Device 8086:a12f (rev 31)                                               

        Subsystem: Device 1558:3568                                                              

        Kernel driver in use: xhci_hcd                                                                 

00:14.2 Class 1180: Device 8086:a131 (rev 31)                                                          

        Subsystem: Device 1558:3568                                                                        

00:16.0 Class 0780: Device 8086:a13a (rev 31)                                                               

        Subsystem: Device 1558:3568                                                                             

00:17.0 Class 0106: Device 8086:a102 (rev 31)                                                                     

        Subsystem: Device 1558:3568                                                                                  

        Kernel driver in use: ahci                                                                                     

00:1c.0 Class 0604: Device 8086:a112 (rev f1)                                                                            

        Kernel driver in use: pcieport                                                                                      

00:1c.3 Class 0604: Device 8086:a113 (rev f1)                                                                                 

        Kernel driver in use: pcieport                                                                                          

00:1c.5 Class 0604: Device 8086:a115 (rev f1)                                                                                     

        Kernel driver in use: pcieport                                                                                              

00:1d.0 Class 0604: Device 8086:a11c (rev f1)                                                                                         

        Kernel driver in use: pcieport                                                                                                  

00:1f.0 Class 0601: Device 8086:a144 (rev 31)                                                                                             

        Subsystem: Device 1558:3568                                                                                                         

00:1f.2 Class 0580: Device 8086:a121 (rev 31)                                                                                                  

        Subsystem: Device 1558:3568                                                                                                              

00:1f.3 Class 0403: Device 8086:a170 (rev 31)                                                                                                       

        Subsystem: Device 1558:3568                                                                                                                   

00:1f.4 Class 0c05: Device 8086:a123 (rev 31)                                                                                                           

        Subsystem: Device 1558:3568                                                                                                                       

        Kernel driver in use: i801_smbus                                                                                                                    

00:1f.6 Class 0200: Device 8086:15b8 (rev 31)

        Subsystem: Device 1558:3568

01:00.0 Class 0280: Device 8086:3165 (rev 81)

        Subsystem: Device 8086:4010

02:00.0 Class ff00: Device 10ec:5229 (rev 01)

        Subsystem: Device 1558:3568
```

----------

## olejseba

Let us also efibootmgr -v.

----------

## muhlemmer

 *olejseba wrote:*   

> Let us also efibootmgr -v.

 

```
BootCurrent: 0003

Timeout: 1 seconds

BootOrder: 0003

Boot0003* UEFI OS       HD(1,GPT,17417a81-3483-4f6d-b99d-6ab9520e53b0,0x1000,0xff000)/File(\EFI\BOOT\BOOTX64.EFI)..BO
```

When executed in systemrecuecd, I get the same result + an entry for the systemrescuecd USB stick.

I don't think the problem resides in the boot entry, since loading of the kernel works. I think the problems lies somewhere in the PCIe stack, not finding the nvme device anymore.

----------

## olejseba

You do not have correctly configured efi partition.

after chroot mount /boot/efi

```

efibootmgr --create --part 1 --label "Gentoo_nvme" --disk /dev/nvme0n1p1 --loader '\EFI\gentoo\bootx64.efi' -u 'init=/usr/lib/systemd/systemd root=/dev/nvme0n1p2 rootfstype=ext4 raid=noautodetect'

```

copy kernel image to /home/efi/EFI/gentoo/bootx64.

and Let us also efibootmgr -v.

You must have nvme0n disk a the result efibootmgr command. Without this uefibios not see this as a boot disk. I have configuration vithout grub on the same NMVE but on ASUS X99 PRO.

----------

## saellaven

keep in mind, if you aren't using systemd, and likely even if you ARE, that particular efibootmgr command isn't going to work on your system (likely wrong disk/wrong root and possibly wrong init options)

----------

## muhlemmer

 *olejseba wrote:*   

> You do not have correctly configured efi partition.
> 
> after chroot mount /boot/efi
> 
> ```
> ...

 

Okay, I removed the rEFIND entry thet wasn't correct according you and deleted rEFInd completely from the EFI partition to start clean. I did all the steps you mentioned and checked that the boot parameters are according to my setup. Again, on boot the kernel loads but the kernel cannot find the nvme device. Except that I have a kernel panic now, I cannot do any checking on lspci because I don't have a shell now from dracut.

Output from efibootmgr -v, issued from sysrescuecd after reboot:

```
BootCurrent: 0001

Timeout: 1 seconds

BootOrder: 0000,0001

Boot0000* Gentoo-4.4.5  HD(1,GPT,17417a81-3483-4f6d-b99d-6ab9520e53b0,0x1000,0xff000)/File(\efi\4.4.5-gentoo\bootx64.efi)i.n.i.t.=./.u.s.r./.l.i.b./.s.y.s.t.e.m.d./.s.y.s.t.e.m.d. .r.o.o.t.=./.d.e.v./.n.v.m.e.0.n.1.p.2. .r.o. .r.o.o.t.f.s.t.y.p.e.=.e.x.t.4. .q.u.i.e.t.

Boot0001* UEFI: General UDisk 5.00, Partition 1 PciRoot(0x0)/Pci(0x14,0x0)/USB(0,0)/HD(1,MBR,0x4294967212,0x1,0x3b5fff)..BO
```

Olsjeba, could be so kind to inform me what kernel sources and version you are using? Maybe can you also send me your working .config file so I can see if I forgot some settings.

----------

## olejseba

Super. Now you need to add to the kernel the BLK_DEV_NVME options (no as the module).[/quote][/code] I will send you .config for my last kernel linux-4.4.2-gentoo via prive. It also worked on the last stable 4.1 kernel.

```

NVM Express block device (BLK_DEV_NVME)

CONFIG_BLK_DEV_NVME:

The NVM Express driver is for solid state drives directly

connected to the PCI or PCI Express bus. If you know you

don't have one of these, it is safe to answer N.

To compile this driver as a module, choose M here: the

module will be called nvme.

Symbol: BLK_DEV_NVME [=m]

Type : tristate

Prompt: NVM Express block device

Location:

-> Device Drivers

Defined at drivers/nvme/host/Kconfig:1

Depends on: PCI [=y] && BLOCK [=y]

```

----------

## nativemad

It shouldn't matter if nvme is compiled as module, as long as the module is within the initramfs - Dracut should include it... But sure it is easier to just compile it in and don't depend on other tools to do the right thing for you!   :Wink: 

I use efi-stub on my skylake based xps13 with nvme as root without any initramfs. You could also have my .config if you like... just PM me.

I guess the problem really boils down to the irq issue! - I mean you also don't see the drive if booted successfully with the normal ssd as you said.

Maybe it is just a kernel-config issue how irqs are handeled!? Maybe you could also influence the outcome via a bios option...

HTH, cheers

----------

## muhlemmer

I've got NVME option compiled in from day 1. It's just that the PCI device is not showing in lspci and my prime suspect is the IRQ error from the SMBus driver. I think that one kicks out the NVME device. In principle I prefer to boot without any initramfs. I just start to use it to have a shell at my disposal after the root device is not found. I received the config from olejseba. During the weekend I will run a diff against mine and his, to see if I oversaw something. nativemad, I will get in touch with you also to do the same.

----------

## Tony0945

 *muhlemmer wrote:*   

>  It's just that the PCI device is not showing in lspci 

 

Just a shot in the dark but I had trouble with a  PCI-e TV card until I pushed the card a little harder into the socket. no discernible movement but it solved the recognition problem.

----------

## olejseba

Sorry for my Engish. 

I think that in this case the problem is not the IRQ . If that were boot from Linux sysrescuecd or not to give access to the disk .

This is my IRQ krernel config 

```

CONFIG_IRQ_WORK=y

# IRQ subsystem

CONFIG_GENERIC_IRQ_PROBE=y

CONFIG_GENERIC_IRQ_SHOW=y

CONFIG_GENERIC_PENDING_IRQ=y

CONFIG_IRQ_DOMAIN=y

CONFIG_IRQ_DOMAIN_HIERARCHY=y

CONFIG_GENERIC_MSI_IRQ=y

CONFIG_GENERIC_MSI_IRQ_DOMAIN=y

CONFIG_IRQ_FORCED_THREADING=y

CONFIG_SPARSE_IRQ=y

CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y

CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y

CONFIG_INLINE_SPIN_UNLOCK_IRQ=y

CONFIG_INLINE_READ_UNLOCK_IRQ=y

CONFIG_INLINE_WRITE_UNLOCK_IRQ=y

CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y

CONFIG_PCI_MSI_IRQ_DOMAIN=y

CONFIG_HT_IRQ=y

CONFIG_SERIAL_8250_SHARE_IRQ=y

CONFIG_SERIAL_8250_DETECT_IRQ=y

CONFIG_UIO_PDRV_GENIRQ=m

CONFIG_UIO_DMEM_GENIRQ=m

CONFIG_IRQ_BYPASS_MANAGER=m

CONFIG_IRQ_REMAP=y

CONFIG_TRACE_IRQFLAGS_SUPPORT=y

CONFIG_HAVE_KVM_IRQCHIP=y

CONFIG_HAVE_KVM_IRQFD=y

CONFIG_HAVE_KVM_IRQ_ROUTING=y

CONFIG_HAVE_KVM_IRQ_BYPASS=y

```

Very important they are also those entries .

```

CONFIG_EFI_PARTITION=y

CONFIG_EFI=y

CONFIG_EFI_STUB=y

CONFIG_EFI_MIXED=y

CONFIG_FB_EFI=y

CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y

CONFIG_EFI_VARS=y

CONFIG_EFI_ESRT=y

CONFIG_EFI_RUNTIME_MAP=y

CONFIG_EFI_RUNTIME_WRAPPERS=y

CONFIG_CACHEFILES=m

CONFIG_EFIVAR_FS=m

CONFIG_EARLY_PRINTK_EFI=y

```

----------

## muhlemmer

Hi. I received some kernel configs from Oleijesba and Natimad. Thanks for that. But because of the many differences between hardware and module configuration it did not give me any meaningful answers. Cleaning up drivers and selecting the ones I need was just eating too much time (I couldn't get those kernels to but my system, even in BIOS mode). As a last resort I experimented some with this sysrescuecd/fedora config in the following set-ups:

sys-kernel/vanila-sources:4.1.19 -> make oldconfig -> make && make modules_install -> cp arch/x86/boot/bzImage /boot/efi/gentoo/bootx64.efi : Boots, finds and mounts my NVME device

sys-kernel/vanilla-sources:4.1.19 -> make localmodconfig -> make && make modules_install -> cp arch/x86/boot/bzImage /boot/efi/gentoo/bootx64.efi : Boots, finds and mounts my NVME device

sys-kernel/vanilla-sources:4.4.5 -> make oldconfig -> select all as default -> make && make modules_install -> cp arch/x86/boot/bzImage /boot/efi/gentoo/bootx64.efi : Boots and doesn't find my NVME device. Same symptoms as before.

I do think by now I'm hitting a very nasty kernel bug, which seems to be showing on my specific system setup... The biggest change I can see is that in 4.1.* kernels, the NVME device is listed under block devices. At versions 4.4.* it seems to be moved one level up, device drivers. After I gather some new patience and time, I will try to write up a kernel bug report. Never done that and I have no idea how to provide meaningful information regarding this problem. Guess for now I will have to boot my system in classic BIOS mode.

Thanks for all your help and suggestions, but for now I'm bailing out.

PS: SMbus is was compiled as module in the last setup, so no IRQ allocation errors from that where raised. Still, the NVME device was not found

PS2: I think I forgot to mention that I also tried sys-kernel/git-sources:4.5-r7 since there was massive contributions to the NVME code. But with that kernel version, the NVME device is not found, even in classic BIOS mode. Seems things are getting worse   :Sad: 

----------

## muhlemmer

It's a bit late to notify, but issues have been solved in kernel 4.5-rc after filing bug report to NVMe dev mailing list. For people with NVMe devices in laptop I can recommend to use at least kernel version 4.9, as this is the first gentoo stable in the tree after 4.5.

smBus IRQ issues mentioned in this topic had nothing to do with the issue.

----------

## Fulgurance

Hello, sorry for my bad english, i'm french. Have you enable minimal kernel configuration for SSD devices ?

https://wiki.gentoo.org/wiki/NVMe

----------

## muhlemmer

 *Fulgurance wrote:*   

> Hello, sorry for my bad english, i'm french. Have you enable minimal kernel configuration for SSD devices ?
> 
> https://wiki.gentoo.org/wiki/NVMe

 

Yes off course. But like I said and the subject modified: it's already solved since a long time. I just forgot to update the thread.

----------

