# My VPS wont boot with kernel >=4.15.0

## Elleni

My virtual private server is setup with encrypted partition including boot. I had used following guide to setup. 

http://blog.guya.de/linux-gentoo-encrypted-boot-partition/

For more than a year now, it was no problem to build updated kernels. I just did copy my .config from previous kernel and then continued with make menuconfig && make -j9 && make install && genkernel --luks initramfs && grub-mkconfig -o /boot/grub/grub.cfg

But for some reason, this does not work anymore with 4.15.0 kernel. Very early in boot process vnc window is closed an on the webinterface of my hoster I can see, that the server is offline. Anyone has an idea, what could have changed in kernel >=4.15.0 that could explain this behaviour ? Could it be related to some new kernel options ex. retpoline or PTI and/or is there a problem of the way cryptsetup with key is setup according to above link?

Following two short vids of how it looks like now, and how it was before (till kernel 4.14.x) - actually I  am using 4.9.76-gentoo-r1 latest stable kernel. 

Failed boot: 

https://videobin.org/+po7/u7q.html

Booting ok: 

https://videobin.org/+po8/u7r.html

I dont even know, where to start for troubleshooting this, as its so early in the boot process that I have no clue, where I could find any usefull error messages. Anyone has a hint for me to start troubleshooting this issue ?

----------

## Elleni

Please give me something to work with and have a starting point to find the issue. I cant find anything in /var/log/messages and I think this is because the error happens too early in boot process, so how to debug ?

----------

## NeddySeagoon

Elleni,

Wild guess as there is little to go on.

Its an Intel CPU and you are using a hardened profile.

If that's not true. we heed to do some analysis starting with your lspci output, your kernel .config (put that on a pastebin) and a description of how you migrate from one kernel to another.

Using an encrypted disk on a VPS is fairly pointless. It only protects against data theft when the system is off.

That makes it useful for portable equipment and less so for fixed systems.

-- edit -- 

Its an Intel CPU and you are using a hardened profile was a feature of 4.14.x

Its fixed in 4.15.0

----------

## grumblebear

Just copying the old .config should rarely work. At least do a "make oldconfig" before building the new kernel.

----------

## szatox

That video is unreadable for me.

It seems to have  loaded kernel, perhaps you're missing some modules either builtin on in initramfs. It's a shot in the dark, I just can't read anything besides the loading dots. Can you just boot the old kernel again and then try to rebuild new one?

You could reuse config from the working kernel, just copy it and repair with oldconfig.

----------

## Elleni

 *NeddySeagoon wrote:*   

> Elleni,
> 
> Wild guess as there is little to go on.
> 
> Its an Intel CPU and you are using a hardened profile.
> ...

 

Hello Neddy, yes you are correct. Its hardened profile and a virtual intel cpu. But what does that mean? I build the kernel by: 

```
make menuconfig && make -j9 && make install && genkernel --luks initramfs && grub-mkconfig -o /boot/grub/grub.cfg
```

grumblebear, part of make after menuconfig is

```
scripts/kconfig/conf  --silentoldconfig Kconfig
```

 so kernel options from previous working config are copied. 

using encrypted disk was more for a learning expereance. Maybe it saves data, when provider sells old discs on amazon, I know that its not really very meaningfull. But this procedure always worked now for more than a year, but suddenly does not anymore with kernel 4.15 and newer..

szatox this .config works with older kernels before 4.15. Video should only show how early vnc windows is closed and server goes offline, even before first services come upLast edited by Elleni on Wed Feb 07, 2018 10:22 pm; edited 1 time in total

----------

## Elleni

Neddy, does that mean changing profile from 

```
default/linux/amd64/17.0/no-multilib/hardened
```

to 

```
default/linux/amd64/17.0/no-multilib
```

will do the trick?

I ll try. Thanks.

Edit: Will emerge -uDNav --with-bdeps=yes be enough after profile switch, or should I do emerge -e system followed by emerge -e world? And should I rebuild kernel after profile switch?

----------

## NeddySeagoon

Elleni,

There was a problem with the gentoo hardened profile and the 4.14 kernel that prevented the kernel booting on Intel CPUs.

That's why the 4.14 stable gentoo kernel was masked while it was investigated.

The most obvious symptom was an early panic.  Its been fixed now by a change to the CFLAGS used for building the kernel and only 4.14 was affected as far as I know.

Whatever, ifs fixed in 4.15, so its not that unless you are editing the kernel build time CFLAGS.

Now, what of your lspci and kernel .config.

There is at least one new option added to the kernel recently that was defaulted to off.It was a very bad thing as it disabled USB interfaces connected to a PCI bus.

That's fixed now too.  It may not matter on a VPS.

----------

## Elleni

Oh, ok, I am doing the profile switch and emerge -uDNav --with-bdeps=y anyways 

lspci

```
00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev 02)

00:01.0 PCI bridge: Intel Corporation 82G35 Express PCI Express Root Port (rev 02)

00:03.0 Unassigned class [ff00]: Parallels, Inc. Virtual Machine Communication Interface

00:05.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper)

00:0a.0 PCI bridge: Digital Equipment Corporation DECchip 21150

00:0e.0 RAM memory: Red Hat, Inc Virtio memory balloon

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)

00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface Controller (rev 02)

00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 Controller (rev 05)

00:1f.2 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)

01:00.0 VGA compatible controller: Parallels, Inc. Accelerated Virtual Video Adapter

```

cat /usr/src/linux/.config |wgetpaste

https://paste.pound-python.org/show/UNKxgnItCabka7K4G2Hs/

emerge --info

```
Portage 2.3.19 (python 2.7.14-final-0, default/linux/amd64/17.0/no-multilib, gcc-6.4.0, glibc-2.25-r9, 4.9.76-gentoo-r1 x86_64)

=================================================================

System uname: Linux-4.9.76-gentoo-r1-x86_64-Intel-R-_Xeon-R-_CPU_E5-2620_v3_@_2.40GHz-with-gentoo-2.4.1

KiB Mem:     6116180 total,   2555680 free

KiB Swap:    4194300 total,   4194300 free

Timestamp of repository gentoo: Wed, 07 Feb 2018 22:00:01 +0000

Head commit of repository gentoo: 471d2fa0870254bcc6557cef8f429d85cc512e71

sh bash 4.4_p12

ld GNU ld (Gentoo 2.29.1 p3) 2.29.1

app-shells/bash:          4.4_p12::gentoo

dev-lang/perl:            5.24.3::gentoo

dev-lang/python:          2.7.14-r1::gentoo, 3.5.4-r1::gentoo

dev-util/cmake:           3.9.6::gentoo

dev-util/pkgconfig:       0.29.2::gentoo

sys-apps/baselayout:      2.4.1-r2::gentoo

sys-apps/openrc:          0.34.11::gentoo

sys-apps/sandbox:         2.12::gentoo

sys-devel/autoconf:       2.69-r4::gentoo

sys-devel/automake:       1.15.1-r1::gentoo

sys-devel/binutils:       2.29.1-r1::gentoo

sys-devel/gcc:            6.4.0-r1::gentoo

sys-devel/gcc-config:     1.8-r1::gentoo

sys-devel/libtool:        2.4.6-r3::gentoo

sys-devel/make:           4.2.1::gentoo

sys-kernel/linux-headers: 4.13::gentoo (virtual/os-headers)

sys-libs/glibc:           2.25-r9::gentoo

Repositories:

gentoo

    location: /usr/portage

    sync-type: rsync

    sync-uri: rsync://rsync.gentoo.org/gentoo-portage

    priority: -1000

    sync-rsync-extra-opts: 

x-portage

    location: /usr/local/portage

    masters: gentoo

    priority: 0

ACCEPT_KEYWORDS="amd64"

ACCEPT_LICENSE="* -@EULA"

CBUILD="x86_64-pc-linux-gnu"

CFLAGS="-march=native -O2 -pipe"

CHOST="x86_64-pc-linux-gnu"

CONFIG_PROTECT="/etc /usr/share/easy-rsa /usr/share/gnupg/qualified.txt"

CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.6/ext-active/ /etc/php/apache2-php7.1/ext-active/ /etc/php/cgi-php5.6/ext-active/ /etc/php/cgi-php7.1/ext-active/ /etc/php/cli-php5.6/ext-active/ /etc/php/cli-php7.1/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"

CXXFLAGS="-march=native -O2 -pipe"

DISTDIR="/usr/portage/distfiles"

FCFLAGS="-O2 -pipe"

FEATURES="assume-digests binpkg-logs candy config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"

FFLAGS="-O2 -pipe"

GENTOO_MIRRORS="http://distfiles.gentoo.org"

LANG="de_CH.utf8"

LDFLAGS="-Wl,-O1 -Wl,--as-needed"

LINGUAS="de el en fr it tr"

MAKEOPTS="-j5"

PKGDIR="/usr/portage/packages"

PORTAGE_CONFIGROOT="/"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"

PORTAGE_TMPDIR="/var/tmp"

USE="3dnow 3dnowext acl amd64 apache2 authdaemond berkdb bzip2 cgi clamav clamdtop cli crypt cryptsetup curl cxx device-mapper dkim dovecot-sasl dri fam fontconfig fortran fpm gd gdbm geoip iconv imap jpeg libmysqlclient maildir mmx mmxext modules mysql mysqli ncurses nls nptl openmp pam pcntl pcre pdo png popcnt readline seccomp sockets spamassassin spell sqlite sse sse2 sse3 sse4_1 sse4a ssl symlink tcpd truetype unicode vhosts xattr xmlwriter zip zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_core authn_dbm authn_file authz_core authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation proxy proxy_http proxy_wstunnel rewrite setenvif socache_shmcb speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev" KERNEL="linux" L10N="de" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-1" POSTGRES_TARGETS="postgres9_5" PYTHON_SINGLE_TARGET="python3_5" PYTHON_TARGETS="python2_7 python3_5" RUBY_TARGETS="ruby22 ruby23" USERLAND="GNU" VIDEO_CARDS="parallels vesa vga" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"

Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
```

----------

## Elleni

Still the same situation after having switched profile and re-emerge world -uDNav --with-bdeps=y followed by a kernel rebuild.

----------

## Elleni

Anything else, I could provide to find a solution?

----------

## Elleni

Could that be a limitation on the hostside in the end ? I have no clue, what to do here to find out more   :Crying or Very sad: 

----------

## Hu

Linus has a zero-regressions policy.  That does not prevent regressions from happening, but it does generally require a very compelling reason to allow a known regression to remain.  If your old kernel worked, and you did not misconfigure the new kernel, then the new kernel should work.  If it does not, that sounds like a possible regression.  We would need to rule out configuration error before declaring it to be a regression.  Can you find the specific patch that breaks it?  A bisection is likely to be required.  Unfortunately, the videos are unusable for me, so I cannot help with your actual problem.

----------

## Elleni

Hello Hu and thank you for your reply. I am willing to provide anything needed to find this error. What I can confirm, that it only happens with kernel >=4.15.0. I can confirm this as I come from booting gentoo-sources-4.14.20 which I come from compiling by the same procedure as I always do. After missconfigration is rooled out, I am willing to redo a better quality video if necessary or provide whatever information might help to find out what's going on: 

- emerge gentoo-sources-version

- menuconfig, then make sure I add a random kernel option and undo it immediately afterwards to make sure that config file is saved upon exit of menuconfig

- make -j9

- make install 

- genkernel --luks initramfs

- grub-mkconfig -o /boot/grub/grub.cfg

And it successfully boots. I tried the same procedure for new 4.15.4 and it does not boot, so I made new videos, which can be seen here: 

https://cloud.tsarouchas.ch/index.php/s/J9Yfe6kG7LtdnXw

Thanks in advance for any help  :Very Happy: 

----------

## bunder

is it possible to set your hypervisor to not close the window when the system panics?  all i see is normal boot then the window closing.

or better yet, could we get a screenshot of the panic?

----------

## Elleni

I asked for it to support of the hoster and will see, what they say.

----------

## Elleni

Hoster supporter was willing to make a printscreen but tells me, that he also only sees the booting kernel for a short moment and then his hypervisor software reports virtual machine down  :Sad: 

May this be a problem with hypervisor parallels ? I asked them to confirm that they are able to boot 4.15 kernels and am waiting for an answer.   :Shocked: 

Could it be worth a try to boot a kernel from other sources than gentoo-sources ?Last edited by Elleni on Wed Feb 21, 2018 12:00 am; edited 1 time in total

----------

## Elleni

Out of couriosity I tried to boot vanilla-sources-4.15.4 and git-sources-4.16_rc2 with same result so I am really out of ideas. One thing that came to my mind, was that I had tried if I can update intel-microcode on this virtual server. I had followed wiki variant "New method without initram-fs/disk" but verification by dmesg | grep microcode showed nothing so I thought, that maybe this is not intended to work on virtual server anyways. When compiling git-sources I removed corresponding kernel options

```
Processor type and features  --->

    <*> CPU microcode loading support

    [*]   Intel microcode loading support

Device Drivers  --->

  Generic Driver Options  --->

    [*]   Include in-kernel firmware blobs in kernel binary

    (intel-ucode/06-3c-03) External firmware blobs to build into the kernel binary

    (/lib/firmware) Firmware blobs root directory
```

But that did not the trick so I am really out of ideas and am afraid that I have to stick with 4.14 kernel series. Again, can this eventually be a problem with parallels hypervisor ?

What else can I try? genkerel all ? Never used that though.

[Moderator edit: added [code] tags to preserve output layout. -Hu]

----------

## Hu

If it is a kernel regression, and it is present in v4.15 and in v4.16-rc2, then it is likely that no one has reported it to the correct maintainer.  Please try a bisection to identify the specific commit which breaks v4.15 for you.

----------

## Elleni

I dont know what a bisection is, and thus am not able to do so.

----------

## Elleni

I will ask my hoster for a testvm to do a new minimal gentoo setup and I also ask them to setup a fedora27 or opensuse-tumbleweed vm to see, wether their kernel 4.15.X is able to boot on their hypervisor parallels and if so to provide me their kernel config. The support of the hoster already announced to me that they are planing to upgrade their hypervisor within the next 2 weeks and is asking for patience. Maybe this will enable my kernel to boot. 

Is there a way to setup 4.15. kernel with the same settings as 4.14.X that is successfully booting, and deactivate any newly added kernel options that are new on 4.15 and did not exist on older kernel? Thatway I could prove that one of the new kernel options is causing this problem?

I will also ask them if they also are able to provide a vm on kvm/qemu instead of parallels. 

If nothing helps, I will probably migrate to another provider who has vms on qemu/kvm hypervisor.

----------

## gengreen

 *NeddySeagoon wrote:*   

> Elleni,
> 
> There was a problem with the gentoo hardened profile and the 4.14 kernel that prevented the kernel booting on Intel CPUs.
> 
> That's why the 4.14 stable gentoo kernel was masked while it was investigated.
> ...

 

Hello,

I'm not familiar with all kind of VPS but they don't have bootloader ?

Using a crypted partition on a dedicated (and on any personal computer I would say) is a must have, it can avoid a single user boot...

----------

## Elleni

Just for information. My provider informed me, that kernels 4.15 and newer are not supported, so aparently it is not a problem of kernel compilation, but hostsystem that would not support it. Thank you to everyone helping with this issue.

----------

## rubeelam

You are not alone, this problem is very common with such servers. I generally do not advise anyone to use these. Nevertheless, the choice of hosting must be approached wisely and with proper training and knowledge. When I came into this business, I did not know anything about VPS storage, how it all works and works. And only over the years, I realized how to choose the right hosting, how not to fall for the bad one. the most important thing is the experience that you gain over time.

----------

