# Newly built kernels are unbootable, provoke power cycle

## escozzia

I'm running a linux kernel, version 4.1.12 that I built back in November 2015, and trying to update to 4.1.15-rc1.

I emerged the package, emerged @module-rebuild, used make oldconfig and then make && make modules_install, copied over the kernel into my /boot and reran the grub configuration maker, all as per usual. Incidentally, the .config for the 4.1.15 kernel and the /proc/config for my running 4.1.12 kernel are identical (except for the header, obviously).

However I can't boot into the new kernel at all. As soon as I select it in grub, the machine resets (hardware down and hardware up - as though the power had been cycled). No panic seems to happen, no messages are visible on the screen (beyond what grub itself echoes) and nothing is written to the /var/log/kern.log - it is as though the failed boot never happened. Old kernels, be it 4.1.12 or previously built kernels I have lying around, all work. Booting into windows works too.

I tried booting into the new kernel via kexec to see if it was a grub problem, but as soon as I do kexec -e the exact same reset happens. Again, old kernels work just fine with kexec, and again the last thing written to the kern.log are the shutdown messages from the working 4.1.12 kernel.

I've tried with linux 4.4.6 to see if it was something funny with 4.1.15, but the same problem persists. This leads me to suspect that it's something in how the kernel is being built, but I can't figure out what. cat /proc/version for the 4.1.12 kernel shows:

```

Linux version 4.1.12-gentoo (root@Silence) (gcc version 4.9.3 (Gentoo 4.9.3 p1.2, pie-0.6.3) ) #1 SMP Tue Nov 3 18:44:09 EST 2015

```

While the gcc version shows:

```

gcc (Gentoo 4.9.3 p1.5, pie-0.6.4) 4.9.3

Copyright (C) 2015 Free Software Foundation, Inc.

This is free software; see the source for copying conditions.  There is NO

warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

```

So they look to be by and large the same versions of gcc, so that's probably not the problem.

I'm running the commercial ati video drivers, but I usually don't have trouble with those beyond gui applications crashing and what have you. I kinda doubt they're the cause here, just because it seems as though a video driver issue would at least cause the kernel to panic.

I'm wondering if someone around here can lend a hand, cause I'm really at my wits' end here.

Thanks!

----------

## Syl20

 *escozzia wrote:*   

> I emerged the package, emerged @module-rebuild, used make oldconfig and then make && make modules_install, copied over the kernel into my /boot and reran the grub configuration maker, all as per usual.

 

I don't know if that will solve your problem, but you should emerge @module-rebuild after installing your newly built kernel.

If not, could you pastebin your .config and give us more informations about your hardware ?

----------

## escozzia

 *Syl20 wrote:*   

> 
> 
> I don't know if that will solve your problem, but you should emerge @module-rebuild after installing your newly built kernel.

 

I did not know that, but unfortunately doing that didn't solve the problem

 *Syl20 wrote:*   

> 
> 
> If not, could you pastebin your .config and give us more informations about your hardware ?

 

Sure, the .config is up here.

My hardware is pretty prosaic, I'm running a four core i5-3750K on an Asus P8 Z77-V LX mobo, the graphics card is an AMD Radeon 7850 HD, and I'm using 8GB of ram.

Here's what lspci has to say:

```

00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)

00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)

00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)

00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)

00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)

00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04)

00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)

00:1c.4 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 5 (rev c4)

00:1c.5 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c4)

00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)

00:1f.0 ISA bridge: Intel Corporation Z77 Express Chipset LPC Controller (rev 04)

00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)

00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn PRO [Radeon HD 7850 / R7 265 / R9 270 1024SP]

01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]

03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)

04:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 03)

```

And here's what lscpu says

```

CPU op-mode(s):        32-bit, 64-bit

Byte Order:            Little Endian

CPU(s):                4

On-line CPU(s) list:   0-3

Thread(s) per core:    1

Core(s) per socket:    4

Socket(s):             1

NUMA node(s):          1

Vendor ID:             GenuineIntel

CPU family:            6

Model:                 58

Model name:            Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz

Stepping:              9

CPU MHz:               3400.000

CPU max MHz:           3401.0000

CPU min MHz:           1600.0000

BogoMIPS:              6820.38

Virtualization:        VT-x

L1d cache:             32K

L1i cache:             32K

L2 cache:              256K

L3 cache:              6144K

NUMA node0 CPU(s):     0-3

```

----------

## Maitreya

 *Quote:*   

> 
> 
> used make oldconfig
> 
> 

 

Although it being a very nice helper, it is far from perfect.

Consider using menuconfig to check if everything you need is still there and stuff you don't need is gone.

----------

## krinn

can get emerge --info?

----------

## NeddySeagoon

escozzia,

I suspect that you have made the kernel for the wrong CPU.

The kernel does not always detect this.  What happens then is that you get an illegal instruction exception before the exception handler is set up, so the system resets.

Your CPU is 

```
Model name:            Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz
```

Your 

```
# Automatically generated file; DO NOT EDIT.

# Linux/x86 4.1.15-gentoo-r1 Kernel Configuration
```

.config looks OK in the CPU department.

What of your  4.4.6  .config ?

----------

## escozzia

 *krinn wrote:*   

> 
> 
> can get emerge --info?
> 
> 

 

Sure thing

 *NeddySeagoon wrote:*   

> escozzia,
> 
> I suspect that you have made the kernel for the wrong CPU.
> 
> The kernel does not always detect this.  What happens then is that you get an illegal instruction exception before the exception handler is set up, so the system resets.
> ...

 

Hmm, that makes sense, but the kernel that I have working is 4.1.12, with both 4.1.15 and 4.4.6 giving me trouble.

The particularly weird thing, which also gets to Maitreya's point about make oldconfig is that the working 4.1.12 and the non working 4.1.15 are basically identical:

```

Silence src # diff linux-4.1.15-gentoo-r1/.config linux-4.1.12-gentoo/.config

3c3

< # Linux/x86 4.1.15-gentoo-r1 Kernel Configuration

---

> # Linux/x86 4.1.12-gentoo Kernel Configuration

Silence src # 

```

(Of course, there's lots of differences between the working 4.1.12 and the non working 4.4.6, but from what I can tell most of those look like normal kernel upgrade stuff.)

Edit:

the 4.4.6 also looks okay for my CPU:

```

CONFIG_64BIT=y

CONFIG_X86_64=y

CONFIG_X86=y

CONFIG_INSTRUCTION_DECODER=y

CONFIG_PERF_EVENTS_INTEL_UNCORE=y

CONFIG_OUTPUT_FORMAT="elf64-x86-64"

```

----------

## NeddySeagoon

escozzia,

That's not the CPU kernel settings I had in mind.

Its 

```
  │ │        Processor family (Opteron/Athlon64/Hammer/K8)  --->                  │ │  

  │ │    [*] Supported processor vendors  --->
```

These menu items.

Use wgetpaste to put your entire .config on a pastebin.

----------

## escozzia

 *NeddySeagoon wrote:*   

> escozzia,
> 
> That's not the CPU kernel settings I had in mind.
> 
> Its 
> ...

 

Sure, here it is: http://bpaste.net/show/57ecfdf0fa4d

----------

## NeddySeagoon

escozzia,

That looks mostly harmless.  

Please explain how you configured your 4.4.6 kernel.

----------

## escozzia

 *NeddySeagoon wrote:*   

> escozzia,
> 
> That looks mostly harmless.  
> 
> Please explain how you configured your 4.4.6 kernel.

 

copied over /proc/config.gz (this one: http://bpaste.net/show/86df5ef2838c) into .config, then make oldconfig (saying N to most of the new stuff as prompted), then make && make modules_install. Lastly copied it over onto /boot/ and grub2-mkconfig -o /boot/grub/grub.cfg

----------

## krinn

How could you use kernel without FHANDLE and have a working system? You don't use (e)udev?

However it should not disturb kernel early boot, your symptoms seems more like cpu is refusing/cannot understand your kernel.

What gives a simple "file /boot/kernel_4.4.6.binary" output?

----------

## NeddySeagoon

escozzia,

That's the right answer, so I'm beginning to suspect its not the .config.

Can you rebuild your 4.1.12 and see if it still works?

If you edit the Makefile, at the top it will have something like

```
VERSION = 4

PATCHLEVEL = 6

SUBLEVEL = 0

EXTRAVERSION = -gentoo
```

  Change the EXTRAVERSION so that everything is kept separate.

Take care that  EXTRAVERSION does not end with whitespace.

Start the build with 

```
make clean
```

to force everything to be rebuilt.

Did you ever edit .config with $EDITOR?

----------

## escozzia

 *krinn wrote:*   

> How could you use kernel without FHANDLE and have a working system? You don't use (e)udev?

 

I do use udev, and while it does complain about the missing FHANDLE when I emerge it, it's never actually given me trouble for some reason. Luck of the ignorant I suppose.

 *krinn wrote:*   

> However it should not disturb kernel early boot, your symptoms seems more like cpu is refusing/cannot understand your kernel.
> 
> What gives a simple "file /boot/kernel_4.4.6.binary" output?

 

```

arch/x86/boot/bzImage: Linux kernel x86 boot executable bzImage, version 4.4.6-gentoo (root@Silence) #4 SMP Thu Aug 4 23:14:18 EDT 2016, RO-rootFS, swap_dev 0x4, Normal VGA

```

 *NeddySeagoon wrote:*   

> 
> 
> escozzia, 
> 
> That's the right answer, so I'm beginning to suspect its not the .config. 
> ...

 

Unfortunately I cannot - I've since (stupidly) depcleaned gentoo-sources-4.1.12 away, and the ebuild is gone:

```

Silence linux-4.1.12-gentoo # ls /usr/portage/sys-kernel/gentoo-sources/gentoo-sources-4.1.*

/usr/portage/sys-kernel/gentoo-sources/gentoo-sources-4.1.15-r1.ebuild  /usr/portage/sys-kernel/gentoo-sources/gentoo-sources-4.1.29.ebuild

/usr/portage/sys-kernel/gentoo-sources/gentoo-sources-4.1.27.ebuild

```

Do you know if there's somewhere I can get an archival gentoo-sources-4.1.12 ebuild?

 *NeddySeagoon wrote:*   

> 
> 
> Did you ever edit .config with $EDITOR?
> 
> 

 

Nope, always via the make *config commands

----------

## NeddySeagoon

escozzia,

The ebuild will be in git but I don't know how to dig it out.

There are a few other threads on the forums on getting a single file out of git.

The sources will still be available.  You may even have them unless you have cleaned your distfiles.

----------

## krinn

i still have gentoo-sources-4.1.12.ebuild

```
# Copyright 1999-2015 Gentoo Foundation

# Distributed under the terms of the GNU General Public License v2

# $Id$

EAPI="5"

ETYPE="sources"

K_WANT_GENPATCHES="base extras experimental"

K_GENPATCHES_VER="16"

K_DEBLOB_AVAILABLE="0"

K_KDBUS_AVAILABLE="1"

inherit kernel-2

detect_version

detect_arch

KEYWORDS="alpha amd64 ~arm ~arm64 -hppa ia64 ~mips ppc ppc64 ~s390 ~sh sparc x86"

HOMEPAGE="https://dev.gentoo.org/~mpagano/genpatches"

IUSE="experimental"

DESCRIPTION="Full sources including the Gentoo patchset for the ${KV_MAJOR}.${KV_MINOR} kernel tree"

SRC_URI="${KERNEL_URI} ${GENPATCHES_URI} ${ARCH_URI}"

pkg_postinst() {

        kernel-2_pkg_postinst

        einfo "For more info on this patchset, and how to report problems, see:"

        einfo "${HOMEPAGE}"

}

pkg_postrm() {

        kernel-2_pkg_postrm

}

```

And you might also have a copy in : /var/db/pkg/sys-kernel/gentoo-sources-4.1.12

----------

## escozzia

Ok, I was able to emerge gentoo-sources-4.1.12 and build a new 4.1.12 kernel, with the exact same config as my working kernel:

```

Silence linux-4.1.12-gentoo # !diff

diff config .config

3c3

< # Linux/x86 4.1.12-gentoo Kernel Configuration

---

> # Linux/x86 4.1.12-test-kernel Kernel Configuration

```

But I can't boot into the newly built 4.1.12 either, same reset problem.

----------

## Tony0945

What does your grub kernel configuration file look like?

----------

## escozzia

 *Tony0945 wrote:*   

> What does your grub kernel configuration file look like?

 

http://bpaste.net/show/7350d1717f41

----------

## NeddySeagoon

escozzia,

That's progress.  It confirms that its not the kernel .config.

What does 

```
gcc-config -l
```

 show?

I am beginning to suspect either hardware or your toolchain.

----------

## escozzia

 *NeddySeagoon wrote:*   

> escozzia,
> 
> That's progress.  It confirms that its not the kernel .config.
> 
> What does 
> ...

 

```

Silence ~ # gcc-config -l

 [1] x86_64-pc-linux-gnu-4.9.3 *

```

Which I think looks like the same gcc that built the working kernel:

```

Silence ~ # cat /proc/version 

Linux version 4.1.12-gentoo (root@Silence) (gcc version 4.9.3 (Gentoo 4.9.3 p1.2, pie-0.6.3) ) #1 SMP Tue Nov 3 18:44:09 EST 2015

```

----------

## NeddySeagoon

escozzia,

Humour me a little ...

Here what I've done

```
emerge =gentoo-sources-4.4.6 -1

cd /usr/src/linux-4.4.6-gentoo/

wget https://bpaste.net/raw/57ecfdf0fa4d

cp 57ecfdf0fa4d .config

make oldconfig

make -j10

scp arch/x86/boot/bzImage neddyseagoon@dev.gentoo.org:/home/neddyseagoon/public_hlml
```

Then I logged in, sorted out the mess and renamed the kernel to escozzia_bzImage

Oh, make oldconfig did nothing, which was expected.

That's gentoo-sources-4.4.6 built on my system with gcc-5.4.0.  Its only the bzImage file to put into /boot.  

You can have the modules if you want but they are not needed for this test.

Put that file into /boot. Tell grub about it, then try to boot it.  What happens?

----------

## krinn

 *escozzia wrote:*   

>  *NeddySeagoon wrote:*   
> 
> I am beginning to suspect either hardware or your toolchain. 
> 
> Which I think looks like the same gcc that built the working kernel:
> ...

 

That's not the same gcc, that's just the same version.

From your post #1, your current gcc is  gcc (Gentoo 4.9.3 p1.5, pie-0.6.4) 4.9.3

----------

## escozzia

 *krinn wrote:*   

> 
> 
> That's not the same gcc, that's just the same version. 
> 
> From your post #1, your current gcc is gcc (Gentoo 4.9.3 p1.5, pie-0.6.4) 4.9.3
> ...

 

Whoop, you're right, my bad - so there's definitely been a gcc change since.

 *NeddySeagoon wrote:*   

> escozzia,
> 
> Humour me a little ...
> 
> Here what I've done
> ...

 

Ah, yours booted perfectly!

I think that points to a toolchain issue.

----------

## NeddySeagoon

escozzia,

Looks like it.  You cannot rely on your toolchain to (re)build itself correctly as it seems to be broken.

Its worth booting into memtest to see what that says about your hardware.

----------

## escozzia

 *NeddySeagoon wrote:*   

> escozzia,
> 
> Looks like it.  You cannot rely on your toolchain to (re)build itself correctly as it seems to be broken.
> 
> Its worth booting into memtest to see what that says about your hardware.

 

Hmm I just did a memtest pass without any errors, so the hardware is probably okay.

I guess that means it's a toolchain rebuild for me - do you have any pointers on the best way to do that? Obviously just re-emerging everything won't work, but it seems as though just downloading binaries and replacing the installed ones would be a bit crazy

----------

## NeddySeagoon

escozzia,

You need a setup as described in fix my Gentoo

Instead of using emerge to generate the required binaries, there is a script.

When you get the the part that reads 

```
More Packages

emerge whatever you need. You don't need to do it all in one go. /home/rescue will stay around until you delete it and its only a chroot away. When you come back, don't forget the /usr/portage bind mount. 
```

Do 

```
/usr/portage/scripts/bootstrap.sh
```

That's actually stage1 of a stage1 install.  It builds your toolchain.  This script does not support resume.

You can probably install all the binaries you get, into your main system, with emerge -K too.

----------

## escozzia

 *NeddySeagoon wrote:*   

> escozzia,
> 
> You need a setup as described in fix my Gentoo
> 
> Instead of using emerge to generate the required binaries, there is a script.
> ...

 

 Hmm, I followed that guide, and then did

```

 emerge '=sys-kernel/gentoo-sources-4.1.15-rc1'

 
```

 inside of the chrooted system, followed by

```

 emerge -K sys-apps/baselayout sys-apps/texinfo sys-devel/binutils sys-devel/gcc sys-devel/gettext sys-libs/glibc sys-libs/zlib virtual/libc sys-devel/bc

 source /etc/profile

 env-update

 
```

Inside the parent system, then rebuilt 4.1.12. However, it still doesn't boot.

I did try building the emerged 4.1.15 from inside of the chrooted system, and that one does boot, so it's looking likely that I just didn't emerge the package that's at fault.

I'll just try to

```

emerge -e world

```

from the child and then install every single binary package into the parent, which might take a while.

----------

## NeddySeagoon

escozzia,

Your parent world file will be polluted with @system packages.

That 

```
emerge -K sys-apps/baselayout sys-apps/texinfo sys-devel/binutils sys-devel/gcc sys-devel/gettext sys-libs/glibc sys-libs/zlib virtual/libc sys-devel/bc
```

should have been -K1 or -K --oneshot.

There is an emerge option to remove entries from the world file without uninstalling the package.

----------

## escozzia

Hurmm still no good. I did the

```

emerge -e world

```

in the child, followed by a

```

quickpgk '*/*'

```

just to make sure I got everything, then went back to the parent and did

```

emerge -K1 $(find /usr/portage/packages/ -type f -name '*.tbz2' | egrep -v 'udev|sandbox' | sed -e 's/.*packages\/\(.*\)-[0-9]\+.*.tbz2/\1/g' | xargs)

```

I rebooted, rebuilt the kernel and still no good. The child kernel still works okay though. I'm stumped - that command reinstalled just about everything I can imagine could be the source (except for udev and sandbox - could it be those? udev was bizarrely giving me a conflict, and reinstalling sandbox caused every build to fail)

----------

