# kernel stuck in boot process when NR_CPUS is increased to 64

## oblivion_vr

I have a machine with 2 cpu socket, each cpu socket has 12 cores. With hyperthreading enabled I expect 48 cores up and running. 

I built the kernel with default configuration. After boot i see in dmesg logs that the cores from 32-47 were switched off because max cpu limit of 32 was reached. I checked the NR_CPUS value in the kernel config and it was set to 32. So, it explained why the cores from 32 to 47 were switched off. 

I increased the NR_CPUS to 64 in the kernel config and re-built the kernel with it. When I boot with the new kernel image it doesn't boot and hangs in the boot process.

Has anyone faced the similar issue? Is CONFIG_NR_CPUS parameter dependent on other parameters related to power and other stuff?

Finer details - 

Gauss ~ # grep NR_CPUS /usr/src/linux/.config

CONFIG_NR_CPUS=64

dmesg log when booting with kernel with CONFIG_NR_CPUS=32

Gauss ~ # dmesg | grep cpu

[    0.000000] Initializing cgroup subsys cpuset

[    0.000000] Initializing cgroup subsys cpu

[    0.000000] Initializing cgroup subsys cpuacct

[    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 32/0x9 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 33/0x29 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 34/0xb ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 35/0x2b ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 36/0x11 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 37/0x31 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 38/0x13 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 39/0x33 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 40/0x15 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 41/0x35 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 42/0x17 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 43/0x37 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 44/0x19 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 45/0x39 ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 46/0x1b ignored.

[    0.000000] ACPI: NR_CPUS/possible_cpus limit of 32 reached.  Processor 47/0x3b ignored.

[    0.000000] setup_percpu: NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:32 nr_node_ids:2

[    0.000000] PERCPU: Embedded 25 pages/cpu @ffff880627c00000 s73344 r8192 d20864 u131072

[    0.000000] pcpu-alloc: s73344 r8192 d20864 u131072 alloc=1*2097152

[    0.000000] pcpu-alloc: [0] 00 02 04 06 08 10 12 14 16 18 20 22 24 26 28 30 

[    0.000000] pcpu-alloc: [1] 01 03 05 07 09 11 13 15 17 19 21 23 25 27 29 31 

[    0.925915] cpuidle: using governor ladder

[    0.926447] cpuidle: using governor menu

----------

## Hu

Assuming you changed it through the menu system and allowed the kernel to do a clean build, as opposed to manually editing the .config, this should work fine.  As far as I know, Linux has been good for up to 1024 CPUs for quite a while.  In recent versions, this limit is even higher.  It appears to be 8192 for v3.14 on x86_64.

Please post the last 20 lines or so of output before the hang.

----------

## NeddySeagoon

Hu,

You mean starting with make clean?

----------

## Hu

I was mainly thinking of ensuring that the OP was not mixing modules from one build with core kernel from another, but yes, a full clean never hurts when a user reports something weird.

----------

## oblivion_vr

 *Hu wrote:*   

> I was mainly thinking of ensuring that the OP was not mixing modules from one build with core kernel from another, but yes, a full clean never hurts when a user reports something weird.

 

I tried make clean and then make -j && make -j modulesinstall. It did not work. 

Below are the lines that I see on the boot screen. All the steps pass [OK]. The boot process gets stuck after Autoloaded 0 Modules.

* Mouting /proc 

* Mounting /run ..

* /run/openrc: creating directory

* /run/lock: creating directory

* /run/lock: correcting owner 

        3.894802] usb 1-1.6.1: new high speed USB device number 4 using ehci-pci 

* Using /dev mounted from kernel ..

* Mouting /dev/mqueue

* Mounting /dev/pts

* Mounting /dev/shm 

* Creating list of required static device nodes for the current kernel ...

* Mounting /sys

* Mounting debug filesystem 

* Mounting cgroup filesystem 

* setting up tmpfiles.d entried for /dev 

* Starting udev

* Generating a rule to create a /dev/root symlink 

* Populating /dev with existing devices through uevents 

* Waiting for uevents to be processed

* Device initiated service 

* Setting system clock using the hardware clock 

* Autoloaded 0 Modules

----------

## NeddySeagoon

oblivion_vr,

make clean and then make -j && make -j modulesinstall does not install your new kernel but it does install the new modules.

The old kernel and new modules will not mork together.

----------

## oblivion_vr

 *NeddySeagoon wrote:*   

> oblivion_vr,
> 
> make clean and then make -j && make -j modulesinstall does not install your new kernel but it does install the new modules.
> 
> The old kernel and new modules will not mork together.

 

What I meant that I followed the following steps - 

make clean

make -j && make -j modulesinstall

grub2-install /dev/sda

grub2-mkconfig -o /boot/grub/grub.cfg

Still I am facing the issue that kernel hangs during the boot process

----------

## lagalopex

Isnt it "make modules_install"?

And you still did not install the kernel itself...

----------

## oblivion_vr

 *lagalopex wrote:*   

> Isnt it "make modules_install"?
> 
> And you still did not install the kernel itself...

 

I did a typo. I did execute make -j modules_install. 

I missed to point out that after the modules install, I copied the kernel image at /usr/src/linux//arch/x86/boot/bzImage to /boot following the correct versioning convention. Then running grub2 commands it detects the kernel images and makes appropriate changes in the grub config. Isn't this enough for the kernel install? 

Request you to list out the steps for kernel install after compiling it?

----------

