# Only one core using kernel 4.9

## donmartio

I upgraded to kernel 4.9 lately and discovered a huge performance impact.

During an emerge job i used htop and saw there is only one cpu 100% busy.

Since i have a AMD FX(tm)-4100 Quad-Core Processor this was pretty weird (annoying so to speek).

Switching back to 4.8.10 i have my 4 cores back.

In the last years i was pretty lazy and just copied the .config from kernel to kernel (the first one was build from pappys kernel seed).

This did work most of the time. Sometimes i had to change a little bit, but that was because of software needs.

And now it seems i missed something.

Cross reading what i found in the net doesn't give me a clue what it is.

Does anyone use the 4.9 on an amd system? Or may someone could push me in a direction where to look for this problem?

----------

## Keruskerfuerst

Relevant part of kernel config? 

SMP support.

----------

## Roman_Gruber

You should start to adapt that kernel config.

takes half an hour but than its fixed.

just read and set Y / N / M ...

everything has a description when things are unsure most likely a N fits. check with a webbrowser also for additional hints

When you want performance, make your own kernel. set experimental flag, set processor architecture ... took of a few minutes compiling libreoffice here on every run

----------

## donmartio

Thanks for the reply.

SMP is set since it works with the sources of 4.8.10.

I'll try different kernel config stuff.

----------

## wrc1944

Under Processor type and features you might try setting: 

CONFIG_NR_CPUS=8

IIRC, I read somewhere it works in conjunction with SMP

Also, check your MAKEOPTS="- number in /etc/make.conf.

For my amd 8320  8 core cpu, mine is:  MAKEOPTS="-j9" 

If none of that works, maybe try the  "threads" global USE flag in make.conf

----------

## donmartio

Hi, thanks for the reply.

I have CONFIG_NR_CPUS=64 and -j5 in my make.conf.

As far as i understood this the NR_CPUS just reduces the size of the kernel.

The point is... everything works perfect with kernel 4.8.10 and below.

I tried different new things with no success.

dmesg | grep -i SMP

gives me different ouput for the two kernels:

Output with kernel 4.9.0

[    0.000000] Linux version 4.9.0-gentoo (root@rabatz) (gcc version 4.9.3 (Gentoo 4.9.3 p1.5, pie-0.6.4) ) #12 SMP PREEMPT Mon Dec 26 21:45:23 CET 2016

[    0.000000] Using ACPI (MADT) for SMP configuration information

[    0.000000] smpboot: Allowing 4 CPUs, 0 hotplug CPUs

[    0.002340] Freeing SMP alternatives memory: 28K (ffffffff81b43000 - ffffffff81b4a000)

[    0.005658] smpboot: APIC(10) Converting physical 1 to logical package 0

[    0.005658] smpboot: Max logical packages: 4

[    0.117517] smpboot: CPU0: AMD FX(tm)-4100 Quad-Core Processor (family: 0x15, model: 0x1, stepping: 0x2)

[    0.127546] smpboot: Total of 1 processors activated (7185.02 BogoMIPS)

Output with kernel 4.8.10

[    0.000000] Linux version 4.8.10-gentoo (root@rabatz) (gcc version 4.9.3 (Gentoo 4.9.3 p1.5, pie-0.6.4) ) #4 SMP PREEMPT Sun Dec 25 17:08:46 CET 2016

[    0.000000] Using ACPI (MADT) for SMP configuration information

[    0.000000] smpboot: Allowing 4 CPUs, 0 hotplug CPUs

[    0.002494] Freeing SMP alternatives memory: 28K (ffffffff81b4a000 - ffffffff81b51000)

[    0.005827] smpboot: APIC(10) Converting physical 1 to logical package 0

[    0.005828] smpboot: Max logical packages: 4

[    0.117529] smpboot: CPU0: AMD FX(tm)-4100 Quad-Core Processor (family: 0x15, model: 0x1, stepping: 0x2)

[    0.128562] x86: Booting SMP configuration:

[    0.322556] smpboot: Total of 4 processors activated (28734.44 BogoMIPS)

The main difference is this:

[    0.128562] x86: Booting SMP configuration:

----------

## Hu

I think the following line is a bit more important:

```
[ 0.322556] smpboot: Total of 4 processors activated (28734.44 BogoMIPS)
```

```
[ 0.127546] smpboot: Total of 1 processors activated (7185.02 BogoMIPS)
```

That says num_online_cpus() is returning 1 instead of 4.  Some of the other messages suggest that you probably have SMP=y in both kernels, which is good.  Please pastebin the full dmesg of both kernels and the full configuration of both kernels.

----------

## Keruskerfuerst

And this difference:

kernel 4.8.10

[ 0.128562] x86: Booting SMP configuration: 

kernel 4.9 does not contain this line

----------

## donmartio

Hi,

thanks for the reply.

config-kernel-4.8.10

http://pastebin.com/3XAnHqRc

dmesg-kernel-4.8.10

http://pastebin.com/9uEJmKDg

config-kernel-4.9

http://pastebin.com/V4DGTpeV

dmesg-kernel-4.9

http://pastebin.com/jwmps3pk

This is a lot of stuff. 

Thanks for helping me in this matter.

DonMartio

----------

## Keruskerfuerst

To switch over from kernel 4.8.10 to 4.9.0 you can use the old config, but after that you must use make oldconfig.

----------

## jburns

In your dmesg-kernel-4.9 you have 

```
[    0.002091] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 10 CPUID: 0

[    0.002095] [Firmware Bug]: CPU0: Using firmware package id 1 instead of 0
```

 which is not in your dmesg-kernel-4.8.10.  This could be causing the kernel to ignore the CPU information.  This is a bug in the kernel.  See https://m.reddit.com/r/linuxquestions/comments/5j1min/cryptic_error_message_on_boot_with_otherwise/ and https://patchwork.kernel.org/patch/9470341/

----------

## donmartio

@Keruskerfuerst : Thanks for the hint (never used this before). Tried that but with no success.

@jburns: Well this sounds reasonably, thanks. I think i'll stick with 4.8.10 and wait for an update.

----------

## Tony0945

On my Kaveri 4.9.0 I get

```
[    0.000000] Linux version 4.9.0-gentoo (root@gentoo) (gcc version 4.9.4 (Gentoo 4.9.4 p1.0, pie-0.6.4) ) #1 SMP Tue Dec 13 17:35:00 CST 2016

[    0.000000] found SMP MP-table at [mem 0x000fd720-0x000fd72f] mapped at [ffff8800000fd720]

[    0.000000] Using ACPI (MADT) for SMP configuration information

[    0.000000] smpboot: Allowing 4 CPUs, 0 hotplug CPUs

[    0.003507] Freeing SMP alternatives memory: 32K (ffffffff81b58000 - ffffffff81b60000)

[    0.007744] smpboot: APIC(10) Converting physical 1 to logical package 0

[    0.007749] smpboot: Max logical packages: 2

[    0.223301] smpboot: CPU0: AMD A8-7600 Radeon R7, 10 Compute Cores 4C+6G (family: 0x15, model: 0x30, stepping: 0x1)

[    0.223556] x86: Booting SMP configuration:

[    0.675435] smpboot: Total of 4 processors activated (24754.85 BogoMIPS)

```

My config is here: https://paste.pound-python.org/show/B0hkf6313viXdyhBQP99/

You might want to compare it to yours. I don't think anything is wrong with 4.9. I'm using it on the Kaveri and an old Athlon II. No such problems.

----------

## donmartio

Hi Tony0945,

thanks, but there are a lot of differences between our configurations.

The SMP and APIC stuff is the same so this won't help.

----------

## donmartio

Just out of curiosity Ii made a copy of my kernel,

cp -a': cp -aR linux-4.9.0-gentoo linux-4.9.0-gentoo-r1

applied the patch jburns mentioned,

patch -p1 < apic.patch

and rebuild the kernel

make -j5 && make modules_install; cp arch/x86_64/boot/bzImage /boot/kernel-menuconfig-4.9.0-gentoo-r1

but that has no effect either.

----------

## Hu

Your problem is beyond what I can handle.  After rereading your dmesg from both kernels and examining the corresponding kernel source, I believe my earlier analysis about num_online_cpus() being low on the bad kernel is correct, but I lack the background to determine why this happens for you.  I think you need help from someone comfortable with developing this part of the kernel.

----------

## donmartio

Thanks for the investigation. Tried kernel 4.9.1 today and there is the same problem.

Really strange.

----------

## Ant P.

Reading this, it seems like a bug in 4.9 that affects AMD CPUs: https://patchwork.kernel.org/patch/9470341/

I get the same on a Phenom II CPU, but it doesn't affect which cores work or not there:

```
[  +0.032006] x86: Booting SMP configuration:

[  +0.000002] .... node  #0, CPUs:      #1 #2

[  +0.082679] [Firmware Bug]: CPU2: APIC id mismatch. Firmware: 2 CPUID: 3

[  +0.079436]  #3

[  +0.000584] [Firmware Bug]: CPU3: APIC id mismatch. Firmware: 3 CPUID: 2

[  +0.079307] x86: Booted up 1 node, 4 CPUs
```

At a guess, you could try booting 4.9 with "noapic" but I have no idea if it's the source of the problem.

----------

## theotherjoe

Ant, thanks for posting the link to the patch.

Got the same firmware bug messages on a FX cpu, also without

serious effects.

Actually, I expected a fix in latest 4.9.1 release, but no such thing.

Patch applied to the 4.9.1 tree without problems and sofar everythings

well with the new kernel.

So, thanks again for the pointer.

----------

## saboya

I had a bad experience with kernel 4.9 as well, but it was a kernel panic when trying to use wifi.

I'll stick to 4.8 for now.

----------

## Jaglover

FWIW there is a report in another forum SMP kernel 4.8.16 running only one core on AMD E-450.

----------

## theotherjoe

The release source of kernel-4.9.1 shows no problem on a AMD E-450

not even the firmware bug messages at the smpboot stage.

```
Linux localhost 4.9.1-kms #1 SMP PREEMPT Mon Jan 9 08:39:29 CET 2017 x86_64 AMD E-450 APU with Radeon(tm) HD Graphics AuthenticAMD GNU/Linux

```

```
 ....

[    0.027076] CPU: Physical Processor ID: 0

[    0.027253] CPU: Processor Core ID: 0

[    0.027348] mce: CPU supports 6 MCE banks

[    0.027466] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 4

[    0.027609] Last level dTLB entries: 4KB 512, 2MB 8, 4MB 4, 1GB 0

[    0.028745] Freeing SMP alternatives memory: 24K (ffffffff81f4a000 - ffffffff81f50000)

[    0.033604] smpboot: APIC(0) Converting physical 0 to logical package 0

[    0.033803] smpboot: Max logical packages: 1

[    0.034507] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1

[    0.469204] smpboot: CPU0: AMD E-450 APU with Radeon(tm) HD Graphics (family: 0x14, model: 0x2, stepping: 0x0)

[    0.469521] Performance Events: AMD PMU driver.

[    0.469711] ... version:                0

[    0.469854] ... bit width:              48

[    0.469992] ... generic registers:      4

[    0.470133] ... value mask:             0000ffffffffffff

[    0.470259] ... max period:             00007fffffffffff

[    0.470355] ... fixed-purpose events:   0

[    0.470495] ... event mask:             000000000000000f

[    0.487423] x86: Booting SMP configuration:

[    0.487625] .... node  #0, CPUs:      #1

[    0.490258] x86: Booted up 1 node, 2 CPUs

[    0.490450] smpboot: Total of 2 processors activated (6586.51 BogoMIPS)

 ....

```

```

~ # cat /proc/cpuinfo 

processor       : 0

vendor_id       : AuthenticAMD

cpu family      : 20

model           : 2

model name      : AMD E-450 APU with Radeon(tm) HD Graphics

stepping        : 0

microcode       : 0x5000101

cpu MHz         : 1320.000

cache size      : 512 KB

physical id     : 0

siblings        : 2

core id         : 0

cpu cores       : 2

apicid          : 0

initial apicid  : 0

fpu             : yes

fpu_exception   : yes

cpuid level     : 6

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt hw_pstate vmmcall arat npt lbrv svm_lock nrip_save pausefilter

bugs            : fxsave_leak sysret_ss_attrs null_seg

bogomips        : 3293.25

TLB size        : 1024 4K pages

clflush size    : 64

cache_alignment : 64

address sizes   : 36 bits physical, 48 bits virtual

power management: ts ttp tm stc 100mhzsteps hwpstate

processor       : 1

vendor_id       : AuthenticAMD

cpu family      : 20

model           : 2

model name      : AMD E-450 APU with Radeon(tm) HD Graphics

stepping        : 0

microcode       : 0x5000101

cpu MHz         : 1320.000

cache size      : 512 KB

physical id     : 0

siblings        : 2

core id         : 1

cpu cores       : 2

apicid          : 1

initial apicid  : 1

fpu             : yes

fpu_exception   : yes

cpuid level     : 6

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt hw_pstate vmmcall arat npt lbrv svm_lock nrip_save pausefilter

bugs            : fxsave_leak sysret_ss_attrs null_seg

bogomips        : 3293.25

TLB size        : 1024 4K pages

clflush size    : 64

cache_alignment : 64

address sizes   : 36 bits physical, 48 bits virtual

power management: ts ttp tm stc 100mhzsteps hwpstate

```

There are other issues with the ACPI table handling showing firmware bug messages.

But the kernel image itself seems to behave well from the short test I ran.

----------

## donmartio

Tried 4.9.2 and the the firmware error message is gone, but the problem stays.

Just one core recognized. 

I think i miss something here, but i am gone through the kernel config

three times now and can't imagine what it is.

----------

## Keruskerfuerst

Have you changed the kernel config?

----------

## saboya

My issue with wifi seems solved with 4.9.2 (ath9k), but now I have a bluetooth issue (ath3k). I don't like this kernel =P

----------

## donmartio

I've copied the .config from my running 4.8.10er kernel and then make oldconfig.

I used the default answer mostly, except for some amazon stuff i don't need.

I've pasted the dmesg-kernel-4.9.2.txt

http://pastebin.com/9x0nf1CQ

and the config-kernel-4.9.2.txt

http://pastebin.com/PEhPM0hQ

The error is gone, as you can see.

----------

## donmartio

The interesting part is now this:

4.9.2

[    0.117526] smpboot: CPU0: AMD FX(tm)-4100 Quad-Core Processor (family: 0x15, model: 0x1, stepping: 0x2)

[    0.117529] Performance Events: Fam15h core perfctr, AMD PMU driver.

[    0.117533] ... version:                0

[    0.117534] ... bit width:              48

[    0.117534] ... generic registers:      6

[    0.117535] ... value mask:             0000ffffffffffff

[    0.117536] ... max period:             00007fffffffffff

[    0.117536] ... fixed-purpose events:   0

[    0.117537] ... event mask:             000000000000003f

[    0.125577] MCE: In-kernel MCE decoding enabled.

[    0.127538] x86: Booted up 1 node, 1 CPUs

[    0.127540] smpboot: Total of 1 processors activated (7184.53 BogoMIPS)

4.8.10

[    0.117520] smpboot: CPU0: AMD FX(tm)-4100 Quad-Core Processor (family: 0x15, model: 0x1, stepping: 0x2)

[    0.117523] Performance Events: Fam15h core perfctr, AMD PMU driver.

[    0.117527] ... version:                0

[    0.117527] ... bit width:              48

[    0.117528] ... generic registers:      6

[    0.117529] ... value mask:             0000ffffffffffff

[    0.117529] ... max period:             00007fffffffffff

[    0.117530] ... fixed-purpose events:   0

[    0.117530] ... event mask:             000000000000003f

[    0.123570] MCE: In-kernel MCE decoding enabled.

[    0.128552] x86: Booting SMP configuration:

[    0.128553] .... node  #0, CPUs:      #1 #2 #3

[    0.322534] x86: Booted up 1 node, 4 CPUs

[    0.322537] smpboot: Total of 4 processors activated (28734.89 BogoMIPS)

----------

## wrc1944

Something appears wrong in your 4.9.2 kernel config file. ([4.9.2 is missing x86: Booting SMP configuration:)  Try this:

1.  Remove the  /usr/src/linux-4.9.2-gentoo directory, and then get a fresh new source directory: 

```
 emerge =sys-kernel/gentoo-sources-4.9.2 
```

Remove (if any) .config file in the new source directory.  NOT the .cocciconfig file

2.  Then copy the working kernel .config file which does enable all 4 cores from your 4.8.10 directory into the your new linux-4.9.2-gentoo directory. Make sure it's just named .config (nothing else) Forget menuconfig, or oldconfig, etc.

3. Then cd into the new 4.9.2 and run: 

```
 make bzImage && make modules && make modules_install 
```

4. then 

```
cp arch/x86/boot/bzImage /boot/kernel-4.9.2-gentoo
```

5, Update grub 

```
 grub-mkconfig -o /boot/grub/grub.cfg 
```

Reboot into the new 4.9.2, and all four cpu cores should be recognized, and active. If it works, you can basically use it (the .config file) on this hardware for future kernels.

You should also weed out your .config for your hardware, only enabling what you actually need. Then you can just use a nice gui like make gconfig or make xconfig, and easily check if need be for any new features/drivers.  The kernel config GUI's really lay out everything so it's quick and easy to load and save your custom file into the new kernel, and deeply explore/edit all the config options if you wish.

----------

## Tony0945

 *Quote:*   

> Then copy the working kernel .config file from your 4.8.10 directory into the your new linux-4.9.2-gentoo. Make sure it's just named .config (nothing else) Forget menuconfig, or oldconfig, etc.
> 
> 3. Then cd into the new 4.9.2 and run:
> 
> Code:	
> ...

 NO!  He MUST run make oldconfig first or new flags for 4.9 will not be set!

He should run make oldconfig first taking all the default choices.

----------

## wrc1944

Tony0945,

I've done this many hundreds of times over the last 15-20 years, and it never fails. and yes the options selected in any .config file you place in a fresh source directory and then manually run 

```
 make bzImage && make modules && make modules_install 
```

 are recognized and take effect on the new kernel. 

I assume by "new flags for 4.9 will not be set! " you mean the kernel options, correct?

Of course I don't usually do kernels this way- this was just be sure donmartio quickly got a kernel recognizing all 4 cores because he seems to be having problems with running make oldconfig/menu config. We know his 4.8.10 config file is getting all cores recognized, and since he's compiling the 4.9.2 kernel on the same hardware that 4.8.10 config fle will work fine. I think that while going through the oldconfig options he probably left out a needed option.  If he uses that 4.8.10 .config, I'll be very surprised if all 4 cores AREN'T then being utilized.

I always use the GUI make xconfig or make gconfig on desktop systems. Compared to their ease for loading custom .config files, and complete control and immediate search/help access for any actual new features, menuconfig or oldconfig (while eventually acomplishing the same thing) are to me awkward, more prone to making a mistake selecting options, not to mention being far more time consuming.  Unless I needed to add new hardware or change file systems, enable new support or modules, I can just use the same .config for many, many subsequent kernel updates- no big deal. 

All such kernel options like drivers, kernel updates, etc., etc. are automatically  picked up from the new versions in the new kernel sources, even if you are still using your custom .config file.

Or, donmartio could try building a full  4.9.2 with "genkernel all" and worry about weeding out all unecessary hardware and drivers, etc. options in the .config file later.

Of course If you think anything I've mentioned is bad practice, I'm all ears, and welcome any feedback.    :Wink: 

----------

## donmartio

Thanks for the replies.

I tried as you suggested with kernel 4.9.4 with no success.

May i try 'genkernel all' just to verify if i get the kernel recognize my cores.

----------

## Tony0945

Donmartio, what is the output of "lscpu"?

----------

## donmartio

Hi Tony0945,

with kernel 4.8.10:

```
Architecture:          x86_64

CPU op-mode(s):        32-bit, 64-bit

Byte Order:            Little Endian

CPU(s):                4

On-line CPU(s) list:   0-3

Thread(s) per core:    2

Core(s) per socket:    2

Socket(s):             1

Vendor ID:             AuthenticAMD

CPU family:            21

Model:                 1

Model name:            AMD FX(tm)-4100 Quad-Core Processor

Stepping:              2

CPU MHz:               3600.000

CPU max MHz:           3600,0000

CPU min MHz:           1400,0000

BogoMIPS:              7184.37

Virtualization:        AMD-V

L1d cache:             16K

L1i cache:             64K

L2 cache:              2048K

L3 cache:              8192K

Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb cpb hw_pstate vmmcall arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold

```

----------

## donmartio

With kernel-4.9.4

```
Architecture:          x86_64

CPU op-mode(s):        32-bit, 64-bit

Byte Order:            Little Endian

CPU(s):                1

On-line CPU(s) list:   0

Thread(s) per core:    1

Core(s) per socket:    1

Socket(s):             1

Vendor ID:             AuthenticAMD

CPU family:            21

Model:                 1

Model name:            AMD FX(tm)-4100 Quad-Core Processor

Stepping:              2

CPU MHz:               3600.000

CPU max MHz:           3600,0000

CPU min MHz:           1400,0000

BogoMIPS:              7184.73

Virtualization:        AMD-V

L1d cache:             16K

L1i cache:             64K

L2 cache:              2048K

L3 cache:              8192K

Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb cpb hw_pstate vmmcall arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold

```

----------

## Tony0945

with 4.9.3 on one machine

```
Architecture:          x86_64

CPU op-mode(s):        32-bit, 64-bit

Byte Order:            Little Endian

CPU(s):                3

On-line CPU(s) list:   0-2

Thread(s) per core:    1

Core(s) per socket:    3

Socket(s):             1

Vendor ID:             AuthenticAMD

CPU family:            16

Model:                 5

Model name:            AMD Athlon(tm) II X3 440 Processor

Stepping:              2

CPU MHz:               800.000

CPU max MHz:           3000.0000

CPU min MHz:           800.0000

BogoMIPS:              6000.14

Virtualization:        AMD-V

L1d cache:             64K

L1i cache:             64K

L2 cache:              512K

```

 and on another

```
Architecture:          x86_64

CPU op-mode(s):        32-bit, 64-bit

Byte Order:            Little Endian

CPU(s):                4

On-line CPU(s) list:   0-3

Thread(s) per core:    2

Core(s) per socket:    2

Socket(s):             1

Vendor ID:             AuthenticAMD

CPU family:            21

Model:                 48

Model name:            AMD A8-7600 Radeon R7, 10 Compute Cores 4C+6G

Stepping:              1

CPU MHz:               1400.000

CPU max MHz:           3100.0000

CPU min MHz:           1400.0000

BogoMIPS:              6188.74

Virtualization:        AMD-V

L1d cache:             16K

L1i cache:             96K

L2 cache:              2048K

```

It seems like there is a bug involving only the FX series, I would file a kernel bug.

----------

## Tony0945

OK, I've taken your pastebin 4.9.2 config and only appended local version "donmartio" so that it does not overlay my kernel. I built it without "make oldconfig"

I used " make -j4 && make -j4 modules_install && make -j4 install" from my script on /usr/src/linux pointed to 4.9.3  Tomorrow I will go to the local machine and boot that kernel, since I probably can't ssh in due to hardware differences. I'll boot non-X and see what lscpu says. If it shows all three CPU's then something is definitely different for FX-4100 than for Athlon II X3.

Watch this space!

UPDATE1:  Build failed with "*** No rule to make target '/lib/firmware/rtl_nic//rtl8168e-3.fw', needed by 'firmware/rtl8168e-3.fw.gen.o"

I had to remove CONFIG_EXTRA_FIRMWARE && CONFIG_EXTRA_FIRMWARE_DIR to progress. I don't have this hardware anyway.

That was the only problem building. Added the lines

```
title=test

root (hd0,0)

kernel /boot/vmlinuz-4.9.3donmartio-gentoo  root=/dev/sda3 vga=0x375
```

 to /boot/grub/grub.conf and will test it tomorrow.

----------

## wrc1944

donmartio,

Hmmm... one other idea:   Have you tried booting the Gentoo live cd (or any other live cd) and  seeing what 

```
cat /proc/cpuinfo
```

reports? 

A live cd would supposedly detect and properly configure its kernel for the AMD FX-4100 Quad-Core Processor . You could then look at the .config file it produced, and use it, or edit yours with what you found in the live cd kernel.

Then again, Tony0945 might very well be right about a weird undiscovered FX series kernel bug, somehow only affecting the FX-4100 Quad on certain motherboards, or who knows?

My AMD -8320 Eight-Core Processor is also family 21, and never had this problem over countless kernel versions, on different  Gentoo installations and several other distros all on on my multi-boot testing machine.  That fact still makes me think we're missing something in the .config file.    :Confused: 

----------

## wrc1944

donmartio,

I noticed your .config file has CONFIG_MK8=y set, which is enabling only the old AMD 64 3000+ flags (Venice core type cpus, circa 2006).

The amd FX 4100 quad is the vishera core (i.e. piledriver, as my FX 8320 is). EDIT:  Actually, the FX-4100 Quad is zambezi/bulldozer, NOT piledriver- Sorry.   :Embarassed: 

My thinking is that while mk8=y seems to still work with kernel 4.8.10, it might not be adequate for the 4.9 series?

You only will get the expanded list of possible gcc cpu optimizations by using the "expermental" USE flag for gentoo-sources, enabled with placing  *Quote:*   

> sys-kernel/gentoo-sources experimental

  in /etc/portage/package.use. It also adds a few more patches like BFQ, but they aren't enabled by default, so no worries there. 

Or, you could get the latest gcc opts patch and apply it yourself from https://github.com/graysky2/kernel_gcc_patch

FWIW, here's my relevant .config file section. When I first got this cpu, I set piledriver, but then went with CONFIG_MNATIVE=y which lets gcc itself detect and set a few more flags the specific cpu is designed for than setting any of the actual core names. I confirmed this when I made the change. Notice I also set CONFIG_NR_CPUS=8, the exact number of my cpu cores, while you have CONFIG_NR_CPUS=64. Not sure, but maybe in kernel 4.9.x setting way more than the actual cores might cause it to default to one core because it fails to find the number of cores the  .config  setting allows?  Who knows? 

Have you ever run across this? 

```
#error "CONFIG_NR_CPUS is too large, please lower it."
```

 Trying to figure this out, I ran across it several times. I would think CONFIG_NR_CPUS=64 would only apply for server cpus, which the amd FX4100 is not. At this point, I'm running out of ideas.   :Rolling Eyes: 

 *Quote:*   

> # Processor type and features
> 
> #
> 
> CONFIG_ZONE_DMA=y
> ...

 

----------

## Tony0945

I booted your kernel. First I had a kernel panic because your kernel renamed /dev/sr0 to /dev/sda and /dev/sda to /dev/sdb. After fixing the kernel parameter "root=/dev/sdb3", it booted.

Of course, because of the hardware differences many things failed like net.eth0 and X, but I was able to run lscpu

```
Architecture:          x86_64

CPU op-mode(s):        32-bit, 64-bit

Byte Order:            Little Endian

CPU(s):                3

On-line CPU(s) list:   0-2

Thread(s) per core:    1

Core(s) per socket:    3

Socket(s):             1

Vendor ID:             AuthenticAMD

CPU family:            16

Model:                 5

Model name:            AMD Athlon(tm) II X3 440 Processor

Stepping:              2

CPU MHz:               3000.000

CPU max MHz:           3000.0000

CPU min MHz:           800.0000

BogoMIPS:              6000.22

Virtualization:        AMD-V

L1d cache:             64K

L1i cache:             64K

L2 cache:              512K

```

Along with wrc1944, I had noticed that you had selected the processor type as basic K8. It's a shot in the dark, but try selecting BULLDOZER or NATIVE.

I always build NATIVE.

----------

## donmartio

Wow, thanks for your suggestions.

I'll try that immediately.

----------

## donmartio

No success...  :Sad: 

I tried 

USE="experimental" emerge =sys-kernel/gentoo-sources-4.9.4

and selected NATIVE.

Ill try Bulldozer now and remerge the kernel sources to compile the kernel from 'clean' sources.

----------

## Jaglover

If everything fails I'd bisect and file a bug. Although it is possible you got "lucky" and your CPU has some rare hardware quirk.

----------

## donmartio

Well no... no success...

since the kernel config runs on your maschine Tony0945, it would be strange if this caused the problem.

It seems i'm running out of options now. 

How and where do i post a kernel bug?

----------

## Jaglover

You should bisect first, this gives your bug filing more weight. 

How to file a Linux kernel bug.

----------

## donmartio

Hi Jaglover,

thanks... i'll bisect. Never heard about that before. Sounds reasonable.

----------

## donmartio

Thanks again for this tip. I would have done this earlier if i had known this cool git feature.

I started git bisect with 4.8.17 and 4.9.0.

This was still a way to go, but here is the result:

```

8f54969dc8d6704632b42cbb5e47730cd75cc713 is the first bad commit

commit 8f54969dc8d6704632b42cbb5e47730cd75cc713

Author: Gu Zheng <guz.fnst@cn.fujitsu.com>

Date:   Thu Aug 25 16:35:16 2016 +0800

    x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping

    

    The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,

    when node online/offline happens, cache based on cpuid <-> nodeid mapping such as

    wq_numa_possible_cpumask will not cause any problem.

    It contains 4 steps:

    1. Enable apic registeration flow to handle both enabled and disabled cpus.

    2. Introduce a new array storing all possible cpuid <-> apicid mapping.

    3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.

    4. Establish all possible cpuid <-> nodeid mapping.

    

    This patch finishes step 2.

    

    In this patch, we introduce a new static array named cpuid_to_apicid[],

    which is large enough to store info for all possible cpus.

    

    And then, we modify the cpuid calculation. In generic_processor_info(),

    it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid

    mapping changes with node hotplug.

    

    After this patch, we find the next unused cpuid, map it to an apicid,

    and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid

    mapping will be persistent.

    

    And finally we will use this array to make cpuid <-> nodeid persistent.

    

    cpuid <-> apicid mapping is established at local apic registeration time.

    But non-present or disabled cpus are ignored.

    

    In this patch, we establish all possible cpuid <-> apicid mapping when

    registering local apic.

    

    Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>

    Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>

    Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>

    Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>

    Acked-by: Ingo Molnar <mingo@kernel.org>

    Cc: mika.j.penttila@gmail.com

    Cc: len.brown@intel.com

    Cc: rafael@kernel.org

    Cc: rjw@rjwysocki.net

    Cc: yasu.isimatu@gmail.com

    Cc: linux-mm@kvack.org

    Cc: linux-acpi@vger.kernel.org

    Cc: isimatu.yasuaki@jp.fujitsu.com

    Cc: gongzhaogang@inspur.com

    Cc: tj@kernel.org

    Cc: izumi.taku@jp.fujitsu.com

    Cc: cl@linux.com

    Cc: chen.tang@easystack.cn

    Cc: akpm@linux-foundation.org

    Cc: kamezawa.hiroyu@jp.fujitsu.com

    Cc: lenb@kernel.org

    Link: http://lkml.kernel.org/r/1472114120-3281-4-git-send-email-douly.fnst@cn.fujitsu.com

    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

```

I'll try to look into this later but it looks absolutely that they miss a mapping here.

----------

## donmartio

Just for the record, i followed this howto:

https://wiki.gentoo.org/wiki/Kernel_git-bisect

----------

## Tony0945

 *Quote:*   

> which is large enough to store info for all possible cpus. 

 Which explains why there is no problem with my old Athlon II nor my newer but very common A8. I would have thought that FX-4100 would be also, but it was very silly to assume that a table held ALL possible CPU's since new versions and steppings are released all the time. In fact, FX-4100 may even be in the table but your stepping is slightly different.

Thanks very much for the bisect link.

----------

## wrc1944

Guess I've been lucky to never have run into something like this, or else it's just very rare. 

I've literally used every amd cpu family and all the cpus  within a family released since the old amd k2 & k3 days, including most of the steppings of each cpu, if available. 

Plus, all while testing every linux kernel version from 2.6.x -> 4.9.4 as they were released over the years.  Only one I ever skipped was the first Bulldozer FX release, after looking at all the benches and reviews about its questionable single thread performance. Almost went with Intel. Finally got a Piledriver/AM3+ rig to hold me over, anticipating the promised upgrade the Steamroller or Excavator, which never were released.   :Rolling Eyes:   Looks like Zen is going to pan out, so if the benches and reviews look good that's the next one for me. 

Many thanks for the great tip & info about bisecting!

----------

## Tony0945

Just learned that 4.9.5 has this fix in it. *Quote:*   

> Thomas Gleixner (1):
> 
>       x86/bugs: Separate AMD E400 erratum and C1E bug
> 
> 

 

Might be worth a try.

----------

## donmartio

Already tried 4.9.5 with the same result.

I send a bug report to the specified mailing list:

http://www.spinics.net/lists/linux-acpi/

I poked a little bit in the sources with the state i created with bisect and tried to understand what they did, with no success either.

I think i was lucky too so far. I compiled a lot of kernels since 2004 and had never such an issue.

But what's really great is the support here. 

Thanks for your help and suggestions so far.

----------

## donmartio

I fiddled around a little bit and added some logging.

This shows a strange thing:

[    0.000000] DONMARTIO apicid 5

[    0.000000] DONMARTIO physical_apicid 16

[    0.000000] DONMARTIO logical cpuids 1

[    0.000000] DONMARTIO disabled cpus 1

[    0.000000] DONMARTIO cpu 1

[    0.000000] DONMARTIO apicid 6

[    0.000000] DONMARTIO physical_apicid 16

[    0.000000] DONMARTIO logical cpuids 2

[    0.000000] DONMARTIO disabled cpus 2

[    0.000000] DONMARTIO cpu 2

[    0.000000] DONMARTIO apicid 7

[    0.000000] DONMARTIO physical_apicid 16

[    0.000000] DONMARTIO logical cpuids 3

[    0.000000] DONMARTIO disabled cpus 3

[    0.000000] DONMARTIO cpu 3

[    0.000000] DONMARTIO apicid 8

[    0.000000] DONMARTIO physical_apicid 16

[    0.000000] DONMARTIO logical cpuids 4

[    0.000000] DONMARTIO disabled cpus 4

[    0.000000] DONMARTIO cpu 4

[    0.000000] DONMARTIO apicid 16

[    0.000000] DONMARTIO physical_apicid 16

[    0.000000] DONMARTIO disabled cpus 4

[    0.000000] DONMARTIO cpu 0

[    0.000000] DONMARTIO apicid 17

[    0.000000] DONMARTIO physical_apicid 16

[    0.000000] DONMARTIO logical cpuids 5

[    0.000000] DONMARTIO disabled cpus 4

[    0.000000] DONMARTIO cpu 5

[    0.000000] DONMARTIO apicid 18

[    0.000000] DONMARTIO physical_apicid 16

[    0.000000] DONMARTIO logical cpuids 6

[    0.000000] DONMARTIO disabled cpus 4

[    0.000000] DONMARTIO cpu 6

[    0.000000] DONMARTIO apicid 19

[    0.000000] DONMARTIO physical_apicid 16

[    0.000000] DONMARTIO logical cpuids 7

[    0.000000] DONMARTIO disabled cpus 4

[    0.000000] DONMARTIO cpu 7

The system checks eight cpu's. The first four are disabled.

If i add this line 

if(apicid < 16) return -ENODEV;

at the beginning of the function __generic_processor_info, i get all 4 cores.

(I took the 16, because i know the lowest apicid from the kernel 4.8.17 output).

I wonder if it's possible that i have an amd with eight cores labeled as quad core?

I'm really curious what's going wrong here.

----------

## ct85711

```
I wonder if it's possible that i have an amd with eight cores labeled as quad core? 
```

If you have an APU, that will explain why you would see 8 cores total, but only 4 you can use as the others are dedicated directly for the GPU that is built in.

----------

## Tony0945

 *donmartio wrote:*   

> I wonder if it's possible that i have an amd with eight cores labeled as quad core?

  That is indeed possible as AMD has a history of such things. When they don't have enough low end chips to satisfy demand, Intel will physically cripple higher performance chips and sell them as low performers. AMD will typically just label them as lower speed/lower core count.  I think they have some way of firmware shutting off cores. The right motherboard(s) will let you turn them back on. My Athlon II X3 is really an X4 with one core shut off. The mobo says I can turn it back on, but I've never been sure it was safe.

----------

## Ant P.

 *Tony0945 wrote:*   

> My Athlon II X3 is really an X4 with one core shut off. The mobo says I can turn it back on, but I've never been sure it was safe.

 

Mine is one of those, I've been using it as an X4 for years now. I had to update the BIOS to get that, but these posts make me wonder if I could've just tweaked the kernel...

----------

## donmartio

I now have kernel-4.9.5 up and running with all four cores.

I created the following patch:

```
diff -urN a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c

--- a/arch/x86/kernel/apic/apic.c   2017-01-20 21:12:10.000000000 +0100

+++ b/arch/x86/kernel/apic/apic.c   2017-01-22 15:06:45.058115715 +0100

@@ -2150,6 +2150,10 @@

        /* Logical cpuid 0 is reserved for BSP. */

        cpuid_to_apicid[0] = apicid;

+   } else if(!boot_cpu_detected) {

+       pr_warning("APIC: physical cpu not yet deteced. apicid is %d, physical apicid is %d.\n", 

+           apicid, boot_cpu_physical_apicid);

+       return -EINVAL;

    } else {

        cpu = allocate_logical_cpuid(apicid);

        if (cpu < 0) {

```

and put that into /etc/portage/patches/sys-kernel/gentoo-sources-4.9.5/apic-cpu-detection.patch.

Then i remerged gentoo-sources-4.9.5, configured and compiled the kernel and now it works.

This is quiet hacky, since i don't really know if this is the right way to solve this, but it works  :Smile:  .

----------

## Tony0945

Absolutely! ANY look up in ANY program should deal wityh "not found".

Great detective work, donmartio.

----------

## wrc1944

WOW!  

"Quite hacky" or not, it's still quite ingenious, and proves once again a poster's so-called "Apprentice" status doesn't mean squat.

Congrats to donmartio for resolving the mystery!  And on a personal note, thanks very much. I learned a lot from following this thread.   :Very Happy: 

----------

## donmartio

No response so far.

Has anybody experience with filing bug reports to a kernel mailing list?

May i did something wrong or they are just to busy.

I think i could live a while with my patch but i'm also quite curious since it seems i'm the only one with this problem (what's kind of strange).

Could be a rare combination of mainboard, processor and bios.

I still wonder if i have a masked eight core under the hood.

----------

## Tony0945

 *Ant P. wrote:*   

> Mine is one of those, I've been using it as an X4 for years now. I had to update the BIOS to get that, but these posts make me wonder if I could've just tweaked the kernel...

  Rebooted, went into BIOS with DEL, clicked on "Test and unlock cores". Number of cores reported in BIOS went from 3 to 4. rebooted. lscpu shows four cores. Seems to be a bit quicker and no problems for several days now.

Seems a bit like overclocking but temps actually seem cooler.

----------

## donmartio

Hmmm, i have no 'unlock cores' or something else in my bios.

I have an ucc mode (ASRock Extreme3 970) which should unlock cores, according to the docs.

But there are no more cores if i switch this mode on.

So i suspect what the apic module detects are phantom cores.

----------

## Tony0945

donmartio, Ant P. and I have older Athlon II X3 CPUs. Mine is on an old AM2+ MSI mobo. I doubt if your FX- has any locked cores.

----------

## donmartio

Just for the record, i'm running the self-patched kernel 4.9.10 now. No response on my post.

I've sent the patch again. The list is pretty noisy so my post got lost, i think.

----------

## Tony0945

Donmartio,  try filing a bug with gentoo against gentoo-sources. Then it could at least become a statndard gentoo-sources patch.  Whem/if I get time I'll prepare a user patch. Or do you already have one you can share. I have not been affected by this bug, but I'll bet if I buy an early Ryzen, I will.

EDIT: DUH! You already created the patch and posted it above.   It's only 7:00AM here, and I haven't had any coffee.

----------

## Tony0945

Patch failed for me against gentoo-sources-4.9.9. Slighly different version using diff -Naur did:

```
--- arch/x86/kernel/apic/apic.c.old     2017-02-17 07:29:50.207194382 -0600

+++ arch/x86/kernel/apic/apic.c 2017-02-17 07:41:04.817193931 -0600

@@ -2150,6 +2150,10 @@

                /* Logical cpuid 0 is reserved for BSP. */

                cpuid_to_apicid[0] = apicid;

+       } else if(!boot_cpu_detected) {

+                       pr_warning("APIC: physical cpu not yet detected. apicid is %d, physical apicid is %d.\n",

+               apicid, boot_cpu_physical_apicid);

+                       return -EINVAL;

        } else {

                cpu = allocate_logical_cpuid(apicid);

                if (cpu < 0) {

```

The logic is a bit difficult to follow with that stupid K&R layout.

----------

## donmartio

Thanks for the response Tony0945.

I got some kind of response today from one of the devs:

```
[+x86, Thomas and Gu Zheng]
```

Don't now what that means.

diff -urN works here but not for you, that's strange.

If i get that right the difference is the -a which means 'treat all files as text'.

Do you have an explanation for that ?

I hoped that i get rid of that problem with an early Ryzen  :Smile:  .

I'm really curious what the apic devs find out .

----------

## Tony0945

 *donmartio wrote:*   

> diff -urN works here but not for you, that's strange.
> 
> If i get that right the difference is the -a which means 'treat all files as text'.
> 
> Do you have an explanation for that ?

 

Don't know. I got that command long ago by googling and never looked into what it meant.

Planning on an 8-core Ryzen myself and don't want the kernel to think it's a single core!

----------

## Chiitoo

 *donmartio wrote:*   

> I got some kind of response today from one of the devs:
> 
> ```
> [+x86, Thomas and Gu Zheng]
> ```
> ...

 

Almost certain that it means they added those mentioned to the thread via 'cc'.  :]

----------

## wrc1944

Tony0945 wrote:   *Quote:*   

> Planning on an 8-core Ryzen myself and don't want the kernel to think it's a single core!
> 
> 

 

Just noticed two new patches in the 4.9.10 patch list that seem related to zen cpus. 

https://lwn.net/Articles/714592

 *Quote:*   

> Borislav Petkov (1):
> 
>       x86/CPU/AMD: Bring back Compute Unit ID
> 
> Yazen Ghannam (1):
> ...

 

I hadn't realized this before, but I think this means we also need to now enable SMT in the kernel for the Zen 17h family.  

IIRC, I've been not setting SMT for years as I believed amd cpus weren't using that option. AFAIK (or had thought), SMT was only relevant for Intel cpus.  Am I correct on this?

BTW- 4.9.11 was just released, and I think 4.10.0 should also be out this weekend.

----------

## Tony0945

```
CONFIG_SCHED_SMT:                                                                                                                                                                     │

  │                                                                                                                                                                                       │

  │ SMT scheduler support improves the CPU scheduler's decision making                                                                                                                    │

  │ when dealing with Intel Pentium 4 chips with HyperThreading at a                                                                                                                      │

  │ cost of slightly increased overhead in some places. If unsure say                                                                                                                     │

  │ N here.  
```

 ?????

I'm going to try turning it on.

Thanks for all of your Ryzen work!

EDIT:   From https://en.wikipedia.org/wiki/Simultaneous_multithreading

 *Quote:*   

> AMD Bulldozer microarchitecture FlexFPU and Shared L2 cache are multithreaded but integer cores in module are single threaded, so it is only a partial SMT implementation.[7]
> 
> AMD Zen microarchitecture has 2-way SMT.

 

----------

## wrc1944

I did a  google seach for "amd cmt processors" and got a lot of discussions about amd CMT ((Cluster-based Multithreading), and relating to the new SMT in Zen. Got a slightly better grasp on what's going on.

One good one is:

https://forum.beyond3d.com/threads/amd-ryzen-cpu-architecture-for-2017.56187

and another one- seemingly offering a different viwpoint: http://wccftech.com/amds-high-performance-processor-cores-coming-2015-giving-modular-architecture

Tony, do you mean turning it on for non-Zen cpus, or only when you have a new Zen system? I was under the impression that SMT was "Intel hyperthreading" and not applicable to AMD CMT based cpus, but since you mention AMD has partial SMT implementation I'm still not precisely sure what all this really means.  However, I am definitely becoming convinced Zen will need SMT enabled in the kernel.  Guess i need to gain more knowledge on the subject.   :Rolling Eyes: 

----------

## donmartio

Tried that here with SMT on, but that has no effect here (as expected).

But for the ryzen, i think you are right. The commit of interest is this one:

c8cbc219d87cdbe33430b92350cb687b3f2201e6 x86/CPU/AMD: Fix Zen SMT topolog

```
git show c8cbc219d87cdbe33430b92350cb687b3f2201e6

commit c8cbc219d87cdbe33430b92350cb687b3f2201e6

Author: Yazen Ghannam <Yazen.Ghannam@amd.com>

Date:   Sun Feb 5 11:50:22 2017 +0100

    x86/CPU/AMD: Fix Zen SMT topology

    

    commit 08b259631b5a1d912af4832847b5642f377d9101 upstream.

    

    After:

    

      a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology")

    

    our  SMT scheduling topology for Fam17h systems is broken, because

    the ThreadId is included in the ApicId when SMT is enabled.

    

    So, without further decoding cpu_core_id is unique for each thread

    rather than the same for threads on the same core. This didn't affect

    systems with SMT disabled. Make cpu_core_id be what it is defined to be.

    

    Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>

    Signed-off-by: Borislav Petkov <bp@suse.de>

    Cc: Linus Torvalds <torvalds@linux-foundation.org>

    Cc: Peter Zijlstra <peterz@infradead.org>

    Cc: Thomas Gleixner <tglx@linutronix.de>

    Link: http://lkml.kernel.org/r/20170205105022.8705-2-bp@alien8.de

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c

index 20dc44d1e6be..2b4cf04239b6 100644

--- a/arch/x86/kernel/cpu/amd.c

+++ b/arch/x86/kernel/cpu/amd.c

@@ -319,6 +319,13 @@ static void amd_get_topology(struct cpuinfo_x86 *c)

                if (c->x86 == 0x15)

                        c->cu_id = ebx & 0xff;

 

+               if (c->x86 >= 0x17) {

+                       c->cpu_core_id = ebx & 0xff;

+

+                       if (smp_num_siblings > 1)

+                               c->x86_max_cores /= smp_num_siblings;

+               }

+

                /*

                 * We may have multiple LLCs if L3 caches exist, so check if we

                 * have an L3 cache by looking at the L3 cache CPUID leaf.

```

ThreadId is included in the ApicId when SMT is enabled.

EDIT: Huh, just saw your post wrc1944, thanks for the links.

----------

## donmartio

By the way, i hoped, when i read 8 cores and 16 threads, to see 16 little penguins at boot time  :Smile:  .

----------

## Tony0945

 *wrc1944 wrote:*   

> Tony, do you mean turning it on for non-Zen cpus, or only when you have a new Zen system? I was under the impression that SMT was "Intel hyperthreading" and not applicable to AMD CMT based cpus, but since you mention AMD has partial SMT implementation I'm still not precisely sure what all this really means.  However, I am definitely becoming convinced Zen will need SMT enabled in the kernel.  Guess i need to gain more knowledge on the subject.  

  I enabled the hyperthreading kernel CONFIG on the Athlon II and as expected, it did absolutely nothing. I just rebooted the Kaveri after doing the same.  The menuconfig help does indeed still say it's not applicable, but the wikipedia article indicates it might be applicable to bulldozer and later. The post just above says it might be screwed up for Fam17.

No errors or warnings in Kaveri log.

EDIT: I only enabled it on the Athlon II because this is machine I'm going to convert to Zen. I figured it would be one less thing to remember.

----------

## donmartio

Tried kernel 4.10.0, the problem is still the same.

----------

## axl

I've experienced the same thing. With a HP Proliant DL380 G7.

https://bugzilla.kernel.org/show_bug.cgi?id=194501

this is my bug report. I put the problem in the acpi code. my dmesg shows:

```
[    0.000000] ACPI: Local APIC address 0xfee00000

[    0.000000] ------------[ cut here ]------------

[    0.000000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:2065 __generic_processor_info+0x297/0x360

[    0.000000] Only 7 processors supported.Processor 8/0x2 and the rest are ignored.

[    0.000000] Modules linked in:

[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.8-gentoo #2

[    0.000000] Hardware name: HP ProLiant DL380 G7, BIOS P67 07/02/2013

[    0.000000]  0000000000000000 ffffffff8120c34d ffffffff81803dd0 0000000000000000

[    0.000000]  ffffffff810415d4 0000000000000008 ffffffff81803e20 0000000000000000

[    0.000000]  0000000000000008 0000000000000015 0000000000000001 ffffffff8104163a

[    0.000000] Call Trace:

[    0.000000]  [<ffffffff8120c34d>] ? dump_stack+0x46/0x59

[    0.000000]  [<ffffffff810415d4>] ? __warn+0xb4/0xd0

[    0.000000]  [<ffffffff8104163a>] ? warn_slowpath_fmt+0x4a/0x50

[    0.000000]  [<ffffffff8102bef7>] ? __generic_processor_info+0x297/0x360

[    0.000000]  [<ffffffff8103de81>] ? acpi_register_lapic+0x3d/0x6c

[    0.000000]  [<ffffffff818a44d1>] ? acpi_parse_lapic+0x3e/0x43

[    0.000000]  [<ffffffff818bbcb8>] ? acpi_parse_entries_array+0xf4/0x152

[    0.000000]  [<ffffffff818bbe43>] ? acpi_table_parse_entries_array+0xa8/0xc6

[    0.000000]  [<ffffffff818a4e0c>] ? acpi_boot_init+0xde/0x494

[    0.000000]  [<ffffffff818a4493>] ? acpi_parse_x2apic+0x6c/0x6c

[    0.000000]  [<ffffffff818a4427>] ? acpi_parse_ioapic+0x74/0x74

[    0.000000]  [<ffffffff8189e40c>] ? setup_arch+0x8b2/0x924

[    0.000000]  [<ffffffff81898aa0>] ? start_kernel+0x52/0x3af

[    0.000000] ---[ end trace 0000000000000000 ]---
```

I reverted to kernel 4.4 series to fix the problem. am glad I'm not the only one that this happened to.

----------

## donmartio

Hey axl,

did you try the patch? The error you get seems slightly different and you have an intel cpu. So there may be another problem.

If the patch works for you too, i would post that to the kernel thread to get a little more weight on this matter.

kind regards

DonMartio

----------

## axl

 *donmartio wrote:*   

> Hey axl,
> 
> did you try the patch? The error you get seems slightly different and you have an intel cpu. So there may be another problem.
> 
> If the patch works for you too, i would post that to the kernel thread to get a little more weight on this matter.
> ...

 

Hey

No. Unfortunately I don't have remote access to that machine. That was what made the situation so difficult. I was asked to put it in a configuration that works and leave. AS SOON AS POSSIBLE.  :Smile: 

It's not a setting where you can just "try" patches. I mentioned that I solved it at the time with a kernel from the previous long term line, the 4.4 series.

I know I scheduled to see that machine again in april, I will try the patch then, if 4.10/4.4 fails. Or if i have extra time. 

I wasn't able to replicate the bug on any of my machines. The weird thing was (for me) that it was a proliant server. you would expect more bug reports.

When you say "may be other problem", did you check dmesg if you have that trace? after loapic? 

WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:2065 __generic_processor_info+0x297/0x360 

this. this is where it goes to 1 core. do you have the same error?

or this:

Only 7 processors supported.Processor 8/0x2 and the rest are ignored. 

which is stupid because 7 it ignores. it has 8, it says only 7 are supported but in fact it works with 1. out of 8. which means 7 are disabled. so clearly an error right there.

In april, I'll be forced to interact again with that machine. I'll post updates then.

----------

## donmartio

When i get this right, you problem seems different but related.

I did'nt get the warning you get. Just one core to work with.

My Problem was , that the detection of the cpu's relies on the order of the found logical cpus.

As it seems, it finds the fpus first, marks them as disabled and when it comes to the real cores it has already found

4 disabled cpus. This leads to one 'physical' cpu with 4 real cores and 4 disabled virtual cores which leaves just one.

You get this warning when it tries to allocate a logical cpuid and already have found 8 logical cpu ids. 

My patch jumps in just before it gets to this allocation.

The whole process of allocation logical cpuids and mapping them to the real cores seems pretty error prone to me. 

I don't now how those virtual processors are detected, but relying on the order is always risky. 

What makes me wonder is, that this kind of error is apparently rare.

----------

