# [solved] hyperthreaded kernel not recognizing two "cores"

## pidsley

I have a Pentium 4 machine with hyperthreading. I have only been using Gentoo for about a month, and I successfully configured and built a 3.2.21 kernel using a config file I got from kernel-seeds.org and modified to fit my hardware. This kernel recognizes that the processor has two "cores."

I am now trying to use the same procedure to build a 3.3.8 kernel, and I have everything working except I cannot get the kernel to recognize both cores of the processor.

I've run vimdiff on the two config files, and I just can't see where the difference is (I'm still very new at kernel configs; I see many diferences in the two files, but nothing that seems to be what I need to change for this). Eventually I want to tune the 3.3.8 config and get rid of some things that I don't need, but I'd like to understand and fix this problem first.

I'm hoping someone just knows what setting I'm missing, but in case you need to see the config files, here they are:

working 3.2.21 .config: http://bpaste.net/show/35953/

non-working 3.3.8 .config: http://bpaste.net/show/35956/

One more question: how often do people typically rebuild Gentoo kernels? My 3.2.21 kernel works fine, and I really only wanted to build 3.3.8 to learn more about the process. So far I'm learning a lot, just not exactly what I need to know...Last edited by pidsley on Fri Jul 20, 2012 4:10 pm; edited 1 time in total

----------

## eccerr0r

You have ACPI disabled on your 3.3.8 kernel.  Many newer machines do not have MPS support and just ACPI, and ACPI is used to detect the other "cores".

----------

## pidsley

Thank you. I thought ACPI was just a laptop power thing.

(edit) -- I renabled ACPI, but just accepted the defaults under that. I'm rebuilding now.

----------

## Gusar

SCHED_MC is for true multi-core machines, with hyperthreading you should use SCHED_SMT instead. That's not your issue though, but it's something you should do too.

The biggest thing I notice in those configs, you don't have ACPI activated with 3.3.8. Activate that, also p4_clockmod and cpu_idle.

Edit: Gah, beaten to it by eccerr0r  :Smile: 

----------

## kimmie

SCHED_SMT isn't set in your 3.3 config, that's probably the problem.

You can use / command in make menuconfig to find the option in the menus.

I tend to stick with a kernel as long as it's working, or until some new features come along that I'm interested in. Usually I read the changelogs on kernelnewbies.org for major kernel versions, and the posts on LWN.net for minor versions. If I see a security issue or a bug that affects my hardware I upgrade. Depends a lot on your setup... main drivers of kernel updates for me have been WiFi or XFS related. My server will stay on 3.0 series until it's no longer maintained as a long-term kernel. And then some, realistically.

----------

## pidsley

Thanks for the quick responses everyone. Enabling ACPI did the trick. Marking it [solved].

Now I can go break it by changing it again -- at least I have a working config to fall back to  :Smile: 

----------

## eccerr0r

Pretty much everyone should be enabling ACPI nowadays, especially on machines with plenty of RAM.  As long as the ACPI interpreter works (and your DSDT isn't broken), it enables many features (sometimes the only way)... It's more than just power.

BTW, theoretically your machine should work just fine without SMT support.   As long as SMP is enabled it's fine.  Not having SMT support will do funny things if you have machines that have more than one core that's multithreaded.  Imagine running two threads on a SMT processor and leaving the other core idle...

ACPI = Advanced Configuration and Power Interface

----------

## pidsley

Thank you. I definitely learned some new things building this kernel, and that's exactly why I started using Gentoo. I have ACPI, SMP, and SMT enabled (and I disabled MC) and it seems to be working well. It boots quickly and uses significantly less memory than the distros I've used before, even running the same WM and apps.

If it's not a problem to have both SMP and SMT enabled on this machine I'm inclined to leave it that way for now. Next kernel build I'll just enable SMP (and ACPI  :Smile:  ).

----------

## Gusar

You're mixing a few things up here. First, we have CONFIG_SMP. This simply says to the kernel "configure for multi-core". Which is what we want, even though with hyperthreading you have "fake" cores. These fake cores are called "thread siblings".

Then comes CONFIG_SCHED_MC vs CONFIG_SCHED_SMT, which, as the name implies, is about scheduling, a wholly different setting from the previous one. Here you choose MC if all your cores are real ones, or SMT if you have hyperthreading. Even if you have several real cores, if hyperthreading is involved, you choose SMT - for example a Core i3 processor has two real cores with hyperthreading, so two cores each with it's own thread sibling.

----------

## pidsley

Thanks. I think I do understand the difference, but may not have expressed myself well. I assumed CONFIG_SMP was required no matter what if I wanted multiprocessing, so I have always had it enabled. I also had SCHED_SMT enabled and SCHED_MC disabled when I first built the kernel, because I thought  that's what I wanted for a hyperthreaded processor with only one true core (CONFIG_SMP and SCHED_SMT, but not SCHED_MC). When /proc/cpuinfo reported only one "core" I tried disabling SMT and enabling SCHED_MC (more a move of trial and error than a well thought out plan) -- and when that didn't work I posted my configs and was told my problem was ACPI. So I enabled ACPI, disabled SCHED_MC and re-enabled SCHED_SMT. 

Then eccerr0r said I did not need to have SCHED_SMT as long as I had CONFIG_SMP enabled, so I asked if it was OK to have both. That's what I have now, and it seems to be working well.

----------

## eccerr0r

The multicore scheduling is even weirder...

If you have a machine with two sockets and both had processors with two cores, each of which have two threads, this means you have 8 threads.

Configuring for SMP is sufficient for all 8 threads to be used.  However the scheduler, not knowing/caring about physical machines, could run two threads on the same core, leaving 3 idle cores.

Now if threading support was enabled, the kernel will no longer schedule two threads to the same core.  But it may assign the two threads to the same processor...  Well, what's wrong with this?

Nothing really, except two possible issues:

1- if each processor in a socket pools cache for both cores, then the two threads will share the cache.  If you push the threads to different sockets then each thread would have their whole cache to themselves, improving performance of both threads

2- if both threads were dumped on one socket, that one processor would warm up a lot (since both cores on the socket are going full bore) and the other would stay idle.  Well this normally wouldn't matter too much but if they were Core i5/i7's it would not allow Turbo Boost to kick in.  If each thread were on its own processor/socket, then both CPUs wouldn't get as hot since only one core each are active, thus they could TurboBoost and both threads get sped up.

Now the question is for a single core, dual thread (Simultaneous Multithread)... Having a second thread running will improve performance overall (1+1=2.1 because of unused resources in the CPU) but there is a tradeoff: whether to use the other thread or use traditional time sharing scheduling.  A context swap needed during traditional time share scheduling versus a thread swap in the CPU... I'm not sure which is better, unless the CPU threadswitch (depending on CPU) does not flush all caches on a threadswitch...  However, clearly, if a second thread is running you should run it on the other thread, not run both on the same "virtual thread"....

Ok this is getting confusing...maybe...ugh... okay...nevermind...

----------

## Gusar

 *pidsley wrote:*   

> Then eccerr0r said I did not need to have SCHED_SMT as long as I had CONFIG_SMP enabled

 

He said "theoretically" you don't, which is true. But he also says funny things will happen if you don't, which is also true. So it's not only "ok" to have it, you *should* have it, so that proper scheduling decisions will be made.

----------

## eccerr0r

There's a lot of stuff I haven't considered here... instead of editing, more addendums...

Having both threads running on the same core can also have its benefits:  If both threads are software siblings of each other they may share the same memory... and if they do, having them on the same core would be beneficial as they'd be accessing the same cache area.

It's all very confusing and nobody knows what people are trying to run... and the OS has to guess...  so what's better?  I don't know...

----------

## pidsley

 *Gusar wrote:*   

>  *pidsley wrote:*   Then eccerr0r said I did not need to have SCHED_SMT as long as I had CONFIG_SMP enabled 
> 
> He said "theoretically" you don't, which is true. But he also says funny things will happen if you don't, which is also true. So it's not only "ok" to have it, you *should* have it, so that proper scheduling decisions will be made.

 

Thank you. I have it that way now and will keep it that way in the future.

----------

## Gusar

 *eccerr0r wrote:*   

> so what's better?  I don't know...

 

Well... if someone wasn't confused already... *now* they surely are!  :Smile: 

But you're quite right, there are complex issues. And you didn't even take power saving into account  :Smile: . So it's even more complicated than what you presented.

----------

## eccerr0r

Yep I was quite performance-focused here. Having all threads dumped onto the same socket has the option of completely shutting off the second socket indeed...

I don't know if this is really done in practice much. It for sure could save a few watts at a cost of performance.  Then again if running 2 threads on a 8-thread system, the machine is a bit more than one really needs to begin with...

----------

## Gusar

I was thinking more of the i3 - two cores, each with a thread sibling. Do you run a two-threaded task on a core and it's sibling, allowing the second core to idle, but the task takes longer. Or do you run the task on two real cores, allowing the task to finish faster, and then both cores quickly go to idle.

----------

## eccerr0r

I think in this case, since I don't think the i3 can dynamically disable an idle core, just use both cores and let both return to idle faster is the least power consuming method.  This is because cache thrash is wasted power as it's doing something not contributing to the application finishing faster...

This is, unless both threads are mostly idle... then a mere i3 is overkill...

----------

