# Kernel 2.6.38 really impresses!

## VinzC

Hi all.

I couldn't wait until 2.6.38 was available so I compiled it yesterday afternoon and tested it right away. I just had forgotten to recompile my ATI video driver but in the end it keeps its promises. I ran a -j16 make against glibc and gcc, ran a video through VLC and started a huge transfer through USB to my external disk. The video ran flawlessly without a hickup. I'm really pleased to see my desktop responsive again, almost like I never saw it (but I'd exaggerate  :Wink:  ).

Honestly, taste it, try it use it! It's worth the change. Be sure to check CONFIG_CGROUPS and CONFIG_CGROUP_SCHED in your kernel config for it's not enabled by default. I guess Pappy Mc Fae will have a very good time preparing a new seed for this kernel  :Wink:  .

Cheers.

----------

## leifbk

 *VinzC wrote:*   

> Be sure to check CONFIG_CGROUPS and CONFIG_CGROUP_SCHED in your kernel config for it's not enabled by default.
> 
> Cheers.

 

I'm running this kernel too, I've had gentoo-sources ~amd64 in my /etc/portage/package.keywords for some time, and 2.6.38 arrived yesterday, less than two days after it was announced by Linus. That's bleeding edge    :Smile: 

I can't find any CONFIG_CGROUP_SCHED. Do you mean CONFIG_SCHED_AUTOGROUP ?

----------

## jms.gentoo

you only need to select 

-> General setup

-->Automatic process group scheduling [CONFIG_SCHED_AUTOGROUP]

it will select automatically the rest 

Selects: EVENTFD [=y] && CGROUPS [=y] && CGROUP_SCHED [=y] && FAIR_GROUP_SCHED [=y]

if you deselect CONFIG_SCHED_AUTOGROUP and go back to configure you will see these one

-> General setup

-->Control Group support[CONFIG_CGROUPS](CGROUPS )

--->Group CPU scheduler[CONFIG_CGROUP_SCHED](CGROUP_SCHED)

---->Group scheduling for SCHED_OTHER[FAIR_GROUP_SCHED]

Conclusion :to get the famous " 233-line pach " simply 

select -> General setup

-->Automatic process group scheduling [CONFIG_SCHED_AUTOGROUP] (SCHED_AUTOGROUP)

and you're good to go

----------

## cach0rr0

i havent read up on this a ton - is this a CPU scheduling deal, IO scheduling, which? 

mainly wondering if, come time to test out .38, I should pluck out BFS and go with this, or if this isn't  a replacement, and is something else entirely

----------

## bobspencer123

Thanks for the update on this kernel. I will have to try it out. I wonder if this also fixes the fragmentation issue between nfs transfers and xfs?

----------

## VoidMage

See git log entry for details.

----------

## Jaglover

 *cach0rr0 wrote:*   

> i havent read up on this a ton - is this a CPU scheduling deal, IO scheduling, which? 
> 
> mainly wondering if, come time to test out .38, I should pluck out BFS and go with this, or if this isn't  a replacement, and is something else entirely

 

This is about TTY-based group scheduling which is not TTY-based [any more].   :Razz: 

https://lwn.net/Articles/418884/

----------

## Ant P.

Is this overhyped patch really so much better than the BFS I've been using for the past 18 months or so? I'm watching 720p video with a gcc -j32 compile on it right now.

----------

## cach0rr0

 *Jaglover wrote:*   

> 
> 
> This is about TTY-based group scheduling which is not TTY-based [any more].  
> 
> https://lwn.net/Articles/418884/

 

cheers for the link, interesting discussion they have going on. 

having said that, I'm somewhat in Ant_P's same boat. 

Sounds like heaps of hype, and while I *am* curious, things are working delightfully for me with BFS, I would be interested to see a compelling reason to make the change, rather, BFS is doing everything I need at the moment, and while I'm keen on squeezing out every last bit I can from this machine, if it's only a nominal gain, dunno, hard sell. I *am* glad mainline is starting to pay a bit more attention to interactivity requirements from desktop users, though, instead of just seeing if we can work efficiently with 4096 cores.

----------

## wswartzendruber

Wonder how long it'll take it to show up in hardened-sources.

----------

## printf

usually how long it takes for at least gentoo-sources be unmasked with this kernel?

----------

## Anon-E-moose

Add me to the BFS has been giving me good performance camp, though I may try 2.6.38 sometime, I'll wait for further testing.

----------

## VinzC

 *Ant_P wrote:*   

> Is this overhyped patch really so much better than the BFS I've been using for the past 18 months or so? I'm watching 720p video with a gcc -j32 compile on it right now.

 

I wouldn't say overhyped  :Very Happy:  . IT deserved all the... «noise» there have been around. The tests I made allowed me to use my laptop as if it weren't stressed at all. Before I would have had to wait (especially with -j16)  until compile was over. That's also why I had to decrease the -j value to 3 because compiling certain packages made my computer (Core2 Duo) totally unresponsive for seconds.

Those times are over!

----------

## pilla

Yes, it impressed me too. Xorg can't stop crashing on it  :Sad: 

----------

## Gusar

Err, what's the point of using more than -j3 on a dual core machine?

----------

## PaulBredbury

Meh, BFS is tons better, with:

```
alias mplayer="ionice -c2 nice -n -4 /usr/bin/mplayer"

alias make='ionice -c3 schedtool -D -e /usr/bin/make'
```

----------

## Ant P.

 *Gusar wrote:*   

> Err, what's the point of using more than -j3 on a dual core machine?

 

There is none, bar showing off. In fact -j3 itself is only needed because the vanilla scheduler has flaws; BFS on a dual core reaches peak efficiency at -j2.

----------

## NathanZachary

It seems that there is a problem with 2.6.38 though.  udev won't build with it:

It seems that the problem is with videodev.h being removed from 2.6.38:

http://comments.gmane.org/gmane.linux.hotplug.devel/16670

The error I get on my new installation is:

```

extras/v4l_id/v4l_id.c:31:28: error: linux/videodev.h: No such file or directory

```

There are other errors, but it fails to build (with or without the 'extra' USE flag).  :Sad: 

Anyone have suggestions?

----------

## d2_racing

I use this configuration :

```

General stuff

[*] Control Group support

     [ ]   Example debug cgroup subsystem

     [*]   Namespace cgroup subsystem                                                                               

     [*]   Freezer cgroup subsystem                                                                            

     [*]   Device controller for cgroups                                                                          

     [*]   Cpuset support                                                                                          

     [*]     Include legacy /proc/<pid>/cpuset file                                                               

     [ ]   Simple CPU accounting cgroup subsystem                                                                

     [ ]   Resource counters                                                                                     

     -*-   Group CPU scheduler  --->                                                                          

    <*>   Block IO controller                                                                            

          [ ]     Enable Block IO controller debugging 

[*] Automatic process group scheduling

```

----------

## graysky

 *PaulBredbury wrote:*   

> Meh, BFS is tons better, with:
> 
> ```
> alias mplayer="ionice -c2 nice -n -4 /usr/bin/mplayer"
> 
> ...

 

+1 for this.  It'll be interesting to see how Con implements the 2.6.38-ready BFS with this code.

http://ck-hack.blogspot.com/2011/03/2638-and-bfs-ck-releases.html

----------

## VinzC

 *Gusar wrote:*   

> Err, what's the point of using more than -j3 on a dual core machine?

 

There are plenty of such questions with GNU/Linux! And the answer would always be: «because I can!»

 *NathanZachary wrote:*   

> It seems that there is a problem with 2.6.38 though.  udev won't build with it:
> 
> It seems that the problem is with videodev.h being removed from 2.6.38:[...]
> 
> Anyone have suggestions?

 

I have udev-162 and it compiles fine. What version do you have?

 *PaulBredbury wrote:*   

> Meh, BFS is tons better, with:
> 
> ```
> alias mplayer="ionice -c2 nice -n -4 /usr/bin/mplayer"
> 
> ...

 

The difference is 2.6.38 will automatically assign time slices without any overhead. I also used to renice (not the way you've shown, which I didn't know) but I found it boring in the end  :Very Happy:  . Say 2.6.38 and AUTO_CGROUPS are for the lazy then  :Wink:  .

----------

## Gusar

 *VinzC wrote:*   

> There are plenty of such questions with GNU/Linux! And the answer would always be: «because I can!»

 

Which is a valid answer as such. But still, -j16 on a dual-core has no real value, it's ridiculous. And if this "miracle" patch only does something at a ridiculous setting, it's equally valid to question the "miracleness" (yes, that's a word!  :Smile: ) of it.

----------

## graysky

 *Gusar wrote:*   

>  *VinzC wrote:*   There are plenty of such questions with GNU/Linux! And the answer would always be: «because I can!» 
> 
> Which is a valid answer as such. But still, -j16 on a dual-core has no real value, it's ridiculous. And if this "miracle" patch only does something at a ridiculous setting, it's equally valid to question the "miracleness" (yes, that's a word! ) of it.

 

This is what Con argued.  Why optimize a kernel to perform at insane workloads when 99.99 % of the user don't use their machines as such.

----------

## jcTux

 *NathanZachary wrote:*   

> It seems that there is a problem with 2.6.38 though.  udev won't build with it:
> 
> It seems that the problem is with videodev.h being removed from 2.6.38:
> 
> http://comments.gmane.org/gmane.linux.hotplug.devel/16670
> ...

 

Udev compiles fine here 

```
USE="extras -devfs-compat -old-hd-rules (-selinux) -test"
```

```
uname -r

2.6.38-gentoo
```

----------

## VinzC

 *VinzC wrote:*   

> There are plenty of such questions with GNU/Linux! And the answer would always be: «because I can!»

 

 *Gusar wrote:*   

> Which is a valid answer as such. But still, -j16 on a dual-core has no real value, it's ridiculous. And if this "miracle" patch only does something at a ridiculous setting, it's equally valid to question the "miracleness" (yes, that's a word! ) of it.

 

 *graysky wrote:*   

> This is what Con argued.  Why optimize a kernel to perform at insane workloads when 99.99 % of the user don't use their machines as such.

 

Well, I really don't like percentages like this (i.e. what's this estimation based on?) but I agree with you: most GNU/Linux users won't probably load their system that much hence the impact of the patch is limited. But the main advantage is there's no longer a need to manually tweak performance using a command. It's handled automatically, which clueless users will probably appreciate.

OTOH -j16 was indeed for the sake of testing. Of course it makes no sense to raise that number that much. [I've compiled hugin and then I saw the effect: computer not responsive for a while, high latency, high I/O rate and much longer waits.] But instead of restoring -j3, I'll keep -j5 for a while and see.

Finally, even if this patch might happen to prove useless for a majority of users, it doubtlessly comes handy with Gentoo just because compiling is a common task. No need to renice, just compile as much as you want and watch. That's the point.

----------

## NathanZachary

Strange.  I'm performing a new installation, and can't get past udev in my first system update.  However, it looks like the bug has been fixed with 164-r2:

https://bugs.gentoo.org/show_bug.cgi?id=359407

I'll try it now.  :Smile: 

----------

## NathanZachary

In case anyone is interested in the actual portion that solved the udev problem, I mention it in  the other thread about it.

----------

## VinzC

 *NathanZachary wrote:*   

> Strange.  I'm performing a new installation, and can't get past udev in my first system update.  However, it looks like the bug has been fixed with 164-r2:
> 
> https://bugs.gentoo.org/show_bug.cgi?id=359407
> 
> I'll try it now. 

 

Aaah, that explains why. On my system nothing exists, which depends on V4L1 so I guess that's why udev compiles fine.

----------

## Aquous

Is anyone else owning a Radeon GPU experiencing a strange flicker when switching from a TTY to X?

Apart from that, this kernel rocks. I've been running it for less than ten minutes and already my desktop feels smoother somehow.   :Very Happy: 

----------

## gorkypl

 *Aquous wrote:*   

> Is anyone else owning a Radeon GPU experiencing a strange flicker when switching from a TTY to X?

 

no - everything OK here with 6.14.1 open drivers and 2.6.38 kernel

----------

## Aquous

Weird. I see my TTY flash through my X screen for like a fifth of a second on my HD 5450 (mesa & libdrm from ~amd64, DDX from amd64).

Ah well. It's no problem at all, really, just a bit odd. This kernel (still) rocks.

----------

## Gusar

 *VinzC wrote:*   

> Finally, even if this patch might happen to prove useless for a majority of users, it doubtlessly comes handy with Gentoo just because compiling is a common task. No need to renice, just compile as much as you want and watch. That's the point.

 

But you yourself said this could already be done before, by just setting a sane -j. So where is the miracleness of this patch? Set a sane -j and possibly use BFS and there's no need for any "miracles".

----------

## VinzC

 *VinzC wrote:*   

> Finally, even if this patch might happen to prove useless for a majority of users, it doubtlessly comes handy with Gentoo just because compiling is a common task. No need to renice, just compile as much as you want and watch. That's the point.

 

 *Gusar wrote:*   

> But you yourself said this could already be done before, by just setting a sane -j. So where is the miracleness of this patch? Set a sane -j and possibly use BFS and there's no need for any "miracles".

 

One difference is at which value of -j my system stops being responsive. The purpose of the patch is to demonstrate how a system can still be responsive under an uncommonly heavy load and not only compiling. Before that my system would have been unresponsive with a low -j value (but still higher than 3), even without compiling hugin. Now with even -j16 it remains usable... except while compiling hugin. Another difference is now you don't have to *do* anything.

----------

## Gusar

 *Quote:*   

> he purpose of the patch is to demonstrate how a system can still be responsive under an uncommonly heavy load and not only compiling.

 

High -j is not "unusually high load", it's a ridiculous artificial scenario.

 *Quote:*   

> Another difference is now you don't have to *do* anything.

 

Before you followed documentation that told you how to set -j (number of cores plus one). I don't see how things are different now. In fact, you now had to specifically create an artificial scenario to show that the patch does something.

----------

## PaulBredbury

ulatencyd looks like it could be interesting, talking about "automatic" scheduling.

----------

## agent_jdh

Just to add my 2c here ... on VinzC's side - noticed a considerable responsiveness improvement with 2.6.38 over pf-sources-2.6.37 (using bfs etc) while backing up files from my fileserver onto an external eSata (NTFS) hard drive connected to my desktop.  I have typically 3 Firefox windows open with say a dozen tabs in each, and with pf-sources, Firefox was _much_ less responsive while the copy was happening.  Using kernel 2.6.38, with essentially the same config (no bfs obviously) but with CONFIG_SCHED_AUTOGROUP enabled, Firefox was much more responsive during the same copy procedure, in fact it behaved pretty much like Firefox was the only thing running.

This is on a Core i5 760 so it is not a slow machine.  The culprit of the high cpu usage during the actual copy would appear to be the ntfs3g driver.

I'll be interested to see a 2.6.38-based pf-sources kernel.

----------

## PaulBredbury

 *agent_jdh wrote:*   

> Firefox was _much_ less responsive while the copy was happening

 

Use the ionice command, as in my examples above.

----------

## agent_jdh

 *PaulBredbury wrote:*   

>  *agent_jdh wrote:*   Firefox was _much_ less responsive while the copy was happening 
> 
> Use the ionice command, as in my examples above.

 

Yeah, but now I don't have to.  That has to be progress, right?

----------

## Gusar

Just one thing I'm curious about... Have you tried a 2.6.38 kernel but without autogroup? To make sure it's really the autogroup thing that brings the improvement and not some other goodies the 38 kernel has.

----------

## Anon-E-moose

Use whatever works and makes you happy.   :Cool: 

----------

## VinzC

 *Gusar wrote:*   

> High -j is not "unusually high load", it's a ridiculous artificial scenario.

 

 Just remove «unusual» then  :Laughing: . Now you don't seem convinced (kay, I got it) but even Linus Torvalds saw a big improvement. I just confirmed his findings... if they still have to be confirmed!  :Very Happy: 

----------

## jormartr

 *jcTux wrote:*   

>  *NathanZachary wrote:*   It seems that there is a problem with 2.6.38 though.  udev won't build with it:
> 
> It seems that the problem is with videodev.h being removed from 2.6.38:
> 
> http://comments.gmane.org/gmane.linux.hotplug.devel/16670
> ...

 

I have just had the same problem, do you have sys-kernel/kernel-headers unmasked to its last version? It worked for me, disabling the unmask, and using the standard version (also, after, i downgraded kernel to 2.6.36 stable).

----------

## VoidMage

There's already a revision of udev with a fix in the tree.

----------

## PaulBredbury

Further evidence that 2.6.38 is fast - "transparent huge pages".

Of course, 2.6.38+BFS will be even better  :Razz: 

----------

## VinzC

 *PaulBredbury wrote:*   

> Further evidence that 2.6.38 is fast - "transparent huge pages".
> 
> Of course, 2.6.38+BFS will be even better 

 

I guess you mean Zen Sources? I can't wait to check it. I was using it until 2.6.35 or 2.6.36 after which I could no longer see it in portage  :Sad:  .

OTOH I've had to revert to 2.6.37 temporarily for I've had a couple of kernel Oopses while the screensaver was active. The issue always occured while I was cancelling the screen saver or stopped it by moving my mouse or pressing a key. I don't know if that's because of the video driver, the video BIOS or because I'm using GL screensavers or anything else. Maybe I should enable crashdumps...

----------

## cach0rr0

 *VinzC wrote:*   

>  *PaulBredbury wrote:*   Further evidence that 2.6.38 is fast - "transparent huge pages".
> 
> Of course, 2.6.38+BFS will be even better  
> 
> I guess you mean Zen Sources? I can't wait to check it. 

 

zen or ck-sources, either one once it hits .38

kernelOfTruth actually has a patchset he seems to be maintaining him, possible he includes BFS as well (I haven't looked)

----------

## Anon-E-moose

Con has stated that he hasn't started working on BFS for 2.6.38 yet, (it came out quicker than he expected) and he was busy on some other project, but it sounds like he will be looking at porting it soon.

----------

## PaulBredbury

I use pf kernel patchset, for its convenience.

----------

## VinzC

 *PaulBredbury wrote:*   

> I use pf kernel patchset, for its convenience.

 

Thanks. Will try it.

----------

## Yamakuzure

 *Gusar wrote:*   

>  *Quote:*   he purpose of the patch is to demonstrate how a system can still be responsive under an uncommonly heavy load and not only compiling. 
> 
> High -j is not "unusually high load", it's a ridiculous artificial scenario.

 No and no. Sorry, but on a quad-core cpu the recommended value is -j9 with which my laptop became completely unusable before the "cgroups-console-hack" was invented whenever I emerged packages with --jobs in the EMERGE_DEFAULT_OPTS. With the console-hack (now superseded by 2.6.38 with auto scheduler) I can keep on working like there was no load. In the meantime portage did a world update, vmware is running with windows xp (needed for a cisco vpn tunnel to a customer), several openoffice documents are open, kmail, knode, amarok is playing music and typing this text goes without problem. Load of my system: 18.5 to 25.0 -- unthinkable before those cgroups methods came up.

And this is not a "ridiculous artificial scenario", it is something I have twice a week. You know, I _do_ want to get the updates finished as soon as possible without having to go for a walk for hours because I can't use my machine...

But I do have a problem with gentoo-sources-2.6.38 : Although the load distribution works, all programs I start need 5 to 10 times longer than before the kernel upgrade. How can this be?

----------

## Anon-E-moose

Woohoo... from Con's blog 

 *Quote:*   

> and BFS for 2.6.38 can be grabbed here:

 

----------

## bollucks

Jobs == number of CPUs is the fastest way to finish a build. The overloaded idea goes way back to when scheduling was bad. Increasing the number these days just overloads other resources in your machine like the I/O subsystem and the Virtual Memory subsystem.

See: http://ck.kolivas.org/patches/bfs/reverse-scalability.png which refers to a quad core machine.

So when using BFS you don't need to overload your system.

----------

## Gusar

 *Yamakuzure wrote:*   

>  *Gusar wrote:*   High -j is not "unusually high load", it's a ridiculous artificial scenario. No and no. Sorry, but on a quad-core cpu the recommended value is -j9

 

Err, the recommendation was always cores+1. So -j9 would be the recommendation if you have a hyperthreaded quad-core. And with BFS, the recommendation is without the +1.

 *Yamakuzure wrote:*   

> with which my laptop became completely unusable before the "cgroups-console-hack" was invented whenever I emerged packages with --jobs in the EMERGE_DEFAULT_OPTS. With the console-hack (now superseded by 2.6.38 with auto scheduler) I can keep on working like there was no load. In the meantime portage did a world update, vmware is running with windows xp (needed for a cisco vpn tunnel to a customer), several openoffice documents are open, kmail, knode, amarok is playing music and typing this text goes without problem. Load of my system: 18.5 to 25.0 -- unthinkable before those cgroups methods came up.
> 
> And this is not a "ridiculous artificial scenario", it is something I have twice a week. You know, I _do_ want to get the updates finished as soon as possible without having to go for a walk for hours because I can't use my machine...

 

Hmm, I wonder... How fast would the process be with a reasonable -j setting? Just as fast, I'd say. And how usable the machine would be while doing it?

This all still looks to me like "if I overload my machine with a high -j, it's not responsive, and this cgroup thing helps." Well, don't overload your machine and you won't need miracle patches.

 *Anon-E-moose wrote:*   

> Woohoo... from Con's blog *Quote:*   and BFS for 2.6.38 can be grabbed here: 

 

That was fast. His previous message read as if it'll take at least a week or two.

----------

## pilla

hyperthread != 2 cores. In some cases, performance can be even worse with hyperthreads enabled. Or at least that was the case when they started shipping pentium 4 HT chips.

----------

## wswartzendruber

This might be the answer to my issues of VirtualBox completely stalling whenever both cores are busy.

I'm still waiting for hardened-sources-2.6.38.

----------

## peter4

I don't get it at all. Can someone give me a real life example when this patch would make a positive impact on desktop responsiveness? Before you ask, make -j16 is not a real life example.  :Confused: 

----------

## Yamakuzure

 *Gusar wrote:*   

>  *Yamakuzure wrote:*    *Gusar wrote:*   High -j is not "unusually high load", it's a ridiculous artificial scenario. No and no. Sorry, but on a quad-core cpu the recommended value is -j9 
> 
> Err, the recommendation was always cores+1. So -j9 would be the recommendation if you have a hyperthreaded quad-core. And with BFS, the recommendation is without the +1.

 Depends where you look.

x86-handbook: Nr of CPUs + 1

amd64-handbook: No recommendation

distcc-handbook: "(A common strategy is setting N as twice the number of total CPUs + 1 available) *Gusar wrote:*   

> Hmm, I wonder... How fast would the process be with a reasonable -j setting? Just as fast, I'd say. And how usable the machine would be while doing it?
> 
> This all still looks to me like "if I overload my machine with a high -j, it's not responsive, and this cgroup thing helps." Well, don't overload your machine and you won't need miracle patches.

 I do not know whethere you followed the "Unresponsiveness" thread in the amd64-board, but even -j5 (your recommendation) and a load below 4.0 made the machine unusable prior those patches. And as I want to be able to use parallel merges (yes, the speed gain is enormous!) it will go beyond 4.0 for sure. *Gusar wrote:*   

>  *Anon-E-moose wrote:*   Woohoo... from Con's blog *Quote:*   and BFS for 2.6.38 can be grabbed here:  That was fast. His previous message read as if it'll take at least a week or two.

 *sigh* This "BFS-Advertising" is unnerving. Look, the problem with responsiveness on heavy load (not necessarily "overload") is a common problem. Keeping on telling people they had to patch kernel sources from an external source by hand solves nothing. If BFS was this good, when will it make into the main tree then? (*) (answer: never.) *pilla wrote:*   

> hyperthread != 2 cores. In some cases, performance can be even worse with hyperthreads enabled. Or at least that was the case when they started shipping pentium 4 HT chips.

 Hyperthreading on Pentium4/Xeon-netburst and Hyperthreading on Pentium Core-i are not really the same. I tried the kernel with nothing, SCHED_SMT, SCHED_MC and both. And the latter was performing best.

(*): don't understand me wrong: I really like the idea of the BFS scheduler. And the main reason for not going into the mainline is, that it does not scale beyond 12-16 logical CPUs (A 16-CPU-machine would suffer from BFS) which is against the mainline policy of having to upscale to 4096 CPUs. But if it did wonders, there would be a solution. Until then it is just an optional hack-by-hand for people who know what they are doing and want to handle the consequence.

 *peter4 wrote:*   

> I don't get it at all. Can someone give me a real life example when this patch would make a positive impact on desktop responsiveness? Before you ask, make -j16 is not a real life example. 

 I did already. But here it is again:

On my laptop a vmware workstation runs with Windows XP. At the same time I am doing a wolrd update. I have several documents open with openoffice. Kontact sits on another screen for e-mail, calendar, news feeds and Usenet. Amarok is playing music in the background and I am typing this text.

for more details why this was an unthinkable scenario (even without the world update) prior the first cgroup-hack can be read up here:

https://forums.gentoo.org/viewtopic-t-793263.html

----------

## PaulBredbury

 *Yamakuzure wrote:*   

> But if it did wonders, there would be a solution. Until then it is just an optional hack-by-hand for people who know what they are doing and want to handle the consequence.

 

You're talking nonsense. The solution, as in pf-sources is to substitute a scheduler designed for 4096 CPUs, for a scheduler designed for 1-16-ish CPUs, which is what people actually have in desktops/laptops. Different scheduler designs.

BFS isn't in the mainline kernel, because of crappy politics  :Crying or Very sad: 

The "consequence", with a tiny bit of BASH aliasing as I showed above, is a smoother desktop experience than with the mainline kernel.

This is Gentoo, ya know. What's with the adversion to anything "by-hand"? We're supposed to be optimizing by hand  :Wink:   In comparison, the wonder patch is a Ubuntu-type scheduling hack.

----------

## Gusar

 *pilla wrote:*   

> hyperthread != 2 cores. In some cases, performance can be even worse with hyperthreads enabled. Or at least that was the case when they started shipping pentium 4 HT chips.

 

HT has evolved since then. And I know it's not the same as two real cores. But hey, my netbook can decode 720p thanks to it. And my Core i3 desktop benefits from it too. It's a very nice thing to have.

@Yamakuzure: BFS isn't in the kernel because of personality clashes. No other reason, just that.

----------

## VinzC

 *PaulBredbury wrote:*   

> What's with the adversion to anything "by-hand"? We're supposed to be optimizing by hand  

 

Entropy, my friend, entropy  :Wink:  .

 *peter4 wrote:*   

> I don't get it at all. Can someone give me a real life example when this patch would make a positive impact on desktop responsiveness? Before you ask, make -j16 is not a real life example. 

 

It's simple.

There's one side of debate that runs here about whether it makes sense to push one's machine load using a high value of -j . That discussion has somewhat drifted from the initial topic and has now nothing to do with the patch in itself.

The patch indeed allows a machine to still be responsive under a high load. The proof is that even with this (granted: stupid) example of -j16 or -j64, a machine still is responsive, which was clearly impossible before. It demonstrates the patch clearly brings a huge enhancement on high loads. The example being stupid doesn't demonstrate the patch doesn't work; the patch *does* work and reaches the goal it was targeting.

The (silly) example hence does demonstrate the patch works (I love to paraphrase myself). This means you can trust the patch to bring the announced enhancements.

My point was that since Gentoo involves compiling a lot, which usually implies a higher load for a certain amount of time, the patch comes handy for the duration of the compile process: it allows your machine to still be responsive while compiling. It also comes handy for you can also slightly increase the -j value (above the general rule 2*CPU+1) to truly load your CPU and keep it at 100% most of the time *and* still use your machine without having to wait. That's an example I'd bring to reply to your question. Now it's up to you to find a good use case on how to load your machine appropriately — use whatever makes you happy  :Wink: .

All in all, my point is that the patch really does what had been announced.

Now whether it's useful or not, whether the same level of responsiveness can be achieved some other way, I don't really care. It'll be useful to me and I *will* use it. Period. All the noise that is made aside is none of my concerns  :Very Happy:  .

----------

## Anon-E-moose

 *Yamakuzure wrote:*   

>  *Gusar wrote:*    *Anon-E-moose wrote:*   Woohoo... from Con's blog *Quote:*   and BFS for 2.6.38 can be grabbed here:  That was fast. His previous message read as if it'll take at least a week or two. *sigh* This "BFS-Advertising" is unnerving. Look, the problem with responsiveness on heavy load (not necessarily "overload") is a common problem. Keeping on telling people they had to patch kernel sources from an external source by hand solves nothing. If BFS was this good, when will it make into the main tree then? (*) (answer: never.)

 

Others have mentioned why it won't make it to the "main tree", but having used it for quite a while now, I've not seen problems.

Bottom line, don't like BFS, don't use it, simple as that.

----------

## peter4

 *VinzC wrote:*   

> My point was that since Gentoo involves compiling a lot, which usually implies a higher load for a certain amount of time, the patch comes handy for the duration of the compile process: it allows your machine to still be responsive while compiling.

 I just made an experiment. I started a kernel build with nice make -j2 (which is more of a real life example) and started up a 1080p movie in mplayer. The playback was perfectly smooth.

I can also say, that i never had any problems even with 1080p fullscreen youtube videos during compiling (with -j2).

EDIT: all of it on 2.6.37 without the "wonder patch", forgot to mention that.

----------

## StalkerNOVA

Didn't get it. In pf-sources 2.6.38 r1 not all options can be found for cgroups...

----------

## cach0rr0

 *VinzC wrote:*   

> 
> 
> The patch indeed allows a machine to still be responsive under a high load. The proof is that even with this (granted: stupid) example of -j16 or -j64, a machine still is responsive, which was clearly impossible before. It demonstrates the patch clearly brings a huge enhancement on high loads. The example being stupid doesn't demonstrate the patch doesn't work; the patch *does* work and reaches the goal it was targeting.

 

I guess the point is, somewhat, that with BFS -j16 is unnecessary/superfluous. 

That's what I'm gathering from reading this. Not an expert myself, but I will say shit just plain builds faster with BFS. Some time when you're bored, boot a kernel with BFS, build a package with -j2. Then boot into a regular ol' kernel using what is it, CFS? Do the same build with -j2, and time both.

----------

## VinzC

 *cach0rr0 wrote:*   

> I guess the point is, somewhat, that with BFS -j16 is unnecessary/superfluous. 
> 
> That's what I'm gathering from reading this. Not an expert myself, but I will say shit just plain builds faster with BFS. Some time when you're bored, boot a kernel with BFS, build a package with -j2. Then boot into a regular ol' kernel using what is it, CFS? Do the same build with -j2, and time both.

 

Well, it's really not about how long it takes with or without the patch. Just watch /proc/loadavg with and without and observe when your computer starts lagging. Your computer will start lagging later (as the load average is higher) with the patch.

Now I'll probably need to compare this with BFS. I think I never quite took time (nor knew how) to assess the benefits of it.

----------

## krinn

using a high -j16 or more is not stupid for a cpu, it's even a need if you use distcc and have others cores/cpu that need to be feed, so stop assuming a high -jX is a stupid value for a cpu. Like all options, it could be stupid with a high value for someone, and stupid with a low value for someone else...

----------

## albright

For what it is worth, with 2.6.38 (with or without BFS),

my rsync hourly backup (snapback2 based) quite often

make kde apps unresponsive for several seconds.

My rsync is ioniced and niced but to no avail. 

I *think* that 2.6.38 is somewhat better than earlier kernels

(the app freeze is of shorter duration when it happens and

I *think* it happens less frequently).

Just my 2 cents.

----------

## PaulBredbury

 *albright wrote:*   

> My rsync is ioniced and niced but to no avail.

 

There's more to it than that. Look at e.g. commit=.

Also, make sure that you are using CFQ (Q for I/O Queuing, rather than S for Scheduler) rather than BFQ in the kernel, because IIRC, BFQ doesn't support ionice.

```
$ zgrep CFQ /proc/config.gz 

CONFIG_IOSCHED_CFQ=y

CONFIG_DEFAULT_CFQ=y
```

----------

## Yamakuzure

@PaulBredbury :

@Gusar :

@Anon-E-moose :

Could you all please _read_ a post completely before answering like that? Thanks. Here is what I mean: *Anon-E-moose wrote:*   

> Bottom line, don't like BFS, don't use it, simple as that.

 While I wrote: *Yamakuzure wrote:*   

> Don't understand me wrong: I really like the idea of the BFS scheduler.

 And about politics: *BFS FAQs wrote:*   

> How scalable is it?
> 
> I don't own the sort of hardware that is likely to suffer from using it, so I can't find the upper limit. Based on first principles about the overhead of locking, and the way lookups occur, I'd guess that a machine with 16 CPUS or more would start to have exponentially less performance (thanks Ingo for confirming this). 
> 
> Are you looking at getting this into mainline?
> ...

 Finally:

Do you think if you set your -j value to the number of CPU(-core)s you really fully use your CPU with all the I/O, even with /var/tmp/portage being in tmpfs, on every single compiler operation? There is a difference between "IOWait/User/System" you can not deny.

----------

## cach0rr0

 *Yamakuzure wrote:*   

> And about politics: *BFS FAQs wrote:*   How scalable is it?
> 
> I don't own the sort of hardware that is likely to suffer from using it, so I can't find the upper limit. Based on first principles about the overhead of locking, and the way lookups occur, I'd guess that a machine with 16 CPUS or more would start to have exponentially less performance (thanks Ingo for confirming this). 
> 
> Are you looking at getting this into mainline?
> ...

 

I do hope you are detecting the large, large amount of sarcasm coming from Con there. 

If you bold the right parts it becomes more obvious:

 *Quote:*   

> 
> 
> a machine with 16 CPUS or more would start to have exponentially less performance (thanks Ingo for confirming this)
> 
> 

 

(and it's not the first time very talented people with useful patches have been ignored by mainline)

 *Quote:*   

> 
> 
> since it won't scale to their 4096 cpu machines
> 
> 

 

(the joke of course being, "I'm so glad they used their 4096 core machines to test with and confirm the issue, or else all the other people out there with 4096 core machines might suffer!")

 *Quote:*   

> 
> 
> The only way is to rewrite it to work that way, or to have more than one scheduler in the kernel. I don't want to do the former, and mainline doesn't want to do the latter.
> 
> 

 

So, yes, politics. The FAQ is loaded with sarcasm, and subtle daggers at the mainline folks for refusing to have a second scheduler in the kernel. If you didn't see it above, they won't accept BFS unless he rewrites it to scale to 4096 CPU's, which he for good reason has no intention of doing

----------

## BenderBendingRodriguez

I honestly don't understand where the problem lies, leave CFQ as default but just introduce BFS and in documentation write that it is for desktop PC's. People who have 4096 CPU's compile their own kernels anyway and have enough expertise to know BFS won't scale to their "mediocre" CPU count  :Wink: 

----------

## neocrust

I think that kernel 2.6.3* with BFS/BFQ is better than kernel 2.6.38 with "~200 lines" patch  :Rolling Eyes: 

At least for me  :Smile: 

----------

## Yamakuzure

Yes, I do believe as well that a kernel with BFS performs better on a desktop/notebook style machine. For such machines BFS is a marvelous scheduler. 

@cach0rr0 : Yes, I do understand the FAQ I quoted. I thought they are right clear, and so I did not explain them.   :Rolling Eyes:  ... But thanks for doing so.  :Wink: 

But why are you linking an old discussion about PaX versus Exec-Shield that is 8 years old? PaX is great. For a hardened system (like production servers) that value protection higher than user privileges. Something like PaX should nowhere go near the mainline kernel, as it is something no normal desktop user could live with easily. (Or at all...)

BFS is something completely different. We work on servers with 16+ CPUs, so BFS would be a big no-no on those. But for desktops and notebooks I really don't see why it is denied from mainline. I *do* understand the politics behind that squabbling, but that doesn't mean I agree with their attitude.

@Thread:

However, this thread is about the new cgroups auto scheduler, and I did some experiments with it.

The best "balance" of load I can get when using:

```
MAKEOPTS="-j15 -l8"

EMERGE_DEFAULT_OPTS="${EMERGE_DEFAULT_OPTS} --jobs --load-average=8.0"
```

Although the load limitation is a double load (on 4 logical CPUs), I can get htop to show a CPU usage of 90-95% while merging multiple packages. The process trees then show that most processes are in D state.

I guess this settings would be too high for a system with more than one hard drive, or even a RAID, as Disk Wait (or IOWait in general) might be a lot less significant there. For my single hard-drived notebook it seems to work, and the machine stays responsible all the time.

----------

## Gusar

 *krinn wrote:*   

> using a high -j16 or more is not stupid for a cpu, it's even a need if you use distcc and have others cores/cpu that need to be feed, so stop assuming a high -jX is a stupid value for a cpu. Like all options, it could be stupid with a high value for someone, and stupid with a low value for someone else...

 

Err, with distcc you're not using a CPU, you're using several CPUs. So of course a high -j makes sense there, to feed all of them. What's the connection between distcc and "it's not stupid for a single CPU", how do you go from one to the other?

 *Yamakuzure wrote:*   

> However, this thread is about the new cgroups auto scheduler, and I did some experiments with it.
> 
> The best "balance" of load I can get when using:
> 
> Code:
> ...

 

Did you time how long it takes to finish all tasks with that setup? And did you time how long would the same tasks take with BFS and a -j that's equal to the number of cores?

----------

## DestroyFX

Here from the timed kernel compilation test I made, with my X6 CPU:

time make -j6

real	2m43.303s

user	13m2.339s

sys	2m1.649s

time make -j7

real	2m44.023s

user	13m6.413s

sys	2m1.957s

time make -j8

real	2m44.176s

user	13m10.244s

sys	2m2.877s

time make -j12

real	2m50.535s

user	13m42.303s

sys	2m7.399s

It was with the 2.6.38-rcSomething without BFS with the wonder patch....

The optimal compilation speed was with -jCore. The performance go down with -jCore+1 and more.

I remember that with the 2.6.36-, -jCore+1 was better except with BFS who required -jCore (and was more fast than without BFS with -jCore+1).

----------

## asturm

For me, 2.6.38 currently mostly impresses with silence. Even 2.6.38.2 didn't fix sound and there are no fixes currently in stable-queue for 2.6.38, so I'm trying to revert the sound tree back to 2.6.37...

----------

## VinzC

 *genstorm wrote:*   

> For me, 2.6.38 currently mostly impresses with silence. Even 2.6.38.2 didn't fix sound and there are no fixes currently in stable-queue for 2.6.38, so I'm trying to revert the sound tree back to 2.6.37...

 

 :Laughing: 

So what is your sound chipset?

----------

## dennisn

I just tried "make oldconfig"ing from 2.6.37, to 2.6.38.2, as I've done countless times before, except this time something is seriously wrong -- the second grub loads the new kernel, there is an endless stream of backtrace-type debug messages that flood my screen -- I have no way of reading what's going on. What the heck happened?!?

----------

## VinzC

 *dennisn wrote:*   

> I just tried "make oldconfig"ing from 2.6.37, to 2.6.38.2, as I've done countless times before, except this time something is seriously wrong -- the second grub loads the new kernel, there is an endless stream of backtrace-type debug messages that flood my screen -- I have no way of reading what's going on. What the heck happened?!?

 

Weird. That's how I usually configure/install new kernels too...

----------

## jbouzan

 *VinzC wrote:*   

>  *dennisn wrote:*   I just tried "make oldconfig"ing from 2.6.37, to 2.6.38.2, as I've done countless times before, except this time something is seriously wrong -- the second grub loads the new kernel, there is an endless stream of backtrace-type debug messages that flood my screen -- I have no way of reading what's going on. What the heck happened?!? 
> 
> Weird. That's how I usually configure/install new kernels too...

 

I did that and it works fine. Bad compile perhaps? I once had a kernel that didn't work, and with no changes to its .config it worked after a recompile. File corruption or something. Or perhaps support for some hardware was dropped, which your computer needed.

----------

## wuzzerd

In a new install using the exact same 2.6.38.2  there was no sound although it worked in the old borked install.  The sound card(s) , mixers etc. all showed up in /dev.  Yesterday it started working after an emerge -uDN world.   Go figure.  I did switch from the default profile to desktop.

----------

## VinzC

The only serious glitch I'm experiencing so far is a kernel Oops in TTM. I have noted down the message partly but it always occurs while the screensaver is active. It looks like a threading issue. Will post what I have noted down later in this thread. Since I've installed 2.6.38-gentoo-r1 I get this panic message when the screensaver is active. Until then the panic message only occurred when unlocking the screen (or cancelling the saver).

----------

## dennisn

2.6.38.2 also broke my resuming from hibernation, which was working just fine in 2.6.36. On my x86_64 Acer Ferrari laptop, I get a "general protection fault" and kernel panic (in swapper / swsusp_arch_resume) whenever I try resuming. No good.

----------

## PaulBredbury

 *dennisn wrote:*   

> broke my resuming from hibernation

 

Known  upstream apparently.

----------

## Yamakuzure

 *DestroyFX wrote:*   

> Here from the timed kernel compilation test I made, with my X6 CPU:
> 
> (... snip ...)
> 
> The optimal compilation speed was with -jCore. The performance go down with -jCore+1 and more.
> ...

 As this is something seldom used, I tried with emerging geany, nano, mlview, kile and hexedit with different MAKEOPTS:With MAKEOPTS="-j4" (I leave the list in once for reference)

```
 # time emerge --oneshot geany nano mlview kile hexedit

>>> Emerging (1 of 5) app-editors/hexedit-1.2.12

>>> Emerging (2 of 5) dev-util/geany-0.20

>>> Emerging (3 of 5) app-editors/nano-2.2.5

>>> Installing (1 of 5) app-editors/hexedit-1.2.12

>>> Emerging (4 of 5) app-editors/mlview-0.9.0

>>> Emerging (5 of 5) app-editors/kile-2.1_beta5

>>> Installing (3 of 5) app-editors/nano-2.2.5

>>> Installing (2 of 5) dev-util/geany-0.20

>>> Installing (5 of 5) app-editors/kile-2.1_beta5

>>> Installing (4 of 5) app-editors/mlview-0.9.0

>>> Jobs: 5 of 5 complete                           Load avg: 5.73, 4.35, 3.05

real    3m59.929s

user    8m1.822s

sys     1m5.482s
```

With MAKEOPTS="-j7"

```
real    2m10.122s

user    2m23.061s

sys     0m36.148s
```

With MAKEOPTS="-j11"

```
real    1m37.223s

user    2m25.297s

sys     0m34.484s
```

With MAKEOPTS="-j15"

```
real    1m36.465s

user    2m24.854s

sys     0m34.618s
```

So for emerging the gain in speed stalls, but doesn't get worse with more parallel jobs.

I am testing kernel compilation next, but I reckon the results will be the same as yours.

===  Edit  ===

And here are the tests with the kernel:

```
 # make clean && time make -j 4

real    4m18.876s

user    13m24.801s

sys     1m9.835s
```

```
 # make clean && time make -j 7

real    4m35.801s

user    14m7.941s

sys     1m11.109s
```

```
 # make clean && time make -j 11

real    4m16.216s

user    14m7.978s

sys     1m9.202s
```

```
 # make clean && time make -j 15

real    4m23.011s

user    14m11.338s

sys     1m9.929s
```

This certainly is weird. While using -j7 performs worse than -j4 (like suggested by your tests), -j15 performs better than -j7 and -j11 performs better than -j4. It's not much, but it is strange... (Or not so strange at all if I understood more of the internal mechanics  :Wink: )

----------

## VinzC

 *Yamakuzure wrote:*   

> This certainly is weird. While using -j7 performs worse than -j4 (like suggested by your tests), -j15 performs better than -j7 and -j11 performs better than -j4. It's not much, but it is strange... (Or not so strange at all if I understood more of the internal mechanics )

 

Isn't to me. "Performs" is not the appropriate term. The patch doesn't improve compile times, it just improves responsiveness, which is totally different. It means, for instance, at equal load, groups of interactive processes in the same session will have better chances to catch a keyboard/mouse events on time and actually *do* the requested operation in a timely fashion. But globally I expect compile times to be just slightly bigger than before. 

It's only when the amount of I/O grows such as there's a latency due to excess of I/O operations. So compile times aren't a good measure of the patch efficiency. Don't expect compile times to vary significantly then.

----------

## Yamakuzure

Well, yes, during all these time tests my laptop was perfectly responsive all the time. I just did the tests because someone stated that it would be stupid to use a different value for the -j option than the number of logical CPUs. That has been proven wrong, now.  :Wink: 

----------

## cach0rr0

just to be a pest and BFS fanboy

test during which my desktop (KDE-4.6.2 with compositing enabled, bluray rip playing, and music playing just for grins) responds as though nothing else is going on

quad core phenom

```

#make clean && time make -j4

real    2m53.167s

user    8m58.095s

sys     0m43.128s

ricker linux # ls -alh arch/x86/boot/bzImage 

-rw-r--r-- 1 root root 5.9M Apr  7 03:17 arch/x86/boot/bzImage #roughly half of this is embedded initramfs

```

----------

## Anon-E-moose

 *Yamakuzure wrote:*   

> Well, yes, during all these time tests my laptop was perfectly responsive all the time. I just did the tests because someone stated that it would be stupid to use a different value for the -j option than the number of logical CPUs. That has been proven wrong, now. 

 

The only thing you've proven is that in running a few tests that things worked well on your system  :Rolling Eyes: 

----------

## Anon-E-moose

I've been running BFS for the last several kernel versions going back to 2.6.32 or thereabouts and I've had consistently good performance.

I'm glad that they are finally addressing the problems with performance for those not running BFS.

So once again, for those who don't want to run BFS then run whatever you want.

And for those who want to run BFS it's there.

----------

## VanFanel

I've been using BFS since 2.6.32: thanks for the ck-sources package, it's what makes my system so great!  :Smile: 

But.. would you recommend automatic task grouping over BFS for emulation?

I'm trying to build a near-zero latency system for both audio & input in emulators and I don't know if BFS is the best option for it.

regards

----------

## Anon-E-moose

 *VanFanel wrote:*   

> I've been using BFS since 2.6.32: thanks for the ck-sources package, it's what makes my system so great! 
> 
> But.. would you recommend automatic task grouping over BFS for emulation?
> 
> I'm trying to build a near-zero latency system for both audio & input in emulators and I don't know if BFS is the best option for it.
> ...

 

The only way to know for sure on your system is to switch to the task grouping and rebuild the kernel.

You can have both and switch back and forth to see which gives you better performance on your system.

----------

## Ant P.

Using schedtool will probably help a bit there too.

----------

## VanFanel

@Ant P. : I use schedtool like this:

```
schedtool -I -e ./mame tekken2
```

any other advice for the lowest possible input/output latency in emulators?

----------

## jbouzan

 *PaulBredbury wrote:*   

>  *dennisn wrote:*   broke my resuming from hibernation 
> 
> Known  upstream apparently.

 

Annoying, but since .38 patches a problem I was having with btrfs I can't really switch back. Any idea whether the patch will be reversed?

----------

## epsilon72

 *PaulBredbury wrote:*   

>  *dennisn wrote:*   broke my resuming from hibernation 
> 
> Known  upstream apparently.

 

This seems to happen with 2.6.37-r4 as well though

----------

## piedar

 *jbouzan wrote:*   

>  *PaulBredbury wrote:*    *dennisn wrote:*   broke my resuming from hibernation 
> 
> Known  upstream apparently. 
> 
> Annoying, but since .38 patches a problem I was having with btrfs I can't really switch back. Any idea whether the patch will be reversed?

 

Just apply the patch yourself in the meantime.

https://patchwork.kernel.org/patch/690691/

----------

## pilla

 *marshmallow1304 wrote:*   

> 
> 
> Just apply the patch yourself in the meantime.
> 
> https://patchwork.kernel.org/patch/690691/

 

404 Not found

2.6.38-gentoo-r2 is still segfaulting Xorg for me.

----------

## VinzC

 *pilla wrote:*   

> 2.6.38-gentoo-r2 is still segfaulting Xorg for me.

 

Does it crash right away or after a certain, random period of time?

----------

## asturm

Sound is ok with unpatched 2.6.38.3 now

----------

## Anon-E-moose

I was getting a few random seg-faults with 2.6.38.2 one in less (man) one in tcp (during system boot) 

I haven't noticed any since going to 2.6.38.3.

----------

## pilla

 *VinzC wrote:*   

>  *pilla wrote:*   2.6.38-gentoo-r2 is still segfaulting Xorg for me. 
> 
> Does it crash right away or after a certain, random period of time?

 

After some random time, usually not so long.

----------

## VinzC

 *pilla wrote:*   

> 2.6.38-gentoo-r2 is still segfaulting Xorg for me.

  *VinzC wrote:*   

> Does it crash right away or after a certain, random period of time?

 

 *pilla wrote:*   

> After some random time, usually not so long.

 

I think we might experience the same bug.

EDIT: Apparently there's a pending fix for radeon video cards. I'll try it and report if the bug is fixed. I think if there's no issue in a week or so then the patch may appear to actually fix the issue.

----------

## robinmarlow

I too am having problems with 2.6.38.  I compile it using my working config from 2.6.37 & answering the questions posed by make oldconfig.

The resulting kernel gives a blank screen with no hd activity or response to magic keys.  There are no logs & no clues as to what's gone wrong!

I have tried compiling vanilla as well as gentoo sources with the same result.  I'm now trying excluding sections of the config e.g. audio individually to see if they are causing the problem - but it's rather time consuming!

How should I troubleshoot it & does anyone know what the problem is?

Thanks,

Robin

----------

## jbouzan

Interesting. I'd say use make oldconfig and then manually check with make nconfig or make xconfig for your video, hard drive, and filesystem drivers to be compiled in.

----------

## wswartzendruber

Phoronix is reporting massive power consumption issues with this kernel.  Anyone else notice this?

----------

## VinzC

 *robinmarlow wrote:*   

> I too am having problems with 2.6.38.  I compile it using my working config from 2.6.37 & answering the questions posed by make oldconfig.
> 
> The resulting kernel gives a blank screen with no hd activity or response to magic keys.  There are no logs & no clues as to what's gone wrong!
> 
> I have tried compiling vanilla as well as gentoo sources with the same result.  I'm now trying excluding sections of the config e.g. audio individually to see if they are causing the problem - but it's rather time consuming!
> ...

 

You might want to try Pappy's Kernel Seeds or this thread. You «just» need to add your hardware-specific options like drivers, firmwares...

 *wswartzendruber wrote:*   

> Phoronix is reporting massive power consumption issues with this kernel.  Anyone else notice this?

 

I'm watching Phoronix too. It even looks like there is a «growing increase» in consumption since 2.6.25.

----------

## pilla

2.6.39-rc5 in my Thinkpad T60 consumes 26W instead of 20W of 2.6.35.

----------

## Jaglover

This automatic group scheduling is not for me. Turned it off with -r4 and got back my responsive desktop. I've Core2 Duo, dual monitors, most of time MythTV is running on the second one. With group scheduling on every time I started something on first monitor MythTV picture stopped for a fraction of a second, very annoying.

----------

## red-wolf76

I don't get it. Whatever happened to the choice factor? Even for mainline kernel development, we get to choose between CFQ and no-op, even if the latter is really not a viable alternative. There's ton's of stuff in the kernels marked "Experimental" with rolls of red tape around it.

What's wrong with including the new scheduler and then slapping a big warning on top of it. "This scheduler is reported to have better performance on systems with fewer than 16 CPUs. If you know what you're doing and have such a system, you may want to select this. If you have no idea or don't care, choose CFQ instead."

End of story. Make it available to people that are sensible enough to actually read the help for kernel settings - provided there is any...

Honestly... This politics crap sounds like even programmers aren't immune to start masturbating on kindergarten power games once they get bored... Why not let Joe User decide?

----------

## Joseph K.

VinzC, did you already post a link to this kernel bug in regard to the screen saver oops that you mentioned?  (I had a feeling that you did, but then I couldn't find it.)  I hit this screen saver crash only every couple of days, but it's enough to motivate me to want to debug it.  Do you want to open a Gentoo bug or a new forum thread for it if there isn't one already?

----------

## VinzC

 *Joseph K. wrote:*   

> VinzC, did you already post a link to this kernel bug in regard to the screen saver oops that you mentioned?  (I had a feeling that you did, but then I couldn't find it.)

 

Yes, I did in this post (a little above).

 *Joseph K. wrote:*   

> Do you want to open a Gentoo bug or a new forum thread for it if there isn't one already?

 

Well, as I saw there is already a bug opened in kernel.org, I didn't feel ike doing it on Gentoo's side. Fact is there doesn't seem to be much progress on that lately.

----------

## Bircoph

No problems with 2.6.38 on both laptop and server so far (I use vanilla-sources).

Transparent hugepages is a really nice feature in this kernel, it measurably speeds up world upgrade.

Also I can't understand why many people are so interested in cgroups scheduler. I tried this feature and it really sucks, because it hurt both throughoutput and peak multimedia performance (think of 1080p h264 on Atom N270 without frame drops). All this tests with -jMANY and "my desktop is still responsible" looks like completely synthetic and unpractical for me: just learn how to use nice and ionice, this will save both your system performance and user experience. Of course, anyone is free to configure their kernels as they like — that is Gentoo way after all —, but I personally see absolutely no gain from cgroups and refuse to use them by now.

----------

## Gusar

 *Bircoph wrote:*   

> think of 1080p h264 on Atom N270 without frame drops

 

Err, the thing can barely decode 720p, are you really playing 1080p on it? Without hardware help (Nvidia ION or Broadcom CrystalHD) it's not possible.

----------

## Bircoph

 *Gusar wrote:*   

> 
> 
> Err, the thing can barely decode 720p, are you really playing 1080p on it? Without hardware help (Nvidia ION or Broadcom CrystalHD) it's not possible.

 

You forgot to mention on unoptimized software, perhaps.

I use SHE + ffmpeg-mt + hardly optimized general system + skiploopfilter for lavdopts (this doesn't hurt quality in general).

Of course, video is downscaled during playback to fit 1024x600 resolution.

----------

## Gusar

What is SHE? And if you're skipping loop filtering, that's something else, I don't know what happens then. I do know that it actually does degrade video quality, sometimes very visibly so. And sometimes it causes artifacts, because the video does motion compensation on filtered frames. Also, what's the bitrate of those 1080p videos?

----------

## Bircoph

 *Gusar wrote:*   

> What is SHE?

 

Super Hybrid Engine is a hardware clock frequency control for eeepc, including light overclocking and moderate underclocking to save battery power: http://event.asus.com/notebook/bamboo/external5.htm

 *Quote:*   

> I do know that it actually does degrade video quality, sometimes very visibly so. And sometimes it causes artifacts, because the video does motion compensation on filtered frames.

 

Read mplayer's manual then:

```

                 skiploopfilter=<skipvalue> (H.264 only)

                      Skips  the  loop filter (AKA deblocking) during H.264 decoding.  Since the fil‐

                      tered frame is supposed to be used as reference for decoding  dependent  frames

                      this  has  a  worse  effect on quality than not doing deblocking on e.g. MPEG-2

                      video.  But at least for high bitrate HDTV this provides a big speedup with  no

                      visible quality loss.

                      <skipvalue> can be either one of the following:

                         none: Never skip.

                         default: Skip useless processing steps (e.g. 0 size packets in AVI).

                         nonref: Skip frames that are not referenced (i.e. not used for decoding oth‐

                         er frames, the error cannot "build up").

                         bidir: Skip B-Frames.

                         nonkey: Skip all frames except keyframes.

                         all: Skip all frames.

```

This is just deblocking skip. You wan't need this for 1080p in most of the cases anyway. And you can always safely skip deblocking for at least B-frames.

 *Quote:*   

> Also, what's the bitrate of those 1080p videos?

 

I tested on different samples, most of them are 25 fps digital video, some are 24 fps.

----------

## Joseph K.

 *VinzC wrote:*   

> Well, as I saw there is already a bug opened in kernel.org, I didn't feel ike doing it on Gentoo's side. Fact is there doesn't seem to be much progress on that lately.

 

Yeah, but I think it's a good way to draw the Gentoo kernel maintainers' attention to the bugs that particularly affect us, which can draw interest and spur progress on them.  Now that 2.6.39 is out, though, I may be inclined to try using that in the hope that this bug was fixed.  If I can get a screen shot of the oops, I'll open a Gentoo bug for it.   :Smile: 

----------

## Gusar

 *Bircoph wrote:*   

> Super Hybrid Engine

 

Ah, marketing buzzword for on-the-fly CPU governor switching.

 *Bircoph wrote:*   

> This is just deblocking skip. You wan't need this for 1080p in most of the cases anyway.

 

The in-loop deblocker is an integral part of h264 encoding/decoding. If it was done during encoding, then turning it off for decoding degrades the picture and can lead to artifacts. That you aren't bothered by the difference is something else. I can imagine it's probably not noticeable in the netbook display, but on anything bigger than 15'' ...

 *Bircoph wrote:*   

> I tested on different samples, most of them are 25 fps digital video, some are 24 fps.

 

That's fps, I asked for bitrate.

----------

## Bircoph

 *Gusar wrote:*   

>  *Bircoph wrote:*   Super Hybrid Engine 
> 
> Ah, marketing buzzword for on-the-fly CPU governor switching.
> 
> 

 

You failed to read documentation again. SHE is completely independent from CPU governors and affects not only CPU.

 *Quote:*   

> 
> 
> The in-loop deblocker is an integral part of h264 encoding/decoding. If it was done during encoding, then turning it off for decoding degrades the picture and can lead to artifacts. That you aren't bothered by the difference is something else. I can imagine it's probably not noticeable in the netbook display, but on anything bigger than 15''
> 
> 

 

I often use it on my 19" desktop, artifacts are rare and I have never seen them on high-res video. And deblocking is not always done during encoding. Anyway, the goal is to be able to see a 1080p movie on netbook with an acceptable quality. This is not my goal to obtain mathematically equivalent image after decoding.

 *Quote:*   

> 
> 
> That's fps, I asked for bitrate.

 

Bitrate is variable, obviously, and may vary greatly during playback. An average value for the samples, estimated by duration and file size, and by taking audio tracks into account, though variable as well, is 8 kbps.

----------

## Cyker

Has anyone had a problem with high iowait in 2.6.38 after several hours/days use?

I ran into this with -r1 - It seems that a kernel task called khugepaged goes into the D state for some reason, and then one of my cores gets hit with a 100% IOWAIT.

It seems to be related to Transparent hugepages; Have disabled it completely and while there is a noticeable performance hit (Not big, just noticeable) I so far haven't had the problem reoccur... but might be too early to judge...!

----------

## Gusar

 *Bircoph wrote:*   

> You failed to read documentation again. SHE is completely independent from CPU governors and affects not only CPU.

 

Ok, I've read a bit about it. It scales the fsb freq instead of the cpu freq. The highest modes also do a bit of overclocking, and then there are modes that do dynamic scaling. So while I oversimplified, it's still just a buzzword frequency scaling and governor switching.

 *Bircoph wrote:*   

> And deblocking is not always done during encoding.

 

Deblocking is very, very rarely not done. I'll bet you you won't find videos out there that don't have it, unless it's conformance tests suites or the person encoding is doing some sort of test.

----------

## soya

Hi, to make use of the feature i just need to enable SCHED_AUTOGROUP or there is something more i should do?

----------

## VinzC

 *soya wrote:*   

> Hi, to make use of the feature i just need to enable SCHED_AUTOGROUP or there is something more i should do?

 

 *In his first post, VinzC wrote:*   

> Be sure to check CONFIG_CGROUPS and CONFIG_CGROUP_SCHED in your kernel config for it's not enabled by default.

   :Wink: 

----------

## depontius

I believe SCHED_AUTOGROUP either turns on the necessary features, or isn't available until they are on, or some mix of the two.  I didn't do a heck of a lot - I may have enabled CGROUPS, but I just turned on SCHED_AUTOGROUP and then several other options popped in, too.

----------

## tabanus

I'm really disappointed with this patch.

I have several Gentoo PCs (all dual core with 2GB RAM), some amd64, some x86, and still get lock-ups with MAKEOPTS="-j3" in /etc/make.conf with some compiles. eg. firefox-5, kdelibs.

Am I missing something, or expecting too much?

I have SCHED_AUTOGROUP CONFIG_CGROUPS and CONFIG_CGROUP_SCHED all set in the kernel .config

----------

