# 25% performance gain in encoding with HZ=1000 vs HZ=100

## devsk

Contrary to the popular belief that HZ=100 gives better CPU performance for a task because of less interrupts and a larger time share chunk for the task to finish, I am finding that I got a whopping 25% increase in encoding speed by moving from HZ=100 to HZ=1000. The result is repeatable every time. I rebuilt the kernel and reran the encoding job 3 times to verify.

The only variable being changed in the test is that one runs a kernel compiled with HZ=100 and the other runs HZ=1000. All the hardware and software remains the same. This is on a corei7 920 with HT ON and OCed to 4.2Ghz.

```
2.6.30.5 HZ=100:

23:45:55 root@localhost /usr/src

# time ./makeKernel

real    1m13.855s (make -j19)

real    1m51.702s (overall, including nvidia,compcache, VB, VMplayer, lirc, genkernel initramfs)

216.17fps for JC projectA2 one pass x264 with 6 threads in mencoder.

2.6.30.5 HZ=1000:

23:45:55 root@localhost /usr/src

# time ./makeKernel

real    1m13.576 (make -j19)

real    1m52.485 (overall, including nvidia,compcache, VB, VMplayer, lirc, genkernel initramfs)

271.41fps for JC projectA2 one pass x264 with 6 threads in mencoder.
```

Can someone else verify this please? I can't explain this behavior? Can you?

----------

## devsk

While trying to dig up reasons for this behavior, I hit this thread from back in 2001 and it has the explanation.

http://www.x86-64.org/pipermail/discuss/2001-March/001932.html

 *Quote:*   

> The advantage of having a higher-resolution clock are that multithreaded
> 
> applications can have much better response times. A common complaint that
> 
> we hear with GNAT in that respect is that a "delay 0.001" statement, will
> ...

 It actually makes a lot of sense. The encoding benchmark is multi-threaded and there is a lot of syncing among threads that needs to happen. If a syncing wait can only be as granular as 10ms, then there is lot of time being wasted by kernel in bringing that waiting task back to work. 10ms is VERY long for modern processors.

No wonder Linus made HZ=1000 the default some time back.

----------

## kernelOfTruth

nice find devsk !

you by chance also have CONFIG_NO_HZ=y enabled in your kernel ?

if yes I have to try this too - I love performance  :Wink: 

it probably won't be 25% more performance but still some more (Core 2 Duo E6600)

thanks !

----------

## devsk

 *kernelOfTruth wrote:*   

> nice find devsk !
> 
> you by chance also have CONFIG_NO_HZ=y enabled in your kernel ?
> 
> if yes I have to try this too - I love performance 
> ...

 Yes, I have tickless kernel. I tried HZ=300 for fun. The FPS I got for 1 pass was 254. So, for multithreaded apps, HZ=1000 is the fastest option. Overhead is compensated by smaller scheduling delays and smaller synchronization waits.

----------

## devsk

 *kernelOfTruth wrote:*   

> nice find devsk !
> 
> you by chance also have CONFIG_NO_HZ=y enabled in your kernel ?
> 
> if yes I have to try this too - I love performance 
> ...

 What HZ do you use now? Can you please post back here with your results?

----------

## Mike Hunt

In fact the help in menuconfig recommends 1000 HZ *Quote:*   

> 1000 Hz is the preferred choice for desktop systems and other systems requiring fast interactive responses to events.

 So I always use that.  :Smile: 

----------

## kernelOfTruth

 *devsk wrote:*   

>  *kernelOfTruth wrote:*   nice find devsk !
> 
> you by chance also have CONFIG_NO_HZ=y enabled in your kernel ?
> 
> if yes I have to try this too - I love performance 
> ...

 

I'm currently using:

 *Quote:*   

> CONFIG_NO_HZ=y
> 
> # CONFIG_HZ_100 is not set
> 
> # CONFIG_HZ_108 is not set
> ...

 

I guess compiling a kernel with 

```
make -j 20
```

 should suffice to make it comparable ?

----------

## devsk

oh, you had exact my settings. You will be very pleased with your encoding (or any other multi-threaded app) improvements once your move to HZ=1000.

Wait! You don't seem to be using vanilla kernel though because it has all those weird looking HZ values. Con's kernel?

----------

## kernelOfTruth

 *devsk wrote:*   

> oh, you had exact my settings. You will be very pleased with your encoding (or any other multi-threaded app) improvements once your move to HZ=1000.
> 
> Wait! You don't seem to be using vanilla kernel though because it has all those weird looking HZ values. Con's kernel?

 

zen-sources: 2.6.30-zen5   :Wink: 

BFS (Con's new CPU scheduler) isn't usable with >Uniprocessor yet (at least for me and others)

----------

## snIP3r

hi all!

i am wondering if this setting is also applicable for a (home) server? actually i am using these settings:

```

CONFIG_NO_HZ=y

CONFIG_HZ_250=y

CONFIG_HZ=250

```

greets

snIP3r

----------

## Naib

 *kernelOfTruth wrote:*   

>  *devsk wrote:*   oh, you had exact my settings. You will be very pleased with your encoding (or any other multi-threaded app) improvements once your move to HZ=1000.
> 
> Wait! You don't seem to be using vanilla kernel though because it has all those weird looking HZ values. Con's kernel? 
> 
> zen-sources: 2.6.30-zen5  
> ...

 wait what? his new schedular isn't that good for multicores?

----------

## ok

 *Naib wrote:*   

>  *kernelOfTruth wrote:*    *devsk wrote:*   oh, you had exact my settings. You will be very pleased with your encoding (or any other multi-threaded app) improvements once your move to HZ=1000.
> 
> Wait! You don't seem to be using vanilla kernel though because it has all those weird looking HZ values. Con's kernel? 
> 
> zen-sources: 2.6.30-zen5  :wink:
> ...

 

http://thread.gmane.org/gmane.linux.kernel/886319

----------

## kernelOfTruth

 *Naib wrote:*   

>  *kernelOfTruth wrote:*    *devsk wrote:*   oh, you had exact my settings. You will be very pleased with your encoding (or any other multi-threaded app) improvements once your move to HZ=1000.
> 
> Wait! You don't seem to be using vanilla kernel though because it has all those weird looking HZ values. Con's kernel? 
> 
> zen-sources: 2.6.30-zen5  
> ...

 

it is  :Wink: 

the problem right now seems to be that it triggers some kind of race-condition and because of that lots of unkillable tasks persist and that new tasks can't be created

thus making daily usage or usage under heavy load currently impossible (perhaps x86_64 specific; I haven't read about multiprocessor and i686 usage yet)

----------

## devsk

Can someone else also confirm this result? It only takes two reboots...and a kernel compile... :Very Happy: 

----------

## Naib

 *ok wrote:*   

>  *Naib wrote:*    *kernelOfTruth wrote:*    *devsk wrote:*   oh, you had exact my settings. You will be very pleased with your encoding (or any other multi-threaded app) improvements once your move to HZ=1000.
> 
> Wait! You don't seem to be using vanilla kernel though because it has all those weird looking HZ values. Con's kernel? 
> 
> zen-sources: 2.6.30-zen5  
> ...

 

Well I am talking about a couple of cores NOT 4096 (since this sched is for desktops NOT render farms) and from this thread here it seems that BFS is rubbish for cores >1

----------

## kernelOfTruth

 *Naib wrote:*   

>  *ok wrote:*    *Naib wrote:*    *kernelOfTruth wrote:*    *devsk wrote:*   oh, you had exact my settings. You will be very pleased with your encoding (or any other multi-threaded app) improvements once your move to HZ=1000.
> 
> Wait! You don't seem to be using vanilla kernel though because it has all those weird looking HZ values. Con's kernel? 
> 
> zen-sources: 2.6.30-zen5  
> ...

 

it's only rubbish for cores > 1 on 64bit because it seems to trigger some race conditions and bugs, whereas on 32bit it literally seems to fly and is rockstable   :Very Happy: 

so give it a try if you're not running 64bit  :Wink: 

----------

## Naib

ahh, 64bit here...

oh well

----------

## hephooey

 *kernelOfTruth wrote:*   

> 
> 
> it's only rubbish for cores > 1 on 64bit because it seems to trigger some race conditions and bugs, whereas on 32bit it literally seems to fly and is rockstable  
> 
> so give it a try if you're not running 64bit 

 

I tried the BFS patch on a 32 bit system. I still have processes end up in the 'D' state. The kernel works fine in a text console, but after I start KDE the bug is eventually triggered.

BTW, I only applied the BFS patch, will other patches like "clone-fix-race-between-copy_process-and-de_thread.patch" and "autoiso-xorg.patch" make a difference?

----------

## devsk

 *hephooey wrote:*   

>  *kernelOfTruth wrote:*   
> 
> it's only rubbish for cores > 1 on 64bit because it seems to trigger some race conditions and bugs, whereas on 32bit it literally seems to fly and is rockstable  
> 
> so give it a try if you're not running 64bit  
> ...

 OT...where is your result from HZ=1000 vs. HZ=100 comparison? you better post it now.

----------

## kernelOfTruth

 *devsk wrote:*   

>  *hephooey wrote:*    *kernelOfTruth wrote:*   
> 
> it's only rubbish for cores > 1 on 64bit because it seems to trigger some race conditions and bugs, whereas on 32bit it literally seems to fly and is rockstable  
> 
> so give it a try if you're not running 64bit  
> ...

 

nothing significant - it got even worse   :Laughing: 

 *Quote:*   

> Hz = 250
> 
> real	13m52.757s
> 
> user	24m8.242s
> ...

 

the main difference I "felt" was that it's somewhat more responsive / it feels snappier 

so AMD and i7 users might be the ones really profiting from this change   :Sad: 

----------

## devsk

 *kernelOfTruth wrote:*   

>  *devsk wrote:*   OT...where is your result from HZ=1000 vs. HZ=100 comparison? you better post it now. 
> 
> nothing significant - it got even worse  
> 
>  *Quote:*   Hz = 250
> ...

 Is that multi-threaded encoding test or something else? how many threads? What did you run for 14 minutes? what cpu is it?

with i7 920, I get best results with x264 if I use 6 threads. any less or more, the numbers are off.

----------

## kernelOfTruth

 *devsk wrote:*   

>  *kernelOfTruth wrote:*    *devsk wrote:*   OT...where is your result from HZ=1000 vs. HZ=100 comparison? you better post it now. 
> 
> nothing significant - it got even worse  
> 
>  *Quote:*   Hz = 250
> ...

 

it was a plain kernel-compilation with -j 20   :Smile: 

----------

## devsk

OK. it won't help the kernel compile (I posted those numbers in post 1 as well). The scenario that HZ=1000 will help is if there are a lot of threads and there is lot of synchronization involved among those threads.

It may help in firefox I think.

----------

## kernelOfTruth

 *devsk wrote:*   

> OK. it won't help the kernel compile (I posted those numbers in post 1 as well). The scenario that HZ=1000 will help is if there are a lot of threads and there is lot of synchronization involved among those threads.
> 
> It may help in firefox I think.

 

you're right !   :Very Happy: 

combined with bfq (<-- successor to cfq ?) it's much more responsive even during heavy i/o (rsyncing several GiBs of data),

bfq, vr and the fifo-scheduler can be found in zen-sources

----------

## Naib

works well here 64bit,dual core.

Booting is silly fast and things do feel "responsive" don't know how much is placibo or not mind

----------

## neuron

I know I'm bumping an ancient thread here, but has anyone done tests with 100hz/1000hz on single thread with turbo boost cpu? Would turbo boost kick in more agressivly with a higher hz?

----------

## xibo

Hmmm the last time i checked performance of those was with my pentium 3. i guess the time(ing)s have changed >_<

Is 1000Hz generally _not_slower_ then 100Hz or is it just on corei/core2 or similar "recent" systems?

----------

