# cpufreq for centrino: scaling_max_freq decreasing over time

## kristoffer

I've recently noticed a very disturbing behaviour produced by (I think) cpu frequency scaling on my centrino laptop (Dell Latitude X1 @ 1100 MHz). Under heavy load, the maximum frequency (as shown by for example /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq) available for the scaling governor (1100 MHz) steps down to the next lower frequency (800 Mhz) after just some minutes. Eventually it moves on from 800 MHz to 600 MHz, effectively disabling the frequency scaling since the minimum also is 600 MHz, making the system horrible for compiling and other heavy stuff. Without load, i.e. while only downloading stuff and browsing the web resulting in a constant 600 MHz, I've never noticed this happening with the 'ondemand' governor.

I'm not completely sure yet if this depends on which governor I have chosen (changing or reloading the governor doesn't help, however), or on the amount of switches between different frequencies due to chaning CPU load (the fact that the problem occurs much more frequently under heavy load, when the frequency is changed more often, suggests this). I'm quite sure that none of this didn't happened some moths ago, but I don't know what kernels and configurations I used. I have been trouble shooting this today and tested with a couple of different kernels: gentoo-sources: 2.6.15-r1, 2.6.16-r9 and suspend2-sources: 2.5.16-r8, 2.6.16-r7. All of them have this problem.

I'm going to try some other governors, specifically 'userspace' (fix each availavle frequency for some time while compiling) and perhaps 'performanc'e and try to figure out from that if it's the amount of frequency changes that produces the problem. Disabling cpufreq alltoghether might be interesting too. I might try using a kernel from six months back or so too, if nothing else bears fruit. I am, however, very interested in your reflections on this issue. Possible fixes, clues or further guidance on how to trouble shoot this -- anything is welcome!

----------

## Pallokala

Hi!

The first thing that came to my mind was that the temperature of your CPU is getting too high and system/hardware is limiting the maximum frequency. You didn't tell if the maximum frequency raises after you have done the heavy cpu-related work and it is again idle. If it raises, I would guess temperature is the cause of the problem.

I understood that you haven't yet tried using performance-governor only. If this happens also with that governor and also without speedstep-support, then this is most certainly related to hardware.

I have noticed with one Asus-laptop (Pentium 4M) that after about one year, the cpu fan is much more noisy and when I have time I guess I have to open the laptop and see if there is dust inside. It might be limiting the airflow and raise temperature.

So try to check that there is nothing limiting the airflow inside/around the computer. Also raising in roomtemperature (winter/summer) certainly makes a difference.

----------

## beatryder

What sort of scripts/daemons are you using to control your cpufreq?

----------

## kristoffer

 *Pallokala wrote:*   

> Hi!
> 
> The first thing that came to my mind was that the temperature of your CPU is getting too high and system/hardware is limiting the maximum frequency. You didn't tell if the maximum frequency raises after you have done the heavy cpu-related work and it is again idle. If it raises, I would guess temperature is the cause of the problem.
> 
> I understood that you haven't yet tried using performance-governor only. If this happens also with that governor and also without speedstep-support, then this is most certainly related to hardware.
> ...

 

Maybe you're right about it being a temperature problem. When I booted up today, with a cool processor, it was able to stay in 1,10 GHz much longer than yesterday. But as of now, after ~30 minutes of compiling, it dropped down to 800 MHz (with a corresponding bogomips count). The CPU temperature peaked at about 69 degrees Celsius, which I know is high, but this computer (Dell Latitude X1) is supposed to run hot -- it has no fans at all. Also, I'm completely sure I didn't encounter this problem at all while running Ubuntu Linux 4-5 months ago, so I doubt that this is limited by the hardware.

I've been looking around for some kind of critical temperature level in /proc and /sys that causes this down throttling but I couldn't find anything interesting. 'cat /proc/acpi/thermal_zone/THM/trip_points' gave me 'critical (S5): 95 C', but I'm not sure what that means. Any good documentation for this kind of thing? Or just some general idea what I can do in order to disable this temperature induced throttling?

While writing this it has dropped to 600 MHz at 59-60 degrees C... damn this!

 *beatryder wrote:*   

> What sort of scripts/daemons are you using to control your cpufreq?

 

None really. I just added a line in local.start where I specify which cpufreq governor to use. I am however looking at cpupw as it can undervolt the cpu, which means lower temperature and longer battery life.

----------

## kristoffer

I've been approximating pi (with mprime -t) at full CPU capacity (1100 MHz) for an hour with a lower CPU voltage (using cpupw). Naturally, the CPU temperature rises much more slowly with a lower voltage; in fact, it almost stabilized at 67-68 C. I didn't check the temperature for some minutes, but I can imagine that it had increased a couple of degrees, to ~70 C, and not surprisingly the scaling_max_freq and currently used CPU frequency has decreased to 800 MHz, i.e. the problem is back.

My hypothesis is as follows: when the CPU reaches the "critrical" threshold of ~70 C, some mechanism reduces the highest available CPU frequency one notch in order to reduce the temperature.

It remains unclear, however, why scaling_max_freq continues to decrease, from 800 to 600 MHz, since running at 800 MHz produces a temperature of ~62 C.

Does anyone know what's doing this? Is it possible and safe to disable it or at least increase the threshold to say 80 C?

Right now I'm going to compile a kernel without CPU frequency scaling to see if the problem still applies. Can I trust the values produced by /proc/cpuinfo? I mean, is the 'cpu MHz' value read from the hardware continously even without CPU frequency scaling enabled?

----------

## kristoffer

I'm currently running a kernel without CPU frequency scaling and I think I have run in to the same problem. I started mprime -t and monitered the temperature. It quickly got to 66-67 C, and a minute later it had dropped to 63 C and stabilized there. This is exactly what happens when it drops from 1100 MHz to 800 MHz. /proc/cpuinfo still reports values for 1100 MHz, however, but the temperature is perhaps a more trustworthy source.

Is there some other, more reliable way I can read the current CPU frequency then /proc/cpuinfo? Or is /proc/cpuinfo reliable? If so, then what's going on with my system?

So, something starts to throttle my CPU at ~70 degrees C. Is it hardware or software? Can it be modified? I know that while running Ubuntu I sometimes saw temperatures as high as 74 degrees C, so it seems unlikely that it's harware based AND unchangable.

----------

## jorrit

I have exactly the same problem as you on my laptop. After a long emerge or other heavy duty work (compile) the fan starts to get blow harder and the maximum cpu scaling frequency drops from 1800 to 800. This is a really annoying problem as it occurs very fast for me and on windows (it is a dual boot laptop) it doesn't occur. CPU speed remains constant there during a long compile.

My system specs:

  Acer Aspire 5024WLMi

  64-bit AMD Turion processor

Greetings,

----------

## kristoffer

 *jorrit wrote:*   

> I have exactly the same problem as you on my laptop. After a long emerge or other heavy duty work (compile) the fan starts to get blow harder and the maximum cpu scaling frequency drops from 1800 to 800. 

 

I'm currently troubleshooting this more, and I'm kind of loosing faith in my ability to resolve it. I just tried some really old kernel versions (vanilla 2.6.10!) and the problem was still there. The only shred of hope I have is that I think it worked when I ran Ubuntu (kernel 2.6.12) a few months ago. At least /proc/cpuinfo always reported the appropriate values then, and my fanless CPU managed to get reeeeaaally hot (over 70 degrees Celsius) and stay there without speed down for as long I wanted it to, healthy or not. Right now I'm going to try running a few liveCDs, like the Ubuntu LiveCD, Gentoo 2006 LiveCD and Knoppix, and see if the problem is present on those systems too under heavy CPU load. It would be interesting if you could try at least one of the above, and if the problem is absent, perhaps we can pin point the cause.

Also, it's interesting that the problem apparently appears on different architectures. However, I find it strange that I can't find any other reports of it.

One more thing, does any one know of a good method or program to check the raw computational power one's CPU? Like a CPU benchmark program. I would like to use it as a determinant of whether the CPU max frequency has dropped instead of the normal methods (like /proc/cpuinfo and cpufreq) as they contradict each other and don't seem very reliable in this case.

----------

## jorrit

 *kristoffer wrote:*   

> 
> 
> Also, it's interesting that the problem apparently appears on different architectures. However, I find it strange that I can't find any other reports of it.
> 
> 

 

Actually I did find another thread related to this problem. I haven't managed to try what they suggest however:

https://forums.gentoo.org/viewtopic-t-423320-start-0.html

Greetings,

----------

## kristoffer

 *jorrit wrote:*   

> Actually I did find another thread related to this problem. I haven't managed to try what they suggest however:
> 
> https://forums.gentoo.org/viewtopic-t-423320-start-0.html

 

Interesting. The fix (disabling Built-in tables for Banias CPUs in the kernel) didn't work for me, however. I didn't really expect it to work either since it seems the problem's still there for me even if I disable coufreq completely.

Also, this fix doesn't apply for your computer, since you're in a amd64 system, so don't try it yourself.

----------

## jorrit

I found one link that seems to offer a hint to the problem and a possible (but very clumsy) solutions:

http://www.linuxquestions.org/questions/showthread.php?t=436577

Basically it appears the cpufreq module is throttling cpu speed based on temperature too early. So compiling in support for cpufreq as a module and then unloading it when you need fullspeed seems to be one clumsy way to solve this problem (haven't tried it yet). It would hoever be nicer if we can control this maximum allowed temperature ourselves somehow.

Greetings,

----------

## kristoffer

 *jorrit wrote:*   

> Basically it appears the cpufreq module is throttling cpu speed based on temperature too early. So compiling in support for cpufreq as a module and then unloading it when you need fullspeed seems to be one clumsy way to solve this problem (haven't tried it yet). It would hoever be nicer if we can control this maximum allowed temperature ourselves somehow.

 

This seems to be my problem, but neither compiling ACPI thermal zone nor the cpufreq processor driver as modules seems to fix it. In fact, I've tried with removing cpufreq from the kernel, and the problem still seems to be there.

----------

## jorrit

I eventually solved my problems by:

1. Undervolting the cpu so that it generates less heat with same speed (i.e. NOT underclocking).

2. Opening the laptop and cleaning the fan.

3. Opening the laptop and applying new grease between cpu and fan.

Now temperature goes maximum to around 70 (instead of 84) under heavy load and stays at max speed.

Greetings,

----------

