LWN.net Logo

Tripping over trip points

By Jonathan Corbet
August 7, 2007
Contemporary processors have an interesting problem: if they operate at their full rated capacity for extended periods of time, they run a real risk of heating to the point that they let the blue smoke out and never run again. To avoid this kind of problem, processors (and other components) are instrumented with temperature sensors. The BIOS programs the sensors with specific "trip points" - temperatures where things will happen to keep the system from overheating. At a given trip point, the system might turn on the fan, throttle the processor, or, if disaster is imminent, shut the system down hard.

The Linux ACPI subsystem provides the ability to query these trip points; the relevant virtual files can be found under /proc/acpi/thermal_zone. Your editor's laptop, for example, reveals that it is set to throttle the processor at 86°C and to pull the plug at 91°. Traditionally, the ACPI code has also allowed a suitably privileged user to change those trip points by writing new values to the /proc files. That capability no longer exists, though; it was removed in the 2.6.22 kernel.

Users are now starting to complain about that change. They feel that the BIOS-set trip points on some systems are positioned incorrectly, resulting in systems that run more slowly than they think they should, fans which come on at the wrong time, and so on. Naturally, they feel that the removal of the trip-point override feature has reduced the functionality of their systems.

ACPI maintainer Len Brown responds that the override feature is a bad idea for a number of reasons. At the top of the list is the fact that the system cannot actually change the hardware trip points. All it can do is disable them. Then the processor must take over by polling the temperature sensors itself and responding when its software trip points are reached. Should that polling and response fail to happen for any reason, there is a real possibility that the hardware could be damaged. Meltdowns could also easily occur if the trip points are set incorrectly, leading to "Linux destroyed my laptop" postings echoed across the net.

On top of that, the BIOS can change the trip points at any time for reasons of its own. Many of the use cases for trip-point overrides (controlling when fans go on and off, for example) are better done by having a user-space daemon control fan operation directly. And the truth of the matter is that overriding trip points is usually (Len would say always) an inappropriate response to problems which are better solved somewhere else. When the issue was discussed in May, he summarized it this way:

The fact that the trip-points are writable has obscured, rather than clarified, the actual causes of the failures. No less than 4 people in that bug report declared that cleaning the dust out of their fan fixed the root cause. A bunch more said that the issues went away when they stopped using ubuntu's user-space power save daemon.

There are a couple more with broken active fan control -- which also gets obscured rather than clarified by over-riding trip points.

The remaining problems, says Len, are most likely not present when Windows is running on the affected hardware. And, he says, Windows is highly unlikely to be overriding the trip points. The conclusion is that Linux is doing something wrong in its thermal management on those systems. He would much rather find and fix the real problem than hide it through use of trip-point overrides.

In the end, according to Len, there has never yet been a bug report which suggests that Linux should be messing with trip points in this way. This is a clear challenge for anybody who misses the trip-point override feature: send in a suitably documented report showing the problem that this feature solved. If the override feature truly turns out to be necessary, it may just come back - but it may just happen that a fix for the actual problem goes in instead.


(Log in to post comments)

Tripping over trip points

Posted Aug 14, 2007 8:26 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

This is mostly a laptop problem.
On desktops any modern processor fitted with a P4-era heatsink will be more than adequately cooled (gpus are another problem)

Tripping over trip points

Posted Jan 9, 2008 0:37 UTC (Wed) by sbs (guest, #49854) [Link]

Just ran into this issue after a kernel upgrade.

The comment about "windows is highly unlikely to be overriding the trip points" is BS - there
are numerous utilities that allow this in windows, and I use them on every computer, the same
as I have overridden the trip points in every linux install up until now.

In 10 years of using Linux this is the first time I recall Windows giving me more control over
my computer while Linux instead tries to protect me from myself in the interest of avoiding
bad PR.

Good grief.

Tripping over trip points

Posted Jun 11, 2008 6:12 UTC (Wed) by Jim_99 (guest, #52487) [Link]

Here's my issue with the trip points being set to turn the fans on for the first time at 65*
C. The processor is rated for 69* C as it's thermal threshold. Why would any programmer turn
the fan on at 65* C for the first time ? I have a Dell C400 (P3 733 Mhz), it shuts itself down
without overriding the trip points in Ubuntu. The heat builds up and damages memory, battery,
video, motherboard and hdd, at the very least shortening any component's useful life. OS X had
this same overheating issue with Powerbooks and these were brand new Powerbooks. No dust in
the fan or air passageways ! I know, I had one and Apple updated OS X to start running the
fan(s) at 47* C. I think Ubuntu and other Linux distros should follow suit. I'd rather hear my
fan go off and run to keep my notebook cool, rather than shutdown for doing nothing more than
surfing the internet. I found my notebook ran best around a 50* C trip point for fan activity
and battery life. I wouldn't run a desktop at 60+* C and I don't like the idea of running a
notebook at that. If the ACPI or power throttling is not supported, at least allow some sort
of relief ? 

Tripping over trip points

Posted Jun 12, 2008 5:38 UTC (Thu) by Jim_99 (guest, #52487) [Link]

Just as I expected, I took mine apart earlier this evening just on the off chance it was
filthy inside. I found no mythical dust or anything else out of the ordinary Dell build
inside. I'll clarify and restate that the Dell I have is the L400 and not the C400 that I
indicated in the prior reply. Regardless, when I was able to reset trip points, this notebook
operated like a Powerbook on OS X. I could keep the temperature between 47-52* C web surfing
and playing back mp3's directly from the internal hdd, while the fan maintained that range
with a minimal hit on battery life. The best I can do is, after recalibrating the battery
right now is to watch it run for a little while around 47-52* C just web surfing, no fan
noise, and after starting up the music thru Rhythymbox, the temperature just keeps climbing
with no relief from a fan until a 65* C trip point gets reached.

The notebook does operate much better after recalibrating the battery, but that takes several
hours and how long that lasts is anyone's best guess. But the calibration is a long and drawn
out "manual, performed in the bios" process. It even appears to resolve the temperature issue
momentarily. I posted about this a while back on Ubuntu's forums, even did a "how to", until
the kernel improvements (?) thwarted trip point resets. Since playing back the music and
typing the rest of this since mentioning it above, the Dell is running at 56-57* C. Still more
acceptable than the 60+ * C it ran at just last night. This and light Open Office tasks/work
are all this Dell is going to do. By today's notebook standards is an internet appliance
anyway. But in a pinch, I shouldn't have to be terrified of heat buildup and operating
temperatures for doing somthing more cpu and/or hdd intensive.

Another point of clarification, the thermal specification is indicated as 80* C for this P3
733. I indicated it as 69* C for the max core temperature. That said, this P3 might have more
overhead on heat tolerances, other cpu's don't have that luxury. Then again, when does the hdd
or other components seize for a lockup or protective shutdown. And how many of those can that
hardware handle, before they die a premature death.

http://www.intel.com/support/processors/pentiumiii/sb/cs-...

and now I'm hitting 64* C. Better post this before the impending crash or lock up. So this is
entire post took about an hour to compose and relay. Sorry for taking my time but just wanted
to test the recalibration and relay the test at the same time. Hooray, the fan just came on
and that means 65* C was achieved. Temperature after a couple minutes of fan activity is back
down to 61-63* C, no where near getting me back under 50* C, which by the way was a long lost
memory of an operating temperature. The bottom of this notebook is just hot, glad it's on a
desktop and not my lap. Wow, 142-149* F operating temperature for an internet appliance
playing back music. Can I cook food on that ?


Tripping over trip points

Posted Jul 17, 2008 12:35 UTC (Thu) by abadona (guest, #52946) [Link]

Constantly having problems on two AMDx2-based systems, used mostly for number crunching (one
AMDx2 6000, another AMDx2 5000). The systems go into emergency shutdown anytime two CPU-bound
processes run simultaneously. The critical temp shown by acpi is 60C, (the only) active trip
point is at 50C. If that would be possible I would rather have fan start at lower temp and
throttle CPU when it climbs close to critical. 

Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds