FOSDEM09: "Aggressive" Linux power management
At FOSDEM 2009 (Free and Open Source Software Developers' European Meeting) in Brussels, there were a number of interesting talks about the state of power management in Linux. Matthew Garrett from Red Hat talked at length about aggressive power management for graphics hardware. People tend to forget that graphics hardware is more than a processor: it is not just the GPU that draws power, the graphics card's memory, outputs, and, of course, the displays themselves all draw power as well. Until now, most of the work on power management has focused on the GPU, but if you want really good power management, you have to attack the problem on all these fronts. And that's what Garrett is doing at Red Hat and shared in his FOSDEM presentation.
The power consumption of the GPU can be decreased by two techniques:
clock gating or reclocking. Clock gating means that different bits of the
chip are disconnected from the clock when not in use, and thus less power
is drawn. However, this functionality has to be hardwired in the chip
design and it must be supported in the graphics driver. And that's where
Linux is still lagging behind, according to Garrett: "For a long time
Linux graphics support has focused on getting a picture. We can go further
now, but we just need the documentation to adapt the drivers.
" Clock
gating has no negative effect whatsoever on the performance of the GPU.
Reclocking is another story: when the GPU is running at a frequency of 600 MHz and you reclock/underclock it to 100 MHz, this results in a massive reduction in power usage, but it also means that the performance is reduced accordingly. Garrett cited a difference of 5 W if clock gating and reclocking is combined on Radeon graphics hardware.
The second component that can be optimized is memory: each memory access draws power. So what can we do about power consumption of memory? Read less often (which is essentially reclocking) or read less memory. Reducing the memory clock can save you again around 5 W, but it introduces visual artifacts on the screen if reclocking happens while the screen is being scanned. The other interesting route (read less memory) comes down to compressing the framebuffer. Most recent Intel graphics chipsets implement this by a run length encoding (RLE) of the screen content on a line-by-line basis. Garrett notes that this means your desktop background can make a difference in battery life: vertical gradients compress very well using this scheme, but horizontal gradients do not.
Another interesting consequence of the memory component is that periodic screen updates are really bad for power consumption. According to Garrett, moving the mouse cursor around has an instantaneous increase of power consumption by 15 W. A blinking cursor draws 2 W, and also the display of seconds or a blinking colon in the system tray clock draws unnecessary power. Garrett adds philosophically: "The whole point of a blinking cursor is attracting your attention. But when you're typing, your attention is already going to your text input, and when you're not typing, it doesn't need your attention. So is it really needed to blink the cursor?
"
The third component where power management can make a difference are the outputs. Just powering off an unneeded output port saves around 0.5 W. If you know for sure that you don't need the external output on your laptop, you can safely turn it off and gain a bit of battery time. However, if you need to connect an external monitor or video projector afterward, you will first need to power on the output port explicitly. It all comes down to a tradeoff between functionality and power consumption.
The last (but not least) component of graphics hardware is the displays. This is another place were reclocking can save some watts. For example, the LVDS (Low-voltage differential signaling) link to a laptop's LCD screen uses power at each clock transition. Reducing the refresh rate reduces the power consumption. While CRT screens begin to flicker if the refresh rate is too low, TFT's don't have this problem. According to Garrett, most TFT screens can be driven at 30 Hz, but then they tend to display visual artifacts. Garrett only recommends this LVDS reclocking when the screen is idle, which saves around 0.5 W. If the screen becomes active again, the system should return to a normal refresh rate of 60 Hz. Another solution is DPMS (Display Power Management Signaling): just turn off the screen when it's idle. Even a screensaver drawing a black screen draws power, while DPMS really turns off the output.
So what's the current state of this "aggressive power management"? Dynamic clock gating is implemented in most recent graphic cards. Future developments will implement even more aggressive dynamic state management: graphics hardware will power on functionality when the system needs it and power it off when it's not used. Graphics drivers and the operating system should control this without irritating the user. Garrett stresses that power management has to be as invisible as possible, otherwise the user will not be happy and stop caring about "green" computing. Garrett is now working on the Radeon code to get some prototype functionality. As it stands now, the combination of dynamic GPU and memory reclocking can save 10 to 15 W, and LVDS reclocking can save 0.5 W. For a laptop, this doesn't make a huge difference, but it is still a significant increase in battery life.
Power management in Nokia's next Maemo device
In the embedded track of FOSDEM, Peter De Schrijver of Nokia gave an insightful but very technical talk about advanced power management for OMAP3. This integrated chip platform made by Texas Instruments is based on an ARM Cortex A8 processor and has a GPU, DSP (digital signal processor) and ISP (image signal processor). Because the chip is targeted at mobile devices, some advanced power management functionality is built-in: the chip is divided in different voltage domains, and in each module the interface clock and functional clock can be turned off independently.
Nokia used an OMAP1 chip in the N770 internet tablet, and an OMAP2 chip in the N800 and N810 internet tablets. The devices use Nokia's Maemo platform, based on Debian GNU/Linux. Last year Nokia executive Ari Jaaksi revealed that their next Maemo device would use an OMAP3 chip. De Schrijver talked about the power management architecture of OMAP3, but also about the Linux support Nokia is developing for this functionality.
Power management on the OMAP3 can be subdivided into two types. On the one hand, there is active power management. It's essentially the same principle as reclocking in graphics hardware: with a lower clock frequency, the chip is running on a lower voltage, resulting in less power consumption. With dynamic voltage frequency scaling this can be handled automatically. In Linux, the frequency scaling of the CPU is implemented in the cpufreq driver, while for the core (the interconnects between different blocks of the chip and some peripherals) there is a new API call for drivers, named set_min_bus_tput(), which sets the minimum bus throughput needed by a device.
On the other hand, when the chip is idle, there are solutions such as clock control, which can be implemented in software (by a driver) or hardware (an auto idle function). Moreover, clocks of different modules of the chip can be turned off selectively: if the interface clock is off, the core can sleep; if the functional clock is off, the module can sleep. The implementation of clock control in the OMAP3 chip is done in the clock framework of the linux-omap kernel, and Nokia is adding the patches to linux-arm now.
The OMAP3 chip knows four power states per domain: "on", "inactive", "retention" and "off". In the "inactive" state, the chip works at normal voltage but the clocks are stopped, while in the "retention" state the chip works at a lower voltage. This means that the "inactive" state uses more power than the "retention" state, but has a lower wakeup latency. The shared resource framework (SRF) that determines the power state for each domain of the chip is implemented by Texas Instruments and is hidden from the driver programmer by an API. This API has to be implemented by the power management framework and has to be used by the drivers. The API documentation is not yet released, but De Schrijver said this will be added into the kernel Documentation directory soon.
The "off" mode has some challenges: while the power management framework can handle saving and restoring the state of the cpu, memory controller and other components, each driver has to handle its module. This means: reinitialize the module when the functional clock is enabled and save and restore the context and shadow registers in memory.
In his talk, De Schrijver also gave a status update of the work. The "retention" state works. Basic "off" mode works on most development boards; drivers are being adapted for "off" mode now and will be ready at the end of February. All this code is being merged in the linux-arm kernel tree, but eventually it will be merged in the mainline kernel. According to De Schrijver, all these power management techniques will be used in the next Nokia Maemo device: the long-awaited successor of the N810.
Index entries for this article | |
---|---|
GuestArticles | Vervloesem, Koen |
Conference | FOSDEM/2009 |
Posted Feb 12, 2009 12:20 UTC (Thu)
by nowster (subscriber, #67)
[Link] (10 responses)
On a laptop that normally consumes about 30W when idle, it's a large reduction in power usage.
Posted Feb 12, 2009 13:50 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link] (9 responses)
Posted Feb 12, 2009 14:07 UTC (Thu)
by dlang (guest, #313)
[Link] (8 responses)
if you are dealing with a 30w baseline, reducing it by 10-15w is extremely impressive and even 0.5w can be noticable, but if you are taking about a 300w baseline 10-15w is barely noticable and 0.5w probably doesn't matter to much of anyone.
Posted Feb 12, 2009 14:11 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link]
Posted Feb 13, 2009 23:02 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (6 responses)
What kind of impression are you looking for? It seems to me that the baseline is useless. 10-15w saves the same amount of money and the same amount of CO2 regardless of the baseline.
Should we also ask how much the room lights draw? Because if there's 300 watts of lighting, a 10-15w savings is barely noticeable even if it's half of what the GPU formerly used.
Posted Feb 13, 2009 23:45 UTC (Fri)
by dlang (guest, #313)
[Link] (5 responses)
so I don't buy that saving the electricity is an end in itself.
for portable devices that reduction in power can mean that your battery lasts much longer, which is the thing that you really care about.
Posted Feb 14, 2009 3:57 UTC (Sat)
by giraffedata (guest, #1954)
[Link] (4 responses)
But what are you responding to? This thread doesn't say anything about saving electricity being an end in itself.
Your post does, however, reinforce the parent post, by showing reason to save electricity without any reference to the baseline usage.
Posted Feb 14, 2009 6:02 UTC (Sat)
by k8to (guest, #15413)
[Link] (3 responses)
Posted Feb 14, 2009 17:38 UTC (Sat)
by giraffedata (guest, #1954)
[Link] (2 responses)
Au contraire. I'm completely mystified by dlang's post, in its context, and I think that may be because he completely misread mine.
And you apparently read something into my most recent post that wasn't there as well.
Posted Feb 16, 2009 21:15 UTC (Mon)
by k8to (guest, #15413)
[Link] (1 responses)
Posted Feb 17, 2009 21:13 UTC (Tue)
by giraffedata (guest, #1954)
[Link]
Posted Feb 12, 2009 15:26 UTC (Thu)
by johnkarp (guest, #39285)
[Link] (9 responses)
Posted Feb 12, 2009 16:10 UTC (Thu)
by dlang (guest, #313)
[Link] (4 responses)
1. even when the GPU has finished doing it's work you still want to see the screen (which requires that the GPU stay on)
2. changing the GPU mode can cause artifacts on the screen (mentioned in the article when talking about changing the clock speed)
3. switching modes takes time, you frequently don't want to drop too far as you then won't be able to respond to activity quickly. If you only switch modes when the entire system is idle, then you miss a lot of chances to save power.
Posted Feb 12, 2009 16:24 UTC (Thu)
by johnkarp (guest, #39285)
[Link] (3 responses)
Posted Feb 12, 2009 16:41 UTC (Thu)
by dlang (guest, #313)
[Link] (1 responses)
I don't know what's common on modern high-end cards
Posted Feb 13, 2009 8:30 UTC (Fri)
by Los__D (guest, #15263)
[Link]
Posted Feb 19, 2009 22:26 UTC (Thu)
by wmf (guest, #33791)
[Link]
Posted Feb 12, 2009 17:25 UTC (Thu)
by iabervon (subscriber, #722)
[Link]
Race-to-idle doesn't work at all if there is no idle state, and doesn't necessarily work the same way if being late means doing less work (possibly with a loss of output quality).
Posted Feb 12, 2009 17:59 UTC (Thu)
by anton (subscriber, #25547)
[Link] (2 responses)
Race-to-idle may work on systems that really power down at idle,
but for most of the systems
I have measured this does not happen to a significant extent, and
the effects of voltage scaling dominate, making it beneficial to run
at the lowest clock and voltage that is sufficient for the load.
Posted Feb 13, 2009 23:11 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (1 responses)
In CPUs, I think race-to-idle is valid only for lower speeds, such as were common five years ago. Above 1.5 GHz or so, the curve starts bending down and it takes more power to squeeze out each additional MHz (which is why we finally hit a wall and can't make them go faster).
I've noticed the same thing with CPU price. That third GHz costs you more than the second.
Posted Feb 14, 2009 4:12 UTC (Sat)
by i3839 (guest, #31386)
[Link]
I think the whole race-to-idle thing is overrated. It might be true if you look at servers or something, or when you are waiting for something to finish before doing anything else, but in general it seems just running at lower cpu speed (and voltage) uses less power. Especially with laptops it's not just the cpu, the screen and other things take a lot of energy as well, and those are on no matter what.
For my Thinkpad X40 with 1.4GHz Pentium M the numbers are (CPU undervoltaged with PHC patch, otherwise plain kernel), more or less, network cable plugged and LCD at usable brightness:
10W idle, 16W max cpu speed (20W without undervoltage), and 12W for fixed 600MHz 100% cpu usage.
4W more power draw for 2.3 "speedup" seems like a good deal, until you realize that what you're doing isn't cpu limited and that you use the thing for something all the time anyway (I call it "reading"/"staring").
So for one hour of 100% cpu usage at 600MHz the total power usage is:
For the same task done with race to idle it is:
t = 1/2.3h
Which at best proves that it doesn't matter much. Without undervoltage the numbers aren't as good though:
1h * 13W = 13Wh
which results in 10% less battery life, or about half an hour.
Throw in the more heat and fan noise, and I'm happy if it stays at 600MHz for cpu intesive tasks taking a while. That said, the default cpufreq does the right thing and puts the cpu at 600MHz speed almost all the time anyway.
Of course that it all doesn't matter all that much may prove that the race-to-idle exists, so you can interpret the results to your liking. But claiming that doing it one way is substantially better than doing it another way seems far fetched. Perhaps the thruth is somewhere in the middle, with best results with cpufreq with a max speed of 1GHz for this setup.
FWIW, the minimum idle power usage is about 6W for this laptop, with everything more or less off or idle/unplugged. That seems way too much for doing nothing (suspend to ram is about 1W, in a perfect world total idle should be about the same, with suspend case using even less energy of course). Oh well.
Posted Feb 14, 2009 19:24 UTC (Sat)
by jwb (guest, #15467)
[Link]
Similarly, changing my laptop's internal display from 50 to 40Hz does not measurably save power.
My laptop uses about 11-12W with the radios off, and the battery lasts about six hours, so believe me that if I thought I could save half a watt with either of these things, I would do it.
One thing that's not mentioned in the article, but might have been mentioned in the presentation, is tickless graphics. If my machine is just sitting around with a static image on the display, there's really no reason for the graphics chip to be doing anything at all. Is this coming some day to X.org?
Posted Feb 15, 2009 14:46 UTC (Sun)
by magnus (subscriber, #34778)
[Link]
Maybe with some research effort it could be possible to develop a freeze-frame mode on TFT screens where the screen is only refreshed maybe a few Hz or so when there is no activity.
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
if you are dealing with a 30w baseline, reducing it by 10-15w is extremely impressive and even 0.5w can be noticable, but if you are taking about a 300w baseline 10-15w is barely noticable and 0.5w probably doesn't matter to much of anyone.
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
so I don't buy that saving the electricity is an end in itself.
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
You both understand each other.
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
if you'd step back and think for a moment you could either stop or have a pleasant conversation.
I wish I knew what about the exchange so far you find to be other than a pleasant conversation. It looks like I've been misunderstood more than once.
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management
Furthermore, they sometimes run without a maximum amount of work; whenever they finish a frame, they start the next one.
race-to-idle
Why doesn't the 'race-to-idle' concept apply to GPUs?
It does not apply in general to CPUs, either. E.g., my current
desktop draws 100W idle, 111W at load 1 2GHz, and 123W at load 1 3GHz.
If I have a job that consumes 6G cycles every 3s, then at race-to-idle
3GHz it will consume (123*2+100)/3=115W, whereas at 2GHz it will
consume 111W.
race-to-idle
race-to-idle
1h * 12W = 12Wh.
t * 16W + (1 - t) * 10W = 12.6Wh
and
t * 20W + (1 - t) * 10W = 14.3Wh
FOSDEM09: "Aggressive" Linux power management
FOSDEM09: "Aggressive" Linux power management