LWN.net Logo

Matthew Garrett on the race to idle

Matthew Garrett on the race to idle

Posted May 10, 2008 0:09 UTC (Sat) by dlang (subscriber, #313)
Parent article: Matthew Garrett on the race to idle

racing to idle works if you can predict the future workload well enough, and if the wakup time
to ramp back up to full speed is fast enough to not annoy the user.

if either of these are incorrect it may very well be that you are better off running at a
lower clock rate  even if you are slightly less efficiant in cyles/watt.

like all benchmarks, it depends on your actual load.

if the sleep/wake time can get to be fast enough (I think <100ms is frequently fast enough)
the user won't notice the difference, and then you can start to shift to the 'race to idle'
mode, but if it takes longer then that to respond to a keystroke the user will start noticing.


(Log in to post comments)

Matthew Garrett on the race to idle

Posted May 10, 2008 0:12 UTC (Sat) by mjg59 (subscriber, #23239) [Link]

Shifting from C4 to C0 takes around 17 microseconds on my hardware, which is pretty typical. I
think that's somewhat less than 100 milliseconds :)

Matthew Garrett on the race to idle

Posted May 10, 2008 0:17 UTC (Sat) by dlang (subscriber, #313) [Link]

try it again with a USB device plugged in and the numbers can change drasticly.

some hardware combinations work well at shifting from one mode to another, others don't work
nearly as well.

David Lang

Matthew Garrett on the race to idle

Posted May 10, 2008 0:52 UTC (Sat) by mjg59 (subscriber, #23239) [Link]

USB's tendancy to trigger DMA means that you're going to spend more time in C2 than would be
ideal, but on anything made in the past 5 years that's still going to result in you saving
more power than staying in C0 at a low voltage. Recent hardware will even automatically
promote itself from C3 to C2 without OS intervention.

Matthew Garrett on the race to idle

Posted May 10, 2008 5:06 UTC (Sat) by dlang (subscriber, #313) [Link]

in theory you are right, in practice it doesn't work as well.

take the OLPC laptop, designed for very good power management at the hardware level (with
serious talk of going to sleep between keystrokes). due in large part to the need to have long
enough delays to properly talk to things like external USB and SD devices this machine takes
>200ms to wake up.

yes there is defiantly room for improvement in the software, but the reliable interfacing to
external (but effectivly permanently attached) devices is bad enough that doing race-to-idle
ends up being a horrible thing in practice (they tried doing it in a few builds, it was bad
enough that people started disabling sleep entirely)

Matthew Garrett on the race to idle

Posted May 10, 2008 10:25 UTC (Sat) by mjg59 (subscriber, #23239) [Link]

That's not race to idle, it's race to suspend. Linux is currently deficient in its requirement
that the entire device tree be resumed before userspace can restart, which results in suckage
like you describe. But that's an orthogonal issue - if you're talking about the processor
rather than the platform, then the power saving states add only small levels of latency.

Matthew Garrett on the race to idle

Posted May 10, 2008 10:37 UTC (Sat) by dlang (subscriber, #313) [Link]

if you aren't switching to a suspend mode when you hit the idle state, what is the benifit of
becoming idle?

you can blame it on whatever layers you want, but for the user the result is the same, in
practice the race-to-idle approach does not currently give a reasonable user experiance (in
many cases), and as a result, switching to a lower clock speed instead of race-to-idle is
actually better.

Matthew Garrett on the race to idle

Posted May 10, 2008 10:53 UTC (Sat) by mjg59 (subscriber, #23239) [Link]

The benefit is that your CPU power draw drops to approximately nothing. On modern CPUs,
halving the speed of your processor doesn't halve its power draw. Letting it idle takes it
down to 0-1 watts.

Matthew Garrett on the race to idle

Posted May 11, 2008 11:23 UTC (Sun) by IkeTo (subscriber, #2122) [Link]

> On modern CPUs, halving the speed of your processor doesn't halve its power draw.

Hm... I read otherwise somewhere else, if you count only the power of the CPU.  (Actually it
should save more than half of its power, otherwise why slowing down?)  Would you mind sharing
with us where you get this idea?

Matthew Garrett on the race to idle

Posted May 10, 2008 18:31 UTC (Sat) by dilinger (subscriber, #2867) [Link]

Please do not use the OLPC laptop as an example.  The power management software is not nearly
close to being finished, and OLPC has suffered greatly from lack of manpower (I'll spare you
the details of _why_ we haven't had enough people).  In reality, there should be no reason why
we can't do <200ms  resume; however, no one within the organization has even _started_
optimizing away the extra 800ms that we deal with.

The automatic suspend stuff that we had been toying with was merely a hack.

Matthew Garrett on the race to idle

Posted May 11, 2008 14:44 UTC (Sun) by daniels (subscriber, #16193) [Link]

I wouldn't call Geode/OLPC 'designed for good power management': it's lower power than most
x86, sure, but isn't even in the same league as consumer device hardware like ARM and MIPS.
If you can actually measure the wake-from-sleep time at all, then you've lost.

Consumer hardware like the Nokia 770/N800/N810, OpenMoko, and similar, all races to sleep, and
sleeps/resumes much, much more often than you think (probably by a factor of thousands).  It
really can be about as transparent as the x86 execution/idle switch if your hardware is
decent, and you do it right.  Current Linux on ARM definitely does it right.

USB does make this more difficult by its very nature, but this isn't any more a problem with
the concept of race-to-sleep than FireWire's remote DMA security fiasco is a problem with the
concept of externally pluggable mass storage.

Matthew Garrett on the race to idle

Posted May 10, 2008 8:58 UTC (Sat) by farnz (subscriber, #17727) [Link]

Unfortunately, this conflicts hard with real-time constraints; I work with a soft real-time system based on Linux, and I've had to disable dropping into deeper C-states. Our system includes a smooth text scroller, which works by updating the screen every frame (16 milliseconds); thanks to the high performance of X11 on Intel Q35, the update takes less than a millisecond, and then that thread goes to sleep until the next frame starts.

With the latest Intel processors, we found that the screen jerked, as there was nothing but this update on one core, and it was going into a low C-state as soon as the update completed, then not coming out of idle until too late.

I note that there's an in-kernel mechanism to let drivers tell the scheduler about their latency needs (allowing them to always be scheduled on the core that's not being put into a low C-state); it's a shame that there's no way for a process to do the same, yet.

Matthew Garrett on the race to idle

Posted May 10, 2008 10:27 UTC (Sat) by mjg59 (subscriber, #23239) [Link]

I agree with that. Applications need to be able to indicate that they're unable to tolerate
latency, and the scheduler and CPU governor need to cope with the extra restriction. I've been
talking to some of the embedded people about this, in terms of what sort of userspace-visible
knobs we need for effective power management without screwing up userspace.

Matthew Garrett on the race to idle

Posted May 12, 2008 17:03 UTC (Mon) by mgross (subscriber, #38112) [Link]

kernel/pm_qos_params.c provides an interface for communicating acceptable latencies.  (new in
2.6.25)

--mgross

Matthew Garrett on the race to idle

Posted May 13, 2008 0:49 UTC (Tue) by nix (subscriber, #2304) [Link]

... and it's built in unconditionally, even if you have no power 
management configured, which seems rather strange.

Matthew Garrett on the race to idle

Posted May 10, 2008 18:39 UTC (Sat) by renox (guest, #23785) [Link]

Weird, if your application needs to update the screen every 16ms, this means that you're using
some kind of timers to wake up every 16ms, so the kernel ought to be aware of this deadline
and change the C-state of the CPU accordingly..

Either I'm misunderstanding something or there is a bug somewhere, have you discussed this on
the LKML?

Matthew Garrett on the race to idle

Posted May 10, 2008 19:23 UTC (Sat) by farnz (subscriber, #17727) [Link]

Not yet discussed this on the LKML - disabling C states is a good enough workaround for now, and I'm currently knee-deep in VIA hardware issues to debug.

We're not using timers at all - we use the DRM to wait for VBlank. The kernel can't easily know (without kernel modesetting, which is a whole different can of worms to fix) that the frame rate of our screens is 60Hz. Once kernel modesetting lands, I'll certainly be looking into ensuring that the kernel is aware of the latency limits that wait for VBlank implies.

Matthew Garrett on the race to idle

Posted May 13, 2008 13:41 UTC (Tue) by daenzer (subscriber, #7050) [Link]

> We're not using timers at all - we use the DRM to wait for VBlank.

But then use X11 for rendering? If so, part of the reason for the latency being too high could
be the several context switches between the interrupt and the rendering operation taking
place. By using OpenGL with current Mesa and a current drm Git snapshot, it would be possible
to have the DRM emit the buffer swap operation from a tasklet triggered by the interrupt, and
the application could generate frames ahead of time and transparently sleep when it's too far
ahead.

Matthew Garrett on the race to idle

Posted May 13, 2008 14:46 UTC (Tue) by farnz (subscriber, #17727) [Link]

We do use X11 for rendering, but it's definitely the move from a low C-state that kills us. We can easily measure the latency of the rendering operation; in the worst case we've seen thus far (not on an Intel platform), we see around 0.03 milliseconds between the wait for VBlank syscall returning, and us being about to enter the wait again. Note that we sync the X stream at this point, so we don't re-enter the wait until the server has finished drawing.

Note that we don't draw by blitting a new frame from scratch; we delta the existing on-screen frame instead, which minimises the work involved.

Matthew Garrett on the race to idle

Posted May 11, 2008 7:19 UTC (Sun) by jd (guest, #26381) [Link]

It also depends on the nature of the hardware. Anything that can be offloaded from the CPU
onto a lower-power device would allow you to do work without CPU intervention and therefore
not require the CPU to be active. We might see more offloading and more intelligent
peripherals as PCIe 2.x gets marketshare, but certainly more if power savings start proving
substantial.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.