|
Matthew Garrett on the race to idleMatthew Garrett on the race to idlePosted May 10, 2008 0:12 UTC (Sat) by mjg59 (subscriber, #23239)In reply to: Matthew Garrett on the race to idle by dlang Parent article: Matthew Garrett on the race to idle
Shifting from C4 to C0 takes around 17 microseconds on my hardware, which is pretty typical. I think that's somewhat less than 100 milliseconds :)
(Log in to post comments)
Matthew Garrett on the race to idle Posted May 10, 2008 0:17 UTC (Sat) by dlang (subscriber, #313) [Link] try it again with a USB device plugged in and the numbers can change drasticly. some hardware combinations work well at shifting from one mode to another, others don't work nearly as well. David Lang
Matthew Garrett on the race to idle Posted May 10, 2008 0:52 UTC (Sat) by mjg59 (subscriber, #23239) [Link] USB's tendancy to trigger DMA means that you're going to spend more time in C2 than would be ideal, but on anything made in the past 5 years that's still going to result in you saving more power than staying in C0 at a low voltage. Recent hardware will even automatically promote itself from C3 to C2 without OS intervention.
Matthew Garrett on the race to idle Posted May 10, 2008 5:06 UTC (Sat) by dlang (subscriber, #313) [Link] in theory you are right, in practice it doesn't work as well. take the OLPC laptop, designed for very good power management at the hardware level (with serious talk of going to sleep between keystrokes). due in large part to the need to have long enough delays to properly talk to things like external USB and SD devices this machine takes >200ms to wake up. yes there is defiantly room for improvement in the software, but the reliable interfacing to external (but effectivly permanently attached) devices is bad enough that doing race-to-idle ends up being a horrible thing in practice (they tried doing it in a few builds, it was bad enough that people started disabling sleep entirely)
Matthew Garrett on the race to idle Posted May 10, 2008 10:25 UTC (Sat) by mjg59 (subscriber, #23239) [Link] That's not race to idle, it's race to suspend. Linux is currently deficient in its requirement that the entire device tree be resumed before userspace can restart, which results in suckage like you describe. But that's an orthogonal issue - if you're talking about the processor rather than the platform, then the power saving states add only small levels of latency.
Matthew Garrett on the race to idle Posted May 10, 2008 10:37 UTC (Sat) by dlang (subscriber, #313) [Link] if you aren't switching to a suspend mode when you hit the idle state, what is the benifit of becoming idle? you can blame it on whatever layers you want, but for the user the result is the same, in practice the race-to-idle approach does not currently give a reasonable user experiance (in many cases), and as a result, switching to a lower clock speed instead of race-to-idle is actually better.
Matthew Garrett on the race to idle Posted May 10, 2008 10:53 UTC (Sat) by mjg59 (subscriber, #23239) [Link] The benefit is that your CPU power draw drops to approximately nothing. On modern CPUs, halving the speed of your processor doesn't halve its power draw. Letting it idle takes it down to 0-1 watts.
Matthew Garrett on the race to idle Posted May 11, 2008 11:23 UTC (Sun) by IkeTo (subscriber, #2122) [Link] > On modern CPUs, halving the speed of your processor doesn't halve its power draw. Hm... I read otherwise somewhere else, if you count only the power of the CPU. (Actually it should save more than half of its power, otherwise why slowing down?) Would you mind sharing with us where you get this idea?
Matthew Garrett on the race to idle Posted May 10, 2008 18:31 UTC (Sat) by dilinger (subscriber, #2867) [Link] Please do not use the OLPC laptop as an example. The power management software is not nearly close to being finished, and OLPC has suffered greatly from lack of manpower (I'll spare you the details of _why_ we haven't had enough people). In reality, there should be no reason why we can't do <200ms resume; however, no one within the organization has even _started_ optimizing away the extra 800ms that we deal with. The automatic suspend stuff that we had been toying with was merely a hack.
Matthew Garrett on the race to idle Posted May 11, 2008 14:44 UTC (Sun) by daniels (subscriber, #16193) [Link] I wouldn't call Geode/OLPC 'designed for good power management': it's lower power than most x86, sure, but isn't even in the same league as consumer device hardware like ARM and MIPS. If you can actually measure the wake-from-sleep time at all, then you've lost. Consumer hardware like the Nokia 770/N800/N810, OpenMoko, and similar, all races to sleep, and sleeps/resumes much, much more often than you think (probably by a factor of thousands). It really can be about as transparent as the x86 execution/idle switch if your hardware is decent, and you do it right. Current Linux on ARM definitely does it right. USB does make this more difficult by its very nature, but this isn't any more a problem with the concept of race-to-sleep than FireWire's remote DMA security fiasco is a problem with the concept of externally pluggable mass storage.
Matthew Garrett on the race to idle Posted May 10, 2008 8:58 UTC (Sat) by farnz (subscriber, #17727) [Link] Unfortunately, this conflicts hard with real-time constraints; I work with a soft real-time system based on Linux, and I've had to disable dropping into deeper C-states. Our system includes a smooth text scroller, which works by updating the screen every frame (16 milliseconds); thanks to the high performance of X11 on Intel Q35, the update takes less than a millisecond, and then that thread goes to sleep until the next frame starts. With the latest Intel processors, we found that the screen jerked, as there was nothing but this update on one core, and it was going into a low C-state as soon as the update completed, then not coming out of idle until too late. I note that there's an in-kernel mechanism to let drivers tell the scheduler about their latency needs (allowing them to always be scheduled on the core that's not being put into a low C-state); it's a shame that there's no way for a process to do the same, yet.
Matthew Garrett on the race to idle Posted May 10, 2008 10:27 UTC (Sat) by mjg59 (subscriber, #23239) [Link] I agree with that. Applications need to be able to indicate that they're unable to tolerate latency, and the scheduler and CPU governor need to cope with the extra restriction. I've been talking to some of the embedded people about this, in terms of what sort of userspace-visible knobs we need for effective power management without screwing up userspace.
Matthew Garrett on the race to idle Posted May 12, 2008 17:03 UTC (Mon) by mgross (subscriber, #38112) [Link] kernel/pm_qos_params.c provides an interface for communicating acceptable latencies. (new in 2.6.25) --mgross
Matthew Garrett on the race to idle Posted May 13, 2008 0:49 UTC (Tue) by nix (subscriber, #2304) [Link] ... and it's built in unconditionally, even if you have no power management configured, which seems rather strange.
Matthew Garrett on the race to idle Posted May 10, 2008 18:39 UTC (Sat) by renox (guest, #23785) [Link] Weird, if your application needs to update the screen every 16ms, this means that you're using some kind of timers to wake up every 16ms, so the kernel ought to be aware of this deadline and change the C-state of the CPU accordingly.. Either I'm misunderstanding something or there is a bug somewhere, have you discussed this on the LKML?
Matthew Garrett on the race to idle Posted May 10, 2008 19:23 UTC (Sat) by farnz (subscriber, #17727) [Link] Not yet discussed this on the LKML - disabling C states is a good enough workaround for now, and I'm currently knee-deep in VIA hardware issues to debug. We're not using timers at all - we use the DRM to wait for VBlank. The kernel can't easily know (without kernel modesetting, which is a whole different can of worms to fix) that the frame rate of our screens is 60Hz. Once kernel modesetting lands, I'll certainly be looking into ensuring that the kernel is aware of the latency limits that wait for VBlank implies.
Matthew Garrett on the race to idle Posted May 13, 2008 13:41 UTC (Tue) by daenzer (subscriber, #7050) [Link] > We're not using timers at all - we use the DRM to wait for VBlank. But then use X11 for rendering? If so, part of the reason for the latency being too high could be the several context switches between the interrupt and the rendering operation taking place. By using OpenGL with current Mesa and a current drm Git snapshot, it would be possible to have the DRM emit the buffer swap operation from a tasklet triggered by the interrupt, and the application could generate frames ahead of time and transparently sleep when it's too far ahead.
Matthew Garrett on the race to idle Posted May 13, 2008 14:46 UTC (Tue) by farnz (subscriber, #17727) [Link] We do use X11 for rendering, but it's definitely the move from a low C-state that kills us. We can easily measure the latency of the rendering operation; in the worst case we've seen thus far (not on an Intel platform), we see around 0.03 milliseconds between the wait for VBlank syscall returning, and us being about to enter the wait again. Note that we sync the X stream at this point, so we don't re-enter the wait until the server has finished drawing. Note that we don't draw by blitting a new frame from scratch; we delta the existing on-screen frame instead, which minimises the work involved.
|
Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.