LWN.net Logo

Frequency vs. power consumption

Frequency vs. power consumption

Posted Mar 27, 2012 12:51 UTC (Tue) by chrisr (guest, #83108)
In reply to: Frequency vs. power consumption by jzbiciak
Parent article: The Linaro Connect scheduler minisummit

I don't believe it would be reasonable for the kernel to expect applications to tell it how cache-missy or otherwise they are. I would expect the cache miss ratio is likely to vary significantly across portions of an application and anyway cores often come with variable cache sizes. This makes it hard to determine what an application will perform like on any particular platform.

For me, this means that the only reasonable way to know about a thread's optimal runtime performance requirement is therefore to measure it while its running and age older measurements as we do with load measurement.

Whether it is possible to do this in a generic enough manner that it could be accepted into a mainline Linux Kernel is IMO a different matter to the technical problem of doing the measurement and using those indications for something useful, and probably will take longer too :)


(Log in to post comments)

Frequency vs. power consumption

Posted Mar 27, 2012 14:06 UTC (Tue) by jzbiciak (✭ supporter ✭, #5246) [Link]

Well, remember, madvise() is a hint. I think you and I likely agree that the system should run reasonably if nobody ever calls it in a typical application. It exists to help you take performance from "reasonable, if not quite optimal" to "stellar." So, if you're bzip2 you might consider throwing that flag. As you said, though, it's less clear if you're a web browser or office app. (Although, I strongly suspect both are more cache thrashy than they'd like to be, even when idle.)

That said, my madvise() proposal above was partly tongue in cheek. (It was also partly inspired by MADV_RANDOM.) It would be interesting though, to try to characterize apps by their recent cache miss ratios and use that to make CPU affinity selections as well as operating frequency selections.

Actually, you need two ratios: Hit/(hit+miss) and Stall/(stall+non-stall) cycles. You could have a fairly high miss ratio with a low stall ratio. A faster CPU still helps you. The high miss ratio suggests streaming or a large working set, but there's enough prefetching and/or inherent parallelism the CPU can stay busy. A high miss ratio with a high stall ratio suggests a more serial program that's staying memory bound. Lower frequency or a slower CPU is unlikely to hurt the performance of the program.

Anyway... I've devolved into ramble-ville.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds