Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
Your point about a cache-missy workload being a good candidate for lower CPU frequency is a good one, depending of course on how much of the memory system is in the same clock domain.
Frequency vs. power consumption
Posted Feb 29, 2012 18:05 UTC (Wed) by jzbiciak (✭ supporter ✭, #5246)
I just amused myself wondering what a good API would be for hinting to the OS about this... Maybe not madvise(MADV_CACHE_THRASHY) (for the simple reason that it's an attribute of the task, not an attribute of a memory page), but then what?
On a more serious note, I wonder if the OS could use hardware performance counters to auto-detect this sort of stuff. If the ratio of stall to execute cycles is above a certain threshold, decrease the task's desired clock speed, and if it's above another threshold, increase it. Hmmm....
Posted Feb 29, 2012 18:06 UTC (Wed) by jzbiciak (✭ supporter ✭, #5246)
Posted Mar 27, 2012 12:51 UTC (Tue) by chrisr (guest, #83108)
For me, this means that the only reasonable way to know about a thread's optimal runtime performance requirement is therefore to measure it while its running and age older measurements as we do with load measurement.
Whether it is possible to do this in a generic enough manner that it could be accepted into a mainline Linux Kernel is IMO a different matter to the technical problem of doing the measurement and using those indications for something useful, and probably will take longer too :)
Posted Mar 27, 2012 14:06 UTC (Tue) by jzbiciak (✭ supporter ✭, #5246)
Well, remember, madvise() is a hint. I think you and I likely agree that the system should run reasonably if nobody ever calls it in a typical application. It exists to help you take performance from "reasonable, if not quite optimal" to "stellar." So, if you're bzip2 you might consider throwing that flag. As you said, though, it's less clear if you're a web browser or office app. (Although, I strongly suspect both are more cache thrashy than they'd like to be, even when idle.)
That said, my madvise() proposal above was partly tongue in cheek. (It was also partly inspired by MADV_RANDOM.) It would be interesting though, to try to characterize apps by their recent cache miss ratios and use that to make CPU affinity selections as well as operating frequency selections.
Actually, you need two ratios: Hit/(hit+miss) and Stall/(stall+non-stall) cycles. You could have a fairly high miss ratio with a low stall ratio. A faster CPU still helps you. The high miss ratio suggests streaming or a large working set, but there's enough prefetching and/or inherent parallelism the CPU can stay busy. A high miss ratio with a high stall ratio suggests a more serial program that's staying memory bound. Lower frequency or a slower CPU is unlikely to hurt the performance of the program.
Anyway... I've devolved into ramble-ville.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds