LWN.net Logo

Memory power management, 2013 edition

By Jonathan Corbet
April 17, 2013
When developers talk about power management, they are almost always concerned with the behavior of the CPU, since that is where the largest savings tend to be found. Computers are made up of more than just CPUs, though, and the other components require power as well. Seemingly, about once per year, attention turns to reducing the power demands of the RAM on the system; since RAM can take up to one third of a system's total power budget, this focus makes sense. Accordingly, LWN has looked at this issue once in 2011 and again in 2012. Now there is a new memory power management patch set in circulation, so another look seems warranted.

The most recent patch set comes from Srivatsa S. Bhat; it differs from previous approaches in a number of ways. For example, it targets memory controllers that have automatic, content-preserving power management modes. Such controllers divide memory into a set of regions, each of which can be powered down independently when the controller detects that there have been no memory accesses to the region in the recent past. The strategy to use is fairly obvious: try to keep as many memory regions as possible empty so that they will stay powered down.

The first step is to keep track of those regions in the memory management subsystem. Previous patches have used the zone system (which divides memory with different characteristics — high and low memory on 32-bit systems, for example) to track regions. The problem with this approach is that it causes an explosion in the number of zones; that leads to more memory management overhead and challenges in keeping memory usage balanced across all those zones. Srivatsa's patch, instead, tracks regions as separate entities in parallel with zones, avoiding this problem.

Once the kernel knows where the regions are, the trick is to concentrate memory allocations on a relatively small number of those regions whenever possible. To that end, the patch set causes the list of free pages to be sorted by region, so that allocations from the head of the list will come from the lowest-numbered region with available pages. Note that sorting within a region is not necessary; it is sufficient that all pages in a given region are grouped together. A set of pointers into the free list, one per region, helps newly-freed pages to be quickly added to the list in the correct location.

Region-aware allocation can help to keep active pages grouped together, but, in the real world, allocated pages will still end up being spread across physical memory over time. Unless other measures are taken, most regions will end up with active pages even when the system is under relatively light memory load; that will make powering down those regions difficult or impossible. So, inevitably, Srivatsa's patch set includes a mechanism for migrating pages out of regions.

Vacating regions of memory is not a new problem; the contiguous memory allocator (CMA) mechanism must sometimes take active measures to create large contiguous blocks, for example. So this particular problem has already been solved in the kernel. Rather than add a new compaction scheme, Srivatsa's patch set modifies the CMA implementation to make it suitable for memory power management uses as well. The result is a simple compact_range() function that can be invoked by either subsystem to move pages and free a range of memory.

There is still the question of when the kernel should try to vacate a memory region. If it does not happen often enough, power consumption will be higher than it needs to be. Excessive page migration will simply soak up CPU time, though, with no resulting power savings. Indeed, overly aggressive compaction could result in higher power usage than before. So some sort of control mechanism is required.

In this patch set, the page allocator has been enhanced to notice when it starts allocating pages from a new memory region. That new region, by virtue of having been protected from allocations until now, should not have many pages allocated; that makes it a natural target for compaction. But it makes no sense to attempt that compaction when the page is being allocated, since, clearly, no free pages exist in the lower-numbered regions. So the page allocator does not attempt compaction at that time; instead, it sets a flag indicating that compaction should be attempted in the near future.

The "near future" means when pages are freed. When some pages are given back to the allocator, it might be possible to use those pages to free a lightly-used region of memory. So that is the time when compaction is attempted; a workqueue function will be kicked off to attempt to vacate any regions that had previously been marked by the allocator. That code will only make the attempt, though, if a relatively small number of pages (32 in the current patch) would need to be migrated. Otherwise the cost is deemed to be too high and the region is left alone.

The patch set is still young, so there is not a lot of performance data available. In the introduction, though, a 6% power savings is claimed when running on a 2GB Samsung Exynos board, with the potential for more held out if other parts of the memory management subsystem can be made more power aware. One question that is not answered in the patch set is this: on a typical Linux system, very few pages are truly "free"; instead, they are occupied by the page cache. To be able to vacate regions, it seems like a more aggressive approach to reclaiming page-cache pages would be required. There are undoubtedly other concerns that would need to be addressed as well; perhaps they will be discussed in the 2014 update, if not before.


(Log in to post comments)

Memory power management, 2013 edition

Posted Apr 20, 2013 12:32 UTC (Sat) by Lennie (subscriber, #49641) [Link]

I ones read that on servers memory (because of constant refreshes) uses as much power on average as the CPUs (that might now be inaccurate).

That is why I've wonder at the time if turning off parts of memory is possible, but never took the time to look into it.

Thank you for this article, very interesting stuff.

Memory power management, 2013 edition

Posted Jun 26, 2013 9:18 UTC (Wed) by l3b2w1 (guest, #88009) [Link]

Well, I have read all the three articles posted.
And my question is that if the patch set has been adopted to the kernel or the patch set is just under test version.
Some guys give me a clear answer, thanks.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds