|
|
Subscribe / Log in / New account

Memory compaction issues

By Jonathan Corbet
March 26, 2014

2014 LSFMM Summit
Memory compaction is the process of relocating active pages in memory in order to create larger, physically contiguous regions — memory defragmentation, in other words. It is useful in a number of ways, not the least of which is making huge pages available. But compaction apparently has some problems of its own; Vlastimil Babka led a brief session in the 2014 Linux Storage, Filesystem, and Memory Management Summit to explore the issue.

After Vlastimil gave a quick overview of how compaction works (also described in this article) and described problems related to compaction overhead, Rik van Riel made the claim that there are two core issues to be looked at in this area: (1) can the compaction code be made to be faster, and (2) when compaction appears to be too expensive, should it just be skipped?

It seems that a number of compaction bugs have been fixed over the years, but some clearly remain. How, it was asked, can they be made easier to find? Writing test programs that reveal compaction problems tends to be hard; these problems arise out of specific workloads that exercise the system in certain ways. There does not appear to be any easy way to abstract the problematic access patterns out of the workloads into separate test programs.

What that means is that the memory management developers don't really have a good understanding of why compaction problems are happening. Some workloads obviously create situations where compaction gets expensive, but how that happens is obscure. So there is clearly a need to gain a better understanding of how the problems come about. One step in that direction might be to add a new counter that is incremented anytime the kernel detects that it has spent a significant amount of time in the compaction code. If that counter starts to increase, that will be a signal that bugs in the compaction code are being tickled. Then, perhaps, it will be possible to try to figure out where those bugs are.

[Your editor would like to thank the Linux Foundation for supporting his travel to the Summit.]

Index entries for this article
KernelMemory management/Large allocations
ConferenceStorage, Filesystem, and Memory-Management Summit/2014


to post comments

Memory compaction issues

Posted Mar 27, 2014 15:13 UTC (Thu) by sbohrer (guest, #61058) [Link]

When I've run into problems with compaction on our servers it was painfully obvious. Our workloads would become 100x slower and CPU time would be almost entirely 100% system time. A quick 'perf record -g -a' would point the finger at the compaction code and an "echo never > /sys/kernel/mm/transparent_hugepage/defrag" would solve the problem.

My question is not, how do I know it is happening? Rather once it is happening on my systems, how can I gather useful information that will allow the memory management developers to understand and fix the issue? These issues are admittedly hard because there is very little incentive for us to track down the problem when we can simply turn off defrag.


Copyright © 2014, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds