|
|
Log in / Subscribe / Register

KS2012: memcg/mm: Reclaiming mapped pages

By Michael Kerrisk
September 17, 2012

2012 Kernel Summit

There is a problem with file pages accessed through mmap() areas: the kernel does not see how frequently those pages are used. This was the starting point of a session of the 2012 Kernel Summit memcg/mm minisummit that considered some of the complexities of reclaiming mapped pages. The problem is that the pages have a single "accessed" bit associated with them. During page reclaim, the kernel sees only whether a page had been accessed or not since it was faulted in. Historically, the assumption was that these pages were important, and the kernel preferred reclaiming other pages before them. This works most of the time, since most files accessed through mmap() are executables, which are indeed important and also usually don't take up more than a few megabytes.

However, sometimes this assumption is false. Some applications map lots of file pages but access them only once, for example, a bit-torrent client that validates multi-gigabyte files by mapping the data chunk-by-chunk in order to calculate a checksum. Faced with this kind of workload, the memory-management algorithm described above struggled: most of the mapped pages in memory were now these "used-once" pages. Consequently, page reclaim stalled the system—in extreme cases, for several seconds at a time. Used-once detection was introduced to allow page reclaim to get rid of those pages quickly again.

Johannes Weiner described a database workload that is adversely affected by the used-once approach. The database repeatedly accesses a small number of mapped pages—the database index—in bursts (which, during each burst, get detected as one access only). It then retrieves data from the database, accessing a large number of unmapped file pages; this pushes the mapped database index out of memory again, so that, on the next access, the index must once more be retrieved from disk.

The point that Johannes wanted to make is that an algorithm based on only the accessed bit cannot make the right reclaim decisions for mapped pages in all circumstances. He then proposed to determine aggressiveness of mapped-page reclaim based on the amount of mapped pages compared to overall memory. There are two extreme points of the spectrum to be considered. At one end, when there are very few mapped pages, they take up very little space. Not reclaiming them does not hurt the system, no matter how unused they may be. At the other end, when most of the pages in memory are mapped file pages, the system must reclaim them, no matter how heavily they are used, since the system is under pressure and has to free some memory. Either way, the approach would rely much less on the inaccurate guidance provided by the accessed bit. The problem remains unresolved, but Johannes is looking for solutions.

Next: Volatile ranges


to post comments


Copyright © 2012, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds