LWN.net Logo

Avoiding game-score loss with per-process reclaim

By Michael Kerrisk
April 3, 2013

Minchan Kim's recent patch series to provide user-space-triggered reclaim of a process's pages represents one more point in a spectrum that increasingly sees memory management on Linux as a task that is indirectly influenced or even directly controlled from user space.

Approaches such as Android's low-memory killer represent one end of the page-reclaim spectrum, where memory management is primarily under kernel control. When memory is low, the low-memory killer picks a victim process and kills it outright; applications that live in such an environment have to work with the possibility that they may disappear from one moment to the next. As Minchan pointed out via an amusing example, the effects of the low-memory killer's approach to page reclaim can be extreme:

[Having a process killed to free memory] was really terrible experience because I lost my best score of game I had ever after I switch the phone call while I enjoyed the game

Jon Stultz's volatile ranges work and Minchan's own work on a similar feature (both described in this article) represent a middle point in the spectrum. The volatile ranges approach, inspired by Android's ashmem, provides a process with a way to inform the kernel that a certain range of its own virtual address space can be preferentially reclaimed if memory pressure is high. Under this approach, the kernel takes no responsibility for the contents of the reclaimed pages: if the kernel needs the memory, the page contents are discarded, and it is assumed that the application has sufficient information available to re-create those pages with the right contents if they are needed. As with the low-memory killer, the decision about if and when to reclaim the pages remains with the kernel.

By contrast, Minchan's patch set places the decision about when to reclaim pages directly under the control of user space. The interface provided by these patches is simple. A /proc/PID/reclaim file is provided for each process. A process with suitable permissions—that is, a process owned by root or one with the same user ID as the process PID—can write one of the following values to the file, to cause some or all of the process's pages to be reclaimed:

  1. Reclaim all process pages in file-backed mappings.
  2. Reclaim all process pages in anonymous (MAP_ANONYMOUS) mappings.
  3. Reclaim all pages of the process (i.e., the combination of 1 and 2).

As currently implemented, all of the process's pages that match the specified criterion are reclaimed. Your editor wondered whether there might be benefit in allowing some control over the range of pages that are reclaimed from the target process, by allowing an address range to be written to the /proc/PID/reclaim file.

By contrast with volatile ranges and the low-memory killer, modifications in pages reclaimed via /proc/PID/reclaim are not lost. Modified pages are written to the underlying file in the case of shared (MAP_SHARED) file mappings or to swap in other cases. Thus, if the process touches the reclaimed page later, it will be faulted into memory with the contents at the time it was reclaimed. The patches also include some logic to handle the case where multiple processes are sharing the same pages; in that case, the pages are reclaimed only after all of the processes have marked them for reclaim. Like the low-memory killer, /proc/PID/reclaim can be used to reclaim all of the pages in a process, but without needing to kill the process to do so.

The idea behind Minchan's proposal is that a user-space task manager could take over some part of the job of memory management. In some cases, this may be more effective than allowing the kernel to make memory-management decisions, since the user-space task manager can bring application-specific intelligence to decisions about whether to reclaim a process's pages. For example, some application processes may be in the foreground while others are in the background. It may desirable to preferentially reclaim pages from one of the background processes, even if it has some frequently accessed pages. Of course, the task manager would somehow need to know when the system is under memory pressure. To that end, a mechanism like Anton Vorontsov's proposed vmpressure_fd() API might be useful.

Minchan's patches apply against Michal Hocko's MMOTM (memory management of the moment) tree. The patches came out on March 25, but have so far seen little review. Nevertheless, they present an idea that will probably be of particular interest for the developers of mobile and embedded devices and thus it seems likely that they will get some attention at some point in the future.


(Log in to post comments)

Avoiding game-score loss with per-process reclaim

Posted Apr 4, 2013 19:36 UTC (Thu) by Baylink (subscriber, #755) [Link]

I have low-mem killer problems on my HTC Supersonic, and you know what they have taught me?

Lots of Android app designers *have not learned yet* that you *checkpoint work-in-progress every time you lose focus*.

You won't generally get killed if you have focus, but as soon as you don't, watch out. Tweetcaster loses in-progress postings, Facebook loses its reading position, even newer builds of CoolReader forget to checkpoint the cursor position on blur.

It's really annoying.

Opera Mobile has an even more annoying bug, in it's latest release: it checkpoints the tab list, and the current position in the tab, but has regressed to not keeping the *history list*.

<sigh>

Avoiding game-score loss with per-process reclaim

Posted Apr 5, 2013 16:45 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

This is half of the classic long term scheduling of batch systems - swapping out a process. But what about the other half: shouldn't reclaiming all of the process' pages be accompanied by making it wait a while before running again? Otherwise, this could lead to a great deal of page thrashing.

I assume we're talking about a process that runs and touches all its pages regularly, or its pages would have been reclaimed long ago by normal page replacement.

Avoiding game-score loss with per-process reclaim

Posted Apr 13, 2013 3:49 UTC (Sat) by wtanksleyjr (subscriber, #74601) [Link]

Not swapping out a process -- killing a process completely. Swapping happens only when you have available memory.

Avoiding game-score loss with per-process reclaim

Posted Apr 13, 2013 19:09 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

Not swapping out a process -- killing a process completely.

I'm not sure what you meant to contradict here, but neither traditional long-term scheduling nor the feature discussed in the article kill a process completely. They remove all of its data from real memory to make more real memory available for other processes.

Swapping happens only when you have available memory.

I don't know what you mean by this. "Swapping" means two things - in Linux, it is moving pages from real memory to e.g. disk and back again. In traditional batch systems, this is called "paging" and "swapping" means moving entire processes from real memory to e.g. disk and back again. In either case, it's done when by some definition there is not "enough" real memory.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds