LRU-list manipulation with DAMON
The kernel's memory-management developers would like nothing better than the ability to know which pages of memory will be needed in the near future; the kernel could then make sure that those pages were resident in RAM. Unfortunately, current hardware is unable to provide that information, so the memory-management code must make guesses instead. Usually, the best guess is that pages that have been used in the recent past are likely to be used again soon, while those that have gone untouched for some time are probably not needed.
LRU lists
This approach works well, but there is still a problem: there are limits to how closely the kernel can track the usage of each page. Making a note of every access would slow the system to a crawl, so something else must be done. The LRU lists are one way in which the memory-management subsystem tries to answer this question in an efficient way.
Occasionally, the kernel will pull a set of pages off the tail of the active list and place them, instead, at the head of the inactive list. When this happens, the pages are "deactivated", meaning that they are marked in the page tables as "not present". Should some process try to access such a page, a soft page fault will result; the kernel will then observe that the page is still in use and move it back to the active list. Pages that remain on the inactive list, instead, will find their way to the tail, where they will be reclaimed when the kernel needs memory for other uses.
The LRU lists are, thus, a key part of the mechanism that decides which pages stay in RAM and which are reclaimed. Despite their name, though, these lists are at best a rough approximation of which pages have been least (or most) recently used. The description might be better given as "least recently noticed to be used" instead. If there were a better mechanism for understanding which pages are truly in heavy use, it should be possible to use that information to improve on the current LRU lists.
Reordering the lists
DAMON ("Data Access MONitor") is meant to be that mechanism. Through some clever algorithms (described in this article), DAMON tries to create a clearer picture of actual memory usage while, at the same time, limiting its own CPU usage. DAMON is designed to be efficient enough to use on production systems while being accurate enough to improve memory-management decisions.
The 5.16 kernel saw the addition of DAMOS ("DAMON operation schemes"), which adds a rule-based mechanism that can cause actions to be taken whenever specific criteria are met. For example, DAMOS could be configured to pass a region that has not been accessed in the last N seconds to the equivalent of an madvise(MADV_COLD) call. Various other options are available; they are all described in detail in Documentation/admin-guide/mm/damon/usage.rst.
The work merged for 6.0 adds two new operations to DAMOS: lru_prio and lru_deprio. The first will cause the indicated pages to be moved to the head of the active list, making them the last pages that the kernel will try to deactivate or reclaim; the second, instead, will deactivate the given pages, causing them to be moved to the inactive lists. With this change, in other words, DAMOS is reaching deep into the memory-management subsystem, using its (hopefully) superior information to make the ordering of the LRU lists closer to actual usage. This sorting could be especially useful if the system comes under sudden memory pressure and has to start reclaiming memory quickly.
Author SeongJae Park calls this mechanism "proactive LRU-list sorting" or
PLRUS. When properly tuned, he claimed in the patch
series cover letter, this mechanism can yield some nice results: "In
short, PLRUS achieves 10% memory PSI (some) reduction, 14% major page
faults reduction, and 3.74% speedup under memory pressure
". The term
"PSI (some)" here refers to the pressure-stall
information produced by the kernel, which is a measure of how much
processes are being delayed waiting for memory.
The "when properly tuned" caveat is important, though; DAMOS has a complex set of parameters to describe action thresholds and to limit how much CPU time is used by DAMOS itself. Adjusting those parameters can result in significant changes to how the core memory-management subsystem goes about its work. DAMOS offers a lot of flexibility to a full-time administrator who understands how memory management works and who is able to accurately measure the effects of changes. It also makes it easy to completely wreck a system's performance.
To aid administrators who do not have the time or skills to come up with an
optimal DAMOS tuning for their workload, Park also added a new kernel
module called damon_lru_sort. It uses DAMOS to perform proactive
LRU-list sorting under a set of "conservative
" parameters that are
meant to safely improve performance while minimizing overhead. This module
will make using the LRU-list sorting feature easier, but it still has a
significant set of tuning knobs; the
documentation describes them all.
This mechanism is aimed at a similar problem to that addressed by the multi-generational LRU work, which currently seems on track to be merged in 6.1. The multi-generational LRU, too, tries to create a more accurate picture of which pages are in active use so that better page-replacement decisions can be made. There are a number of open questions about how the movement of pages between the generations should be handled; there is talk of allowing the loading of BPF programs to control those decisions, but DAMOS might be able to help as well. The integration between the two mechanisms does not currently exist, but could be a good thing to add.
The advent of this type of ability to tweak memory management is,
obviously, a sign that better performance is always desirable. It is also,
perhaps, an indication that creating a memory-management subsystem that
performs optimally for all workloads is beyond our current capabilities.
Kernel developers tend to prefer not to add new configuration knobs on the
theory that the kernel should be able to configure itself. Here, though,
new knobs are being added in large numbers. Some problems are, it seems,
still too hard for the kernel to solve without help.
Index entries for this article | |
---|---|
Kernel | Memory management/DAMON |
Kernel | Releases/6.0 |
Posted Aug 23, 2022 13:15 UTC (Tue)
by linusw (subscriber, #40300)
[Link]
https://lore.kernel.org/linux-arm-kernel/20220607120530.2...
Posted Aug 25, 2022 21:26 UTC (Thu)
by zev (subscriber, #88455)
[Link] (1 responses)
Posted Aug 27, 2022 6:20 UTC (Sat)
by jezuch (subscriber, #52988)
[Link]
*ducks*
Posted Jan 2, 2023 2:40 UTC (Mon)
by karim96 (subscriber, #153187)
[Link]
This is wrong. Pages moved to inactive LRU lists are never marked "not present" in process page tables and page faults never happen on such pages.
Pages in the inactive lists get promoted back to the active lists if the page scan done by the reclaim subsystem finds out that the "accessed bit" for those pages was set since the last scan.
LRU-list manipulation with DAMON
LRU-list manipulation with DAMON
Author SeongJae Park calls this mechanism "proactive LRU-list sorting" or PLRUS.
_______________________________________
/ I, for one, wish it were "work-ahead" \
\ instead of "proactive". /
---------------------------------------
\
\
__ ___
.'. -- . '.
/U) __ (O|
/.' ()() '.\._
.',/;,_.--._.;;) . '--..__
/ ,///|.__.|.\\\ \ '. '.''---..___
/'._ '' || || '' _'\ : \ ' . '.
/ || || '., ) ) : \
:'-.__ _ || || _ __.' _\_ .' ' ' ,)
( ' |' ( __= ___..-._ ( (.\\
('\ .___ ___. /'.___= \.\.\
\\\-..____________..-''
LRU-list manipulation with DAMON
LRU-list manipulation with DAMON