Flushing the page cache from user space
/proc/sys/vm/toss_page_cache_nodes
If a suitably privileged process writes one or more NUMA node numbers to that file, all pages belonging to that node which are found in the page cache will be flushed out. Essentially, this operation causes a node to forget about all locally-cached pages from files in the filesystem.
Clearing the page cache in this way would normally be bad for performance. The page cache exists to allow the filesystem to satisfy common filesystem requests without going to the disk; clearing the cache defeats that functionality and would normally be undesirable. There are exceptions to everything, however. This patch is aimed at large-scale high-performance computing tasks running in a cluster environment. Such jobs typically do best if they can start with a clean system; they have no real use for whatever may have been cached for the previous user. More to the point, a full page cache can cause memory allocations to be satisfied with non-local (slower) memory, resulting in significantly worse performance. By clearing the cache before starting a new job, a system administrator can ensure that local memory is available for that job.
Not everybody likes the patch. Ingo Molnar thinks that this capability will create confusion and make the debugging of memory problems even harder.
Andrew Morton, instead, sees the value of the patch for some users, but doesn't like the implementation. He would like to see this capability made useful for other classes of users, such as kernel developers who want to put the system into a known state before running tests. He also doesn't like the /proc interface, and argues for a new system call instead. His suggestion was:
sys_free_node_memory(long node_id, long pages_to_make_free,
long what_to_free);
This form of the call would allow the clearing of something less than the entire page cache, making the tool a bit less crude. The what_to_free argument would be a bitmask specifying which types of memory to free; beyond the page cache, this call could cause the kernel to reclaim anonymous memory or slab caches.
The system call approach would seem to make sense; there is one remaining glitch, however: SUSE already shipped the /proc interface in SLES9. That revelation drew a complaint from Andrew:
An explicit purpose behind the 2.6 development model is to get patches into
the mainline quickly so that their form can be stabilized before
distributors ship them. As the developers become used to this mode of
operation, this sort of issue should become relatively rare.
| Index entries for this article | |
|---|---|
| Kernel | Development model |
| Kernel | Memory management |
