> Doing this all in memory seems too complex. Probably I'm missing
> something, but why not do it with files?
That's the thing - it is doing this with files. People are focussing on memory/page cache behaviour because the initial target was tmpfs files.
> The concept could even be extended to disks, with files that are
> automatically removed by the kernel if the free space on the file
> system goes below certain threshold.
It already does that, through using the hole punching interface.
Indeed, this is exactly the reason I originally suggested it needs to be fallocate() based. There are good reasons for allowing the filesystem to track volatile ranges to allow discards to be done when the filesystem reaches ENOSPC thresholds. Once again, the focus on tmpfs has tended to make people think purely of "this is only for page cache pages" and that ignores the wider usefulness it has for disk based caching applications.
Posted Nov 7, 2012 18:07 UTC (Wed) by dgm (subscriber, #49227)
[Link]
> That's the thing - it is doing this with files. People are focussing on memory/page cache behaviour because the initial target was tmpfs files.
Well, maybe I didn't express myself right. What I meant was _whole_ files. as neilbrown pointed out, they are a more "natural" unit of caching at the application level.
Many more words on volatile ranges
Posted Nov 15, 2012 23:44 UTC (Thu) by mm7323 (guest, #87386)
[Link]
I think there much is value here.
It seems to me that a forgetful file system is the perfect model for a volatile cache. You can use the last access-time and file size as a metric for simple LRU removal together with the classic open(), read(), write(), mmap(), unmap(), close(), unlink() interfaces for access.
The open() syscall can trivially increment a reference count on some file and prevent the content being reclaimed while open. Open() can resolve within the kernel whether to succeed and return a file descriptor, or fail in the case that the file has already been reclaimed (e.g. due to memory pressure) and return a suitable error code such as ENOENT.
Once a valid descriptor has been obtained, read() and write() can trivially access the file contents, or mmap() could be used to further increase the reference count and create a memory mapping. Once the reference count is again zero, such volatile files within the filesystem would again be eligible to be reclaimed and removed at any time.
I think the major benefits of this would be that the user-space interface is traditional and easy to understand and there is no need to handle signals or actually use mmap() or munmap() to benefit from such a system. As you suggest, there is also the notion of a useful unit - a file. Discarding individual pages is a nonsense to userspace, whereas the idea that a file may at some point atomically be deleted is much easier to grasp and use without bugs. Files also allow applications to decide what is a useful unit to keep/lose, simply by deciding what to store within some file.
This obviously maps well to a browser's cache and similar such applications, though it could be argued that open() and mmap() are too heavy to use when backing a malloc() type allocator unless used at a very coarse level, in which case the benefits or reclaim could be reduced (it becomes all or nothing). That said dropping the volatile files of a process maybe a kind first stage which keeps a system running before oom_killer.