Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
I can only think of real use-cases for volatile anonymous pages, and thus having an API that only works on files, instead, seems rather odd.
Many more words on volatile ranges
Posted Nov 6, 2012 20:42 UTC (Tue) by dgc (subscriber, #6611)
The typical use case is that cache expiry simply marks regions of the cache as volatile, then reclaim is controlled wholly by the kernel memory pressure. Reuse of an expired cache entry does verification and if it is intact it gets marked "unvolatile" again, and the cycle goes around.
When you are doing file backed caching, then "reclaim" means freeing the backing store of the range in the file. i.e. punching a hole in the file.
Posted Nov 6, 2012 22:28 UTC (Tue) by foom (subscriber, #14868)
Besides, filesystems would all need to be updated to support volatile files. And the existence of files whose data which might disappear at any point without anyone touching them seems a radical new feature for a filesystem, which seems to me like it could cause all sorts of problems.
But volatile anonymous memory pages? Yes please! That's certainly of use...which is probably why everyone is confused by this!
Posted Nov 7, 2012 0:49 UTC (Wed) by dgc (subscriber, #6611)
That's no reason to say there aren't any. I gave a few when I first suggested that fallocate() be used instead. Marking parts of files as volatile can help optimise large scale cache management (e.g. for HSMs, SSD file caches, squid, etc) - it's not a phone/desktop web browser cache that I'm thinking of here.
> Besides, filesystems would all need to be updated to support volatile
Just like the page cache needs to be updated to support it, eh? :)
Besides, this is just confusing implementation with API. The API needs to support it from the start as we can't change that over time. Filesystem implementation can be done later, as well as change over time, so we don't need to have that up front....
> And the existence of files whose data which might disappear at any point
> without anyone touching them seems a radical new feature for a
> filesystem, which seems to me like it could cause all sorts of problems.
When you use your filesystem as an access cache for some other data, this is exactly the expected behaviour. Only right now, the cache application causes files to disappear at random points in time. Volatile ranges on files just moves a common cache management mechanism into the filesystem so it can be done when the filesystem needs it to be done....
Posted Nov 7, 2012 1:05 UTC (Wed) by neilbrown (subscriber, #359)
Having files disappear spontaneously makes sense to me. A 'file' is a natural unit of caching. There is a clear distinction between the 'file' and the 'name', so that you can unlink a cache file even while it is in use, and the process using it will not lose out.
Having arbitrary blocks in the middle of a file disappear spontaneously it not something that I am so comfortable with. There is no 'natural unit' (so John had to invent 'ranges' and worry about semantics for merging etc) and there is no 'object/name' distinction so you have to think carefully about races between access and discard.
I would really like it if the whole 'volatile data' thing could be done with files. Files get marked as 'volatile' and the filesystem can unlink them as desired. One problem is that open/mmap/close is a whole lot slower than any single systemcall, and definitely slower than a simple memory access that might (but usually doesn't) cause SIGBUS.
Maybe an madvise style interface that works for ranges in anonymous memory, and some sort of per-file interface for filesystems when a shared cache is required.
I'm not sure that one size can fit all.
Posted Nov 7, 2012 8:04 UTC (Wed) by dgc (subscriber, #6611)
Fundamentally, HSMs make blocks disappear from files spontaneously. And those blocks come back when you try to read them. IOWs, the filesystem is basically a namespace with a great big data cache in front of some kind of slower storage.
Volatile ranges turn HSM space management on it's head - instead of moving data to tape when you run out of space, we can do it pre-emptively and mark the duplicated data ranges as volatile. When the filesystem runs out of space, it can just punch out the volatile ranges and everything continues quickly rather than blocking waiting for the HSM to move data out to tape.
Then when you add range based hot data tracking as teh method of selecting what parts of the files are copied to tape and marked volatile, you've got quite a neat way of automatically managing the filesystem space that doesn't impact performance when space runs low or the HSM moves frequently accessed data to tape mistakenly...
Big picture - we've got lots of infrastructure on the way for doing interesting things with our storage stack - the only thing missing is the application that ties them all together....
> There is no 'natural unit' (so John had to invent 'ranges' and worry
> about semantics for merging etc)
There doesn't need to be a natural unit. In reality, it is a filesystem block, but having a tracking structure is necessary regardless of unit. Using the mapping tree proved impractical for various reasons, and the simplest solution was to use it's own tree. Volatile ranges on files are not bad because we have no generic range tree library in the kernel that could be used for tracking them....
> and there is no 'object/name'
> distinction so you have to think carefully about races between access
> and discard.
Same for any method of tracking volatile ranges... :)
> I'm not sure that one size can fit all.
Probably not - the anonymous memory usage is recent, but I think it's separate to filebacked volatile regions which is what John's original proposal was for. Lumping them together as equivalent functionality is not really correct....
Posted Nov 7, 2012 1:31 UTC (Wed) by foom (subscriber, #14868)
Yes, and having filesystems able to disappear (parts of) files all by themselves with no application involvement seems like a *major* change, and seems rather scary to me.
I mean, in the intended usage, the application itself expects its data to disappear, sure. But, I'm wondering about other knock-on effects of these sorts of files being able to exist. Will I, as admin, be able to easily tell that some files are "disappear-y"? New feature added to "ls"? How can I tell how much space is used by such data? New fields in "df"?
What sorts of controls over who can mark data like that will there be? Can it cause a security issue for data to disappear in the middle of a file unexpectedly? Maybe clearing volatile-ness on file ownership or permissions change fixes that?
I dunno...it just seems like so much complication versus in-memory volatility that it doesn't seem worth it. And, worse, pinning it to fallocate instead of something like madvise makes the API so much more fiddly to use for a simple in-memory case.
Posted Nov 7, 2012 6:29 UTC (Wed) by dlang (✭ supporter ✭, #313)
This leads to a couple obvious answers
> What sorts of controls over who can mark data like that will there be? Can it cause a security issue for data to disappear in the middle of a file unexpectedly? Maybe clearing volatile-ness on file ownership or permissions change fixes that?
Tie this to the ability to modify/truncate the file and you are not adding any new possibilities, just new ways to trigger the possibilities (someone who can modify the file can truncate it, write a new file missing some data, etc)
> How can I tell how much space is used by such data?
the same way you would find out how much space is used by other sparse files today.
Posted Nov 8, 2012 3:04 UTC (Thu) by foom (subscriber, #14868)
> Tie this to the ability to modify/truncate the file and you are not adding any new possibilities, just new ways to trigger the possibilities (someone who can modify the file can truncate it, write a new file missing some data, etc)
But without taking extra preventative measures, the ability to ever once *have* *had* permission to modify a file might then result in the ability to modify the file (by zeroing out some blocks) any arbitrary time in the future.
> the same way you would find out how much space is used by other sparse files today.
But these new files aren't sparse immediately; volatile data does use up actual space, until it gets dropped on the floor. That's a brand new type of thing.
Posted Nov 8, 2012 7:47 UTC (Thu) by dlang (✭ supporter ✭, #313)
remember that permission checks are only made when the file is opened, so even without this you could hold the filehandle open and erase blocks at any time.
Yes, this can now happen even after the program has exited (more below)
> But these new files aren't sparse immediately; volatile data does use up actual space, until it gets dropped on the floor. That's a brand new type of thing.
you can do one of two things by default (and either one is defensible)
1. show the size as it currently occupies disk space
2. show the size as if the holes had been reclaimed by the filesystem
I would probably do #1, because with sparse files, you already have a situation where the file size on disk can change significantly, so this is not that different.
to cover the rest of the first problem, and the corner cases of the second problem, there will need to be a utility to report on what's been tagged as being volatile, but that tool will probably just be needed for odd, corner cases. The existing options for dealing with sparse files should cover the 'normal' needs
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds