LWN.net Logo

Interesting work - and some ideas for the future

Interesting work - and some ideas for the future

Posted Jul 6, 2006 9:06 UTC (Thu) by ayeomans (subscriber, #1848)
Parent article: The 2006 Linux Filesystems Workshop (Part III)

Thanks for the great report on the workshop. It's great to see that innovation is still continuing.
Something struck me, though, in that the thrust is towards performance and reliability improvements at the data level. I'm also interested in work at higher levels, to add more features for the user.
Let me give some examples:

  • Versioning file system - ages ago systems such as RSX-11M and VMS had versioning filesystems, that kept incremental copies of data. Nowadays that disk space is much cheaper, it makes increased sense to do this again. Even if systems were damaged by that mythical Linux virus which tried to overwrite files, a versioning filesystem would provide protection.
  • Note this can be done on a file block basis - no need to copy all data in a file, just make a copy of the allocation map. This is pretty close to some of the current journalled filesystems, and should be possible to be combined by providing the user with access to earlier file versions.
  • Synchronisation-friendly filesystem. Mobile devices increases the demand for synchronised portions of filesystems. By maintaining a "sync point map", i.e. list of files/blocks modified after a nominated sync point, it becomes a fast process to identify portions that need copying. Again, this is pretty close to current JFS facilities.
  • De-duplicating filesystem. More useful at enterprise level or for multi-computer backups. Identify duplicates of files - probably by a crypto checksum calculated during or just after writing. Then only store one logical copy of any file. Makes backups faster and reduces disk space requirements. (Think of the number of operating system files held in common across computer networks. Or the mass-mailed .ppt^h^h^h.odp presentation files.)
  • Enhanced metadata that gets preserved with files. Including some kind of data origin (e.g. internet download from url), also classification. So that backup and security decisions can be made automatically, including auto-encryption of confidential files when transferring out the system.
  • Auto-zip/unzip views - allowing a collection of files to appear like a single zip file, or vice-versa. Currently done at presentation layers, but allowing the filesystem to handle can make the facilities available to all apps. Even the enhanced metatdata mentioned above could be handled as if it were a component of a zip file. Probably a cleaner way to present resource forks and alternate data streams.


(Log in to post comments)

Interesting work - and some ideas for the future

Posted Jul 6, 2006 9:55 UTC (Thu) by nix (subscriber, #2304) [Link]

I have a number of ideas on the versioning filesystem front which are congealing into a design. Eventually people will stop giving me new ideas faster than I can figure out how to use them :) the tree-of-blocks-CoW stuff is a critical part of it, of course.

The block allocation stuff isn't the hard part (we can start with something naive and get more complex later on): the optimizers are the hard part, because we *really* want to share as much data as possible, not least because my chosen semantics for versioning involve a new version on every close() of an open-for-writing file and on every update of a directory. This also means we need automatic expiry.

(It's also versioning both by name and by inum --- essential if it's to be useful with most text editors --- which interacts interestingly with hard links and the permissions system.)

(Because it's very much a prototype and because the data structures are not simple I'm initially prototyping the thing inside PostgreSQL, with access via FUSE. Figuring out how to turn the whole thing into a more conventional filesystem can wait. What's that about typical impractical researcher's mentality? OK, OK, I admit it ;) )

Interesting work - and some ideas for the future

Posted Jul 13, 2006 19:36 UTC (Thu) by ringerc (subscriber, #3071) [Link]

Versioning FSs get problematic when you consider that they're most useful in areas like network file servers and home directories. Lots of programs like to drop large amounts of crap in these places, much of which should not be versioned. Consider program scratch files, thumbnail DBs, etc. Identifying these files and avoiding versioning them would be extremely useful.

Ageing out versions sounds like a good idea. You might also want to look into a winnowing process where fewer versions are kept further back in time, rather than just using a strict version count or time limit. I refer to something akin to the way round robin databases work - you keep one file from a month ago, one for each prior week, one every day for this week, and one every hour (assuming of course that the file was actually modified in each period). This helps reduce the "damn, I save every five minutes so my versions only go back an hour" problem.

Being able to mount a read-only view of the FS frozen at a point in time would be an incredibly nice inteface to the versions that wouldn't require any special tools. User says "Yeah, I deleted it some time today" ... so I just:
mount -o version_date=`date -d 'yesterday' +%Y-%m-%d-%H-%M-%S` /path/to/device /path/to/samba/share/yesterdays_files

and the user can just access `yesterday's files' with samba. For that matter, imagine a rolling view:

mount -o version_age=24h ....

so the mount shows files as they were 24h ago, updating (roughly, presumably in chunks) with time. Stupid idea? Probably. Useless? Probably. Fixed point-in-time views that were mountable would be amazingly useful though ... like a coarse continual snapshot.

Interesting work - and some ideas for the future

Posted Jul 15, 2006 11:23 UTC (Sat) by nix (subscriber, #2304) [Link]

An avoidance list will be mandatory: probably one global one and one per-directory (non-inherited). We don't want to version vim .swp files, for starters :)

The winnowing idea sounds excellent: I'll incorporate it if I can figure out a user interface! (It sounds like the sort of thing which should be controlled by a minimal cutoff date and some sort of logarithmic or exponential parameter.)

As for the read-only view, well, the current design already has the ability to roll individual files, directories, and trees backwards and forwards in time. It's not stupid at all. :)

Oh, and it's not read-only. If you write to a file (or modify a directory) which is rolled back into the past, you get a branch.

I think I've managed to conform to POSIX everywhere: I'm in the middle of verifying this at the moment. From the point of view of apps which have files open when someone rolls them back, it just looks like someone's done a truncate or a big write... the whole point of this is POSIX conformance with version control layered on top: I don't want to turn out yet *another* versioning system which doesn't support hard links, symlinks, or permissions!

Interesting work - and some ideas for the future

Posted Jul 7, 2006 14:16 UTC (Fri) by dion (subscriber, #2764) [Link]

The de-duplication feature sounds a lot like hardlinks and that can be done today.

The tricky part is the copy-on-write needed when you start modifying the file.

de-duplication in filesystems

Posted Jul 8, 2006 17:44 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

The tricky part of de-duplication is identifying the duplicate files.

Users today create multiple copies of files because it's easier than sharing. The idea of de-duplication is that the users maintain that ease, but get the benefits of sharing because the system stores only one copy anyhow.

The copy on write technology is pretty much the same as is used today for snapshot copies. But the identification of duplicate files (or, in some proposals, blocks) is something I have yet to see done with demonstrable gain.

de-duplication in filesystems

Posted Jul 10, 2006 18:58 UTC (Mon) by martinfick (subscriber, #4455) [Link]

Check out the vserver work on vhashify.

de-duplication in filesystems

Posted Jul 15, 2006 11:26 UTC (Sat) by nix (subscriber, #2304) [Link]

That's sort of similar, except I'm trying to work on the block level. The hardest part is arranging to detect cases, where, say, someone has a big text file and inserts one byte at the front of it: the rest should still be detected as a duplicate, even if the original file and the new file are not version-related (in which case detecting the duplicate is feasible), but doing that for arbitrary unrelated files without storing ridiculously many hashes is tricky. (More generally, modifications that are not multiples of a block size should not cause unmodified portions of duplicated files to be un-duplicated.)

de-duplication in filesystems

Posted Jul 22, 2006 3:43 UTC (Sat) by JumpJoe (guest, #39288) [Link]

Not sure what level the deduplication is being done however:
www.datadomain.com

Other companies are doing deduplication above the filesystem layer (CAS)

http://searchstorage.techtarget.com/originalContent/0,289...

Yes, it would be great to have a compression/deduplication built into a filesystem.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds