Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
LWN.net Weekly Edition for May 9, 2013
(Nearly) full tickless operation in 3.10
The two sides of reflink()
Posted May 5, 2009 21:21 UTC (Tue) by martinfick (subscriber, #4455)
Posted May 5, 2009 21:29 UTC (Tue) by flewellyn (subscriber, #5047)
Posted May 6, 2009 14:25 UTC (Wed) by wilreichert (subscriber, #17680)
Posted May 6, 2009 15:09 UTC (Wed) by dlang (✭ supporter ✭, #313)
Posted May 6, 2009 17:03 UTC (Wed) by elanthis (guest, #6227)
If on the other hand cp says "this is a copy" to the kernel then the filesystem can just do the right thing. Of course, other applications will need to be modified to take advantage of the new feature, but such is the truth of most progress.
Posted May 7, 2009 21:03 UTC (Thu) by anton (guest, #25547)
Posted May 7, 2009 21:13 UTC (Thu) by martinfick (subscriber, #4455)
You don't, usually the host system mounts a portion of the filesystem into a separate chroot for each guest server. The guests typically then have a limited root capability that does not included making device nodes so they really do not have access to the device, only the filesystem.
Posted May 10, 2009 18:42 UTC (Sun) by anton (guest, #25547)
The guests typically then have a limited root capability
that does not included making device nodes so they really do not have
access to the device, only the filesystem.
Posted May 10, 2009 19:09 UTC (Sun) by martinfick (subscriber, #4455)
Sure, but if you make the binaries read only you no longer have
independent guest systems that can be administered without knowledge of
the host or other guests. In other words, if I now want to upgrade the
apache server in one guest, I can't since the binary is read only to my
guest root user. With COW, no problem, as a guest admin I do not even
know that my apache binary is shared with others. It is only relevant to
the host (the host unifies the various guest binaries, not the guest).
Posted May 6, 2009 14:33 UTC (Wed) by MarkWilliamson (guest, #30166)
Those folks who are fortunate enough to have their home directory on a netapp filer
have for years been able to "cd ~/.snapshot/" and find a special directory of
historical versions of their files. These are stored efficiently because of the nature of
the WAFL filesystem. With reflink, it would be possible to create a lightweight
version of historical snapshotting: you'd have a daemon run every night (for
instance) and recursively reflink the current state of all your files into a directory
tree at ~/.old-versions/<date>/ - then, if you ever needed to go back to an old
version of a file you could just look in there.
With reflinks this would be very fast and would not use up loads of disk space
(though there would still be quota concerns). It would make time-machine or Netapp
.snapshot-like functionality easy to implement efficiently on single disk systems.
Probably the most quoted reference for stuff like this is the Elephant research
filesystem, about which there are a number of decent research papers.
Another use that I've seen mentioned is the ability to make checkouts / clones in
moderen version control systems go faster and be more lightweight in terms of disk
storage - for instance, cloning a git repository could transparently share all the
underlying data (including the working directory!) using reflinks. Similar tricks being
possible for the other VCSes.
Finally, you could probably have a daemon that rummages around the system, finds
identical files and unifies them on disk using reflink in order to save space.
Loads of cool stuff :-)
Posted May 6, 2009 14:34 UTC (Wed) by MarkWilliamson (guest, #30166)
Posted May 6, 2009 16:38 UTC (Wed) by cdarroch (guest, #26812)
Posted May 6, 2009 16:59 UTC (Wed) by MarkWilliamson (guest, #30166)
rdiff-backup (http://www.gnu.org/savannah-checkouts/non-gnu/rdiff-backup/) gives
somewhat similar snapshotting convenience but you have to interact with it through a
command line app. Also, it does use up extra space (although if you're backing up to
another machine / another drive for redundancy then that's just fine!).
archfs (http://code.google.com/p/archfs/) provides a Fuse interface to browse rdiff-backup
repositories. Last time I tried it it wasn't really suitable for large repositories but this may
have been fixed since then. rdiff-backup's page on related info has some other solutions:
.snapshot is a very nice user interface to have to old revisions.
Posted May 9, 2009 5:07 UTC (Sat) by TRS-80 (subscriber, #1804)
Line endings - make sure you select HTML not plain text, as the latter doesn't do wrapping for some reason.
Posted May 11, 2009 0:21 UTC (Mon) by vonbrand (subscriber, #4458)
I suffered through DOS's "you can undelete files whenever you fatfingered DEL". Most of the time it worked, but Murphy's Law ensured that when you really needed to get something back, it would usually be gone for good.
Unix' idea of "rm is final" is harsh, but you learn not to misplace stuff in the first place. Makes for a better experience in the long run.
Posted May 11, 2009 1:10 UTC (Mon) by MarkWilliamson (guest, #30166)
Although in practice it's going to get used to undo rm occasionally, it seems to me only sensible to have something like this available so I'm able to roll back important documents and settings to previous states if I make the wrong modification, or if some program barfs over everything and corrupts things.
Users will probably have to be repeatedly reminded that, yes, they do need an independent backup on another disk somewhere because reflinks won't save you if your computer explodes. But most folks don't do proper backups *anyhow*, so I doubt it'll make that aspect of user behaviour much worse!
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds