The two sides of reflink()
Posted May 6, 2009 8:52 UTC (Wed) by lmb (subscriber, #39048)
There is, of course, also a need for reflinkstat() or whatever it is going to be called - one needs to find out how many blocks are part of a specific COW, and also reflinkdiff(), which enumerates the (meta-)data blocks which differ from the link target (needed for efficient rsync/backup).
Then, it also becomes possible to use quota to account just the difference, i.e. the actual space used by the reflink.
A further complication arises when we look at using reflink() on directories, which would of course also be quite desirable (for snapshots, multi-use chroots etc). That will be an interesting direction to explore ;-)
Posted May 9, 2009 23:52 UTC (Sat) by butlerm (guest, #13312)
However both aforementioned filesystems store block checksums, so perhaps a
more practical means would be to add an interface that returns the block
checksums of a file range (if not the block offsets) to user space, and
generate a candidate duplication list from that.
Posted May 6, 2009 9:02 UTC (Wed) by epa (subscriber, #39769)
That said, it makes no sense to account disk quota conservatively while lying about the amount of free space really available. The two should be treated the same, so if reflinking a large file has no effect on the reported free space, it shouldn't cost quota either.
Posted May 6, 2009 9:22 UTC (Wed) by nix (subscriber, #2304)
Posted May 6, 2009 10:42 UTC (Wed) by epa (subscriber, #39769)
Posted May 6, 2009 12:33 UTC (Wed) by vonbrand (guest, #4458)
Even worse, if I reflink() a file of yours, and yout then change it (or delete it, whatever) suddenly my quota goes up without any action on my part.
quota behaviour of reflink()
Posted May 6, 2009 11:21 UTC (Wed) by pjm (subscriber, #2080)
The reasons for different quota behaviour would be if it results in different (and more desirable) user behaviour, or if it helps system administrators choose better quota limits, i.e. if it results in less frequent filesystem-full situations for a given amount of user productivity.
How much are quotas used these days, and for what uses ? Can people comment on the usefulness of different quota policies in the context of specific use cases?
As to whether different quota behaviour would result in different user behaviour (e.g. encouraging taking steps for files to be reflink'ed rather than copied), I wonder how many quota'd users would have the necessary knowledge for it to change their behaviour.
Posted May 6, 2009 12:47 UTC (Wed) by utoddl (subscriber, #1232)
Space-reporting tools can report variations of (1) actual space used by extant allocated blocks (modulo sparse files), (2) space free, (3) space that would be used in the case where a naive copy were made to another file system -- all of which are valid and different numbers. The "simple" question of how much storage is used is in fact complicated, our desire for simple answers notwithstanding.
Posted May 16, 2009 0:37 UTC (Sat) by efexis (guest, #26355)
One example - if you share somebody's file and it doesn't count as your own space, then the original owner deletes their copy, the space is now soley allocated to you, and so should count against your quota? This act could push you way over your quota; what if the file alone is bigger than your complete quota?
What if two people are sharing a file that you delete? Should what's on your quota be divided equally amongst the two of them? If you want to be fair, surely the thing to do when you share a file from someone else is add half of its size to your own quote, and remove half of it from theres. If you share the file, you should share the cost?
If it goes straight onto your quota when you share it, you simply don't have to worry about any of these - but you also do at the same time lose certain benefits of sharing the data. With VMs, over-committing can often be specified as a %, maybe a similar option for sharing files... sharing a file could save you a certain % of its size from your quota, which then means if you suddenly become the sole owner, you only have added the remaining % rather than the full 100%. The % chosen would be linked to the average reference count of all blocks that you own, as this shows the likelyhook of any block being or becoming soley yours.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds