| Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net. |
The proposed reflink() system call creates an interesting cross between a hard link and a file copy. The end result of a successful reflink() call is a new, distinct file - with its own inode - which shares data blocks with the original file. A copy-on-write policy is used, so the two files remain distinct; if one is modified, the changes will not be visible in the other. This call has a number of uses, including fast snapshotting and as a sort of optimized copy operation. But, as was described in the previous article on reflink(), there is some disagreement over how file ownership and security-related metadata should be handled.
It comes down to the different use cases for this system call. In the "snapshot" case, security information must be preserved; that, in turn, means that reflink() can only be used by the owner of the file (or by a process with sufficient capabilities to get around ownership restrictions). On the other hand, those wanting to use reflink() as a fast file copy would rather see security information treated like it would be with a file copy; the user creating the reflink must have read access to the original file and ends up owning the new one.
For a while, it seemed like the reflink-as-copy use case was simply going to be left out in the cold. But then Joel Becker, the author of the reflink() patches, proposed a compromise. If the process calling reflink() had ownership or suitable privilege, the snapshot semantics would prevail. Otherwise, read access would be required and a new set of security attributes would be applied. The idea was to try to automatically do the right thing in all situations.
In the end, though, this approach didn't fly either. From Andy Lutomirski's objection:
Please search the web and marvel at the disasters caused by setuid's magical caller-dependent behavior (the sendmail bug is probably the most famous). This proposal for reflink is just asking for bugs where an attacker gets some otherwise privileged program to call reflink but to somehow lack the privileges (CAP_CHOWN, selinux rights, or whatever) to copy security attributes, thus exposing a link with the wrong permissions.
Others agreed that automagically changing behavior depending on caller privilege was not the best way to go. So Joel went back to the drawing board yet another time. On May 15, he came back with a new proposal. The reflink() API would now look like:
int reflink(const char *oldpath, const char *newpath, int preserve);
The new preserve parameter would be a set of flags allowing the caller to specify which bits of security-oriented information are to be preserved. Anticipated values are:
The API would also provide REFLINK_ATTR_NONE and REFLINK_ATTR_ALL, with the obvious semantics. Importantly, if the caller lacks the requisite credentials to preserve the requested information, the call will simply fail. There will be no magically-changing semantics depending on the caller's capabilities.
Joel also proposes some new flags to the ln command:
There were some question as to whether all the flags are necessary; perhaps
all that is really needed is "preserve all" or "preserve none." But Joel
feels like one might as well add the flexibility, given that the argument
is being added to the API anyway, and there doesn't seem to be that much
strong sentiment to the contrary. All told, the reflink() API
would appear to be stabilizing toward something that everybody can agree
on. It's probably late for 2.6.31, but this new system call could conceivably be
ready for the 2.6.32 development cycle.
| Index entries for this article | |
|---|---|
| Kernel | Filesystems |
| Kernel | reflink() |
| Kernel | System calls |
This week's reflink() API
Posted May 21, 2009 17:41 UTC (Thu) by spitzak (guest, #4593) [Link]
This week's reflink() API
Posted May 21, 2009 19:17 UTC (Thu) by butlerm (guest, #13312) [Link]
I don't know why one would extend the "ln" command however, when the
semantics of all modes of operation are essentially variations on "cp".
"reflink" really ought to be renamed accordingly for the same reason - from a
user perspective there is no link of any kind - it is just a space and time
efficient file copy.
This week's reflink() API
Posted May 21, 2009 20:30 UTC (Thu) by oak (guest, #2786) [Link]
"cowcopy"?
This week's reflink() API
Posted May 21, 2009 22:48 UTC (Thu) by butlerm (guest, #13312) [Link]
Copyright © 2009, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds