Weekly Edition Return to the Kernel pageSponsored link Serve your customers, not your servers, with VERIO Linux VPS. Full-access test-drive here. |
The ongoing fallocate() story
The proposed fallocate() system call, which exists to allow an
application to preallocate blocks for a file, was covered here back in March.
Since then there has been quite a bit of discussion, but there is still no
fallocate() system call in the mainline - and it's not clear that
there will be in 2.6.23 either. There is a new version of the
fallocate() patch in circulation, so it seems like a good time
to catch up with what is going on.
Back in March, the proposed interface was:
long fallocate(int fd, int mode, loff_t offset, loff_t len);
It turns out that this specific arrangement of parameters is hard to support on some architectures - the S/390 architecture in particular. Various alternatives were proposed, but getting something that everybody liked proved difficult. In the end, the above prototype is still being used. The S/390 architecture code will have to do some extra bit shuffling to be able to implement this call, but that apparently is the best way to go. That does not mean that the interface discussions are done, though. The current version of the patch now has four possibilities for mode:
As an example of how the last two operations differ, consider what happens if an application uses fallocate() to remove the last block from a file. If that block was removed with FA_DEALLOCATE, a subsequent attempt to read that block will return no data - the offset where that block was is now past the end of the file. If, instead, the block is removed with FA_UNRESV_SPACE, an attempt to read it will return a block full of zeros. It turns out that there are some differing opinions on how this interface should work. A trivial change which has been requested is that the FA_ prefix be changed to FALLOC_ - this change is likely to be made. But it seems there's a number of other flags that people would like to see:
All told, it's a significant number of new features - enough that some people are starting to wonder if fallocate() is the right approach after all. Christoph Hellwig, in particular, has started to complain; he suggests adding something small which would be able to implement posix_fallocate() and no more. Block deletion, he says, is a different function and should be done with a different system call, and the other features need more thought (and aggressive weeding). So it's unclear where this patch set will go and whether it will be considered ready for 2.6.23. (Log in to post comments)
One word comes to mind here... Posted Jul 9, 2007 13:32 UTC (Mon) by dion (subscriber, #2764) [Link] Bikeshed.
This is clearly one of the simplest and least critical syscalls ever conceived so it only makes sense that it would take forever to settle the details.
One word comes to mind here... Posted Jul 14, 2007 17:59 UTC (Sat) by jkm (subscriber, #14176) [Link] all syscalls are important. they form an ABI which we must maintain forever, basically. getting them right the first time is pretty damned important.
|
Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.