User: Password:
Subscribe / Log in / New account

How disappointing

How disappointing

Posted Sep 9, 2011 16:26 UTC (Fri) by sionescu (subscriber, #59410)
Parent article: Ensuring data reaches disk

Articles like this make me think again that the Unix-Haters handbook is still as valid as ever, maybe it should be updated.

Better OSes allow one to write code like (the equivalent of) this:

fd = open(path, O_WRONLY | O_REPLACE);
close(fd, CLOSE_COMMIT);

which atomically replaces the file's data - keeping intact its metadata(perms, xattrs, selinux context, etc...)

Again, how disappointing...

(Log in to post comments)

How disappointing

Posted Sep 9, 2011 21:15 UTC (Fri) by butlerm (guest, #13312) [Link]

Some of those better OS's have to be rebooted on a regular basis because an in-use binary file can't be replaced, either. So unless your filesystem implements multiversion read concurrency, locking up everything until the file is closed is more trouble than it is worth.

How disappointing

Posted Sep 9, 2011 22:45 UTC (Fri) by sionescu (subscriber, #59410) [Link]

Who said anything about locking ?

How disappointing

Posted Sep 11, 2011 0:12 UTC (Sun) by butlerm (guest, #13312) [Link]

>Who said anything about locking ?

I mention locking because it is the most common way to implement atomic commit semantics, from the perspective of all other processes. Your idea makes great sense as long as you have multiversion read concurrency, so that existing openers can see an old, read only version of the file indefinitely.

POSIX simply has a different solution for that, as I am sure you know - the name / inode distinction, which allows you to delete a file, or rename replace it with a new version without locking other processes out, waiting, or disturbing existing openers.

It is unfortunate of course that there is no standard call to clone an existing file's extended attributes and security context for use in a rename replace transaction - perhaps one should be added, it would be a worthwhile enhancement. Hating UNIX when it is vastly superior to the most widely distributed alternative in this respect seems a bit pointless to me.

How disappointing

Posted Sep 11, 2011 16:00 UTC (Sun) by sionescu (subscriber, #59410) [Link]

No, it's the common way of implementing atomic commit when *modifying* the data, but it's not what I have in mind, which is this:

open(path, O_REPLACE) only allocates a new inode

close(fd, CLOSE_COMMIT) atomically replaces the reference to the old inode with the new inode(just like rename) copying all metadata except for the (a|c|m)time, then calls fsync()

easy, isn't it ?

How disappointing

Posted Sep 11, 2011 20:45 UTC (Sun) by nix (subscriber, #2304) [Link]

Sure. Practicalities: you could do it to open() (though you'd have to get the change into POSIX before Ulrich would let it past), but you could never do that to close() without breaking every C program ever written. You could call it close_replace(), perhaps?

How disappointing

Posted Sep 11, 2011 21:10 UTC (Sun) by sionescu (subscriber, #59410) [Link]

Why POSIX ? There are other Linux-specific open flags, did Ulrich object to every one of them ?

The new syscall could be called close2, adding a "flags" parameter - in the spirit of accept4() et al.

How disappointing

Posted Sep 11, 2011 21:52 UTC (Sun) by nix (subscriber, #2304) [Link]

True, though close2() is a horrible name (as is wait$num() and accept$num()): give it a name that reflects its purpose.

How disappointing

Posted Sep 11, 2011 22:14 UTC (Sun) by sionescu (subscriber, #59410) [Link]

How about "close_with_flags" ?

How disappointing

Posted Sep 11, 2011 23:51 UTC (Sun) by nix (subscriber, #2304) [Link]

Again, ugh ('with'?). I'd simply say close_replace(), no need for a flag or indeed any parameters at all. This means it has the same prototype as close(), so if anyone wants to choose between calling close() or close_replace() at runtime, they can just use a function pointer.

How disappointing

Posted Sep 23, 2011 0:59 UTC (Fri) by spitzak (guest, #4593) [Link]

If it is opened with the atomic-replace semantic, I would just have plain close() do the replacement.

There may be a need to somehow "abort" the file so that it is as though you never started writing it. But it may be sufficient to do this if the process owning the fd exits without calling close().

I very much disagree with others that say POSIX should be followed. The suggested method of writing a file is what is wanted in probably 95% of the time that files are written. It should be the basic operation, while "dynamic other processes can see the blocks change as I write them" is an extremely rare operation that should be the one requiring complex hacks.

Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds