LWN.net Logo

Where the the correctness go?

Where the the correctness go?

Posted Mar 14, 2009 12:03 UTC (Sat) by alexl (subscriber, #19068)
In reply to: Where the the correctness go? by bojan
Parent article: Ts'o: Delayed allocation and the zero-length file problem

The only way with the current POSIX apis to get this guarantee is to fsync() the fd before renaming. But this imposes an unnecessary overhead on both the app (generally) and the whole system (with ext3 data=orderer).

Now, what ext4 does is clearly correct according to what is "allowed" by POSIX (actually, this is kinda vague as POSIX allows fsync() to be empty, and doesn't actually specify anything about system crashes.)

However, even if its "posixly correct", it is imho broken. In the sense that I wouldn't let any of my data near such a filesystem, and I would recommend everyone who asks me to not use it.

Take for example this command:
sed -i s/user1/user2/ foo.conf

This does in-place update using write-to-temp and rename over, without fsync. The result of running this command, is that if your machine locks up after up to a minute you loose both versions of foo.conf.

Now, is foo.conf important? How the heck is sed to know? Is sed broken? Should it fsync? Thats more or less arguing that every app should fsync on close, which on ext4 is the same as the filesystem doing it, but on ext3 is unnecessary and a massive system slowdown.

Or should we try to avoid the performance implications of fsync (due to its guarantees being far more than what we need to solve our requirements)? We could do this by punting this to the users of sed, by having a -important-data argument, and then pushing this further out to any script that uses sed, etc, etc.

Or we could just rely on filesystems to guarantee this common behaviour to work. Even if its not specified by POSIX. (And choose not to use filesystems that doesn't give us that guarantee, like so many people have switched from XFS after data losses).

Ideally of course there would be another syscall, flag or whatever that says "don't write metadata before data is written". That way we could get both efficient and correct apps, but that doesn't exist today.


(Log in to post comments)

Where the the correctness go?

Posted Mar 14, 2009 21:20 UTC (Sat) by bojan (subscriber, #14302) [Link]

> However, even if its "posixly correct", it is imho broken.

Look, this may as well be true, but the fact is that all of us that are creating applications have one thing to rely on - documentation. And the documentation says what it says.

Where the the correctness go?

Posted Mar 16, 2009 12:00 UTC (Mon) by nye (guest, #51576) [Link]

POSIX also allows a system crash to cause your computer to explode and hurl shrapnel into your face, because crash-behaviour is *undefined*. Are you seriously arguing that *any* POSIX-compliant behaviour is automatically the right thing? Clearly not, because you are arguing against one POSIX-compliant method in favour of another. There are an infinite number of ways to be POSIX-compliant, some of which are more useful than others.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds