|
|
Log in / Subscribe / Register

Guarantees and the belt-and-braces of journaling

Guarantees and the belt-and-braces of journaling

Posted Mar 18, 2009 0:16 UTC (Wed) by xoddam (subscriber, #2322)
In reply to: ordered(tm) brand by nybble41
Parent article: Garrett: ext4, application expectations and power management

The use-case of writing a new file and renaming it to replace an existing one is documented and recommended by expert practitioners[citation needed] as the best, nay the only, way to achieve atomic operations on a POSIX system.

Specifically, whenever a rename replaces an old file with a recently-written one, the application developer's intention is to achieve atomic replacement of the file's contents. Invariably. No exceptions. Even if you "kill -9 1", this will not cause corruption truncation of the target file.

POSIX *does* guarantee this atomicity, with precisely *one* exception -- if the system crashes, behaviour is undefined.

A journaling filesystem exists for one reason only to provide reasonable behaviour in the event of a system crash, ie. to extend the guarantees POSIX provides and reduce the need to recover data.

In a couple of instances, users have observed that particular up-and-coming journaling filesystems make it more (not less!) likely for them to need to recover files than the status quo. It is only sane that they should report this as a bug. It has *nothing* to do with application developers, who are using the recommended pattern and generally don't have much influence over what happens when their users' computers crash.

It is wonderful news then, that the developers of both filesystems (to my knowledge) that did exhibit such behaviour have listened to the requests of their users and let the journal extend the POSIX guarantee of atomic replacement on rename across system failures.

Hurrah and thankyou.

Discussion of fsync is a complete red herring. On older POSIX-conforming filesystems there is NO GUARANTEE AT ALL that the filesystem will be accessible after a system crash, fsync or no. On some implementations, fsync can indeed make this particular kind of data loss less likely (and application developers in-the-know have used it for this purpose). There is still no POSIXLY_CORRECT guarantee that data will not be lost, so for a filesystem developer to say that his users don't really deserve to benefit from the safety that journaling can afford until application developers have jumped through an extra latency-imposing hoop is a bit rude. Not to say, putting the cart before the horse.

By the way, Ted T'so is 90% correct to say application developers shouldn't fear fsync, and 100% wrong to say that fsync is the correct way to achieve atomic replacement with rename. Rename alone is supposed to achieve this; if a filesystem is technically capable of preserving this guarantee even across system failures then it should do so.


to post comments

Guarantees and the belt-and-braces of journaling

Posted Mar 20, 2009 13:28 UTC (Fri) by regala (guest, #15745) [Link] (1 responses)

journaling is not here to preserve data, but to preserve integrity. You would be pleased if the data were on disk, but the filesystem's got broken and unrepairable...
People need to know what journaling was introduced, and clearly it is not here to preserve your little settings you got smashed because you wanted to play World of Goo. Get serious.

Guarantees and the belt-and-braces of journaling

Posted Mar 20, 2009 15:41 UTC (Fri) by foom (subscriber, #14868) [Link]

> People need to know what journaling was introduced, and clearly it is not here to preserve
> your little settings you got smashed because you wanted to play World of Goo. Get serious.

If journaling is not for that, then I want something which is! And yes, I do want to play World of Goo!

Why should I give a damn about the filesystem structure except as a prerequisite to being able to
get to my files. I want my files, and that means file *content*. So I want a system which does a
reasonably reliably job of ensuring that content doesn't disappear. Ext3 is such a system. Maybe it
was unintentional at the time it was designed, but now that it's recognized as a good idea, let's
*keep* making systems that work as well as it!


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds