The fundamental problem is that there are two similar but different operations an application developer can request:
open(A)-write(A,data)-close(A)-rename(A,B): replace the contents of B with data, atomically. I don't care when or even if you make the change, but whenever you get around to it, make sure either the old or the new version is in place.
open(A)-write(A,data)-fsync(A)-close(A)-rename(A,B): replace the contents of B with data, and do it now.
In practice, operation 1 has worked as described on ext2, ext3, and UFS with soft-updates, but fails on XFS and unpatched ext4. Operation 1 is perfectly sane: it's asking for atomicity without durability. KDE's configuration is a perfect candiate. Browser history is another. For a mail server or an interactive editor, of course, you'd want operation 2.
Some people suggest simply replacing operation 1 with operation 2. That's stupid. While operation 2 satisfies all the constraints of operation 1, it incurs a drastic and unnecessary performance penalty. By claiming operation 1 is simply operation 2 spelled incorrectly, you remove an important word from an application programmer's vocabulary. How else is an he supposed to request atomicity without durability?
(And using a "real database" isn't a good enough answer: then you've just punted the same problem to a far heavier system, and for no good reason.)
The last patch mentioned in the article seems to make operation 1 work correctly, and that's good enough for me. Still, people need to realize that the filesystem is a database, albeit not a relational one, and that we can use database terminology to describe it.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds