The problem has nothing to do with delayed allocation, nor with the commit interval. It has to do with the classic mistake of writing metadata without writing the corresponding data. A file system can easily delay allocation of file data for a minute and still preserve the data during crashes: It just needs to write the metadata for the new file after the data; and of course the rename metadata and the corresponding deletion of the old file data should be written even later. And of course the file system needs to ensure with barriers that this is written in the right order to disk.
The real solution to this problem is to fix the applications which are expecting the filesystem to provide more guarantees than it really is.Why should it be "the real solution" to change thousands of applications to deal with crash-vulnerable file systems? Even if all the application authors all agreed with this idea, how would they know that their applications are not expecting more than the file system guarantees?
IMO the real solution is to keep the applications the same, and fix the file system; we need to fix just one file system, and can relegate all the others that don't give the guarantees to special-purpose niches where data integrity is unimportant.
What guarantee should the file system give? A good one would be this: If the application leaves consistent data if it is terminated unexpectedly without a system crash (e.g. with SIGKILL), the data should also be consistent in case of a system crash (although possibly old without fsync()). One way to give this guarantee is to implement in-order semantics.
Bringing the applications back into line with what the system is really providing is a better solution than trying to fix things up at other levels.That's just wrong. But more importantly, it won't happen. So better bring the system in line with what the applications are expecting; for now, ext3 looks like the good-enough solution (despite Linux doing the wrong thing (no barriers) by default), and hopefully we will have file systems that actually give data consistency guarantees in the future.
I would welcome an article about the consistency guarantees that Btrfs gives (maybe in a comparison with other file systems). Judging from the lack of documentation of the guarantees (at least in prominent places), there seems to be little interest from file system developers in this area yet, but an article focusing on that topic may improve that state of affairs.
Concerning the subject of my comment: Whenever someone mentions XFS, someone else reports a story about data loss, and that's why he's no longer using XFS. It seems that ext4 aspires to the same ideals as XFS: high performance, large data handling capabilities, and it does not care much for the user's data in the case of a crash. I guess ext4 will then play a similar role among Linux users as XFS has.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds