Linus thinks rename() without fsync() "is safe", in the sense that if he were writing an application that was intended to safely maintain important data (like, for example, the Linux source code), he would use rename() on files without using fsync() on them first. I'm not entirely sure that he reads any forum where this has been discussed so far.
I think fsync() makes sense to have. If the system stops running, there are some things that would have happened, except that the system stopped running first. Furthermore, when the whole system stops running, it becomes difficult to know what things had happened and which had not happened. Furthermore, it's too inefficient to serialize everything, particularly for a multi-process system. Falling back to the concurrency model, you can say that the filesystem after an emergency restart should be in some state that could have been seen by a process that was running before the restart. But there needs to be a further restriction, so that you know that the system won't go back to the blank filesystem that you had before installing anything; so fsync() makes sense as a requirement that the filesystem after a restart will be some state that could have been seen after the last fsync() that returned successfully.
(Of course, any time the system crashes, you might lose some arbitrary data, since the system has crashed; but a better system will lose less or be less likely to lose things. This is qualitatively different from the perfectly reasonable habit of ext4 of deciding that the post-restart state is right after every truncate.)
Ts'o: Delayed allocation and the zero-length file problem
Posted Mar 14, 2009 0:47 UTC (Sat) by njs (guest, #40338)
[Link]
But now you've just redefined fsync(2) to mean sync(2), and that has unacceptable overhead for many real uses. (Durably spooling a 1k email message should not force that multi-gigabyte rsync to flush to disk!)