But __NOTHING__ specifies what data you'll find left on the disk after a crash (and after a crash is the only time when the difference between "on disk" and "in memory buffers" makes any difference). fsync() does NOT guarantee durability - it can be a no-op.
So what this all boils down to is how close each filesystem implementation comes to "non-crash" behaviour after a crash, which is a quality-of-implementation choice for the filesystems.
As far as I can see, for portable code the best bet is to stick with the write-close-rename pattern. This is sufficient for atomic changes in the non-crash case. Adding fsync in there makes it safe in the crash case for some filesystems, but not all, and there are others where it was safe without it, and others where it has a performance penalty: it's far from a clear winner at the moment.
Posted Mar 15, 2009 21:24 UTC (Sun) by bojan (subscriber, #14302)
[Link]
> fsync() does NOT guarantee durability - it can be a no-op.
Hence, you need to have various #ifs and ifs() to figure out what works on your platform. See Mac OS X. fsync is just an example here. The point is that you must use _something_ to commit. Without that, POSIX does not guarantee anything beyond currently running processes seeing the same picture.
Where the the correctness go?
Posted Mar 16, 2009 4:49 UTC (Mon) by dlang (✭ supporter ✭, #313)
[Link]
ven doing s fsync doesn't mean that you won't have this corruption. the two writes could go to the disk drive's buffer and it could write the metadata out before it writes the data blocks. if it looses power in between these two steps you have the same problem
Where the the correctness go?
Posted Mar 16, 2009 13:28 UTC (Mon) by jamesh (guest, #1159)
[Link]
Of course, if the drive supports barriers in its command queueing implementation it should be possible to prevent it reordering those writes.
That is likely to restrict reorderings that won't break correctness guarantees though.
Where the the correctness go?
Posted Mar 16, 2009 3:19 UTC (Mon) by k8to (subscriber, #15413)
[Link]
A no-op fsync is not compliant. You've taken it quite a bit too far.
fsync explicitly says that when it returns success, the data has been handed to the storage system successfully.
It doesn't guarantee that that storage system has committed it in a durable way for all scenarios. That's another issue.
fsync does guarantee that the data has been handed to the storage medium, but makes no guarantees about the implementation of that storage medium.