> Which, to me, sounds very much the same as saying, like Ted Ts'o, that a correct user program has to fsync() its data and not rely on fclose() actually flushing anything to disk.
It's not Ted saying this. It is how it works. From man 2 close:
> A successful close does not guarantee that the data has been successfully saved to disk, as the kernel defers writes. It is not common for a file system to flush the buffers when the stream is closed. If you need to be sure that the data is physically stored use fsync(2). (It will depend on the disk hardware at this point.)
Ts'o: Delayed allocation and the zero-length file problem
Posted Mar 16, 2009 6:13 UTC (Mon) by jamesh (guest, #1159)
[Link]
People aren't asking for sync on close. Rather they'd like the rename() operation to only occur if the new file data has been written to disk.
Conversely, if the file data hasn't been written to disk then they expect that the rename over the old data won't occur.
There is no expectation that either of these operations will occur immediately, which is why they don't request that happen via fsync().
If the current method applications use when expecting this behaviour, then it'd be nice to define an API that does provide the desired semantics. That said, I can't think of any cases where you wouldn't want the new data blocks written out before renaming over an existing file.