> No, even at that point it might not be important to get the new data to disk right away.
> But it is still important to get either the old or the new data. However ext4 leaves the disk
> contents with something that didn't exist at any point during the changes the process made,
> this is because ext4 did the rename before writing the data.
> The problem is, that there exist no API that guarantee exactly the level of integrity needed in
> many cases. You used to be able to create a file and then rename it on top of an existing file to
> get what you wanted. The change forced you to sync to get the guarantee that you needed, but
> it gave you more than you wanted and was slower because of that.
rename is a metadata operation, which is atomic. If you have not guaranteed the data is on disk
with fsync, then seeing the new file with no data after a crash is one obvious outcome.
And if you rely on btrfs to flush on rename, or ext3 semantics or whatever, then the app is still
"write, fsync, rename" is the sequence you need for correctness. If you don't need the new data
right away, then defer the fsync,rename part until the point at which you do need it. If you see or
percieve some performance problem with sequence required for correctness, then the answer is
absolutely not to destroy correctness or hope to rely on some undocumented implementation
detail. Raise the issue on lkml, provide details, suggest additional APIs etc.