Better guidance for database developers
Better guidance for database developers
Posted Sep 26, 2019 20:29 UTC (Thu) by nybble41 (subscriber, #55106)In reply to: Better guidance for database developers by rweikusat2
Parent article: Better guidance for database developers
Even if you mandated that fsync() == sync() so that *all* filesystem data was written to disk before fsync() returns it still wouldn't guarantee that there is actually a directory entry pointing to that file. For example, it could have been unlinked by another process, in which case the data on disk really would be nothing more than a "magnetic ornament".
Let's say process A creates a file with path "/a/file", writes some data to it, and calls fsync(). While this is going on, another process hard-links "/a/file" to "/b/file" and then unlinks "/a/file" prior to the fsync() call. Would you expect the fsync() call to synchronize both directories, or just the second directory?
Posted Sep 26, 2019 20:55 UTC (Thu)
by rweikusat2 (subscriber, #117920)
[Link]
The Linux ext2 file system introduced write-behind caching of directory operations in order to improve performance at the expense of reliablity in situations deemed to be rare. Because of this, depending on the filesystem being used, fsync on a file descriptor is not sufficient to make a file crash-proof on Linux: An application would need to determine the path to the root file system, walk that down while fsyncing every directory and then call fsync on the file descriptor. This is obviously not a requirement applications will realistically meet in practice.
Possibily 'hostile' activities of other processes (as in "Let's say ...") are of no concern here because that's not a situation fsync is supposed to handle.
Better guidance for database developers