Yes, glibc's rename() API guarantees atomic renames. Since normal applications do not make syscalls directly, but call the libc API to do it on their behalf, they are not to blame.
Posted Nov 11, 2010 8:46 UTC (Thu) by bojan (subscriber, #14302)
[Link]
And even more "funily", glibc doesn't deal with file system implementation (i.e. the persistence of the change) at all. In fact, that very page you pointed to states that strange things may indeed happen after a crash.
The atomicity of rename() refers to a view from the running system and not much else. But it has sure been misread a lot :-)
Glibc change exposing bugs
Posted Nov 11, 2010 9:06 UTC (Thu) by Mook (guest, #71173)
[Link]
Hmm, odd; I parse "If there is a system crash during the operation, it is possible for both names to still exist; but newname will always be intact if it exists at all. " as "the file named by the destination will either not exist, or have some sort of sensible value, but not be truncated at zero bytes unless that was one of the two inputs".
Glibc change exposing bugs
Posted Nov 11, 2010 9:52 UTC (Thu) by bojan (subscriber, #14302)
[Link]
You are confusing file names (i.e. what is recorded in the directory) with contents of files.
Glibc change exposing bugs
Posted Nov 11, 2010 13:49 UTC (Thu) by pbonzini (subscriber, #60935)
[Link]
"intact" seems to refer to the contents?
Glibc change exposing bugs
Posted Nov 11, 2010 23:05 UTC (Thu) by bojan (subscriber, #14302)
[Link]
Suppose there are two entries in the directory, with oldname being renamed to newname, and each (obviously) pointing to an inode. If the system crashes during the rename, it is possible that both will survive (because the directory was not committed to disk yet).
What glibc docs are talking about is that rename() is not implemented by copying content of the oldname to newname. So, if there was newname before rename and the directory commit doesn't go through, the content of newname will not be changed. It is a pure directory operation. On the other hand, if the directory gets committed, there will be just newname there, pointing to whatever content oldname had. All of that is if your FS knows how to survive a crash - otherwise situation is not interesting (well, unless you're the sysadmin recovering the mess :-).
Now note the situation from the ext4 "problem". The oldname content was not fsync()-ed to disk before the rename(). Ergo, when the directory got committed, oldname became newname on disk, pointing to zero bytes, due to delayed allocation. This has nothing to do with the fact that on unsuccessful (i.e. not committed before the crash) rename(), both oldname and newname would remain in the directory.
Glibc change exposing bugs
Posted Nov 12, 2010 7:12 UTC (Fri) by Mook (guest, #71173)
[Link]
Thank you for the clear explanation! It does clearly say that I'm wrong :)