User: Password:
|
|
Subscribe / Log in / New account

ext4 and data loss

ext4 and data loss

Posted Mar 13, 2009 2:14 UTC (Fri) by iabervon (subscriber, #722)
In reply to: ext4 and data loss by giraffedata
Parent article: ext4 and data loss

Beyond POSIX, I think that users of a modern enterprise-quality *nix OS writing to a good-reliability filesystem expect is that operations which POSIX says are atomic with respect to other processes are usually atomic with respect to processes after a crash (mostly of the unexpected halt variety), and that fsync() forces the other processes to see the operation having happened.

That is, you can think of "stable storage" as a process that reads the filesystem sometimes, and, after a crash, repopulates it with what it read last, and that fsync will only return after one of these reads after when you call it. You don't know what "stable storage" read, and it can have all of the same sorts of race conditions and time skew that any other concurrent process can. If the filesystem matches some such snapshot, it's the user or application's carelessness if anything is lost; if the filesystem doesn't match any such snapshot, it's crash-related filesystem damage.


(Log in to post comments)

ext4 and data loss

Posted Mar 13, 2009 2:51 UTC (Fri) by quotemstr (subscriber, #45331) [Link]

Beyond POSIX, I think that users of a modern enterprise-quality *nix OS writing to a good-reliability filesystem expect is that operations which POSIX says are atomic with respect to other processes are usually atomic with respect to processes after a crash (mostly of the unexpected halt variety)
In an ideal world, that would be exactly what you'd see: after a cold restart, the system would come up in some state the system was in at a time close to the crash, not some made-up non-existent state the filesystem cobbles together from bits of wreckage. Most filesystems weaken this guarantee somewhat, but leaving NULL-filled and zero-length files that never actually existed on the running system is just unacceptable.

fsync() forces the other processes to see the operation having happened
Huh? fsync has nothing to do with what other processes see. fsync only forces a write to stable storage; it has no effect on the filesystem as seen from a running system. In your terminology, it just forces the conceptual "filesystem" process to take a snapshot at that instant.

ext4 and data loss

Posted Mar 13, 2009 15:26 UTC (Fri) by iabervon (subscriber, #722) [Link]

In an ideal world, that would be exactly what you'd see: after a cold restart, the system would come up in some state the system was in at a time close to the crash, not some made-up non-existent state the filesystem cobbles together from bits of wreckage.
The model works if you include the fact that, in a system crash, unintended things are, by definition, happening. Any failure of the filesystem to make up a possible state afterwards appears as fallout from the crash. Maybe some memory corruption changed your file descriptors, and your successful writes and successful close were some other file (but the subsequent rename found the original names). Maybe something managed to write zeros over your file lengths. It's not a matter of standards how often undefined behavior leads to noticeable problems, but it is a matter of quality.
fsync has nothing to do with what other processes see. fsync only forces a write to stable storage; it has no effect on the filesystem as seen from a running system. In your terminology, it just forces the conceptual "filesystem" process to take a snapshot at that instant.
That's what I meant to say: it makes the "filesystem" process see everything that had already happened. (And, by extension, processes that run after the system restarts, looking at the filesystem recovered from stable storage)

ext4 and data loss

Posted Mar 13, 2009 16:04 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

In an ideal world, that would be exactly what you'd see: after a cold restart, the system would come up in some state the system was in at a time close to the crash, not some made-up non-existent state the filesystem cobbles together from bits of wreckage. Most filesystems weaken this guarantee somewhat, but leaving NULL-filled and zero-length files that never actually existed on the running system is just unacceptable.

You mean undesirable. It's obviously acceptable because you and most your peers accept it every day. Even ext3 comes back after a crash with the filesystem in a state it was not in at any instant before the crash. The article points out that it does so to a lesser degree than some other filesystem types because of the 5 second flush interval instead of the more normal 30 (I think) and because two particular kinds of updates are serialized with respect to each other.

And since you said "system" instead of "filesystem", you have to admit that gigabytes of state are different after every reboot. All the processes have lost their stack variables, for instance. Knowing this, applications write their main memory to files occasionally. Knowing that even that data isn't perfectly stable, some of them also fsync now and then. Knowing that even that isn't perfectly stable, some go further and take backups and such.

It's all a matter of where you draw the line -- what you're willing to trade.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds