LWN.net Logo

Not again

Not again

Posted Dec 31, 2010 16:21 UTC (Fri) by man_ls (subscriber, #15091)
In reply to: Not again by etienne
Parent article: Ext4 filesystem hits Android, no need to fear data loss (ars technica)

And, the next time? Would you keep all the old files? You can also keep two of them, but how do you know which one of them was the last one written? What if one of them is half-written? That way leads to madness. Just to avoid a feature which has worked on ext3 (and probably all other popular POSIX filesystems) for decades, and which other OSs are trying to copy.


(Log in to post comments)

Not again

Posted Dec 31, 2010 19:05 UTC (Fri) by etienne (subscriber, #25256) [Link]

> Just to avoid a feature which has worked on ext3 for decades

Well, for decades you did not have hard drives with internal command queueing.
To have better performances you need to keep the queue full.
Because you cannot tell the hard drive that updating this sector is more important than that one, that information is probably not managed at all in the driver.
Moreover, you asked for this behaviour by wanting only metadata journaling of the filesystem, explicitely wanting a coherant filesystem (i.e. no fsck after crash) even if it means data inside files may be corrupted.
You can run with data journaling, but people/distributions thinks it does not worth the performance hit.

Once again, reenacting history

Posted Dec 31, 2010 20:21 UTC (Fri) by man_ls (subscriber, #15091) [Link]

It didn't happen that way. I did explicitly use a journaling filesystem thinking that "journaling" meant that it did not lose data in the event of a crash. When I found out that only metadata was guaranteed to be consistent, I thought "What a sham". Then I (and millions of other people) immediately switched to another filesystem: ext3 with data=ordered, which did offer better guarantees (data journaling) at the cost of performance. Who wants performance when your files are being corrupted?

The funny thing is that XFS developers eventually realized their folly and solved the atomicity issues, but now people don't trust them with their data anymore.

Not again

Posted Dec 31, 2010 22:14 UTC (Fri) by neilbrown (subscriber, #359) [Link]

1/ ext3 is the only filesystem I know of that forces and fsync before committing a rename.
2/ ext3 is about 10 years old, so it hasn't been around for "decades" unless you mean "0.9 decades".
3/ When rename was first introduced into Unix in the BSD, it was atomic in the sense that even in the event of a crash there would always be a file with the destination name, either the original or the new. This is in contrast to the previous behaviour. which required:
- create "file.tmp"
- unlink "file"
- link "file.tmp" to "file"
- unlink "file.tmp"

which can easily leave nothing called "file". This is all that "atomic rename" means, or at least all it meant before ext3 gave rename unfortunate (though useful) semantics.

Though I cannot know the intention of the author of that post you linked to, there is no prima-face reason to believe they mean anything more than the atomicity of names (not of contents) that rename has always had in Unix.

(and half-written files are easy to detect by writing a checksum at the end. If you suffix each file with a timestamp it is easy to know which is the most recent. And file older than a few minutes will be safe-on-disk so you are always free to clean up any file older than the youngest file that is older than a few minutes)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds