LWN.net Logo

Nice summary

Nice summary

Posted Mar 18, 2009 3:53 UTC (Wed) by bojan (subscriber, #14302)
Parent article: Better than POSIX?

> It is probably a matter of building our filesystems to provide "good enough" robustness as a default, with much stronger guarantees available to developers who are willing to do the extra coding work.

Which is the euphemism for "we'll have workarounds for your bugs" and "people that know will fix their apps".


(Log in to post comments)

Nice summary

Posted Mar 18, 2009 5:36 UTC (Wed) by xoddam (subscriber, #2322) [Link]

Please stop insisting that applications are buggy or broken that haven't considered recovery from kernel or hardware failure. That didn't use to be possible at all and POSIX certainly never guaranteed it.

Application writers have long used rename precisely and only to achieve atomicity; it's the only atomic operation the API provides (simulating atomicity with locks and a series of synchronous flushes is a very different matter). As long as the kernel and fs are up, POSIX guarantees that data writes which precede the rename will be visible to all readers after the rename. Post-crash, nothing used to be guaranteed.

This isn't about application developers coding to an API, it's about users wanting reasonable behaviour from their computers even when they abuse them by kicking out power cords or installing proprietary device drivers.

Ext3 has given us a new, much-better-than-POSIX standard of data recoverability. It's a mere implementation detail that it does this in part by effectively preserving the order of operations that POSIX mandates down to the disk level.

Delayed allocation without a write barrier before renames of newly-written files practically guarantees data loss in this extremely common use-case, so it's a regression.

The regression now has been fixed (thanks Ted). No hacking of applications required.

Nice summary

Posted Mar 18, 2009 6:05 UTC (Wed) by bojan (subscriber, #14302) [Link]

> Please stop insisting that applications are buggy or broken that haven't considered recovery from kernel or hardware failure.

Sorry, I didn't come up with that. I think that would be... Ted ;-)

Know your place

Posted Mar 18, 2009 8:21 UTC (Wed) by xoddam (subscriber, #2322) [Link]

Appeal to authority, eh?

Please, bojan and tytso alike, cease and desist from saying applications are broken, when users have given a clear requirement for a new filesystem: that it not lose data as a matter of course, when the status quo would preserve it.

Know your place

Posted Mar 18, 2009 8:55 UTC (Wed) by bojan (subscriber, #14302) [Link]

> Appeal to authority, eh?

I don't know. I think a person that created the file system may know a thing a two about POSIX. I did actually go and check and he did appear to be right. But, that's obviously not good enough for you (or you may know of some interpretation we cannot grasp - it is possible). I'm OK with that.

> Please, bojan and tytso alike, cease and desist from saying applications are broken, when users have given a clear requirement for a new filesystem: that it not lose data as a matter of course, when the status quo would preserve it.

I have no intention of doing that (unless LWN editors throw me out). Likewise, you can say what you please.

Ted, being a pragmatic perseon, already did put workarounds in place, so users will be happy.

Know your place

Posted Mar 30, 2009 12:37 UTC (Mon) by forthy (guest, #1525) [Link]

> I think a person that created the file system may know a thing a two about POSIX.

It's not, and I repeat in bold: NOT about POSIX. It is about reasonable behavior. Ordered data has been implemented in ReiserFS and XFS, which both had the reputation of being unstable and prone to eat files before. This is a quality of implementation issue, not a standard issue. Maybe we would need a better standard for file systems, so that quality of implementation is reasonable by default, but that's a different topic. If you insist that your way-below-average quality of implementation is "perfectly valid", you are anal-retentive.

I think Ted T'so should read the GNU Coding Standards. What is written there is mandatory for a core component of the GNU project (which the Linux kernel is, regardless if it's officially part of the GNU project). The point in question here is section 4.1:

The GNU Project regards standards published by other organizations as suggestions, not orders. We consider those standards, but we do not “obey” them. In developing a GNU program, you should implement an outside standard's specifications when that makes the GNU system better overall in an objective sense. When it doesn't, you shouldn't.

What Ted has implemented was a behavior which is standard, but makes his file system worse, because it has inconvenient side-effects on robustness in case of a crash. In shorter words: It sucks. And the GNU Coding Standards clearly say: If the standard sucks, don't follow it.

it's about the crashes!!!

Posted Mar 18, 2009 17:24 UTC (Wed) by pflugstad (subscriber, #224) [Link]

that it not lose data as a matter of course
Okay, I had to post on this one. The thing that EVERYONE seems to be forgetting is that these problems only occur when you have crashes - I.e. bad hardware or buggy drivers. This is not a case of lose data as a matter of course, it's a case of the whole freakin system crashing badly. This is a situation which happens VERY rarely.

Honestly, has anyone here, NOT running binary closed source drivers, experienced a crash in a distro provided kernel in what, the last 12 months or longer? Heck, even a bleeding edge (but not -RC) kernel.

Didn't think so. Now, please refrain from hyperbolic statements like that.

I realize the Ted pointed this out in his initial emails and while it's still not good for the system level behavior to change like this, this is a case of ultra bleeding edge kernel, ALPHA distro release, etc. These are not common users in any sense of the word "common".

it's about the crashes!!!

Posted Mar 18, 2009 20:02 UTC (Wed) by zeekec (subscriber, #2414) [Link]

Just let me say that I agree with Teds opinion that the applications are in error, not ext4

> Honestly, has anyone here, NOT running binary closed source drivers, experienced a crash in a distro provided kernel in what, the last 12 months or longer? Heck, even a bleeding edge (but not -RC) kernel.

Actually, yes I have. I run Gentoo unstable at home, and I am currently having issues with the 2.6.28 kernel and Xorg's intel drivers. All open source. So it does happen. (But I'm running Gentoo unstable and expect it!)

it's about the crashes!!!

Posted Mar 18, 2009 22:45 UTC (Wed) by xoddam (subscriber, #2322) [Link]

Your point is entirely correct, as far as it goes, and it validates my position.

The purpose of a journaling filesystem is *only* to ease and speed the task of recovery after an unclean shutdown. I can't emphasise this point strongly enough.

it's about the crashes!!!

Posted Mar 19, 2009 0:35 UTC (Thu) by butlerm (subscriber, #13312) [Link]

<em>The purpose of a journaling filesystem is *only* to ease and speed the
task of recovery after an unclean shutdown.</em>

That is not quite correct. The primary purpose of journaling in typical
journaling filesystems is to preserve metadata integrity. Filesystem
repair tools cannot repair metadata that has never been written.

The secondary purpose of journaling is to loosen ordering restrictions on
meta data updates. Assuming you want your filesystem to be there after an
unclean shutdown, that is a major advantage.

Finally, journaling filesystems are not metaphysically prohibited from
using their journals to do other useful things, such as store meta-data
undo information, for example.

it's about the crashes!!!

Posted Mar 19, 2009 5:56 UTC (Thu) by xoddam (subscriber, #2322) [Link]

Metaphysics aside, surely these primary and secondary purposes you describe themselves have the ultimate goal of saving end users the trouble of cleaning up a mess after an unclean shutdown?

it's about the crashes!!!

Posted Mar 20, 2009 21:17 UTC (Fri) by butlerm (subscriber, #13312) [Link]

Yes. The primary goal of journaling is to make the filesystem more robust
so that manual intervention after a system crash is minimized.

it's about the crashes!!!

Posted Mar 19, 2009 23:25 UTC (Thu) by jschrod (subscriber, #1646) [Link]

If you take your own argument seriously, you don't need any journaled file system -- after all, the only reason to use journaling is to get better behaviour after a crash.

That said, yes, I had many kernel crashes at the start of this year, using SUSE and no proprietary modules. It took a long time to identify the piece of hardware that caused it. (It was the video card.) I have another system where usage of ionice causes hard lockups of the whole system, reproducable. E.g., running updatedb with ionice. I have never identified the culprit here and finally put it in the closet; my time was worth more than the price of a new system.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds