|
|
Subscribe / Log in / New account

Two more

Two more

Posted Mar 31, 2009 7:03 UTC (Tue) by man_ls (guest, #15091)
In reply to: From ext3 to ext4: An Interview with Theodore Ts'o (Linux Magazine) by bojan
Parent article: From ext3 to ext4: An Interview with Theodore Ts'o (Linux Magazine)

Thanks for an excellent summary. Let me explain two more possible consequences:

  • Mr Ts'o shows considerable arrogance saying that virtually every application on the planet is "badly written" (including GNU fileutils, meaning most frequently used OS tools such as mv). He also seems unaware of what we might call "Hot topics in filesystem design", such as: "POSIX is not the bible of reliability it was never supposed to be" or "Users dislike empty files".
  • This dangerous combination of arrogance and ignorance is leading Mr Ts'o to quickly damage ext4 reputation and place it next to XFS in users minds, and we all know how hard it is to revert that kind of reputation. This may leave Linux users in many years to come between a rock and a hard place when it comes to filesystem performance: use the obsolete and slow ext3, or suffer the consequences of repeated slow fsync() calls in the much-needed ext4.
Linux has never been about correctness (however one might define it), but about quality and performance. I wonder if Linus, the benevolent dictator, should benevolently revoke Mr Ts'o's commit rights, or something.


to post comments

Two more

Posted Mar 31, 2009 7:52 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (9 responses)

Linus doesn't give commit access to his tree to anybody. Everybody gets commit access to their own trees and nobody else can prevent that. I assume. you mean refuse to pull the maintainer's code. I don't think it is the right strategy to encourage cooperation and I don't remember that happening before.

We have many many arrogant people in the Free software world and key parts of any Linux system depends on their code. If you can find any technical incompetence that results in issues unfixed, it might be worth considering but I don't see you pointing out any such issues.

Two more

Posted Mar 31, 2009 19:38 UTC (Tue) by man_ls (guest, #15091) [Link] (8 responses)

You are right, "commit rights" was meant in a purely rhetorical sense. Saying "Linus should not pull nor even cherry-pick from Mr Ts'o any more" just doesn't carry the same strength.
If you can find any technical incompetence that results in issues unfixed, it might be worth considering but I don't see you pointing out any such issues.
Sorry, I don't buy that. Technical competence to me is not just leaving issues unfixed; it includes the ability to see the consequences of your actions. When a guy makes a change and suggests that thousands compensate for it for no good reason that is a pretty good sign of incompetence. As sbergman27 pointed out below (and as he quoted a few jiffies before I did), Linus did choose the word "incompetent".

Two more

Posted Mar 31, 2009 21:02 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (7 responses)

There are multiple issues being mixed together in these discussions, I think. If you are talking about the zero length file issues, Ted suggested that it was an application usage problem but added hacks to workaround the issues anyway. It seems a lot of people just ignored that for whatever reasons.

Workarounds

Posted Mar 31, 2009 21:40 UTC (Tue) by man_ls (guest, #15091) [Link] (5 responses)

Just for one reason: because Mr Ts'o never admitted to being wrong. In Catholic terms, what good is reparation without repentance? Or, how can you ever learn from your mistakes if you don't admit them in the first place?

Workarounds

Posted Mar 31, 2009 21:50 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (4 responses)

You seem to be making it a religious black and white issue while I think, there is weight in both sides of the debate. One important consequence is that if applications rely on Ext3 like behavior, then those applications will fail miserably when running on other filesystems that haven't adopted the same characteristics and that was among the things Ted pointed out.

The very same blog post that describes the problems also mentions that fixes have already been queued. Technically, I don't know what more you could ask for. To be clear, there are other potential issues present but the ones you are talking about were fixed even before the blog post was written.

Workarounds

Posted Mar 31, 2009 22:47 UTC (Tue) by man_ls (guest, #15091) [Link] (3 responses)

What other filesystems are you talking about? On ext2 and other filesystems without a journal, sure, users know the risks and live with them. But applications seem to work fine on most other journaling filesystems: ext3, reiserfs, hfs+, zfs, even xfs was fixed years ago. Cygwin on ntfs works fine.

There are few black and white issues, but a filesystem developer saying that corrupting user data is fine would seem to qualify. Later commiting a fix to "work around" the problem while a hundred thousand developers fix their code is hardly enough. Technically, I am not even sure a public flogging would be enough.

And now, ladies and gentlemen, with your kind permission I will just call Ts'o a nazi in a half-assed invocation of Godwin's law to jump out of this discussion and go to sleep.

Workarounds

Posted Mar 31, 2009 23:52 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (2 responses)

Maybe FAT? what about filesystems beyond those in Linux? That's the point of POSIX. I don't see anyone ever claiming that losing data is fine. It is much more nuanced debate than that and I am sure you are aware of the issues very well so I won't bother repeating it again but I still don't know why you think hundreds of applications have to be fixed when the patch has already been merged to retains the Ext3 like behavior.

Workarounds

Posted Apr 1, 2009 0:04 UTC (Wed) by bojan (subscriber, #14302) [Link] (1 responses)

> Maybe FAT?

Actually, you don't even have to look at other file systems. ext3 in writeback mode is sufficient, because metadata can go to disk before data. You may end up with garbage in your files after the crash.

Workarounds

Posted Apr 1, 2009 6:52 UTC (Wed) by man_ls (guest, #15091) [Link]

Writeback mode? FAT?!? Please leave your (metaphorical) commit rights in the reception on your way out. Both of you.

Two more

Posted Apr 3, 2009 14:03 UTC (Fri) by anton (subscriber, #25547) [Link]

Ted suggested that it was an application usage problem but added hacks to workaround the issues anyway.
It's a question of trust. Do I trust my data to a file system whose developer has the attitude that Ted T'so has? Not if I have an alternative.

Two more

Posted Mar 31, 2009 9:38 UTC (Tue) by regala (guest, #15745) [Link] (4 responses)

who's the arrogant ?
who's been contributing since September 1991 ?

Two more

Posted Mar 31, 2009 9:47 UTC (Tue) by regala (guest, #15745) [Link] (3 responses)

guess that's not well put :/
what I'd like you to do, is to think about what you said. I don't think anyone can say Ted was ever arrogant, in these dreadful flame threads around Launchpad, Ubuntu and here on LWN. He's been quite understanding, never calling anybody anything while being insulted by herds of angry mob.
Would you please like to stop ? He's no arrogant, you are. Ever considering Linus starting to mistrust his judgement is ridiculous.

Two more

Posted Mar 31, 2009 18:21 UTC (Tue) by man_ls (guest, #15091) [Link] (2 responses)

Have you ever had anyone say that your code is "badly written" because he understood a spec in a rather peculiar manner? That amply qualifies as an insult to me. Given that most people in the world understands the spec differently, it's not bad for arrogance either.

That reminds me of the old joke. A reckless driver on the highway is listening to the radio: "Attention, attention, there is a crazy man driving against the traffic on the highway", and he says: "One? All of 'em!"

Two more

Posted Mar 31, 2009 19:32 UTC (Tue) by nix (subscriber, #2304) [Link] (1 responses)

I'd call Ted one of the most charming and thoughtful people working in
kernel development (in fact, in free software development, period). I may
sometimes disagree with what he says, but he's *always* worth listening
to, and always well reasoned.

I wonder if you're been using the same Internet I have, really.

Good people behaving badly

Posted Mar 31, 2009 21:37 UTC (Tue) by man_ls (guest, #15091) [Link]

nix, I highly value your opinion, and Mr Ts'o can be a patron saint of the arts, but he has behaved like a jerk in this issue. Just look at his own E pur si muove:
This will cause a significant performance hit, but apparently some Ubuntu users are happy using proprietary Nvidia drivers, even if it means that when they are done playing World of Goo, quitting the game causes the system to hang and they must hard-reset the system. For those users, it may be that nodelalloc is the right solution for now — personally, I would consider that kind of system instability to be completely unacceptable, but I guess gamers have very different priorities than I do.
I probably got too carried away with the discussion (and my own indignation). Probably he did not mean to insult anyone, and he did express himself with manners. But this tirade is not well reasoned; it has a lot of holes and is in general a lot of rubbish. More's the pity if he is such a worthy individual as you say.

Two more

Posted Mar 31, 2009 14:02 UTC (Tue) by clugstj (subscriber, #4020) [Link] (8 responses)

His arrogance and whether he is or is not arrogant is irrelevant. It is his competence that matters. If his code sucks, everyone is free to not use it. This is the power of freedom.

Two more

Posted Mar 31, 2009 15:40 UTC (Tue) by sbergman27 (guest, #10767) [Link] (7 responses)

"""
It is his competence that matters.
"""

There is competence, and there is judgment. And the two are distinct. I think that it is his judgment on this matter that is in question. I've been waiting for Linus to speak on the matter. I would be very interested in his view of this matter. Of course, the distros have the final say as to what are the effective defaults, even down to the patches they choose to apply. And *savvy* users have the ultimate decision as to the configuration of their systems. Unsavvy users, of course, are stuck with what they get.

Two more

Posted Mar 31, 2009 19:18 UTC (Tue) by sbergman27 (guest, #10767) [Link]

Apparently, Linus has spoken, and I missed it. And he does choose the word "incompetent":

========================
On Tue, 24 Mar 2009, Theodore Tso wrote:
>
> Try ext4, I think you'll like it. :-)
>
> Failing that, data=writeback for single-user machines is probably your
> best bet.

Isn't that the same fix? ext4 just defaults to the crappy "writeback"
behavior, which is insane.
Sure, it makes things _much_ smoother, since now the actual data is no
longer in the critical path for any journal writes, but anybody who thinks
that's a solution is just incompetent.

We might as well go back to ext2 then. If your data gets written out long
after the metadata hit the disk, you are going to hit all kinds of bad
issues if the machine ever goes down.

Linus

=======================

Where competence meets judgment

Posted Mar 31, 2009 19:18 UTC (Tue) by man_ls (guest, #15091) [Link] (5 responses)

I had the impression that Linus had already spoken against data loss, and he has indeed:
Sure, it makes things _much_ smoother, since now the actual data is no longer in the critical path for any journal writes, but anybody who thinks that's a solution is just incompetent.

We might as well go back to ext2 then. If your data gets written out long after the metadata hit the disk, you are going to hit all kinds of bad issues if the machine ever goes down.

Gods how I enjoyed that quote. And:
But I also think that the "we write meta-data synchronously, but then the actual data shows up at some random later time" is just crazy talk. That's simply insane. It _guarantees_ that there will be huge windows of times where data simply will be lost if something bad happens.

And expecting every app to do fsync() is also crazy talk, especially with the major filesystems _sucking_ so bad at it (it's actually a lot more realistic with ext2 than it is with ext3).

So look for a middle ground. Not this crazy militant "user apps must do fsync()" crap. Because that is simply not a realistic scenario.

And:
Doesn't at least ext4 default to the _insane_ model of "data is less important than meta-data, and it doesn't get journalled"?

And ext3 with "data=writeback" does the same, no?

Both of which are - as far as I can tell - total braindamage. At least with ext3 it's not the _default_ mode.

Linus is tha man.

Speed doesn't matter if you cannot trust it

Posted Mar 31, 2009 19:39 UTC (Tue) by oak (guest, #2786) [Link]

I have an insanely fast file system with a really simple design:
cat >/dev/null

If /dev/null writes aren't zero-copy, it's journaled too!

The window for data retrieval is (infinitely) small though.

Where competence meets judgment

Posted Mar 31, 2009 22:37 UTC (Tue) by bojan (subscriber, #14302) [Link] (3 responses)

> And expecting every app to do fsync() is also crazy talk, especially with the major filesystems _sucking_ so bad at it (it's actually a lot more realistic with ext2 than it is with ext3).

Major filesystems being "ext3 in ordered mode only", of course. The rest could be just fine with fsync(), as we can see above from his ext2 comment. And as Ted pointed out, ext4 doesn't have a big penalty on fsync(), because it doesn't have to flush out MBs of stuff that are unrelated to this particular fsync(), every time this system call is used.

Just as Linus says that ext4 is brain damaged for doing delayed allocation by default, so can it be claimed that is ext3 brain damaged for locking up people's machines for a few seconds on a perfectly reasonable system call: fsync(). We have seen this from the FF fiasco. In fact, when Linux says that having an interactive application do fsync() is impossible, he must mean on ext3 in ordered mode, because that's what FF complaints were about. As Alan Cox and Ted pointed out, one can already do fsync() in another thread and be fully interactive.

As for configuration files of KDE (which is where the problem started), the library can trivially do backup of these files on startup and _never_ use fsync() after that. Other problems should probably be solved by a proper system call that does guarantee ordering (I think Ted provisionally called it fbarrier() or something). Then we'd have a real guarantee of the behaviour, instead of relying on whims of implementations.

Claiming the rename() always did "data before metadata" commits is ahistorical. So, the crazy talk ain't that crazy after all. We just got caught we our pants down.

Surely, Linus is "tha man" when it comes to Linux and what he says will eventually go. But, removing any criticism from what he says is just arse licking, IMNSHO.

Where competence meets judgment

Posted Mar 31, 2009 22:43 UTC (Tue) by bojan (subscriber, #14302) [Link] (1 responses)

> when Linux says

Gee, he should have called it something else. It is impossible to get the man's name right after having "Linux" :-)

Where competence meets judgment

Posted Apr 12, 2009 7:59 UTC (Sun) by Duncan (guest, #6647) [Link]

You likely know this as I'm sure most Linux veterans do by now, and the
above "he should have called it something else" was simply a figure of
speech, but maybe the below will be new to the newbies at least.

Actually, "he" (Linus) did call it something else, "Freeix". It was
Linus' colleague that put it up on the ftp-site that put it in a
directory he named "linux", and so history was made.

(Just google freeix linux for more. "I'm feeling lucky" does it for me.)

Duncan

Judgments must take into accounts users

Posted Apr 2, 2009 12:01 UTC (Thu) by renox (guest, #23785) [Link]

Even the best fsync on earth can take a long time if there's a lot of data to be written on the disk, so fsync is always a 'potentially time consuming' operation.
Which means that whatever the FS if you must use fsync to have the correct behaviour then to avoid showing freeze to the user you must go to the dreaded multi-threaded world.
Sure the FS can provide a (Linux specific) write barrier, but it's very likely that nobody will use this.

OR the other possibility is to use a FS which does the operations in-order which simplify a lot the application programming.
There may be a small performance cost, somehow I doubt that users will care.

Two more

Posted Mar 31, 2009 19:26 UTC (Tue) by nix (subscriber, #2304) [Link]

One minor point: GNU fileutils hasn't existed since about 2001. It's GNU
coreutils now (merged with what used to be sh-utils and textutils).

Two more

Posted Mar 31, 2009 19:28 UTC (Tue) by nix (subscriber, #2304) [Link] (1 responses)

What? He's written defences against the common instances of this data
loss: the remaining instances don't seem major to me (extending existing
files, for instance, is much rarer than writing out new ones).

Two more

Posted Apr 3, 2009 6:52 UTC (Fri) by efexis (guest, #26355) [Link]

"extending existing files, for instance, is much rarer than writing out new ones"

My system, apache and database replay log directories would disagree on that one.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds