That massive filesystem thread
That massive filesystem thread
Posted Apr 1, 2009 3:39 UTC (Wed) by ajross (guest, #4563)In reply to: That massive filesystem thread by bojan
Parent article: That massive filesystem thread
And to be fair, there's a difference in designing around "the odd crash here and there" and a 30 Second Window of Doom for every file creation.
Posted Apr 1, 2009 4:19 UTC (Wed)
by bojan (subscriber, #14302)
[Link] (9 responses)
What is your point here exactly? That I should not post because you may not like reading it? If you are a moderator of the site, please feel free to remove my post.
I make no apologies for my snideness - I think it was well deserved. Essentially, just because one file system does something in an idiotic way, we should now drop a perfectly good system call. Shouldn't we instead FIX what's broken so that all system calls and all file systems can be used as designed?
Similarly, we have seen heaps of new system calls introduced into Linux in recent times (dup3 and friends + other, backup related stuff from Ulrich Drepper), which all have to do with files. Why? Because they were needed. No complaints there. I thought the deal was that they would never get used? (see, being snide again).
> And to be fair, there's a difference in designing around "the odd crash here and there" and a 30 Second Window of Doom for every file creation.
And to be fair, there is difference in designing around complete system lockups for a number of seconds and committing data when required.
Posted Apr 1, 2009 8:34 UTC (Wed)
by nix (subscriber, #2304)
[Link] (8 responses)
They're not really intended for use by everyman, anyway.
The problem with what one might call the fsync() RANDOMLY_LOSE option is that it is something which must be used by everyman to avoid data loss, which if you get it wrong there is no sign unless you lose power at exactly the right time, and which nearly all programs you might clap eyes on other than Emacs have historically got wrong, and which many utility programs *cannot* get right no matter what, because there's no way they can tell if the data they are operating on is 'important', and thus should be fsync()ed, or not. (Sure, you could add a new command-line option to tell them, but that option is not in POSIX so portable applications can't rely on it for a long long time).
That's a big difference.
Posted Apr 1, 2009 10:24 UTC (Wed)
by bojan (subscriber, #14302)
[Link] (6 responses)
You are kidding, right? dup3() is not for general use?
> That's a big difference.
Look, I'm not really bent on a particular mechanism of actually making sure that programmers have a reliable interface for doing this. Using fsync() before close() is the only portable solution now, but it is far from optimal. I think there is very little doubt about that. And we all know it sucks to high heaven on ext3 in ordered mode.
I don't know what the best way is: new call, some kind of flag to open that says O_ALWAYSDATABEFOREMETADATA, rename2(), close_with_magic() or whatever. But, saying that application programmers cannot grok this kind of stuff is just not true. They can and they will, only if given the tools. Just like they did dup3() and friends (and as you point out, there is little danger of misuse - these are new calls).
As I said many times before, overloading current pattern with non-portable behaviour is dangerous, because it provides false sense of robustness and ties one up to a particular FS and kernel. If we can get POSIX updated so that rename() actually means "always data before metadata, but don't put on disk now", then it may even fly. But, I don't know how that's going to make guarantees retroactively, when even Linux features file systems that don't do that (e.g. ext3 in writeback mode).
Also, having things like delayed allocation, where metadata can legitimately be committed before data, is really useful. Most short lived temporary files will never see disk platters, therefore making things faster and disks last longer. Meaning, keeping the old cruft around ain't that bad.
As for utility programs that are called from scripts, you can use dd with conv=fsync or conv=fdatasync in your pipe to commit files to disk today. On FreeBSD, they already have standalone fsync program for that. Yeah, I know. It sucks. But, your usual tools don't have to make any decisions on fsync()-ing - you can.
Posted Apr 1, 2009 18:09 UTC (Wed)
by quotemstr (subscriber, #45331)
[Link] (5 responses)
Posted Apr 1, 2009 20:55 UTC (Wed)
by bojan (subscriber, #14302)
[Link] (4 responses)
Quite the opposite. I'm all for fixing bugs and giving application programmers the _right_ tools for the job. If some Linux developers took a second to lift their noses out of the specifics of Linux and actually looked around, this could be fixed for _everyone_, not just for some Linux specific file systems. That is my point, in case you didn't get it by now.
Posted Apr 1, 2009 21:37 UTC (Wed)
by man_ls (guest, #15091)
[Link] (3 responses)
After reading that Linus is not pulling from Mr Tso's trees made me suspect. Well, now that Ts'o's commit rights have been officially revoked I think that the whole discussion is moot. I wonder if the next ext4 head maintainer will learn from this painful experience and just do the right thing.
Posted Apr 1, 2009 21:46 UTC (Wed)
by corbet (editor, #1)
[Link] (1 responses)
Maybe it's an April 1 post that went over my head?
Posted Apr 2, 2009 6:21 UTC (Thu)
by man_ls (guest, #15091)
[Link]
Will try to do better next time :D)
Posted Apr 1, 2009 22:38 UTC (Wed)
by bojan (subscriber, #14302)
[Link]
> Why invent a new system call which cannot (by necessity) be honored by ext2, or ext4 without a journal?
Even if there was some kind of magical law that said that you could not order commits on the non-journaled file system this way, it can always be trivially implemented through - wait for it - fsync(), which has acceptable performance characteristics on such file systems.
> Everything is working now fine in ext3
Sure. Except fsync(), which locks the whole system for a few seconds. Hopefully, this will get fixed (or at least its effect reduced) as a result of the hoopla.
> Well, now that Ts'o's commit rights have been officially revoked I think that the whole discussion is moot.
Now you are really making a fool of yourself.
Posted Apr 2, 2009 23:16 UTC (Thu)
by anton (subscriber, #25547)
[Link]
Posted Apr 1, 2009 5:31 UTC (Wed)
by ncm (guest, #165)
[Link] (16 responses)
Posted Apr 1, 2009 6:07 UTC (Wed)
by bojan (subscriber, #14302)
[Link] (7 responses)
What exactly is not polite about that? Is sarcasm now verboten on LWN? I see plenty of it. Daily.
In a post not so long ago, someone accused me of hiding behind Ted's authority (although I actually used documentation to support my case - which many don't bother to read, of course). This time, I point out what to me is nonsense coming from an even bigger authority, but that's no good either. I'm not sure what position of mine would satisfy fragile sensibilities here. Only silence, I guess.
This time I was being accused of making snide remarks. So, I replied to ajross using his terminology, although I do not actually agree with that qualification (which you can see from my sarcastic: "see, being snide again" remark) and I should have used "so called snideness" in my reply instead. I am really just being sarcastic, because we are all supposed to rally behind the high priest or something.
Sure, Linus is a genius, but that doesn't mean that whatever he says is beyond criticism. And, I do not see how I am not being polite by exercising criticism with a hint of sarcasm.
What is it exactly that you have the issue with in my posts? What exactly is impolite?
Posted Apr 1, 2009 7:54 UTC (Wed)
by khim (subscriber, #9252)
[Link] (3 responses)
Nope. You are being 100% smart-ass. Linus's reality check is not
inconsistent. It's description of reality and reality is not
consistent. Whenever it was? You have different factors and in
different but quite real situations different factors prevail. That's different facet of reality. When you consider reality from kernel
developer POV what the applications are doing is your "unchangeable fact",
your "speed of light", when you consider reality from application developer
POV what the kernel does is "unchangeable fact" and you should deal with
it. This is true even if kernel developer and application developer is the
same person. You can only think differently if your application is designed
to only be used "in-house" and you can always guarantee
control over both kernel and userspace - and git was not designed to only
be used "in-house"... You are exercising ignorance with a hint of sarcasm. That's
different.
Posted Apr 1, 2009 8:29 UTC (Wed)
by bojan (subscriber, #14302)
[Link] (2 responses)
Let me review.
When another Unix kernel (or Linux) holds your data in buffers and commits metadata only (because it is allowed to), you, as an application developer, deal with it by ignoring that fact.
And, when your file system does crazy things with the perfectly good system call, you also ignore it as a kernel developer.
WOW, is that now the new "very special relativity"? We pick whichever behaviour is the most narrow to a specific file system and go with that?
Posted Apr 1, 2009 14:22 UTC (Wed)
by drag (guest, #31333)
[Link] (1 responses)
POSIX allows you never to write data to disk at all. That will make your file system very fast. After all you can have a POSIX-compliant file system that operates off of ramdisk quite easily.
POSIX file system access is designed to describe the interface layer between userland and the file system. It leaves the actual integration between the file system and the hardware, as well as the internals to the file system itself is left up to the developer of the OS.
It is like if you discovered all of a sudden a network service provided by a Apache-based web app uses SSL badly so that all usernames and passwords are transmitted over the Web in plain text... then you complain about it and the developer says back to you that his application's behavior is allowed by TCP/HTTP/SSL and that you should be changing your password with each usage, like people who use his app correctly do. Then he emails you some documentation from a security expert that says you should change your password frequently and that many other protocols like telnet or ftp send your username and password over the network in plain text.
Posted Apr 1, 2009 16:10 UTC (Wed)
by foom (subscriber, #14868)
[Link]
Posted Apr 2, 2009 23:17 UTC (Thu)
by xoddam (guest, #2322)
[Link] (1 responses)
I plead guilty and I apologise. That was immediately after replying to someone else's post the gist of which was "Ted wrote ext2 and ext3 in the first place, he is therefore above criticism." It concluded with the words "Know your place", which got me riled.
[proverb: in the midst of great anger, never answer anyone's letter]
Your words were not so condescending but they had much the same emphasis: all ur filesystems are belong to POSIX (not users) 'cos POSIX is the law, and by the way Ted's interpretation is the only correct one because he's the primary implementor.
I hope you understand where I was coming from. Forgive me.
Posted Apr 2, 2009 23:56 UTC (Thu)
by bojan (subscriber, #14302)
[Link]
Posted Apr 8, 2009 0:05 UTC (Wed)
by jschrod (subscriber, #1646)
[Link]
But your self-rightousness doesn't allow to understand this, obviously. Luckily, there are still some discussion threads where you don't try to take over. I hope the likes of you will remain few on LWN in the future, this is not Slashdot, after all.
Posted Apr 1, 2009 15:46 UTC (Wed)
by GreyWizard (guest, #1026)
[Link] (7 responses)
People get nasty in the comments here all the time. If there's something beautiful and fragile here it's already in a thousand jagged pieces. But people hector one another about being polite all the time too. That also wrecks the signal-to-noise ratio and solves nothing.
Posted Apr 4, 2009 9:05 UTC (Sat)
by jospoortvliet (guest, #33164)
[Link] (6 responses)
Living in a country where that mode of thinking is the norm, I can tell
A little decency now and then doesn't hurt. I know people who, knowing how
Posted Apr 5, 2009 3:34 UTC (Sun)
by GreyWizard (guest, #1026)
[Link] (5 responses)
But saying "be polite you jerk" merely drags things even further down into the muck.
Posted Apr 5, 2009 12:43 UTC (Sun)
by jospoortvliet (guest, #33164)
[Link] (4 responses)
First of all, some people don't notice their behavior is unnecessarily impolite. Pointing it out can help them (if they are willing to be reasonable in the first place). Never pointing out somebodies failures will make them fail forever.
Second, it shows you care about being polite. If others show they care too, a culture of 'you should be polite' can be maintained. As you might have noticed from the differences between FOSS communities, culture is important and heavily influential. And it can be changed.
Some things to note:
Posted Apr 5, 2009 15:42 UTC (Sun)
by GreyWizard (guest, #1026)
[Link] (3 responses)
A truly polite request for more courtesy might help but it's difficult to be sure because such things are quite rare. Giving in to the temptation to scold even just a little makes the comment worse than useless. Unless you are absolutely certain you can do it right it's better to focus on substantive issues and avoid appointing yourself a courtesy cop.
Posted Apr 5, 2009 16:20 UTC (Sun)
by jospoortvliet (guest, #33164)
[Link] (2 responses)
Posted Apr 5, 2009 16:27 UTC (Sun)
by GreyWizard (guest, #1026)
[Link] (1 responses)
Posted Apr 5, 2009 17:11 UTC (Sun)
by jospoortvliet (guest, #33164)
[Link]
On re-reading the thread, I think you are right in that ajross was more impolite than bojan, which often leads to a downward spiral and isn't helpful... bojan's post wasn't that far off from the normal tone on this site.
Anyway. This is went pretty far off-topic, and I think we mostly agree. For as far as we don't, we at least agree on that ;-)
That massive filesystem thread
That massive filesystem thread
That massive filesystem thread
By your logic, we should never fix bugs. Remember the 25 year old readdir bug? Don't you agree it was good to fix that? What if a program, somewhere, depended on that behavior?
In reality, programs use rename for atomic replacement. POSIX doesn't say anything about guarantees after a hard system crash, and it's just disingenuous to think that by punishing application authors by giving them as little robustness as possible, you're doing them some kind of portability favor.
That massive filesystem thread
That massive filesystem thread
It is a worthless effort. Each filesystem must keep its house clean. Why invent a new system call which cannot (by necessity) be honored by ext2, or ext4 without a journal? Everything is working now fine in ext3, and if it doesn't work right in ext4 people will just look for a different filesystem.
That massive filesystem thread
I'm confused. The article said that Ted's trees had not been pulled yet. In fact, that happened today; a bunch of ext4 work went into the mainline, including a number of patches which increase robustness for applications which don't use fsync(). I dunno what you were trying to link to, but it didn't work. I've not seen anything about revocation of commit rights. (It's hard to "revoke commit rights" in a distributed system in any case; at worst you can refuse to pull from somebody else's repository.)
ext4 trees
Sorry, it was a stupid attempt from a foreigner at an April Fools' prank :D I was hoping that the recursive link would give it away, but maybe it was too plausible altogether.
Recursive linking
That massive filesystem thread
That massive filesystem thread
The problem with what one might call the fsync()
RANDOMLY_LOSE option is that it is something which must be used by
everyman to avoid data loss, which if you get it wrong there is no
sign unless you lose power at exactly the right time, and which nearly
all programs you might clap eyes on other than Emacs have historically
got wrong
s/other
than/including/. However, I don't agree that this application
behaviour is wrong; if the application wants to jump through hoops to
get a little bit of extra safety on low-quality file systems, that's
ok, but if it doesn't, that's also ok. It's up to the users to chose
which applications they run and on which file system.
The end of LWN comment dialog?
The end of LWN comment dialog?
Yup. It's the beginning of the end.
If you read my original post in this thread, you will find that
I am pointing at inconsistencies of what Linus describes as reality
check.
So, I ridicule (among other things) his conclusion that: ext3
sucks at doing fsync(), hence we should drop fsync().
And, I do not see how I am not being polite by exercising
criticism with a hint of sarcasm.
Yup. It's the beginning of the end.
Yup. It's the beginning of the end.
Yup. It's the beginning of the end.
of the other article's threads. I'd like to suggest that it might be in everyone's interest to move on to
more useful pass-times than rehashing the same arguments over and over again every time there's
an update on the subject.
sticks & stones
sticks & stones
The end of LWN comment dialog?
The end of LWN comment dialog?
The end of LWN comment dialog?
with what comes out, it's on their own plate.
you it also has disadvantages... If only because the resulted hurt
feelings can muddy the discussion more than you might think. Besides, it
chases people away who would otherwise have contributed constructively -
it's not acceptable behavior in all cultures. Ever wondered why the FOSS
community is still predominantly western, despite many smart developers in
countries like India?
blunt they can be, ask someone else to read certain emails before sending
them. After all, reality is that people DO have feelings.
The end of LWN comment dialog?
The end of LWN comment dialog?
- people DO care about what others think of them. No matter how much they scream 'no I don't', they do. It is our nature.
- people should know their arguments are not supported by being mean - it is the other way around.
- I agree that a 'be polite you yerk' might not always be the best way to correct someone. A personal mail can do more. However, it won't show up in public (unless an apology is made), thus it does not much to influence others who might think it is acceptable behavior because the guy got away with it. Of course, giving a good example is better than anything else.
- Of course discussing without end whether somebody was polite enough or not muddies the discussion and lowers the SNR.
The end of LWN comment dialog?
The end of LWN comment dialog?
The end of LWN comment dialog?
The end of LWN comment dialog?