|
|
Log in / Subscribe / Register

ordered(tm) brand

ordered(tm) brand

Posted Mar 16, 2009 0:09 UTC (Mon) by njs (subscriber, #40338)
In reply to: ordered(tm) brand by szh
Parent article: Garrett: ext4, application expectations and power management

ext3's "ordered" mode does *not* have the best semantics; it (accidentally) makes very strong ordering guarantees -- in fact, much stronger than are needed for atomic-rename -- and the consequence of those strong guarantees is that fsync() becomes unbearably slow.

And one direct consequence of this is that on ext3, firefox *cannot* guarantee safety of your e.g. browsing history -- there is no way to do it that is fast enough for users to put up with (they tried). ext4 + Ted's flush-on-rename patch provide all the useful parts of ext3's semantics, while also making fsync fast and thus putting *less* data at risk than ext3.


to post comments

ordered(tm) brand

Posted Mar 16, 2009 2:38 UTC (Mon) by mjg59 (subscriber, #23239) [Link] (3 responses)

Well, no, if it flushes on rename then I disagree that it provides all the useful parts of ext3's semantics.

ordered(tm) brand

Posted Mar 16, 2009 4:45 UTC (Mon) by njs (subscriber, #40338) [Link] (2 responses)

"flush-on-rename" is probably a misleading description (I just don't know a better short phrase for it). From Ted's blog post, it seems pretty clear that ext4+patch gives the same behavior for write-then-rename as ext3, i.e., it ensures that the new file data lands on the disk at the same time as the new metadata (whenever that ends up being). So... flush-on-rename-taking-effect, or flush-from-virtual-pages-to-allocated-pages or something like that.

(I think that answers your objection; it's a bit terse.)

ordered(tm) brand

Posted Mar 16, 2009 13:24 UTC (Mon) by nye (guest, #51576) [Link] (1 responses)

I didn't interpret Ted's post the way you did, though now I've read through it again I can see that you may be right. Specifically, Ted says:
>These three patches (with git id’s bf1b69c0, f32b730a, and 8411e347) will cause a file to have any delayed allocation blocks to be allocated immediately when a file is replaced.

I interpreted that to mean that those blocks would be written to disk as if fsync() had been used, but is that incorrect?

Am I correct in believing that your interpretation is as follows:
When a file is replaced, it is not marked for delayed allocation, so its data will be written immediately before its metadata, *whenever that happens to be*. In other words, the disk will be spun up only at the same time it would have been without those patches, but now the data will be written in addition to the metadata.

If that's correct, then it appears to be the correct resolution to me.

ordered(tm) brand

Posted Mar 16, 2009 21:29 UTC (Mon) by njs (subscriber, #40338) [Link]

Yeah, Ted's speaking filesystemdeveloperese there, but if you read other comments about the difference between ext3 and ext4 there's enough context to decipher. What he's basically saying is that at rename time they will now force the kernel to *decide* which disk blocks the data will eventually end up on (that's what "allocation" means), and then it's a pre-existing rule that before the metadata transaction commits (i.e., that's when the rename will hit disk) you have to write out all "allocated" blocks to disk.

So your interpretation is correct.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds