|
|
Log in / Subscribe / Register

Ts'o: Delayed allocation and the zero-length file problem

Ts'o: Delayed allocation and the zero-length file problem

Posted Mar 16, 2009 16:50 UTC (Mon) by masoncl (subscriber, #47138)
In reply to: Ts'o: Delayed allocation and the zero-length file problem by njs
Parent article: Ts'o: Delayed allocation and the zero-length file problem

The btrfs data=ordered implementation is different from ext34 and reiserfs. It decouples data writes from the metadata transaction, and simply updates the metadata for file extents after the data blocks are on disk.

This means the transaction commit doesn't have to wait for the data blocks because the metadata for the file extents always reflects extents that are actually on disk.

When you rename one file over another, the destination file is atomically replaced with the new file. The new file is fully consistent with the data that has already been written, which in the worst case means it has a size of zero after a crash.

I hope that made some kind of sense. At any rate, 2.6.30 will have patches that make the rename case work similar to the way ext3 does today. Files that have been through rename will get flushed before the commit is finalized (+/- some optimizations to skip it for destination files that were from the current transaction).


to post comments

Ts'o: Delayed allocation and the zero-length file problem

Posted Mar 16, 2009 21:23 UTC (Mon) by njs (subscriber, #40338) [Link] (2 responses)

...Is what you're saying that for btrfs, metadata about extents (like disk location and checksums, I guess) is handled separately from metadata about filenames, and traditionally only the former had data=ordered-style guarantees? (Just trying to see if I understand.)

Ts'o: Delayed allocation and the zero-length file problem

Posted Mar 16, 2009 22:51 UTC (Mon) by masoncl (subscriber, #47138) [Link] (1 responses)

That's correct. The main point behind data=ordered is to make sure that if you crash you don't have extent pointers in the file pointing to extents that haven't been written since they were allocated.

Without data=ordered, after a crash the file could have garbage in it, or bits of old files that had been deleted.

Ts'o: Delayed allocation and the zero-length file problem

Posted Mar 16, 2009 22:56 UTC (Mon) by njs (subscriber, #40338) [Link]

That makes sense. Thanks.

Ts'o: Delayed allocation and the zero-length file problem

Posted Apr 7, 2009 22:27 UTC (Tue) by pgoetz (guest, #4931) [Link]

"When you rename one file over another, the destination file is atomically replaced with the new file. The new file is fully consistent with the data that has already been written, which in the worst case means it has a size of zero after a crash."

Sorry this doesn't make any sense. Atomicity in this context means that when executing a rename, you always get either the old data (exactly) or the new data. Your worst case scenario -- a size of zero after crash -- precisely violates atomicity.

For the record, the first 2 paragraphs are equally mysterious: "This means the transaction commit doesn't have to wait for the data blocks...". Um, is the data ordered or not? If you commit the transaction -- i.e. update the metadata before the data blocks are committed, then the operations are occurring out of order and ext4 open-write-close-rename mayhem ensues.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds