LWN.net Logo

ext4 and data loss

ext4 and data loss

Posted Mar 13, 2009 0:58 UTC (Fri) by quotemstr (subscriber, #45331)
In reply to: ext4 and data loss by bojan
Parent article: ext4 and data loss

Data-before-rename isn't just an fsync when rename is called. That's one way of implement a barrier, but far from the best. Far better would be to keep track of all outstanding rename requests, and flush the data blocks for the renamed file before the rename record is written out. The actual write can happen far in the future, and these writes can be coalesced.

Say you're updating a few hundred small files. (And before you tell me that's bad design: I disagree. A file system is meant to manage files.) If you were to fsync before renaming each one, the whole operation would proceed slowly. You'd need to wait for the disk to finish writing each file before moving on to the next, creating a very stop-and-go dynamic and slowing everything down.

On the other hand, if you write and rename all these files without an fsync, when the commit interval expires, the filesystem can pick up all these pending renames and flush all their data blocks at once. Then it can write all the rename records, at once, much improving the overall running time of the operation.

The whole thing is still safe because if the system dies at any point, each of the 200 configuration files will either refer to the complete old file or the complete new file, never some NULL-filled or zero-length strangelet.


(Log in to post comments)

ext4 and data loss

Posted Mar 13, 2009 1:16 UTC (Fri) by bojan (subscriber, #14302) [Link]

> And before you tell me that's bad design: I disagree. A file system is meant to manage files.

I don't think that's bad design either. It is very useful to build an XML tree from many small files (e.g. gconf), instead of putting everything into one big one, which, if corrupted, will bring everything down.

> The whole thing is still safe because if the system dies at any point, each of the 200 configuration files will either refer to the complete old file or the complete new file, never some NULL-filled or zero-length strangelet.

I think that's the bit Ted was complaining about. It is unusual that changes to hundreds of configuration files would have to be done all at once. Users usually change a few things at a time (which would then be OK with fsync), so this must be some kind of automated thing doing it.

But, yeah, I understand what you're getting at in terms of performance of many fsync calls in a row.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds