LWN.net Logo

The 2.4.20 ext3 corruption bug

Shortly after the release of the 2.4.20 stable kernel, word got out that there was a bug which could lead to corruption on ext3 filesystems. This particular bug will not affect all that many users: to be bitten, one must (1) use the non-default data=journal option, and (2) unmount the filesystem after making changes, but before those changes are synced to disk. Nonetheless, filesystem corruption is not a good feature to include in a stable kernel release.

2.4.20 users who wish to be protected from this bug should apply this patch from Andrew Morton. Andrew also includes some information on how the bug came to be. The trouble, it seems, comes from a longstanding confusion between two operations:

  • Flushing data to a filesystem to get it out of main memory, and

  • Fully synchronizing a filesystem to get it into a consistent, current state on disk.

The write_super() filesystem operation once performed the second operation above. A full sync, however, requires waiting for all of the I/O operations to complete. Most of the time, that is not what the kernel wants to do; it simply wants to get dirty buffers headed toward the disk sometime soon. So the ext3 write_super() method was made asynchronous, as a way of increasing performance. After another tweak went in, however, the lack of synchronization allowed the filesystem to be unmounted before the data actually made it to disk. And that, of course, led to corruption.

The solution is to properly separate the two operations. So Andrew's patch adds a new sync_fs() operation; it writes everything to the filesystem, and does not return until the job is done. With this patch in place, write_super() can be safely made into an asynchronous flush operation; kernel code which needs to be sure that everything has been written out will use sync_fs() instead.

Andrew has also posted a version of the patch for the 2.5 kernel. It is a more extensive change (though the patch is still small) in that it tries to improve performance by getting all sync operations going before waiting for any of them.


(Log in to post comments)

Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds