User: Password:
|
|
Subscribe / Log in / New account

A nasty file corruption bug - fixed

A nasty file corruption bug - fixed

Posted Jan 2, 2007 4:51 UTC (Tue) by iabervon (subscriber, #722)
Parent article: A nasty file corruption bug - fixed

This, of course, leaves out three-quarters of the story, in which quite a number of people, including Linus, found a number of things which were confusing or actual bugs, but weren't actually the real issue. There was quite a bit of argument about whether dirty bits on pages or page tables were getting lost in complicated situations in the VM (including Linus finding something that probably was a bug, and probably would cause the right sort of corruption, but fixing it didn't solve the problem), but it turned out not to be the issue at all.

I'm not sure I actually completely follow what was going on, but I think it's a bit more subtle than the article concludes. If the PTE is already dirty, further writes don't lead to set_pte_dirty() being called. But the buffer heads may be cleaned by the filesystem after the PTE is initially marked dirty and before later writes. Then, when the page is finally done, the buffer heads are already marked clean, so they're skipped. Linus finally found that, when the bug triggered, the kernel was deciding to write out the page, at a point where there was no activity, and then doing nothing because all of the buffer heads were clean.

(Linus had previously thought the issue was that, somewhere, a dirty bit was getting cleared when I/O was completed rather than when I/O started. If you clear the dirty bit when I/O is completed, you'd lose any writes which happen during I/O. But he couldn't find anywhere this was happening, because the real issue was different.)


(Log in to post comments)

A nasty file corruption bug - fixed

Posted Jan 2, 2007 5:51 UTC (Tue) by rganesan (subscriber, #1182) [Link]

I agree with this comment that the article does not tell the full story. In particular, I don't think the statement "When the I/O is complete, the filesystem clears the dirty flag in the bh." is correct. I believe the filesystem clears the dirty flag in the bh when the I/O is started.

A nasty file corruption bug - fixed

Posted Jan 5, 2007 20:28 UTC (Fri) by riel (subscriber, #3142) [Link]

You are correct. Dirty bits are cleared when I/O is started, so the application can dirty the page again while the disk I/O happens, without the kernel forgetting that the page was dirtied again.

A nasty file corruption bug - fixed

Posted Jan 2, 2007 9:37 UTC (Tue) by kay (subscriber, #1362) [Link]

The article may be a little confusing about this, but it states clear:

If the set_page_dirty() call comes while the I/O on the block is active, the filesystem will not notice the fact that the block's data may have changed after it was written

Kay

A nasty file corruption bug - fixed

Posted Jan 2, 2007 18:34 UTC (Tue) by iabervon (subscriber, #722) [Link]

But I don't think that's actually true. If the I/O on the block is active, it has already cleared the bh's dirty bit (because the rule is that you clear dirty bits when you decide to write out data, not when you finish, to plug exactly the race you're talking about), and therefore set_page_dirty() will set it and things will be okay. I think this was Linus's second-to-last theory (something was cleaning a buffer after it sent the data to the disk), but it turned out not to be the problem.

The issue is if the page gets written out after set_page_dirty() but before the last write to the page, because the VM didn't redirty buffers in dirty pages when more writes came in. After getting the concurrent dirtying case correct, it essentially missed the case of writes to a clean part of a dirty page.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds