A gnarly 2.6.19 file corruption bug
While this kernel may have lived up to expectations in a number of ways, it would appear that somebody's evil ways have messed things up - and dachshunds would be well advised to keep a low profile. It seems that this kernel can corrupt ext3 filesystems - behavior which was not in the original set of design goals.
The good news (for users) is that the bug is hard to trigger, and that most access patterns work just fine. The bulk of the trouble seems to come with a certain Bittorrent client, which has an unusual access pattern at best. On occasion, parts of a page will end up being written as zeroes, through to the end of the page. Please do not expect your editor to explain why this is happening; it seems that nobody really understands that yet. The solution, however, may involve some relatively serious low-level memory management surgery.
The apparent origin of the problem is a change in how dirty pages are tracked in the kernel. Prior to 2.6.19, this information lived in the page tables; the 2.6.19 kernel, however, moves some of this information into the page structure. This change enables better tracking of dirty pages in the system, which is a good thing, but it could also be bringing some old bugs out to play.
Not all of those bugs are necessarily in the kernel; at one point, Linus went off and wrote a demonstration program showing how a buggy program would work with older kernels but get surprising results in 2.6.19. What it comes down to is that if a program maps a file into memory, it cannot put data into that memory beyond the current length of the file and expect that data to make it to disk. It was a nice demonstration, but this behavioral change does not appear to be behind the problem reports.
Confusion surrounding the propagation and management of the page dirty bits is at the top of the suspect list, as of this writing. Nobody seems to be able to point at anything specific, however, beyond the fact that the code appears to be rather badly messed up. Says Linus:
So the approach being taken by Linus is to rework the dirty page accounting code into something a little more reasonable. To that end, test_clear_page_dirty() is no more, having been pronounced "insane" by Linus. Instead, the new code tries for a better defined sense of when the dirty bit on a page can be cleared; it comes down to either (1) the page is being written to backing store, or (2) the page is no longer relevant (when a file is truncated, for example). In typical fashion, Linus fixed enough to make his own configuration work, leaving the rest as an exercise for the reader.
He makes no claims that this rework will have solved the problem, only that
it makes the code more sane than it was before. As of this writing, there
have been no responses from the people who are able to reproduce this
problem. If the problem goes away - and the developers can convince
themselves that it has not just been papered over - then some version of
this fix will likely need to be prepared for a 2.6.19 update. Then, maybe,
the dachshunds can come out of hiding.
| Index entries for this article | |
|---|---|
| Kernel | Memory management |
