Improving ext4: bigalloc, inline data, and metadata checksums
Improving ext4: bigalloc, inline data, and metadata checksums
Posted Nov 30, 2011 2:13 UTC (Wed) by nix (subscriber, #2304)In reply to: Improving ext4: bigalloc, inline data, and metadata checksums by yokem_55
Parent article: Improving ext4: bigalloc, inline data, and metadata checksums
I'm quite willing to believe that bad RAM and the like can cause data corruption, but even when I was running ext4 on a machine with RAM so bad that you couldn't md5sum a 10Mb file three times and get the same answer thrice, I had no serious corruption (though it is true that I didn't engage in major file writing while the RAM was that bad, and I did get the occasional instances of bitflips in the page cache, and oopses every day or so).
Posted Nov 30, 2011 12:49 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link] (1 responses)
To someone who isn't looking for RAM/ cache issues as the root cause, those often look just like filesystem corruption of whatever kind. They try to open a file, get an error saying it's corrupted. Or they run a program and it mysteriously crashes.
If you _already know_ you have bad RAM, then you say "Ha, bitflip in page cache" and maybe you flush a cache and try again. But if you've already begun to harbour doubts about Seagate disks, or Dell RAID controllers, or XFS then of course that's what you will tend to blame for the problem.
Posted Dec 1, 2011 19:23 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Rare bitflips are normally going to be harmless or fixed up by e2fsck, one would hope. There may be places where a single bitflip, written back, toasts the fs, but I'd hope not. (The various fs fuzzing tools would probably have helped comb those out.)
bitflips
bitflips
