It's true that crcs allow us to figure out if the data on the drives is correct. But, if you crash while updating the parity and you lose one of the drives (not unusual in a power failure), you need to be able to rebuild the data from parity.
If the parity isn't consistent with the rest of the stripe, the rebuild isn't possible.
Posted Feb 6, 2013 15:27 UTC (Wed) by Jonno (subscriber, #49613)
[Link]
> If the parity isn't consistent with the rest of the stripe, the rebuild isn't possible.
True, but a write-intent bitmap wouldn't help with that either, as all it does is tell you which drive(s), if any, is out of date and need to be rebuilt, information that won't help if you lost a drive (or two for raid6) and can't rebuild anything.
RAID 5/6 code merged into Btrfs
Posted Feb 6, 2013 18:26 UTC (Wed) by butlerm (subscriber, #13312)
[Link]
The purpose of a write intent bitmap is not to recover a failed drive, it is to recover from a lost write. In the event of a power failure or system crash, one or more of the writes may be lost (or partially completed), leaving the stripe parity in an inconsistent state.
Correct parity (sufficient to recover from a subsequent drive failure) can be trivially regenerated using the contents of the write intent bitmap. The data on the blocks actually being written to may be still be incomplete of course, but that doesn't matter for the purpose of protecting the data on other other blocks in the same stripe.
If a drive fails and the system crashes at the same time a stripe update is in progress, it is entirely possible of course that unrelated parts of the stripe being updated may become unrecoverable, for lack of consistent parity information. You can see the attraction of the ZFS full stripe minimum block size policy.