Responding to ext4 journal corruption
Posted May 30, 2008 6:03 UTC (Fri) by jzbiciak
(✭ supporter ✭
Parent article: Responding to ext4 journal corruption
- A file is created, with its associated metadata.
- That file is then deleted, and its metadata blocks are released.
- Some other file is extended, with the newly-freed metadata blocks being reused as data blocks.
It seems that if you defer releasing metadata blocks in the in-memory notion of "available space" until the transaction releasing them is well and truly committed (rather than "sent to the journal"), you prevent '3' from ever happening.
In fact, the general issue seems to be related to storage repurposing. For example, consider blocks freed from file A get allocated to file B. If data for B gets written to those blocks but the transactions reassigning those blocks get corrupted across a crash, then file A would hold contents intended for file B.
Thus, it seems prudent in data=ordered mode to prevent the allocator from reallocating recently freed blocks until the metadata indicating that those blocks are actually free is actually committed. I have no idea how difficult to implement that might be, but it is something that only needs to be tracked in the in-memory notion of "available space."
Will this degrade the quality of allocations? It might for nearly full filesystems or filesystems with a lot of churn, but for filesystems that are far from full, I doubt it would have any measurable impact whatsoever. There will be some pool of blocks from files recently getting deleted or truncated that won't be available for reallocation immediately.
Anyone see any holes in this?
to post comments)