Is self-healing always good?
Is self-healing always good?
Posted Apr 3, 2025 8:03 UTC (Thu) by DemiMarie (subscriber, #164188)In reply to: the self healing work continues in bcachefs by koverstreet
Parent article: The first part of the 6.15 merge window
Is self-healing always wanted? My concerns are:
- It could risk trashing good-but-unreachable data, preventing subsequent data recovery operations.
- It could hide errors from userspace, such as by reporting “file definitely does not exist” instead of “I/O error occurred and we don’t know if the file exists”.
- It could recover data that was never actually present, such as freed disk blocks, creating a security concern.
-ENOENT until and unless the administrator tells the filesystem to use its best guess of what the pre-corruption situation was, or X is overwritten by an operation that makes that state irrelevant. Silently returning wrong data is the worst possible outcome.
Posted Apr 3, 2025 13:59 UTC (Thu)
by koverstreet (✭ supporter ✭, #4296)
[Link]
There are cases where fsck will delete things, but for the most part that's only if we have another piece of metadata that says "this shouldn't exist".
e.g., extent past the end of an inode - something went wrong with truncate.
If a reflink pointer points to a missing indirect extent, we just mark it as poisoned, so on future attempts to read from it we don't have to print out the same error, and we can un-poison it if the indirect extent comes back; this guards against a temporary lookup error in the reflink btree.
For the snapshots btree, a key for a snapshot node that doesn't exist generally indicates a problem with snapshot deletion, and the key will be deleted. But we also track when a btree has lost data (topology error, IO error), and if the snapshots btree has lost data we'll instead try to reconstruct snapshot tree nodes (and also subvolume keys, etc.).
We can reconstruct inodes if the inodes btree has lost data (permissions, ownership, timestamps etc. will all be wrong, and i_size will be a bit off but you'll still have the correct file contents).
This topic is an area of future research, but for all practical purposes we're in good shape.
Is self-healing always good?
