btrfs: Enhancement to tree block validation
From: | Qu Wenruo <wqu-AT-suse.com> | |
To: | linux-btrfs-AT-vger.kernel.org | |
Subject: | [PATCH v5 00/12] btrfs: Enhancement to tree block validation | |
Date: | Fri, 15 Feb 2019 18:50:32 +0800 | |
Message-ID: | <20190215105044.17619-1-wqu@suse.com> | |
Archive-link: | Article |
Patchset can be fetched from github: https://github.com/adam900710/linux/tree/write_time_tree_... Which is based on v5.0-rc1 tag. Also there is no conflict rebasing the patchset to misc-next. This patchset has the following 3 features: - Tree block validation output enhancement * Output validation failure timing (write time or read time) * Always output tree block level/key mismatch error message This part is already submitted and reviewed. - Write time tree block validation check To catch memory corruption either from hardware or kernel. Example output would be: BTRFS critical (device dm-3): corrupt leaf: root=2 block=1350630375424 slot=68, bad key order, prev (10510212874240 169 0) current (1714119868416 169 0) BTRFS error (device dm-3): write time tree block corruption detected BTRFS: error (device dm-3) in btrfs_commit_transaction:2220: errno=-5 IO failure (Error while writing out transaction) BTRFS info (device dm-3): forced readonly BTRFS warning (device dm-3): Skipping commit of aborted transaction. BTRFS: error (device dm-3) in cleanup_transaction:1839: errno=-5 IO failure BTRFS info (device dm-3): delayed_refs has NO entry - Better error handling before calling flush_write_bio() One hidden reason of calling flush_write_bio() under all cases is, flush_write_bio() will trigger endio function and endio function of epd->bio will free the bio under all cases. So we're in fact abusing flush_write_bio() as cleanup. Since now flush_write_bio() has its own return value, we shouldn't call flush_write_bio() no-brain, here we introduce proper cleanup helper, end_write_bio(). Now we call flush_write_bio() like: New | Old -------------------------------------------------------------- ret = do_some_evil(&epd); | ret = do_some_evil(&epd); if (ret < 0) { | flush_write_bio(&epd); end_write_bio(&epd, ret); | ^^^ submitting half-backed epd->bio? return ret; | return ret; } | ret = flush_write_bio(&epd); | return ret; | Above code should be more streamline for the error handling part. Changelog: v2: - Unlock locked pages in lock_extent_buffer_for_io() for error handling. - Added Reviewed-by tags. v3: - Remove duplicated error message. - Use IS_ENABLED() macro to replace #ifdef. - Added Reviewed-by tags. v4: - Re-organized patch split Now each BUG_ON() cleanup has its own patch - Dig much further into the call sites to eliminate unexpected >0 return May be a little paranoid and abuse some ASSERT(), but it should be much safer against further code change. - Fix the false alert caused by balance and memory pressure The fix it skip owner checker for non-essential tree at write time. Since owner root can't always be reliable, either due to commit root created in current transaction or balance + memory pressure. v5: - Do proper error-out handling other than relying on flush_write_bio() to clean up. This has a side effect that no Reviewed-by tags for modified patches. - New comment for why we don't need to do anything about ebp->bio when submit_one_bio() fails. - Add some Reviewed-by tag. Qu Wenruo (12): btrfs: Always output error message when key/level verification fails btrfs: extent_io: Kill the forward declaration of flush_write_bio() btrfs: disk-io: Show the timing of corrupted tree block explicitly btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up btrfs: extent_io: Handle error better in extent_write_full_page() btrfs: extent_io: Handle error better in btree_write_cache_pages() btrfs: extent_io: Kill the dead branch in extent_write_cache_pages() btrfs: extent_io: Handle error better in extent_write_locked_range() btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io() btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages() btrfs: extent_io: Handle error better in extent_writepages() btrfs: Do mandatory tree block check before submitting bio fs/btrfs/disk-io.c | 21 +++-- fs/btrfs/extent_io.c | 168 ++++++++++++++++++++++++++++------------ fs/btrfs/tree-checker.c | 24 +++++- fs/btrfs/tree-checker.h | 8 ++ 4 files changed, 162 insertions(+), 59 deletions(-) -- 2.20.1