Pretty sure I've found & fixed the root cause of this now.
Pretty sure I've found & fixed the root cause of this now.
Posted Oct 28, 2012 17:58 UTC (Sun) by sandeen (guest, #42852)Parent article: An update on the ext4 corruption issue
I decided to test recovery with journal_checksum enabled, and every journal replay I tried failed with a bad checksum in the log. (This used to work; long ago I fixed a journal_checksum error Linus ran into, and we turned it off by default after that). I went searching for when this new regression happened, and landed on a commit present in kernel 3.4, 119c0d4460b001e44b41dcf73dc6ee794b98bd31 "ext4: fold ext4_claim_inode into ext4_new_inode." This change resulted in an un-journaled metadata update, which caused the bad journal checksum, which caused the "corruption" (really, just an unplayable / unplayed log) the reporter experienced.
Anyway, I really expect the patch I sent last night, "[PATCH] ext4: fix unjournaled inode bitmap modification" to fix it; the original reporter found that it fixed it for him.
It appears that the corruption problem everyone was worried about was confined to users who had the non-default journal_checksum option turned on, thus resulting in an unplayable log.
There's a lot to be learned from this whole episode - about how to report bugs, how to triage bugs, and how to write news articles about bugs, I think. :) Anyway, in the end, fixed I believe, and not as scary or widespread as originally feared.
-Eric
Posted Oct 28, 2012 18:03 UTC (Sun)
by sandeen (guest, #42852)
[Link] (3 responses)
Posted Oct 28, 2012 19:14 UTC (Sun)
by nix (subscriber, #2304)
[Link] (2 responses)
Still, I guess it's a good thing I reported this rather than just saying 'oh, I'll turn it off since it seems to be broken now', even if the media splash was more than slightly disconcerting.
Posted Oct 31, 2012 22:33 UTC (Wed)
by rahvin (guest, #16953)
[Link] (1 responses)
In all honesty it felt like some paid "sponsors" (or shills as others call them) took this bug report and ran with it for political reasons. It was an obscure bug, yet the way it was presented it was a bug everyone had experienced with data (and babies) being eaten. The overblown way it was presented felt like some companies press office wrote stories and had the shills running around submitting it to every news source available.
Posted Oct 31, 2012 23:15 UTC (Wed)
by nix (subscriber, #2304)
[Link]
[1] that feeling of 'oh god stop it tell me there is no more' that any true introvert gets when they create, accidentally or otherwise, a huge splash
Posted Oct 28, 2012 20:03 UTC (Sun)
by theophrastus (guest, #80847)
[Link]
VERSION = 3
...yet nothing about ext4 in the logs [shrug].
in any-case i appreciate the apparent bug fix that's eventually on the way (patience is a virtue) thankee!
Pretty sure I've found & fixed the root cause of this now.
Pretty sure I've found & fixed the root cause of this now.
Pretty sure I've found & fixed the root cause of this now.
Pretty sure I've found & fixed the root cause of this now.
Pretty sure I've found & fixed the root cause of this now.
PATCHLEVEL = 7
SUBLEVEL = 0
EXTRAVERSION = -rc3
NAME = Terrified Chipmunk