BcacheFS for 6.17
BcacheFS for 6.17
Posted Aug 12, 2025 10:05 UTC (Tue) by sheepdestroyer (guest, #54968)In reply to: BcacheFS for 6.17 by pschneider1968
Parent article: Kernel prepatch 6.17-rc1
Ken's arguments on ML (afaict) seem pretty solid : fully internal changes to his tree, that fixes real user bugs, should get in ASAP, and not arbitrarily be postponed for 3 more months. Users depend on them.
So far the rationale for doing otherwise feels wrong and doesn't strike me yet as well argumented.
On a separate note : There also my personal preference for favoring technically correct people. An over focus (and weaponization) on *communication* to win against them rubs me the wrong way. The fact that paid people have been able to turn away skilled technical hobbyists just because they did use the proper corporate tone is extremely worrying to me. Linux will got get better in the long term by playing that game.
Posted Aug 12, 2025 10:18 UTC (Tue)
by Wol (subscriber, #4433)
[Link] (19 responses)
"If I make an exception for ONE person, then I've got to make an exception for EVERYONE, and the linux release process will become a farce".
Are you old enough to remember linux 0.99? Or (I think it was) Linux 2.5? The current system may not be the best, but I don't think people want to go back to those days. Deadlines are deadlines, that's all there is to it, and if you miss one, you have to wait until the next.
If Knet wants his users to get his fixes early, he has to do what everyone else does - rebase his public tree on the current Linus tree and let his users pull an update patch set and merge it.
The alternative is chaos. (And a burnt out Linus. We don't want to go there ...)
Cheers,
Posted Aug 12, 2025 16:42 UTC (Tue)
by koverstreet (✭ supporter ✭, #4296)
[Link] (18 responses)
Posted Aug 12, 2025 18:55 UTC (Tue)
by viro (subscriber, #7872)
[Link] (17 responses)
_That_ wouldn't scale. At all. And your reaction to many and varied attempts to explain that to you, by a lot of different people had so far been a mix of "but I'm right!", quietly doing the same damn thing again and occasional "I'll just have to go for out-of-tree variant" thrown in. The last one got you "feel free to do so" from Linus (around the middle of last cycle), at which point you started to look for some other way to handle the situation.
Posted Aug 12, 2025 19:25 UTC (Tue)
by koverstreet (✭ supporter ✭, #4296)
[Link] (16 responses)
And that's been met, repeatedly, with escalating levels of "no, I decide!".
And at the risk of stating the obvious, yes, I am the expert on the bcachefs codebase, I am the one who answers bug reports, stares at the dashboards, so when the patch in question is purely internal to bcachefs, that gets a !?!?!?!?!.
I repeat, reasoning and talking things out hasn't worked. I've got plenty of hard data I can point to that on all the technical criteria bcachefs is in good shape, and you guys keep acting like I'm not to be trusted with my own code.
And when the patches in question are bugfixes and basic data integrity questions - I'm sorry, but this is just broken.
Posted Aug 12, 2025 20:57 UTC (Tue)
by sdalley (subscriber, #18550)
[Link] (15 responses)
The very best approach, when you're in a hole as deep as the one you are now in,
Posted Aug 12, 2025 21:53 UTC (Tue)
by koverstreet (✭ supporter ✭, #4296)
[Link] (11 responses)
Keep in mind: this all escalated from one simple patch to add recovery code for an issue that had caused data loss, which we were able to fully recover from with the new recovery code.
Posted Aug 13, 2025 5:38 UTC (Wed)
by mb (subscriber, #50428)
[Link] (10 responses)
The process says that during the freeze period no new features (a.k.a recovery functionality) are allowed.
Posted Aug 13, 2025 13:49 UTC (Wed)
by koverstreet (✭ supporter ✭, #4296)
[Link] (9 responses)
The underlying principle has always been: make sure things work for the end user. Don't break userspace, don't crash the kernel. For a filesystem, that means don't lose user data: so recovery code to avoid data loss is absolutely a bugfix.
But the technical discussions have gotten consistently derailed with rants about how Linus doesn't trust my code, and high school stuff about how everyone hates me, how I need therapy, etc.
And to counter "not ready for production use" - bcachefs has been in use and supported since before it went upstream, and I'm being much more conservative with the experimental label than btrfs or ext4 were - it has more usage now than those filesystems did when they were deemed stable, from what I can tell.
e.g. when we had issues with Debian shipping a broken package - the package maintainer broke the build and didn't report it upstream for months, I started getting a bunch of bug reports from Debian users who had a drive die and couldn't mount in degraded mode. A bunch of people were affected by the casefolding/overlayfs incompatibility in 6.15 as well, sigh.
That's been typical - it's out there and it's working for most people, so I only find out about usage when something breaks (and often that's process), because filesystems are supposed to be boring things that you don't have to think about.
Posted Aug 13, 2025 18:18 UTC (Wed)
by ikm (guest, #493)
[Link] (8 responses)
Hi Kent, You're right, that's not a hard rule, and I bet you could've made journal_rewind be merged without too much arguing. But could we please revisit that moment when you've tried? So, Linus gets a pull request during rc3 containing "New option: journal_rewind". Nowhere in the description of that option does it say why it should absolutely be included in rc3. So, from any random person's point of view (Linus included), new options don't belong in rc3, hence it shouldn't be merged. So Linus writes to you exactly that: "We don't start adding new features just because you found other bugs." What you could've done: politely asked for an exception given that 1) it fixes an important existing problem for users, and you'd like to have it upstream ASAP to further accelerate the development due to better bug turnover, 2) it's self-contained, and the fs is still experimental, so it's likely to fix more things than it could break, 3) promise it won't happen again unless there's really a good reason for it, and you'd appreciate it being granted at least this one time. What you did do: 1) attacked other filesystems, 2) wrote about how great yours is, and 3) ended by saying that Linus just "doesn't get it". What were you expecting when you were insulting other filesystems and Linus personally? Here's an example pull request text which you could've authored instead: That text shows that 1) you realize this is against the rules, that 2) you've made an effort to explain why it's still necessary, and that 3) you're willing to back off if that's not good enough for Linus. Thus, you show him nothing but respect. Could you consider, if only for a second, whether this approach would have gone better? It sucks to see bcachefs leave the kernel, I was holding fingers for it. Unless, you know, there's still one option left on the table... It's so easy even ChatGPT can do it for you (like that text above - yes, you've guessed that right). Now, I know you won't do it, but one can keep dreaming, right? Wish you and your project well, and hope someday to start using bcachefs.
Posted Aug 13, 2025 18:56 UTC (Wed)
by koverstreet (✭ supporter ✭, #4296)
[Link] (7 responses)
There was a lot of drama over pull requests before, with repeated threats of removing bcachefs from the kernel; dealing with the fallout of that took up an enormous amount of time. I've gotten a lot of legitimate hate mail.
It really wasn't good. There was the three day shouting-match-over-email before Linus was finally able to communicate that the timeframe of when I was sending pull requests was the issue and what we wanted was for me to send them on Thursday. Calling it "whining" in response to trying to get out bugfixes; claims of bcachefs's journalling model being broken in response to needing to allocate more than 2GB of memory for journal replay (I rent servers with 256GB of ram), claims of "filesystems that don't need fsck" being the future. I'd have to go back through my inbox to find more, and I really don't want to; what I can tell you is that almost nothing resulted in actionable technical feedback.
So if I came across as on edge in that pull request thread, yes, I was. Yes, I could have handled it better, but it's been a lot to deal with and I'm honestly not sure the outcome would have been any different.
Meanwhile, I'm looking at all the technical metrics for how bcachefs is doing, and all that just leads me to ask "why?". I look at pull requests from other subsystems - have the type of patches I've been sending out of line? No, it doesn't look like it; perhaps pull requests could be clearer about criticality of bugfixes and things of that nature, but I see new features going into drivers during RCs all the time, and refactorings going into XFS during rc6 or rc7. The only thing that's been unusual about bcachefs's pull requests is volume - but we're a massive modern filesystem that's stabilizing rapidly, closing out bug reports, and supporting those users.
So I keep coming back to one basic question - why was any of this necessary?
If the process isn't working, and it hasn't been (again: bcachefs is doing well on technical metrics, pull request drama has never called that into question), the way to solve it is with a conversation. I think the chances for that happening evaporated a long time ago.
Posted Aug 13, 2025 19:55 UTC (Wed)
by mjg59 (subscriber, #23239)
[Link] (5 responses)
What's your end goal here? If you're trying to change Linus's behaviour, you're going about it in a way that's not only ineffective, you're actually making the problem worse - people are actually viewing it as justification for him behaving this way in other situations. If you're trying to get your code merged in a timely manner then, well, you're obviously failing there too.
It seems like you're not in a place to apologise and change your behaviour, and, well, I do understand that and I've been there before and I understand why it may feel incredibly unfair to be asked to. I'd recommend finding a good therapist to work on that, but also consider whether you can achieve more of what you want by just disengaging from LKML and having someone else take responsibility for managing pull requests and engaging with Linus and other maintainers. This may not be fair, it may not be just, but you're a technical person and you should understand the benefits of pragmatic solutions.
Posted Aug 13, 2025 23:43 UTC (Wed)
by koverstreet (✭ supporter ✭, #4296)
[Link] (4 responses)
Why'd you put up with it?
> But I feel like the fundamental thing you keep missing here is that it doesn't matter if you're right. It doesn't matter if you're right on technical grounds. It doesn't matter if you're right about the abuse you've received. It doesn't matter whose fault the entire situation is. You are not going to achieve your desired goals by behaving the way you are. Nobody is willing to support you. It may be horrifically unjust, but bcachefs will die and be forgotten and you will have wasted your time.
Ok, you think I've waging some kind of futile war. But - no.
I do have significant support, and a real community I've built up: that's not going to vanish if (when) we have to go the DKMS route. ZFS has shown that can work for a filesystem, they've made it all the way to distro installers, so I think that approach has some legs. People want this thing, it solves real problems and addresses real needs, and a lot of believe believe in my code and my methods. So I've got options here.
Option a: go the DKMS route for awhile. In a couple more years I should have erasure coding, online fsck and a whole bunch of other stuff done that'll make us more than competitive with ZFS for most people. There'll be some big deployments that are already in the works, and we'll be in distro installers - we were well supported by NixOS even before going upstream, I've just been waiting until the experimental label is off to push for more distro support, and that's six months away. Then, in a few years someone else can see about getting it back in (probably won't be me, someone else in the project), and we'll be much more solidly positioned should fights and drama like this arise.
Option b: people have been getting interested in FUSE performance, and I see no reason why a userspace filesystem can't be as fast as in-kernel, to within the margin of error. Ringbuffers for kernel <-> userspace fs communication, threads pinned to each core, temporary per-cpu mappings for zero copy to/from the pagecache - I've been sketching stuff out and it looks plenty doable, and that would open up new long term possilibities, like talking directly to NVME devices, which would cut out a lot of block layer overhead (the block layer really is fatter than it needs to be). A lot of interesting stuff is going on in userspace these days that we might be able to take advantage of.
Option c: I've in the past had real offers from people who wanted to use the bcachefs codebase for other products. Not my passion, but certainly would be lucrative.
Option d: Start playing music again, find a nice hippie village, start having a social life and just work on bcachefs as more of a hobby, supporting my friends and existing users who really like it. I'm sure I could find some way to pay the bills :)
IOW, I really have no need to put up with toxicity and drama if it gets to be too much. And it's been way over the top: I lost a very promising hire because of Linus's repeated public threats to remove bcachefs from the kernel (strangely enough, people don't want to leave a stable job for the new thing when the leader of the project is saying he's going to kill it, and all their coworkers are seeing it and telling them they'd be completely crazy to do it). And it's been an enormous time sink and source of stress: bcachefs isn't just me, it's a whole community, and the drama affects the entire community and I have to respond to it. On top of that, given how often the fights have been over bugfixes, and it doesn't seem to be getting better, I have to consider whether we're really going to have a stable reliable release process staying in the kernel.
So all of that means that being out of the kernel is probably going to be the more stable, secure option at this point.
I like the kernel _community_, I have a lot of friends in it that I genuinely enjoy working with and it's a project that I've genuinely enjoyed being a part of and want to support - but when I've gotten at least a dozen emails from Linus in just the past two months that don't fit on a screen and have all been about how he doesn't trust my judgement, thinks that everyone hates me, thinks I need therapy and have to give a public apology - yeesh. No thank you :)
> It seems like you're not in a place to apologise and change your behaviour, and, well, I do understand that and I've been there before and I understand why it may feel incredibly unfair to be asked to. I'd recommend finding a good therapist to work on that, but also consider whether you can achieve more of what you want by just disengaging from LKML and having someone else take responsibility for managing pull requests and engaging with Linus and other maintainers. This may not be fair, it may not be just, but you're a technical person and you should understand the benefits of pragmatic solutions.
Now hang on, which behavior are you referring to?
Technical criticism really has to be fair game if we're to take ourselves seriously as engineers. I don't mind in the slightest when people tear into my code, and in fact I rather enjoy it (I've said it elsewhere before, and I'll say it again - I'll bake and mail a plate of cookies to the first person that comes up with a good, intelligent insightful rant/flame about the bcachefs code. Stuff like that is pure gold; when people can talk in an articulate way about what frustrates them, that's some of the best and most useful critiquing you can get).
But from what I've seen of the people who've spoken up and said I've pissed them off - or even the stuff people have come up with digging through lore - I'm sorry, it's been just absurd. It's like people have collectively lost their minds and decided to go on a witchhunt (you do realize the kernel community is famously known for being toxic, right? It shouldn't surprise anyone).
The only person I'd remotely want to apologize to that I haven't already is Josef, because I was pretty savage about btrfs the other day on fsdevel. I think highlighting our track record and engineering on standards on filesystems definitely needed to be done; we're a technical community, and we should be making decisions for technical reasons, so even if people within the community don't want to hear it that needs to be highlighted and discussed before git rm -rfing bcachefs.
But Josef is a good guy, and that was a bit over the top, so if Josef's reading this - please consider this an apology.
As for the rest... sorry, not apologizing to people who start swearing at me in technical discussions :)
Posted Aug 14, 2025 1:59 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link] (3 responses)
I didn't. I left and did something else.
> Now hang on, which behavior are you referring to?
I will not work with you. I will not work on any project that you're leading. This is *purely* as a result of reading your writings here and on LKML. I have no stake in the technical issues whatsoever. In reading your posts I see you reacting similarly to both cases where, yes, the other party is going beyond what I consider reasonable behaviour, but also where people have gone out of their way to provide you with support and advice. And maybe this wouldn't be the case if the level of tolerated toxicity in the kernel was more reasonable, and you'd have got along with everyone. But that's not where we are, that's not what happened, and now you just appear to be just as much of an asshole as Linus (if not more) except it's far easier to just ignore you because I don't need to deal with you if I want to get something upstream. And if you're unwilling to accept that there may be truth in that then you're *definitely* more of an asshole than Linus is, because at his worst he did step back and take some time to work on his interpersonal interactions and came back better (if still not great).
Seriously. Consider whether every single person who's complained about you is wrong, or whether there's any possibility that you could be a factor in this.
Posted Aug 14, 2025 4:14 UTC (Thu)
by koverstreet (✭ supporter ✭, #4296)
[Link] (2 responses)
But if I may unpack this a bit?
You're coming in with some really strong opinions about someone you've never worked with, taking the time to post on a public forum that I need therapy...
Look, the kernel is a high pressure, high stakes environment, it's a bit of a pressure, and we're coders - a group not exactly known for high emotional intelligence.
Don't you think that this might produce a lot of overreactions, and some rushing to superficial judgements?
I try not to judge people based only on what I hear from others... I try to keep an open mind. And I try not to keep old rivalries and feuds alive; it helps to remember that people may have been having a bad day in that one interaction you saw, or they may legitimately be doing a lot (possibly a lot of good) and under some stress - reactions get short in situations like that.
It helps not to make blanket statements and judgements about people. Look, I don't even consider Linus an asshole... control freak with some toxic behavior sure, but for me to judge anyone too harshly for that would be the pot calling the kettle black. I've also had plenty of moments where I genuinely enjoyed working with him, he has a lot of qualities I respect, and I try to keep that stuff in mind even when tensions have gotten waaaay too high and I need a break. It helps.
We've all had our moments, we're all (hopefully) learning and getting better. I know I've had plenty of moments of realization along the lines of "oh, I took that way too far - I need to think about how to approach that better next time", and I suspect you have too.
Such is life. Sometimes we all just need to chill out.
Posted Aug 15, 2025 13:04 UTC (Fri)
by ntcarruth (guest, #178852)
[Link] (1 responses)
I’m very much an outsider here, but wanted to make the following observation:
I admire Kent for trying to not hold grudges; the world would be a better place if more people were like that. Paradoxically, though, from some comments I’ve read here and on LKML, I wonder whether an attitude like that in the quote above, sufficiently extrapolated, could result in interactions like some people complain (fairly or unfairly) about having had with Kent? More specifically, from my own personal experience, I am inclined to believe the following:
Assuming that someone who swears at me is simply having a bad day is almost always a good thing.
Assuming that someone who tells me I’m wrong, even if they can’t explain why other than reference nebulous “experience”, is simply having a bad day, is generally a bad thing.
At any rate, I have certainly made the latter mistake. Experience, in my experience (excuse the repetition), almost always trumps theory, at least when the context surrounding the experience is applicable.
At any rate, from Kent’s comments elsewhere, he has more than one option for keeping bcachefs alive in or out of the kernel. Similar to what another commentator said below, Kent, keep up the good work, we all need better technology!
Posted Aug 15, 2025 18:05 UTC (Fri)
by koverstreet (✭ supporter ✭, #4296)
[Link]
Posted Aug 13, 2025 22:47 UTC (Wed)
by ikm (guest, #493)
[Link]
I honestly don't think there's anything wrong with the code. I've been a long-time user of the original bcache and it's been rock solid, so I'm sure you can totally pull it off with bcachefs - I've seen you do it before. Furthermore, I'm sure one can easily find a lot of lower quality work in the kernel right now if one decides to dig for it for some reason. I don't think that's really the problem.
Thing is, writing the code is one thing, but selling it to the public is an entirely different one. When I want to send a patch to some upstream project, and the patch itself is done, I always have this thought crossing my mind: "ok, the easy predictable part is done, now for the hard, unpredictable one". I then have to explain what this patch is, why it's necessary, how I tested it, how I tried to adhere to the project's existing best practices, and also make sure it's super simple to apply. When I do so, I try to put myself in the shoes of the person I'm addressing and imagine what they might think and what their objections might be. Oftentimes this goes smooth, but sometimes I have a really hard time and it simply goes nowhere. It's impossible to predict, really - you just have to keep a positive attitude and hope you can get your point across. When you see your pitch not working, trying to be flexible, finding other angles, better reasoning, adapting your work to the received feedback, iterate, etc etc. And hoping that eventually it works.
Yes, it sucks to have to prove to other people that your work is sound and your reasoning is right, especially when you *know* it is. Unfortunately, *they* often don't. They might also have their own limitations and flaws when it comes to communication. Ever been to a technical interview where you knew you could totally ace the job, but left the interviewer totally unconvinced? Unfortunately, winning over people is a completely different discipline, but a necessary one nonetheless.
> the way to solve it is with a conversation. I think the chances for that happening evaporated a long time ago.
I actually have a feeling that they didn't yet. Did you ask yourself why bcachefs wasn't removed in this -rc1? I keep thinking about this, and the only reason I see is because Linus doesn't really want to, so he waits for some reason not to. I think you can still give him that reason. Honestly, just writing the same thing you wrote here, about being tired and frustrated, being on edge, having to deal with non-technical issues, etc etc, and asking for a time-out for a release cycle to rethink the communication strategy, might just be enough. The only thing which would be absolutely necessary is to treat people with respect - we are all humans, we all have flaws, we're all struggling, and no one really knows how to solve this, so we have to try and be kind to one another in the process, even when nothing seems to work. And hope for the best.
Posted Aug 13, 2025 15:49 UTC (Wed)
by tuna (guest, #44480)
[Link] (2 responses)
The value of Bcachefs in Linux is 1 % of the total value of Linux. For comparison, the AMD drivers are worth 5 %.
However, for the people developing these specific things value them much much more. They can be a source of income, pride, community and A LOT of effort has been put into them. But from Linus T's perspective they are just a part of his project, and if he has to spend more time on those things than the value they provide something is wrong. So if Linus T has to spend 10% of his time Bcachefs he might think something is off. But the developers of this subsystem might not understand that because that subsystem is the most important thing to them.
Posted Aug 13, 2025 16:17 UTC (Wed)
by koverstreet (✭ supporter ✭, #4296)
[Link] (1 responses)
Linux started out as a community effort and, in theory, still is. I think we should be true to our roots, and I think we should be thinking about our long term future, too. Big tech companies come and go, and they're always going to prioritize their short term needs; it's up to us, the engineers, to care about our long term priorities and engineering culture.
bcachefs is a real community effort. I've declined to take funding when that would have meant a shift away from the community focus; I've always been up front with funders that this is for everyone and that most of my development time goes to things the community works (not that it's been a problem actually, I've always been able to find funders that agree with me on that).
I don't do the "I'll fix a bug that affects your company as part of a quid pro quo" that some other maintainers talk about; I just fix bugs, as long as the person reporting them is reasonably professional and pleasant to work with or willing to learn; I try to work with everyone and I don't discriminate.
And in my opinion, that community focus has been one of the big ingredients in the success that we've found. It builds a lot of trust and loyalty in the userbase when they know that if they hit a serious bug I'll drop what I'm doing and fix it - I prioritize and triage my work based on user severity, nothing else. Being available to the community on IRC has built a very active and proficient community, which has been an incredible asset and allowed us to stabilize and close bugs fast; users consistently report real noticeable improvements in reliability and stability with every release; they see the bugs they report being fixed which makes them happy, and I get to stabilize quickly instead of dragging this out for 5 or 10 years.
The community is big and active, and it's been drawing in new people, and that's huge. They're mostly not old timers that post on lwn or mailing lists, but that's good, that means we're drawing in new people and teaching them new things.
So looking at all that - if upstream is trying to dictate excessively and it's coming into conflict with doing right by the bcachefs community and userbase, I'm going to pick doing right by the community every time. It's just a no brainer. I absolutely take feedback and make adjustments based on upstream feedback when I can and it's reasonable, but when it's unreasonable (i.e. zero effect on the rest of the kernel), and it's coming into conflict with doing right by users, and the only justification is "Linus gets to dictate" - I think it's better for everyone in the long run if I say "no, that's not the right thing to do; I need to do right by users and community".
Posted Aug 13, 2025 19:22 UTC (Wed)
by tuna (guest, #44480)
[Link]
bcachefs is a real community effort. I've declined to take funding when that would have meant a shift away from the community focus; I've always been up front with funders that this is for everyone and that most of my development time goes to things the community works (not that it's been a problem actually, I've always been able to find funders that agree with me on that)."
In Linux there are something like 10 filesystems. I am sure that, when finished, Bcachefs will be the absolute best of them. However, there is one driver for AMD GPUs. If that is removed all people with AMD HW will be affected and will basically not be able to use Linux.
Now you and others are super invested in Bcachefs (which is great!), but that is not the case for the wider Linux community. And if you cause people like Linux T to spend a lot of their time "managing" (I couldn't find a better word) Bcachefs instead of doing things they would rather do they might decide that the maintainer effort is not worth it and skip Bcachefs in mainline Linux.
Even if Bcachefs is removed from Linux now, I hope it can come back when it is more mature both technically and in its development process. Then it will hopefully be less of a burden for the "top" Linux maintainers and be managed like all other file system (even if it is the absolute best!).
Anyway, thanks for your work. We all want better technology!
BcacheFS for 6.17
Wol
BcacheFS for 6.17
BcacheFS for 6.17
BcacheFS for 6.17
That's right, Linus decides.
is to Stop Digging.
That's right, Linus decides.
That's right, Linus decides.
If your fs requires this feature to function properly, it simply means that the fs not ready for production use, yet. It means that your users can't use the fs from the mainline, yet. They need to wait another cycle for all features to get in or use out of tree patches.
As simple as that.
That's right, Linus decides.
That's right, Linus decides.
Subject: [GIT PULL] bcachefs: fixes for -rcX and one targeted recovery feature (journal_rewind)
Hi Linus,
I realize this is after -rc1 and I respect the freeze on new features. I’m sending a
small fixes pull, and I’m also asking whether you’d consider an exception for a
single, tightly scoped feature in bcachefs: `journal_rewind`.
Why I’m asking
--------------
TL;DR: `journal_rewind` addresses a real, current pain point users are hitting in the
field during recovery scenarios. Enabling it upstream now meaningfully improves
reliability and shortens time-to-fix for follow-on issues. bcachefs is still marked
experimental, and this change is self-contained to the journal recovery path.
What it does (high level)
-------------------------
`journal_rewind` gives recovery a bounded, controlled way to step back to a known-good
point when replay lands on an inconsistent tail, instead of failing hard or requiring
heavy manual intervention. In practice, it turns “stuck until deep surgery” into
“mounts cleanly with prior committed state,” which is a clear net win for users.
Scope & risk
------------
- Touches only bcachefs’s journal/recovery code; no VFS or cross-subsystem changes.
- No on-disk format change (and no format migration). It relies on existing metadata.
- Can be disabled by a mount/config knob if it causes trouble, so we have an immediate
off-switch without reverting the whole series.
- Revert is straightforward: it’s a linear series with no collateral refactors.
Testing
-------
- Local power-failure and crash-injection loops with journal fuzzing.
- Stress under mixed rw/metadata workloads; no new lockdep splats observed.
- Boots and smoke on x86_64 and arm64 defconfigs.
- Reporter repros that previously led to painful recovery now pass with clean mounts.
Why not wait?
-------------
I’m trying to minimize churn, but holding this back prolongs an issue users are already
running into, and it slows our ability to iterate on the recovery corner cases that
surface only at scale. Getting this upstream now helps users today and accelerates
stabilization of the experimental filesystem overall.
What I’m *not* asking
---------------------
- No grab bag: this is not bundled with unrelated refactors or features.
- If an exception isn’t acceptable, I will drop `journal_rewind` and resend fixes only.
I won’t push further on it this cycle.
Git
---
The branch contains:
* fixes-only commits
* the `journal_rewind` series (clearly separated)
(Branch/tag available as usual; happy to resend with whatever naming you prefer.)
Thanks for considering the exception. If any part of this looks borderline for -rc,
tell me what to drop and I’ll do it immediately.
— Kent
That's right, Linus decides.
That's right, Linus decides.
That's right, Linus decides.
That's right, Linus decides.
That's right, Linus decides.
Experience and theory
Experience and theory
That's right, Linus decides.
That's right, Linus decides.
That's right, Linus decides.
That's right, Linus decides.