I think this was the right thing

Posted Oct 1, 2025 4:23 UTC (Wed) by rolexhamster (guest, #158445)
In reply to: I think this was the right thing by koverstreet
Parent article: Bcachefs removed from the mainline kernel

A good fraction of the bcachefs users are using it because they've been burned and lost entire filesystems on btrfs, and needed something more reliable - they've found bcachefs to be that.

It's easy to state the above, but the usual "citation needed" caveat applies.

Is there evidence to concretely state that, right now, btrfs (as of linux kernel 6.17) is less stable / more fragile than bcachefs (as of the DKMS version today) ?

This isn't about the historical (in)stability of btrfs at its development stage compared to (in)stability of bcachefs at the equivalent point in its development stage. This is also not about differences in feature sets.

This is about how things stand right now, as in is btrfs or bcachefs more reliable today, for plain non-fancy desktop usage.

(Disclaimer: I don't use btrfs or bcachefs. All my systems are on ext4, mainly due to inertia).

I think this was the right thing

Posted Oct 1, 2025 5:02 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Apparently people still don't trust btrfs's own advanced RAID modes: https://btrfs.readthedocs.io/en/latest/Status.html I absolutely got burned by it when booting degraded arrays and trying to reassemble them.

Another thing, btrfs support for multi-device setups is still not great, while bcachefs shines there. It's possible to configure complex cache hierarchies, so that most data is stored on fast SSDs and only periodically migrated to slower HDDs: https://bcachefs.org/Caching/

I think this was the right thing

Posted Oct 1, 2025 10:12 UTC (Wed) by gdamjan (subscriber, #33634) [Link]

meanwhile, bcachefs didn't mount a single device filesystem when kernel version (and bcachefs version) changed. luckily it wasn't the root device but an auxiliary one.

I do like to see the bcachefs project succeed, but its glory seems to me to be greatly exaggerated.

I think this was the right thing

Posted Oct 1, 2025 10:21 UTC (Wed) by ferringb (subscriber, #20752) [Link]

> Apparently people still don't trust btrfs's own advanced RAID modes: https://btrfs.readthedocs.io/en/latest/Status.html I absolutely got burned by it when booting degraded arrays and trying to reassemble them.

I hope you're not referring to raid5/6... which should frankly be behind CONFIG_BTRFS_EXPERIMENTAL since it is *just* for experimental/development use. The design lacks protections against the write hole problem, something upstream has tried to inform people of but seems to frequently get ignored. Quoting:

---
The RAID56 feature provides striping and parity over several devices, same as the traditional RAID5/6. There are some implementation and design deficiencies that make it unreliable for some corner cases and the feature should not be used in production, only for evaluation or testing. The power failure safety for metadata with RAID56 is not 100%.
---

The duplication functionality (raid1/c2, c3, etc) works fine and is what people use.

I think this was the right thing

Posted Oct 2, 2025 6:34 UTC (Thu) by koverstreet (✭ supporter ✭, #4296) [Link] (14 responses)

> Is there evidence to concretely state that, right now, btrfs (as of linux kernel 6.17) is less stable / more fragile than bcachefs (as of the DKMS version today) ?

The first thing to be aware of is that we don't have the kind of hard numbers on filesystem reliability that we'd like, no one does. For determining if a filesystem is 99.9%, 99.99%, etc. reliable - you'd really need telemetry.

So we have to go off of anecdotal data.

All that does in fact point to that:

- yes, btrfs does indeed still have reliability problems, and no one seems to know how to debug them

- bcachefs has been shaping up _fast_, and I think the userbase is big enough that I can conclusively say it's less fragile than btrfs. We simply haven't been seeing unrecoverable issues at any stage of development - barring ~1 or 2 extreme outliers that were solved quickly - and we seem to be pretty well past stability issues in general.

I think this was the right thing

Posted Oct 2, 2025 15:07 UTC (Thu) by rolexhamster (guest, #158445) [Link] (13 responses)

So we have to go off of anecdotal data.

Okay, where? As in what specific bug reports on, say, kernel bugzilla or LMKL.

All that does in fact point to that: ...

Anecdotes are not reliable facts, though to give you the benefit of the doubt, the above may have been an unfortunate or unintended choice of words.

At the same time, one cannot rely on anecdotes to say that btrfs is in worse/better shape than bcachefs. Relying on anecdotes to make an argument sounds more like FUD.

... I can conclusively say [bcachefs is] less fragile than btrfs.

How, quantitatively? Going off vibes is not the same as actually measuring the number of corruptions, crashes, failed recoveries, etc. There needs to be a concerted side-by-side comparison (such as stress testing, deliberate random corruption, etc), similar to what Backblaze does with hard drive reliability measurements.

Back when development of bcachefs started (as in ~10 years ago), there were a lot of lingering questions on the reliability of btrfs. At that point in time, these questions were a fair motivation for aiming to create a better file system. However, since then btrfs has improved and now has the advantage of being stabilised (except RAID5/6) for a far longer time period than bcachefs.

I think this was the right thing

Posted Oct 2, 2025 15:16 UTC (Thu) by koverstreet (✭ supporter ✭, #4296) [Link] (12 responses)

Look, if you're trying to argue btrfs is fit for purpose, you're smoking something.

https://news.ycombinator.com/item?id=45432749
https://news.ycombinator.com/item?id=45428033
https://news.ycombinator.com/item?id=45429813

This is just from the most recent hacker news filesystem thread where btrfs came up, which took me about 5 minutes. If I spent an afternoon I could find you hundreds of user reports like this.

I think this was the right thing

Posted Oct 2, 2025 15:33 UTC (Thu) by paulj (subscriber, #341) [Link] (7 responses)

One of the first posts on that yc story, from someone who has long sponsored you apparently, has this:

> From a long time (and now former) sponsor: if these posts are actually from you, please stop.

If btrfs sucks and has many issues, users and others will notice. As a dev of a filesystem in a similar space, you should of course take note and learn what you can from those issues. But... is it really helping you to be out there posting in various forums about how bad btrfs is and how much better your fs is?

I think this was the right thing

Posted Oct 2, 2025 16:22 UTC (Thu) by koverstreet (✭ supporter ✭, #4296) [Link] (6 responses)

> If btrfs sucks and has many issues, users and others will notice. As a dev of a filesystem in a similar space, you should of course take note and learn what you can from those issues. But... is it really helping you to be out there posting in various forums about how bad btrfs is and how much better your fs is?

Excuse me? Follow the thread back to see where this started: bcachefs was called a "meme filesystem", and I was just explaining that there's a real demand for something better than btrfs.

_Every single time there's a bcachefs thread_ this happens. I start out explaining "yes, btrfs really does have issues, bcachefs really is delivering something better" and then people coming out swinging accusing me of attacking btrfs.

You're just trolling.

Let's stop here

Posted Oct 2, 2025 16:26 UTC (Thu) by jake (editor, #205) [Link] (1 responses)

[ Not directed at Kent only, by any means ]

This sub-thread does not seem to be going anywhere useful for anyone. Can we please stop it here?

thanks,

jake

Let's stop here

Posted Oct 2, 2025 17:06 UTC (Thu) by koverstreet (✭ supporter ✭, #4296) [Link]

Thanks Jake, I appreciate it.

I think this was the right thing

Posted Oct 3, 2025 2:47 UTC (Fri) by interalia (subscriber, #26615) [Link] (3 responses)

> bcachefs was called a "meme filesystem"

I just want to say that that wasn't how I read it in context and I think you may have overreacted. The subthread was started by DemiMarie:

> I don’t think that experimental filesystems belong upstream. [snip] The need to get fixes out super quickly to recover user’s data does not mix well with the upstream kernel’s release cycle.

In reply, roryi said:

> It was very clearly marked as being experimental, though. I find it hard to understand who could possibly be using an experimental filesystem in a bleeding-edge kernel for important data without backups. [snipped]

> Has someone been actively recommending use of bcachefs to naive end users? Is there such a thing as a meme filesystem - and if so, has bcachefs somehow become one? Is there a rogue forum poster / youtuber / tiktoker out there tricking people into such risky behaviour without realising the implications?

I understood this reply to DemiMarie to be saying: "bcachefs was okay being upstream, because it was clearly marked as experimental. I would normally expect anyone using a filesystem marked as such in the most recent kernels would only be using it with backups, or with expendable data."

I don't think roryi was asserting that bcachefs IS a meme filesystem. In response to DemiMarie talking about data recovery they were asking if numerous users had somehow used a newer experimental filesystem without precautions, and IF so asking how they came to do so, finally asking speculatively if use of a new experimental filesystem somehow became a social media thing and thus encourage inexperienced users to do so. I thought it was a slightly odd thing to ask but I didn't consider it to be denigrating bcachefs's actual technical quality, so the entire subthread after that became an overreaction.

I think this was the right thing

Posted Oct 3, 2025 3:29 UTC (Fri) by koverstreet (✭ supporter ✭, #4296) [Link] (2 responses)

Well, if you'd like my take on who was actually using it: serious users, who were doing their homework first, would often come by the IRC channel (or Reddit) and ask if it was ready for use.

Up until recently (~6 months ago?), it seemed I was perpetually telling the people who sounded less risk averse to "check back in six months"; this included people who had been bit by recent data loss on btrfs even when we'd hit the point where similar data loss was going to be unlikely on bcachefs (that's never been much of a concern on bcachefs, although we used to have a lot of "downtime event" bugs, until those were worked through).

There have been a few people who clearly ignored the experimental label and jumped in much too early and ended up unhappy and disgruntled, but - the people who ended up the most disgruntled are also the ones who like to tout their storage industry experience, so can't say I feel much for those guys :)

Generally speaking it's been a lot of younger hobbyists/tinkerers, or older people with spare machines who know exactly why we need this and have been investing significant amounts of time QAing and beating on it.

And for all the people I can talk because I've been interacting with, I should also mention there's a much larger userbase who hasn't been pushing all the various features quite as hard for whom it's been working just fine.

So there's a good mix. Some people are legitimately using bcachefs in production - less critical or less demanding scenarios - some are testing the limits, some people are putting it onto any old junk and running real workloads (!).

Especially in the last six months, it's gotten sufficiently solid that the "people running it on random junk with real production workloads crown" have been coming up with all kinds of stories of it quietly and happily humming along through dying hardware and god knows what else that people seem fairly confident would have rather upset btrfs or even ZFS.

So I don't think there's any reason for people to be taking special precautions at this point. The only special precaution you ever needed to take was "if it wedges itself, be patient and feed us logs/metadata dump so we can get you bugfixes", and even that's pretty much stopped and the bugs have become pretty mundane.

I think this was the right thing

Posted Oct 3, 2025 4:18 UTC (Fri) by interalia (subscriber, #26615) [Link] (1 responses)

That's fine and all Kent, but I didn't ask about bcache's current users and maturity.

Rather than diverting my post into discussion of that, I'd be more interested in an acknowledgement that you might have misinterpreted what roryi said, or reasons why you think your interpretation was indeed correct if there was further context or subtext I didn't see.

Please, stop

Posted Oct 3, 2025 4:25 UTC (Fri) by jake (editor, #205) [Link]

[ again, this is not only aimed at interalia ]

This sub-thread is really not going anywhere very useful for anyone.

Please end it here.

thanks,

jake

I think this was the right thing

Posted Oct 2, 2025 15:47 UTC (Thu) by rolexhamster (guest, #158445) [Link] (1 responses)

if you're trying to argue btrfs is fit for purpose, you're smoking something.

Really Ken? The above response is quite disrespectful and dismissive.

My aim here is to have an intelligent (and polite) conversation about various potential merits/demerits of btrfs and bcachefs. As I've stated before, I don't use either btrfs or bcachefs, so all the questions and observations I have made are coming from a 3rd party point of view.

I think this was the right thing

Posted Oct 2, 2025 16:17 UTC (Thu) by koverstreet (✭ supporter ✭, #4296) [Link]

Why not do me the courtesy of getting my name right?

I think this was the right thing

Posted Oct 7, 2025 0:03 UTC (Tue) by Kamilion (guest, #42576) [Link] (1 responses)

Uh.... why do I see zero references to btrfs-recovery in any of these?

They're great anecdotes, but my realworld experience has been that most other filesystems have no recovery tools at all.
And I don't mean, "fix the filesystem", I mean, "other catastrophic failure occurred."

I've had a *lot* of spindles die, and have always been able to ask btrfs to drop the disk out of the array *before* any further problems occurred.

What other filesystem can even *do* that safely?

As far as I know, the only recovery tools for ext2/3/4, xfs, and the microsoft filesystems, other than testdisk, are commercial software.
Those hacker news posts are full of doing very silly things which will also break most recovery tools. Tossing md in the mix makes it nearly unrecoverable. lvm metadata loss has eaten more arrays than I can count in the field, which is why we ditched it a decade ago. Tradeins and RMAs went down significantly afterwards.

So far, I have *never* encountered an operational spindle that btrfs-recover could not extract customer data from.

*everything fails*. Having a path for recovery after failure is priceless. btrfs gigablocks are such a simple concept writ large that contribute to that path's viability.

XFS's correctness tests are quite nice and all, but if the on-disk structures present a problem to recover from without mounting, it's a major issue for deployments hosting unique data.

For every list of people having problems you can search up in 5 minutes, there's another batch of us who have found it reliable, and we are largely silent, busy doing our workloads.

I think this was the right thing

Posted Oct 14, 2025 11:25 UTC (Tue) by taladar (subscriber, #68407) [Link]

Most filesystems "can't" drop a disk out of an array before a problem occurs because RAID is completely out of scope for them, that is what mdraid or hardware RAIDs or LVM are used to solve with those.