|
|
Subscribe / Log in / New account

Btrfs stability and production-readiness

Btrfs stability and production-readiness

Posted Feb 22, 2016 22:04 UTC (Mon) by pr1268 (guest, #24648)
In reply to: I don't get it... by malor
Parent article: Kirkland: ZFS licensing and Linux

If they haven't managed to get it fully stable *yet*, there's probably something broken in the basic design.

Umm, the Wiki says it's "no longer unstable". It does mention that Btrfs is still "under heavy development"—mostly for feature enhancements and performance improvements, from what I gather.

I think it's probably a fairly safe bet that btrfs is never going to be production-ready.

Novell/SUSE would beg to disagree with you on that one.

Of course, your definition of "stability" and "production-readiness" may differ. YMMV.


to post comments

Btrfs stability and production-readiness

Posted Feb 22, 2016 22:27 UTC (Mon) by malor (guest, #2973) [Link] (21 responses)

>Umm, the Wiki says it's "no longer unstable".

It says the disk format is stable. There is vast gulf between that and being functionally stable. (ie, you can actually use it and trust it.)

>Novell/SUSE would beg to disagree with you on that one.

After seeing the sheer number of catastrophic, stupid failure stories about btrfs on LWN and other places... I suspect they're probably making a fairly severe mistake.

>Of course, your definition of "stability" and "production-readiness" may differ. YMMV.

"I can trust it not to crash and never to lose data without a hardware failure" is a pretty good definition for filesystems, and I strongly suspect either the btrfs codebase or dev team is not capable of getting it there.

It's been nine years now, and it's still not ready yet. To me, that speaks of a design that's either fundamentally wrong in some way, or which can't be reliably implemented by humans.

Btrfs stability and production-readiness

Posted Feb 23, 2016 0:04 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (10 responses)

>I suspect they're probably making a fairly severe mistake.

They have filesystem developers on-board and they have disabled features they don't consider stable. Unless you can point to any real bug reports with their implementation, whatever feelings you may have isn't based on evidence.

Btrfs stability and production-readiness

Posted Feb 23, 2016 5:55 UTC (Tue) by malor (guest, #2973) [Link] (2 responses)

>whatever feelings you may have isn't based on evidence.

Which is why I said I suspect they're making a mistake, rather than asserting that they absolutely are.

And that feeling is based on evidence, just not evidence with their specific code base. Maybe they've whipped it into shape, but I certainly wouldn't bet my own data on it.

Btrfs stability and production-readiness

Posted Feb 23, 2016 18:17 UTC (Tue) by Wol (subscriber, #4433) [Link] (1 responses)

> And that feeling is based on evidence, just not evidence with their specific code base. Maybe they've whipped it into shape, but I certainly wouldn't bet my own data on it.

aiui, they have btrfs configured so your two choices are basic or mirrored. And that is working fine. There is a trickle, and I mean trickle, of reports where a combination of snapshots and disk-full causes a serious problem.

Beyond that, it seems pretty stable and solid.

(I follow the linux-raid list, and this sort of stuff gets touched on, so this appears to be the "current state of play".)

Cheers,
Wol

Btrfs stability and production-readiness

Posted Feb 24, 2016 10:32 UTC (Wed) by jezuch (subscriber, #52988) [Link]

> There is a trickle, and I mean trickle, of reports where a combination of snapshots and disk-full causes a serious problem.

Btrfs has an annoying habit of returning ENOSPC while df shows plenty of GBs left, yes. Since I maintain a history of snapshots for quick-and-dirty "oops, let's restore from an older snapshot" kind of "backups", I noticed that applications react *very* poorly to out-of-space conditions.

Btrfs stability and production-readiness

Posted Feb 23, 2016 9:18 UTC (Tue) by rleigh (guest, #14622) [Link] (6 responses)

They may well have disabled buggy features. But what about fundamental design flaws? Did they fix the repeated unbalancing problem? Having a filesystem that can go read-only at some indeterminate time in the future is simply unacceptable! A regular rebalance kills performance and still doesn't provide any guarantees, so isn't a proper solution.

Btrfs stability and production-readiness

Posted Feb 23, 2016 14:18 UTC (Tue) by niner (subscriber, #26151) [Link] (5 responses)

The balancing issue has been fixed for quite a while now.

All in all your arguments seem to be based on outdated information. May I suggest you viewing your experience in that light when you enter the next discussion about this topic?

Btrfs stability and production-readiness

Posted Feb 23, 2016 17:22 UTC (Tue) by malor (guest, #2973) [Link] (4 responses)

I certainly would consider doing so, if anyone would actually give me anything substantive.

What I've seen in this thread has been weak agreement with my belief, specific complaints about specific bugs, and absolutely nothing concrete from the other side at all.

I'm absolutely willing to re-evaluate my position. The btrfs devs standing up and saying "it's done" would be a HUGE step toward doing so. The fact that they still haven't, nine years later, really makes me think that it's never really going to work properly.

Btrfs stability and production-readiness

Posted Feb 23, 2016 18:24 UTC (Tue) by Wol (subscriber, #4433) [Link] (3 responses)

> I'm absolutely willing to re-evaluate my position. The btrfs devs standing up and saying "it's done" would be a HUGE step toward doing so. The fact that they still haven't, nine years later, really makes me think that it's never really going to work properly.

Raid isn't done yet - aiui that's very definitely experimental.

And as I said, there is a known corner-case with disk full and snapshots - which they can't debug because they can't reproduce it reliably :-(

But have the devs of other file systems - ext for example - ever stood up and said that? Most every file system has issues, and ext3 was a nightmare by all accounts ...

I'm interested in databases, and integrity, and that is a HARD problem. Different filesystems take different approaches, different databases take different approaches, the interactions are, shall we say, interesting ... don't blame a filesystem for behaving "as advertised" and then a "buggy" app trashes your data because the devs didn't read the spec ... and I'm looking at ext3 here ... as I said, this problem is HARD.

Cheers,
Wol

Btrfs stability and production-readiness

Posted Feb 23, 2016 18:39 UTC (Tue) by pizza (subscriber, #46) [Link]

> shall we say, interesting ... don't blame a filesystem for behaving "as advertised" and then a "buggy" app trashes your data because the devs didn't read the spec ... and I'm looking at ext3 here ...

There's a big difference between recently-written data being lost due to an unclean shutdown (ie the ext3 situation) vs the filesystem eating itself on its own across a clean unmount/mount cycle.

Both of my btrfs experiments ended with massive, filesystem-wide data loss.

Btrfs stability and production-readiness

Posted Feb 24, 2016 2:28 UTC (Wed) by pr1268 (guest, #24648) [Link] (1 responses)

But have the devs of other file systems - ext for example - ever stood up and said that?

Umm, yes, Ted Ts'o did just that for Ext4 in October 2008.

And Linux 2.6.28 had Ext4 marked as "stable" upon its release two months later.

Btrfs stability and production-readiness

Posted Feb 24, 2016 7:36 UTC (Wed) by niner (subscriber, #26151) [Link]

He said (full quote) "The ext4 filesystem is getting stable enough that it's time to drop the "dev" prefix. Also remove the requirement for the TEST_FILESYS flag."

That's quite different from the "it's done" that's requested from the btrfs developers. It's more along the lines of btrfs no longer being marked experimental.

Btrfs stability and production-readiness

Posted Feb 23, 2016 1:46 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (9 responses)

> After seeing the sheer number of catastrophic, stupid failure stories about btrfs on LWN and other places... I suspect they're probably making a fairly severe mistake.
You've probably never used ZFS on FreeBSD...

Btrfs stability and production-readiness

Posted Feb 23, 2016 5:56 UTC (Tue) by malor (guest, #2973) [Link] (8 responses)

>You've probably never used ZFS on FreeBSD...

I haven't. Is this relevant to btrfs somehow?

Btrfs stability and production-readiness

Posted Feb 23, 2016 6:10 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (7 responses)

It's about as crashy as btrfs. So it's not like ZFS is somehow fundamentally more reliable.

Btrfs stability and production-readiness

Posted Feb 23, 2016 6:16 UTC (Tue) by malor (guest, #2973) [Link] (5 responses)

Well, maybe it's not a good idea either, then.

But it sure seems to have a lot of very vocal supporters out there. I don't remember seeing anyone bitching about it.

Well, no, I remember seeing some problems with ZFS on Linux... the biggest complaint I saw was that it needs a lot of RAM to run well. (I think the minimum people were recommending was 8 gigs on a machine that wasn't doing anything but file serving, and the more you could throw on there, the better.)

But on BSD? I don't remember reading any complaints at all, and a hell of a lot of praise. I've been contemplating building a FreeBSD server because of it.

Btrfs stability and production-readiness

Posted Feb 23, 2016 8:07 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

ZFS is OK if you have backups and want to have a nice fileserver with cheap snapshots. Its RAID support is also still better than in btrfs.

But that's pretty much it. ZFS has its own pagecache so it needs a lot of RAM and it's not terribly fast for a lot of workloads.

On the other hand, btrfs is designed to be a good Linux citizen and plays nicely with its pagecache. So when we get hugepages for pagecache you can be sure that btrfs will support them, for example.

Btrfs stability and production-readiness

Posted Feb 23, 2016 8:16 UTC (Tue) by malor (guest, #2973) [Link]

>But that's pretty much it.

It has the reputation, at least on BSD, of being very reliable. Personally, I find that to be an extremely attractive feature in a filesystem... as in, the single most attractive one.

I'm willing to accept that it might not be as good as I've heard, but a vague assertion that it's no better than btrfs doesn't sound especially credible, on its own. The problems with btrfs are legion, and I haven't seen substantial ZFS complaints, except that it needs a lot of RAM to work well.

Oh, and that because of the RAM thing, it's really best on 64-bit systems. On FreeBSD, at least, UFS is supposed to be better for i386 kernels.

Btrfs stability and production-readiness

Posted Feb 23, 2016 9:56 UTC (Tue) by rleigh (guest, #14622) [Link] (2 responses)

The RAM usage is a non-issue. It's only excessive if you enable deduplication (so don't enable it). And you can constrain the ARC if you need to; it will still work just fine. It's a high-end filesystem, so if I'm running a fileserver, then who cares if it's using lots of RAM? It's not like it's going to be used for anything else. Likewise for my workstation with RAM to spare. It's using it to fulfil the system's primary purpose.

Regarding performance, at the low end with a single disk or simple mirror ZFS will be slow in comparison with ext4 and other simpler filesystems. Those data integrity guarantees don't come for free. The performance is still better than Btrfs though, which is generally abysmal, but Btrfs has to pay the same price here, which is then compounded by design mistakes and implementation problems. The thing is though, that ZFS *scales* up where other filesystems can't. You can fill up a pool with multiple zvols of mirrors or raid sets, and have it stripe the reads and writes over the array of arrays, and add SSD cache and log devices (which can also be mirrored). This can perform very well.

As for Btrfs being a "good citizen" (the unwritten implication being that ZFS is somehow "bad"), since it's a terrible filesystem it doesn't really matter either way. It's an irrelevant consideration.

Btrfs stability and production-readiness

Posted Feb 23, 2016 10:12 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> The RAM usage is a non-issue. It's only excessive if you enable deduplication (so don't enable it).
Yet btrfs manages dedup without gobbling up all of the RAM.

> It's a high-end filesystem, so if I'm running a fileserver, then who cares if it's using lots of RAM?
I certainly do. I prefer my memory to be used for something that's relevant, rather than letting it sit as a deadweight. And I'm using btrfs on my private machine for easy snapshots (+send/receive for backups) and on our Docker farm (for obvious reasons).

Dedicated file servers? That's sooo last millennium.

> The performance is still better than Btrfs though, which is generally abysmal, but Btrfs has to pay the same price here, which is then compounded by design mistakes and implementation problems.
Actually, btrfs has a much more thought out design. ZFS is already hobbled by its reliance on the fixed block sizes that later had to retrofitted for compression.

BTRFS suffers from the same problems as ZFS during its first 10 years or so.

> The thing is though, that ZFS *scales* up where other filesystems can't.
And this is just a marketing bullshit. Linux LVM can use SSDs for metadata devices, BTRFS supports that natively as well.

About the only missing piece is RAID-Z, and that's being worked on. Regular mirroring/striping already works perfectly fine.

Btrfs stability and production-readiness

Posted Feb 24, 2016 12:42 UTC (Wed) by nye (subscriber, #51576) [Link]

>Yet btrfs manages dedup without gobbling up all of the RAM.

Only in the sense that it doesn't do it, but provides an interface for some other utility to dedupe by creating block references, thus making it Somebody Else's Problem. Despite considerable opposition, there is some work on in-line dedupe, but it's experimental (not sure if it's in-tree yet) and ... requires tonnes of RAM.

>I certainly do. I prefer my memory to be used for something that's relevant,

What's more relevant to a fileserver than fileserving?

>rather than letting it sit as a deadweight

You might find https://en.wikipedia.org/wiki/Cache_(computing) to be a useful educational resource.

>Actually, btrfs has a much more thought out design. ZFS is already hobbled by its reliance on the fixed block sizes that later had to retrofitted for compression.

One of ZFS's main headline features is (and has always been) variable block sizes

>BTRFS suffers from the same problems as ZFS during its first 10 years or so.

This is so nonsensical that, as the saying goes, it isn't even wrong.

>And this is just a marketing bullshit. Linux LVM can use SSDs for metadata devices, BTRFS supports that natively as well

That is completely unrelated to the point you were replying to.

>About the only missing piece is RAID-Z, and that's being worked on

We've been hearing that it's technically possible for longer than it took for ZFS to go from an idea to wide production use, and still no indication that it will ever actually be done.

Btrfs stability and production-readiness

Posted Feb 23, 2016 9:31 UTC (Tue) by rleigh (guest, #14622) [Link]

Do you have any evidence for that? I've seen some historical bugs from FreeBSD 8 and earlier, but even then they tend not to be severe dataloss bugs. For current releases, it seems pretty solid; I've not seen any awful bugs in recent years.

After suffering from several incidents of non-recoverable dataloss with Btrfs, as well as the unbalancing issues making things unusable, and the woeful fsync behaviour killing performance, ZFS has so far for me been absolutely solid over the last 30 months (on Linux and FreeBSD), and I have zero complaints about it. Something which I can't say for Btrfs, which promises to be wonderful but has let me down every time.

While I can't claim any statistical significance to my observations, my 8 years of Btrfs use and 2.5 years of ZFS use have demonstrated to me that ZFS is indeed fundamentally more reliable and better designed than Btrfs.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds