SUSE reaffirms support for Btrfs

Posted Aug 24, 2017 18:51 UTC (Thu) by jhoblitt (subscriber, #77733)
Parent article: SUSE reaffirms support for Btrfs

Apparently, "Enterprise Readiness" means slow and corrupts data...

SUSE reaffirms support for Btrfs

Posted Aug 25, 2017 2:17 UTC (Fri) by ttelford (guest, #44176) [Link] (6 responses)

I've not seen BTRFS corrupt data. Lose data, sure. Horked, unrecoverable file systems, yup. But I've not seen a case where the data stored innit was corrupt (at least not while the file system did not indicate an error)

SUSE reaffirms support for Btrfs

Posted Aug 25, 2017 8:18 UTC (Fri) by rodgerd (guest, #58896) [Link] (5 responses)

Well, the *developers* told you this year that btrfs can corrupt data.

So I think I'll take their word for it over yours.

Just like I'll take Chris Mason's word that even basic RAID-1 functionality is buggy, or that ENOSPC bugs are unresolved, over SuSE's PR department asserting it's "Enterprise Ready".

SUSE reaffirms support for Btrfs

Posted Aug 25, 2017 16:23 UTC (Fri) by zlynx (guest, #2285) [Link] (4 responses)

Big secret: MD and EXT3 can corrupt data!

Wow! I know right? Terrible news!

Several years ago I had to deal with a 4 disk RAID-5 using Linux MD. It had an unexpected power-down and I suspect had a RAID write hole: a partially updated stripe. We had to do an hours-long fsck of the EXT3 file system and handle several files in lost+found.

As for BTRFS, I've been using it for three years on my 250 GB laptop where it hits ENOSPC all the time, but recovers, and on a file server with RAID 10 on six drives. It's been a lot easier to work with than LVM, MD and XFS, that's for sure. I think it helps that I run the newest Fedoras and not some distro trying to stay on kernel 3.10.

SUSE reaffirms support for Btrfs

Posted Aug 26, 2017 16:35 UTC (Sat) by Wol (subscriber, #4433) [Link] (3 responses)

Big secret: drives can corrupt data!

There's a recent thread on linux-raid where data loss seems to have been tracked down to the fact that the drive said "yes I've got the data", then lost it somewhere between the cache and rotating rust ...

(Oh, and fixing that - disabling write cache - really f***s up performance.)

Cheers,
Wol

SUSE reaffirms support for Btrfs

Posted Aug 28, 2017 10:28 UTC (Mon) by anton (subscriber, #25547) [Link] (1 responses)

In the good old times server drives came with disabled write caches, and file systems used tagged command queueing (essentially an asynchronous interface) to provide performance. PATA and SATA got several generations of tagged commands (from what I read, the first ones were pretty unusable, don't know about the latter ones), as well as write barriers (another way to support consistent file systems without killing performance).

It seems to me that Linux support for such features is lackluster. Block device layers like LVM have not supported such features for a long time (do they now?), a Linux file system developer has apologized for not losing more data; my general impression is that most people concerned with file systems and surrounding themes in Linux put performance first, and treat data consistency on crash recovery as an unloved stepchild (e.g., the ext3 data=journal corruption bug existed for several years; and last I looked the only Linux file system providing a good consistency guarantee was nilfs2).

In this context an advantage of BTRFS is that it does not need layers like md or LVM, and can therefore use tagged commands or write barriers to provide consistency. Does it do that? I have no idea. Given that the BTRFS developers are in the filter-bubble of the performance-first Linux file system people, I am pessimistic.

SUSE reaffirms support for Btrfs

Posted Aug 28, 2017 19:49 UTC (Mon) by Wol (subscriber, #4433) [Link]

I got the impression these drives were enterprise raid jobbies ...

That said, there does seem to be a bit of antipathy in certain quarters to "doing the job right". Like you know the dialog boxes that say "confirm yes/no, do you want to remember this answer?". WHY IS IT that they always seem to fill in *three* boxes of the two-by-two grid, and never the fourth? They'll remember that you said "yes", but forget that you said "no".

I'm one of those people who find it frustrating that such "stupidities" exist, yet there are many people who don't even seem capable of seeing them! Let alone consider them worthy of fixing - they'll try and obstruct any effort to clean up the logic.

Cheers,
Wol

SUSE reaffirms support for Btrfs

Posted Aug 30, 2017 12:43 UTC (Wed) by nix (subscriber, #2304) [Link]

That wasn't the problem being discussed on the list, as I understood it. The problem there was that a drive had been told "FUA", there was a timeout because the drive's power (but not the system's power) was unreliable so the drive had forgotten the command was issued and lost its transient state, including its RAM; the kernel retried the command, got told "OK! Flushed!"... and indeed the cache had been flushed, but because of the power interruption the cache that was flushed was *empty*.

Perhaps any timeout at all should force all layers to assume that everything until the last *acknowledged* FUA is potentially lost? Regardless, turning off write caching isn't going to help there at all: any rotating-rust drive needs *somewhere* to store data between it being sent to the drive and its being committed to the disk: whether you call it a cache or not, its contents can be lost if power goes out at the wrong instant (indeed, if the write to the disk is underway you can get a torn sector and ECC recovery on the spinning rust too). The real lesson here is that operating reliably atop hardware with faulty power rails is about as easy as operating atop any other faulty hardware...