SUSE reaffirms support for Btrfs

Posted Aug 25, 2017 2:14 UTC (Fri) by ttelford (guest, #44176)
In reply to: SUSE reaffirms support for Btrfs by drag
Parent article: SUSE reaffirms support for Btrfs

> I never understood why RAID5 was so valuable to so many people

If your array has 3 drives, you have a point. There's only one more drive needed to go RAID 10.

However, RAID10 gets prohibitively expensive as the number of disks grows.

And of course, there's the issue of being able to find an individual users sweet spot between storage space, reliability, and the physical space to mount disks in a chassis.

SUSE reaffirms support for Btrfs

Posted Aug 25, 2017 3:24 UTC (Fri) by drag (guest, #31333) [Link] (12 responses)

RAID 5 recovery time is what kills it. Lack of performance doesn't help things. It's obsolete because of this. 'Individual sweet spot' is fairly irrelevant when the 'individual sweet spot' in question just setting yourself up for misery. It's not just a issue of chances of a second drive failing before the recovery is complete, but also having a fully usable and available system ASAP.

With 4TB drives being less then 100 dollars now you get 8TB for less then $400 or 16TB for less then $800. The per-drive expense of storage like that is far less of a concern then it used to be and other factors tend to matter more. If you are trying to get work done on a system rebuilding a array then the time to recover can easily stretch out to multiple days.

If somebody has their heart set on RAID5 then that's fine, but I still consider it ill-advised.

SUSE reaffirms support for Btrfs

Posted Aug 25, 2017 19:17 UTC (Fri) by nybble41 (subscriber, #55106) [Link] (2 responses)

While I agree with you regarding RAID5, with a 4-disk RAID6 parity scheme the loss of _any_ two drives is recoverable. With RAID10 on the same array you have a 1/3 chance of losing access to half your data when the second drive fails (i.e. when it happens to be the mirror of the first drive that failed). On the other hand, while RAID6 can perform two reads in parallel from any part of the array, RAID10 has the advantage in speed since it can perform at least two reads (one per mirror) and possibly up to four (one per disk) at one time, depending on which part of the array is being read. Similarly, a RAID6 write touches all four drives whereas RAID10 only needs to update two, potentially allowing up to double the throughput if writes are distributed across the mirrors.

SUSE reaffirms support for Btrfs

Posted Aug 28, 2017 15:07 UTC (Mon) by Wol (subscriber, #4433) [Link] (1 responses)

Just watch out though - as far as linux is concerned, raid-10 and raid-1+0 are two different beasts. For example, you only need three drives for raid-10, but four for raid-1+0.

Cheers,
Wol

SUSE reaffirms support for Btrfs

Posted Aug 28, 2017 17:26 UTC (Mon) by nybble41 (subscriber, #55106) [Link]

> you only need three drives for raid-10

Right, I was writing from the perspective of 'nested' RAID10, not the 'complex' version. With 'complex' RAID10 you would have guaranteed partial data loss on failure of any two drives, since the data stripes are distributed across the drives and—for an array of four uniformly-sized disks, as we've been discussing, which doesn't really need to be 'complex'—the two failed drives would be the sole mirrors for approximately one-twelfth of the stripes.

Whether a guaranteed 8% data loss (given a second drive failure) is better or worse than a 1/3 chance of losing 50% of the data will, I suppose, depend on your use case. However, do consider that if the 8% includes critical metadata (and it probably will) then the remainder, while still technically "present", may still be unrecoverable in practice.

SUSE reaffirms support for Btrfs

Posted Aug 26, 2017 3:14 UTC (Sat) by ttelford (guest, #44176) [Link]

At the end of the day, at any given moment, there's a "largest size available". Simply buying a bigger disk isn't possible, and RAID-10 uses space very inefficiently.

Not all applications are terribly sensitive to performance. A lot of the classical file servers have files that are effectively static - with occasional updates.

A home media server is another example: you can record multiple channels of 1080 video (Using ATSC / MPEG-2), and playback multiple recordings simultaneously, without skipping. Why target performance (which you don't need), and sacrifice storage space (which you do need).

Ther is simply no one size fits all solution; it's all a compromise. I personally use RAID6 when performance isn't a problem.

SUSE reaffirms support for Btrfs

Posted Aug 26, 2017 8:20 UTC (Sat) by Otus (subscriber, #67685) [Link] (7 responses)

> RAID 5 recovery time is what kills it.
> [...]
> If you are trying to get work done on a system rebuilding a array then the time to recover can easily stretch out to multiple days.

Really? Recovery is something that needs to be done rarely. As long as it can be done in the background without interruption I cannot see the problem, even if it takes days.

I also don't see why RAID 1 recovery should be significantly faster. Whenever you lose one disk you need to rewrite one disk, in any RAID mode. With RAID 5 you need to read double the data, but does that really make that much of a difference in practice?

(Finally, with 4 disks instead of 3 in an array, you will get a drive failure about a third more often.)

SUSE reaffirms support for Btrfs

Posted Aug 26, 2017 12:43 UTC (Sat) by SampsonF (guest, #118216) [Link] (6 responses)

In a multi-drive volume (RAID), when one drive fails due to "age", the other drive is very like to start failing also.

When one drive failed, and once it started to rebuild, it created high volume of disk read/write activities for long duration of time. Thus surge of disk activities in turn increase the likelyhood of failure in the remaining disks.

That is why sometimes Mirroring or RAID1 is feasible for some usage case.

SUSE reaffirms support for Btrfs

Posted Aug 26, 2017 16:30 UTC (Sat) by Wol (subscriber, #4433) [Link] (3 responses)

RAIDs usually suffer multiple drive failures because they haven't been looked after. A sysadmin will do scrubs, will monitor SMART etc, and won't be surprised by a disk failure...

So why do we regularly get stories about arrays falling over? Maybe because home users (and bean counters) don't think checking the health of the system is important?

Cheers,
Wol

SUSE reaffirms support for Btrfs

Posted Aug 30, 2017 12:28 UTC (Wed) by nix (subscriber, #2304) [Link]

A sysadmin will do scrubs, will monitor SMART etc, and won't be surprised by a disk failure...

Your assumption that scrubbing and SMART will reliably detect disk failures is not very accurate. There are sudden failures, for starters, where the drive works until it suddenly doesn't: but also SMART is not terribly reliable at the best of times, and scrubbing is more to guard against slow magnetic degradation than to discern whether the drive is on its way out.

SUSE reaffirms support for Btrfs

Posted Aug 30, 2017 22:47 UTC (Wed) by jwarnica (subscriber, #27492) [Link] (1 responses)

Mortals, which is to say people with meetings and more than one system to nurse, have other demands on their time.

"Enterprise" hardware, and software should not catastrophically fail without warning that comes out of the box, and for sure not fail totally before shipping kicks in.

SUSE reaffirms support for Btrfs

Posted Aug 31, 2017 13:33 UTC (Thu) by Wol (subscriber, #4433) [Link]

No names no pack drill ...

But a company I know of bought a commercial raid-6 array. Cue a massive panic a couple of years down the line when an operator suddenly noticed two red lights indicating a double drive failure. Nobody'd been checking the raid.

So they had people who were supposed to be looking after it. So it did try to tell them something was wrong. And still they just dodged a catastrophic failure by pure luck rather than judgement.

Cheers,
Wol

SUSE reaffirms support for Btrfs

Posted Aug 26, 2017 21:14 UTC (Sat) by zlynx (guest, #2285) [Link] (1 responses)

A proper weekly scrub will solve this. It does a full read and verify and is just as much of a heavy load as a full rebuild. So if a drive would have failed during a rebuild it would have failed during a scrub.

SUSE reaffirms support for Btrfs

Posted Aug 28, 2017 15:10 UTC (Mon) by Wol (subscriber, #4433) [Link]

Also, a lot of raid failures are down to "soft" problems. You need to read (and rewrite) your drive regularly. Just as dynamic ram needs to be refreshed every couple of nanoseconds, so does your drive need to be refreshed regularly, although on a MUCH longer timescale. Again, a scrub will pick up problems before they get serious.

Cheers,
Wol