User: Password:
|
|
Subscribe / Log in / New account

Btrfs: multiple device support: remove it!

Btrfs: multiple device support: remove it!

Posted Jan 1, 2014 16:54 UTC (Wed) by dougg (guest, #1894)
Parent article: Btrfs: Working with multiple devices

Btrfs should remain a file system and leave the storage side where it belongs. Its lower interface should be to a SCSI logical unit (LU) or, if you prefer, a VM drive (same thing). Then the (storage) target subsystem, md or block subsystems can take care of size, reliability, replication and backups.

I'm looking at SCSI xcopy version 2 lite, what Microsoft calls Offloaded Data Transfer (ODX). It facilitates "point in time" copies via ROD (representation of data) tokens. This is almost a perfect mechanism for backups on a live system (i.e. consistent, almost no down time). With the help of a file system log it seems to solve the storage side of what VMWare calls vMotion: the ability to move a VM from one server to another while it is operating. Nicolas Bellinger's Linux target subsystem is part of the way toward implementing ODX; it already implements the building block behind VMWare's VAAI in this area (i.e. SCSI xcopy version 1).

http://blogs.technet.com/b/matthewms/archive/2013/09/13/v...

The redundant array of independent disks (RAID) is a remnant of the age of rotating disks. File systems should forget about it when looking to the future.


(Log in to post comments)

Btrfs: multiple device support: remove it!

Posted Jan 1, 2014 21:07 UTC (Wed) by rvfh (subscriber, #31018) [Link]

> Then the (storage) target subsystem, md or block subsystems can take care of size, reliability, replication and backups.

But then you lose the ability to treat data and metadata differently...

Btrfs: multiple device support: remove it!

Posted Jan 2, 2014 1:19 UTC (Thu) by ncm (subscriber, #165) [Link]

Btrfs allowing metadata to be on a different device LU would suffice for that, I think. Does it?

There's no requirement for btrfs to take over all your RAIDity, is there? I.e. you could still run it on top of MD (or what-have-you), so long as btrfs sees it as a block device? So the one benefit of letting btrfs do the work is finer-grained control over what is stored how.

Btrfs: multiple device support: remove it!^w^w

Posted Jan 4, 2014 14:54 UTC (Sat) by kreijack (guest, #43513) [Link]

> So the one benefit of letting btrfs do the work is finer-grained control over what is stored how.

There are a lot of optimizations that could be implemented with an integration between the filesystem and the storage layer
- the "RAID write hole"[1] disappears: the filesystem checksum allow to determine which data is correct
- more, you can also detect error due to random bit flipping on the platter (the disk size is becoming comparable with the error rate [2]) and correct it.
- you don't need to "prepare" the raid before using it: with btrfs you need only few second to create a filesystem w/raid
- you can have different raid profiles in the same filesystem: today BTRFS allow different raid profile only for data and metadata. It was discussed to allow different raid profiles per subvolume basis. [Even tough I am not so sure that it is a good idea, from an administrator point of view]. For example:
- you can put metadata on RAID1
- "valuable data" on raid 5/6
- "cache data" (i.e. downloaded films) on raid0

The only point which I have to agree that the integration of the raid in a file-system requires longer developing time.

[1] http://www.raid-recovery-guide.com/raid5-write-hole.aspx
[2] WD reports in its data-sheet an error rate of 1/10^14; an 1 TB disk are about 13^10 bit.... so if you read all the disk ten times, WD says that it is possible that you got a bit wrong

Btrfs: multiple device support: remove it!

Posted Jan 2, 2014 1:04 UTC (Thu) by smurf (subscriber, #17840) [Link]

I disagree.

For one, backups at the block level are a very stupid idea, especially if you have a log-structured file system, or one that can do snapshots.

Also, you will need to have the whole file system in a consistent state in order to make a block-level snapshot. Surprise: this is an expensive operation. Not all work loads tolerate that.

Also, speaking of taking care of size: Resizing a RAID (by adding a disk) is an expensive operation which touches the whole disk, even if most space is not allocated yet. On the other hand, as I understand it (correct me if I'm wrong) btrfs will only spread new files onto a new disk, which allows one to re-balance the file system at one's leisure. In my case, if I added a new disk, rebalancing the whole RAID would slow down my site for a week; btrfs would require around 80 hours, and I can limit that to dead time in the early morning.

Also, a link to something that reads like a Windows commercial is hardly an endorsement for THIS crowd. :-P

In any case, the text you linked to talks about treating a file like a file system, except that this file gets loopback-mounted on the SAN instead of on the host machine and you appear to be able to shrink it. The overlap between that blatant Windows commercial ^w^w^w blog post, and this topic, is essentially zero.

Btrfs: multiple device support: remove it!

Posted Jan 2, 2014 23:34 UTC (Thu) by dougg (guest, #1894) [Link]

"Very stupid idea"?? Not it would seem to VMWare, Netapp, HP and Microsoft with more to come.

Yes, "you will need to have the whole file system in a consistent state in order to make a block-level snapshot. Surprise: this is an expensive operation." And a point_in_time ROD based read can be made a very quick operation; continue almost immediately, backup ready to write whenever, take as many copies as you like; migrate that consistent FS state to another machine, etc.

How many months has btrfs been delayed while its authors tried to re-invent and re-implement RAID?

So you don't like marketing BS? Well welcome to planet earth. Just maybe there might be some useful information lurking in there. There doesn't seem to be much pedagogic material available. If you prefer something dry and obtuse, I can provide references to SPC-4 and SBC-3 drafts found at www.t10.org [No marketing BS there.]

Btrfs: multiple device support: remove it!

Posted Jan 3, 2014 0:38 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

Btrfs is going to have stuff that is impossible to implement on classic RAIDs.

For example, it's already possible to use different RAID levels for metadata and data. It's also going to be possible to set individual RAID levels for individual files.

ZFS can do both, btw.

Btrfs: multiple device support: remove it!

Posted Jan 3, 2014 7:40 UTC (Fri) by smurf (subscriber, #17840) [Link]

Taking a consistent point-in-time snapshot is not a block-level operation, it's a file system problem, and whether it's a good idea or not for btrfs to have its own RAID implementation has nothing whatsoever to do with point-in-time snapshots.

You cannot do snapshots and all the other cool ideas marketing talks about without file system and OS support. Which block-level technology is below that (file system's RAID, kernel block-level RAID, hardware RAID, no RAID, block storage file of the VM's host; directly connected, NAS, iSCSI, Acronym-of-the-Day) is completely irrelevant.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds