LWN.net Logo

Actually RAID/volume management is superlimited when not in filesystem...

Actually RAID/volume management is superlimited when not in filesystem...

Posted Jan 8, 2009 16:58 UTC (Thu) by roblucid (subscriber, #48964)
In reply to: Actually RAID/volume management is superlimited when not in filesystem... by khim
Parent article: Btrfs aims for the mainline

> 1. Keep most of the data on just one drive (for movies from my own DVDs).
> 2. Keep the rest in RAID-5 form (for movies in games and such: PITA to
> reinstall but can be done if needed).
> 3. Keep my own personal files (1% of total size or so) duplicated 4 times > (on 4 HDDs).
> Pretty easy and simple requirements, right? Yet totally unachievable with
> usual LVM/filesystem separation. Currently btrfs can not support this mode > of operation too, but potentially - it's doable...

What's wrong with using partitions and bind mounts?

3) A partition that's mirrored on all disks
2) A RAID 5 (yuck!) using partition on 3 disks
1) A partition for your data disk

You can bind mount stuff to be at convenient points in the filesystem hierarchy. That is "specifiying different methods of replication for different parts of the namespace".

( Frankly if you have 4 disks, then I'd rather stripe, lower level mirrors with RAID 10 and accept the lower capacity for the performance and reliability benefits of avoiding RAID5 in a 3 disk configuration. )

Using a RAID layer to do RAID
An LVM layer to provide logical volumes (caveat on FS barriers)
File System layer to hand filesystem structure, and journalling

Would appear like a logical structure, though it does require initial planning.

Perhaps you want to be able to expand the alloted space, given over to RAID1, RAID5, RAID10 etc?

Couldn't that be done, by using LVM type block devices, used by the RAID layer which is then exposed to filesystems, so they can grow/shrink their capacity, as chunks of disk are (de)allocated to partitions?

Call me a cynic if you like but pushing every feature you could want into 1 layer, the filesystem, which should then become a generic dynamic disk management system, would appear to become a very complicated monolithic block of code. If sane implementations would then use generic layers, then you're really back to where you started.


(Log in to post comments)

That's the setup I have - and HATE it

Posted Jan 8, 2009 18:03 UTC (Thu) by khim (subscriber, #9252) [Link]

Perhaps you want to be able to expand the alloted space, given over to RAID1, RAID5, RAID10 etc?

I want to forget words "RAID", "LVM" and related. Forever. I want sane options. Like:
A. Store data cheaply (say... $0.10/GB) but unreliably: single disk failure - and be ready to redownload/reinstall)
B. Store data reliably but expensively ($.40/GB): up to three disks can fail without any problems
C. Some intermediate versions: cheap and reliable ($0.12/GB to $0.15/GB), Ok with single disk failure (if it'll happen OS will of course restore status quo if possible), but sloooooow (still much faster then DVD).
Just like filesystems were invented to make unnecessary manual manageent of data on a single disk I want something to hide all this RAID/LVM/etc stuff from me. Will it be btrfs or stack of other technologies - I don't care as long as I have nice simple option list in "Save As..." dialog.

Couldn't that be done, by using LVM type block devices, used by the RAID layer which is then exposed to filesystems, so they can grow/shrink their capacity, as chunks of disk are (de)allocated to partitions?

May be. But then - it'll need huge, very complex schemes to make it work well as whole. This is "microkernel vs monolitic kernel" discussion all over again.

Call me a cynic if you like but pushing every feature you could want into 1 layer, the filesystem, which should then become a generic dynamic disk management system, would appear to become a very complicated monolithic block of code. If sane implementations would then use generic layers, then you're really back to where you started.

Huh? Why "monolithic block of code"??? I'm perfectly happy with separating of functions - different filesystems are free to use the same implementation of RAID, LVM, etc - if their authors decide it's the best way to do things. Just as long as it's not exposed to userspace (or at least to user).

That's the setup I have - and HATE it

Posted Jan 8, 2009 18:56 UTC (Thu) by dlang (✭ supporter ✭, #313) [Link]

what you want isn't anything resembling a traditional filesystem. what you want is something like the 'object based storage' things that are being discussed (but you want something far more complex than what has been proposed, let alone implemented or accepted).

defining the redundancy for each file as it is saved will also require changes to every single program out there, which is very unlikely to happen.

if you are willing to deal with different directories having different redundancy options, then what you want is doable today, with no kernel changes. it just needs userspace tools written to make it easier to deal with.

That's the setup I have - and HATE it

Posted Jan 8, 2009 20:11 UTC (Thu) by roblucid (subscriber, #48964) [Link]

> defining the redundancy for each file as it is saved will also require
> changes to every single program out there, which is very unlikely to happen.

Probably a lot of ppl still expect filesystems to be fast, and having every node in the hierarchy and all files, having some different form of backing storage depending on redundancy requirements... *sucks through teeth* sounds expensive to me.

The "I want to know nothing" and just have it managed by some kind of Storage management system that takes care of details, does sound a better requirement to me.

Actually I don't think "every program" would need modifying, as when files are created, they can inherit the characteristics of the parent directory, as would new sub-directories.

That's the setup I have - and HATE it

Posted Jan 8, 2009 20:26 UTC (Thu) by roblucid (subscriber, #48964) [Link]

> Huh? Why "monolithic block of code"??? I'm perfectly happy with
> separating of functions - different filesystems are free to use the same
> implementation of RAID, LVM, etc - if their authors decide it's the best
> way to do things. Just as long as it's not exposed to userspace (or at
> least to user).

Well you seemed to imply it by saying you couldn't imagine it being down outside of the filesystem.

Permitting layers, I could imagine some specialised "meta" filesystem being feasible, that would give you a name space, that lets you tag directories and files with storage characteristics using more traditional type file systems as backing stores for bulk data storage. The real files might end up in a file hierarchy, spread over a number of volumes a bit like http proxy caches, to provide manageable chunks of RAID1, RAID0, RAID5, RAID10 storage, which can be increased (and freed) on demand by the Disk Management System.

The only thing is, how many ppl would actually need that? And if it were provided, how many would ever use funky "Save As" options, in applications, compared to the number who would whinge about excessive options, being confusing and unclean in their precious GUI?

That's the setup I have - and HATE it

Posted Jan 9, 2009 18:43 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

Permitting layers, I could imagine some specialised "meta" filesystem being feasible, that would give you a name space, that lets you tag directories and files with storage characteristics using more traditional type file systems as backing stores for bulk data storage.

I believe you're describing object storage. The lower of those layers is the object layer. But it differs from a traditional filesystem in that the only names the objects (files) have are made up by the system (essentially, inode numbers).

In most of the work on object storage, that layer actually lives in a hardware unit separate from the one with the POSIX filesystem image, but it could be in the kernel between the filesystem drivers and the block device drivers as well (if it hasn't been done already).

But we do need to ask whether there is a need to have more than one future filesystem type with this kind of storage function before we put a lot of effort into making a reusable layer.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds