| This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible. |
MeeGo is arguably the dark horse in the mobile platform race: it is new, unfinished, and unavailable on any currently-shipping product, but it is going after the same market as a number of more established platforms. MeeGo is interesting: it is a combined effort by two strong industry players which are trying, in the usual slow manner, to build a truly community-oriented development process. For the time being, though, important development decisions are still being made centrally. Recently, a significant decision has come to light: MeeGo will be based on the Btrfs file system by default.
Btrfs is seen as the long-term future of Linux filesystems, representing a much-needed clean break from the legacy filesystem designs we have been using for all these years. With the demise of reiser4 and the unavailability of ZFS, Btrfs would seem to be the only contender for that title. But talk about Btrfs is always framed in "it's not stable yet" terms, with few people willing to commit themselves to an actual date when the filesystem might be ready for production use. It is generally assumed that most cautious users will spend some years running on ext4 before making the jump to Btrfs. The 2.6.34 kernel will be released with this text still guarding the Btrfs configuration entry:
The MeeGo 1.0 release could happen as early as this month; given that, the above words might just seem a bit scary. In fact, they are more scary than they need to be: further on-disk format changes are not expected. The warning, it seems, will be scaled down for 2.6.35.
So why pick Btrfs for MeeGo? Arjan van de Ven described the decision this way:
He went on to describe a number of reasons why Btrfs makes sense for the MeeGo platform, starting with its data integrity features. The copy-on-write design which is at the core of Btrfs has a number of nice attributes, one of which is that users should never, ever see garbage data in files, even in a "pulled out the battery at the worst moment" situation. Device manufacturers, understandably, like that idea.
The on-disk compression feature is interesting for the MeeGo environment as well. It makes the initial system load take less space, making more available for the users of the device. But, as Arjan points out, manufacturers like it too: a smaller system image takes less time to shovel onto the storage device.
It would appear that there are a number of plans for the use of the Btrfs snapshot feature, starting with reversible package updates. With snapshots, a device can support a multi-user mode where each user appears to have the entire system to him- or herself. And the "reset to factory defaults" operation becomes a simple operation which does not require a separate recovery partition on the disk. Snapshots are not just for enterprise users anymore.
There are a number of other advantages, including small-file performance, built-in defragmentation (which is most useful for keeping boot time short), the storage management features, and more. In short, there's no doubt that Btrfs offers a useful set of features for any distribution; it's not hard to see why MeeGo wanted to use it. But that does leave an interesting open question: is Btrfs ready for inclusion into MeeGo, where it will, presumably, be installed onto systems intended for users who aren't looking to become development-stage filesystem testers?
Btrfs was initially merged for the 2.6.29 kernel; since then, patch activity looks like this:
So there is a steady rate of change to the filesystem, significant but not overwhelming. There is a wide range of contributors to this code, though the bulk of the work (by far) has been done by developers from Oracle and Red Hat. There are certainly people using Btrfs in normal use, and Fedora offers it as an experimental option. The mailing list shows a number of oops reports still, and it would appear that the famous ENOSPC issue (where the filesystem reacts poorly when the storage device overflows) is still not entirely solved. Significant feature patches - direct I/O support and RAID 4/5 support, for example - remain unmerged. In summary: Btrfs does not quite have that "it's done" look to it yet.
That said, it may well be getting close to ready for the sort of restricted and well-tested environment likely to be found in MeeGo deployments. Btrfs will also have stabilized further by the time devices actually start shipping with MeeGo - helped, no doubt, by the work of the MeeGo developers themselves. So, while this decision may appear to be ambitious now, it is not necessarily unreasonable. A dark-horse platform can only be helped by taking advantage of the best technology available to it.
MeeGo and Btrfs
Posted May 11, 2010 19:51 UTC (Tue) by arjan (subscriber, #36785) [Link]
We've been using btrfs in our builds ever since, and the MeeGo code release a while ago also obviously was using btrfs. It's been very stable for us in all our testing since August/September of last year, and the feature set is very attractive obviously (as is the data integrity)
MeeGo and Btrfs
Posted May 11, 2010 21:42 UTC (Tue) by walters (subscriber, #7396) [Link]
MeeGo and Btrfs
Posted May 11, 2010 21:50 UTC (Tue) by xnox (subscriber, #63320) [Link]
MeeGo and Btrfs
Posted May 12, 2010 8:34 UTC (Wed) by Los__D (guest, #15263) [Link]
MeeGo and Btrfs
Posted May 16, 2010 0:23 UTC (Sun) by Baylink (guest, #755) [Link]
I've carried an n800 around for over a year now, and *precisely the fact that it has two 16GB capable SD slots* is the reason I wouldn't replace it with an n810, much less an n900.
The Cloud is *greatly* overrated; I give it about 5 years to implode under the accumulated weight of the failings inherent in it which the people flacking it never mention. :-)
MeeGo and Btrfs
Posted May 11, 2010 22:01 UTC (Tue) by arjan (subscriber, #36785) [Link]
MeeGo and Btrfs
Posted May 11, 2010 22:37 UTC (Tue) by walovaton (guest, #57287) [Link]
MeeGo and Btrfs
Posted May 12, 2010 13:41 UTC (Wed) by csamuel (✭ supporter ✭, #2624) [Link]
MeeGo and Btrfs
Posted May 12, 2010 19:28 UTC (Wed) by walovaton (guest, #57287) [Link]
MeeGo and Btrfs
Posted May 12, 2010 5:55 UTC (Wed) by corsac (subscriber, #49696) [Link]
MeeGo and Btrfs
Posted May 12, 2010 17:04 UTC (Wed) by arjan (subscriber, #36785) [Link]
MeeGo and Btrfs
Posted May 13, 2010 23:11 UTC (Thu) by njs (guest, #40338) [Link]
MeeGo and Btrfs
Posted May 14, 2010 10:41 UTC (Fri) by nix (subscriber, #2304) [Link]
I can't wait.
;)
(Oracle, at least pre-Sun-merger: pretty good backend software. Frontend software horrifically awful. Should they ever produce a mobile phone I suspect my silly litany above would be an *understatement*.)
MeeGo and Btrfs
Posted May 20, 2010 19:54 UTC (Thu) by oak (guest, #2786) [Link]
MeeGo and Btrfs
Posted May 11, 2010 22:37 UTC (Tue) by lbt (subscriber, #29672) [Link]
I was using 2.6.33
I got disk full messages and yet df showed:
/dev/root 920M 651M 269M 71% /
So I went and asked about it ... some chat in #btrfs got me:
"I was told my fs was full. I just got a chance to do a du on it...585M"
"thats probably right"
&
"269m in use for metadata sounds correct"
&
"you will want to update your kernel to something 2.6.34 ish"
"OK, so btrfs is really not a good choice for small disks/devices not at ~50% overhead"
"yeah basically"
At the time I estimated 50% I wasn't clear on the metadata measurement bug in my kernel. I guess it's more like 25% used by metadata. I'd be interested to know if this is a percentage overhead or a lump-sum.
I'm really interested in many of the btrfs features - and on the desktop I may have volumes I'd make this tradeoff for in a hearbeat... not so sure about on my phone.
MeeGo and Btrfs
Posted May 11, 2010 23:07 UTC (Tue) by elanthis (guest, #6227) [Link]
MeeGo and Btrfs
Posted May 12, 2010 0:23 UTC (Wed) by Fowl (subscriber, #65667) [Link]
MeeGo and Btrfs
Posted May 12, 2010 7:13 UTC (Wed) by eru (subscriber, #2753) [Link]
Me too... One major use for mobile phones and similar is snapping photos and videos, and these take space, especially as the quality is going up these days (latest models can film at HD resolution). I could live with a 16G flash giving me 14G, but not if it really allows only 8G.
MeeGo and Btrfs
Posted May 12, 2010 13:29 UTC (Wed) by nix (subscriber, #2304) [Link]
MeeGo and Btrfs
Posted May 12, 2010 17:07 UTC (Wed) by dlang (subscriber, #313) [Link]
while there are a couple spots where capacity is significantly above usage (personal desktops using TB size drives, phones holding audio files), there are MANY other situations where people routinely run up against size limits
think of laptops running SSDs
think servers with multi-TB raid arrays.
in both cases using significant amounts of space for metadata will significantly hurt you.
I have quite a few systems, and many of them do routinly run up against drive size limits. These tend to be personal machines with <150G (SSDs, laptops, high-speed SCSI drives), or servers (where I have multi-TB raid arrays, but they are sized for what I am storing n them)
loosing 30% of the space to metadata would be a problem.
MeeGo and Btrfs
Posted May 13, 2010 1:56 UTC (Thu) by Fowl (subscriber, #65667) [Link]
That would be fairly key to keep latency down I would think.
MeeGo and Btrfs
Posted May 20, 2010 8:23 UTC (Thu) by mfedyk (guest, #55303) [Link]
(Btw, #btrfs can be very quiet at times and quite active at other times. Many times people join the channel and ask a question but leave before someone who can answer has seen it. Stick around in-channel for at least 12 hours, there are people in many different time zones.)
Btrfs sections areas of block storage like spinning disks, SSDs, etc. into chunks. You can think of them like block groups in ffs and ext*, but with more capabilities. There are meta-data chunks and data chunks among other types of chunks. The size of the chunks are fixed size once allocated until a rebalance (it can be run while the volume is online, no need to unmount) operation is invoked. I suspect your meta-data chunk was 256MB.
The size of btrfs meta-data can be a bit misleading. Btrfs does tail-packing on the end of files, and if the file is small enough it will be stored entirely in the meta-data chunk.
The numbers reported to df have been a bit misleading until recently. It used to be that used only counted data blocks and free counted all types of free space including free space in the meta-data chunks which couldn't be used for full sized data blocks (only small files and tails). Now used is counted as <used data blocks> + <used meta-data blocks> and free is counted as <free data blocks> + <block space not allocated to any chunk>. This cleared up most of the ambiguity where df was reporting free space, but apps would see disk full errors. (I'm not sure if that made it into 2.6.33.) Though if df reports zero free you can still write small files that fit into tails because that goes into the meta-data chunk.
A patch to mkfs.btrfs and the kernel module that scales the size of meta-data chunks based on blockdev size would be a welcome contribution from the meego developers I suspect.
The Fedora 13 kernel has a more recent version of btrfs than stock 2.6.33 has and the current advice is to run the latest btrfs from git so that duplicate bug reports aren't reported for issues already fixed.
You can find more details of your btrfs volume with these commands:
btrfs-show
(or "btrfs fi sh" in latest git btrfs-progs)
btrfs fi df <path to btrfs volume>
(in latest git btrfs-progs and recent btrfs kernel module)
"fi" is short for "filesystem" and can be shortened as long as it is unambiguous.
In short, it looks like meego will only be using the features that work well in btrfs right now and I suspect with their somewhat narrow use cases it has a good chance of working well for them right now. On that note the btrfs project can always use more patches, testing and documentation.
(I have been testing btrfs for the last few months and I run as my root filesystem on my desktop and laptop. I do not represent the project and only explain what I think I understand of the project and filesystem. Please let me know if you have found any errors.)
MeeGo and Btrfs
Posted May 12, 2010 1:57 UTC (Wed) by mcgrof (subscriber, #25917) [Link]
MeeGo and Btrfs
Posted May 12, 2010 15:22 UTC (Wed) by wingo (guest, #26929) [Link]
Graphs
Posted May 12, 2010 15:29 UTC (Wed) by corbet (editor, #1) [Link]
I just fed the numbers into gnumeric and had it produce the graph. Not the world's prettiest result, but it's quick...
Graphs
Posted May 13, 2010 2:04 UTC (Thu) by ccurtis (guest, #49713) [Link]
Even better than that, though, would be comparisons with patches for other filesystems during the same time. And even better than that would be those same graphs over the lifetime of the filesystem so that we could see where in the development lifecycle btrfs is compared to, say, ext3. Is the LoC change of btrfs comparable to that of ext3 in 2005? 2002? 2009?
MeeGo and Btrfs
Posted May 12, 2010 10:23 UTC (Wed) by Funcan (subscriber, #44209) [Link]
MeeGo and Btrfs
Posted May 13, 2010 17:43 UTC (Thu) by i3839 (guest, #31386) [Link]
MeeGo and Btrfs
Posted May 26, 2010 3:05 UTC (Wed) by rodgerd (guest, #58896) [Link]
Before I let it near anything even remotely important I'd want the standard suite of tools to allow me to remove and rename subvolumes (available via a third-party btrfs tool), and apply block quotas to subvolumes (because if I can't stop a large download using up all my disk space and stopping me from logging in because /tmp isn't segregated from /var, I've got a system which is worthless for multiuser purposes).
MeeGo and Btrfs
Posted Jun 10, 2010 3:05 UTC (Thu) by mfedyk (guest, #55303) [Link]
This is a planned feature. I plan to use it for this as well as openvz/lxc type virtualization.
"rename subvolumes"
This should be a simple addition to the btrfs-progs tools. In the meantime, you can snapshot the subvolume with the new name and delete the old subvolume location with the current tools.
You should post this to the btrfs mailing list.
Copyright © 2010, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds