At the 2013 LSFMM Summit, Btrfs developer Josef Bacik gave an update on the status of the filesystem. It has seen a lot of changes over the last year, he said, with around 800 changesets being merged. He contrasted that with ext4, which has seen around 250. There are a number of features being worked on, including subvolume quota groups [PDF] and a new restriper to use when adding more disks to a RAID filesystem.
The performance of Btrfs is now roughly the same as ext4 and twice that of XFS on spinning disks, Bacik said. On Fusion-io devices it is "abysmally slow", about the same as XFS. That is caused by the way that fsync() works—"write wait, write wait"—which he is hoping to get around by using atomic writes. This is for workloads using direct I/O, which is awful for Btrfs. In particular, writing 4K then doing an fsync() is something of a worst case for Btrfs.
The problems with handling full filesystems are "kind of an ongoing thing", Bacik said. There always seems to be something broken in that path. On the other hand, send/receive is working well, and defragmentation is working better. The extended inode references (IREF) problem has been fixed. That limited the number of hard links to a specific inode in a directory to two in the worst case, and only 40 in the best case. It is now only limited by disk space.
RAID 5/6 has finally been merged, he said. It is not power-failure-safe yet, though that fix is coming soon. It requires a format change, which has delayed it somewhat. The code for replacing a broken drive is "much cleaner and faster". It is also a lot easier for administrators to use. There is an fsck, now, that does work and fixes problems in the filesystem. It checks the extent trees and checksum trees along with the free space in the filesystem. There is also btrfs-image tool for creating an image of a Btrfs partition.
A new release of Btrfs will be coming soon, Bacik said. There will hopefully be more steady releases in the future, not just of the mainline code, but also the utilities in btrfsprogs.
Running out of space (i.e. ENOSPC) has been a big problem for Btrfs, though he thinks there is now a solution for it. Basically, the filesystem never knows how much space metadata is going to take, so it seriously overestimates. The fix will be a special "chunk" in the log where any overflow goes.
Online deduplication has gone through a couple of iterations. It will probably go into the 3.11 kernel, Bacik said. It will not be the default, and will require a format change before it can be turned on. Offline deduplication can be done in user space.
In answer to the "are we there yet?" question, Bacik said that we would be by the "end of the year". He has said that for the last three years now, but is getting more comfortable that it really is stabilizing. In the past, he has never had time to work on features because he has been fixing bugs, but there are fewer bugs to fix now. There are also a bunch of user-space tools to help "if things go horribly wrong". The ENOSPC problem should be handled in the next few months.
Toward the end of this year, or early next year, the project can start talking to distributions about becoming the default filesystem for new installations. Beyond that, performance is the next big focus for the team, he concluded.
LSFMM: Btrfs: "are we there yet?"
Posted May 10, 2013 17:25 UTC (Fri) by heijo (guest, #88363) [Link]
Seems to me this is the most important piece of information.
Copyright © 2013, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds