LWN.net Logo

Which filesystem for Samba4?

Andrew Tridgell has been hacking away on Samba 4 for a while now; that project has gotten to the point that he has started doing some performance testing. His first set of results looked like this (numbers in MB/sec):

FilesystemNo xattrWith xattr
ext26864
ext36758
xfs6240
xfs 2K inode6358
tmpfs69--
jfs3629
reiser35844

These results show that all filesystems slow down when extended attributes are used. This matters for Samba 4 because Windows filesystems make heavy use of extended attributes. As Tridge put it:

The high cost of xattr support is a bit of a problem.... I hope we can reduce the cost of xattrs as otherwise Samba4 is going to be seriously disadvantaged when full windows compatibility is needed. I'm guessing that nearly all Samba installs will be using xattrs by this time next year, as we can't do basic security features like WinXP security zones without them, so making them perform well will be important.

The cause of the performance problems is not particularly mysterious. Most filesystems store extended attributes in a special data block, away from the rest of the associated file's metadata. So working with a file's extended attributes forces the filesystem to go out and read another block from the drive. The extra transfers and seeks take their toll on performance, as can be seen in the numbers above.

A pointer to the solution can be seen there as well. The "xfs 2K inode" results were obtained by turning on the XFS large inode option. This option expands the size of the on-disk inode structure, making room for the extended attributes to be stored there. When the inode is read from the drive, the extended attributes come with it, and no separate I/O is required to work with them. When this option is enabled, the performance hit for using extended attributes with XFS is much reduced.

It turns out that a large inode patch for ext3 has been in the works for a while; it has passed muster with the ext3 developers, but has not yet been pushed into the mainline. Tridge tried this patch and was pleased with the results:

Using a 256 byte inode on ext3 gained a factor of up to 7x in performance, and only lost a very small amount when xattrs were not used. It took ext3 from a very mediocre performance to being the clear winner among current Linux journaled filesystems for performance when xattrs are used. Eventually I think that larger inodes should become the default.

First, however, the patch must be merged. With testimonials like this, that merger is likely to happen in the relatively near future.

One interesting mystery remains, however: Tridge gets notably better results with 2.6.10-rc2-mm2 than what he gets with 2.6.10-rc2. As of this writing, nobody seems to have an explanation for why ext3 should perform that much better in the -mm kernel. Inquiring minds very much want to know, however, and Andrew Morton is working at finding out which patch makes the difference.


(Log in to post comments)

Which filesystem for Samba4?

Posted Nov 24, 2004 14:25 UTC (Wed) by rl (subscriber, #2336) [Link]

The mystery has been solved.

From: tridge@samba.org
Date: Wed, 24 Nov 2004 18:53:47 +1100

You can call off your bsearch - I found the culprit.

For the 2.6.10-rc2 tests I was running with the patch from Andreas that added large ext3 inode support (in order to also test the ext3-256 case). For the -mm2 test I wasn't.

This patch was supposed to have no effect if large inodes were not setup at mkfs time. Unfortunately it does have an affect as it also removes the in-place xattr modification logic from ext3_xattr_set_handle(), so every xattr set becomes the same as a delete+create pair. In plain -rc2 and in -mm2 an xattr set of the same size will be done in-place. As every xattr set is of the same size in dbench3 this made a huge difference.

Which filesystem for Samba4?

Posted Nov 27, 2004 3:24 UTC (Sat) by stevef (subscriber, #7712) [Link]

The working set seems to have been too small to cause much disk activity which may explain the counterintutive result (ext3 being faster than jfs and xfs). Most of the data I have seen on larger server benchmarks (whose working set exceeds physical memory) showed ext3 somewhat worse. The updates to ext3 seem promising though.

In any case a good xattr performance test ala iozone or equivalent would be helpful as well.

Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds