LWN.net Logo

Improving ext4: bigalloc, inline data, and metadata checksums

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 3, 2011 0:12 UTC (Sat) by walex (subscriber, #69836)
In reply to: Improving ext4: bigalloc, inline data, and metadata checksums by nix
Parent article: Improving ext4: bigalloc, inline data, and metadata checksums

«tytso wasn't working for RH when ext4 started up, and still isn't working for them now. So their influence must be more subtle. »

Quite irrelevant: a lot of file system were somebody's hobby file systems, but they did not achieve prominence and instant integration into mainline even if rather alpha, and RedHat did not spend enormous amounts of resources quality assuring them to make them production ready either, and quality assurance is a pretty vital detail for file systems, as the Namesys people discovered.

Pointing to tytso is just misleading. Also because ext4 really was seeded by Lustre people before tytso became active on it in his role as ext3 curator (and in 2005, which is 5 years later than when JFS became available).

Similarly for BTRFS, it has been initiated by Oracle (who have an ext3 installed base), but its main appeal is still as the next inplace upgrade on the Red Hat installed base (thus the interest in trialing it in Fedora, where EL candidate stuff is mass tested), even if for once it is not just an extension of the ext line but has some interesting new angles.

But considering ext4 on its own is a partial view; one must consider the pre-existing JFS and XFS stability and robustness and performance, and from a technical point of view ext4 is not that interesting (euphemism) and its sole appeal is inplace upgrades, and the widest installed based for that is RedHat, and to a large extent that could have been said of ext3 too.


(Log in to post comments)

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 3, 2011 0:52 UTC (Sat) by nix (subscriber, #2304) [Link]

So you're blaming the Lustre people now? You do realise Lustre is not owned by Red Hat, and never was?

And if you're claiming that btrfs is effectively RH-controlled merely because RH customers will benefit, then *everything* that happens to Linux must by your bizarre definition be RH-controlled. That's a hell of a conspiracy: so vague that the coconspirators don't even realise they're conspiring!

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Apr 13, 2012 19:34 UTC (Fri) by fragmede (subscriber, #50925) [Link]

I though *Oracle* was a/the big contributor to btrfs...

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 3, 2011 19:45 UTC (Sat) by tytso (subscriber, #9993) [Link]

Sure, and I've always been careful to give the Lustre folk credit for the work that they did between 2003 and 2006 extending ext3 to add support for delayed allocation (which JFS didn't have), multi-block allocation (which JFS didn't have) and extents (OK, JFS had extents).

But you can't have it both ways. If that code had been in use by paying Lustre companies, then it's hardly alpha code, wouldn't you agree?

And why did the Lustre developers at Clustrefs chose ext3? Because the engineers they hired knew ext3, since it was a community-supported distribution, whereas JFS was controlled by a core team that was all IBM'ers, and hardly anyone outside of IBM was available who knew JFS really well.

But as others have already pointed out, there was no grand conspiracy to pick ext2/3/4 over its competition. It won partially due to its installed base, and partially because of the availability of developers who understood it (and books written about it, etc., etc., etc.) The way you've been writing you seem to think there was some secret cabal (at Red Hat?) that made these decisions, and there was a "mistake" because they didn't chose your favorite file systems.

The reality is that file systems all have trade-offs, and what's good for some people are not so great for others. Take a look at some of the benchmarks at btrfs.boxacle.net; they're a bit old, but they are well done, and they show that across many different workloads at that time (2-3 years ago) there was no one single file system that was the best across all of the different workloads. So anyone who only uses a single workload, or a single hardware configuration, and tries to use that to prove that their favorite file system is the "best" is trying to sell you something, or who is a slashdot kiddie who has a fan-favorite file system. The reality is a lot more complicated than that, and it's not just about performance. (Truth be told, for many/most uses cases, the file system is not the bottleneck.) Issues like availability of engineers to support the file system in a commercial product, the maturity of the userspace support tools, ease of maintainability, etc. are at least as important if not more so.

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 3, 2011 20:43 UTC (Sat) by dlang (✭ supporter ✭, #313) [Link]

at the time ext3 became the standard, JFS and XFS had little support (single vendor) and were both 'glued on' to linux with heavy compatibility layers.

Add to this the fact that you did not need to reformat your system to use ext3 when upgrading, and the fact that ext3 became the standard (taking over from ext2, which was the prior standard) is a no-brainer, and no conspiracy.

In those days XFS would outperform ext3, but only in benchmarks on massive disk arrays (which were even more out of people's price ranges at that point then they are today)

XFS was scalable to high-end systems, but it's low-end performance was mediocre

looking at things nowdays, XFS has had a lot of continuous improvement and integration, both improving it's high-end performance and reliability, and improving it's low-end performance without loosing it's scalability. There are also more people, working for more companies supporting it, making it far less of a risk today, with far more in the way of upsides.

JFS has received very little attention after the initial code dump from IBM, and there is now nobody actively maintaining/improving it, so it really isn't a good choice going forward.

reiserfs had some interesting features and performance, but it suffered from some seriously questionably benchmarking (the one that turned me off to it entirely was a spectacular benchmarking test that reiserfs completed in 20 seconds that took several minutes on ext*, but then we discovered that reiserfs defaulted to a 30 second delay before writing everything to disk, so the entire benchmark was complete before any day started getting written to disk, after that I didn't trust anything that they claimed), and a few major problems (the fsck scrambling is a huge one). It was then abandoned by the developer in favor of the future reiserfs4, with improvements that were submitted being rejected as they were going to be part of the new, incompatible filesystem.

ext4 is in large part a new filesystem who's name just happens to be similar to what people are running, but it has now been out for several years, with developers who are responsive to issues, are a diverse set (no vendor lock-in or dependencies) and are willing to say where the filesystem is not the best choice.

btrfs is still under development (the fact that they don't yet have a fsck tool is telling), is making claims that seem too good to be true, and have already run into several cases where they have pathalogical behavior and have had to modify things significantly. I wouldn't trust it for anything other than non-critical personal use for another several years.

as a result, I am currently using XFS for the most part, but once I get a chance to do another round of testing, ext4 will probably join it. I have a number of systems that have significant numbers of disks, so XFS will probably remain in use.

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 4, 2011 1:12 UTC (Sun) by nix (subscriber, #2304) [Link]

ext4 is in large part a new filesystem who's name just happens to be similar to what people are running
ext4 is ext3 with a bunch of new extensions (some incompatible): indeed, initially the ext4 work was going to be done to ext3, until Linus asked for it to be done in a newly-named clone of the code instead. It says a lot for the ext2 code and disk formats that they've been evolvable to this degree.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds