| LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing |
Dmitry Monakhov prefaced his 2015 LSFMM Summit session on filesystem defragmentation with a statement that the "problem is almost already solved". His session turned into a largely informational description of the status of a defragmentation tool that he has been working on.
Over time, filesystems change and cannot avoid fragmentation issues, he said. For example, extracting a Linux source tree results in many small files that filesystem tries to allocate close to each other. Building in the tree results in lots of temporary files that get removed, so the filesystem gets fragmented.
Beyond appearing in regular filesystems, these fragmentation problems show up in thin provisioning systems, as well as for shingled magnetic recording (SMR) devices, he said. In addition, to make boot times shorter, it would be best to lay out all the needed files sequentially on the disk, which may require defragmentation.
The fragmentation problem is already solved for large files. Btrfs, XFS, and ext4 all have tools for doing defragmentation on files. But there is no solution for directory fragmentation. The filesystems try to put files that are in the same directory close to each other on the disk, but as files get deleted or moved, fragmentation of the directory occurs.
To perform defragmentation, it is often necessary to copy file data from one place to another. Monakhov suggested that a checksum could be calculated on the data when doing that copy, which could then be stored in a "trusted" extended attribute (xattr). He noted that overlayfs uses the "trusted.overlay" xattr, which can only be modified by processes with CAP_SYS_ADMIN, so a "trusted.sha1" (or or other hash) could be calculated and stored when copying data for defragmentation.
Executable files could then have their contents checked and compared to the hash value before being executed. He proposed adding that capability to his tool, but it seemed to be something of an aside. It is not clear how it relates to the integrity measurement architecture (IMA), for example.
He has been working on a tool called e4defrag2 (developed in a branch of e2fsprogs) that will perform defragmentation. It is mostly independent of the filesystem type. It uses the same block scanning code to find fragmentation, but ext4 and XFS have a different ioctl() name for their defragmentation operations.
The result is a "giant utility that works for everything", Monakhov said. The filesystem-dependent part is roughly 100 lines of code. This "universal defragmenter" will be released soon.
Ted Ts'o asked what would be needed to eliminate the 100 lines. He asked if wiring up the XFS ioctl() name into ext4 would help. Monakhov said that the tool needs to get the block bitmap from the filesystem, which is also different between the filesystems. Ts'o and Dave Chinner indicated that they would attempt to provide the same interfaces. Chinner did caution that XFS cannot defragment a range in a file, only the whole file. That is different than ext4, Monakhov said.
[I would like to thank the Linux Foundation for travel support to Boston for the summit.]
Filesystem defragmentation
Posted Mar 26, 2015 16:41 UTC (Thu) by pr1268 (subscriber, #24648) [Link]
Forgive me for being a little unenlightened, but didn't I read somewhere a number of years ago that Linux filesystems (e.g. Ext2/3/4)1 are resistant to fragmentation (or at least the ill effects thereof) by design?
Not that I'm trying to dismiss Dmitry's work. I suppose SMR devices expose new engineering challenges with respect to filesystem layout.
Also, just an idea: If Dmitry's utility is as filesystem-agnostic as mentioned, why not drop the "e4" from the tool's name. Or, how about a more universal-sounding "fsdefrag2"? (Again, just a suggestion—I'm pleased that someone is working on keeping our filesystems neat-and-orderly.) :-)
1 Also ReiserFS, JFS, XFS, etc.
Filesystem defragmentation
Posted Mar 26, 2015 22:45 UTC (Thu) by flussence (subscriber, #85566) [Link]
There's still good reasons for defragmentation, the main one being the above scheme tends to fragment free space over time, making it increasingly difficult to find large contiguous areas on a disk that's filling up. There's also the use case of boot and possibly individual applications, where you can measure an access pattern once and optimize heavily for it - `e4rat` does that, but the name implies it's ext4-only.
The official builds of Firefox have been doing a similar thing for a few years now, by packing all the data needed for startup into a large, carefully crafted .so file that causes it to be read more or less linearly. If some of that behaviour can be done in the filesystem, everyone wins.
Filesystem defragmentation
Posted Mar 30, 2015 4:58 UTC (Mon) by martinfick (subscriber, #4455) [Link]
Filesystem defragmentation
Posted Mar 30, 2015 13:23 UTC (Mon) by Jonno (subscriber, #49613) [Link]
Well to be honest there is *some* truth to this "myth", but fragmentation resistance is not a binary thing, but a matter of degree, and while Linux is a whole lot better than Windows, it is not (and can not possibly be) perfect.
It's a bit like the FAT vs NTFS story back in the NT4 days. NTFS was so much better than FAT that MS thought a defragmentation tool in NT4 was unnecessary. However, after several months of typical use (or several days of a pathologically bad use pattern) the performance penalty still grew significant, and MS had to back-pedal.
Now the Linux file systems are even better than NTFS, but the story is essentially the same, except knowledgeable Linux people have generally not claimed defragmentation tools to be completely unnecessary, only not a priority (until now, apparently).
Copyright © 2015, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds