Transparent decompression for ext4

By Jonathan Corbet
July 31, 2013

Transparent compression is often found on the desired feature list for new filesystems; compressing data on the fly allows the system to make better use of both storage space and I/O bandwidth, at the cost of some extra CPU time. The "transparent" in the name indicates that user space need not be aware that the data is compressed, making the feature easy to use. Thus, filesystems like btrfs support transparent compression, while Tux3 has a draft design toward that end. A recent proposal to add compression support to ext4, however, takes a bit of a different approach. The idea may run into trouble on its way into a mainline kernel, but it is indicative of how some developers are trying to get better performance out of the system.

Dhaval Giani's patch does not implement transparent compression; instead, the feature is transparent decompression. With this feature, the kernel will allow an application to read a file that has been compressed without needing to know about that compression; the kernel will handle the process of decompressing the data in the background. The creation of the compressed file is not transparent, though; that must be done in user space. Once the file has been created and marked as compressed (using chattr), it cannot be changed, only deleted and replaced. So this feature enables the transparent use of read-only compressed files, but only after somebody has taken the time to set those files up specially.

This feature is aimed at a rather narrow use case: enabling Firefox to launch more quickly. Desktop users will (as Taras Glek notes) benefit from this feature, but the target users are on Android. Such systems tend to have relatively slow storage devices — slow enough that compressing the various shared objects that make up the Firefox executable and taking the time to decompress them in the CPU is a net win. Decompression at startup time slows things down, but it is still faster than reading the uncompressed data from a slow drive. Firefox currently uses its own custom dynamic linker to load compressed libraries (such as libxul.so) during startup. Moving the decompression code into the filesystem would allow the Firefox developers to dispense with their custom linker.

Dhaval's implementation has a few little problems that could get in the way of merging. Decompression must happen in a single step into a single buffer, so the application must read the entire file in a single read() call; that makes the feature a bit less than fully transparent. Mapping compressed files into memory with mmap() is not supported. The "szip" compression format is hardwired into the implementation. A new member is added to the file_operations structure to read compressed files. And so on. These shortcomings are understood and acknowledged from the outset; Dhaval's main purpose in posting the code at this time was to get feedback on the general design. He plans to fix these issues in subsequent versions of the patch.

But fixing all of those problems will not help if the core filesystem maintainers (who have, thus far, remained silent) object to the intent of the patch. A normal expectation when dealing with filesystems is that data written with write() will look the same when retrieved by a subsequent read() call. The transparent decompression patch violates that assumption by having the kernel interpret and modify the data written to disk — something the kernel normally tries hard not to do.

Having the kernel interpret the data stream could perhaps be countenanced if there were a compelling reason to add this functionality to the kernel. But, if such a reason exists, it was not presented with the patch set. Firefox has already solved this problem with its own dynamic linker; that solution lives entirely in user space. A fundamental rule of kernel design is that work should not be done in the kernel if it can be done equally well in user space; that suggests that an in-kernel implementation of file decompression would have to be somehow better than what Firefox is using now. Perhaps an in-kernel implementation is better, but that case has not yet been made.

The end result is that Dhaval's patch is unlikely to receive serious consideration at this point. Before kernel developers look at the details of a patch, they usually want to know why the patch exists in the first place — how does that patch make the system better than before? That "why" is not yet clear, so the contents of the patch itself are not entirely relevant. That may be part of why this particular patch set has not received much in the way of feedback in the first week after it was posted. Transparent decompression is an interesting idea for speeding application startup with a relatively easy kernel hack; hopefully the next iteration will contain a stronger justification for why it has to be a kernel hack in the first place.

Index entries for this article
Kernel	Filesystems/ext4

Transparent decompression for ext4

Posted Aug 1, 2013 4:20 UTC (Thu) by dlang (guest, #313) [Link] (2 responses)

It seems like this would be good for shared library files as well.

I've done a lot of tests of dealing with compressed data over the years (mostly dealing with log files), and unless you have a _very_ good I/O system, or are already running enough other processes to make you CPU bound, you can almost always find a compression algorithm that's a win for spending CPU time to save I/O time

Transparent decompression for ext4

Posted Aug 1, 2013 21:59 UTC (Thu) by neilbrown (subscriber, #359) [Link] (1 responses)

> It seems like this would be good for shared library files as well.

Except that shared libraries are memory-mapped which isn't supported yet...
And it strikes me that there would be an awful lot of complexity to go from "uncompress whole file at once" to "uncompress pages individually to support demand paging for memory mapping"

Given that this seems to be aimed at specific use cases, I think I would lean towards letting those use-cases deal with uncompression themselves.

zcat/zless/zgrep demonstrate to me that it isn't really that hard to add decompression to specific use cases.

I'm sure you could write a dlopen which did transparent decompression of libraries - cache them in ramfs and map them from there...

Transparent decompression for ext4

Posted Aug 1, 2013 22:06 UTC (Thu) by dlang (guest, #313) [Link]

when an application executes and the system loads the shared library, it doesn't just mmap misc pages, it first read enough about the library to know what is where so that it would know what pages need to be mmapped.

It seems to me that this process could be tweaked to understand this sort of transparent compression, and uncompress the entire library (or, in most cases, find where the library has already been uncompressed to support other applications)

Even if this means that the entire library must sit in ram while it's used, this won't make it worthless (it will mean that a huge library that's not used by many apps, and the apps only use a small portion of it may not be good candidates for compression)

Pragmatic solution

Posted Aug 1, 2013 7:13 UTC (Thu) by iq-0 (subscriber, #36655) [Link]

I think the basic idea is a very elegant and pragmatic solution to the compressed files in filesystem abstraction. Getting transparent compression right and efficient is pretty tricky (especially when you have to consider random writes).

But 90%+ of most of our systems are static content (controlled by the package manager) which for all practical purposes are read-only (even if root is allowed to change them, he isn't really supposed to).

One of the biggest assets would probably be that one can use expensive compression techniques as long as decompression is a win over the uncompressed case ('xz -9e'?) and could probably be shipped pre-compressed thus avoiding the compression on the system itself.

All in all I think that correctly implemented decompression feature is a major improvement that could possibly be implemented for the most part in the VFS (only needs a filesystem bit to signify the file is compressed).

Transparent decompression for ext4

Posted Aug 1, 2013 15:03 UTC (Thu) by gnacux (guest, #91402) [Link]

It would have better change to be merged if author can provide both compress and decompress together.

Transparent decompression for ext4

Posted Aug 1, 2013 21:33 UTC (Thu) by jwarnica (subscriber, #27492) [Link] (3 responses)

Does Android even use ext3? Aren't there file-systems just better for flash storage one expects on mobile devices?

Transparent decompression for ext4

Posted Aug 2, 2013 12:29 UTC (Fri) by gidoca (subscriber, #62438) [Link] (2 responses)

At least my Galaxy S4 does. AFAIR they abandoned some flash-specific FS (forgot which) in favour of EXT4 because its performance on multi-core systems is better.

Transparent decompression for ext4

Posted Aug 2, 2013 13:02 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

Samsung Galaxy SI/II had a bastardized version of FAT32 - they added Unix permission support and a couple of other features.

Ugh.

Transparent decompression for ext4

Posted Aug 3, 2013 18:17 UTC (Sat) by GhePeU (subscriber, #56133) [Link]

My Galaxy S2 does too (at least with Cyanogenmod). My old Acer Liquid MT, initially shipped with Froyo and later upgraded to Gingerbread, used YAFFS.

What's the downside to doing it in the linker?

Posted Aug 2, 2013 13:17 UTC (Fri) by paulj (subscriber, #341) [Link] (1 responses)

If this functionality already works well in user-space, what's the advantage of moving it to the kernel? Why is it better to move this to the kernel, rather than the system linker?

What's the downside to doing it in the linker?

Posted Aug 2, 2013 18:10 UTC (Fri) by giraffedata (guest, #1954) [Link]

Why is it better to move this to the kernel, rather than the system linker?

I'd say it fits more cleanly in the kernel. That means fewer people would be surprised to find it there and when it needs to be altered or expanded in the future, it will be more feasible.

As a practical matter, I don't see anything about transparent decompression that is unique to program code; if this is good for files the linker accesses, it must be good for files other programs access too.

On the other hand, it also looks like the whole thing has very limited applicability because of its need to read and cache the entire file, as well as the special procedure for creating the files and the nature of the I/O time vs CPU time tradeoff, so even putting it in the system linker might be too general a solution to the Android Firefox problem.

Alternatives

Posted Aug 4, 2013 11:58 UTC (Sun) by pjm (guest, #2080) [Link]

[Here was I hoping that "transparent decompression for ext4" meant that e2compr was being revived, which allowed writes and mmap.]

Alternatives to consider for similar use cases (e.g. not requiring writing) are squashfs or cramfs. These are more transparent because they don't require applications to read the whole file from a single read(2) call, and they allow mmapping of individual pages.

Note that these are read-only filesystems. To mix with writability, the obvious suggestion would be a unionfs, but consider simpler alternatives: e.g. using essentially a writable filesystem but using an /opt scheme to allow individual pieces to be compressed. Mountpoints allow a per-directory choice between compression or writability. Symlinks can provide a simple form of compressing an individual file, perhaps comparable in convenience in practice to the scheme discussed in the parent article.

Purely read-only filesystems might offer performance advantages over filesystems that have to allow for writes. It isn't a straightforward "never needs more reads" (e.g. files are more likely to span hardware block boundaries in a filesystem that aims for compression), but it's different enough from a writable filesystem to be worth testing for a given use case.

Transparent decompression for ext4

Posted Aug 5, 2013 8:59 UTC (Mon) by ssam (guest, #46587) [Link]

If compression is performance win on the platform, then would it not be sensible to compress the whole filesystem?

If compression is too much overhead for files that are written often, then maybe a filesystem with per file compression rules would be best.