Adding encryption to Btrfs

By Jonathan Corbet
September 21, 2016

One of the promises of the Btrfs filesystem is that its new design would facilitate the addition of modern features like compression and encryption. Compression has been there for a while, but Btrfs has yet to gain support for encryption; indeed, the ext4 filesystem got this feature first over a year ago with an implementation that is also used by the f2fs filesystem. Work to fill this gap is underway, as can be seen in this recently posted patch set from Anand Jain, but it would appear that encryption in Btrfs remains a distant goal.

It remains distant because it has become clear that this code will not be merged in anything like its current form. With luck, though, it should be the source of a lot of lessons that can be applied to later, hopefully more successful attempts. Sometimes, one simply has to stumble a few times when attacking a difficult problem space.

Crypto troubles

There is an aspect to cryptographic code development that has been learned the hard way many times over: this code needs to be written with help from people who understand cryptography well and know where the pitfalls are. Developers who set out without that domain knowledge are certain to make serious mistakes. So this is not a good way to introduce an encryption-related patch set:

Also would like to mention that a review from the security experts is due, which is important and I believe those review comments can be accommodated without major changes from here.

As Dave Chinner (among others) pointed out, it is far too late for a security review, which should really happen during the design phase. The ext4 encryption feature, he noted, did go through a design review phase ahead of the posting of any code, and quite a bit of useful feedback was the result.

In this case, it would appear that this kind of review would have been helpful. Eric Biggers, who is working on the ext4 encryption feature, looked at the code and came back with a harsh judgment:

You will also not get a proper review without a proper design document which details things like the threat model and the security properties provided. But I did take a short look at the code anyway because I was interested. The results were not pretty. As far as I can see the current proposal is fatally flawed as it does not provide confidentiality of file contents against a basic attack.

Alex Elsayed also pointed out some of the cryptographic problems in the code. It comes down to a poor choice of encryption modes that leaves a filesystem open to well-understood known-plaintext attacks. The reviewers said that a mode like XTS, which lacks this particular vulnerability, should have been used instead. Or, even better, an authenticated encryption (AE) approach should be used; AE modes are believed to be far more resistant to most known attacks. AE brings its own challenges, though; the (mostly obsolete) ecryptfs filesystem uses it, but the current ext4/f2fs implementation does not. A related issue, as Ted Ts'o pointed out, is the increasing importance of taking advantage of hardware-based encryption for performance; that will tend to rule out "exotic encryption modes" in favor of something boring (but hardware-supported) ~~like AES~~.

Crypto at the wrong level?

Another criticism of the patch set is that it implements a Btrfs-specific encryption infrastructure, rather than using the generic infrastructure added at the virtual filesystem (VFS) layer and used by ext4 and f2fs. One motivation for that approach is that Btrfs encryption is managed at the subvolume level, meaning that a single master key is used for the entire subvolume. Ext4 and f2fs, instead, lack the subvolume concept; they provide file-level encryption that allows different users to have different keys within the same filesystem. Another result is that Btrfs does not benefit from the work that has been done on the VFS infrastructure; as Chinner put it:

The generic file encryption code is solid, reviewed, tested and already widely deployed via two separate filesystems. There is a much wider pool of developers who will maintain it, review changes and know all the traps that a new implementation might fall into. There's a much bigger safety net here, which significantly lowers the risk of zero-day fatal flaws in a new implementation and of flaws in future modifications and enhancements.

He compared Btrfs-specific encryption to the Btrfs RAID5/6 implementation, which has had known problems for years and appears to be essentially unmaintained. "Encryption simply cannot be treated like this - it has to be right, and it has to be well maintained." Some Btrfs developers bristled at the description of the filesystem's RAID implementation, but there was general agreement that the VFS code should be used to the greatest extent possible — and improved in places where it cannot yet be used.

Btrfs does provide some unique challenges that will stress the capabilities of the existing VFS code. That code, for example, manages encryption keys as an inode attribute; that is how file-level encryption is supported. Btrfs throws a spanner into that works in a couple of ways:

If Btrfs snapshots are present, an inode is likely to be present in more than one of them. Without a great deal of care, these snapshots could be used to force a reuse of the encryption keys and "nonce" values used with a specific file; many AE algorithms will fail catastrophically if that happens.
In general, Btrfs does a lot of sharing of file blocks at the extent level. That is how the copy-on-write mechanism works in general, and features like deduplication will cause even more sharing to happen. Once again, this sharing could be used to expose encrypted traffic, or to simply tell when one party has modified a file that shares extents with another.

A solution to some of these problems would be to simply copy extents and do without the sharing when encryption is involved. But another solution falls out of the requirements: encryption in Btrfs probably needs to be managed at the extent level, rather than at the file level. That would reduce the potential for nonce-reuse attacks and would eliminate problems that would otherwise result if one file sharing an extent is modified in a way that changes the extent's offset within the file.

As Btrfs developer Zygo Blaxell put it, the Btrfs extent-use model already creates challenges for the VFS layer:

Currently any extent in the filesystem can be shared by any inode in the filesystem (assuming the two inodes have compatible attributes, which could include encryption policy), including multiple references from the same inode to the same extent at different logical offsets. This is the basis of the deduplication and copy_file_range features.

This confuses the VFS caching layer when dealing with deduped reflinked, or snapshotted files. It's not surprising that VFS crypto has problems coping with it as well.

At the moment, encryption at the VFS level doesn't have any real concept of extents at all; extents are generally something that only specific filesystems know about. So the VFS file-encryption code is not suitable for solving the Btrfs encryption problem in its current form. As many have pointed out, though, the solution is not to start over, but to enhance the VFS code to get it to the point where it can do the job.

About the only definite conclusion that came from the discussion was that there is still a lot of work to do before the Btrfs encryption problem is even well understood, much less properly implemented. If nothing else, the patches posted so far have served as a focus point for a discussion that needs to happen and, hopefully, a starting point for the next try, sometime in the future. Once again, we see that cryptography is hard, and the intersection with a next-generation filesystem makes it even harder.

Index entries for this article
Kernel	Btrfs
Kernel	Filesystems/Btrfs
Security	Encryption/Filesystems

Adding encryption to Btrfs

Posted Sep 22, 2016 5:18 UTC (Thu) by koverstreet (✭ supporter ✭, #4296) [Link] (6 responses)

No mention of bcachefs encryption? https://bcache.evilpiepirate.org/Encryption/

A COW filesystem is an opportunity to do encryption significantly better than existing disk level or filesystem level encryption - update in place is the main obstacle to things like randomized encryption and nonces. Once you're doing data checksumming by storing the checksums with the pointers, not the data, you've got most of what you need for AEAD style encryption - which really is the modern gold standard. That's what bcachefs is doing, and I don't see why btrfs couldn't do something similar.

Also, as I commented on the btrfs mailing list, encryption in hardware is not necessarily faster - ChaCha20 in software is generally faster than AES in software: http://www.spinics.net/lists/linux-btrfs/msg59034.html

Adding encryption to Btrfs

Posted Sep 22, 2016 8:16 UTC (Thu) by micka (subscriber, #38720) [Link] (1 responses)

I suppose you meant "faster than AES in hardware". At least, from your link:

> on Haswell, ChaCha20 (in software) is over 2x as fast as AES (in hardware), at realistic (for a filesystem) block sizes

Adding encryption to Btrfs

Posted Sep 22, 2016 14:56 UTC (Thu) by koverstreet (✭ supporter ✭, #4296) [Link]

Yeah, mistyped that.

Adding encryption to Btrfs

Posted Sep 23, 2016 13:41 UTC (Fri) by epa (subscriber, #39769) [Link] (3 responses)

I thought that hardware AES was really for the benefit of weaker, embedded processors which can't do software encryption as fast.

Adding encryption to Btrfs

Posted Sep 23, 2016 16:15 UTC (Fri) by magila (guest, #49627) [Link] (2 responses)

Even on larger CPUs hardware AES is more power efficient. The SIMD units are by far the most power hungry logic units in modern Intel CPUs.

Adding encryption to Btrfs

Posted Sep 27, 2016 14:36 UTC (Tue) by jtaylor (subscriber, #91739) [Link] (1 responses)

I'm curious, do you have a source for that claim?

Adding encryption to Btrfs

Posted Sep 29, 2016 4:31 UTC (Thu) by magila (guest, #49627) [Link]

I'm not aware of any published comparisons. I've done some informal testing on my own machine, a quad core Skylake running at 4.5GHz, with 8K blocks and found:

ChaCha20 achieves 16.4GB/s while consuming 102W or 5.93 microjoules/byte
AES-128-CTR achieves 22.4 GB/s while consuming 87W or 3.70 microjoules/byte
AES-256-CTR achieves 16.6 GB/s while consuming 82W or 4.71 microjoules/byte

You might argue comparing AES-128 to ChaCha20 is unfair, but the fact is those are by far the most widely used variants of each.

ChaCha20 was tested using the benchmark tool from https://github.com/floodyberry/chacha-opt modified to run ChaCha20-avx2 with 8K blocks in a loop
AES was tested using the example code from https://wiki.openssl.org/index.php/EVP_Symmetric_Encrypti... modified to encrypt 8K blocks in a loop.
All tests were done with 4 instances running in parallel.
Power consumption was measured using CPUID HWMonitor.

Adding encryption to Btrfs

Posted Sep 22, 2016 14:08 UTC (Thu) by ballombe (subscriber, #9523) [Link] (2 responses)

> that will tend to rule out "exotic encryption modes" in favor of something boring (but hardware-supported) like AES.

AES is not an "encryption mode"...

Adding encryption to Btrfs

Posted Sep 22, 2016 15:10 UTC (Thu) by dkg (subscriber, #55359) [Link] (1 responses)

exactly. AES is a block cipher, and XTS is a cipher mode. XTS is one way to use AES, and doesn't rule out hardware-accellerated AES at all, afaik. Wikipedia has a good page of description about cipher modes.

Authenticated encryption modes would also be great, as they're tamper-evident -- any modification or damage to the ciphertext results in unreadable data (aka "⊥"), rather than returning nonsense cleartext.

Adding encryption to Btrfs

Posted Sep 22, 2016 15:27 UTC (Thu) by koverstreet (✭ supporter ✭, #4296) [Link]

highly relevant (and excellent) article explaining XTS:

https://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/

Adding encryption to Btrfs

Posted Sep 22, 2016 15:38 UTC (Thu) by rahvin (guest, #16953) [Link] (3 responses)

Does anyone know why btrfs development essentially stagnated? Is it because Oracle as the primary developer early on redirected resources after buying Sun and gaining access to zfs? I ask this because for a few years it looked like btrfs was making fantastic process but haven't seen major announcements or visible improvements for a while.

Adding encryption to Btrfs

Posted Sep 22, 2016 17:01 UTC (Thu) by flussence (guest, #85566) [Link]

With Btrfs being funded by the likes of Facebook I imagine there's less of a pressing need to make RAID-5/6 work. They can afford to do RAID-over-HTTP...

Adding encryption to Btrfs

Posted Sep 22, 2016 17:58 UTC (Thu) by masoncl (subscriber, #47138) [Link]

I think we're definitely not doing a great job of talking about our progress, but overall development of Btrfs hasn't slowed down at all. Stability is dramatically better and it's used in production here at FB.

Adding encryption to Btrfs

Posted Sep 26, 2016 7:35 UTC (Mon) by cwillu (guest, #67268) [Link]

I'd hesitate to consider the lack of progress on the encryption front to be evidence of stagnation; I certainly spent some time harping at cmason and company that encryption was not something that should be attempted without encryption experts getting involved. My harps were mostly of the "obvious approach A will cause non-obvious failure modes 1, 2, 3", and I don't feel they needed much convincing at the time (or maybe cmason and josef will say anything to shut me up :p).