Adding encryption to Btrfs
It remains distant because it has become clear that this code will not be merged in anything like its current form. With luck, though, it should be the source of a lot of lessons that can be applied to later, hopefully more successful attempts. Sometimes, one simply has to stumble a few times when attacking a difficult problem space.
Crypto troubles
There is an aspect to cryptographic code development that has been learned the hard way many times over: this code needs to be written with help from people who understand cryptography well and know where the pitfalls are. Developers who set out without that domain knowledge are certain to make serious mistakes. So this is not a good way to introduce an encryption-related patch set:
As Dave Chinner (among others) pointed out, it is far too late for a security review, which should really happen during the design phase. The ext4 encryption feature, he noted, did go through a design review phase ahead of the posting of any code, and quite a bit of useful feedback was the result.
In this case, it would appear that this kind of review would have been helpful. Eric Biggers, who is working on the ext4 encryption feature, looked at the code and came back with a harsh judgment:
Alex Elsayed also pointed out some of the
cryptographic problems in the code.  It comes down to a poor choice of
encryption modes that leaves a filesystem open to well-understood known-plaintext
attacks.  The reviewers said that a mode like XTS,
which lacks this particular vulnerability, should have been used instead.
Or, even better, an authenticated
encryption (AE) approach should be used; AE modes are believed to be
far more resistant to most known attacks.  AE brings its own challenges,
though; the (mostly obsolete) ecryptfs filesystem uses it, but the current
ext4/f2fs implementation does not.  A related issue, as Ted Ts'o pointed out, is the increasing importance of
taking advantage of hardware-based encryption for performance; that will tend
to rule out "exotic encryption modes
" in favor of something
boring (but hardware-supported) like AES.
Crypto at the wrong level?
Another criticism of the patch set is that it implements a Btrfs-specific encryption infrastructure, rather than using the generic infrastructure added at the virtual filesystem (VFS) layer and used by ext4 and f2fs. One motivation for that approach is that Btrfs encryption is managed at the subvolume level, meaning that a single master key is used for the entire subvolume. Ext4 and f2fs, instead, lack the subvolume concept; they provide file-level encryption that allows different users to have different keys within the same filesystem. Another result is that Btrfs does not benefit from the work that has been done on the VFS infrastructure; as Chinner put it:
He compared Btrfs-specific encryption to the Btrfs RAID5/6 implementation,
which has had known problems for years and appears to be essentially
unmaintained.  "Encryption simply cannot be treated like this - it
has to be right, and it has to be well maintained.
"  Some Btrfs
developers bristled at the description of the filesystem's RAID
implementation, but there was general agreement that the VFS code should be
used to the greatest extent possible — and improved in places where it
cannot yet be used.
Btrfs does provide some unique challenges that will stress the capabilities of the existing VFS code. That code, for example, manages encryption keys as an inode attribute; that is how file-level encryption is supported. Btrfs throws a spanner into that works in a couple of ways:
-  If Btrfs snapshots are present, an inode is likely to be present in
     more than one of them.  Without a great deal of care, these snapshots
     could be used to force a reuse of the encryption keys and "nonce"
     values used with a specific file; many AE algorithms will fail
     catastrophically if that happens.
- In general, Btrfs does a lot of sharing of file blocks at the extent level. That is how the copy-on-write mechanism works in general, and features like deduplication will cause even more sharing to happen. Once again, this sharing could be used to expose encrypted traffic, or to simply tell when one party has modified a file that shares extents with another.
A solution to some of these problems would be to simply copy extents and do without the sharing when encryption is involved. But another solution falls out of the requirements: encryption in Btrfs probably needs to be managed at the extent level, rather than at the file level. That would reduce the potential for nonce-reuse attacks and would eliminate problems that would otherwise result if one file sharing an extent is modified in a way that changes the extent's offset within the file.
As Btrfs developer Zygo Blaxell put it, the Btrfs extent-use model already creates challenges for the VFS layer:
This confuses the VFS caching layer when dealing with deduped reflinked, or snapshotted files. It's not surprising that VFS crypto has problems coping with it as well.
At the moment, encryption at the VFS level doesn't have any real concept of extents at all; extents are generally something that only specific filesystems know about. So the VFS file-encryption code is not suitable for solving the Btrfs encryption problem in its current form. As many have pointed out, though, the solution is not to start over, but to enhance the VFS code to get it to the point where it can do the job.
About the only definite conclusion that came from the discussion was that
there is still a lot of work to do before the Btrfs encryption problem is
even well understood, much less properly implemented.  If nothing else, the
patches posted so far have served as a focus point for a discussion that
needs to happen and, hopefully, a starting point for the next try, sometime
in the future.  Once again, we see that cryptography is hard, and the
intersection with a next-generation filesystem makes it even harder.
| Index entries for this article | |
|---|---|
| Kernel | Btrfs | 
| Kernel | Filesystems/Btrfs | 
| Security | Encryption/Filesystems | 
      Posted Sep 22, 2016 5:18 UTC (Thu)
                               by koverstreet (✭ supporter ✭, #4296)
                              [Link] (6 responses)
       
A COW filesystem is an opportunity to do encryption significantly better than existing disk level or filesystem level encryption - update in place is the main obstacle to things like randomized encryption and nonces. Once you're doing data checksumming by storing the checksums with the pointers, not the data, you've got most of what you need for AEAD style encryption - which really is the modern gold standard. That's what bcachefs is doing, and I don't see why btrfs couldn't do something similar. 
Also, as I commented on the btrfs mailing list, encryption in hardware is not necessarily faster - ChaCha20 in software is generally faster than AES in software: http://www.spinics.net/lists/linux-btrfs/msg59034.html 
     
    
      Posted Sep 22, 2016 8:16 UTC (Thu)
                               by micka (subscriber, #38720)
                              [Link] (1 responses)
       
> on Haswell, ChaCha20 (in software) is over 2x as fast as AES (in hardware), at realistic (for a filesystem) block sizes 
     
    
      Posted Sep 22, 2016 14:56 UTC (Thu)
                               by koverstreet (✭ supporter ✭, #4296)
                              [Link] 
       
     
      Posted Sep 23, 2016 13:41 UTC (Fri)
                               by epa (subscriber, #39769)
                              [Link] (3 responses)
       
     
    
      Posted Sep 23, 2016 16:15 UTC (Fri)
                               by magila (guest, #49627)
                              [Link] (2 responses)
       
     
    
      Posted Sep 27, 2016 14:36 UTC (Tue)
                               by jtaylor (subscriber, #91739)
                              [Link] (1 responses)
       
     
    
      Posted Sep 29, 2016 4:31 UTC (Thu)
                               by magila (guest, #49627)
                              [Link] 
       
ChaCha20 achieves 16.4GB/s while consuming 102W or 5.93 microjoules/byte 
You might argue comparing AES-128 to ChaCha20 is unfair, but the fact is those are by far the most widely used variants of each. 
ChaCha20 was tested using the benchmark tool from https://github.com/floodyberry/chacha-opt modified to run ChaCha20-avx2 with 8K blocks in a loop 
     
      Posted Sep 22, 2016 14:08 UTC (Thu)
                               by ballombe (subscriber, #9523)
                              [Link] (2 responses)
       
AES is not an "encryption mode"... 
     
    
      Posted Sep 22, 2016 15:10 UTC (Thu)
                               by dkg (subscriber, #55359)
                              [Link] (1 responses)
       
Authenticated encryption modes would also be great, as they're tamper-evident -- any modification or damage to the ciphertext results in unreadable data (aka "⊥"), rather than returning nonsense cleartext.
      
           
     
    
      Posted Sep 22, 2016 15:27 UTC (Thu)
                               by koverstreet (✭ supporter ✭, #4296)
                              [Link] 
       
     
      Posted Sep 22, 2016 15:38 UTC (Thu)
                               by rahvin (guest, #16953)
                              [Link] (3 responses)
       
     
    
      Posted Sep 22, 2016 17:01 UTC (Thu)
                               by flussence (guest, #85566)
                              [Link] 
       
     
      Posted Sep 22, 2016 17:58 UTC (Thu)
                               by masoncl (subscriber, #47138)
                              [Link] 
       
     
      Posted Sep 26, 2016 7:35 UTC (Mon)
                               by cwillu (guest, #67268)
                              [Link] 
       
     
    Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
AES-128-CTR achieves 22.4 GB/s while consuming 87W or 3.70 microjoules/byte
AES-256-CTR achieves 16.6 GB/s while consuming 82W or 4.71 microjoules/byte
AES was tested using the example code from https://wiki.openssl.org/index.php/EVP_Symmetric_Encrypti... modified to encrypt 8K blocks in a loop.
All tests were done with 4 instances running in parallel.
Power consumption was measured using CPUID HWMonitor.
Adding encryption to Btrfs
      
      exactly.  AES is a block cipher, and XTS is a cipher mode.  XTS is one way to use AES, and doesn't rule out hardware-accellerated AES at all, afaik.  Wikipedia has a good page of description about cipher modes.
Adding encryption to Btrfs
      Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
Adding encryption to Btrfs
      
 
           