Inline encryption support for block devices
In a combined storage and filesystem session at LSFMM 2017, Ted Ts'o led a discussion of support for inline cryptographic engines (ICEs) that are being used in mobile phones. A number of hacks have been made over the last few years by Android device makers for Linux support of these engines to encrypting filesystem data, but Ts'o would like to create something that can go into the mainline kernel. He was looking for thoughts on how to make that happen.
Doing AES encryption on the ARM cores that are used in mobile phones is fairly power hungry, so manufacturers are increasingly turning to ICE devices to encrypt the data on the device. These ICE devices sit between the CPU and the flash storage; the CPU must provide a key ID to them in order to use them. So there is a need to tell the engine which key to use for an I/O request. In the future, Ts'o said, the keys themselves may come from a secure element, such that the CPU and kernel will not have access to them at all.
Qualcomm has been trying to get support for ICE devices upstream for some time, but the code is "rather unspeakable". It blindly assumes an ext4 filesystem and roots through private pointers to access inode structures in order to associate key IDs with I/O requests. The Qualcomm code is not what is in the Pixel phones, he was quick to note; Qualcomm started with the Pixel code and "did horrible things to it".
His goal is to find upstream-acceptable changes to support ICE. A "nice to have" would be a way for him to remove the hacks in the ext4 and f2fs filesystems, as well, and add a filesystem and block encryption mechanism that does not require a device-mapper layer. For the desktop and server case, having a device-mapper layer makes it easier for users, he said, but with hardware crypto, there's no reason to have one.
Ts'o proposed adding a 32-bit key ID field to struct bio, which is what Universal Flash Storage (UFS) has. Key IDs are integer values that refer to keys that have been stored into "slots" in ICE device. He believes that most ICE devices will have far fewer slots than 32 bits will allow, though.
James Bottomley suggested using the Data Integrity Field and Data Integrity Extensions (DIF/DIX) support for the key IDs. Martin Petersen said there is a union that holds DIF/DIX or copy offload information; another field could be added for the key ID. Ts'o said he would look into that.
There will also be a need for a key slot manager of some kind. Since there will be a limited number of key slots for an ICE device, there can only be that many BIOs with different key IDs in flight at any given time. So the device will need to request a key slot, which might block if there are none available. The slots will need to be reference counted; they would be incremented when a BIO with an ID is submitted and decremented when it completes.
All of the key slot management would be hidden from the filesystem. The drivers will manage the slots, but the filesystem will need to identify the key that goes with a particular request. It is important that two BIOs with different keys do not get merged. David Howells asked about superblock encryption and whether mount() needs to know about keys, but Ts'o said that the metadata for the ext4 and f2fs filesystems is not encrypted on Android devices. There is some rough prototype code that Michael Halcrow has been working on that should come out soon, Ts'o said.
In something of a side note, he also mentioned that right now filesystem encryption on desktops or servers uses a per-process or per-session key ring. Users can set and remove their own keys from those rings, but that doesn't work for hardware devices because there is no concept of a key owner. Once a key gets into an ICE device, there are no further checks and anyone can use the key. It is the host operating system that allows or prevents access to files using the Unix permissions.
It would be useful to have a kind of global key ring for software crypto that could be used like an ICE device, he said. Keys would be added or removed only by root, but once they are added, those keys can be used by anyone on the system. Someone in the audience asked about containers where there may be multiple "root" users due to user namespaces. Ts'o said he hadn't thought about it. Someone suggested tying the key ring to the user namespace where they were created, so a global key ring created by root in a container would only be accessible to other users in that container/namespace.
Index entries for this article | |
---|---|
Kernel | Block layer |
Kernel | Security |
Security | Encryption |
Conference | Storage, Filesystem, and Memory-Management Summit/2017 |
Posted Mar 23, 2017 1:34 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Mar 24, 2017 3:05 UTC (Fri)
by johnjones (guest, #5462)
[Link] (5 responses)
Also does not OCF (OpenBSD Crypto Framework) and other accelerators already do something like this ?
thanks
John
Posted Mar 26, 2017 2:00 UTC (Sun)
by tytso (subscriber, #9993)
[Link] (4 responses)
Historically, most accelerators work by sitting on the Host CPU's bus, and talks to the main memory. An example of this is an SHA256 accelerator which sits on the PCIe bus, and checksums data in memory. See the slides from a presentation[1] at the 2014 OpenZFS summit.
[1] http://open-zfs.org/w/images/6/63/Lightning_Talk-Zacodi_L...
What some ARM SOC (system on chip) vendors have done is to put an encryption engine in between the host CPU and the storage device. This isn't actually new; IBM Mainframes can do something similar. Interestingly, one of things ARM CPU's on handsets and Mainframe CPU's is that they tend to be relatively underpowered compared to the rest of the system. So while having a storage-specific accelerator between CPU and storage device is less flexible, it reduces the overhead on the CPU and memory. (For example, an encryption engine which can also be used for IPSEC would read from memory, and then write the ciphertext back to memory --- but then data would have to be sent from memory to the storage or networking device. Compare this architecture with one where you have a crypto engine just for storage, and a different crypto engine just for networking which lives on the NIC.)
I predict that in the future, we'll see this architecture on server platforms. Since we can no longer double the CPU frequency every 18 months, it makes more sense to speed up system by pushing more transistors away from CPU to more specialized hardware accelerators. And if that means specialized crypto engines for storage and networking --- that's just fine.
Posted Mar 27, 2017 8:15 UTC (Mon)
by ortalo (guest, #4654)
[Link] (2 responses)
Another question: all these new hardware are targetting encryption only? Nothing with respect to signing (or hashing) only?
Posted Mar 28, 2017 21:04 UTC (Tue)
by tytso (subscriber, #9993)
[Link] (1 responses)
Right now, all of in-line crypto acceleration hardware which I am familiar with is targetting encryption only, unfortunately.
The challenge with doing authenticated encryption (e.g., AES GCM) is that you need to store the per-block authentication tag somewhere. The challenge is that doing this would mean we would need flash chips with page sizes that are 4k plus 32 bytes for the IV and AES GCM authentication tag. But there aren't any eMMC flash devices with 4128 byte pages. And a non-standard, custom eMMC storage device would be extremely pricey. So it would probably be not commercially viable. :-(
Posted Mar 28, 2017 22:56 UTC (Tue)
by tytso (subscriber, #9993)
[Link]
[1] https://blog.hansenpartnership.com/using-your-tpm-as-a-se...
But a TPM is really not a crypto *accelerator*; in fact, pretty much all TPM's are incredibly S-L-O-W at doing crypto operations. So why does a TPM exist? Because it will only encrypt, decrypt, or sign messages on your behalf if you give it the correct password or PIN. And if you try too many bad passwords, the TPM will lock you out. So this is useful if you want to protect someone from stealing your password, since the password by itself won't be enough; they will also need to be able to get your laptop. It's also really good at making Jim Comey, head of the FBI, really angry. Oh, well. :-)
You can certainly use a TPM, or some other secure element, in conjunction with an in-line crypto engine. The two technologies are very complementary. This is why I want to make sure that whatever inline crypto encryption support gets landed in the upstream kernel can be easily extended to support device designs where the host CPU does not have access to the encryption keys, and where the keys are provisioned via some kind of secure element directly to the inline crypto engine.
Posted Mar 29, 2017 8:36 UTC (Wed)
by robbe (guest, #16131)
[Link]
Isn’t this the case for most HW features, that mainframes had it already in the 1970s? IBM is the Beatles of computer hardware.
Posted Mar 30, 2017 8:14 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
The problem is basically that data gets passed to the encryption engine in ?4K chunks. Each chunk suffers a setup/teardown overhead. And while the hardware could quite happily handle much larger chunks, the device-mapper code needs major changes to be able to deal with said larger chunks. The impact is actually very noticeable once you look for it.
Cheers,
Posted Jun 6, 2018 17:24 UTC (Wed)
by ladvine (guest, #124882)
[Link]
Inline encryption support for block devices
wait... how do other systems such as APFS and ZFS do acceleration ?
wait... how do other systems such as APFS and ZFS do acceleration ?
wait... how do other systems such as APFS and ZFS do acceleration ?
However, it seems to me we are seeing the problem of sharing these devices between multiple users appear now ; especially in the use cases where different users expect to use different keys.
Admittedly, it is already very nice to have host encryption, if only to have host-level protection againts "outsiders".
However, I wonder if simple additions to these hardware would not allow association of different sets of keys to different users contexts (similarly to the way pagetables are associated to specific adress spaces in the classical MMU case)?
wait... how do other systems such as APFS and ZFS do acceleration ?
wait... how do other systems such as APFS and ZFS do acceleration ?
wait... how do other systems such as APFS and ZFS do acceleration ?
Device mapper and hardware encryption
Wol
"Adding a 32-bit key ID field to struct bio" for Inline Encryption
I am not sure providing a reference in the bio to the crypto context has security risks associated to same.