A netlink-based user-space crypto API
User-space access to the kernel cryptography subsystem has reared its head several times of late. We looked at one proposal back in August that had a /dev/crypto interface patterned after similar functionality in OpenBSD. There is another related effort, known as the NCR API, and crypto API maintainer Herbert Xu has recently posted an RFC for yet another. But giving user space the ability to request that the kernel perform its computation-intensive crypto operations is not uncontroversial.
As noted back in August, some kernel hackers are skeptical that there would be any performance gains by moving user-space crypto into the kernel. But there are a number of systems, especially embedded systems, with dedicated cryptographic hardware. Allowing user space to access that hardware will likely result in performance gains, in fact 50-100x performance improvements have been reported.
Another problem with both the /dev/crypto and NCR APIs (collectively known as the cryptodev-linux modules) is the addition of an enormous amount of code to the kernel to support crypto algorithms beyond those that are already available. Those two modules have adapted user-space libraries for crypto and multi-precision integers and included them into the kernel. They are necessary to support some government crypto standards and certifications that require a separation between user space and crypto processing. So, the cryptodev-linux modules are trying to solve two separate (or potentially separate) problems: user-space access to crypto hardware acceleration and security standards compliance.
When Xu first put out an RFC on his idea for the API (without any accompanying code) back in September, Christoph Hellwig had a rather strongly worded reaction:
Xu more or less agrees with Hellwig, but sees his API as a way to provide access to the hardware crypto devices. Because Xu's API is based on netlink sockets (as opposed to ioctl()-based or a brand new API that the cryptodev-linux modules introduce), he is clearly hoping that it will provide a way forward without requiring such large changes to the kernel:
The purpose of the user-space API is to export the hardware crypto devices to user-space. This means PCI devices mostly, as things like aesni-intel [Intel AES instructions] can already be used without kernel help.
Now as a side-effect if this means that we can shut the security people up about adding another interface then all the better. But I will certainly not go out of the way to add more crap to the kernel for that purpose.
The netlink-based interface uses a new AF_ALG address family that gets passed to the initial socket() call. There is also a new struct sockaddr_alg that contains information about what type of algorithm (e.g. "hash" or "skcipher") is to be used as well as the specific algorithm name (e.g. "sha1" or "cbc(aes)") that is being requested. That structure is then passed in the bind() call on the socket.
For things like hashing, where there is little or no additional information needed, an accept() is done on the socket, which yields an operation file descriptor. The data to be hashed is written to that descriptor and, when there is no more data to be hashed, the appropriate number of bytes (20 for sha1) are then read from the descriptor.
It is a bit more complicated for ciphers. Before accepting the connection on the socket, a key needs to be established for a symmetric key cipher. That is done with a setsockopt() call using the new SOL_ALG level and ALG_SET_KEY option name and passing the key data and its length. But there are additional parameters that need to be set up for ciphers, and those are done using sendmsg().
A cipher will need to know which direction it is operating in (i.e. encrypting or decrypting) and may need an initialization vector. Those are specified with the ALG_SET_OP and ALG_SET_IV messages. Once the accept() has been done, those messages are sent to the operational descriptor and the cipher is ready for use. Data can be sent as messages or written to the operational descriptor, and the resulting data can then be read from that descriptor.
There is an additional wrinkle for the "authenticated encryption with associated data" (AEAD) block cipher mode, which can include authentication information (i.e. message authentication code or MAC) into the ciphertext stream. Because of that, AEAD requires two data streams, one containing the data itself and another with the associated authentication data (the MAC). This is handled in Xu's API by doing two accept() calls, the first for the operational descriptor, and the second for the associated data. If the cipher is operating in encryption mode, both descriptors will be written to, while the encrypted data is read from the operational descriptor. For decryption, the ciphertext is written to the operational descriptor, while the plaintext and authentication data are read from the two descriptors.
There hasn't been much discussion, yet, of the actual code posting, but Xu's September posting elicited a number of complaints about performance, most from proponents of the cryptodev-linux modules. But it would seem that there is some real resistance to adding completely new APIs (as NCR does) or to adding a complicated ioctl()-based API (as /dev/crypto does). Now there are three competing solutions available, but it isn't at all clear that any interface to the kernel crypto subsystem will be acceptable to the kernel community at large. We will have to wait to see how it all plays out.
Index entries for this article | |
---|---|
Kernel | Cryptography |
Posted Oct 21, 2010 6:16 UTC (Thu)
by neilbrown (subscriber, #359)
[Link] (10 responses)
I much prefer the filesystem model.
{ cat myfile &>0 ; read hash ; } <> /random-mountpoint/crypto/hash/sha1
So the name of the algorithm in passed as part of the file name, the content is written to the file descriptor. The hash is read from that same filedescriptor. The hash state is stored attached to the 'struct file'. See "Transaction based IO" in fs/libfs.c. It would need to be extended to work with writing a large file, but the concept is sound.
For encrypting.. using the same 'fd' for both read and write is problematic in a way that it isn't (so much) for the above. The original (Unix 6) pipe syscall returned only one fd which you could read from and write to. One problem with that was that it can be awkward to detect when the 'write' end has been closed (so the read end should get EOF), as there is no distinction between the two. If you happen to have two processes with the 'read' end open you never see EOF.
If we can either ignore that or work around it, then
The need to multiplex cyphertext and MAC is certainly a complication. I suspect there was a reason Herbert suggested 2 sockets rather than a simple multiplexing scheme. Without knowing that reason it is pointless trying to refine the design.
If it was to be done with sockets, it would seem to make much more sense to use 'socketpair(AF_ALG, SOCK_STREAM, ....)' rather than the sockets + accept model. Then you have distinct 'read' and 'write' ends. I would also use MSG_OOB to send the MAC beside the cyphertext rather than having two separate streams (not that I am a big fan of MSG_OOB, but it does seem to be a shoe that fits).
Posted Oct 21, 2010 14:23 UTC (Thu)
by ken (subscriber, #625)
[Link] (1 responses)
To me it looks like you just transform magic ioctl number into magic socket options and magic sendmsg() commands.
where is the benefit over /dev/crypto ??
Posted Nov 1, 2010 4:49 UTC (Mon)
by kevinm (guest, #69913)
[Link]
The fundamental problem of the ioctl interface is that every implementer of the interface must re-implement that parameter marshalling - for every arch. There are plenty of ioctl()s that *still* don't work properly for IA32 callers on x86-64 arch.
Posted Oct 21, 2010 15:04 UTC (Thu)
by rvfh (guest, #31018)
[Link] (4 responses)
Posted Oct 21, 2010 15:12 UTC (Thu)
by jengelh (guest, #33263)
[Link] (3 responses)
Posted Oct 21, 2010 16:29 UTC (Thu)
by nye (subscriber, #51576)
[Link] (1 responses)
Posted Oct 28, 2010 19:11 UTC (Thu)
by oak (guest, #2786)
[Link]
Hmmm. On second thought, I would assume socat author to have tested all the documented options. Maybe I used it with the --simulate option after all.
(That could also explain the dancing ping elephants and Jumbo frame's enormous flapping ears...)
Posted Oct 22, 2010 1:42 UTC (Fri)
by neilbrown (subscriber, #359)
[Link]
With suitably chosen file names, no such extra coding for the shell is needed.
Posted Oct 21, 2010 15:13 UTC (Thu)
by jengelh (guest, #33263)
[Link] (2 responses)
Posted Oct 22, 2010 1:54 UTC (Fri)
by neilbrown (subscriber, #359)
[Link] (1 responses)
Posted Oct 22, 2010 9:09 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Oct 21, 2010 18:28 UTC (Thu)
by daniel (guest, #3181)
[Link]
Posted Oct 21, 2010 20:54 UTC (Thu)
by alonz (subscriber, #815)
[Link] (6 responses)
I also dislike for Xu's proposal. Sorry.
My issues with this API (unlike the previous commenters) relate to function, not form:
Posted Oct 22, 2010 0:59 UTC (Fri)
by SEJeff (guest, #51588)
[Link] (2 responses)
Posted Oct 24, 2010 21:50 UTC (Sun)
by alonz (subscriber, #815)
[Link] (1 responses)
Posted Oct 25, 2010 12:13 UTC (Mon)
by SEJeff (guest, #51588)
[Link]
Posted Oct 22, 2010 1:52 UTC (Fri)
by neilbrown (subscriber, #359)
[Link] (2 responses)
I'm imagining you do an aio_read() to identify the buffer for the transformed data to be written to, then an O_DIRECT write to identify the buffer containing data to be transformed. The underlying implementation would need to notice the presence of a pending aio_read and place the result directly there rather than in the page cache.
I guess the same thing could do an in-place transformation, but it could get messy.
Of course if O_DIRECT wasn't used it would fall back to the simple case of copying to the page cache, transforming, and copying back out.
Posted Oct 24, 2010 21:58 UTC (Sun)
by alonz (subscriber, #815)
[Link] (1 responses)
As for the specific API—all proposals I have seen so far look like hacks, and are rather brittle (e.g. the aio_read solution would require the driver to keep userspace pointers for longer than a single system call, which is generally considered bad taste AFAIK).
Posted Oct 24, 2010 23:31 UTC (Sun)
by neilbrown (subscriber, #359)
[Link]
aio, by its very nature, requires the kernel to hold on to user-space pointers for longer than a single system call. This is OK because 'aio_cancel' exists to reclaim the pointer if needed.
Posted Oct 29, 2010 15:49 UTC (Fri)
by deviantmaru (guest, #70901)
[Link] (1 responses)
Disclaimer: I am a kernel-driver who is currently hacking (learning) on an
Posted Nov 1, 2010 15:26 UTC (Mon)
by eparis (guest, #33060)
[Link]
The biggest problem with ioctl is by FAR that people get it wrong. ioctl is the equivalent of typing everything in C void * and wondering why your program isn't behaving correctly. Look at ioctl vs getsockopt() and setsockopt()
int ioctl(int d, int request, ...);
int getsockopt(int sockfd, int level, int optname, void *optval, socklen_t *optlen);
They provide the same ability to be generic and to move data back and forth but the socket functions encode size and direction into the call. It means you can easily do sane checks in the kernel.
Linus has recently pushed a bit that syscalls are the right way to go (not in this discussion, just in general discussions about kernel/userspace ABI). A good syscall is going to provide size, direction, and strong typing of arguments.
The more information an interface encodes and enforces the more likely it is that the interface will be used correctly.
A netlink-based user-space crypto API
/mountpoint/crypto/cypher/$direction/$cyphertype/$key/$iv
is a promising file name to write to/ read from, except that there is a risk that the key would get stuck in the dcache and appear in /proc/$N/fd/$FD. I'm sure that is solvable though. The key would be HEX or BASE64 encoded of course.
A netlink-based user-space crypto API
A netlink-based user-space crypto API
No way to access sockets from a shell script?
No way to access sockets from a shell script?
No way to access sockets from a shell script?
No way to access sockets from a shell script?
No way to access sockets from a shell script?
A netlink-based user-space crypto API
A netlink-based user-space crypto API
A netlink-based user-space crypto API
Don't use netlink
Well, speaking as the architect of a hardware cryptography device…
A netlink-based user-space crypto API
I don't really have a good alternative API; crypto just doesn't appear to map cleanly to the Unix abstractions. Maybe a specialized system call ("sendrecvmsg()"/"servercall()" or somesuch) could help with the second point.
A netlink-based user-space crypto API
I'm the lead architect for Discretix' CryptoCell embedded security platform (which is also the basis for the Intel Moorestown security subsystem).
A netlink-based user-space crypto API
A netlink-based user-space crypto API
A netlink-based user-space crypto API
I refer to "zero-copy" in rather loose terms—not copying more than is necessary. In particular, if the application chooses input/output buffers that are suitable for DMA, I would like to perform a single DMA translation (many cryptography engines have dual-channel DMA engines, so they can read the source buffer via DMA, transform it, and write the output to the target buffer in a single pass).
A netlink-based user-space crypto API
A netlink-based user-space crypto API
What about PKI? And other comments.
2) I would have to agree with ken and alonz in that the netlink-based system
seems more like a hack than a proper design.
3) I would also agree with alonz that crypto operations don't seem to fit
well into any of the current Unix abstractions.
4) I am new to ioctl-based programming, so can anyone please tell me what is
awful about it?
ioctl-based, /dev/blah driver for a hardware (PCI) crypto device.
What about PKI? And other comments.
awful about it?
int setsockopt(int sockfd, int level, int optname, const void *optval, socklen_t optlen);