Keeping secrets in memfd areas
Sharing of address spaces comes about in a number of ways. Linux has traditionally mapped the kernel's address space into every user-space process; doing so improves performance in a number of ways. This sharing was thought to be secure for years, since the mapping doesn't allow user space to actually access that memory. The Meltdown and Spectre hardware bugs, though, rendered this sharing insecure; thus kernel page-table isolation was merged to break that sharing.
Another form of sharing takes place in the processor's memory caches; once again, hardware vulnerabilities can expose data cached in this shared area. Then there is the matter of the kernel's direct map: a large mapping (in kernel space) that contains all of physical memory. This mapping makes life easy for the kernel, but it also means that all user-space memory is shared with the kernel. In other words, an attacker with even a limited ability to run code in the kernel context may have easy access to all memory in the system. Once again, in an era of speculative-execution bugs, that is not necessarily a good thing.
The memfd subsystem wasn't designed for address-space isolation; indeed, its initial purpose was as a sort of interprocess communication mechanism. It does, however, provide a way to create a memory region attached to a file descriptor with specific characteristics; a memfd can be "sealed", for example, so that a recipient knows that it will not be changed. Rapoport decided that it would be a good foundation on which to build a "secret memory" feature.
Actually creating an isolated memory area requires passing a new flag to memfd_create() called MFD_SECRET. That, however, doesn't describe how this secrecy should be implemented. There are a number of options that offer varying levels of security and performance degradation, so the user has to make a decision. The available options, as implemented in the patch, could easily have been specified directly to memfd_create() with their own flags, but Rapoport decided to require the use of a separate ioctl() call instead. Until the secrecy mode has been specified with this call, the user cannot map the memfd, and thus cannot actually make use of it.
There are two modes implemented so far; the first of them, MFD_SECRET_EXCLUSIVE, does a number of things to hide the memory attached to the memfd from prying eyes. That memory is marked as being unevictable, for example, so it will never be flushed out to swap. The effect is similar to calling mlock(), but with a couple of differences: pages are not actually allocated until they are faulted in, and the limit on the number of locked pages appears to be (perhaps by mistake) implemented separately from the limits imposed by mlock(). There is also no way to unlock pages except by destroying the memfd, which requires unmapping it and closing its file descriptor.
The other thing done by MFD_SECRET_EXCLUSIVE is to remove the pages used by the memfd from the kernel's direct map, making it inaccessible from kernel space. The problem with this is that the direct map is normally set up using huge pages, which makes accessing it far more efficient. Removing individual (small) pages forces huge pages to be broken apart into lots of small pages, slowing the system for everybody. The current code (admittedly a proof of concept) allocates each page independently when it is faulted in, which seems likely to maximize the damage done to the direct mapping. That will need to change before this feature could be seriously considered for merging.
The other mode, MFD_SECRET_UNCACHED does everything MFD_SECRET_EXCLUSIVE does, but also causes the memory to be mapped with caching disabled. That will prevent its contents from ever living in the processor's memory caches, rendering it inaccessible to exploits that use any of a number of hardware vulnerabilities. It also makes access to that memory far slower in general, to the point that it may seem inaccessible to the intended user as well. For small amounts of infrequently accessed data (cryptographic keys, for example) it may be a useful option, though.
In its current form, the feature only allows one mode to be selected. In
truth, though, MFD_SECRET_UNCACHED is a strict superset of
MFD_SECRET_EXCLUSIVE, so that is not currently a problem.
Rapoport suggests that this whole API could change in the future, with an
alternative being "something like 'secrecy level' from 'a bit more
secret than normally' to 'do your best even at the expense of
performance'
".
Part of the purpose behind this posting was to get comments on the proposed
API, but those have not been forthcoming so far. This may be one of those
projects that has to advance further — and get closer to being merge-ready
— before developers will take notice. But at least the work itself is not
a secret anymore, so interested users can start to think about whether it
meets their needs or not.
Index entries for this article | |
---|---|
Kernel | Memfd |
Kernel | Memory management/Address-space isolation |
Kernel | System calls/memfd_secret() |
Posted Feb 14, 2020 15:34 UTC (Fri)
by Funcan (subscriber, #44209)
[Link] (2 responses)
Posted Feb 14, 2020 15:52 UTC (Fri)
by zlynx (guest, #2285)
[Link]
Posted Feb 14, 2020 17:02 UTC (Fri)
by hansendc (subscriber, #7363)
[Link]
Posted Feb 15, 2020 11:26 UTC (Sat)
by mezcalero (subscriber, #45103)
[Link] (1 responses)
Posted Feb 15, 2020 12:01 UTC (Sat)
by edeloget (subscriber, #88392)
[Link]
Anyway, I like the idea - although I'm wondering if hiding memory from the kernel would not allow some kind of abuse (like hiding malicious stuff).
Posted Feb 18, 2020 8:13 UTC (Tue)
by flussence (guest, #85566)
[Link]
Posted Feb 18, 2020 22:38 UTC (Tue)
by ncm (guest, #165)
[Link] (2 responses)
Posted Feb 18, 2020 23:38 UTC (Tue)
by excors (subscriber, #95769)
[Link]
Posted Feb 20, 2020 22:42 UTC (Thu)
by chutzpah (subscriber, #39595)
[Link]
Keeping secrets in memfd areas
Keeping secrets in memfd areas
Keeping secrets in memfd areas
Keeping secrets in memfd areas
Keeping secrets in memfd areas
Keeping secrets in memfd areas
Keeping secrets in memfd areas
Keeping secrets in memfd areas
Keeping secrets in memfd areas