Per-call-site slab caches for heap-spraying protection
A heap-spraying attack can be carried out by allocating as many objects as possible and filling each with data of the attacker's choosing. If the kernel can be convinced to use that data, perhaps as the address of a function to call, then the attacker can gain control. Heap spraying is not a vulnerability itself, but it can ease the exploitation of an actual vulnerability, such as a use-after-free bug or the ability to overwrite a pointer. The kernel's kmalloc() function (along with several variants) allocates memory from the heap. Since kmalloc() is used heavily throughout the kernel, any call site that can be used for heap spraying can potentially be used to exploit a vulnerability in a distant, unrelated part of the kernel. That makes the kmalloc() heap a tempting target for attackers.
kmalloc() makes its allocations from a set of "buckets" of fixed-sized objects; most (but not all) of those sizes are powers of two. So, for example, a 48-byte allocation request will result in the allocation of a 64-byte object. The structure behind kmalloc() is, in a sense, an array of heaps, each of which is used for allocations of a given size range. This separation can make heap spraying attacks easier, since it is not necessary to overwrite the entire heap to target an object of a given size.
The dedicated bucket allocator creates a separate set of buckets for allocation sites that are deemed to present an especially high heap-spraying risk. For example, any allocation that can be instigated from user space and filled with user-supplied data would be a candidate for a dedicated set of buckets. Then, even if the attacker manages to thoroughly spray that heap, it will not affect any other allocations; the attacker's carefully selected data cannot be used to attack any other part of the kernel.
The way to get the most complete protection from heap spraying would be to create a set of dedicated buckets for every kmalloc() call site. That would be expensive, though; each set of buckets occupies a fair amount of memory. Inefficiency at that level is the sort of tradeoff that kernel developers tend to view with extreme skepticism; creating a set of buckets for every call site simply is not going to happen.
This new patch series from Cook is built around one of those observations that is obvious in retrospect: most kmalloc() call sites request objects of a fixed size that will never change. Often that size (the size of a specific structure, for example) is known at compile time. In such cases, providing the call site with a single dedicated slab for the size that is needed would give an equivalent level of protection against heap-spraying attacks. There is no need to provide buckets for all of the other sizes; they would never be used.
The only problem with that idea is that there are thousands of kmalloc() call sites in the kernel. Going through and examining each one would be a tedious and possibly error-prone task, that would result in a lot of code churn. But the compiler knows whether the size parameter passed to any given kmalloc() call is a compile-time constant or not; all that is needed is a way to communicate that information to the call itself. If that information were accompanied by something that identified the call site, the slab allocator could set up dedicated slabs for the call sites where it makes sense.
So the problem comes down to getting that information to kmalloc() in an efficient way. Cook's approach is an interesting adaptation of the code-tagging framework that was merged for the 6.10 release. Code tagging is part of the memory-allocation profiling subsystem, which is meant to help find allocation-related bugs; it ties allocations to the call site that requested them, so developers can find, for example, the source of a memory leak.
Code tagging was not really meant as a kernel-hardening technology, but it does provide the call-site information needed here. Cook's series starts by augmenting the tag information stored for each call site with an indicator of whether the allocation size is constant and, if so, what that size is. That information will be available to the slab allocator when the kmalloc() call is made.
If a given allocation request is at the GFP_ATOMIC level, it will be handled in the usual way to avoid adding any extra allocations to that path. Otherwise, though, the allocator will check whether that call site uses a constant size; if so, a dedicated slab will be created for that site and used to satisfy the allocation request (and all that follow). If the size is not constant, then a full set of buckets will be created instead. Either way, the decision will be stored in the code tag to speed future calls. It is worth noting that this setup is not done for any given call site until the first call is made, meaning that it is not performed for the many kmalloc() call sites that will never execute in any given kernel.
If this series is merged, the kernel will have three levels of defense against heap-spraying attacks. The randomized slab option, merged for 6.6, creates 16 sets of slab buckets, then assigns each call site to one set randomly. Its memory overhead is relatively low, but the protection is probabilistic — it reduces the chance that an attacker can spray the target heap, but does not eliminate it. The dedicated-buckets option provides stronger protection, but is limited by the need to explicitly identify risky call sites and isolate them manually. This new option, instead, provides strong protection against heap spraying, but it will inevitably increase the memory overhead of the slab allocator.
The amount of that overhead will depend on the workload being run. For an
unspecified distribution kernel, Cook reported that the number of slabs
reported in /proc/slabinfo grew by a factor of five or so. Should
the series land in the mainline, it will be up to distributors to decide
whether to enable this option or not. When a kernel is going to run on a
system that is at high risk of heap-spraying attacks, though, that may
prove to be an easy decision to make.
Index entries for this article | |
---|---|
Kernel | Memory management/Slab allocators |
Kernel | Security/Kernel hardening |
Posted Aug 20, 2024 14:51 UTC (Tue)
by Lionel_Debroux (subscriber, #30014)
[Link] (4 responses)
Posted Aug 21, 2024 18:41 UTC (Wed)
by kees (subscriber, #27264)
[Link] (3 responses)
The idea of separating allocation by type is not new[2] (though doing it per call site is easier). Getting Linux to a safer position to defend against heap UAF is going to take a lot of steps, and this series is just one of many needed steps (see my other comment further down).
[1] https://perens.com/2017/06/28/warning-grsecurity-potentia...
Posted Aug 21, 2024 22:52 UTC (Wed)
by Lionel_Debroux (subscriber, #30014)
[Link] (2 responses)
This sentence you quoted from that questionable post by Bruce Perens, published several weeks after PaX+grsecurity went commercial-only, _might_ have been correct at the time _if_ nobody had redistributed the patches yet... however, I can only think of it as factually incorrect since, at the latest, December 2018, when one version was redistributed to the general public, showcasing the improved defenses and highlighting mainline's stable backporting process missing a sizable number of important fixes (FTR, I did help with the latter).
Posted Aug 22, 2024 23:49 UTC (Thu)
by kees (subscriber, #27264)
[Link] (1 responses)
As far as inspiration, this series is not trying to implement what AUTOSLAB claims to do. The implementation goals come from all over the place, including MTE, kCTF patches, PartitionAlloc, the XNU kmalloc_type allocator, the GrapheneOS hardened_malloc, etc. Heap defense research is hardly unique to grsecurity. :)
As for the Perens article quote being "factually incorrect", is grsecurity no longer a commercial product? Regardless, random monolithic source leaks is hardly useful for making robust upstream improvements. Besides, Linux has moved away from compiler plugins -- we've been driving language extensions directly in Clang and GCC so the entire Open Source ecosystem can benefit, and then refactoring Linux itself to gain better language robustness and hardening coverage.
Posted Aug 23, 2024 8:11 UTC (Fri)
by Lionel_Debroux (subscriber, #30014)
[Link]
You're mentioning moving away from infrastructure (compiler plugins) which has made it possible to provide ongoing support for a wide range of compiler versions for a decade or so, in order to replace it by built-in implementations of a subset of the capabilities provided by compiler plugins into only the newest and future compiler versions, while producing crippled kernel builds on compiler versions which don't support these newfangled extensions - i.e. most of them.
Posted Aug 20, 2024 20:10 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
Posted Aug 21, 2024 0:56 UTC (Wed)
by willy (subscriber, #9762)
[Link] (3 responses)
Posted Aug 21, 2024 7:33 UTC (Wed)
by taladar (subscriber, #68407)
[Link] (2 responses)
Posted Aug 21, 2024 8:20 UTC (Wed)
by johill (subscriber, #25196)
[Link]
Posted Aug 21, 2024 18:24 UTC (Wed)
by kees (subscriber, #27264)
[Link]
https://lore.kernel.org/lkml/20240807235433.work.317-kees...
Posted Aug 22, 2024 1:09 UTC (Thu)
by comex (subscriber, #71521)
[Link] (1 responses)
https://security.apple.com/blog/towards-the-next-generati...
Posted Aug 22, 2024 23:19 UTC (Thu)
by kees (subscriber, #27264)
[Link]
AUTOSLAB ?
"
Different from quarantining freed kernel heap objects, grsecurity developed an isolation-based approach where each generic allocation site (calling to k*alloc*) has its own dedicated memory caches. As such, two different object types will be isolated from each other since they are allocated from their own dedicated memory caches.
"
The article lists vulnerabilities and benchmarks which can be interesting for evaluating implementations.
AUTOSLAB ?
[2] https://chromium.googlesource.com/chromium/src/+/master/b...
AUTOSLAB ?
Further making that sentence factually incorrect in 2024 is the fact that some grsecurity versions more than two years newer than that one, and four years (!) newer than the latest non-commercial ones, have been publicly available for download and usage - under the terms of the GPLv2, obviously - for years. AUTOSLAB is there, as is e.g. RESPECTRE.
AUTOSLAB ?
AUTOSLAB ?
It's an interesting approach, which certainly has upsides beyond Linux (just like the GCC Rust efforts, which Open Source Security Inc. has been one of the very few entities providing actual funding to), if people start to use these language extensions. Their approach is arguably more practical, though. And they can still pull it as a tiny company, which has nowhere remotely near the resources any of the large Linux companies has access to.
Optimization opportunity?
Optimization opportunity?
Optimization opportunity?
Optimization opportunity?
Optimization opportunity?
kalloc_type
kalloc_type