A new LLVM CFI implementation
The kernel makes extensive use of indirect function calls; they are at the heart of its internal object model. Every one of those calls is a potential entry point for an attacker; if the target of the call can be somehow changed to an address of the attacker's choosing, the game is usually over. Forward-edge CFI works to thwart such attacks by ensuring that every indirect function call sends control to a code location that was actually intended to be a target of that call. Specifically, an indirect function call should only go to a known function entry point, and the prototype of the function should match what is expected at the call site.
The CFI implementation merged for 5.13 works by creating "jump tables" containing all of the legitimate targets of indirect function calls in the kernel; there is one jump table for each observed function prototype. Actual indirect calls are replaced with a jump-table lookup to ensure that the intended target meets the criteria; the target should be found in the jump table corresponding to the intended function prototype. If that test fails, a kernel panic results. See this article for a more detailed description of how this mechanism works.
That implementation of CFI does the job, but it has a few disadvantages as well. Creating the jump tables requires a view of the full kernel binary; in practice, it requires that link-time optimization be used to build the kernel, which is a slow and sometimes tricky process. The replacement of function-pointer variables with jump-table entries also means that those variables cannot be compared against the address of a specific function, which is something that kernel code needs to do on occasion. It would be nicer to have a CFI implementation that doesn't impose problems of this sort.
That implementation would appear to exist in this
patch set from Sami Tolvanen. It depends on a new Clang compiler option
(-fsanitize=kcfi), which has not yet landed in the LLVM mainline.
This CFI mechanism, which is "intended to be used in low-level code,
such as operating system kernels
", avoids the above-mentioned problems
at the cost of a couple of other tradeoffs, notably that it cannot work
with execute-only memory (read access is always required).
When code is compiled with -fsanitize=kcfi, the entry point to each function is preceded by a 32-bit value representing the prototype of that function. This value is (part of) a hash calculated from the C++ mangled name for the function and its arguments. On x86 systems, this hash is placed into a simple MOV instruction and surrounded by INT3 instructions; this is meant to prevent the hash itself from becoming a useful gadget for attackers. When an indirect call is made, extra code is emitted to fetch and check this hash value prior to emitting the call itself; if the hash does not match what was expected, a trap (which will be turned into a kernel oops) results. The checking of the hash is why execute-only memory cannot be supported: it must be possible to read the hash value from the executable code.
For the most part, this mechanism just works without the need for much change in the kernel code itself — at least, not beyond the changes that were already required for the previous CFI implementation. There is, however, the problem of functions written in assembly, which will need to have the necessary preamble generated by some other means. Generating the requisite hash value for each indirectly called assembly function could be a tiresome task; fortunately, the compiler provides some help. Whenever it sees (in C code) the address of a function being taken (as in this example):
static const struct v4l2_file_operations mcam_v4l_fops = { .open = mcam_v4l_open, /* ... */ };
it will generate a corresponding symbol defined as the resulting hash value; in this case, the symbol would be __kcfi_typeid_mcam_v4l_open. The existence of these symbols means that the preambles for assembly functions can be generated automatically via some tweaks to the macros already used to define those functions.
This patch series is currently in its third version, and it would appear
that all of the substantive concerns have been addressed. It is, in other
words, looking ready to be merged into the mainline. There is only one
remaining obstacle to overcome: kernel developers will be reluctant to
merge this feature until it is actually supported in the LLVM Clang
compiler. Assuming that happens in the near future, it should not be too
long until the kernel acquires an upgraded CFI implementation for the arm64
and x86 architectures.
Index entries for this article | |
---|---|
Kernel | Releases/6.1 |
Kernel | Security/Control-flow integrity |
Posted Jun 18, 2022 17:52 UTC (Sat)
by jhoblitt (subscriber, #77733)
[Link] (1 responses)
Posted Jun 21, 2022 8:41 UTC (Tue)
by LtWorf (subscriber, #124958)
[Link]
Posted Jun 19, 2022 13:48 UTC (Sun)
by mss (subscriber, #138799)
[Link] (15 responses)
The function return value type isn't a part of its C++ mangled name so I guess this isn't checked either.
Posted Jun 19, 2022 14:24 UTC (Sun)
by atnot (subscriber, #124910)
[Link] (5 responses)
Posted Jun 19, 2022 14:29 UTC (Sun)
by willy (subscriber, #9762)
[Link]
Posted Jun 19, 2022 14:29 UTC (Sun)
by mss (subscriber, #138799)
[Link] (3 responses)
I assume having a hash table of allowed call targets could work here (it would be incompatible with out-of-tree kernel modules, however).
But this sounds a bit like the previous CFI design that this new implementation seeks to replace.
Posted Jun 19, 2022 20:58 UTC (Sun)
by willy (subscriber, #9762)
[Link] (2 responses)
Posted Jun 19, 2022 21:32 UTC (Sun)
by mss (subscriber, #138799)
[Link] (1 responses)
I don't quite understand your analogy here, the article says:
So the checking happens prior to the actual call: in the caller, not in the callee.
Posted Jun 19, 2022 21:41 UTC (Sun)
by willy (subscriber, #9762)
[Link]
But I wasn't referring to the implementation; rather the concept is that the function declares who can call it. That's done by type here, but could also be done by saying "I am an implementation of get_block_t"
Posted Jun 19, 2022 18:49 UTC (Sun)
by iabervon (subscriber, #722)
[Link]
Posted Jun 19, 2022 21:45 UTC (Sun)
by corbet (editor, #1)
[Link] (7 responses)
Posted Jun 19, 2022 21:58 UTC (Sun)
by mss (subscriber, #138799)
[Link] (5 responses)
I think that the set of functions implementing particular callback in the kernel should be known at compile time, either via manual annotations (as willy has suggested above) or maybe even automatically by a sufficiently smart compiler.
This probably would be incompatible with out-of-tree kernel modules, however.
Posted Jun 19, 2022 22:40 UTC (Sun)
by NYKevin (subscriber, #129325)
[Link] (3 responses)
Posted Jun 20, 2022 13:51 UTC (Mon)
by khim (subscriber, #9252)
[Link] (2 responses)
Well… zero-sized arguments don't exist in standard C, they are GNU extension which means you can try to supply the patch which will support what you want to Clang and GCC. Once you left the standard it's kind of hard to expect to see such non-standard constructs supported in said standard, don't you think?
Posted Jun 20, 2022 17:39 UTC (Mon)
by nybble41 (subscriber, #55106)
[Link] (1 responses)
A keyword for the void constructor might be nice, but "(void)0" would serve well enough. This could be made into a standard VOID macro, like NULL for "(void*)0".
Posted Jun 20, 2022 20:56 UTC (Mon)
by wahern (subscriber, #37304)
[Link]
Also, at least based on a straight-forwarding reading of the standard, sequentially declared 0-length bit fields should collapse (i.e. not unspecified or undefined behavior), so that they introduce only one word of padding at most, if any; and this is indeed the behavior I see from clang and GCC. And while maybe more susceptible to disagreement, the language of the standard does seem to specify that a 0-length bit field not succeeding another bit field should not introduce any padding. I see the same behaviors for 0-length arrays, but the GCC documentation seemed much more ambiguous on both points.[2]
I am curious why I haven't seen void (the true "nothing") type semantics extended elsewhere in the grammar. Maybe 0-length bit and array fields were sufficient, if not ideal, for the most pressing scenarios. But perhaps the language (inclusive of extensions) is finally moving in a direction where the old hacks are insufficient, and void might see some more attention.
[1] See $ 3.5.2.1 at http://port70.net/~nsz/c/c89/c89-draft.txt
[2] Putting it all together after having double checked my assertions with the standard, it does seem that the only use for 0-length arrays has been almost entirely subsumed by flexible array members, except that the former make indexing notation easier (no offsetof verbiage). For anything else (mostly in relation to extensions, like 0-length structures), 0-length bit field notation seems sufficient. Maybe the situation is different with C++.
Posted Jun 19, 2022 23:29 UTC (Sun)
by corbet (editor, #1)
[Link]
Posted Jun 20, 2022 2:38 UTC (Mon)
by comex (subscriber, #71521)
[Link]
Of course, that doesn't apply to C. But it may be possible to get a similar effect with manual annotations of some sort...
Posted Jun 20, 2022 7:46 UTC (Mon)
by Villemoes (subscriber, #91911)
[Link]
Indeed, and for that reason it's really beyond me how the current code could have been merged without an explicit "depends on BROKEN". The series monkey-patched out some sanity checking WARN_ONs that used function pointer comparison, but there are places in the kernel that rely on function pointer comparison for correctness, and because this CFI code breaks the semantics of comparing function pointers, leads to crashes down the line. Which, of course, prevents an attacker from gaining control; the owner's control and use of the machine is just collateral damage we have to accept in the name of s3kurity.
Oh well, at least it's getting replaced by something saner now.
A new LLVM CFI implementation
A new LLVM CFI implementation
A new LLVM CFI implementation
A new LLVM CFI implementation
A new LLVM CFI implementation
A new LLVM CFI implementation
A new LLVM CFI implementation
A new LLVM CFI implementation
> When an indirect call is made, extra code is emitted to fetch and check this hash value prior to emitting the call itself
A new LLVM CFI implementation
A new LLVM CFI implementation
I don't quite understand what you are asking for here. The name of the called function isn't known at compile time, that's why it's an indirect call. So what name would you check against?
Checking the name
Checking the name
Checking the name
> It's a shame the standard doesn't let you write something like "void" for a zero-size argument.
Checking the name
Checking the name
Checking the name
That is essentially what the existing implementation does - that's what the jump tables contain. It requires link-time optimization to work, though, and I'm not sure what it buys over verification of the prototype.
Checking the name
Checking the name
A new LLVM CFI implementation