A return-oriented programming defense from OpenBSD

Posted Aug 31, 2017 8:34 UTC (Thu) by epa (subscriber, #39769)
In reply to: A return-oriented programming defense from OpenBSD by droundy
Parent article: A return-oriented programming defense from OpenBSD

Yes, a lookup table might perform better for large numbers of callers, while if there are just two possible values a couple of inline test-and-jumps would likely be faster. (And if there are a hundred possible callers but one of them is way more frequent than the others, it might still be faster to test for that one first.) It depends on benchmarks and would likely vary by platform and even by different CPUs in the same platform.

If the program is not multithreaded, and static analysis shows that a function is not re-entrant, then its return address (or a number to look up the return address) does not need to be on the stack at all. It can be at a fixed location, making it harder for an attacker to overwrite, and saving some stack space too.

Indeed, there's a case for saying that all re-entrant functions should need to be explicitly tagged as such by the programmer, since they need more care in writing. If the function isn't tagged as re-entrant, and the compiler cannot statically prove that it can never be called by some chain starting from itself, then that's a compile-time warning or error. On the other hand, if a function is never re-entrant, all its local variables can be allocated statically away from the stack. Who knows, perhaps some optimizing compilers do this already.

A return-oriented programming defense from OpenBSD

Posted Aug 31, 2017 9:31 UTC (Thu) by karkhaz (subscriber, #99844) [Link] (1 responses)

This is a cool idea, but I'm confused about two points.

1. You say
> and the compiler cannot statically prove that it can never be called by some chain starting from itself

AFAIK there will be plenty of situations where a function cannot be called through a chain starting from itself, but a static analysis cannot prove this, so the compiler will emit false positives (i.e. tell you that you need to tag the function when you should not). This is due to function pointers; static analyses typically over-approximate what concrete values function pointers might have at runtime. Analyses that have a more precise idea about function pointer addresses are typically very slow.

Note also that the analysis would need to have the entire program at its disposal to determine the values of function pointers, while here we're talking about the compiler (which only has access to a single translation unit). There are tools like CBMC [1] that don't have much trouble with analyzing entire programs and serve as a drop-in replacement to GCC, and there has been some work [2] to the Clang Static Analyzer that would enable it to analyze the entire program at once (still in discussion), but these are both way beyond the capabilities of a regular compiler anyway. Alternatively, it could be done as a link-time optimization (well, it would be a link-time analysis, but the line between analysis and optimization is quite fine), as at link-time you have the whole program, though I don't know what the exact capabilities of LLVM and GCC's LTO are and whether this would be possible.

2. I'm not sure what reentrancy has to do with this, would you mind elaborating?

[1] http://www.cprover.org/cbmc/
[2] https://reviews.llvm.org/D30691

A return-oriented programming defense from OpenBSD

Posted Aug 31, 2017 13:41 UTC (Thu) by epa (subscriber, #39769) [Link]

That's right, static analysis won't be able to show all cases where a function isn't re-entrant. The programmer would have to annotate some functions with a tag to quieten the warning and would presumably add a comment with a human-readable explanation (or proof) of why it's OK. Yes, analysis of the whole program as a unit is necessary -- don't compilers have whole-program optimization modes nowadays?

After my first paragraph I went off on a separate idea which was that the stack could be avoided altogether if a function can only be executing 'once' -- it cannot call itself so it cannot appear twice in a call stack. In that case you can set aside a static area of memory, which is read-write of course, but is physically separate from the stack and so perhaps less likely to be trampled in a typical stack smashing attack. This might not really be a win, if it just means that memory trampling attacks against this static area become easier.

A return-oriented programming defense from OpenBSD

Posted Aug 31, 2017 10:16 UTC (Thu) by ibukanov (subscriber, #3942) [Link]

This suggests to go back to early Fortran model with no stack but static locations for return addresses unless the function is recursive.