LWN: Comments on "A return-oriented programming defense from OpenBSD" https://lwn.net/Articles/732201/ This is a special feed containing comments posted to the individual LWN article titled "A return-oriented programming defense from OpenBSD". en-us Sat, 30 Aug 2025 09:35:06 +0000 Sat, 30 Aug 2025 09:35:06 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net A return-oriented programming defense from OpenBSD https://lwn.net/Articles/733889/ https://lwn.net/Articles/733889/ areilly <div class="FormattedComment"> The dual-stack method is also the basis of the Web Assembly ABI, which (via emscripten) is probably why it's in clang already. Coming (or already in) a web browser near you.<br> </div> Sat, 16 Sep 2017 22:16:54 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/733477/ https://lwn.net/Articles/733477/ excors <div class="FormattedComment"> RISC-V sounds similar to ARMv8 AArch64, where (if I understand correctly) the "BLR" instruction branches and stores the return address in the X30 (LR) register, and the "RET Xn" instruction returns to the address stored in some register, and that's the only proper way to call a function. A non-leaf function can preserve X30 however it wants; usually it will push/pop X30 on the stack associated with the slightly magic SP register, but the push/pop instructions (store-and-decrement/load-and-increment) can use any register as the index, so I think you could create a separate control stack for approximately zero cost (just one more reserved register) to keep the frame pointers and return addresses away from all the "char surely_this_is_big_enough[256];" buffers and other local variables.<br> <p> (ARMv7/AArch32 is similar but more confusing, because there are lots of mostly-deprecated ways of returning by using PC as a destination register, and you usually push/pop LR/PC in the same instruction as all the other registers you want to preserve (whereas AArch64 can only push/pop a pair of registers at once), and the Thumb instruction encoding has lots of limitations, and there are only half as many registers as AArch64, so a separate control stack might be significantly more expensive there.)<br> </div> Tue, 12 Sep 2017 18:32:17 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/733459/ https://lwn.net/Articles/733459/ zlynx <div class="FormattedComment"> I believe that *every* recent CPU instruction set does a separate return stack. Well, as far as I can tell RISC-V puts it in a register, then it is up to a function caller to save the "ra" register wherever it wants before making the call. It can save it to the stack, a different stack or a linked list, the processor doesn't care. Itanium was similar, return addresses were saved in registers, and the registers would overflow into a separate stack.<br> <p> If only people would stop relying on the x86 / amd64 ISA.<br> </div> Tue, 12 Sep 2017 16:20:42 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/733401/ https://lwn.net/Articles/733401/ wahern <p> From Table 1 of the <a href="http://dslab.epfl.ch/pubs/cpi.pdf">Code-Pointer Integrity (2014) paper</a>: </p> <p><pre> Safe Stack CPS CPI ------------------------------------------- Average (C/C++) | 0.0% 1.9% 8.4% Median (C/C++) | 0.0% 0.4% 0.4% Maximum (C/C++) | 4.1% 17.2% 44.2% ------------------------------------------- Average (C only) | -0.4% 1.2% 2.9% Median (C only) | -0.3% 0.5% 0.7% Maximum (C only) | 4.1% 13.3% 16.3% ------------------------------------------- Table 1: Summary of SPEC CPU2006 performance overheads. </pre></p> <p> Safe Stack is the dual-stack mechanism. CPS(weak) and CPI (strong) are for dealing with function pointers in heap data. </p> Mon, 11 Sep 2017 18:40:05 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/733344/ https://lwn.net/Articles/733344/ itvirta <div class="FormattedComment"> <font class="QuotedText">&gt; The dual-stack approach was merged into Clang.</font><br> <p> So, if I got that right, they make another software controlled stack for variables/arrays that get their pointers taken,<br> and keep the rest in the usual hardware stack?<br> <p> <p> I've sometimes wondered why CPUs don't implement a separate, protected, stack for the CALL/RETURN instructions.<br> That should deal with all sorts of overwriting and adding return pointers, if the stack was made read-only or completely<br> inaccessible to other instructions. But I don't know if there would be some prohibitive cost to that.<br> <p> <p> <p> </div> Mon, 11 Sep 2017 13:29:32 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732585/ https://lwn.net/Articles/732585/ shemminger <div class="FormattedComment"> Windows also has a similar (and somewhat richer) set of ROP defenses:<br> <a href="https://msdn.microsoft.com/en-us/library/windows/desktop/mt637065">https://msdn.microsoft.com/en-us/library/windows/desktop/...</a>(v=vs.85).aspx<br> Though there are IP issues in using some of this in Linux.<br> <p> Also, there is some evidence that these are reducing the use of ROP in exploits:<br> <a href="https://www.endgame.com/blog/technical-blog/rop-dying-and-your-exploit-mitigations-are-life-support">https://www.endgame.com/blog/technical-blog/rop-dying-and...</a><br> </div> Thu, 31 Aug 2017 23:04:30 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732583/ https://lwn.net/Articles/732583/ thestinger <div class="FormattedComment"> The Clang implementation works well when paired with proper integration in libc, but glibc doesn't have that.<br> </div> Thu, 31 Aug 2017 22:18:46 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732560/ https://lwn.net/Articles/732560/ nix <div class="FormattedComment"> Intel has a similar thing they're calling 'shadow stacks'.<br> </div> Thu, 31 Aug 2017 19:17:13 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732559/ https://lwn.net/Articles/732559/ nix <div class="FormattedComment"> It would only be viable for static functions whose addresses are not leaked (whether dlsym() counts as such a leak is questionable). Simply taking the function's address is probably enough to invalidate it, particularly given the existence of things like register_printf_function(), or, heck, atexit().<br> </div> Thu, 31 Aug 2017 19:15:48 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732557/ https://lwn.net/Articles/732557/ wahern <div class="FormattedComment"> The dual-stack approach was merged into Clang.<br> <p> <a href="http://dslab.epfl.ch/proj/cpi/">http://dslab.epfl.ch/proj/cpi/</a><br> <a href="http://clang.llvm.org/docs/SafeStack.html">http://clang.llvm.org/docs/SafeStack.html</a><br> <p> Though IIUC there are still some unresolved integration and interoperability issues involving threading, dynamic memory, etc, that make SafeStack undesirable to use long-term until those issues are resolved.<br> <p> </div> Thu, 31 Aug 2017 18:07:37 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732541/ https://lwn.net/Articles/732541/ mathstuf <div class="FormattedComment"> Wouldn't this break the ABI and `dlsym` looking up and using that function? Or would this approach only be viable for static functions or functions going into executables?<br> </div> Thu, 31 Aug 2017 15:26:11 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732537/ https://lwn.net/Articles/732537/ mathstuf <div class="FormattedComment"> There's also the way the Mill CPU is doing it where the CPU manages the call stack pointers for you and there's no access to them (except presumably through the debugger APIs, but I assume even that is read-only). Kind of like the split stack, but instead, it is just the way the CPU works.<br> </div> Thu, 31 Aug 2017 15:24:15 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732534/ https://lwn.net/Articles/732534/ epa <div class="FormattedComment"> That's right, static analysis won't be able to show all cases where a function isn't re-entrant. The programmer would have to annotate some functions with a tag to quieten the warning and would presumably add a comment with a human-readable explanation (or proof) of why it's OK. Yes, analysis of the whole program as a unit is necessary -- don't compilers have whole-program optimization modes nowadays?<br> <p> After my first paragraph I went off on a separate idea which was that the stack could be avoided altogether if a function can only be executing 'once' -- it cannot call itself so it cannot appear twice in a call stack. In that case you can set aside a static area of memory, which is read-write of course, but is physically separate from the stack and so perhaps less likely to be trampled in a typical stack smashing attack. This might not really be a win, if it just means that memory trampling attacks against this static area become easier.<br> </div> Thu, 31 Aug 2017 13:41:57 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732510/ https://lwn.net/Articles/732510/ mjthayer <div class="FormattedComment"> <font class="QuotedText">&gt; This still allows overwriting of the return address of the caller of the current function.</font><br> <p> Yes, I realise that. It is at least harder though as you have to avoid overwriting your own in the process.<br> </div> Thu, 31 Aug 2017 12:07:12 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732508/ https://lwn.net/Articles/732508/ sorokin <div class="FormattedComment"> Lots of different solutions were proposed in comments.<br> <p> 1. Xoring the return address stored in stack. (article)<br> 2. Allowing a function to return only to a finite number of places using an array of possible return addresses. (epa)<br> 3. Keeping return address in a global variable in the case the function is non-recursive. (epa)<br> 4. Duplicate a return address at bottom of stack frame. (michaeljt)<br> <p> Just for completeness I would like to say that another possible solution would be having two separate stacks. One for return addresses and for variables whose address is not taken. Another for variables whose address is taken.<br> <p> </div> Thu, 31 Aug 2017 12:04:22 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732509/ https://lwn.net/Articles/732509/ sorokin <div class="FormattedComment"> This still allows overwriting of the return address of the caller of the current function.<br> </div> Thu, 31 Aug 2017 12:00:03 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732501/ https://lwn.net/Articles/732501/ ibukanov <div class="FormattedComment"> This suggests to go back to early Fortran model with no stack but static locations for return addresses unless the function is recursive.<br> </div> Thu, 31 Aug 2017 10:16:23 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732496/ https://lwn.net/Articles/732496/ karkhaz <div class="FormattedComment"> This is a cool idea, but I'm confused about two points.<br> <p> 1. You say<br> <font class="QuotedText">&gt; and the compiler cannot statically prove that it can never be called by some chain starting from itself</font><br> <p> AFAIK there will be plenty of situations where a function cannot be called through a chain starting from itself, but a static analysis cannot prove this, so the compiler will emit false positives (i.e. tell you that you need to tag the function when you should not). This is due to function pointers; static analyses typically over-approximate what concrete values function pointers might have at runtime. Analyses that have a more precise idea about function pointer addresses are typically very slow.<br> <p> Note also that the analysis would need to have the entire program at its disposal to determine the values of function pointers, while here we're talking about the compiler (which only has access to a single translation unit). There are tools like CBMC [1] that don't have much trouble with analyzing entire programs and serve as a drop-in replacement to GCC, and there has been some work [2] to the Clang Static Analyzer that would enable it to analyze the entire program at once (still in discussion), but these are both way beyond the capabilities of a regular compiler anyway. Alternatively, it could be done as a link-time optimization (well, it would be a link-time analysis, but the line between analysis and optimization is quite fine), as at link-time you have the whole program, though I don't know what the exact capabilities of LLVM and GCC's LTO are and whether this would be possible.<br> <p> 2. I'm not sure what reentrancy has to do with this, would you mind elaborating?<br> <p> [1] <a href="http://www.cprover.org/cbmc/">http://www.cprover.org/cbmc/</a><br> [2] <a href="https://reviews.llvm.org/D30691">https://reviews.llvm.org/D30691</a><br> </div> Thu, 31 Aug 2017 09:31:50 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732492/ https://lwn.net/Articles/732492/ epa <div class="FormattedComment"> Yes, a lookup table might perform better for large numbers of callers, while if there are just two possible values a couple of inline test-and-jumps would likely be faster. (And if there are a hundred possible callers but one of them is way more frequent than the others, it might still be faster to test for that one first.) It depends on benchmarks and would likely vary by platform and even by different CPUs in the same platform.<br> <p> If the program is not multithreaded, and static analysis shows that a function is not re-entrant, then its return address (or a number to look up the return address) does not need to be on the stack at all. It can be at a fixed location, making it harder for an attacker to overwrite, and saving some stack space too.<br> <p> Indeed, there's a case for saying that all re-entrant functions should need to be explicitly tagged as such by the programmer, since they need more care in writing. If the function isn't tagged as re-entrant, and the compiler cannot statically prove that it can never be called by some chain starting from itself, then that's a compile-time warning or error. On the other hand, if a function is never re-entrant, all its local variables can be allocated statically away from the stack. Who knows, perhaps some optimizing compilers do this already.<br> </div> Thu, 31 Aug 2017 08:34:37 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732472/ https://lwn.net/Articles/732472/ droundy <div class="FormattedComment"> Nice idea! Presumably you'd want this to use a simple lookup table (with a bounds check), in which case the performance wouldn't be particularly affected by a large number of callers.<br> </div> Thu, 31 Aug 2017 00:12:12 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732440/ https://lwn.net/Articles/732440/ mjthayer <div class="FormattedComment"> I wondered recently about how useful copying the return address to a local variable at the bottom (address-wise) of the frame and comparing them before returning would be as a defence technique.<br> </div> Wed, 30 Aug 2017 20:15:30 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732371/ https://lwn.net/Articles/732371/ epa <div class="FormattedComment"> In most cases a function is called from only a finite number of places. If the number of such places is less than 256, then instead of storing a return address on the stack you could store a byte. At the end of the function the compiler generates code to jump to one of several possible addresses depending on the value of that byte. Then no matter what the attacker overwrites he can't jump to a gadget of his choosing, only to a small number of points that call the function in normal use anyway. And if there are only 5 possible return points but the byte is overwritten with 99, that's a clean crash ('stack corruption detected') rather than a jump to some random address. The stack becomes smaller too. (Performance would suffer if there are loads of possible return points and the compiler generates an endless if-then-goto-else chain, but you could set an upper limit or disable the mechanism for performance-critical code, which can then be audited more carefully than usual.)<br> <p> Here I am thinking of the program as a single lump of object code which is compiled and linked in its entirety before running. Obviously if you have loadable modules or you are writing a dynamically linked library you have to allow more flexibility in return addresses.<br> </div> Wed, 30 Aug 2017 13:58:09 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732361/ https://lwn.net/Articles/732361/ roc <div class="FormattedComment"> I don't think this would interfere with the return stack buffer. Probably the RSB just predicts the return address using an internal stack and verifies that the return address, when popped, matches its prediction. It doesn't matter whether the memory containing the return address was mangled temporarily.<br> </div> Wed, 30 Aug 2017 11:21:55 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732357/ https://lwn.net/Articles/732357/ alonz Note that the scheme used in RAP has a different trade-off - RAP needs two extra registers (which probably hurts performance) but it also doesn't alter the return address iterator. Modifying the return address are runtime, the way this technique does, will invalidate the processor's branch prediction and hurt performance. Wed, 30 Aug 2017 09:04:06 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732353/ https://lwn.net/Articles/732353/ PaXTeam <div class="FormattedComment"> it has nothing to do with RAP and it's not quite that novel (and secure) either: <a rel="nofollow" href="https://twitter.com/grsecurity/status/899294869105106944">https://twitter.com/grsecurity/status/899294869105106944</a> and <a rel="nofollow" href="https://pax.grsecurity.net/docs/pax-future.txt">https://pax.grsecurity.net/docs/pax-future.txt</a> . <br> </div> Wed, 30 Aug 2017 06:54:59 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732351/ https://lwn.net/Articles/732351/ josh <div class="FormattedComment"> Or if the stack address is reasonably predictable.<br> </div> Wed, 30 Aug 2017 05:06:00 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732348/ https://lwn.net/Articles/732348/ luto <div class="FormattedComment"> I wonder how easy this is to defeat if the stack address is already known.<br> </div> Wed, 30 Aug 2017 04:18:45 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732339/ https://lwn.net/Articles/732339/ smurf <div class="FormattedComment"> Probably not; RAP is considerably more involved (and causes a lot more slowdown).<br> Using the stack pointer itself to encode the return address is an interesting idea that should help thwart such attacks with significantly lower overhead.<br> </div> Wed, 30 Aug 2017 03:26:36 +0000 A return-oriented programming defense from OpenBSD https://lwn.net/Articles/732336/ https://lwn.net/Articles/732336/ pabs <div class="FormattedComment"> That sounds similar to but not the same as the equivalent protection in RAP from grsecurity, I wonder if the OpenBSD implementation was influenced by grsecurity's patent on RAP.<br> <p> <a href="https://www.grsecurity.net/rap_announce.php">https://www.grsecurity.net/rap_announce.php</a><br> <a href="https://www.grsecurity.net/rap_faq.php">https://www.grsecurity.net/rap_faq.php</a><br> </div> Wed, 30 Aug 2017 02:25:28 +0000