Undefined Behaviour as usual

Posted Mar 1, 2024 1:37 UTC (Fri) by pizza (subscriber, #46)
In reply to: Undefined Behaviour as usual by farnz
Parent article: Stenberg: DISPUTED, not REJECTED

> Am I right in thinking that you'd agree that it's the output binary that's usually deterministic in its environment (modulo things with clearly-defined non-determinism such as the RDRAND instruction, or runtime data races between multiple threads), and not the combination of compiler (including flags) and source code?

Yes, I'd agree with this, and I don't think that it's a terribly controversial opinion.

> In other words, while this would be a nasty surprise, you wouldn't be surprised if recompiling the same UB-containing source resulted in a different binary

If you're recompiling with the same toolchain+options I'd expect the resulting binaries to behave damn near identically from one build to the next [1]. (Indeed, as software supply-chain attacks become more widespread, 100% binary reproducibility is a goal many folks are working towards)

> but you would be surprised if using the same binary in the same environment with the same inputs had a different output, unless it was doing something that is clearly specified to have non-deterministic behaviour by the hardware (like a data race, or RDRAND).

Yep. After all, the script kiddies wouldn't be able to do their thing unless a given binary on a given platform demonstrated pretty consistent behavior.

[1] The main difference would the linker possibly putting things in different places (especially if multiple build threads are involved) but that doesn't change the fundamental attack vector -- eg a buffer overflow that smashes your stack on one build (and/or platform) will still smash your stack on another, but since the binary layout is different, you'll likely need to adjust your attack payload to achieve the results you want. Similarly, data-leakage-type attacks (eg Heartbleed) usually rely on being able to repeat the attack with impunity until something "interesting" is eventually found.

Undefined Behaviour as usual

Posted Mar 1, 2024 10:04 UTC (Fri) by farnz (subscriber, #17727) [Link] (2 responses)

If you're recompiling with the same toolchain+options I'd expect the resulting binaries to behave damn near identically from one build to the next [1]. (Indeed, as software supply-chain attacks become more widespread, 100% binary reproducibility is a goal many folks are working towards)

This is a potentially dangerous expectation in the presence of UB in the source code; there are optimizations that work by searching for a local maximum, and where for fully-defined code (even where it's unspecified behaviour, where there are multiple permissible outcomes), we know that there is only one maximum they can find. We use non-determinism in that search to speed it up, and for UB we run into the problem that there can be multiple maxima, all of which are locally the best option.

Because the search is non-deterministic, exactly which maximum we end up in for some UB cases is also non-deterministic. This does mean that 100% binary reproducibility has the nice side-effect of wanting to reduce UB - by removing UB, you make the search type optimizations find the one and only one optimal stable state every time, instead of choosing a different one each time).

And I'd agree that it's not terribly controversial to believe that a binary running in user mode has no UB - there's still non-deterministic behaviour (like the result of a data race between threads, or the output of RDRAND), and if your binary's behaviour is defined by something non-deterministic, it could end up in what the standards call unspecified behaviour. This is not universally true once you're in supervisor mode (since you can do things on some platforms like change power rails to be out-of-spec, which results in CPUs having UB since the logic no longer behaves digitally, and thus it's possible for software UB to turn into binary defined behaviour of changing platform state such that the platform's behaviour is now unpredictable).

Undefined Behaviour as usual

Posted Mar 1, 2024 14:48 UTC (Fri) by pizza (subscriber, #46) [Link] (1 responses)

> we know that there is only one maximum they can find. We use non-determinism in that search to speed it up, and for UB we run into the problem that there can be multiple maxima, all of which are locally the best option.

FWIW I've seen my fair share of shenanigans caused by nondeterminsitic compiler/linking behavior. To this day, there's one family of targets in the CI system that yields a final binary that varies by about 3KB from one build to the next depending on which nonidentical build host was used to cross-compile it. I suspect that is entirely due to the number of cores used in the highly parallel build; I've never seen any variance from back-to-back builds on the same machine (binaries are identical except for the intentionally-embedded buildinfo)

But I do understand what you're saying, and even agree -- but IME compilers are already very capable of loudly warning about the UB scenarios that can trigger what you described. Of course, folks are free to ignore/disable warnings, but I have no professional sympathy for them, or the consequences.

> This is not universally true once you're in supervisor mode (since you can do things on some platforms like change power rails to be out-of-spec [...]

I've spent most of my career working in bare-metal/supervisory land, and yeah, even a off-by-one *read* could have some nasty consequences depending on which bus address that happens to hit. OTOH, while the behavior of that off-by-one read is truly unknowable from C's perspective, if the developer "corrects" the bug by incrementing the array size by one (therefore making it too large for the hw resource) the program is now "correct" from C's perspective, but will trigger the same nasty consequences on the actual hardware.

Undefined Behaviour as usual

Posted Mar 1, 2024 16:34 UTC (Fri) by farnz (subscriber, #17727) [Link]

OTOH, while the behavior of that off-by-one read is truly unknowable from C's perspective, if the developer "corrects" the bug by incrementing the array size by one (therefore making it too large for the hw resource) the program is now "correct" from C's perspective, but will trigger the same nasty consequences on the actual hardware.

I spend a lot of time in the land of qualified compilers, where the compiler promises that as long as you stick to the subset of the language that's qualified, you can look only at the behaviours evident in the source code to determine what the binary will do. You're expected, if you're working in this land, to have proper code review and separate code audit processes so that a fix of the type you describe never makes it to production, since it's obvious from the source code that the program, while "correct" from C's perspective, is incorrect from a higher level perspective.

And a lot of the problems I see with the way UB is handled feel like people expect all compilers to behave like qualified compilers, not just on a subset of the language, but on everything, including UB.