Infinite loop language lawyering
Infinite loop language lawyering
Posted Jul 13, 2024 6:33 UTC (Sat) by cpitrat (subscriber, #116459)In reply to: Infinite loop language lawyering by mb
Parent article: New features in C++26
You forgot the "never inadvertently write some code that is UB "which is about ad helpful as "do not write bugs".
The big problem with UB is not people willingly using it but people inadvertently being bitten by it.
Posted Jul 13, 2024 9:03 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
Or worse, people who tried to get it right and failed!
Cheers,
Posted Jul 13, 2024 17:28 UTC (Sat)
by mgb (guest, #3226)
[Link] (26 responses)
How bad would it be if C/C++ compilers issued warnings for all UB and all optimizations relying on "UB can't happen" except where for a performance critical piece of code the authors wrapped it in a "Do not flag this I know what I'm doing and I'll stake my career on this" macro or pragma or whatever?
Posted Jul 13, 2024 18:07 UTC (Sat)
by mpr22 (subscriber, #60784)
[Link] (19 responses)
Posted Jul 13, 2024 18:54 UTC (Sat)
by mgb (guest, #3226)
[Link] (18 responses)
How bad it would be if every time a compiler actually optimized some C/C++ code based on detected UB it issued a warning, unless the warning for that code had been specifically suppressed?
Posted Jul 13, 2024 19:58 UTC (Sat)
by atnot (subscriber, #124910)
[Link]
Compilers just apply rules that make assumptions and the assumptions they're allowed to make as per the standard are that signed integers won't overflow, dereferenced pointers are valid, empty loops don't count as a side effect, etc.
You *could* warn every time such an assumption is made. But this would mean a[i]++ emits at least five warnings (pointer validity (and it's many subtypes), alignment validity (and it's subtypes), index validity, overflow of each part in base+(index*stride), overflow of increment).
For a practical example, consider alignment. Many architectures require special or multiple, slower instructions to load unaligned values. By being able to assume all dereferenced pointers are sufficiently aligned, compilers can unconditionally emit the faster instructions instead. Since it is almost impossible to guarantee what the alignment will be (at minimum once you start calling out to foreign code like libc), this would mean warning on every pointer dereference in the entire program.
There is pretty much only one way to solve this, which is to make it impossible to get invalid pointers from valid ones and require some special operation or blessing to turn unknown pointers into valid ones. Which is why most languages seeking to avoid this situation have ended up with that.
Posted Jul 13, 2024 20:00 UTC (Sat)
by farnz (subscriber, #17727)
[Link] (15 responses)
The problem is that the compiler does not optimize based on detected UB; rather, you have transforms (like "a * 5" turning into "a + a << 2", only often more complicated) which are only legal in combination on the assumption that the codebase does not contain UB. Because the combination of transforms is a performance improvement, the optimizer makes them all, and the resulting code no longer does what you expect.
And that mental model of "the compiler optimizes based on detected UB" is part of the communication gap - it's not that the compiler optimizes based on detected UB, but that the compiler optimizes on the assumption that there is no UB. The only way for the compiler to avoid depending on there being no UB is for the compiler to not optimize at all.
Posted Jul 14, 2024 12:10 UTC (Sun)
by kleptog (subscriber, #1183)
[Link] (3 responses)
Within their field the switching of the integrals works and is a valid optimisation and saves a lot of time. But if you tried to apply the same optimisation in other places you would get Undefined Behaviour.
This is usually not a problem, but if you see the amount of work required for computer proofs to actually validate all the assumptions, you can understand why compilers don't go to that effort.
Posted Jul 14, 2024 14:15 UTC (Sun)
by Wol (subscriber, #4433)
[Link] (2 responses)
AOL!
I was helping analyse truck arrival times recently, and my colleague was deferring to my ?superior? statistical knowledge. I waffled on about standard deviation and how it would give us an answer that was "good enough", but she picked up on a couple of anomalies and said "how can I present this to senior management?"
My answer, which went down okay as far as I could tell, was "we're using bell-curve analysis on a skewed distribution. It's the wrong maths, but it's good enough for what we're doing".
If you understand what you're doing is dodgy, and evaluate the risks, then it's okay. If you blindly apply formulae without understanding what those formulae do, you're going to screw up.
Cheers,
Posted Jul 15, 2024 1:20 UTC (Mon)
by himi (subscriber, #340)
[Link] (1 responses)
Compilers are at least operating in a relatively constrained domain, with well defined rules . . . except that strolling through the copious arguments that always crop up in any discussion that touches on language specification and compiler behaviour makes it pretty clear that "well defined rules" are something you need to be focusing on right from the start of your language design, rather than retrofitting to an already established language.
Posted Jul 15, 2024 8:29 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
Even if you do focus on well-defined rules from day 1, the whole field of programming language semantics is relatively young, and it's not completely clear which formal semantic model gets you the best outcomes; we know that axiomatic semantics are hard for non-specialists to interpret correctly, but that's about it for things we can use to choose between formalisms. At the moment, it looks like operational semantics are the front-runner, since they lend themselves nicely to optimizations on the basis of the "as-if" rule, but it's not clear whether operational or denotational semantics are easier to transform to a form that's clear to non-specialists. And there's no point having "correct" formal semantics if they're only referenced by specialists in formal semantics (since that's a subset of compiler developers); we need the formal semantics to at least be transformable into something that's clear to a skilled programmer.
Posted Jul 15, 2024 14:20 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (5 responses)
To my mind, this instantly splits optimisations into two sorts (and I'm sure you'll tell me things don't work that way :-) but there is (1) your a*5 example, where the calculation still takes place, just that the result is not what you expect.
Then there's (2), where constant-folding changes the program flow - for example the test for undefined behaviour that folds to false because the compiler assumes undefined behaviour can't happen.
How easy it would be to do, I don't know (as in, does the compiler even have the context to do so), but if we could have "Warning: the expression inside a conditional has been folded to true/false", that would pick up a lot of the problematic optimisations.
So long as we can avoid warning on stuff that's actually been written as "while(true)" or "if(false)", (and I've no doubt programmers write that stuff - I do) then it shouldn't be a problem, because if that's what programmers meant, they should have written it.
Cheers,
Posted Jul 15, 2024 14:48 UTC (Mon)
by farnz (subscriber, #17727)
[Link] (4 responses)
That runs into the problem of macro expansion, constexpr functions, inlining and more. while(1) is clearly intended to be an infinite loop, but from a C99 or C11 compiler's point of view, while(true) is the same as while(HOW_BIG_IS_THAT_FISH), since true is not a value, but a macro that expands to the value 1.
And the problem with your description is that the program doesn't know that the "test folded to false", it knows that there's a conditional that's always false - but it can't tell whether it's intentionally always false, or whether that's an accident. Compilers can't read developer minds, yet.
Remember that, for a lot of the key optimizations, it's not working at the level of the source language - it's working with something like LLVM IR, and what it's seeing is a conditional jump where the condition is a constant, but it has no clue that the condition was "meant" to be variable, since the optimizations aren't working at that level. And it could be hundreds of optimization passes in before the condition becomes constant, at which point it's really hard to work out whether it's constant because the compiler has assumed no UB, or whether it's constant because the fully-defined behaviour of the language makes it constant.
Posted Jul 15, 2024 15:55 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (1 responses)
And the compiler doesn't track whether a constant expression was constant when first met, or a previous optimisation pass has folded it. And as you say macros screw things up.
I wasn't asking it to read minds - it was as simple as "did the programmer write "while(true)", or "while(a==b)"", but I'm not surprised you're saying even something that simple is hard to track. Because if the latter evaluates to a constant there's a good chance the programmer didn't mean it (but if that code was in a macro, then it might not always evaluate to a constant for the programmer who wrote the macro, but does for the programmer using the macro :-(
Cheers,
Posted Jul 15, 2024 17:27 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
Unfortunately, it becomes impossible for the compiler to even determine what the programmer wrote at the point where it's interesting to know if a condition is constant, especially since defensive programming means that a condition might be known to be constant, even though it's variable at source level. Consider the following C function:
At source level, this is clearly not attempting to invoke UB, and there's no constant comparisons involved. As a result, by the time the optimizer is relying on assuming no UB, what it will be looking at (after optimizing the function's IR over many passes) is something like:
This is short enough to inline; when it's inlined, the optimizer will replace a %call = call noundef i32 @fetch_update_package_epoch(epochs*)(ptr noundef nonnull %epochs) with the function body, making the right substitution for epoch_area. Now, because the call tells us that epoch_area is nonnull, the comparison becomes a constant, the branch can be made unconditional, and then dead code elimination removes the NULL check.
But the problem is that what's triggered the comparison becoming a constant is the compiler identifying that epoch_area in the call at IR level is nonnull. The optimizer can't warn you that it's eliding the NULL check based on this, because it cannot distinguish "this is nonnull because otherwise there is UB in the program" from "this is nonnull because the source lines are epoch_area epochs = { package = 1 }; pkg = fetch_update_package_epoch(&epochs);. Both of those are the same IR at this point - a call with a non-NULL annotation on the pointer - how should the optimizer know that one of those annotations is fine to ignore, but the other matters? And note that you also don't want to warn just because a pointer is marked as non-NULL because it'd be UB to be NULL, since that would cause you to warn on:
But it might be perfectly clear from the rest of the source code context that, in this case, epoch_area will never be NULL, because no function calls this with a possibly-NULL pointer.
Posted Jul 15, 2024 19:35 UTC (Mon)
by unilynx (guest, #114305)
[Link] (1 responses)
eslint has a nice fix for this exact situation in JavaScript - you can configure it to require for(;;) to build an endless loop. no constant folding happens so no confusion
Posted Jul 15, 2024 19:54 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
That only works because the output language is still JavaScript, with JavaScript semantics; the issue in this case is that the compiler has read in C (or Rust, or whatever) source code, translated it to an IR like GIMPLE or LLVM IR (with different semantics to the source language) on the basis that the translation only has UB in the IR if the source code had UB, optimized the IR on the assumption of no UB in the IR output, and then ended up with IR that it translates to machine code that has different semantics to the one the programmer intended, because the programmer did not intend to write code with UB.
Note, for example, that C++'s "forward progress" assumption also allows for endless loops, as long as they're "trivial", change state visible to another thread, or they contain I/O. It's only pure computational loops that can be elided if they do not terminate, and only if they're non-trivial in the original source - so for(;;) {} is trivial, and therefore cannot be elided, while for(;;) do_the_thing() is non-trivial, and can be elided if do_the_thing() has no effects visible outside the current thread. This is useful for compilers, since once it's able to show that do_the_thing() is a no-op, it can elide the whole loop - even if do_the_thing is actually a macro containing a break; statement.
There is no real equivalent of this stack of languages in the JavaScript ecosystem, because the "machine code" you execute is also JavaScript.
Posted Jul 19, 2024 10:25 UTC (Fri)
by anton (subscriber, #25547)
[Link] (4 responses)
Pro-UB advocates like to give the impression that basically every optimization requires the assumption that the program does not exercise undefined behaviour (e.g., they love to recommend -O0 for those people who do not want optimizations that assume the absense of UB), but in reality most optimizations do not require that assumption. Here we have an example where a pro-UB advocate wanted to give an example of an optimization that requires this assumption but failed.
Posted Jul 19, 2024 11:16 UTC (Fri)
by farnz (subscriber, #17727)
[Link] (3 responses)
It was not intended to be a transformation that is "only legal on the assumption that the codebase does not contain UB"; it was an example of a transform that is hopefully uncontroversial, in order to indicate just how much you're leaving on the table if you say "the code must behave exactly as written, and no optimizations are permitted in case a combination of optimizations result in UB doing something unexpected". In other words, that particular transform is an example of something that's barred if you say "use -O0" and yet that everyone reasonable agrees is one that should be permitted.
You even quoted the parts where I said that this wasn't an optimization that, on its own, depends on UB existing - I said "like … only often more complicated" to indicate that this transform is not itself controversial. I also said "transforms which are only legal in combination", because the problem is rarely transformations that rely on UB in their own right, but rather combinations of transforms where each transform is a valid optimization on its own, but the combination of 2 or more transforms (bearing in mind that even bad compilers have hundreds of optimizations they know how to perform) that is only OK in the absence of UB.
And the problem is that the resulting combinatorial explosion of possible combinations of transforms, each valid in its own right, and where the combination is itself valid in the absence of UB is a big space to search through. By defining more behaviours, you limit the allowed transformations and the allowed combinations of transforms, so that these combinations become compiler bugs, instead of bugs in your code.
Finally, I consider your name-calling defamatory; please desist from making false statements about me with the intention of making other people think worse of me.
Posted Jul 19, 2024 14:34 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (2 responses)
It's a bit like a sequence of integer multiplies and divides. *I* may know that if I do them in the correct order, I will get the correct result without either over- or under-flow, but if some clever idiot loses my parentheses in the bowels of the compiler, then the computer will get it wrong. Equally, if I'm used to the compiler doing it left-to-right, and then a slight change makes the compiler do it right-to-left. It's the law of unintended consequences, writ large, and the continual flow of new optimisers writs it ever larger :-)
Hence the stream of people moaning about it as they get bitten out of the blue ...
Cheers,
Posted Jul 19, 2024 15:02 UTC (Fri)
by pizza (subscriber, #46)
[Link] (1 responses)
... but do you, really? Have you proven that your specified ordering does the right thing on all possible inputs?
(On a near-daily basis I have to deal with the consequences of arithmetic overflow and issues relating to integer precision)
> will get the correct result without either over- or under-flow, but if some clever idiot loses my parentheses in the bowels of the compiler, then the computer will get it wrong. Equally, if I'm used to the compiler doing it left-to-right, and then a slight change makes the compiler do it right-to-left. It's the law of unintended consequences,
This is a poor example; operator evaluation order (including explicit evaluation ordering signified by use of parenthesis) is part of the core language specification, and compilers can't (and don't) muck with that.
Posted Jul 19, 2024 15:25 UTC (Fri)
by adobriyan (subscriber, #30858)
[Link]
What are you talking about? The evaluation order of "x = a + (b + c);" is "anything".
Posted Jun 20, 2025 23:27 UTC (Fri)
by vonbrand (subscriber, #4458)
[Link]
Posted Jul 13, 2024 18:46 UTC (Sat)
by mb (subscriber, #50428)
[Link]
Compilers don't try to and fundamentally can't find most actual UB in programs.
Compilers do *not* find UB and then exploit it to optimize the program.
>Do not flag this I know what I'm doing and I'll stake my career on this"
Use volatile and/or -O0.
Posted Jul 19, 2024 9:52 UTC (Fri)
by anton (subscriber, #25547)
[Link] (1 responses)
One interesting case is the idiom "if (x+1 <= x)". Someone has built recognition of this idiom into gcc, because gcc warns about it (or at least something similar). However, gcc does not optimize it into the shortest code "inc x; jno ..." when -fwrapv is enabled, nor does gcc optimize (x==LONG_MAX) into this code (at least when I last looked). So these are idioms that some gcc maintainer working on user friendlyness recognizes and warns about, but the gcc maintainers working on benchmark results do not optimize them.
Posted Jul 19, 2024 15:06 UTC (Fri)
by atnot (subscriber, #124910)
[Link]
I mean, they are. Look at GCC -fanalyzer, or even much of the existing warning infrastructure, all of which uses much of the same machinery as the compiler and in the case of -fanalyzer also extra stuff like symbolic evaluation that the compiler normally won't do.
However as you'll also discover reading the fanalyzer blog posts, the hard part is not finding possible UB, that's easy, too easy. It's making a tool that has sufficiently low false positives that anyone is actually willing to use it. Something that can reliably tell the difference between code that humans would consider "obviously correct" and subtly wrong, across various contexts and in every stage of compilation from the source down to the final assembly. That's what people who ask for this actually want and that is the thing that is nigh impossible in a language where right and wrong code looks almost identical, but people demand must do everything exactly as they ask, make it fast, output exactly the assembly instructions they expect (see your comment), and never break or change anything.
Posted Jul 19, 2024 14:24 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (2 responses)
VERY bad, I think. Look at my discussions with farnz and pizza.
The problem basically can be summed up in two words - "macros" and "templates". In other words, a massive chunk of the code that actually gets to the compiler, wasn't written by the programmer. And a lot of that code probably relies on "UB won't happen".
So, to put it simply, you'll drown in false positives, most of them down to code that's nothing whatsoever to do with what you actually wrote. I couldn't agree more with what you want/wrote, I just don't think with modern compilers it's remotely practical. Half the problem is they keep adding more and more "cleverer" optimisers that bite mere mortals in the butt. You can turn off all the stupid optimisers, but can you even find them all? Especially as your definition of "stupid" probably disagrees with mine.
Cheers,
Posted Jul 19, 2024 14:53 UTC (Fri)
by pizza (subscriber, #46)
[Link]
Uh, yeah.. "-O0" turns off _everything_. And everything rolled into -O[n>0] can be individually selected.
...O0 also disables a lot of warnings because those rely on the optmization passes to detect questionable stuff.
> Especially as your definition of "stupid" probably disagrees with mine.
The definition of "stupid" also varies depending on the hardware and libc/operating system.
Posted Jul 19, 2024 17:54 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
There's a second layer of trouble, too. A single optimization looks like "if these obligations are fulfilled by the current IR, then this IR transform is applicable and will improve the output"; the compiler will have some form of sequencing and limits on which individual optimizations it's willing to run, but it will otherwise repeatedly loop running the optimizations whose obligations are fulfilled until it gets to a point where no optimization is available or the compiler has hit its limits.
And the thing that hurts is not normally a single optimization; rather, it's that a sequence of optimizations have their obligations met, and the resulting output is not what was intended, because the compiler front-end gave you IR that fulfilled the obligations on the basis that the source code contained no UB, but you wanted a different behaviour.
It's worth noting, too, that a lot of the "exploiting UB" that compilers do comes from a position of wanting code to have "portable performance". For example, there's a lot of C code out there that uses int as the index into an array (instead of size_t, which is technically what you're "meant" to use); on systems where int and size_t are the same size, this is a complete non-issue, since if signed overflow is defined as twos' complement wraparound, you'd still generate the same code for arr[start + offset] as you would if it was undefined behaviour (assuming a non-adversarial compiler). On a twos' complement 32-bit machine, there's no non-adversarial code generation for arr[start + offset] that's better than the obvious "add start to offset, use indexed addressing with a shift to get to the right bit of memory".
But on a LP64 machine, it gets more interesting. If start and offset are both ints, then the addition is done in int precision, then it's supposed to be sign-extended to 64-bit (to match size_t, then the shift-and-add happens in 64-bit. If you define signed overflow as "what x86-64 instructions do", then this requires a sign extension; if you don't define it, then the compiler is allowed to elide the sign extension, and generate the equivalent instructions to the 32-bit version.
Now, it's very reasonable to argue about whether or not the code should just be fixed to use size_t instead, or to say that the performance hit of sign extension is a price worth paying to remove UB on signed overflow. But the current situation is that compilers have to depend on no signed overflow in order to elide the sign extension, and they want to do that because it saves a few %age points on a benchmark that compilers care about.
And once you've written an optimization that fixes this in LLVM IR terms, it starts to fire in places where, instead of being an unmitigated benefit (as in the indexing case), it's a surprise.
For the curious, the sort of optimization that I'd be expecting to see fire in this particular case would be looking for LLVM IR like:
With that in place as an optimization, clang just needs to know that addition of signed integers should be add nsw, rather than plain add, since add is defined even in the case of overflow, and nsw modifies arithmetic instructions to say that signed overflow is undefined. But now you can get surprise cases that aren't the indexing case we were trying to optimize, where the compiler happens to know that the output use case is undefined (possibly because another optimization pass was able to directly use the output with a getelementptr inbounds, or similar) if the output is negative, and therefore this optimization fires, resulting in you being surprised.
And that's the sort of thing that leads to "UB is evil" situations; the optimizations individually all make sense, are well-motivated, and make the output better; but they combine in unexpected ways, and it's the combinations that hurt, and hurt badly.
Infinite loop language lawyering
Wol
Infinite loop language lawyering
By "all UB", do you mean "UB that a static verifier can prove will definitely be triggered by code within the translation unit" or "UB that a static verifier can prove could be triggered if code outside the translation unit passed (in)appropriate arguments to a function inside the translation unit, or returned (in)appropriate values when called from inside the translation unit"?
Infinite loop language lawyering
Infinite loop language lawyering
Infinite loop language lawyering
Optimizers exploiting UB
Optimizers exploiting UB
Optimizers exploiting UB
Wol
Optimizers exploiting UB
Optimizers exploiting UB
Compilers are at least operating in a relatively constrained domain, with well defined rules . . . except that strolling through the copious arguments that always crop up in any discussion that touches on language specification and compiler behaviour makes it pretty clear that "well defined rules" are something you need to be focusing on right from the start of your language design, rather than retrofitting to an already established language.
Optimizers exploiting UB
Wol
Optimizers exploiting UB
Optimizers exploiting UB
Wol
Detecting UB involves reading minds :-(
static int fetch_update_package_epoch(epochs *epoch_area) {
if (epoch_area == NULL) {
return 0;
}
epoch_area->package += 1;
return epoch_area->package;
}
define dso_local i32 @fetch_update_package_epoch(ptr noundef %epoch_area) local_unnamed_addr {
entry:
%cmp = icmp eq ptr %epoch_area, null
br i1 %cmp, label %return, label %if.end
if.end:
%0 = load i32, ptr %epoch_area, align 4
%add = add nsw i32 %0, 1
store i32 %add, ptr %epoch_area, align 4
br label %return
return:
%retval.0 = phi i32 [ %add, %if.end ], [ 0, %entry ]
ret i32 %retval.0
}
static int fetch_update_package_epoch(epochs *epoch_area) {
epoch_area->package += 1;
return epoch_area->package;
}
Optimizers exploiting UB
JavaScript not a good source of analogies or fixes, unfortunately.
Optimizers exploiting UB
transforms (like "a * 5" turning into "a + a << 2", only often more complicated) which are only legal in combination on the assumption that the codebase does not contain UB.
This is certainly a valid optimization with -fwrapv semantics. With -ftrapv semantics it's also valid AFAICS. So it's certainly not a transformation that is "only legal in combination on the assumption that the codebase does not contain UB."
Optimizers exploiting UB
Optimizers exploiting UB
Wol
Optimizers exploiting UB
Optimizers exploiting UB
Infinite loop language lawyering
Infinite loop language lawyering
They *assume* that there is no UB and then apply optimizations with that assumption as a base.
They assume that there is no UB.
You are getting all kinds of excuses about the supposed impossibility or impracticality of such warnings. But if the C++ compiler writers actually wanted to provide such warnings, there are ways to do so. E.g., for the classic example where the compiler assumes that overflow does not happen, generate code without that assumption and with that assumption; if the code is different, output a warning. With tracking of where in the source code the assumptions about, e.g., value ranges are coming from, the warnings can also pinpoint to that code. But of course the culprits among the compiler maintainers do not want to do that because it would detract them from producing better benchmark results.
Infinite loop language lawyering
Infinite loop language lawyering
Infinite loop language lawyering
Wol
Infinite loop language lawyering
Stupid optimizations that depend on UB can't happen
with the prerequisite for applicability being that the use of the output of the sext instruction is undefined if the output is negative. If that's the case, you can remove the sext instruction and replace %idxprom with %add in the rest of the scope. For indexing, that'll be fine, since you'll use getelementptr inbounds to do the array[offset] operation, and inbounds guarantees that array + offset is within the bounds of the allocation tied to array, so you know that if %idxprom is used as the offset in a getelementptr inbounds, it'll be UB if %idxprom < 0.
%add = add nsw i32 %7, %8
%idxprom = sext i32 %add to i64