Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Posted Nov 8, 2023 10:10 UTC (Wed) by anton (subscriber, #25547)
In reply to: Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack) by farnz
Parent article: Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

It's actually only one program in SPECint2006 where it costs 7.2% with gcc (as reported by Wang et al.); and the performance improvement of the default -fno-wrapv can alternatively be achieved by changing the type of one variable in that program.

You write about people and culture, I write about compiler maintainers and attitude. And my hypothesis is that this attitude is due to the compiler maintainers (in particular those working on optimization) evaluating their work by running their compilers on benchmarks, and then seeing how an optimization affects the performance. The less guarantees they give, the more they can optimize, so they embrace UB. And they produced advocacy for their position.

Admittedly, there are also other motivations for some people to embrace UB:

Language lawyering, because UB gives them something that distinguishes them from ordinary programmers.
Elitism: The harder C programming is (and the more UB, the harder it is), the more they feel part of an elite. Everyone who does not want it that hard should switch to a nerfed language like Java, or better get out of programming at all.
Compiler supremacy: The compiler always knows best and magically optimizes programs beyond what human programmers can do. So if, by not defining behaviour, the compiler reduces the ways in which a programmer can express something, that's only for the best: The compiler will more than make up for any potential slowdown from that by being able to optimize better. After all (and this is where the compiler writer advocacy kicks in), UB is the source of optimization, and without having as much UB in C as there is, you might as well use -O0.
It's free: There are those who have not experienced that the compiler changed the behaviour of their program based on the assumption that the program does not exercise UB (or have not noticed it yet, in cases where the compiler optimizes away a bounds check or the code that erases a secret). They can fall for the compiler writer's advocacy that UB gives them performance for free. The cost of having to "sanitize" (hunting and eliminating UB in) their programs is not yet obvious to them.

There are, however, also other positions, advocated by many (including me), so I think that the C compiler writer's position on UB is not "the C culture" (it may be "the C++ culture", I don't know about that). In particular, I think (and have evidence for it) that humans are superior at optimizing programs, and that, if the goal is performance, programmer's time is better spent at performing such optimizations (by, e.g., changing the type of one variable, but also more involved transformations) than at "sanitizing" the program.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Posted Nov 8, 2023 10:57 UTC (Wed) by farnz (subscriber, #17727) [Link] (4 responses)

What I see, however, when I look at what the C standards committee is doing, is that the people who drive C and C++ the languages forwards are dominated by those who are embracing UB; this is largely because "make it UB, and you can define it if you really want to" is the committee's go-to answer to handling conflicts between compiler vendors; if there are at least two good ways to define behaviour (e.g. arithmetic overflow, where the human-friendly definitions are "error", "saturate", and "wrap"), and two compiler vendors refuse to agree since the definition either way results in the output code being less optimal on one of the two compilers, the committee punts on the decision.

And it's not just the compiler maintainers and the standards committees at fault here; both GCC and Clang provide flags to provide human-friendly defined behaviours for things that in the standard are UB (-fwrapv, -ftrapv, -fno-delete-null-pointer-checks, -fno-lifetime-dse, -fno-strict-aliasing and more). Users of these compilers could insist on using these flags, and simply state that if you don't use the flags that define previously undefined behaviour, then you're Holding It Wrong, but they don't.

Perhaps if you got (say) Debian and Fedora to change their default CFLAGS and CXXFLAGS to define behaviours that in standard C and C++ are undefined, I'd believe that you were anything more than a minority view - but the dominant variants of both C and C++ cultures don't do that.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Posted Nov 8, 2023 11:45 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

> this is largely because "make it UB, and you can define it if you really want to" is the committee's go-to answer to handling conflicts between compiler vendors;

Arghhhh ....

The goto answer SHOULD be "make it implementation defined, and define a flag that is on by default". If other compilers don't choose to support that flag, that's down to them - it's an implementation-defined flag.

(And given I get the impression this SPECInt thingy uses UB, surely the compiler writers should simply optimise the program to the null statement and say "here, we can run this benchmark in no time flat!" :-)

Cheers,
Wol

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Posted Nov 8, 2023 12:38 UTC (Wed) by farnz (subscriber, #17727) [Link]

The C standard (and the C++ standard) do not define flags that change the meaning of a program, because they are trying to define a single language; if something is implementation defined, then the implementation gets to say exactly what that means for this implementation, including whether or not the implementation's definition can be changed by compiler flags.

There's four varieties of incompletely standardised behaviour for a standards-compliant program that do not require diagnostics in all cases (I'm ignoring problems that require a diagnostic, since if you ignore warnings from your compiler, that's on you):

"ill-formed program". This is a program which simply has no meaning at all in the standard. For example, I am a fish. I am a fish. I am a fish is an ill-formed program (albeit one that requires a diagnostic). Ill-formed programs can always be optimized down to the null statement.
"undefined behaviour". This is a case where, due to input data or program constructs, the behaviour of the operation is undefined, and the consequence of that undefinedness is that the entire program has no meaning attributed to it by the standard. A compiler can do anything it likes once you've made use of UB, but in order to do this (e.g. compile down to the null statement), the compiler first has to show that you're using UB; e.g. you've executed a statement with undefined behaviour, such as arithmetic overflow in C99. If you do something that has UB in some cases but not others, the compiler can assume that the UB cases don't happen.
"unspecified behaviour". The behaviour of the program upon encountering unspecified behaviour is not set by the standard, but by the implementation. The implementation does not have to document what the behaviour will be, nor remain consistent between versions. Unspecified behaviour can be constrained by the standard to a set of possible options; for example, C++03 says that statics initialized via code are initialized in an unspecified order, but for each initializer, all statics must either be fully-initialized or zero-initialized at the point the initializer runs. This means that code like int a() { return 1; }; int b = a(); int c = b + a(); can set c to either 1 (b was zero-initialized) or 2 (b was fully-initialized), but not to any other value. However, because this is unspecified, the behaviour can change every time you run the resulting executable.
"implementation-defined behaviour". The behaviour of the program upon encountering unspecified behaviour is not set by the standard, but by the implementation. The implementation must document what behaviour it chooses; like unspecified behaviour, the allowed options may be set by the standard.

And it's been the case in the past that programs have compiled down to the null statement because they always execute UB; the problem with the SPECint 2006 benchmark in question is that it's conditionally UB in the C standard language, and thus the compiler must produce the right result as long as UB does not happen, but can do anything if UB happens.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Posted Nov 8, 2023 18:55 UTC (Wed) by anton (subscriber, #25547) [Link] (1 responses)

What I see, however, when I look at what the C standards committee is doing, is that the people who drive C and C++ the languages forwards are dominated by those who are embracing UB; this is largely because "make it UB, and you can define it if you really want to" is the committee's go-to answer to handling conflicts between compiler vendors

Yes, not defining something where existing practice diverges is the usual case in standardization. That's ok if you are aware that a standard is only a partial specification; programs written for compiler A would stop working (or require a flag to use counter-standard behaviour) if the conflicting behaviour of compiler B was standardized. However, if that is the reason for non-standardization, a compiler vendor has a specific behaviour in mind that the programs of its customers actually rely on. For a case (e.g., signed overflow) where compiler vendors actually say that they consider the programs that do this buggy and reject bug reports about such programs, compiler vendors do not have this excuse.

We certainly use all such flags that we can find. Not all of the following flags are for defining what the C standard leaves undefined, but for gcc-10 we use: -fno-gcse -fcaller-saves -fno-defer-pop -fno-inline -fwrapv -fno-strict-aliasing -fno-cse-follow-jumps -fno-reorder-blocks -fno-reorder-blocks-and-partition -fno-toplevel-reorder -falign-labels=1 -falign-loops=1 -falign-jumps=1 -fno-delete-null-pointer-checks -fcf-protection=none -fno-tree-vectorize -pthread -fno-defer-pop -fcaller-saves

I think both views are minority views, because most C programmers are relatively unaware of the issue. That's because the maintainers of gcc (and probably clang, too) preach one position, but, from what I read, practice something much closer to my position: What I read is that they check whether a new release actually builds all the Debian packages that use gcc with the release candidate and whether these packages then pass some tests (probably their self-tests). I assume that they then fix those cases where the package then does not work (otherwise, why should they do this checking? Also, Debian and other Linux distributions are unlikely to accept a gcc version that breaks many packages). This covers a lot of actual usage (including a lot of UB). However, it will probably not catch cases where a bounds check is optimized away, because the tests are not very likely to test for that.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Posted Nov 9, 2023 11:23 UTC (Thu) by farnz (subscriber, #17727) [Link]

Right, but you're a rarity. I see that big visible users of C (Debian, Fedora, Ubuntu, SLES, RHEL, FreeRTOS, ThreadX and more) don't set default flags to define more behaviour in C, while at the committee level John Regehr (part of the CompCert verified compiler project) could not get agreement on the idea that the ~200 UBs in the C standard should all either be defined or require a diagnostic from the compiler. And compiler authors aren't pushing on the "all UB should either be defined or diagnosed" idea, either.

So, for practical purposes, C users don't care enough to insist that their compilers define behaviours that the standard leaves undefined, nor do they care enough to insist that compilers must provide diagnostics when their code could execute UB (and thus is, at best, only conditionally valid). The standard committee doesn't care, either; and compiler authors aren't interested in providing diagnostics for all cases where a program contains UB.

From my point of view, this is a "C culture overall is fine with UB" situation - people have tried to get the C standard to define more behaviours, and the in charge of the C standard said no. People have managed to get compilers to define a limited set of behaviours that the standard leaves undefined, and most C users simply ignore that option - heck, if C users cared, it would be possible to go through the Fedora Change Process, or the Debian General Resolution process to have those flags set on by default for entire distributions, overruling the compiler maintainers. Given the complete lack of interest in either top-down (start at the standards body and work down) or bottom-up (get compiler writers to introduce flags, set them by default and push for everyone to set them by default) fixes to the C language definition, what else should I conclude?

And note that in the comments to this article, we have someone who agrees that too much of C is UB saying that they'll not simply insist that people use the extant compiler flag and rely on the semantics that are created by that - which is basically C culture's problem in a nutshell; we have a solution to part of the problem, but we're going to complain instead of trying to push the world to a "better" place.