Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)
Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)
Posted Nov 8, 2023 10:10 UTC (Wed) by anton (subscriber, #25547)In reply to: Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack) by farnz
Parent article: Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)
It's actually only one program in SPECint2006 where it costs 7.2% with gcc (as reported by Wang et al.); and the performance improvement of the default -fno-wrapv can alternatively be achieved by changing the type of one variable in that program.
You write about people and culture, I write about compiler maintainers and attitude. And my hypothesis is that this attitude is due to the compiler maintainers (in particular those working on optimization) evaluating their work by running their compilers on benchmarks, and then seeing how an optimization affects the performance. The less guarantees they give, the more they can optimize, so they embrace UB. And they produced advocacy for their position.
Admittedly, there are also other motivations for some people to embrace UB:
- Language lawyering, because UB gives them something that distinguishes them from ordinary programmers.
- Elitism: The harder C programming is (and the more UB, the harder it is), the more they feel part of an elite. Everyone who does not want it that hard should switch to a nerfed language like Java, or better get out of programming at all.
- Compiler supremacy: The compiler always knows best and magically optimizes programs beyond what human programmers can do. So if, by not defining behaviour, the compiler reduces the ways in which a programmer can express something, that's only for the best: The compiler will more than make up for any potential slowdown from that by being able to optimize better. After all (and this is where the compiler writer advocacy kicks in), UB is the source of optimization, and without having as much UB in C as there is, you might as well use -O0.
- It's free: There are those who have not experienced that the compiler changed the behaviour of their program based on the assumption that the program does not exercise UB (or have not noticed it yet, in cases where the compiler optimizes away a bounds check or the code that erases a secret). They can fall for the compiler writer's advocacy that UB gives them performance for free. The cost of having to "sanitize" (hunting and eliminating UB in) their programs is not yet obvious to them.
Posted Nov 8, 2023 10:57 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (4 responses)
What I see, however, when I look at what the C standards committee is doing, is that the people who drive C and C++ the languages forwards are dominated by those who are embracing UB; this is largely because "make it UB, and you can define it if you really want to" is the committee's go-to answer to handling conflicts between compiler vendors; if there are at least two good ways to define behaviour (e.g. arithmetic overflow, where the human-friendly definitions are "error", "saturate", and "wrap"), and two compiler vendors refuse to agree since the definition either way results in the output code being less optimal on one of the two compilers, the committee punts on the decision.
And it's not just the compiler maintainers and the standards committees at fault here; both GCC and Clang provide flags to provide human-friendly defined behaviours for things that in the standard are UB (-fwrapv, -ftrapv, -fno-delete-null-pointer-checks, -fno-lifetime-dse, -fno-strict-aliasing and more). Users of these compilers could insist on using these flags, and simply state that if you don't use the flags that define previously undefined behaviour, then you're Holding It Wrong, but they don't.
Perhaps if you got (say) Debian and Fedora to change their default CFLAGS and CXXFLAGS to define behaviours that in standard C and C++ are undefined, I'd believe that you were anything more than a minority view - but the dominant variants of both C and C++ cultures don't do that.
Posted Nov 8, 2023 11:45 UTC (Wed)
by Wol (subscriber, #4433)
[Link] (1 responses)
Arghhhh ....
The goto answer SHOULD be "make it implementation defined, and define a flag that is on by default". If other compilers don't choose to support that flag, that's down to them - it's an implementation-defined flag.
(And given I get the impression this SPECInt thingy uses UB, surely the compiler writers should simply optimise the program to the null statement and say "here, we can run this benchmark in no time flat!" :-)
Cheers,
Posted Nov 8, 2023 12:38 UTC (Wed)
by farnz (subscriber, #17727)
[Link]
The C standard (and the C++ standard) do not define flags that change the meaning of a program, because they are trying to define a single language; if something is implementation defined, then the implementation gets to say exactly what that means for this implementation, including whether or not the implementation's definition can be changed by compiler flags.
There's four varieties of incompletely standardised behaviour for a standards-compliant program that do not require diagnostics in all cases (I'm ignoring problems that require a diagnostic, since if you ignore warnings from your compiler, that's on you):
And it's been the case in the past that programs have compiled down to the null statement because they always execute UB; the problem with the SPECint 2006 benchmark in question is that it's conditionally UB in the C standard language, and thus the compiler must produce the right result as long as UB does not happen, but can do anything if UB happens.
Posted Nov 8, 2023 18:55 UTC (Wed)
by anton (subscriber, #25547)
[Link] (1 responses)
We certainly use all such flags that we can find. Not all of the following flags are for defining what the C standard leaves undefined, but for gcc-10 we use: I think both views are minority views, because most C programmers are relatively unaware of the issue. That's because the maintainers of gcc (and probably clang, too) preach one position, but, from what I read, practice something much closer to my position: What I read is that they check whether a new release actually builds all the Debian packages that use gcc with the release candidate and whether these packages then pass some tests (probably their self-tests). I assume that they then fix those cases where the package then does not work (otherwise, why should they do this checking? Also, Debian and other Linux distributions are unlikely to accept a gcc version that breaks many packages). This covers a lot of actual usage (including a lot of UB). However, it will probably not catch cases where a bounds check is optimized away, because the tests are not very likely to test for that.
Posted Nov 9, 2023 11:23 UTC (Thu)
by farnz (subscriber, #17727)
[Link]
Right, but you're a rarity. I see that big visible users of C (Debian, Fedora, Ubuntu, SLES, RHEL, FreeRTOS, ThreadX and more) don't set default flags to define more behaviour in C, while at the committee level John Regehr (part of the CompCert verified compiler project) could not get agreement on the idea that the ~200 UBs in the C standard should all either be defined or require a diagnostic from the compiler. And compiler authors aren't pushing on the "all UB should either be defined or diagnosed" idea, either.
So, for practical purposes, C users don't care enough to insist that their compilers define behaviours that the standard leaves undefined, nor do they care enough to insist that compilers must provide diagnostics when their code could execute UB (and thus is, at best, only conditionally valid). The standard committee doesn't care, either; and compiler authors aren't interested in providing diagnostics for all cases where a program contains UB.
From my point of view, this is a "C culture overall is fine with UB" situation - people have tried to get the C standard to define more behaviours, and the in charge of the C standard said no. People have managed to get compilers to define a limited set of behaviours that the standard leaves undefined, and most C users simply ignore that option - heck, if C users cared, it would be possible to go through the Fedora Change Process, or the Debian General Resolution process to have those flags set on by default for entire distributions, overruling the compiler maintainers. Given the complete lack of interest in either top-down (start at the standards body and work down) or bottom-up (get compiler writers to introduce flags, set them by default and push for everyone to set them by default) fixes to the C language definition, what else should I conclude?
And note that in the comments to this article, we have someone who agrees that too much of C is UB saying that they'll not simply insist that people use the extant compiler flag and rely on the semantics that are created by that - which is basically C culture's problem in a nutshell; we have a solution to part of the problem, but we're going to complain instead of trying to push the world to a "better" place.
Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)
Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)
Wol
Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)
Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)
What I see, however, when I look at what the C standards committee is doing, is that the people who drive C and C++ the languages forwards are dominated by those who are embracing UB; this is largely because "make it UB, and you can define it if you really want to" is the committee's go-to answer to handling conflicts between compiler vendors
Yes, not defining something where existing practice diverges is the usual case in standardization. That's ok if you are aware that a standard is only a partial specification; programs written for compiler A would stop working (or require a flag to use counter-standard behaviour) if the conflicting behaviour of compiler B was standardized. However, if that is the reason for non-standardization, a compiler vendor has a specific behaviour in mind that the programs of its customers actually rely on. For a case (e.g., signed overflow) where compiler vendors actually say that they consider the programs that do this buggy and reject bug reports about such programs, compiler vendors do not have this excuse.
-fno-gcse -fcaller-saves -fno-defer-pop -fno-inline -fwrapv -fno-strict-aliasing -fno-cse-follow-jumps -fno-reorder-blocks -fno-reorder-blocks-and-partition -fno-toplevel-reorder -falign-labels=1 -falign-loops=1 -falign-jumps=1 -fno-delete-null-pointer-checks -fcf-protection=none -fno-tree-vectorize -pthread -fno-defer-pop -fcaller-saves
Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)