LWN: Comments on "Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)"

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

mathstuf — Thu, 16 Nov 2023 20:14:16 +0000

The plan (AFAICT) seems to be along the lines of defining rules that code must obey in order to satisfy profiles. Once the rules are known, language and library constructs can be analyzed to see if they adhere to those rules and called out by the compiler from there. While it will take time for code to be under a set of profiles that gets Rust-like safety, I think it is probably the most reasonable plan given the various constraints involved. But with this, one can start putting code behind "we checked for property X and want the compiler to enforce it from here on out" until you can start to say it project-wide and then start flipping the switch to say "we know we cannot adhere to property X for this function" getting something like Rust's `unsafe` "callout" that something special is going on here.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

mathstuf — Tue, 14 Nov 2023 14:30:02 +0000

For "basic" operations, probably very few. What about vectorized operations? Are they consistently twos-complement? How about GPU and other specialized hardware?

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Kluge — Mon, 13 Nov 2023 18:42:25 +0000

Any idea of why that NSA document singled out Ruby from among the "scripting" programming languages (Python, Perl, Lua et al.)? Is Ruby actually more memory-safe?

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

kreijack — Mon, 13 Nov 2023 17:14:32 +0000

> Is there anything that's been produced in the last 10 years that is /not/ twos-complement? (If "nothing" - in the last 20?)

My understanding is that the ISO c++20 standard already mandates the two's complement:

If you look at https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm, you can find more information; even an analysis about which processor is/was not "two's complement".

And also it seems that C23 also is following the same path.

Anyway I think that the right question is "which architecture" supported by GCC (or CLANG...) is/isn't two's complement.

https://en.wikipedia.org/wiki/C23_(C_standard_revision)#cite_note-N2412-62
https://en.wikipedia.org/wiki/C%2B%2B20#cite_note-32
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/...

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

paulj — Mon, 13 Nov 2023 13:13:19 +0000

Is there anything that's been produced in the last 10 years that is /not/ twos-complement? (If "nothing" - in the last 20?)

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

linuxrocks123 — Sat, 11 Nov 2023 08:13:54 +0000

It's pretty extensive, man. Unless you want to move the goalposts to the point where Java isn't "safe", it's safe.

> In this mode, by design, it is not allowed to call native code and access native memory. All memory is managed by the garbage collector, and all code that should be run needs to be compiled to bitcode.

> Pointer arithmetic is only possible to the extent allowed by the C standard. In particular, overflows are prevented, and it is not possible to access different allocations via out-of-bounds access. All such invalid accesses result in runtime exceptions rather than in undefined behavior.

> In managed mode, GraalVM simulates a virtual Linux/AMD64 operating system, with musl libc and libc++ as the C/C++ standard libraries. All code needs to be compiled for that system, and can then be used to run on any architecture or operating system supported by GraalVM. Syscalls are virtualized and routed through appropriate GraalVM APIs.

https://www.graalvm.org/latest/reference-manual/llvm/Nati...

I think this is a very cool project and would love to play with it sometime. Doing something like hooking this into the Gentoo build system and making a whole Linux distro where everything is compiled to have these safety characteristics would be interesting. Of course, it would be as slow as Java, so I wouldn't actually want to use such a system normally. But maybe I'd be willing to pay the cost with a web browser, or for server software exposed to the Internet.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Fri, 10 Nov 2023 14:00:27 +0000

Now go and convince the Clang, GCC, Fedora or Debian maintainers that this should be the default state. That's the hard part - getting anyone whose decisions will influence the C standards body to declare that they want less UB, even at the expense of a few % of speed on some benchmarks.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Wol — Fri, 10 Nov 2023 13:55:52 +0000

Which is why my take is that - ON A 2S COMPLEMENT PROCESSOR - -fwrapv should be defined as the default. So by default, you get the expected behaviour.

Then they can compile SPECInt with a flag that switches off fwrapv to give the old behaviour and say "you want speed? Here you are! But it's safe by default".

So UB has now become hardware- or implementation-defined behaviour but the old behaviour is still available if you want it.

Cheers,
Wol

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

foom — Thu, 09 Nov 2023 23:15:11 +0000

> And note that, by the nature of -fwrapv, every case where it regresses performance is one where the source code is already buggy

Nope.

The flag only affects the _results_ if the program previously exhibited UB, but, it removes flexibility from the optimizer by requiring the result be wrapped. This may require additional conditions or less efficient code. If the more-optimal version didn't produce the correct result when the value wrapped, it cannot be used any more.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

pizza — Thu, 09 Nov 2023 14:43:52 +0000

> As long as people who think along these lines are in charge of the C/C++ standards,

That's a little disingenuous; it's not that they're "in charge of C/C++" it's that there's a large contingent of *users* of C/C++ that ARE VERY VERY VOCAL about performance regressions in _existing_ code.

There's a *lot* of existing C/C++ code out in the wild, representing a *lot* of users. And many of those users are pulling the C/C++ standards in mutally-incompatible ways.

Addressing UB in C

Tobu — Thu, 09 Nov 2023 12:12:57 +0000

Safe Sulong does/did address classes of UB much more accurately than tools that rely on heuristics (like valgrind and sanitizers). You would use it with clang -O0 (though that does still have optimisations that would silence UB) and a toolchain that preserves LLVM IR, then try to find bugs at runtime. Kind of like what Miri does for Rust, possibly faster thanks to JIT (the meta-interpreter approach is similar to RPython). Sadly the code was never released (the perils of being an Oracle partnership?). At the time of the paper there was a lot of work left to reimplement libc functions; porting a libc to emulate at a lower level (instead of reimplementing every libc function) was considered but didn't happen. It also didn't allow pointer-integer roundtrips (exposing provenance), since it could garbage collect objects that didn't have a live pointer.

Native Sulong (maybe with some of the relaxed rules from Lenient C) seems to be published in the GraalVM repo, but it doesn't tackle most UB, it just makes it easier to mix C/C++/Fortran… with GraalVM languages.

And Lenient C is definitely interesting as well, as is the idea of relaxing UB rules based on what programmers seem to believe should work. Though extending allocations to live past the point free is called (using the GC graph instead) feels maybe too lenient.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Thu, 09 Nov 2023 11:23:03 +0000

Right, but you're a rarity. I see that big visible users of C (Debian, Fedora, Ubuntu, SLES, RHEL, FreeRTOS, ThreadX and more) don't set default flags to define more behaviour in C, while at the committee level John Regehr (part of the CompCert verified compiler project) could not get agreement on the idea that the ~200 UBs in the C standard should all either be defined or require a diagnostic from the compiler. And compiler authors aren't pushing on the "all UB should either be defined or diagnosed" idea, either.

So, for practical purposes, C users don't care enough to insist that their compilers define behaviours that the standard leaves undefined, nor do they care enough to insist that compilers must provide diagnostics when their code could execute UB (and thus is, at best, only conditionally valid). The standard committee doesn't care, either; and compiler authors aren't interested in providing diagnostics for all cases where a program contains UB.

From my point of view, this is a "C culture overall is fine with UB" situation - people have tried to get the C standard to define more behaviours, and the in charge of the C standard said no. People have managed to get compilers to define a limited set of behaviours that the standard leaves undefined, and most C users simply ignore that option - heck, if C users cared, it would be possible to go through the Fedora Change Process, or the Debian General Resolution process to have those flags set on by default for entire distributions, overruling the compiler maintainers. Given the complete lack of interest in either top-down (start at the standards body and work down) or bottom-up (get compiler writers to introduce flags, set them by default and push for everyone to set them by default) fixes to the C language definition, what else should I conclude?

And note that in the comments to this article, we have someone who agrees that too much of C is UB saying that they'll not simply insist that people use the extant compiler flag and rely on the semantics that are created by that - which is basically C culture's problem in a nutshell; we have a solution to part of the problem, but we're going to complain instead of trying to push the world to a "better" place.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

anton — Wed, 08 Nov 2023 18:55:54 +0000

What I see, however, when I look at what the C standards committee is doing, is that the people who drive C and C++ the languages forwards are dominated by those who are embracing UB; this is largely because "make it UB, and you can define it if you really want to" is the committee's go-to answer to handling conflicts between compiler vendors

Yes, not defining something where existing practice diverges is the usual case in standardization. That's ok if you are aware that a standard is only a partial specification; programs written for compiler A would stop working (or require a flag to use counter-standard behaviour) if the conflicting behaviour of compiler B was standardized. However, if that is the reason for non-standardization, a compiler vendor has a specific behaviour in mind that the programs of its customers actually rely on. For a case (e.g., signed overflow) where compiler vendors actually say that they consider the programs that do this buggy and reject bug reports about such programs, compiler vendors do not have this excuse.

We certainly use all such flags that we can find. Not all of the following flags are for defining what the C standard leaves undefined, but for gcc-10 we use: -fno-gcse -fcaller-saves -fno-defer-pop -fno-inline -fwrapv -fno-strict-aliasing -fno-cse-follow-jumps -fno-reorder-blocks -fno-reorder-blocks-and-partition -fno-toplevel-reorder -falign-labels=1 -falign-loops=1 -falign-jumps=1 -fno-delete-null-pointer-checks -fcf-protection=none -fno-tree-vectorize -pthread -fno-defer-pop -fcaller-saves

I think both views are minority views, because most C programmers are relatively unaware of the issue. That's because the maintainers of gcc (and probably clang, too) preach one position, but, from what I read, practice something much closer to my position: What I read is that they check whether a new release actually builds all the Debian packages that use gcc with the release candidate and whether these packages then pass some tests (probably their self-tests). I assume that they then fix those cases where the package then does not work (otherwise, why should they do this checking? Also, Debian and other Linux distributions are unlikely to accept a gcc version that breaks many packages). This covers a lot of actual usage (including a lot of UB). However, it will probably not catch cases where a bounds check is optimized away, because the tests are not very likely to test for that.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

vadim — Wed, 08 Nov 2023 13:16:18 +0000

You can to a limited extent.

Eg, under DOS you can write C code that sets interrupt handlers, or does low level control of the floppy drive. The ability to do such low level things is precisely why C gets used to write operating systems.

Like I said elsewhere, "portable assembler" is in my view a very metaphorical description, because obviously there can't be such a thing in the absolute sense. Proper assembler reflects the CPU's architecture, and a single language can't accurately depict the wildly different designs that exist. However it can get there part of the way given some compromises.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Wed, 08 Nov 2023 12:38:25 +0000

The C standard (and the C++ standard) do not define flags that change the meaning of a program, because they are trying to define a single language; if something is implementation defined, then the implementation gets to say exactly what that means for this implementation, including whether or not the implementation's definition can be changed by compiler flags.

There's four varieties of incompletely standardised behaviour for a standards-compliant program that do not require diagnostics in all cases (I'm ignoring problems that require a diagnostic, since if you ignore warnings from your compiler, that's on you):

"ill-formed program". This is a program which simply has no meaning at all in the standard. For example, I am a fish. I am a fish. I am a fish is an ill-formed program (albeit one that requires a diagnostic). Ill-formed programs can always be optimized down to the null statement.
"undefined behaviour". This is a case where, due to input data or program constructs, the behaviour of the operation is undefined, and the consequence of that undefinedness is that the entire program has no meaning attributed to it by the standard. A compiler can do anything it likes once you've made use of UB, but in order to do this (e.g. compile down to the null statement), the compiler first has to show that you're using UB; e.g. you've executed a statement with undefined behaviour, such as arithmetic overflow in C99. If you do something that has UB in some cases but not others, the compiler can assume that the UB cases don't happen.
"unspecified behaviour". The behaviour of the program upon encountering unspecified behaviour is not set by the standard, but by the implementation. The implementation does not have to document what the behaviour will be, nor remain consistent between versions. Unspecified behaviour can be constrained by the standard to a set of possible options; for example, C++03 says that statics initialized via code are initialized in an unspecified order, but for each initializer, all statics must either be fully-initialized or zero-initialized at the point the initializer runs. This means that code like int a() { return 1; }; int b = a(); int c = b + a(); can set c to either 1 (b was zero-initialized) or 2 (b was fully-initialized), but not to any other value. However, because this is unspecified, the behaviour can change every time you run the resulting executable.
"implementation-defined behaviour". The behaviour of the program upon encountering unspecified behaviour is not set by the standard, but by the implementation. The implementation must document what behaviour it chooses; like unspecified behaviour, the allowed options may be set by the standard.

And it's been the case in the past that programs have compiled down to the null statement because they always execute UB; the problem with the SPECint 2006 benchmark in question is that it's conditionally UB in the C standard language, and thus the compiler must produce the right result as long as UB does not happen, but can do anything if UB happens.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

khim — Wed, 08 Nov 2023 11:49:43 +0000

The problem are not the people who specify or not specify these flags, but people who don't even think about these flags and UBs related to these flags.

They apply “common sense” and refuse to divide problems in parts. They mash everything (millions of lines of code) into one huge pile and then try to reason about that. Here's perfect example: the gcc maintainers decided to recognize this idiom in order to pessimize it.

How that crazy conclusion was reached? Easy: ignore the CPU model that gcc uses, ignore the process that GCC optimizer uses, imagine something entirely unrelated to what happens in reality, then complain about object of your imagination that it doesn't work as you expect.

It's not possible to achieve safety if you do that! If you refuse to accept reality and complain about something that doesn't exist then it's not possible to do anything to satisfy you. It's as simple as that.

P.S. It's like a TV repairer who refuses to ever look on what's happening inside and just tries to “fix” things by punching TV from different directions. Decades ago when TV included half-dozen tubes this even worked and some such guys even knew how to “fix” things by punching them lightly and harshly. But after TVs have become more complicated they stopped being able to do their magic. And modern TVs don't react to punches at all. Similar thing happened to compilers. Same result: A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die and a new generation grows up that is familiar with it. And the best way to achieve that is to use some other language: new generation ,may learn Ada or Rust just as easily as they may learn C++ (easier, arguably) and there are no need for opponents to physically die off, if they would just stop producing new code the end result would be approximately the same.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Wol — Wed, 08 Nov 2023 11:45:02 +0000

> this is largely because "make it UB, and you can define it if you really want to" is the committee's go-to answer to handling conflicts between compiler vendors;

Arghhhh ....

The goto answer SHOULD be "make it implementation defined, and define a flag that is on by default". If other compilers don't choose to support that flag, that's down to them - it's an implementation-defined flag.

(And given I get the impression this SPECInt thingy uses UB, surely the compiler writers should simply optimise the program to the null statement and say "here, we can run this benchmark in no time flat!" :-)

Cheers,
Wol

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Wed, 08 Nov 2023 10:57:27 +0000

What I see, however, when I look at what the C standards committee is doing, is that the people who drive C and C++ the languages forwards are dominated by those who are embracing UB; this is largely because "make it UB, and you can define it if you really want to" is the committee's go-to answer to handling conflicts between compiler vendors; if there are at least two good ways to define behaviour (e.g. arithmetic overflow, where the human-friendly definitions are "error", "saturate", and "wrap"), and two compiler vendors refuse to agree since the definition either way results in the output code being less optimal on one of the two compilers, the committee punts on the decision.

And it's not just the compiler maintainers and the standards committees at fault here; both GCC and Clang provide flags to provide human-friendly defined behaviours for things that in the standard are UB (-fwrapv, -ftrapv, -fno-delete-null-pointer-checks, -fno-lifetime-dse, -fno-strict-aliasing and more). Users of these compilers could insist on using these flags, and simply state that if you don't use the flags that define previously undefined behaviour, then you're Holding It Wrong, but they don't.

Perhaps if you got (say) Debian and Fedora to change their default CFLAGS and CXXFLAGS to define behaviours that in standard C and C++ are undefined, I'd believe that you were anything more than a minority view - but the dominant variants of both C and C++ cultures don't do that.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Wed, 08 Nov 2023 10:53:35 +0000

I think you're being a bit harsh on those flags - they define behaviours that are UB in the standard, and require the compiler to act as-if those behaviours are fully defined in the documented manner.

The problem with the flags is that people aren't willing to take any hit, no matter how minor, in order to have those flags on everywhere, preferring to stick to standard C, not that there exist flags that reduce the amount of UB you can run across.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Wed, 08 Nov 2023 10:51:27 +0000

presumably in your example the processor has an op-code to carry out the operation

Not usually, no. The processor typically has between 0 and 3 opcodes that you can use to implement any low-level operation, with different behaviours; if it has zero opcodes, then there are multiple choices for the sequence of opcodes you use to implement the C abstract machine, each with different behaviours.

And inherently, if you're asking the compiler writers to pick an option and stick to it forever, you're also saying that you don't want the optimizer to ever do a better job than it does in the current version; the entire point of optimizing is to choose different opcodes for a given C program, such that the resulting machine code program is faster.

This differs to things like -fwrapv, and -funreachable-traps, since those options define the behaviour of the source code where the standard says it's UB, and promise you that whatever opcodes they end up picking, they'll still meet this definition of behaviour; but a negative consequence of that is that there are programs where the new definition costs you a register or more opcodes. Now, you can almost certainly fix those programs to not have the performance bug; but that's a tradeoff that people choose not to make.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

anton — Wed, 08 Nov 2023 10:10:20 +0000

It's actually only one program in SPECint2006 where it costs 7.2% with gcc (as reported by Wang et al.); and the performance improvement of the default -fno-wrapv can alternatively be achieved by changing the type of one variable in that program.

You write about people and culture, I write about compiler maintainers and attitude. And my hypothesis is that this attitude is due to the compiler maintainers (in particular those working on optimization) evaluating their work by running their compilers on benchmarks, and then seeing how an optimization affects the performance. The less guarantees they give, the more they can optimize, so they embrace UB. And they produced advocacy for their position.

Admittedly, there are also other motivations for some people to embrace UB:

Language lawyering, because UB gives them something that distinguishes them from ordinary programmers.
Elitism: The harder C programming is (and the more UB, the harder it is), the more they feel part of an elite. Everyone who does not want it that hard should switch to a nerfed language like Java, or better get out of programming at all.
Compiler supremacy: The compiler always knows best and magically optimizes programs beyond what human programmers can do. So if, by not defining behaviour, the compiler reduces the ways in which a programmer can express something, that's only for the best: The compiler will more than make up for any potential slowdown from that by being able to optimize better. After all (and this is where the compiler writer advocacy kicks in), UB is the source of optimization, and without having as much UB in C as there is, you might as well use -O0.
It's free: There are those who have not experienced that the compiler changed the behaviour of their program based on the assumption that the program does not exercise UB (or have not noticed it yet, in cases where the compiler optimizes away a bounds check or the code that erases a secret). They can fall for the compiler writer's advocacy that UB gives them performance for free. The cost of having to "sanitize" (hunting and eliminating UB in) their programs is not yet obvious to them.

There are, however, also other positions, advocated by many (including me), so I think that the C compiler writer's position on UB is not "the C culture" (it may be "the C++ culture", I don't know about that). In particular, I think (and have evidence for it) that humans are superior at optimizing programs, and that, if the goal is performance, programmer's time is better spent at performing such optimizations (by, e.g., changing the type of one variable, but also more involved transformations) than at "sanitizing" the program.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

smurf — Wed, 08 Nov 2023 09:34:24 +0000

> anyone working on an optimizer has to reason about how an optimization affects behaviour

… unless that behaviour is related to UB, in which case most bets are off.

Sure you can control some optimizer aspects with some flags, but (a) that's too coarse-grained (b) there's heaps of UB conditions that are not related to the optimizer and thus cannot be governed by any flags (c) replacing some random code (random in the sense of "if it's UB the compiler can do anything it damn well pleases") with some slightly less random code doesn't fix the underlying problems (d) there's plenty of UB that isn't related to the optimizer.

Contrast all of this that with Rust's definition of soundness, which basically states that if you don't use the "unsafe" keyword you cannot cause any UB behavior, period end of discussion.

C/C++ is lightyears away from that.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Wol — Wed, 08 Nov 2023 09:32:01 +0000

> > Five/seven/five syllables

> So…how do *you* get 5 here? ;)

"You over did it"

Though I wonder what he thinks he's replying to ???

Cheers,
Wol

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

anton — Wed, 08 Nov 2023 09:03:39 +0000

A correct optimization must not change the behaviour, not that of the transformed code, and not that of seemingly unrelated code. Even UB fans accept this, they only modify the definition of behaviour to exclude UB. So anyone working on an optimizer has to reason about how an optimization affects behaviour.

And the fact that C (and maybe also C++, but I don't follow that) compiler writers offer fine-grained control over the behaviour they exclude, with flags like -fwrapv, -fwrapv-pointer, -fno-delete-null-pointer-checks, -fno-strict-aliasing etc. shows that they are very confident that they can control which kind of behaviour is changed by which optimization.

It's not reasonable to demand that the next version of a compiler should not come with any improvements to its optimizer.

That's a straw man argument.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Wol — Tue, 07 Nov 2023 22:27:01 +0000

> "What the hardware does" is a idea, but indeed not a specification (I guess that's why UB advocates love to beat on strawmen that include this statement).

> However, you can define one translation from an operation-type combination in the C abstract machine to machine code. If the results are used in some way, conditioning the data for that use is part of that use.

No it's nice and simple ... presumably in your example the processor has an op-code to carry out the operation. So instead of UB, we now have Implementation Defined - the compiler chooses an op-code, and now we have Hardware Defined.

If the compiler writers choose idiotic op-codes, more fool them. But the behaviour of your code is now predictable, given a fixed compiler and hardware. Of course "same compiler and hardware" has to be defined to mean all versions of the compiler and all revisions of the architecture.

"What the hardware does" means the compiler writers have to pick an implementation, AND STICK WITH IT. (Of course, a flag to force a different implementation is fine.)

Cheers,
Wol

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Tue, 07 Nov 2023 19:17:55 +0000

But therein lies the core of the problem: -fwrapv costs some subsets of SPECint 2006 around 5% to 10% performance numbers. Which means, in turn, that people have already refused to turn on -fwrapv by default, since they're depending on the performance boost they get from the compiler treating "int" as "unsigned", rather than promising sign extension and wraparound.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

mb — Tue, 07 Nov 2023 19:17:53 +0000

So you are saying that you can't "program to the hardware" in C at all?
I fully agree.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

vadim — Tue, 07 Nov 2023 19:12:50 +0000

No, I don't think even that will do if you're counting cycles, because even with -O0 the possibility exists that the compiler will make different decisions about what instructions to use depending on bug fixes/implementation. I don't think there's for instance any guarantee that GCC and Clang will both produce the same binary with -O0 in anything but the most trivial cases.

So if you're counting cycles you should probably be actually coding it in assembler.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

Tobu — Tue, 07 Nov 2023 18:52:45 +0000

That's a very short read, that left me unsatisfied (is the IR before or after optimisation?); but looking at the intro, the same author's PhD thesis seems like it will at least address this.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

anton — Tue, 07 Nov 2023 18:44:42 +0000

"What the hardware does" is a idea, but indeed not a specification (I guess that's why UB advocates love to beat on strawmen that include this statement).

However, you can define one translation from an operation-type combination in the C abstract machine to machine code. If the results are used in some way, conditioning the data for that use is part of that use. One interesting case is p+i: Thanks to the I32LP64 mistake, on 64-bit platforms p is 64 bits wide, while i can be 32 bits wide, but most hardware does not have instructions that performs a scaled add of a sign-extended (or, if i is unsigned, zero-extended) 32-bit value and a 64-bit. So one would translate this operation to a sign/zero-extend instruction, a multiply by the size of *p, and an add instruction.

And then you can optimize: If the instruction producing (signed) i already produced a sign-extended result, you can leave the sign extension away; or you may be able to combine the instruction that produces i with the sign/zero extending instruction and replace it with some combined instruction. And for that you can see how the various features of the architectures mentioned above play out.

As for -fwrapv or more generally -fno-strict-overflow, all architectures in wide use have been able to support that for many decades, so yes, that would certainly be something that can be done across the board and making it the default and putting that behaviour into the C standard is certainly a good idea. C compiler writers worrying about performance can then warn about loop variable types that require a sign extension or zero extension on every loop iteration.

BTW, on machines with 32-bit ints, there is no need to sign-extend 8-bit or 16-bit additions, because the way that C is defined, all additions happen on ints (i.e., 32 bits) or wider. So you convert your 8-bit or 16-bit operands to ints first, and then add them.

Standardization on a fully-defined behaviour is unlikely for cases where architectural differences are more substantial, e.g., shifts. You can then define -fwrapv-like flags, and that might be a good porting help, but in the absence of that, having consistent behaviour on a specific platform would already be helpful.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

smurf — Tue, 07 Nov 2023 18:34:20 +0000

> This is unacceptably large

… if you value "no performance regression even on UB-buggy code" higher than "sane(r) semantics and less UB".

As long as people who think along these lines are in charge of the C/C++ standards, Stroustrup’s plan (or indeed any plan to transform the language(s) into something safe(r)) has no chance whatsoever to get adopted.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Tue, 07 Nov 2023 17:43:05 +0000

-fwrapv slows some sub-tests in SPECint 2006 by around 5% to 10% as compared to -fno-wrapv. This is unacceptably large, even though in the cases where it regresses, someone's already done the analysis to confirm that it regresses because it used int for array indexing instead of size_t.

And note that, by the nature of -fwrapv, every case where it regresses performance is one where the source code is already buggy, because it depends on UB being interpreted in a way that suits the programmer's intent, and not in a different (but still legal) way. It cannot change anything where the program's behaviour was fully defined without -fwrapv, since all -fwrapv actually does is say "these cases, which used to be Undefined Behaviour, now have the following defined semantics". But that was already a legal way to interpret the code before the flag changed the semantics of the language, since UB is defined as "if you execute something that contains UB, then the entire meaning of the program is undefined and the compiler can attribute any meaning it likes to the source code".

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

adobriyan — Tue, 07 Nov 2023 17:35:45 +0000

> we have noises about -fwrapv being too expensive

-fwrapv gains/losses are trivial to measure in theory:
Gentoo allows full distro recompile with seemingly arbitrary compiler flags.
I was using "-march=native" more or less since the moment it was introduced.

> If you can't get agreement on something that trivial, where the performance cost (while real, as shown by SPECint 2006) can be deal with by relatively simple refactoring to make things that should be unsigned into actual unsigned types instead of using int for everything, what hope is there for fixing other forms of UB?

Making types unsigned is not simple, the opportunities for introducing new bugs are limitless.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

farnz — Tue, 07 Nov 2023 10:52:41 +0000

Forget bounds checking being too expensive; we have noises about -fwrapv being too expensive, and "all" that does is say that signed integer overflow/underflow is defined in terms of wrapping a 2s complement representation. If your code is already safe against UB, then this flag is a no-op; it can only cause performance issues if you could have signed integer overflow causing issues in your code.

If you can't get agreement on something that trivial, where the performance cost (while real, as shown by SPECint 2006) can be deal with by relatively simple refactoring to make things that should be unsigned into actual unsigned types instead of using int for everything, what hope is there for fixing other forms of UB?

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

mb — Tue, 07 Nov 2023 10:24:15 +0000

>Because by specifying the "optimize" flag

So we're back to: You must specify -O0, if you "program to the hardware".

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

vadim — Tue, 07 Nov 2023 09:27:34 +0000

Because by specifying the "optimize" flag you're explicitly asking the compiler to try and make it faster, and therefore change the timing.

And even without optimization I don't think timing can be guaranteed in C or C++. At any time a compiler could be updated to learn of a new instruction, or fix a code generation bug.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

anton — Tue, 07 Nov 2023 09:08:54 +0000

C doesn't have a way to specify "fused multiply and add" at all. Should C offer a library intrinsic to access such instructions?

Looking at the output of man fma, it reports that the functions fma(), fmaf(), and fmal() are conforming to C99.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

mathstuf — Tue, 07 Nov 2023 07:14:29 +0000

> 2. Translate it as-is, and whatever happens on this CPU, happens

If it is translated as-is, it means that an optimizer is not allowed to touch it. An "as-is" rule is far more restrictive on code transformations than an "as-if" rule. Of course, this also assumes that targets even have operations for the relevant abstract operation (e.g., consider floating point-lacking hardware).

Here's a question: C doesn't have a way to specify "fused multiply and add" at all. Should C offer a library intrinsic to access such instructions? Require inline assembly? If a processor supports `popcount`, what do you want me to do to my source to access it besides something like `x && x & (x - 1) == 0` becoming `popcount x == 1`. After all, I wrote those bitwise operations, I'd expect to see them in the assembly in your world, no?

> In that case the language should declare a single correct interpretation and emulate it as necessary on every CPU.

Sure, except that we have noises about *bounds* checking being too intrusive and expensive. What makes you think that every `(var >> nbitshift)` expression being guarded with some check/sanitizing code would be acceptable?

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

mathstuf — Tue, 07 Nov 2023 07:07:45 +0000

> Personally, I think I'd expect option 3 - right-shifting 2 33 times looks like 0 to me. If I have a flag to say "that is what I expect", then the compiler can either do what I asked for, or tell me "you're on the wrong hardware". Or take ten times as long to do it. All those options comply with "least surprise". Maybe not quite the last one, but tough!

Note that optimization passes tend not to be aware of the literal input source or, necessarily, the target. Without that knowledge, it would mean that any optimization around a shift with a variable on the right is impossible to do because it could be doing What Was Intended™ and assuming any given behavior may interfere with that.

Bjarne Stroustrup’s Plan for Bringing Safety to C++ (The New Stack)

anton — Mon, 06 Nov 2023 23:09:41 +0000

And here's the data on real hardware. I also include sequence 4 that is a more literal translation of the C code (gcc-10 -O -fwrapv produces this sequence):

	leaq	-1(%rdx), %rax
	cmpq	%rdx, %rax
	jg	...

I put 100 copies of these sequences in a row, but they check 4 different registers rather than just one (to avoid making the data dependence on the result of the dec instruction a bottleneck for this microbenchmark). The results are in cycles per sequence.

 1    2    3    4
1.02 0.54 0.54 0.55 Rocket Lake
0.54 0.54 0.52 0.53 Zen3
1.19 1.03 1.03 1.03 Tremont
2.05 1.06 2.05 1.55 Silvermont
2.42 1.43 1.27 3.09 Bonnell

So, concerning our original question, sequence 2 is at least as fast as sequence 1, and actually faster on 4 out of 5 microarchitecture, sometimes by a factor of almost 2. Even sequence 4, which is what gcc10 produces is faster on 4 out of 5 microarchitectures and is slower only on Bonnell, which has been supplanted by Silvermont in 2013. Sequence 3 is also at least as fast as sequence 1, and faster on 4 out of 5 microarchitectures. So the gcc maintainers decided to recognize this idiom in order to pessimize it. Strange.

Code can be found here.