Development quote of the week

Posted Dec 4, 2022 0:30 UTC (Sun) by mathstuf (subscriber, #69389)
In reply to: Development quote of the week by anton
Parent article: Development quote of the week

> My paper does not tell anyone how to make a compiler-specific program portable between compilers, nor how to make an architecture-specific program portable between architectures. The mistake of efforts like Regehr's Friendly C is that they try to solve these problems, but that is not necessary for a friendly C compiler.

This basically leads down to an "IBSan" tool that detects implementation-defined behavior and signals on used-to-be-UB-but-is-now-arch-dependent. Portability is a benefit of code and if I know that my x86-compiled code is UB-free, it'll have the same behavior (but certainly not performance profile) on ShinyNewArch that gets released a decade from now. I really don't want to have to go to every project and make sure that they CI test my pet arch to make sure I don't have live grenades being lobbed my way on every update. I expect that Debian and NetBSD porters to obscure architectures appreciate that breaking these rules is just as "bad" on the "native" development platform(s) as they are on their target(s).

Now, if there were an in-language (no, the preprocessor doesn't count) to say "this is targeting x86 because we're talking to an IME, give me native behavior", *then* I could see there being some new "undefined-if-portable behavior" bucket for these kinds of things to go into.

Development quote of the week

Posted Dec 4, 2022 17:46 UTC (Sun) by anton (subscriber, #25547) [Link] (11 responses)

Portability is valuable in many settings, and I think I am quite experienced in the area, with Gforth (which "breaks the rules" (in your terminology) a lot) usually working out of the box on new architectures and operating systems.

But my point is that a friendly compiler must also work for non-portable code: If it works with one version of the compiler on a particular platform, it must also work with a later version of that compiler on that platform.

Portability is an orthogonal requirement. Your hypothetical "IBSan" tool may be helpful, although I have my doubts, see below. In practice I test for portability by making test runs on as many different platforms as I can get my hands on. That's not 100% reliable, but it tends to work quite well.

I have my doubts about "IBSan" because it assumes one binary that should cover all portability variants. Real-world portable C programs often have lots of conditional compilation and stuff coming from configure to help with portability. If you write "the preprocessor doesn't count", it's obvious that you are not interested in C as it is used in the real world.

Development quote of the week

Posted Dec 4, 2022 18:27 UTC (Sun) by khim (subscriber, #9252) [Link] (3 responses)

> But my point is that a friendly compiler must also work for non-portable code: If it works with one version of the compiler on a particular platform, it must also work with a later version of that compiler on that platform.

Well, neither C, C++ or Rust are even trying to be “friendly” by that definition (here's recent example where Rust 1.65 doesn't accept source which Rust 1.64 accepted). That's fine with Rust users, yet, apparently not fine with small (but very vocal) group of C users.

That's basically why C and C++ are doomed: in their world compiler users and compiler developers each talk in ultimatums which the other side is not willing to accept, which means conflict could never be resolved.

I have seen so much talks about “friendly C” (O_PONIESs, really) from C users, but don't even know a single optimizing compiler developer who subscribes under that idea.

Development quote of the week

Posted Dec 4, 2022 20:45 UTC (Sun) by pizza (subscriber, #46) [Link] (2 responses)

> Well, neither C, C++ or Rust are even trying to be “friendly” by that definition (here's recent example where Rust 1.65 doesn't accept source which Rust 1.64 accepted). That's fine with Rust users, yet, apparently not fine with small (but very vocal) group of C users.

That should read -- "that's apparently fine with the current Rust users".

C and C++ have several orders of magnitude of users than Rust. And those users (and compiler writers, and language stewards) are all trying to pull in their own, often incompatible, directions, collectively with literally billions of lines of code/baggage.

Rust, by virtue of being rather youthful, doesn't yet have a significant mass of users or use cases. There is only one implementation, produced by the same folks who define the language, and most of the users are still of the True Believer sort. All of this will inevitably change, and when it does, the needs of these various sub-groups will inevitably begin to diverge, and then the current "our way or the highway" language+implementation stewardship model will start failing.

If Rust does eventually succeed (ie ends up as a "legacy" language with many hundreds of millions of lines of code in wide deployment across tens of thousands (if not more) of organizations with divergent needs spanning a couple decades or so) then continuing to evolve it will run into many of the same sorts of problems that C and C++ face today -- ie problems of politics and governance.

I don't have any skin in this particular game, but I've been around long enough to see certain patterns, including the "we're smarter than those other guys so we'll be immune to their problems" hubris that *always* comes back to bite.

Development quote of the week

Posted Dec 4, 2022 21:53 UTC (Sun) by khim (subscriber, #9252) [Link] (1 responses)

> C and C++ have several orders of magnitude of users than Rust.

I don't think so. There are certainly a lot more existing code in C and C++, because they had several decades of headstart. As for number of actual users it's hard to say for sure, but recent countings put at around half of Go or third of Kotlin (and about ten times less popular than JavaScript which, you must admit, it definitely more popular than C, C++ or Rust).

> Rust, by virtue of being rather youthful, doesn't yet have a significant mass of users or use cases.

No. The important thing is not fact that Rust is youthful, but the fact that Rust users are youthful. The fiasco that happened with C and C++ is mostly caused by old people who still remember times where it was possible to pretend that C is “portable assembler”, “program to the hardware” and expect that compiler wouldn't screw you.

I dealt with quite a few newgrads and they accept strange and bizzare rules of standard C/C++ without much complaints. For them it's just how this weird language works. Strange rules, but hey, rules are rules. And the same happens with Rust.

But in C, very often, they have to deal with these old “relax, I know what I'm doing, I'm older than C, I know how it works” guys. While in Rust these guys, as I have said, are expelled from the community, instead.

I don't think this would change. Even if number of Rust developers is not ⅓ of number of C/C++ developers but closer to ⅒ of number of C/C++ developers it's pretty obvious that C/C++ style disaster wouldn't happen to Rust.

Plank's principle in action: An important scientific innovation rarely makes its way by gradually winning over and converting its opponents: it rarely happens that Saul becomes Paul. What does happen is that its opponents gradually die out, and that the growing generation is familiarized with the ideas from the beginning

I have meet real old software-related guys when I was in college and what I observe today reminds me of their tales about how structural programming arrived. The exact same refusal to accept new idea, the insistence that “proper” design is with the use of flowcharts on A1 (or A0 for complex cases) papers and that all these newfanged things like stacks or loops are just making development difficult and so on so forth.

The only big question is whether this time Rust (and Rust-like) language would actually win or if history would repeat itself and after initial success of languages properly structural like Algol or Pascal some half-backed newcomer would come and take over (like C and C++ did).

Time will tell.

> I don't have any skin in this particular game, but I've been around long enough to see certain patterns, including the "we're smarter than those other guys so we'll be immune to their problems" hubris that *always* comes back to bite.

Nah. I don't think there's any chance of Rust making the same mistake as C and C++ did, but it certainly can do an entirely new ones.

E.g. its approach to async programming… I'm still not convinced it's the right one and wouldn't lead to dead end.

It's a tale as old as time.

Posted Dec 5, 2022 16:09 UTC (Mon) by smoogen (subscriber, #97) [Link]

Most of these threads seem to mirror conversations I remember in the late 1980's when obfuscated C programs were big and many of the people who are now old, argued the same things about how the compiler should OR should not have allowed it. It also mirrors arguments between K&R C 1978 and K&R C 1988 version. The fact that many compilers would allow some middle road between 78 and 88 until the early 00's just allowed for 'what does C mean?' arguments even longer.

Development quote of the week

Posted Dec 5, 2022 12:02 UTC (Mon) by farnz (subscriber, #17727) [Link] (1 responses)

Given "If it works with one version of the compiler on a particular platform, it must also work with a later version of that compiler on that platform", what stops a compiler team deciding that supporting the behaviours of the previous version is too hard, and thus they're going to release a new compiler with the same CLI, that's not a "new version of the existing compiler"?

This is not a pure hypothetical - GCC 3 is not merely a "new version of GCC", it's a new compiler (the egcs GCC fork) that was adopted by GCC as the clear better outcome. If you set a rule like your proposed rule, what stops GCC21 being a new compiler, not version 21 of GCC?

Development quote of the week

Posted Dec 6, 2022 22:46 UTC (Tue) by anton (subscriber, #25547) [Link]

what stops a compiler team deciding that supporting the behaviours of the previous version is too hard, and thus they're going to release a new compiler with the same CLI, that's not a "new version of the existing compiler"?

Self-respect.

But actually that's somewhat the situation we have with gcc (and probably clang) now, only the maintainers don't say explicitly that their compilers are not backwards-compatible (they certainly have declared bug reports as invalid that clearly state that the code has worked with earlier gcc versions), so some people think of switching to a newer version of gcc as being an upgrade. It's not.

Even when starting with the same code base a compiler can be backwards-incompatible (as demonstrated by some gcc versions newer than 3), and with a different code base it can be compatible (but that's hard). and actually ecgs was forked from the pre-gcc-2.8 code base.

Development quote of the week

Posted Dec 6, 2022 3:26 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (4 responses)

> If you write "the preprocessor doesn't count", it's obvious that you are not interested in C as it is used in the real world.

I'm interested in *improving* things so that the compiler can *see* "this code is x86-bound, feel free to optimize appropriately" with proper attributes rather than code-masking performed by the preprocessor. Flowing "this code was selected based on a check of `defined(__x86_64__)`" is unlikely to be tenable with how complicated some preprocessor checks are (and *their abstractions* used in various libraries).

Development quote of the week

Posted Dec 6, 2022 22:55 UTC (Tue) by anton (subscriber, #25547) [Link] (3 responses)

There are some programs that don't need configure or the like to be portable, but many use a lot of stuff coming out of configure, and I am very pessimistic that we can get rid of conditional compilation.

When you write "this code is x86-bound, feel free to optimize appropriately", what optimization do you have in mind?

Development quote of the week

Posted Dec 6, 2022 23:03 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (2 responses)

> When you write "this code is x86-bound, feel free to optimize appropriately", what optimization do you have in mind?

I'm thinking that the optimizers can assume specific behavior for things instead of considering it UB. For example, left shift by too much can keep the same value (IIRC, ARM makes it 0). The programmer *intent* that this is target-specific is what is important here. Bare C code doing such a shift is still in the "this doesn't mean what you think it means, so I will assume that such Bad Things™ don't happen".

> There are some programs that don't need configure or the like to be portable, but many use a lot of stuff coming out of configure, and I am very pessimistic that we can get rid of conditional compilation.

I also think that conditional compilation is here to stay. However, it being a code-blind copy/paste mechanism doesn't have to be true. With `constexpr` instead of preprocessor symbols, it is possible to have something like D's `static if` or Rusts `cfg!()` mechanisms to hide code during compilation. This allows it to still be syntax checked and formatted appropriately instead of being a wild west of sadness when some long-dormant branch with unbalanced curly braces finally gets activated.

Development quote of the week

Posted Dec 7, 2022 0:29 UTC (Wed) by khim (subscriber, #9252) [Link]

> For example, left shift by too much can keep the same value (IIRC, ARM makes it 0).

It's a bit worse than that. ARM uses low byte to do shift which means that shift by 128 is, indeed, zero, but shift by 256 doesn't change anything (and doesn't touch flags).

Development quote of the week

Posted Dec 7, 2022 0:54 UTC (Wed) by khim (subscriber, #9252) [Link]

> For example, left shift by too much can keep the same value (IIRC, ARM makes it 0).

Note BTW, that the very first CPU, 8086 (and 8088) performs like ARM, not like all subsequent CPUs.

Means Intel took advantage of this UB back when it was developing Intel 80186 forty years ago.

ARM also have similar case, e.g., it has push and pop instructions which may push or pop from 1 to 16 registers as result of one instructions. If you specify 0 registers then some manufacturers treat it as NOP, some treat as UD, but it's also permitted to load random set registers from stack including PC counter!

So much for predictable hardware, huh? In fact document called ARMv8 AArch32 UNPREDICTABLE behaviours lists more than 50 of these.