|
|
Log in / Subscribe / Register

DeVault: Announcing the Hare programming language

DeVault: Announcing the Hare programming language

Posted May 4, 2022 9:22 UTC (Wed) by wtarreau (subscriber, #51152)
In reply to: DeVault: Announcing the Hare programming language by felix.s
Parent article: DeVault: Announcing the Hare programming language

> There is a way: you avoid the cases that trigger UB, and rely only on what the abstract machine guarantees. I can agree this is not always easy, but there are tools to help you with that. As long as the abstract machine is implemented correctly and its invariants are upheld, the program will work on any target it is compiled for.

That doesn't work at all in practice due to portability. Look at syscalls, some used to take an int a long time ago, which was replaced with a socklen_t or a size_t or ssize_t over time. Integer promotion in C is a disaster. You cannot basically use any single integer in a portable way without having to write 1 or 2 consecutive casts without fearing that it might be incorrectly mapped. And it's getting worse when the input data you have was also defined as one of these types.

Casts are a big cause of bugs and they're made more and more common due to all the crappy abstraction types everywhere. Try to pass a time_t over the network. Hmmm does it need to be signed or unsigned ? 32 or 64 bits ? In doubt you might want to pass it as signed 64 bits. But then how to reliably decode it on the other side ? What if you picked the wrong type on the encoding side, won't you risk to get it decoded wrong for special values like -1 which could mean "forever" or "event not happened" for some syscalls ?

> If you don’t, you forfeit any right to complain that compilers ‘abuse’ UB: if it’s undefined, it’s undefined, and it doesn’t even have to act deterministically

As someone said above, they used to be undefined in that it was only hardware dependent. Now it's a free-pass for the compiler to say "awesome, this developer fell into my trap, then I can overoptimize that code and show my rival how faster my code is without all these useless checks". In addition, let me remind you that the C spec isn't open, you have to pay for it, and you discover the undefined behaviors very late in your developers' life. Sure, now some drafts are accessible, that you may consider almost identical to the official spec. But this alone is a big problem.


to post comments

DeVault: Announcing the Hare programming language

Posted May 4, 2022 11:39 UTC (Wed) by farnz (subscriber, #17727) [Link]

C has, since at least ANSI C89, had both undefined behaviour and implementation defined behaviour. Implementation defined behaviour is the stuff that's hardware-defined - for implementation defined behaviour, the compiler must tell you how it implements it (but can say things like "arithmetic overflow for 32 bit integers is defined by the behaviour of ADD EAX, EBX on your CPU, while for doubles, it's defined by the behaviour of FADD ST0, STi with truncation to 64 bit only happening if the compiler chooses to store the value to memory"), while for undefined behaviour, the compiler can do anything it likes.

The underlying "gotcha" with C for us older programmers is that optimizing compilers weren't very good until the late 1980s; before then, it was reasonable to model compilers as translating what I wrote 1:1 to a lower level language, then peephole optimizing that language, then repeating the process until the lower level language is machine code.

That's not how modern compilers work, however. They do much more sophisticated analyses to drive optimization, and can thus easily detect many more opportunities to optimize, but those analyses come up with results that are surprising if you're thinking in terms of peephole optimization.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds