Undefined, implementation defined, and unspecified behaviour
Undefined, implementation defined, and unspecified behaviour
Posted Jul 24, 2024 15:49 UTC (Wed) by khim (subscriber, #9252)In reply to: Undefined, implementation defined, and unspecified behaviour by farnz
Parent article: GNU C Library 2.40 released
> Making more things UB is a hint that they're absolutely fine with the idea that some things are undefined in Standard C, but fully defined in POSIX C.
Have they actually talked to POSIX guys about the removal of this volume of POSIX.1-2017 defers to the ISO C standard words, then?
Because it's not enough to “give a hint”, someone would have to go and change things to stop “defering to the ISO C standard”. Who would that someone be?
> If there's a huge row about what realloc(NULL, 0) or realloc(ptr, 0) is "supposed" to mean, the politics of the situation can make it harder to get consensus on acceptable behaviours than to simply declare it undefined.And making existing program invalid on the day they would publish C23 is, somehow, better? How?
Posted Jul 24, 2024 16:23 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (1 responses)
Yes, they absolutely have talked to the POSIX guys about this - the issue is that there are three different conflicting POSIX-land interpretations of what realloc(ptr, 0) should do (AIX, BSD, glibc), and so we've ended up in this mess where POSIX doesn't want to pick sides, the C standard doesn't want to pick sides either (since as well as the three POSIX-land behaviours, it wants to support non-POSIX behaviours, too), and the result is a mess because no standards body is willing to stand up and say "this is the behaviour that you must have if compliant with this standard".
And no existing program is made invalid under C23; an implementation is allowed to define UB itself, and the older standards made it implementation defined with an absolute dogs' dinner of a set of permitted behaviours (notably, it was permissible to return a non-NULL pointer that you could not use). Because it was implementation defined, implementations have to document what they do with realloc(_, 0), and the C standard does not overrule your documented behaviour; rather, your documented behaviour overrules the C standard.
The dogs' dinner of choices for the behaviour of realloc(_, 0), in turn, was sufficiently bad that it was effectively UB under a different name, thanks to bad drafting in an effort to keep the conflicting behaviours of AIX, BSD and glibc all as permitted behaviours. As a result, if you're not going to choose a winner, it's more honest to define it as UB, and not as returning a pointer that you may or may not be allowed to dereference, with a requirement that you might be required to free the pointer that you passed to realloc if it returned NULL, but you also must not free the pointer that you passed to realloc if it returned NULL.
Underlying all of this is an unfortunate truth about the state of Standard C right now; ISO doesn't want to pick winners among existing implementations, unless it's clear that there is a "right" and a "wrong" answer. The Austin Group (POSIX) don't believe that they have sufficient weight with compiler authors to get compilers to comply with POSIX C in places where it defines things that ISO does not. And language users don't have enough weight with compiler authors to get an agreement together that can be taken to The Austin Group, or to ISO. Which leaves us in a situation where ISO is (correctly, per their charter) not defining parts of C that don't have a generally agreed upon meaning, but there is no way to get compiler authors to define things that ISO doesn't.
And that's bad for C users, since it leaves C users in a position where anything that's not defined the same way by every obscure C compiler out there cannot be relied upon between compiler version upgrades of a big-name compiler like Clang or GCC. Heck, from what I can gather, The Austin Group aren't even convinced that, if they were starting again now, they could insist that CHAR_BIT == 8 - the ISO standard says that CHAR_BIT >= 8.
Posted Jul 24, 2024 21:11 UTC (Wed)
by khim (subscriber, #9252)
[Link]
Sure, but it wasn't possible to back-propagate the fact that someone tries to call New working makes such optimizations more-or-less no-brainer for anyone who wants to earn some brownie points in their resume. That's fine. The biggest issue with UB is not cases where someone uses it on purpose, but cases where someone triggers that condition because of accident (attempt to allocate zero-sized buffer, e.g.) and then compiler turns your program into a pile of goo (like new wording allows). That is the biggest issue and it was introduced in C23, it wasn't in any other standard AFAICS. No, it wasn't. Any behavior, not matter how crazy it is, still keeps your program in the set of a strictly conforming programs and thus guarantees somewhat sane behavior in the rest of program. But if your program triggers UB then it's no longer a strictly conforming program and compiler have the right to turn it into a pile of goo. No, it's not just “bad”. It's a complete disaster. Currently Google is looking for way to completely ditch C++ (and C, of course, too), long term. Sure, its says we don't know if any of these options will be feasible or how much they will cost; unless some of them prove both feasible and cost effective, we will continue investing in C++ despite its problems, but given the fact that they explicitly say that C++ is a long-term strategic risk for… Microsoft wanted to ditch C++ in favor in C# and that plan failed, but they may succeed with Rust. But in the end… I guess the worst thing that may actually happen is C and C++ becoming new COBOL and FORTRAN. COBOL 2023 and Fortran 2023 both exist thus C and C++ committee would be able to travel around the world to write and publish new papers even if no one sane would even bother to read them, thus for them such outcome is acceptable, I guess.
Dysfunctionality in standards land
> And no existing program is made invalid under C23; an implementation is allowed to define UB itself, and the older standards made it implementation defined with an absolute dogs' dinner of a set of permitted behaviours (notably, it was permissible to return a non-NULL pointer that you could not use).
Dysfunctionality in standards land
realloc with zero size into functions that precede or follow that piece of code and remove security checks and other such things.
