Null pointers and error paths
Null pointers and error paths
Posted Jun 7, 2024 18:01 UTC (Fri) by riking (subscriber, #95706)In reply to: Null pointers and error paths by Wol
Parent article: Removing GFP_NOFS
Posted Jun 7, 2024 18:13 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
Not "your previous implementation-defined code can now be optimised away as undefined".
Cheers,
Posted Jun 8, 2024 8:47 UTC (Sat)
by farnz (subscriber, #17727)
[Link] (8 responses)
That sounds like a perfect case for implementation defined behaviour, without a set of constrained behaviours; the implementation has to document how it chooses to behave to be compliant, but has to stick to whatever behaviour it documents. I can see why you wouldn't want it to be unspecified behaviour, since (while you can constrain that to a set of allowed options), you generally want unspecified behaviour to be cases where there's a consistent compatible use, and room to do better if the implementation chooses a specific option.
Is there any discussion you can point me to that explains why not implementation defined here?
Posted Jun 8, 2024 10:25 UTC (Sat)
by excors (subscriber, #95769)
[Link] (7 responses)
The key part is:
> Classifying a call to realloc with a size of 0 as undefined behavior would allow POSIX to define the otherwise undefined behavior however they please.
It won't be undefined behaviour to call realloc(ptr, 0) from C23 on a POSIX system, because POSIX already defines it. ("Undefined behaviour" is a recessive trait - if another document wants to define it, then that definition takes priority). Platform-independent code can't rely on the POSIX definition (since it may run on a non-POSIX platform), but it can't rely on any implementation-defined behaviour either, so that's no worse than how it's been since at least C99.
POSIX says realloc(ptr, 0) can either return NULL, or return a pointer to some allocated space of unknown size, with some extra rules about errno (which were never in the C standard). So you can't use it as an alternative to free() anyway - it may perform a new allocation, which you will leak.
Posted Jun 8, 2024 13:38 UTC (Sat)
by farnz (subscriber, #17727)
[Link] (6 responses)
Ah, so the rationale is that they want POSIX and other downstream standards to add definitions for more things that are UB in Standard C; effectively reducing Standard C to "the things that are portable across all implementations of C", and expecting POSIX C to be an extension of Standard C to "the things that portable across all reasonable UNIX-like implementations of C".
Posted Jun 9, 2024 9:03 UTC (Sun)
by Wol (subscriber, #4433)
[Link] (5 responses)
If the C standard says "the compiler can assume undefined behaviour cannot happen", then if it's defined as undefined surely the compiler can just delete the code as "can't happen"? And isn't that exactly the behaviour we've been moaning about for ages?
In which case any alternative definition never gets considered?
Cheers,
Posted Jun 9, 2024 10:07 UTC (Sun)
by excors (subscriber, #95769)
[Link] (4 responses)
It doesn't say that. It says:
> undefined behavior: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirements
A compiler that assumes realloc(ptr, 0) cannot happen will still be a conforming implementation of the C standard, but it won't be a conforming implementation of POSIX. A compiler that implements the POSIX behaviour will be a conforming implementation of both. Any C compiler that targets POSIX will aim to conform with both standards, so it doesn't really matter which one defines the behaviour.
Incidentally, on non-POSIX systems, Microsoft says "realloc hasn't been updated to implement C17 behavior because the new behavior isn't compatible with the Windows operating system" (https://learn.microsoft.com/en-us/cpp/c-runtime-library/r...). N2464 says C17 changed the definition of realloc "to allow for the existing range of implementations", but evidently they failed since Microsoft believes it doesn't allow their behaviour.
I'm guessing C23 made it undefined because they couldn't work out a useful, unambiguous definition that would allow both the POSIX and Windows behaviours, and neither of those are going to change, so it was a waste of time - better to simply defer to the platform documentation.
Posted Jun 9, 2024 16:19 UTC (Sun)
by Wol (subscriber, #4433)
[Link] (1 responses)
But that then leaves us wondering what on earth any cross-platform compiler such as gcc does, seeing as it can produce both Posix and Windows binaries ... and given that linux makes no claim whatsoever to support Posix, that's giving the gcc guys carte blanche to just eliminate a realloc(ptr,0) as "does nothing". BAAADDDD ...
Cheers,
Posted Jun 10, 2024 8:18 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
A compiler like GCC can know what platform it's compiling code for (it has to know which CPU, and which ABI, after all, so extending that to which platform standard applies and allowing an override is a non-issue). It can thus support POSIX C when compiling for POSIX platforms (like glibc-based Linux), Windows C when compiling for Windows, and merely ISO C when compiling a freestanding binary that doesn't depend on a platform.
Note that GCC already has the mechanism you'd need for this in place as part of its support for multiple C dialects - it could have a POSIX/Windows/neither switch in there, too.
Posted Jun 9, 2024 16:25 UTC (Sun)
by Wol (subscriber, #4433)
[Link]
So doing a "ptr = realloc(ptr, 0)" instead of a free would prevent double frees or access-after-free, so long as (a) you're using a Windows-compliant compiler, and (b) you don't make copies of ptr. Surely that would make masses of sense as a simple way of defensive coding!
Cheers,
Posted Jun 10, 2024 10:40 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
So, reading the discussions a bit more, I get the sense that the ISO committee's goal is to move the meanings of undefined behaviour, unspecified behaviour and implementation-defined behaviour around a bit (in a way that's always been valid, but has been under-utilized by downstream standards like POSIX).
The ISO standard sets a common definition of C that all implementations of C must agree on, but allows a lot of latitude for downstream standards to tighten up the ISO definitions; for example, ISO says that "double" is a floating point number, so a downstream standard cannot repurpose "double" for integers represented using a pair of registers, but while ISO C says that the size of "char" in bits is implementation-defined and at least 8 bits, POSIX says that the size of "char" is always exactly 8 bits.
This is then an attempt to push downstream standards to tighten up ISO definitions when it comes to behaviours; it's already obvious to downstream standards that where something is implementation defined, a downstream standard can say "the implementation must define it this way", but it's not so obvious that where something is unspecified behaviour (one of a set of choices, no need to be consistent or to document which one as long as you choose from the set every time you encounter this construct) or undefined behaviour (the program can take any meaning if this construct is encountered) in ISO C, a downstream standard can make that defined, implementation-defined (choose and document a behaviour), or unspecified if it so desires, without conflicting with ISO C.
Taking an example that upsets a lot of people, ISO says that signed integer overflow is undefined behaviour; but they'd be very happy for POSIX to say that signed integer overflow is unspecified behaviour, and must be either saturating, twos complement wrap-around, wraps around to zero (skipping the negative numbers completely), or results in the program receiving a SIGFPE signal. It'd then be on implementations to choose the behaviour that results in the most optimal program from those choices, assuming they claimed POSIX support.
Null pointers and error paths
Wol
Undefined, implementation defined, and unspecified behaviour
Undefined, implementation defined, and unspecified behaviour
Pushing for downstream standards to define more
Pushing for downstream standards to define more
Wol
Pushing for downstream standards to define more
>
> Note 1 to entry: Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
Pushing for downstream standards to define more
Wol
Compiler choices when supporting two conflicting C standards
Pushing for downstream standards to define more
Wol
Downstream standards should define more behaviours (changing the split between US, ID, and UB)
