realloc() and the oversize importance of zero-size objects

By Jonathan Corbet
October 24, 2024

Small objects can lead to large email threads. In this case, the GNU C Library (glibc) community has been having an extensive debate over the handling of zero-byte allocations. Specifically, what should happen when a program calls realloc() specifying a size of zero? This is, it seems, a topic about which some people, at least, have strong feelings.

The job of realloc() is to change the size of an existing allocation; its prototype is:

    void *realloc(void *ptr, size_t size);

This call will replace the memory object at ptr with a new object of the given size and return a pointer to its replacement; that new object will contain the same data, up to the minimum of the old and new sizes. The old memory will be freed. Calls to realloc() are useful when an allocated block of memory needs to be resized on the fly.

An interesting question arises, though, when size is zero. If a program calls malloc() requesting a zero-byte object, the result is (in glibc) well defined: "If size is 0, then malloc() returns a unique pointer value that can later be successfully passed to free()." Needless to say, any attempt to store data using the returned pointer is unlikely to lead to joy, but the caller will get a pointer value back. This appears to be common behavior across malloc() implementations.

The situation is not quite as clear with realloc() in either the relevant standards or existing implementations; if it is called with a size of zero, it could do any of:

Behave like malloc() and return a pointer to a zero-byte object.
Behave like free(), release the object, and return NULL.
Behave like a spoiled brat, refuse to do anything, store an error number in errno, and return NULL.

Alejandro Colomar recently brought this question to the glibc development list. His opinion, expressed in unambiguous terms, was that realloc() should behave in the same way as malloc() when passed a size of zero; it should, in other words, free the passed-in object and return a pointer to a zero-byte object. But that is not what glibc's realloc() does; instead, it takes option 2 above: it frees the object and returns NULL. This behavior is not new; it dates back to 1999, when the behavior was changed to align with the C99 standard (or, at least, one reading thereof).

This change, Colomar asserted, should be reverted. That would make glibc conformant to his interpretation of C99 and in line with what the BSD systems do. He copied the discussion widely and rallied some support to this cause. None other than Douglas McIlroy put in an appearance to describe glibc's behavior as "a diabolical invention". Paul Eggert agreed, saying that "one must always be able to shrink an allocation, even if the shrinkage is [to] zero".

In truth, there were few defenders for glibc's behavior, but there is still resistance to changing it. As Joseph Myers pointed out, there are almost certainly programs that rely on the current behavior:

Given that any change in implementation behavior would introduce memory leaks or double frees depending on what an implementation currently does that applications rely on, it's not reasonable to try now to define it more precisely.

Colomar, though, disagreed strongly. Any memory leaks, he said, would be of zero-byte objects and could thus be ignored; Siddhesh Poyarekar pointed out, though, that a zero-byte object still requires the equivalent of two size_t values for tracking, so real memory would be leaked. Colomar also doubted the double-free case; Myers explained how that could come about. While there is disagreement over how many programs would be affected by this sort of subtle change, it is hard to argue that the number would be zero. That is going to make the glibc developers reluctant to change realloc() in this way.

Poyarekar, instead, attempted to find a compromise with this patch adding a new tunable that would allow the behavior of realloc() to be changed while leaving the current behavior as the default. Florian Weimer disliked the patch, saying that any applications that would benefit from the tunable are simply buggy in their current form; he later added that distributions would have to do a lot of work to support this tunable properly. Andres Schwab argued that a tunable is not appropriate, since a given program's needs will not change at run time; the behavior, if it changes at all, needs to be set at compile time.

Hanging over this whole discussion is another important detail: the C23 standard explicitly defines calling either malloc() or realloc() with a size of zero as undefined behavior. That is a change from C17, which called it implementation defined. Colomar had mentioned that change in his initial message, but dismissed it, saying "let's ignore that mistake". The glibc project cannot entirely ignore the standards it implements, though. So it is not surprising that, among others, Poyarekar also suggested updating the GNU tools and sanitizers to enforce this standard in the long term.

The conversation wound down without any definitive conclusions. One might surmise, though, that a change to the well-documented behavior of realloc() for the last 25 years would be unlikely to happen, even without the C23 change. Instead, glibc will have a hard time resisting the push from the standards committee to eliminate the existence of zero-size objects entirely. At some point, attempting to allocate such an object may well become an error, though this behavior will surely be controlled by a feature-test macro, as usual, to avoid breaking existing programs. Meanwhile, though, chances are good that we will see further ado over the allocation of nothing.

Possible solution

Posted Oct 24, 2024 15:51 UTC (Thu) by beckmi (subscriber, #87001) [Link] (4 responses)

It might be a solution to return (void *) -1 for allocations of size 0. Free and realloc would be flagged by this that it's no real object but a 0 allocation and could act accordingly.

Possible solution

Posted Oct 24, 2024 17:28 UTC (Thu) by a3f (subscriber, #106306) [Link] (3 responses)

This would break the (questionable) use of malloc(0) as a way to allocate unique cookies..

Possible solution

Posted Oct 25, 2024 4:26 UTC (Fri) by Baughn (subscriber, #124425) [Link] (2 responses)

Wait, are you allowed to compare pointers from different allocations? I thought that was UB.

Possible solution

Posted Oct 25, 2024 6:01 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (1 responses)

You may compare arbitrary pointers *for equality.* You only invoke UB if you compare them for ordering.

Note however that this assumes the pointers were legitimately obtained in the first place. Pointer types may have trap representations, so you cannot just cast some arbitrary pile of bytes into a pointer and expect everything to work correctly (even if you never dereference it).

Possible solution

Posted Oct 28, 2024 2:45 UTC (Mon) by ianmcc (subscriber, #88379) [Link]

In C++, p < q is undefined behavior for unrelated pointers p,q. But std::less<T*>(p,q) is well-defined, and must give a strict total ordering. Gcc had a bug on this, since fixed: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78420

Realloc freed the memory long before the C99 standard.

Posted Oct 24, 2024 16:05 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

Option 2 (free the object and return NULL) predates the C99 standard by quite a lot. The glibc implementation might not ...

I remember reading my Microsoft C v5.1 (from 1991-ish?) where it says realloc with a size of 0 frees the memory and returns NULL.

Imho this is a very good way of assuring safety - if you do not use free, but instead always do "ptr = realloc( ptr, 0)", you will never (absent multiple copies of a pointer) have dangling pointers lying around.

Cheers,
Wol

Realloc freed the memory long before the C99 standard.

Posted Oct 24, 2024 18:05 UTC (Thu) by fman (subscriber, #121579) [Link] (3 responses)

>> 2. Behave like free(), release the object, and return NULL.

And further: How (in this context) is NULL *not* a pointer that can be passed to free().
So in this regard 2) is just a specialization of 1)

Presumably, both 1) and 2) would free() the incoming argument pointer.

Realloc freed the memory long before the C99 standard.

Posted Oct 24, 2024 18:11 UTC (Thu) by randomguy3 (subscriber, #71063) [Link] (2 responses)

i believe the key word is "unique" - NULL is definitely not a unique pointer value!

Realloc freed the memory long before the C99 standard.

Posted Oct 24, 2024 18:12 UTC (Thu) by randomguy3 (subscriber, #71063) [Link]

(looking back, "unique" is not mentioned in 1 in that list, but is mentioned earlier in a quote from glibc's docs)

Realloc freed the memory long before the C99 standard.

Posted Nov 1, 2024 0:50 UTC (Fri) by kelnos (subscriber, #174370) [Link]

A long long time ago, the first time I ever read that, I thought it meant that it would return the single same pointer value every time you called malloc() with a size of 0. But that value was also guaranteed to never be returned for a non zero sized allocation. To me the second bit was the "unique" part. And so returning NULL could be a conformant thing to do. As could (as someone else suggested) returning (void *)-1 every time.

I know no one actually interprets or implements it that way, but to me, that's still a valid reading of the spec.

Single call API for the heap

Posted Oct 24, 2024 17:33 UTC (Thu) by bushdave (guest, #58418) [Link] (6 responses)

I always considered behaviour 2 to be kind of beautiful, since that allows you to define a heap allocator using just a single call:
malloc(n) is just realloc(NULL,n) and free(p) is just realloc(p,0).

But let's say that you want behaviour 1, couldn't realloc(p, 0) just return a singleton pointer that points to at least one readable and writable word? free could check for this singleton and do nothing. There would be no memory leaks (but a tiny performance penalty).

If I remember correctly there are architectures where loading a NULL pointer into some register is illegal, even if the memory isn't accessed.

Single call API for the heap

Posted Oct 24, 2024 18:14 UTC (Thu) by randomguy3 (subscriber, #71063) [Link] (5 responses)

the glibc docs promise a unique pointer, which i bet someone relies on (eg: using it as a map key)

Single call API for the heap

Posted Oct 25, 2024 6:14 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (4 responses)

For the record, please do not do this. Unique numbers are not necessarily good hash inputs (see for example all prefixes of the Thue-Morse sequence). If you want to make a well-behaved hash table, either use a good implementation with an opaque interface (such as Rust's Hasher type), or actually learn the theory. You cannot just YOLO your way around by XOR'ing arbitrary things with other arbitrary things, and expect to have guaranteed O(1) worst-case performance.

Single call API for the heap

Posted Oct 25, 2024 7:42 UTC (Fri) by SLi (subscriber, #53131) [Link]

I guess that still fits in "use it as a map key". The hashing should be an implementation detail that is ideally not even exposed to the programmer. And there are also maps that are not hashmaps!

Single call API for the heap

Posted Oct 31, 2024 15:56 UTC (Thu) by anton (subscriber, #25547) [Link] (2 responses)

I have never read anything about "good hash inputs". Normally the idea is that a good hash function distributes the inputs into the buckets like a random choice of buckets or better (perfect hashing), for inputs of any characteristics (of course each hash function has input sets that produce worst-case behaviour, but for good hash functions trivial patterns do not form such sets).

I expect that you mean that "good hash inputs" do ok even for bad hash functions. The solution to this problem is to use good hash functions, not to produce "good hash inputs" from realloc(...,0). There is no guarantee of "good hash inputs" for any other stuff that you might throw at the hash function, including other uses of realloc().

Single call API for the heap

Posted Oct 31, 2024 17:56 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (1 responses)

I suppose my real concern is that some people are going to use the modulus operator as their "hashing algorithm," under the (false) assumption that malloc(0) is required to give uniformly-distributed pointers (and because C does not come with a hash table implementation in its stdlib, so whatever hash implementation you come up with is probably not going to be a best-in-class implementation).

Single call API for the heap

Posted Nov 1, 2024 12:04 UTC (Fri) by paulj (subscriber, #341) [Link]

The Jenkins hash is easy to pull from the Linux source code, and more than good enough for non-security-sensitive, performance orientated, contexts.

malloc(0)

Posted Oct 24, 2024 17:43 UTC (Thu) by mjw (subscriber, #16740) [Link] (4 responses)

The article states: "the C23 standard explicitly defines calling either malloc() or realloc() with a size of zero as undefined behavior."

I thought it was just realloc with size zero that is now undefined behaviour.
At least the last public draft I can find only mentions something about size being zero for realloc, but not for malloc.
https://open-std.org/JTC1/SC22/WG14/www/docs/n3301.pdf
But it isn't very consistent.

7.24.4.1 talks generically about malloc, realloc, aligned_alloc and calloc saying "If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned to indicate an error, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object."

7.24.4.7 talks about malloc and doesn't mention size zero at all, just saying "The malloc function allocates space for an object whose size is specified by size and whose representation is indeterminate. The malloc function returns either a null pointer or a pointer to the allocated space".

7.24.4.8 talks about realloc and says "If ptr is a null pointer, the realloc function behaves like the malloc function for the specified size. Otherwise, if ptr does not match a pointer earlier returned by a memory management function, or if the space has been deallocated by a call to the free or realloc function, or if the size is zero, the behavior is undefined."

Which seems to me to imply only realloc (p, 0) is undefined behaviour unless p is NULL, then it behaves like malloc(0). But it isn't specified what malloc(0) means.

malloc(0)

Posted Oct 24, 2024 20:37 UTC (Thu) by Wol (subscriber, #4433) [Link] (2 responses)

> The article states: "the C23 standard explicitly defines calling either malloc() or realloc() with a size of zero as undefined behavior."

OMG. More UB.

Why can't they just state it's implementation defined, and should have flags to select the desired behaviour from previous standards/implementations if it's not the default.

(That leaves compiler writers free to say "well we are going to treat it as undefined unless you tell us otherwise", but the option of just saying "it's UB, we don't care if you rely on previous behavior" is NOT available.)

Cheers,
Wol

malloc(0)

Posted Oct 25, 2024 6:13 UTC (Fri) by fw (subscriber, #26023) [Link] (1 responses)

If it's implementation-defined, an implementation can still document it as undefined. If the standard says it's undefined, an implementation can document specific behavior for that implementation. The difference between the two approaches is very minor.

malloc(0)

Posted Oct 25, 2024 10:03 UTC (Fri) by khim (subscriber, #9252) [Link]

> If it's implementation-defined, an implementation can still document it as undefined.

No, that's not allowed. Only standard can declare that's something is undefined.

But it's permitted for the implementation to define few different outcomes and declare that one of them is chosen depending on the phase of the price on the papaya in the Honolulu. The trick is that all possible outcomes have to be listed… and it's appears that it's precisely what happens with realloc, right?

> The difference between the two approaches is very minor.

It's exactly why we have this mess on our hands: standard writers perceive that difference as something “minor” when in reality it's as big as day and night.

Program that triggers implementation-defined behavior is still valid C program and you may understand what it does after reading the documentation (although documentation may say that few different outcomes are possible). And after reading the documentation for a particular implementation you always know if you need to fix that code or it's already working in a fashion that satisfies you.

Program that triggers undefined behavior is invalid C program and the only recourse is to rewrite it. Or beg implementation to define it.

But leaving code unchanged when compiler developers are not cooperating is not an option: your program may misbehave even in parts that precede the point where undefined behavior it triggered!

malloc(0)

Posted Oct 25, 2024 5:48 UTC (Fri) by CChittleborough (subscriber, #60775) [Link]

My reading is that 7.24.4.1 'wins' over 7.24.4.7.

So the official standard says that malloc(0) has Implementation-Defined Behavior, and can return either NULL or a non-null pointer that should never be dereferenced.

Working out which of those is better is the hard part here.

Special allocator for zero-sized blocks

Posted Oct 24, 2024 18:14 UTC (Thu) by jepler (subscriber, #105975) [Link] (4 responses)

> Siddhesh Poyarekar pointed out, though, that a zero-byte object still requires the equivalent of two size_t values for tracking, so real memory would be leaked

A special allocator could be used for zero-sized blocks, especially on 64-bit systems where you should be able to make due with 1/128th as much storage as "two size_t values":

You'd set aside a big swath of addresses for all your zero-sized blocks (never mapped) and then track which ones are allocated with a single bit per allocation.

Special allocator for zero-sized blocks

Posted Oct 25, 2024 6:23 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (2 responses)

The question is how much code is realistically going to rely on this behavior. Is it worth the complexity to support?

Special allocator for zero-sized blocks

Posted Oct 25, 2024 12:58 UTC (Fri) by Heretic_Blacksheep (guest, #169992) [Link]

The only way to know the answer to such a question is to make the change (either for real or in simulation) then find out over time just how many programs trigger the behavior via some kind of feedback mechanism.

As an aside, this is usually where FOSS often falls flat. There's very little intelligence into what programs use which function of a particular library stack until someone on a very big project weighs in that functionY() change last month broke featureZ. Then you get arguments over how many other programs have problems ranging from ''none" to "many" because no one really knows for sure, and not all users of a library are going to speak up nor do they all use the same subset of functions.

Special allocator for zero-sized blocks

Posted Oct 30, 2024 19:52 UTC (Wed) by dgm (subscriber, #49227) [Link]

I'm not sure of understanding you correctly, but if you mean that the cost of updating 25+ years worth of code is not worth what is, in essence, an aesthetic change, then I have to concur.

Special allocator for zero-sized blocks

Posted Oct 25, 2024 14:20 UTC (Fri) by adobriyan (subscriber, #30858) [Link]

On x86_64 you could approximate this by maintaining internal "pointer" into non-canonical part of address space,
so that most derefences are trivially caught by MMU. Of course, language layers will be upset.

Zero is a number just like any other number

Posted Oct 24, 2024 19:26 UTC (Thu) by jem (subscriber, #24231) [Link] (14 responses)

I don't understand why a size of zero should be a special case. Realloc can enlarge or shrink the allocated block, and you can call realloc with a pointer returned by realloc. The size of the (re)allocated block is in most cases the result of a computation. If a size of zero has a different meaning, then you have add a check for it in the caller and handle it differently.

>His opinion, expressed in unambiguous terms, was that realloc() should behave in the same way as malloc() when passed a size of zero; it should, in other words, free the passed-in object and return a pointer to a zero-byte object.

In my opinion, realloc(ptr, 0) should logically behave in the same way as realloc(ptr, 1), except that in the second case there is a single byte left instead of zero bytes.

If malloc can take a size argument of zero, returning an object with a zero sized block that can be realloc'd, why do we have the asymmetry that a realloc(ptr, 0) can't return a zero sized block (that could be realloc'd back to some larger size)?

Using realloc(ptr, 0) as a substitute for free is an ugly hack. We already have the free function.

Zero is a number just like any other number

Posted Oct 25, 2024 7:47 UTC (Fri) by taladar (subscriber, #68407) [Link] (11 responses)

Agreed. The existing behavior where realloc acts like free also means that the lifetime of the actual pointer ends prematurely which is a special case that I bet isn't checked, especially not in the cases where the size is the result of a calculation and realloc is called on 0 sized results anyway.

Zero is a number just like any other number

Posted Oct 25, 2024 9:53 UTC (Fri) by Wol (subscriber, #4433) [Link] (7 responses)

> The existing behavior where realloc acts like free also means that the lifetime of the actual pointer ends prematurely

But the behaviour of free is that the lifetime of the pointer outlives what it's pointing to ...

(And I don't understand your argument - realloc will destroy the contents then the assignment destroys the pointer - how is post-hoc destruction premature?)

"ptr = realloc(ptr, 0)" is the only syntax where the destruction both of the pointer, and its contents, APPEARS to be atomic from the PoV of the caller.

Otherwise it's multiple operations which can get forgotten, separated, {}-errored, etc etc.

Cheers,
Wol

Zero is a number just like any other number

Posted Oct 25, 2024 10:05 UTC (Fri) by mb (subscriber, #50428) [Link] (6 responses)

Only because you write multiple operations (realloc + pointer overwriting) in the same line doesn't make them less dividable by accident.

And the pattern ptr = realloc(ptr, ...) is dangerous (memory leaks), if your size can be non-zero. This pattern should not be encouraged.

The real solution is to avoid using C altogether and switch to a sane language with a sane allocator.

Zero is a number just like any other number

Posted Oct 25, 2024 11:12 UTC (Fri) by Wol (subscriber, #4433) [Link] (4 responses)

> Only because you write multiple operations (realloc + pointer overwriting) in the same line doesn't make them less dividable by accident.

How do you chop an atomic statement in half?

> And the pattern ptr = realloc(ptr, ...) is dangerous (memory leaks), if your size can be non-zero. This pattern should not be encouraged.

You should not encourage the CORRECT use of realloc? You should know whether you want to re-use ptr or not, and you should know what sort of object you're going to put there - if the answer is "nothing" then allocate zero space. You're advocating careless programming ... you'll say I'm exaggerating, but taking your argument to its extreme it sounds like "don't bother freeing ptr, you don't know if you're going to re-use it", which is probably worse on the memory leak front ...

> The real solution is to avoid using C altogether and switch to a sane language with a sane allocator.

Agreed :-)

Zero is a number just like any other number

Posted Oct 25, 2024 12:00 UTC (Fri) by khim (subscriber, #9252) [Link]

> How do you chop an atomic statement in half?

Easy: p = realloc(q, 0);. Bam: your pointer is no longer clobbered and can be happily reused.

That approach was even tried in Go (where proper way to append something to the slice is s = append(s, t); and not just simply append(s, t); or s.append(t);) and causes nothing but grief.

> You should not encourage the CORRECT use of realloc?

It would have been “correct” if it was taught that way from the beginning (preferably in first edition of K&R).

But, alas, in our world that haven't happened, thus it falls fully into “very non-standard and unusual non-portable code that some weirdos try to promote for no good reason”.

Typical C programmer, in today's world, wouldn't even know that ptr = realloc(ptr, 0); is supposed to free memory and would look, in vain, for free, which would lead not to greater safety, but to greater confusion.

> You should know whether you want to re-use ptr or not

And you shouldn't do any other mistakes too. Which means that such code is useless for real programmers who are not taught to use it and it's also useless for imaginary “perfect” programmers who never do mistakes. Who could benefit from it, then?

> You're advocating careless programming

Nope. He (and me, too) advocate familiar programming that's based on the tools that we have today.

You are imagining some alternate world, where C is different, C programmers are different and realloc is different, too!

Sorry, but it's really too late to push for some new C and new set of C programmers and new C standard library that could, collectively, embrace that “brave new way” of dealing with memory allocations.

Zero is a number just like any other number

Posted Oct 25, 2024 12:40 UTC (Fri) by mb (subscriber, #50428) [Link] (2 responses)

> How do you chop an atomic statement in half?

It's not atomic.

> You should not encourage the CORRECT use of realloc?

No. It's usually incorrect to overwrite the pointer with the return value of realloc before checking for NULL return. If realloc fails to allocate, it does not free the pointer and returns NULL. If you overwrite your only pointer with the NULL return, you have leaked the original allocation.

Zero is a number just like any other number

Posted Oct 25, 2024 18:27 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

> If you overwrite your only pointer with the NULL return, you have leaked the original allocation.

Sure, but if you want not to allocate memory, but to free it and that fails… what are the mitigations? At this point I would assume that allocator would just stop the program because it's probably the best response that it could do.

And in some alternate reality where this behavior is mandated… and realloc was required to behave like that… and all C courses would have teached you to use that ability… yes, in such a world, use of realloc would have been justified.

I would argue that even in that world it would have been bad design, but familiarity of the pattern would have made it justifiable.

But inventing and pushing new convention like that in our world? That's just… I don't even know strong enough words to describe what I think about that idea.

Zero is a number just like any other number

Posted Oct 25, 2024 18:30 UTC (Fri) by mb (subscriber, #50428) [Link]

>but if you want not to allocate memory, but to free it and that fails

I commented on something else.

I was just saying that ptr = realloc(ptr, ...) was a bad pattern, because it's wrong for all cases *except* the free/zero case (maybe; implementation defined; if not UB).

Zero is a number just like any other number

Posted Oct 31, 2024 5:33 UTC (Thu) by milesrout (subscriber, #126894) [Link]

there is no rule that you have to use malloc when you write C. Plenty of people use "sane" allocators in C.

this has nothing to do with the fact p = realloc(p,...) is erroneous. Of course it is wrong! It is obviously nonsensical rubbish code with any allocator. If you cant get this basic stuff right then i dont think a "sane" allocator would save you from writing hundreds of other serious bugs in your program

Zero is a number just like any other number

Posted Oct 25, 2024 10:12 UTC (Fri) by khim (subscriber, #9252) [Link] (2 responses)

> The existing behavior where realloc acts like free also means that the lifetime of the actual pointer ends prematurely which is a special case that I bet isn't checked

After call to realloc pointer is no longer usable in all cases, I don't see where do you see the special case.

Zero is a number just like any other number

Posted Oct 25, 2024 11:04 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

The "special case" is called "use after free".

main {
void *ptr;
ptrr = malloc(6);
free(ptr);
*ptr = 9.6;
}

My C is rusty, but that will access what ptr USED TO point at, and (possibly) do a load of damage?

Whereas
ptr = realloc(ptr, 0);
using the "returns null" definition will in most circumstances trap and cause a run-time error "dereferencing null pointer"?

The point is that, from the PoV of the calling function, realloc ATOMICally destroys BOTH what is pointed at, and what is doing the pointing. Absent multiple copies, you don't get dangling pointers lying around to cause "use after free" problems.

And it's easily enforced with a couple of rules, that I guess are easy to write in a linter - "don't use free; never ignore the return from realloc".

Cheers,
Wol

Zero is a number just like any other number

Posted Oct 25, 2024 11:43 UTC (Fri) by khim (subscriber, #9252) [Link]

> using the "returns null" definition will in most circumstances trap and cause a run-time error "dereferencing null pointer"?

It may or may not do that. In fact in a world where *ptr = 9.6; means I solemnly swear that ptr is not NULL and ptr = realloc(ptr, 0); means I solemnly swear that ptr would be NULL this could be “interpreted” by the compiler in a very radical manner with very non-obvious consequences.

> And it's easily enforced with a couple of rules, that I guess are easy to write in a linter - "don't use free; never ignore the return from realloc".

Which is still much more convoluted and contrived way that just “use FREE macro” that is defined as ({typeof(&p) addr_of_p = &p; free(*addr_of_p); *addr_of_p = NULL;}).

I agree that in some alternate reality where C would have developed in some other direction than how it happened in our world realloc used in this fashion can aid safety, but in the mess that we have now, today? Nope, it would just make everything worse.

Note that free is already defined as function that invalidates pointer and compiler can be taught to detect these situations statically in simple cases while in more complicated cases use-after-free comes not from pointer directly passed to free but from entirely different pointer stashed in some different struct.

Zero is a number just like any other number

Posted Oct 25, 2024 8:08 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

> Using realloc(ptr, 0) as a substitute for free is an ugly hack. We already have the free function.

But they're not identical. realloc is a function, free is a subroutine. From the point of view of the caller, wiping the pointer with realloc is an atomic operation, freeing then wiping is two actions which could get separated (or forgotten).

So what if realloc is a hack. It's also an aid to memory safety, and probably far easier to lint.

Cheers,
Wol

Zero is a number just like any other number

Posted Oct 25, 2024 10:31 UTC (Fri) by khim (subscriber, #9252) [Link]

> It's also an aid to memory safety

As it's currently defined it's far from an aid to anything. Notice how it invalides some pointers that are not even arguments to realloc!

> and probably far easier to lint

The simples rule to enforce: flag any use of realloc and ask developer to stop doing it. Presto: problem solved.

It may not be the most satisfying solution and in some alternate world realloc would behave differently but after the mess that was already created around it any attempt to use it for safety is a fools errand.

You may never make different people agree on how these realloc-using programs have to behave.

From safety POV the simplest resolution that hyphotetical C50 may do is to just replace all that mess with call to realloc is undefined behavior. That would be shorter, simpler… and wouldn't affect all that many programs, in reality.

I can understand the different views.

Posted Oct 24, 2024 21:09 UTC (Thu) by jd (guest, #26381) [Link] (1 responses)

Option 1 makes sense, in that there needs to be something that defines not only the allocation but any other attributes that might be associated with it. There might be several such metadata structures in some operating systems.

If realloc can only change, but not create or destroy, the metadata, then you've a performance gain. You can have a pool of, say, buffers that get realloced to zero then realloced to the necessary size when needed. You'd expect this to be faster (less work), but also more predictable (since all memory that's free can be allocated, rather than all minus the space needed for housekeeping).

I can imagine that'd be good for HPC and embedded, but you'd also expect such software to use replacement mallocs more suited to such work.

Option 2 makes sense too, because a valid result from malloc and realloc should be usable. It's one thing if there's an error, but if it's successful, the pointer you have should work. You can't validate the case where an operation returns a pointer but the pointer is unusable and there's no exception handling in C.

However, that would break the semantics for malloc, which is a Bad Thing, but if you want to improve memory safety then you've got to have safe behaviour.

Which leads to Option 3, where you return an error saying the operation doesn't make sense.

There's an alternative approach: remove memory allocation from glibc and put it in a library glibc links to.

Don't bother with conventional tunables, since you don't want degraded performance and more opportunities for bugs, simply have a bank of malloc implementations, where glibc's malloc is the default.

People are going to replace the memory allocator if they want different semantics anyway. Moving malloc outside reduces overhead if they're going to do that and there's no new code so no room for new defects.

Precisely because the memory allocator can be replaced already, even dynamically rather than at compile time, there's only limited scope for arguing the behaviour should be decided at compile time. That isn't, after all, how it's done at present.

I can understand the different views.

Posted Oct 25, 2024 7:48 UTC (Fri) by taladar (subscriber, #68407) [Link]

The behavior needs to be known at compile time though since the error handling and further use of that pointer needs to be different for the different behaviors.

"unique" is the problem here

Posted Oct 25, 2024 0:31 UTC (Fri) by josh (subscriber, #17465) [Link]

If you drop the requirement that malloc(0) return a *unique* zero-sized pointer, then you no longer need to *track* the pointer you return. Return (void *)1 or (void *)-1 or a similar never-valid non-NULL pointer, and have free of that pointer successfully do nothing. realloc to size 0 can similarly return that pointer, so things expecting a non-NULL return are satisfied, and things expecting to free memory that way are *also* satisfied.

We faced exactly this problem years ago

Posted Oct 25, 2024 3:33 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (1 responses)

Years ago, when implementing Lua support in haproxy, we found that their allocator matched exactly our understanding of realloc() based on the man page on Linux (i.e. option 2).

Then some time later, some users noticed memory leaks on Alpine (using musl not glibc). It's when we
discovered that POSIX doesn't mandate to completely free the area. So we had to put an explicit check
and forcefully free when the size is zero, otherwise call realloc(), and put a big fat warning on top if it.
It suddenly became much less elegant but now it works.

The conclusion of this is that changing this behavior *WILL* break existing applications that were never
tested outside glibc. However, having a tunable to test for that for portability purposes would be great.
And maybe 20 years from now the default setting of the tunable could change.

Another feature which would be great would be to easily access a counter of zero-sized objects so that
applications can be instrumented to detect their use of this malloc() feature of unique return pointers,
because it might come with lib dependencies and is hard to detect (due to the slow leak). Even a tunable
causing an abort on zero-sized objects would help locate the rare call points.

We faced exactly this problem years ago

Posted Oct 25, 2024 12:15 UTC (Fri) by kleptog (subscriber, #1183) [Link]

> The conclusion of this is that changing this behavior *WILL* break existing applications that were never tested outside glibc.

ISTM that the benefit of changing this after 20 years is negative. All existing code has been written and debugged with the existing behaviour, changing it can only add new bugs.

It would be different if C had a glorious future and lots more was going to be coded in it, but that's not the case. The vast majority of code in the future is going to be written in languages that don't use malloc/realloc directly, so this discussion is as useful as rearranging deck-chairs.

Ignoring undefined behaviour

Posted Oct 25, 2024 13:23 UTC (Fri) by quietbritishjim (subscriber, #114117) [Link] (18 responses)

> Hanging over this whole discussion is another important detail: the C23 standard explicitly defines calling either malloc() or realloc() with a size of zero as undefined behavior. That is a change from C17, which called it implementation defined. Colomar had mentioned that change in his initial message, but dismissed it, saying "let's ignore that mistake". The glibc project cannot entirely ignore the standards it implements, though.

Perhaps I misunderstood, but to me "... cannot entirely ignore the standards ... " seems to imply that code with "undefined behaviour" must crash and burn in the most random and confusing way possible. That's not so. An implementation is free to define a meaning to some code that is undefined according to the standard – that is still standard compliant since, by definition, the standard does not define what that code should do.

Ignoring undefined behaviour

Posted Oct 25, 2024 13:41 UTC (Fri) by magfr (subscriber, #16052) [Link] (17 responses)

I suspect that is the real reason they did decide that it is undefined behaviour.

We have two worlds of similar size where one uses interpretation 1) and the other uses interpretation 2).

Neither is likely to change their interpretation as that would mess up for their users.

In this case the only thing a standard can do is declare it UB in portable programs as fifty years of IDB have failed to produce a winner.

I suppose one could argue that something like IDBsan is needed as a companion to UBsan.

Ignoring undefined behaviour

Posted Oct 25, 2024 13:57 UTC (Fri) by smcv (subscriber, #53363) [Link] (10 responses)

Making something UB is a very strong statement, though: it's saying that any existing code that does that thing is simply wrong (perhaps retroactively!), and whenever that existing code is recompiled, compilers are allowed (perhaps even encouraged) to emit binaries that only do what the developer presumably intended in situations where the UB cannot happen.

The most pathological case is that if the compiler can prove that a program will call realloc(p, 0) for some non-null pointer p, it is allowed to "optimize" the whole program into a binary containing 0 instructions, because according to the standard that's just as valid as any other result. The result is a very efficient program (the compiler has very successfully optimized for both code size and execution speed), but it's unlikely to be what anyone wanted or expected.

Compare with IDB: if the result of an operation is implementation-defined, then the compiler does need to implement some sort of result for it, and cannot just delete the rest of your program.

Ignoring undefined behaviour

Posted Oct 25, 2024 14:13 UTC (Fri) by farnz (subscriber, #17727) [Link] (7 responses)

From the standards committee's point of view, making it UB makes sense; there is no common behaviour that all implementations must provide (bear in mind that for both implementation-defined and unspecified behaviour, the standard is allowed to give you "guard rails" on what behaviour is allowed such that a strictly conforming program can depend on the guard rails, and a conforming program can depend on implementation specifics), and thus there's no significant difference between implementation defined (can't be used in strictly conforming programs, can be used in a conforming program) and undefined (can't be used in strictly conforming programs, can be used in a conforming program).

If there were downstream standards like POSIX defining behaviours that ISO deems out-of-scope, this would be less of a big deal. But because we're in a situation where the downstream standards don't define more than ISO, it's a problem.

Ignoring undefined behaviour

Posted Oct 27, 2024 18:10 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (6 responses)

I have to say that I have gotten very exhausted with the C standards committee's motte-and-bailey fallacy surrounding UB. They really need to pick one of these positions and hold it consistently:

* Programs that perform UB are wrong. Compilers can and should optimize under the assumption that UB never occurs.
* Programs that perform UB are not wrong, UB just means that the standards committee does not know or care exactly what will happen in those cases due to portability considerations. Non-normatively, an implementation which makes nasal-demon optimizations is a poor quality implementation despite being technically standards-conforming.

Right now, the committee wants to have its cake and eat it too, but the resulting constellation of UB does not make coherent sense as a whole.

Ignoring undefined behaviour

Posted Oct 28, 2024 8:43 UTC (Mon) by farnz (subscriber, #17727) [Link] (5 responses)

I've never seen the first position from the C standards committee, only from compiler maintainers. The committee has consistently held the second position, IME, with an extension of "and a downstream standard such as POSIX can specify things that we don't; after all, POSIX already specifies that CHAR_BIT must be 8".

Ignoring undefined behaviour

Posted Oct 28, 2024 19:27 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (4 responses)

Compiler maintainers are the main (if not only) group of people who tell the standards committee what they do and do not want to see in the next iteration of the standard. They may technically be two different groups of people, but they move and act in concert, and I do not think that distinguishing between them adds much expository value to our model of why the standards committee does things.

Ignoring undefined behaviour

Posted Oct 29, 2024 9:22 UTC (Tue) by farnz (subscriber, #17727) [Link] (3 responses)

I disagree deeply; the people I see arguing that compilers should only care about the ISO standard are a disjoint group from those who say that the standard should expect downstreams to define more behaviour than ISO does. It makes a huge difference, because it's a minority group, who just happen to have positions of power w.r.t. open source C compilers - note that many of the proprietary C compilers don't make the same arguments around "ISO says it's OK" - and your position is a lot like saying that because you hold an opinion, Google as your employer must agree, and that it's unhelpful to view your opinions as separate from Google's.

In practice, if enough of the people who care were to get involved with LLVM and GCC maintenance, and write and enforce documentation for what LLVM and GCC do when ISO says "UB", "IFNDR", "US", and similar terms for "ISO doesn't have a view here", it'd stop being an issue. But that would involve a bunch of people who aren't interested in compilers taking control of compiler projects, and this is an underlying weakness of open source - only people who are genuinely interested in something tend to take control of that thing.

Ignoring undefined behaviour

Posted Oct 31, 2024 18:00 UTC (Thu) by anton (subscriber, #25547) [Link] (1 responses)

Some of us actually work on and publish about compilers (albeit not C compilers), so your slander is just that.

Concerning your suggestion to get involved with LLVM and GCC maintenance, these are big projects with a lot of paid-for participants who have agreed on certain goals and evaluation methods, and these agreements have lead to the current situation. It is unlikely that one or a few volunteers can change the course of these big projects in a way that conflicts with the value system of the paid participants. Even a contribution that did not conflict with that value system was ignored (and the authors of that contribution also presented at a GCC Developer's summit).

Ignoring undefined behaviour

Posted Nov 1, 2024 9:22 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> It is unlikely that one or a few volunteers can change the course of these big projects in a way that conflicts with the value system of the paid participants.

FWIW, LLVM has a (new) community code ownership policy[1] and is actively seeking[2] community members to participate. You're unlikely to change minds if you conflict about something from the paying entities, but it is possible to offer arguments[3] that end up nudging things in the right direction[4] (even if there's a lack of explicit acknowledgement).

[1] https://discourse.llvm.org/t/rfc-proposing-changes-to-the...
[2] https://discourse.llvm.org/t/calling-all-volunteers/82817
[3] https://github.com/bazelbuild/bazel/pull/19940#issuecomme...
[4] https://github.com/bazelbuild/bazel/pull/19940#issuecomme...

Ignoring undefined behaviour

Posted Oct 31, 2024 18:02 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

OK, fine, let's analyze them as two groups:

1. The GCC/Clang people have publicly stated, in many different fora, for many years, that they will interpret UB as license to do whatever they want. GCC and Clang are also the two most popular compilers in practical use.
2. The committee ignores (1) and continues to designate things as UB which probably should not be treated in this manner, and then insists that they are not at fault when the compiler writers do exactly what they publicly said they were going to do.

That's hardly better.

Ignoring undefined behaviour

Posted Oct 31, 2024 6:05 UTC (Thu) by milesrout (subscriber, #126894) [Link] (1 responses)

>Making something UB is a very strong statement, though: it's saying that any existing code that does that thing is simply wrong (perhaps retroactively!), and whenever that existing code is recompiled, compilers are allowed (perhaps even encouraged) to emit binaries that only do what the developer presumably intended in situations where the UB cannot happen.

This is NOT true. It is disinformation spread by malicious compiler developers.

Code with behaviour that is not explicitly defined by the C standard is not necessarily portable to arbitrary implementations of C. There is no reason at all to think that its behaviour is not defined by something other than the C standard, such as:

- the compiler documentation
- basic common sense
- obvious authorial intent
- platform documentation
- historical precedent
- behaviour implied by the standard
- behaviour so obvious to the standard's authors it didn't occur to them that it needed to be specified
- the fact that an implementation treating said code as erroneous would be useless and so no reasonable implementation will do so
- etc.

It is of course possible under one reading of the standard for a malicious implementation of C++ to treat an empty infinite loop (eg. "for(;;);") as illegal and to ignore the loop, replacing it with a no-op. Such an implementation is a thought experiment. Its possible existence does not justify the claim that the standard "says" that such a loop is "simply wrong" (it does not and it is not), nor does the standard "encourage" compilers to miscompile code like this. Of course Mr Compiler can do whatever he pleases, and emit whatever machine code he likes. He can call himself a C++ compiler. But he is a useless one and no programmer that cares about correctness will use him.

Similarly it is POSSIBLE to create a program that you claim is a C implementation that treats "realloc(p, 0)" as erroneous, without a diagnostic, and which miscompiles it. But such an implementation is just useless crap. That the GNU libc people are even considering this is very sad. GNU used to be about being better than the standard, where the standard was silent, GNU programs would be designed to do something sensible. GCC COULD miscompile code that used variable names more than 6 bytes long too (or whatever the stupid limit in the standard is). It doesn't, because what a stupid decision that would be. Half the point of GNU originally was to write decent implementations of standard utilities that used dynamic allocation to avoid those sorts of arbitrary limits, to go beyond the bare minimum "standard" and to build good useful software.

Nowhere else in the world do we accept this kind of malicious compliance with the (very stretched) black letter of the rules while totally ignoring its objects, its context, and its plain and natural meaning.

Nobody would create an intentionally useless and malicious alternative implementation of any programming language WITHOUT a standard (except as a joke). Why, as soon as SOME behaviour is written down, do people start to act like anything not written down is free to be implemented in as stupid a way as possible, *and should be*, and that any code that would run afoul of such a (mis)compiler is "simply wrong"?

Again, nobody would write a C compiler with these sorts of miscompilation bugs (falsely claimed to be "optimisations" - optimisations cannot turn correct code into incorrect code so they cannot be called this) if the standard didnt exist. Why oh why would the existence of the standard make the set of things that can be usefully described as C implementations LARGER than it was before?

Ignoring undefined behaviour

Posted Nov 7, 2024 23:14 UTC (Thu) by fest3er (guest, #60379) [Link]

«Why oh why would the existence of the standard make the set of things that can be usefully described as C implementations LARGER than it was before?»

Probably for the same reason drag racers, sled pullers and other motorsports movers and shakers work to ensure that there are some rules that are subtly ambiguous: that ambiguity can, and often does, give them a competitive advantage even though that advantage goes against the intent of the rules. In other words, some people like ambiguous rules because it lets them flex their creativity. Alas, they forget that unfettered creativity often results in broken software.

Ignoring undefined behaviour

Posted Oct 25, 2024 14:25 UTC (Fri) by excors (subscriber, #95769) [Link] (1 responses)

> I suspect that is the real reason they did decide that it is undefined behaviour.

I think that's more than just a suspicion: the C23 change proposal at https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf says:

> Classifying a call to realloc with a size of 0 as undefined behavior would allow POSIX to define the otherwise undefined behavior however they please.

so it was explicitly intended for POSIX to continue to require specific behaviour, and code that only needs portability across POSIX systems (not any arbitrary ISO C conforming implementation) could continue to rely on that. It wasn't intended to break compatibility with existing code, it was just an editorial change to clean up their previous attempt to specify it as implementation-defined behaviour which turned out to be ambiguous and internally inconsistent.

Ignoring undefined behaviour

Posted Oct 31, 2024 16:56 UTC (Thu) by anton (subscriber, #25547) [Link]

I have looked at three different POSIX versions (2004, 2013, 2024), and they all change the wording and sometimes the behaviour of realloc() (the 2004 variant or realloc(ptr,0) free()s the object pointed to by a non-NULL ptr), the others allow several implementations.

I don't see that a tighter definition (e.g., the 2004 one) in POSIX would be in conflict with a looser (e.g., implementation-defined, or one of several options, including the POSIX 2004 one) definition in a C standard.

Ignoring undefined behaviour

Posted Oct 25, 2024 18:45 UTC (Fri) by khim (subscriber, #9252) [Link] (3 responses)

> I suspect that is the real reason they did decide that it is undefined behaviour.

There's rationale and it quite explicitly says what they wanted to achieve: Classifying a call to realloc with a size of 0 as undefined behavior would allow POSIX to define the otherwise undefined behavior however they please.

Their intention were admirable (at least we know for the fact that anyone have done anything out of sheer malice), but what they have actually achieved is sheer lunacy. POSIX, for more than two decades, declares realloc semantic like this:

blah blah blah This volume of IEEE Std 1003.1-2001 defers to the ISO C standard. blah blah blah (emphasis in the original)

And since actual version of ISO C standard is not mentioned anywhere (presumably to ensure that possible bugfixes are automatically picked)… that means that by ratifying that change they have, effectively, added undefined behavior to most existing versions of POSIX specification.

Ignoring undefined behaviour

Posted Oct 31, 2024 8:20 UTC (Thu) by milesrout (subscriber, #126894) [Link] (1 responses)

>And since actual version of ISO C standard is not mentioned anywhere (presumably to ensure that possible bugfixes are automatically picked)… that means that by ratifying that change they have, effectively, added undefined behavior to most existing versions of POSIX specification.

This is not true. POSIX.1-2024 says in Chapter 1 of Volume 2 (https://pubs.opengroup.org/onlinepubs/9699919799/function...):

>This volume of POSIX.1-2024 is aligned with the following standards, except where stated otherwise:
>ISO C (C17)
> ISO/IEC 9899:2018, Programming Languages — C.
>Parts of the ISO/IEC 9899:2018 standard (hereinafter referred to as the ISO C standard) are referenced to describe requirements also mandated by this volume of POSIX.1-2024.

And earlier versions have similar language:

>Great care has been taken to ensure that this volume of POSIX.1-2017 is fully aligned with the following standards:
>ISO C (1999)
> ISO/IEC 9899:1999, Programming Languages - C, including ISO/IEC 9899:1999/Cor.1:2001(E), ISO/IEC 9899:1999/Cor.2:2004(E), and ISO/>IEC 9899:1999/Cor.3.
>Parts of the ISO/IEC 9899:1999 standard (hereinafter referred to as the ISO C standard) are referenced to describe requirements also mandated by this volume of POSIX.1-2017.

Note the 'hereinafter referred to as the ISO C standard' bit.

Ignoring undefined behaviour

Posted Oct 31, 2024 8:24 UTC (Thu) by milesrout (subscriber, #126894) [Link]

Note that this means that, despite being released in 2024, the 8th edition of POSIX still refers to "C17" (the 2018 edition of ISO C) and not to C23, despite 2024 being after 2023. That is, at least in part, because, as far as I know, the official standard for C23 still hasn't been released! It is expected to be finalised this year but who knows? It could well end up being ISO/IEC 9899:2025, which would be pretty funny.

Ignoring undefined behaviour

Posted Oct 31, 2024 16:45 UTC (Thu) by anton (subscriber, #25547) [Link]

It says that, in case of conflict (i.e., both standards define the behaviour in different ways), the C standard prevails. But if the C standard does not define a behaviour, and POSIX defines it, there is no conflict, and the POSIX definition is the one a POSIX-compliant system has to implement.

If you want to promote an interpretation where undefined behaviour in the C standard prevails over defined behaviour in POSIX, that's obviously absurd: for all the functions defined in the C standard, the POSIX definition would just be redundant; and for all the functions that are not defined in the C standard, but are defined in POSIX (e.g., read()), that interpretation would mean that these functions are undefined. I.e., with that interpretation all the definitions of C functions in POSIX would be superfluous.

Remember that realloc can fail

Posted Oct 25, 2024 14:06 UTC (Fri) by smcv (subscriber, #53363) [Link] (2 responses)

I think the real problem here is that "1. Behave like malloc() and return a pointer to a zero-byte object" has three possible results: it can succeed, returning the same non-null pointer it was originally given (let's call that result 1a); or it can succeed, returning a non-null pointer that is different from the one it was originally given (1b); or it can fail, returning NULL and setting errno to ENOMEM (1c). You might reasonably assume that allocating 0 bytes of memory can't fail, but if the result is required to be a unique pointer - as it is in glibc documentation - then allocating some space for bookkeeping overhead will still be required, and *that* might result in running out of memory.

If it succeeds, a caller can distinguish that from behaviours 2. and 3., because the result is non-null. It can compare the result with the previous pointer to decide whether it's in situation 1a or 1b.

But if it fails, a caller cannot distinguish between 1c, 2 and 3. This is a problem because they have differing side-effects! In situations 1c and 3, errno has been set, and the original pointer is untouched (the caller is still responsible for freeing it later). In situation 2, the value of errno is undefined (so errno cannot be used to distinguish between this and situations 1c and 3), and the original pointer has been freed (the caller must not free it again, that would be UB).

Remember that realloc can fail

Posted Oct 25, 2024 17:13 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (1 responses)

This is one of the reasons that realloc(..., 0) was specified as UB: It is not practical to write a portable program that correctly uses it, unless you're content to leak memory like a sieve on some platforms.

IMHO if you're going to insist on allowing realloc(..., 0) at all, it should be an infallible operation. That leaves us with two general categories of behavior that make sense:

1. Return a pointer that can be passed to free() or realloc(), and may or may not be unique as the implementation sees fit, but probably can't be unique in the general case.
2. Return NULL and free the pointer.

Note that (2) is simply a special case of (1) where the pointer happens to be NULL - free(NULL) is specified as a no-op, and realloc(NULL, ...) is specified as equivalent to malloc(), so NULL is entirely equivalent to a zero-sized allocation of arbitrary type. As such, I see no point in bothering with (1) in the first place.

If the developer wants a source of unique numbers, they can do that by atomically incrementing a global u64. There is no logical reason to use malloc() or realloc() for such a purpose, since those functions take actual locks and are much more expensive. If the developer wants a lot of numbers and is concerned that a u64 will wrap, then they can go to the bother of using UUIDs (seeing as 2^64 is unbelievably vast, I very much doubt the average program really needs to do this, but it's obviously the correct approach in such cases).

Remember that realloc can fail

Posted Oct 25, 2024 17:22 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

Of course, the counterpoint to my argument is that the developer can just do this:

#define realloc_s(ptr, size) (size == 0 ? free(ptr), (void*)NULL : realloc(ptr, size))

Now realloc_s is guaranteed to have the "correct" behavior for all pointers, regardless of what the implementation decides to do, and so one could argue that realloc() may as well do something else instead. But IMHO that way lies madness, because you could special-case any random stdlib function in this way, so where do we draw the line?

leave established API alone

Posted Oct 25, 2024 20:49 UTC (Fri) by RogerOdle (subscriber, #60791) [Link] (3 responses)

Please do not fix existing problems by changing long established API. This only breaks existing code. Treat it like the kernel does, do not break userspace. Instead, define a new sane API and provide it as an alternative.

Personally, I like two varities:

1) never return a bad value, AKA is do it or die, AKA die early because I can't do anything anyway. OK to scream OUCH!!! if out of memory. Really, if you are out of memory then you are screwed anyway. If you thought you were allocating something other than 0 bytes, it is best to know as soon as possible.

2) return NULL and set errno=EINVAL on size==0. Users problem, not systems. I can keep going or do controlled shutdown as the case may be.

leave established API alone

Posted Oct 26, 2024 4:10 UTC (Sat) by abartlet (subscriber, #3928) [Link]

This is where I sit, to change such a long-defined behaviour is fraught with risk.

leave established API alone

Posted Oct 30, 2024 11:52 UTC (Wed) by vadim (subscriber, #35271) [Link]

There are edge cases where outright dying instantly in the allocator may not be desirable. Eg, take a program that allocates 1 GB for something big like a VM or a database, and then the error handler allocates memory to report an useful error.

You can expect that if you can't allocate 1GB, you probably still can get 1K for a string.

There also may be tasks that can back off on allocation, like if the 1GB is for a cache, perhaps 512MB is also fine.

leave established API alone

Posted Nov 7, 2024 23:33 UTC (Thu) by fest3er (guest, #60379) [Link]

«This only breaks existing code.»

Correct. I've had to go through and fix a bunch of old C++ code because newer standards changed syntax. C—known to be fraught with pointer errors—should work to tighten behavior; that is, the standard should be changed to minimize the negative effects of undefined pointer actions/operations even if means that existing code will have to be corrected. It would likely involve a syntax change for well-written code, but must just require significant reprogramming for programs that employ diabolical creativity.

Call abort()?

Posted Oct 26, 2024 5:53 UTC (Sat) by DemiMarie (subscriber, #164188) [Link]

Could realloc(p, 0) just call abort()? I guess if UBSAN is on, but otherwise it would be too backwards-incompatible…

glibc changes like this are fine

Posted Oct 27, 2024 2:12 UTC (Sun) by marcH (subscriber, #57642) [Link] (3 responses)

> This is, it seems, a topic about which some people, at least, have strong feelings.

50+ comments here already!

> there are almost certainly programs that rely on the current behavior

Apparently just non-portable programs that have a hard dependency on glibc, so such breakage would be fine because users:
1. either use a stable / LTS GNU/Linux distribution that is not going to perform a major glibc upgrade (this change should obviously not be part of a minor release, only in a major one)
2. or they use a rolling / fast-paced / development one where everything keeps breaking anyway for a gazillion of other reasons.

To be fairer: _how_ it would break in case 2. matters. Discussed above already.

glibc changes like this are fine

Posted Oct 27, 2024 7:08 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (2 responses)

It's not that easy.

1. Stable systems eventually do need to upgrade to the next version of everything. This is already uncomfortably close to being a major flag day for those distros. Changing the behavior of something as basic as realloc would push it further in that direction.
2. There are many intermediates between an unstable distro where everything is breaking all the time and (e.g.) RHEL. Debian Testing, for example, is well known to be a reasonably well-behaved release channel in practice, and even Sid is reportedly not that bad (personally, I would never use either of those for serious purposes, but different organizations and use cases have different needs).

glibc changes like this are fine

Posted Oct 27, 2024 22:05 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (1 responses)

Oh, and I forgot:

3. Hyrum's law. Most programmers do not read the C specification, they read (at most) the man page, which (on my WSL installation of Debian), says that realloc(ptr, 0) is equivalent to malloc(0) if ptr == NULL and free(ptr) otherwise (the return value section does mention that realloc(ptr, 0) may return a non-NULL pointer, but it does not distinguish between the ptr == NULL case and the ptr != NULL case, so the most straightforward interpretation is that it is talking about the former). There is no mention of portability anywhere in the document that I could find (the document claims that realloc is "conforming to" various versions of both the POSIX and C standards, without any note of this extension). Some programmers will not even go that far, they just write some code and test if it works as expected. Then the code silently picks up a glibc dependency that nobody explicitly knows about. This is not hypothetical - there is at least one comment in this thread from a developer who actually did take such a dependency and had to patch their code for musl compatibility.

In case anyone is wondering whether it is up-to-date: I honestly have no idea. It is probably outdated, because Debian, but I do not know off the top of my head how to confirm that. The colophon says it's from Linux man-pages version 5.10 and has a URL pointing to https://www.kernel.org/doc/man-pages/, which in turn tells me nothing at all about the current version of that project.

glibc changes like this are fine

Posted Oct 28, 2024 10:58 UTC (Mon) by guillemj (subscriber, #49706) [Link]

> In case anyone is wondering whether it is up-to-date: I honestly have no idea. It is probably outdated, because Debian, but I do not know off the top of my head how to confirm that. The colophon says it's from Linux man-pages version 5.10 and has a URL pointing to https://www.kernel.org/doc/man-pages/, which in turn tells me nothing at all about the current version of that project.

From the upstream link you provided you can either get to the latest version from git or from its online manuals, from the top links "bar":

https://man7.org/linux/man-pages/man3/malloc.3.html

Where there is a mention of the non-portable behavior, which I assumed would be there given that Alejandro used to maintain the manpages project until recently. The Debian version you mention is from _oldstable_, you can see where the various versions are provided for each release here for example:

https://tracker.debian.org/pkg/manpages

The version in _stable_ seems to already have the note:

https://manpages.debian.org/bookworm/manpages-dev/realloc...

Honestly kind of irrelevant

Posted Oct 29, 2024 8:38 UTC (Tue) by chris_se (subscriber, #99706) [Link] (2 responses)

To me this is kind of irrelevant. If you want to write portable code then you have to cope with the fact that malloc(0) and realloc(ptr, 0) don't react the same on all platforms - and just avoid them altogether. To me any C code relying on these kind of specific behaviors (be that of glibc or any other platform) has a strong smell.

Even if the code is used in non-portable programs that are designed for just a single operating system, I still don't like them. There are some implementation-defined behaviors (such as relying on two's complement or 8bit byte sizes) where it can make a lot of sense to rely on those, because not assuming that would make the code a **lot** more complicated for no benefit at all in many cases. (And platforms that that don't use e.g. two's complement are extremely rare nowadays.) But avoiding malloc(0) or realloc(ptr, 0) is just 1-2 more lines of code. And if it's handled explicitly it is 100% clear what the code intends to do. In the absence of explicit handling that's not the case, and then one doesn't immediately know whether the code actually relies on that behavior, or whether there's a bug in there and the author didn't even think about that corner case.

I personally would leave the current behavior as-is, because I don't see any benefit of changing it, because in my opinion nobody should ever write new code that relies on these kinds of specifics. But I also don't really care if they do decide to change that behavior.

Honestly kind of irrelevant

Posted Oct 29, 2024 10:10 UTC (Tue) by Wol (subscriber, #4433) [Link] (1 responses)

> To me this is kind of irrelevant. If you want to write portable code then you have to cope with the fact that malloc(0) and realloc(ptr, 0) don't react the same on all platforms - and just avoid them altogether. To me any C code relying on these kind of specific behaviors (be that of glibc or any other platform) has a strong smell.

I know this sort-of goes against the grain of being a standard, but why not say "here are two competing versions, you have to provide both (with a compiler switch), but we take no position on the default".

Khim's point about ISO saying Posix can do what it likes, when Posix explicitly defers to ISO, does imho appear daft. And if you want ISO C to be portable, specifying the two dominant (Posix and MS) implementations and saying "support both" seems to be the right way to go ... whether others agree is another matter ...

Cheers,
Wol

Honestly kind of irrelevant

Posted Oct 31, 2024 18:10 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

Because you don't need to. The developer can just do this:

#define malloc_s(size) (size == 0? (void*)NULL : malloc(size))
#define realloc_s(ptr, size) (size == 0 ? free(ptr), (void*)NULL : realloc(ptr, size))

And then the compiler can optimize that back into a malloc/realloc call easily enough, if it knows that malloc/realloc have the appropriate semantics on a given platform.

Getting unique non-dereferencable pointers is harder, and IMHO should not be done, because (as I've explained elsewhere in the thread) it is cheaper to have a global u64 and atomically increment it (malloc takes a lock in almost any sensible implementation). If you somehow manage to allocate over 16 quintillion numbers in this fashion (so that it wraps), then you probably should be using UUIDs instead of 64-bit integers in the first place.

Opportunity for GENSYM

Posted Oct 30, 2024 17:20 UTC (Wed) by jreiser (subscriber, #11027) [Link]

realloc(ptr, 0) is related to malloc(0). Experienced app implementors might realize that some uses of malloc(0) are equivalent to (GENSYM) in Lisp: create a new atom that is unique for the remaining duration of the process.

In C language, a tentatve definition such as one of:
void *GENSYSM(void) { return malloc(0); }
void *GENSYSM(void) { return malloc(1); }
void *GENSYSM(void) { return malloc(sizeof(void *)); }
allows taking advantage of simpler management than general malloc(). Allocate sequentially from larger block(s) at known address(es) (such as a page on 16-bit, aligned megabyte on 32-bit, aligned 4GiB on 64-bit), use a bitmap to track, etc. Interposing such a definition, intercepting malloc(0), can be useful in an existing system. Being able to distinguish (by address) a GENSYSM from a general malloc block can have advantages in debugging, user interface, or even logical flow.

In an environment that has inter-operable GENESYM and malloc, then realloc(ptr, 0) can be easy to understand and implement: { free(ptr); return GENSYM(); }

Why not define the behavior?

Posted Nov 3, 2024 20:09 UTC (Sun) by Bluehorn (subscriber, #17484) [Link]

I don't get it. If an API was misused before, I'd rather disallow what could be misinterpreted and ask to use another API.

IMHO the only sane way forward would be to narrow the API of realloc, rejecting zero-size objects.
So realloc(something, 0) should just set errno to EINVAL and return a NULL pointer. This may crash some applications, but that's easier to fix than invoking undefined behavior, making the code "run on my machine".

Full disclosure: I used realloc to resize string buffers, and was expecting for realloc(something, 0) being equivalent to free. Obviously I did not check for errors in that code, so calling realloc to increase the size failing would lead to a memory leak (and probably write to zero address).