|
|
Log in / Subscribe / Register

DeVault: Announcing the Hare programming language

DeVault: Announcing the Hare programming language

Posted May 4, 2022 14:02 UTC (Wed) by Vipketsh (guest, #134480)
In reply to: DeVault: Announcing the Hare programming language by atnot
Parent article: DeVault: Announcing the Hare programming language

> making it well-defined to turn arbitrary addresses into pointers and dereference them [...]. You can no longer rely on anything in memory still being the way it was across function calls

Compilers can not rely on that today, at least not in general. Without the seldom used 'pure' and 'const' attributes, the compiler has to assume that an (extern) function call has modified any and all memory accessible through some pointer. Furthermore, there are rules in the C standard for when the compiler has to assume things may have been indirectly modified through random pointers: the aliasing rules. These rules are generally so loose that many people make them much more strict with -fno-strict-alias, yet somehow we haven't seen a huge fallout from lack of optimisations as you would suggest. Being able to manufacture pointers out of random data does not have to effect on any of those rules!

It's interesting that in exactly *no* discussion of undefined behaviour have I ever seen any sort of numbers passed around along the lines of "if we would define that thing this way, we would loose an estimated X% of performance on some code bases", instead it's all in the lines of your comment saying "Oh, the hysteria, quiver in fear because you could do exactly no optimisations". People arguing to remove some undefined behaviour tend to give examples of what that undefined behaviour makes a big pain or impossible, but there is little concrete arguments from the other side about what removing the undefined behaviour in question would loose. That makes discussions, awareness of the problem, and finding some sort of middle ground exceedingly difficult.

> you can't be sure what it is now, because there was a function call in between, and the implementation of free() might have held onto the address of that allocation and fiddled with it

Guess what ? Every implementation of free() "holds onto the address" given to it (puts it on some free list) and "fiddles with it" (marks the area as unallocated).

Why is everything always painted in a way that if you can't fix any and all possible cases of a certain undefined behaviour without even a minimum of compromise we may as well through the baby out with the bath water ? We don't have to make everything perfect and foolproof to make things better.


to post comments

DeVault: Announcing the Hare programming language

Posted May 4, 2022 14:36 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

> These rules are generally so loose that many people make them much more strict with -fno-strict-alias, yet somehow we haven't seen a huge fallout from lack of optimisations as you would suggest.

Surely you mean `-fstrict-aliasing` here. Or are you saying that people *loosen* the rules with `-fno-strict-aliasing` and still see no fallout?

> Every implementation of free() "holds onto the address" given to it (puts it on some free list) and "fiddles with it" (marks the area as unallocated).

That's an argument that `free` cannot be implemented in (ISO) C and ends up doing more platform-specific things with the pointer than C would normally allow. Just like `std::memmove` isn't technically possible (AFAIK) in ISO C++ (because of the rules around comparing pointer from separate allocations). See also `std::bless` in C++ to have a way to inform the compiler "I did some memory shenanigans, the object there is now C++-okay". I suspect that compilers "know" when they're compiling these functions and act accordingly (probably through some compiler flag or pragma whatnots). Or very careful coding around the rules that C has to make sure the intent is preserved across the abstract machine.

DeVault: Announcing the Hare programming language

Posted May 4, 2022 14:57 UTC (Wed) by Vipketsh (guest, #134480) [Link]

> Surely you mean `-fstrict-aliasing` here.

Heh. I had a suspicion this would come up. The 'strict' in the compiler option refers to how strictly the compiler's alias analysis adheres to the standard. My use of 'strict' was referring to how many transformations are allowed by the standard. Wish I could have explained better.

> That's an argument that `free` cannot be implemented in (ISO) C

Indeed, but even so we don't have to disable all possible optimisations like the post I'm replying to is implying, while free() is routinely implemented in C.

DeVault: Announcing the Hare programming language

Posted May 4, 2022 15:21 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (4 responses)

> It's interesting that in exactly *no* discussion of undefined behaviour have I ever seen any sort of numbers passed around along the lines of "if we would define that thing this way, we would loose an estimated X% of performance on some code bases",

I totally agree. Gcc 4.7 used to abuse UB way less than 6 and above, and I've yet to see a program run faster with gcc 11 than it used to with gcc 4, usually it's even the opposite!

I said a few times (probably in this thread I don't remember) that if I knew how to do it and had enough time I would be happy to create a new "standard" for gcc such as "safe11" or something like this that next to gnu99 and friends, would be C11 with most (ideally all) UB defined to the most commonly expected case (it wouldn't be that far from the "linux kernel C").

And I'm quite sure it would be quickly adopted by many of us suffering from such jokes. Plus it would remove a ton of non-sensical warnings such as the ones that force you to scratch you head for a moment when trying to implement a binary integer rotate operation without any warning (32-bit doesn't work, you need to use bit^31 in the opposite shift and the compiler doesn't always recognize it to optimize it into a single rol/ror operation).

DeVault: Announcing the Hare programming language

Posted May 4, 2022 16:30 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (3 responses)

> I would be happy to create a new "standard" for gcc such as "safe11" or something like this

There have been attempts[1]. I've not heard news about meaningful progress (though I've also not sought it out). I'd expect any announcements of such a thing to show up on LWN in some manner :) .

[1] https://blog.regehr.org/archives/1287

DeVault: Announcing the Hare programming language

Posted May 6, 2022 2:59 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (2 responses)

Ah interesting, thanks for the link!

Probably the mistake this person made was to try to reach a consensus. If the proposal worked for some old code base, surely it wasn't that bad, and ought to have been proposed as-is as a patch to gcc.

DeVault: Announcing the Hare programming language

Posted May 6, 2022 13:05 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

> If the proposal worked for some old code base, surely it wasn't that bad, and ought to have been proposed as-is as a patch to gcc.

And it would be promptly rejected. Because the question which would be asked would be simple: why do you think you are especially special and deserve a separate treatment?

As was noted in the blog post: there are many programs typically compiled for ARM that would fail if this produced something besides 0, and there are also many programs typically compiled for x86 that would fail when this evaluates to something other than the original value… and both types can be rewritten to work within limitations of standard C… so why should the compiler developers care?

More-or-less the only guy who they give special treatment is Linus: not only he leads a huge and important project, but, more importantly, it's obvious that said project need to go beyond boundaries of Standard C, sometimes.

Even then leeway is extremely limited, Linus have to argue about things a lot for these to be accepted as GCC C extension.

DeVault: Announcing the Hare programming language

Posted May 6, 2022 18:40 UTC (Fri) by wtarreau (subscriber, #51152) [Link]

> As was noted in the blog post: there are many programs typically compiled for ARM that would fail if this produced something besides 0, and there are also many programs typically compiled for x86 that would fail when this evaluates to something other than the original value… and both types can be rewritten to work within limitations of standard C… so why should the compiler developers care?

I do have a response to this: just look at the code for each of them to adapt to the other one's behavior to figure which choice has the least impact, and purposely break the other one, given that it currently is broken or about to break anyway during a future compiler upgrade. But at least this will be clearly documented. And when the cost is the same I'd choose x86 by default since 1) it's accumulated way more older code (arm code tends to be more modern and less arch-specific), and 2) it's where users go when they want the highest performance level nowadays.

> More-or-less the only guy who they give special treatment is Linus: not only he leads a huge and important project, but, more importantly, it's obvious that said project need to go beyond boundaries of Standard C, sometimes. Even then leeway is extremely limited, Linus have to argue about things a lot for these to be accepted as GCC C extension.

Yes, I know, and that's sad.

DeVault: Announcing the Hare programming language

Posted May 4, 2022 15:32 UTC (Wed) by excors (subscriber, #95769) [Link] (2 responses)

> These rules are generally so loose that many people make them much more strict with -fno-strict-alias, yet somehow we haven't seen a huge fallout from lack of optimisations as you would suggest. [...] It's interesting that in exactly *no* discussion of undefined behaviour have I ever seen any sort of numbers passed around along the lines of "if we would define that thing this way, we would loose an estimated X% of performance on some code bases", instead it's all in the lines of your comment saying "Oh, the hysteria, quiver in fear because you could do exactly no optimisations".

It's trivial to construct plausible examples where aliasing has a huge effect on performance, especially in C++ where you don't want everything to alias with 'this'. E.g. https://godbolt.org/z/zddTYK378 executes over 4x faster with -fstrict-aliasing on my CPU (because the compiler can autovectorize the loop when it realises the input and output don't alias). You can probably do similar with most other undefined-behaviour optimisations, but I'm not sure that would really prove much.

I think one major problem with trying to translate that into "X% of performance on some code bases" is that there's a massive range of code bases, and no benchmark suite is representative of them all, so it's impossible to get representative numbers. But even if it was: If an optimisation has no effect on 99% of programs, but it makes 1% of programs 4x faster, is that worth it? It seems the most common positions are "it's always worth it, regardless of the exact numbers" (modern compiler developers), and "it's never worth it, regardless of the exact numbers" (people who want C to be nicer syntax for assembly code), and the exact numbers probably won't change anyone's mind.

And in any code base where performance is important, the developer should have already profiled and optimised it around their current compiler's capabilities - e.g. if they had code like my example with -fno-strict-aliasing then they'd probably extract 'sum' into a local variable to help the compiler. Then a benchmark would show no benefit from -fstrict-aliasing, because the programmer has already paid the cost of working around aliasing problems. Optimisation isn't just a compiler algorithm, it's a feedback loop between compiler and programmer, so you can't evaluate it properly by running compilers on a static set of benchmarks.

And it's a feedback loop that spans decades: e.g. compilers get really good at inlining and constant-folding and eliminating dead code, so people invent techniques like expression templates (where a C++ expression doesn't compute a value, it essentially computes a type that represents the AST of the expression, which can be manipulated at compile-time before eventually turning into hundreds of function calls that produce a single line of code), then they build a linear algebra library like Eigen using that technique, then applications start using the library, then compiler developers are motivated to improve autovectorization because there's all these applications doing linear algebra, etc.

At the end of that process, you can't just turn off one of the old compiler optimisations and expect to get meaningful results; too much code implicitly depends on it. And at the start of the process, you couldn't have predicted exactly what that optimisation would lead to; all you could predict is that if you had waited for quantifiable evidence of a major benefit then you'd never had made any progress.

(This argument mostly applies to C++, not C, but I think nobody cares enough about C to develop a serious compiler for it - you'll just get a C++ compiler with a cut-down parser, so you'll get the costs of these fancy optimisations without much of the benefit. That's the downside of sticking with a niche language like C.)

DeVault: Announcing the Hare programming language

Posted May 4, 2022 17:44 UTC (Wed) by Vipketsh (guest, #134480) [Link] (1 responses)

> that there's a massive range of code bases, and no benchmark suite is representative of them all,

That's exactly the kind of argument I was talking about in my last sentence that does not help these discussions. Somewhere along the way, someone put in a ton of work to write an optimisation pass to, I hope, produce more optimal output. Since everything is about optimising output, again, I would hope that there were at least *some* benchmarks published along with the new optimisation to show that maintaining the optimisation pass for the future is a good idea. Therefore when these discussions come up it should be pretty simple: "look, when this new NULL check deleting pass was added it brought X% to the table on this benchmark". At that point we would have a basis for discussion: maybe the code base in question isn't so important any more, maybe some other newer passes make the gains less relevant, or maybe just decide that the gains are not a good trade-off. With random hand-waiving and fear mongering there is no way a meaningful discussion can be had.

> At the end of that process, you can't just turn off one of the old compiler optimisations and expect to get meaningful results;

On the flip side optimisation passes can turn out to be meaningless because some other new passes don't create the sequences any more for it to be meaningful. It's also not like performance regressions are unheard of in compiler land. If we could have a discussion with numbers we could very well come to some tentative conclusion and disable the pass by default to see what falls out (let your users do the testing on "massive range of code bases").

DeVault: Announcing the Hare programming language

Posted May 6, 2022 0:28 UTC (Fri) by khim (subscriber, #9252) [Link]

> Since everything is about optimising output, again, I would hope that there were at least *some* benchmarks published along with the new optimisation to show that maintaining the optimisation pass for the future is a good idea.

True and you can find such benchmarks in the bugzilla (or github for clang). But nobody bothers to measure impact of optimizations based on different UBs. Because the assumption is that code doesn't have any.

In the end you have hundreds of passes and absolutely zero knowledge about which of them are applicable in which cases (except for a few, niche, UBs which are simple enough to deserve a dedicated flag).

> On the flip side optimisation passes can turn out to be meaningless because some other new passes don't create the sequences any more for it to be meaningful.

Sure. Compiler writers keep track of these things. What they don't keep is mapping between UBs and optimizations (again: with exception of explicitly created flags like -fstrong-aliasing or -fwrapv).

You can measure effect of different optimization passes, but you have absolutely no idea which of them are safe or not safe to use when you want to turn some UB into defined behavior.

DeVault: Announcing the Hare programming language

Posted May 4, 2022 16:09 UTC (Wed) by atnot (guest, #124910) [Link] (12 responses)

> there are rules in the C standard for when the compiler has to assume things may have been indirectly modified through random pointers: the aliasing rules

Indeed. But those rely on the fact that just creating pointers to arbitrary memory is invalid. If de-referencing arbitrary addresses is valid, all bets are off.

> These rules are generally so loose that many people make them much more strict with -fno-strict-alias

It does the opposite, it makes them weaker, but only a bit. But that's kind of besides the point, which is that it is basically impossible to interface with memory at all without some kind of aliasing rules.

> People arguing to remove some undefined behavior tend to give examples of what that undefined behaviour makes a big pain or impossible, but there is little concrete arguments from the other side about what removing the undefined behaviour in question would loose.

Well, it's kind of impossible to know. There's not a single flag or pass you could turn off to e.g. reliably leave in null checks. The compiler might or might not have a specific code path for eliminating null pointers, but removing that doesn't mean those null dereferences won't be removed by other passes operating on similar assumptions. Or that something else critical won't be removed next time.

The thing is, even if it is phrased that way, the complaint is rarely actually "I would like this specific thing to be defined", it is "I would like the C abstract machine to behave exactly as simply as I think it does". But in a language as unconstrained as C, that's not really possible, nor would it really be a desirable slope to ride.

At the end of the day, that's what this is really about to me. I'm not personally elated when the compiler optimizes out my checks either. I'm not sitting here refreshing the gcc homepage, eagerly anticipating new optimization passes to break my code. But I recognize that these are the consequence of a language that desires to be both fast and accept programs that do arbitrary memory manipulations. And to me it's very clear that if we want to write programs that behave as we think they do, ones where our mental model and the compiler's model are one and the same, we have no choice but to give up one or the other. Just defining a few things won't be enough to make the problems go away.

DeVault: Announcing the Hare programming language

Posted May 4, 2022 16:40 UTC (Wed) by excors (subscriber, #95769) [Link] (11 responses)

> There's not a single flag or pass you could turn off to e.g. reliably leave in null checks. The compiler might or might not have a specific code path for eliminating null pointers, but removing that doesn't mean those null dereferences won't be removed by other passes operating on similar assumptions.

There is -fno-delete-null-pointer-checks, which may not be reliable enough for security purposes but can easily pessimize code: https://godbolt.org/z/PGje44zna is autovectorized unless you enable that flag or remove the "*sum = 0;" line (which tells the compiler it can ignore the subsequent NULL checks).

(And for completeness a similar example with -fwrapv: https://godbolt.org/z/exzs7ocaj is only autovectorized when it can assume the loop does not overflow to negative values.)

DeVault: Announcing the Hare programming language

Posted May 4, 2022 17:45 UTC (Wed) by Vipketsh (guest, #134480) [Link]

> remove the "*sum = 0;" line (which tells the compiler it can ignore the subsequent NULL checks).

I don't think that example demonstrates a case for "derferencing NULL is undefined behaviour". The compiled code has the "if (!sum)" hoisted out of the loop, and once you do that optimisation the loop is no different than if the check where completely removed. Seems to me like the reason for the failed vectorisation is more an internal compiler issue, possibly because of the ordering of passes and not because "dereferencing NULL being undefined behaviour" is vital to the vectorisation.

DeVault: Announcing the Hare programming language

Posted May 5, 2022 2:54 UTC (Thu) by foom (subscriber, #14868) [Link] (9 responses)

> There is -fno-delete-null-pointer-checks

Yes, this flag has a remarkably poor name. In fact, the flag doesn't "turn off deleting null pointer checks" (whatever that might mean). Rather, the underlying behavior (at least as implemented in Clang -- I believe the same is true for GCC) is entirely principled: it informs the compiler that the null pointer might actually refer to valid memory that a program can successfully (potentially even intentionally!) access as an object.

A _consequence_ is that "*foo = 0;" doesn't imply "foo != nullptr", as it otherwise does (so it does have the effect of "not deleting" THAT null pointer check).

DeVault: Announcing the Hare programming language

Posted May 5, 2022 17:28 UTC (Thu) by nybble41 (subscriber, #55106) [Link] (8 responses)

> A _consequence_ is that "*foo = 0;" doesn't imply "foo != nullptr", as it otherwise does (so it does have the effect of "not deleting" THAT null pointer check).

A further consequence of enabling this flag is that you are no longer programming in ISO C:

> An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

… or C++:

> A null pointer constant is an integer literal (5.13.2) with value zero or a prvalue of type std::nullptr_t. A null pointer constant can be converted to a pointer type; the result is the null pointer value of that type (6.8.2) and is distinguishable from every other value of object pointer or function pointer type.

… since a null pointer can no longer be distinguished from a pointer to an object or function.

DeVault: Announcing the Hare programming language

Posted May 5, 2022 18:10 UTC (Thu) by farnz (subscriber, #17727) [Link] (5 responses)

I'm not sure I follow your reasoning, and I'd appreciate you expanding on it.

Take the following C++ code:


bool bad_code(bool deref_null) {
    int *foo;
    int real_val;
    int *bar = deref_null ? nullptr : ℜ_val;
    *bar = 0;
    return bar == nullptr;
}

I don't see how the snippets you've quoted make it impossible for this function's return value to differ from its deref_null parameter. The null pointer remains a unique value; the behaviour of *bar = 0 is undefined, but importantly, if I remove that line, the function behaves the same in both ISO C++ and C++ with -fno-delete-null-pointer-checks - the distinction is that in ISO C++, this function can be optimized to the equivalent of:


bool bad_code(bool) { return false; }

while with -fno-delete-null-pointer-checks, it can only be optimized to:

bool bad_code(bool deref_null) { return deref_null; }

Although, in both cases, it's perfectly reasonable to elide or not elide the write to pointer value 0, since that write is UB.

DeVault: Announcing the Hare programming language

Posted May 5, 2022 19:12 UTC (Thu) by nybble41 (subscriber, #55106) [Link] (4 responses)

> The null pointer remains a unique value; …

Unique, yes—there is only one null pointer value—but not distinct from any pointer to an object or function. With the -fno-delete-null-pointer-checks flag enabled you can have a pointer to a valid object which compares equal to a null pointer.

> I don't see how the snippets you've quoted make it impossible for this function's return value to differ from its deref_null parameter.

(I am assuming that "ℜ_val" in your example was supposed to be "&real_val". I'm not sure of the purpose of the unused pointer variable "foo".)

According to ISO C++, with the "*bar = 0" line deleted, the return value must be equal to "deref_null". The "bar" pointer can only be "nullptr" when deref_null is true or "&real_val" when deref_null is false, and "&real_val", as a pointer to an object, can never compare equal to "nullptr". With the "*bar = 0" line it's UB when deref_null is true and so could be optimized to just "return false", as you said.

However, with the -fno-delete-null-pointer-checks is enabled, we do not have the guarantees of ISO C++ and "nullptr" could in theory compare equal to a pointer to an object, e.g. if pointers are represented as byte addresses, "nullptr" is represented as byte address zero, and the object (in this case "real_val") happens to be placed at byte address zero. If this happened then "bar == nullptr" would be true even if deref_null is false, so the function cannot be optimized to just "return deref_null".

DeVault: Announcing the Hare programming language

Posted May 6, 2022 13:38 UTC (Fri) by farnz (subscriber, #17727) [Link] (3 responses)

Sorry about the bad code formatting - I have no idea how copying and pasting from Emacs did that.

I don't see how you get "not distinct from any pointer to an object or function" from the description of the -fno-delete-null-pointer-checks flag. As I read the documentation, -fno-delete-null-pointer-checks does not permit you to have a pointer to a valid object that compares equal to a null pointer; instead it says that the act of dereferencing a pointer implies nothing about its value. Without the flag, dereferencing a pointer implies the pointer value must not be a null pointer, since if it was a null pointer, the dereference would result in UB (since a null pointer cannot point to a valid object). With the flag, however, while the dereference itself is still UB (since a null pointer cannot point to a valid object), the compiler acts as-if each dereference of a nullptr was immediately followed by an assignment of an unknown value to the pointer.

Because the value is unknown, it could still be a null pointer, but it could also be a new pointer to a valid object - the compiler's analysis passes simply don't know at this point, and thus it cannot rely on the dereference to permit it to remove a nullptr check, since it does not know what the pointer's value is.

DeVault: Announcing the Hare programming language

Posted May 6, 2022 14:55 UTC (Fri) by nybble41 (subscriber, #55106) [Link] (1 responses)

> I don't see how you get "not distinct from any pointer to an object or function" from the description of the -fno-delete-null-pointer-checks flag.

The default is -fdelete-null-pointer-checks, which has the description: "Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero."[0] The -fno-delete-null-pointer-checks flag affects *both* of these assumptions, meaning that the compiler cannot assume "that no code or data element resides at address zero" (i.e. that no pointer to an object has the same representation as a null pointer).

As stated in the documentation the intended use of the -fno-delete-null-pointer-checks flag is platforms such as AVR where objects *can* be placed at address zero, which implies that &variable can be indistinguishable from a null pointer. Though this is more likely to be true for a global or static object than for a stack variable.

[0] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#...

DeVault: Announcing the Hare programming language

Posted May 6, 2022 16:52 UTC (Fri) by farnz (subscriber, #17727) [Link]

Thanks for clearing up my misunderstanding - for some reason, I was mentally skipping the second assumption (since on my platforms of choice,there cannot be a code or data element at address 0, and only focusing on the first assumption (that programs cannot safely dereference null pointers, which is the one that allows a compiler to deduce that if you dereference a pointer, it cannot be a null pointer).

DeVault: Announcing the Hare programming language

Posted May 6, 2022 17:26 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

> Sorry about the bad code formatting - I have no idea how copying and pasting from Emacs did that.

ampersand is the HTML/XML metacharacter for starting an entity, and although the standard says that entity references should include a final semicolon, HTML-handling software is more tolerant of the missing semicolon than is entirely ideal.

So it appears that the sequence &real gets overgenerously interpreted as an HTML entity equivalent to Unicode codepoint U+211C BLACK-LETTER CAPITAL R from the Letterlike Symbols block of the Mathematical Symbols, which has an alias name of "Real part".

DeVault: Announcing the Hare programming language

Posted May 7, 2022 6:03 UTC (Sat) by Vipketsh (guest, #134480) [Link] (1 responses)

I have to ask, what were you trying to add to the discussion ?

Sure, you are absolutely correct but what relevance does it have ? If the *compiler* has to assume that NULL points to a valid object what programs would break ? What other fallout would there be ? The only thing I can think of is that when of lawyering about "if (my_pointer == NULL)" you would have to say "Does my_pointer point to the object at address NULL?" instead of "Is my_pointer pointing to an invalid object?".

I think most interpretations of the standard, in the context of undefined behaviour, are simply done in bad faith. My opinion is that the reason that language is in there, and has to be there, is so that malloc(), or anything else that works with a pointer, can return or check for an error. And the reason dereferencing a NULL pointer is undefined is because there is no telling how a platform behaves when you do so. See how none of this has anything to do with the compiler ?

It would do so much good for these discussions if the standard and what it says were put aside. Talk about how one or another change would affect real existing programs and/or platforms. Talk about possible fallout. Talk about potential issues. Talk about benefits. Because "oh, how terrible, now some standard does not match up if I squint at this way" is completely meaningless and adds nothing. Standards, in general, should be looked upon as a nothing more than a aid to achieving interoperability (they can and do contain falsehoods). We all know that standards are violated all the time and to make things work one needs domain specific experience. Lastly, if you are writing a standard your goal should be to document the status quo and most definitely not an attempt to change the world.

DeVault: Announcing the Hare programming language

Posted May 8, 2022 12:48 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

> My opinion is that the reason that language is in there, and has to be there, is so that malloc(), or anything else that works with a pointer, can return or check for an error. And the reason dereferencing a NULL pointer is undefined is because there is no telling how a platform behaves when you do so. See how none of this has anything to do with the compiler ?

You have muddled the NULL pointer (an abstract idea) with the all zeroes address on a typical CPU, these are intentionally not the same thing.

While it's obviously a bad idea, C has no trouble with using actual values from a type as sentinels, atoi("junk") and atoi("0") are both zero. So it wouldn't have been a problem to define that malloc() returning zero can be either an error or an actual zero address. And because C runs on the abstract machine, not some actual platform with whatever weird behaviour, the question of what happens if we try to do platform illegal operations never comes up.

Most platforms are likely to either not be phased at all by the all-zeroes address, or to be equally concerned with some other address values, including values beyond some logical "end of memory", ROMs, and memory mapped peripherals. We can observe that the C language does not define special behaviour for any of these, only NULL which means something in the abstract machine and *that* is why it's used as a sentinel value.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds