|
|
Subscribe / Log in / New account

DeVault: Announcing the Hare programming language

DeVault: Announcing the Hare programming language

Posted May 5, 2022 17:28 UTC (Thu) by nybble41 (subscriber, #55106)
In reply to: DeVault: Announcing the Hare programming language by foom
Parent article: DeVault: Announcing the Hare programming language

> A _consequence_ is that "*foo = 0;" doesn't imply "foo != nullptr", as it otherwise does (so it does have the effect of "not deleting" THAT null pointer check).

A further consequence of enabling this flag is that you are no longer programming in ISO C:

> An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

… or C++:

> A null pointer constant is an integer literal (5.13.2) with value zero or a prvalue of type std::nullptr_t. A null pointer constant can be converted to a pointer type; the result is the null pointer value of that type (6.8.2) and is distinguishable from every other value of object pointer or function pointer type.

… since a null pointer can no longer be distinguished from a pointer to an object or function.


to post comments

DeVault: Announcing the Hare programming language

Posted May 5, 2022 18:10 UTC (Thu) by farnz (subscriber, #17727) [Link] (5 responses)

I'm not sure I follow your reasoning, and I'd appreciate you expanding on it.

Take the following C++ code:


bool bad_code(bool deref_null) {
    int *foo;
    int real_val;
    int *bar = deref_null ? nullptr : ℜ_val;
    *bar = 0;
    return bar == nullptr;
}

I don't see how the snippets you've quoted make it impossible for this function's return value to differ from its deref_null parameter. The null pointer remains a unique value; the behaviour of *bar = 0 is undefined, but importantly, if I remove that line, the function behaves the same in both ISO C++ and C++ with -fno-delete-null-pointer-checks - the distinction is that in ISO C++, this function can be optimized to the equivalent of:


bool bad_code(bool) { return false; }

while with -fno-delete-null-pointer-checks, it can only be optimized to:

bool bad_code(bool deref_null) { return deref_null; }

Although, in both cases, it's perfectly reasonable to elide or not elide the write to pointer value 0, since that write is UB.

DeVault: Announcing the Hare programming language

Posted May 5, 2022 19:12 UTC (Thu) by nybble41 (subscriber, #55106) [Link] (4 responses)

> The null pointer remains a unique value; …

Unique, yes—there is only one null pointer value—but not distinct from any pointer to an object or function. With the -fno-delete-null-pointer-checks flag enabled you can have a pointer to a valid object which compares equal to a null pointer.

> I don't see how the snippets you've quoted make it impossible for this function's return value to differ from its deref_null parameter.

(I am assuming that "ℜ_val" in your example was supposed to be "&real_val". I'm not sure of the purpose of the unused pointer variable "foo".)

According to ISO C++, with the "*bar = 0" line deleted, the return value must be equal to "deref_null". The "bar" pointer can only be "nullptr" when deref_null is true or "&real_val" when deref_null is false, and "&real_val", as a pointer to an object, can never compare equal to "nullptr". With the "*bar = 0" line it's UB when deref_null is true and so could be optimized to just "return false", as you said.

However, with the -fno-delete-null-pointer-checks is enabled, we do not have the guarantees of ISO C++ and "nullptr" could in theory compare equal to a pointer to an object, e.g. if pointers are represented as byte addresses, "nullptr" is represented as byte address zero, and the object (in this case "real_val") happens to be placed at byte address zero. If this happened then "bar == nullptr" would be true even if deref_null is false, so the function cannot be optimized to just "return deref_null".

DeVault: Announcing the Hare programming language

Posted May 6, 2022 13:38 UTC (Fri) by farnz (subscriber, #17727) [Link] (3 responses)

Sorry about the bad code formatting - I have no idea how copying and pasting from Emacs did that.

I don't see how you get "not distinct from any pointer to an object or function" from the description of the -fno-delete-null-pointer-checks flag. As I read the documentation, -fno-delete-null-pointer-checks does not permit you to have a pointer to a valid object that compares equal to a null pointer; instead it says that the act of dereferencing a pointer implies nothing about its value. Without the flag, dereferencing a pointer implies the pointer value must not be a null pointer, since if it was a null pointer, the dereference would result in UB (since a null pointer cannot point to a valid object). With the flag, however, while the dereference itself is still UB (since a null pointer cannot point to a valid object), the compiler acts as-if each dereference of a nullptr was immediately followed by an assignment of an unknown value to the pointer.

Because the value is unknown, it could still be a null pointer, but it could also be a new pointer to a valid object - the compiler's analysis passes simply don't know at this point, and thus it cannot rely on the dereference to permit it to remove a nullptr check, since it does not know what the pointer's value is.

DeVault: Announcing the Hare programming language

Posted May 6, 2022 14:55 UTC (Fri) by nybble41 (subscriber, #55106) [Link] (1 responses)

> I don't see how you get "not distinct from any pointer to an object or function" from the description of the -fno-delete-null-pointer-checks flag.

The default is -fdelete-null-pointer-checks, which has the description: "Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero."[0] The -fno-delete-null-pointer-checks flag affects *both* of these assumptions, meaning that the compiler cannot assume "that no code or data element resides at address zero" (i.e. that no pointer to an object has the same representation as a null pointer).

As stated in the documentation the intended use of the -fno-delete-null-pointer-checks flag is platforms such as AVR where objects *can* be placed at address zero, which implies that &variable can be indistinguishable from a null pointer. Though this is more likely to be true for a global or static object than for a stack variable.

[0] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#...

DeVault: Announcing the Hare programming language

Posted May 6, 2022 16:52 UTC (Fri) by farnz (subscriber, #17727) [Link]

Thanks for clearing up my misunderstanding - for some reason, I was mentally skipping the second assumption (since on my platforms of choice,there cannot be a code or data element at address 0, and only focusing on the first assumption (that programs cannot safely dereference null pointers, which is the one that allows a compiler to deduce that if you dereference a pointer, it cannot be a null pointer).

DeVault: Announcing the Hare programming language

Posted May 6, 2022 17:26 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

> Sorry about the bad code formatting - I have no idea how copying and pasting from Emacs did that.

ampersand is the HTML/XML metacharacter for starting an entity, and although the standard says that entity references should include a final semicolon, HTML-handling software is more tolerant of the missing semicolon than is entirely ideal.

So it appears that the sequence &real gets overgenerously interpreted as an HTML entity equivalent to Unicode codepoint U+211C BLACK-LETTER CAPITAL R from the Letterlike Symbols block of the Mathematical Symbols, which has an alias name of "Real part".

DeVault: Announcing the Hare programming language

Posted May 7, 2022 6:03 UTC (Sat) by Vipketsh (guest, #134480) [Link] (1 responses)

I have to ask, what were you trying to add to the discussion ?

Sure, you are absolutely correct but what relevance does it have ? If the *compiler* has to assume that NULL points to a valid object what programs would break ? What other fallout would there be ? The only thing I can think of is that when of lawyering about "if (my_pointer == NULL)" you would have to say "Does my_pointer point to the object at address NULL?" instead of "Is my_pointer pointing to an invalid object?".

I think most interpretations of the standard, in the context of undefined behaviour, are simply done in bad faith. My opinion is that the reason that language is in there, and has to be there, is so that malloc(), or anything else that works with a pointer, can return or check for an error. And the reason dereferencing a NULL pointer is undefined is because there is no telling how a platform behaves when you do so. See how none of this has anything to do with the compiler ?

It would do so much good for these discussions if the standard and what it says were put aside. Talk about how one or another change would affect real existing programs and/or platforms. Talk about possible fallout. Talk about potential issues. Talk about benefits. Because "oh, how terrible, now some standard does not match up if I squint at this way" is completely meaningless and adds nothing. Standards, in general, should be looked upon as a nothing more than a aid to achieving interoperability (they can and do contain falsehoods). We all know that standards are violated all the time and to make things work one needs domain specific experience. Lastly, if you are writing a standard your goal should be to document the status quo and most definitely not an attempt to change the world.

DeVault: Announcing the Hare programming language

Posted May 8, 2022 12:48 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

> My opinion is that the reason that language is in there, and has to be there, is so that malloc(), or anything else that works with a pointer, can return or check for an error. And the reason dereferencing a NULL pointer is undefined is because there is no telling how a platform behaves when you do so. See how none of this has anything to do with the compiler ?

You have muddled the NULL pointer (an abstract idea) with the all zeroes address on a typical CPU, these are intentionally not the same thing.

While it's obviously a bad idea, C has no trouble with using actual values from a type as sentinels, atoi("junk") and atoi("0") are both zero. So it wouldn't have been a problem to define that malloc() returning zero can be either an error or an actual zero address. And because C runs on the abstract machine, not some actual platform with whatever weird behaviour, the question of what happens if we try to do platform illegal operations never comes up.

Most platforms are likely to either not be phased at all by the all-zeroes address, or to be equally concerned with some other address values, including values beyond some logical "end of memory", ROMs, and memory mapped peripherals. We can observe that the C language does not define special behaviour for any of these, only NULL which means something in the abstract machine and *that* is why it's used as a sentinel value.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds