|
|
Subscribe / Log in / New account

realloc

realloc

Posted Jul 12, 2021 22:25 UTC (Mon) by khim (subscriber, #9252)
In reply to: realloc by tialaramex
Parent article: Rust for Linux redux

> The insistence that "Pointer provenance is not a thing" appears to be a false claim about reality.

What kind of reality?

> As you have noticed, programs written with your belief do not necessarily work.

Yes. Compilers are broken. And now we even know there broken not as an accident. So?

> Just as C programmers are often too clever for their own good, sometimes the standards committee is likewise and that's reflected in #260.

Indeed. Instead of throwing away premature and unfinished ideas about how the so-called “pointer provenance” have to work they acquiesced to compiler writers demands (or, more likely, an ultimatum, but I wasn't there and couldn't say for sure). Yet instead of saying that standard is wrong they just “came to a number of conclusions as to what it would be desirable for the Standard to mean”. IOW: they proposed that compiler developer would start developing some sane set of rules for the C users to follow.

Instead of doing that compiler developers went on to invent unproven, buggy and half-specified schemes which were, as they claim, permitted by DR260. Indeed even latest proposal from last year explicitly says DR260 CR has never been incorporated in the standard text and lament about the sad fact that DR260 only explicitly permits track the origins of a bit-pattern and [...] may also treat pointers based on different origins as distinct even though they are bitwise identical

Which basically means: output from that realloc example is still buggy. Even if you would incorporate DR260 resolution into the standard — you still couldn't get output 1 2 in question. The most you can get is elimination of comparison and the whole code which does operations after that branch.

Now, you may try to argue that compiler was clever enough to notice that the only situation where p is equal to q and both are valid… nope, no cigar

> This also happened with the Memory Model, C++ 17 "temporarily discouraged" use of consume ordering because eh, it seemed like a good idea when it was invented and now it does not.

Yes, but at least there some memory model was actually accepted and added to the text of standard. Nothing like that happened with DR260.

> it seemed like a good idea when it was invented and now it does not.

And I would argue that “pointer provenance” was a very bad idea and shouldn't brought to the realms of C as unconditional property. Previous proposal at least tried to acknowledged that and proposed two modes, test macros and so on.

But it was rejected. Most likely because it was “too much work” to implement.

So… it's too much work for the compiler writers to implement -fno-provenance switch yet reviewing and rewriting billions lines of code to make them compliant with new “optimizations” (which break code which was acceptable for decades) is not “too much work”? WTH?


to post comments

realloc

Posted Jul 13, 2021 13:13 UTC (Tue) by tialaramex (subscriber, #21167) [Link] (2 responses)

> What kind of reality?

This kind. Where your program doesn't do what you expected.

C and C++ are ISO standards. But, ISO standards are just written documents. One of the insights of the IETF is appropriate here. If there's a standards document which says X, but everybody does Y, then X is not in fact the standard. But worse than that, the standards document might (hopefully inadvertently) say that 2+2=5, and even if everybody tries very hard to obey that standard they can't.

For some of the ISO standards the difference between what the text says, what it is understood to mean, and what is practical, is negligible. ISO 216 A-series paper sizes are simple enough that this works.

But C++ is very far from that. So, the C++ standards document says words, but in some cases those words turn out to be incoherent nonsense (like 2+2=5) as happened for the Memory Model. In other cases, as here with pointers, the words imply that C++ is a language whose correct implementation has terrible performance. But nobody wants terrible performance (sometimes in this sort of forum they'll _say_ they want terrible performance, but then immediately they demand a way to "opt out" and they never switch it off) so what you'll actually get is not that language.

For years Java had to pretend there was a special "less strict" interpretation of how floating point numbers work distinct from how they're documented in the actual language standard, which was optional and might be (read: was) switched on for some (read: Intel architecture) platforms. Eventually the terrible Intel x87 FPU was obsolete and Java removed this "feature" entirely. It was a necessary evil, until it wasn't, and then it was gladly killed.

Anyway, not only do you have the problem that C++ as-it-is-compiled is not the written standards document, because rooms full of people translating your C++ into assembly language would be a horror show -- but worse, the optimizers, which you can't live without, do not actually optimise C++. They have an Intermediate Representation and optimize that.

One of the things that makes some people unhappy about Rust is that it doesn't have a written standard. If you've got an unjustifiable faith in the power and correctness of standards documents this can feel like a big obstacle. But in truth Rust's bigger obstacle isn't the lack of a standards document for Rust the programming language, but for the LLVM IR. Everybody is agreed that Rust has different semantics from C++ and so it needs to express those in the LLVM IR. Famously for example infinite loops aren't a thing in C++ (forward progress is mandatory, sometimes the compiler can't be sure if there is progress and so the running program is actually an infinite loop, but if it knew that it would elide the loop entirely) so Clang needn't express "this is an infinite loop" because there aren't any in C++ whereas in Rust they're a thing, so rustc needs to express that and have LLVM produce correct code for an infinite loop. But, since the IR is not formally standardised, there is no analysis which says the optimisations actually work on the IR, only that they seem to work for C++. Sometimes a C++ programmer will optimise the IR in a way that deletes Rust's infinite loop. Oops.

realloc

Posted Jul 13, 2021 14:41 UTC (Tue) by khim (subscriber, #9252) [Link]

> If there's a standards document which says X, but everybody does Y, then X is not in fact the standard. But worse than that, the standards document might (hopefully inadvertently) say that 2+2=5, and even if everybody tries very hard to obey that standard they can't.

Except that's not what I hear when I bring the question of overflowing shift or nullptr arithmetic.

When I bring this up and ask for some sanity (like: I don't care what i << 32 would be as long as (i << 32) & 0 is still 0) the answer is always the same: “holy standard” proclaimed that's an UB, thus go away and not bother us with triflities, fix your program instead (it's usually more polite, but idea is always the same).

I'm more than willing to discuss things like this with Rust developers because they:

  1. Explain their decisions in plain English.
  2. Offer me an alternative which I may use instead.

I would have been willing to discuss that with C/C++ developers if they wouldn't have brought sayings from “holy standard” into discussions about how should I deal with mmap or shift-by-32 but treated these questions like IETF (or Rust developers) treat.

At times it felt like I'm talking with devout Jews who are believing that all the answers can be found in Torah. It looked somewhat acceptable (if tiring after same time) if they were actually devout believers and actually tried to follow their “holy scripture”.

But when they started talking about pointer provenance and brought DR260 which is 2004 year decision not incorporated into their “holy scripture” in the course of last 15 years (and two published standards)… I have realized the depth of hypocrisy. They were knowingly breaking standard-compliant programs for more than decade while simultaneously preaching standard as the source of truth.

> But C++ is very far from that.

Maybe, but that's not what C/C++ compiler developers say usually. You can't have you cake and eat it too. Well… you can try but when you would be, eventually, caught, you would lose what little trust you had.

> For years Java had to pretend there was a special "less strict" interpretation of how floating point numbers work distinct from how they're documented in the actual language standard, which was optional and might be (read: was) switched on for some (read: Intel architecture) platforms. Eventually the terrible Intel x87 FPU was obsolete and Java removed this "feature" entirely. It was a necessary evil, until it wasn't, and then it was gladly killed.

But note that both Strict or Nonstrict Floating-Point Arithmetic was described in books and present from the beginning. It was nothing like what C/C++ compiler developers are trying to do with their “we don't even yet know themsleves the rules in year 2020 but let's pretend they were already there in C in 1989 and demand that C users obey them”.

> If you've got an unjustifiable faith in the power and correctness of standards documents this can feel like a big obstacle.

I don't. But C/C++ compiler developers do. Except when it doesn't suit them. That is the problem.

Basically: “if compiler user violated the standard and program was miscompiled then it's fault of said user and there would be no lenience” simultaneously “if compiler violated the standard and program was miscompiled then it's fault of the standard and standard (and not the compiler) would be fixed” stance is flat out unacceptable.

> But in truth Rust's bigger obstacle isn't the lack of a standards document for Rust the programming language, but for the LLVM IR.

Well… LLVM IR simultaneously with the need for optimizations. Rust without optimizations is not viable which means that for the foreseeable future it would have to consider limitations of LLVM. That's serious problem but as long as Rust developers are trustworthy and don't adopt the C/C++ developers “it's my way or the highway” stance it's manageable.

> Sometimes a C++ programmer will optimise the IR in a way that deletes Rust's infinite loop. Oops.

Oops indeed. But I think if Rust would adopt the stance similar to what Java had as explicitly say: “here is what may happen because we rely on the untrustworthy LLVM” then people would accept that.

Maybe some time down the road Rust would be big enough to afford completely separately compiler and these warts could be fixed for real, but noone expects them to do the impossible. Because they in turn don't demand the impossible from compiler users.

In fact Rust developers tell that explicitly: we need better language specs. They admit the fact that unsafe Rust is underspecified and they couldn't actually explain how and what can be done in it safely.

But for them it's a problem. They don't pretend that everything's peachy, Rust users would follow the rules even we couldn't even write these rules themselves.

On the other hand C/C++ compiler developers made standard text sacred by constant insistence that we should ignore everything and anything when we talk about C/C++ programs (“POSIX, x86/ARM/RISC-V architecture specs and so on are all irrellevant and must be ignored, if you want to change even one jot your have to change the standard first” was their persistent public stance for years).

realloc

Posted Jul 13, 2021 23:12 UTC (Tue) by foom (subscriber, #14868) [Link]

FWIW, the handling of infinite loops in LLVM IR was codified late last year with the addition of the "mustprogress" function and loop attribute in LLVM IR, whereby languages which want to require forward progress must explicitly opt into the requirement. ("forward progress" being defined by the C and C++ standards, e.g. https://eel.is/c++draft/intro.progress for C++).


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds