Footguns

Posted Jul 18, 2021 1:25 UTC (Sun) by HelloWorld (guest, #56129)
In reply to: Footguns by Vipketsh
Parent article: Rust for Linux redux

> However, I think that there are still reasonable expectations for things the compiler will not do (e.g. start operating on non-rings).
Let's say you do an out-of-bounds write somewhere in your program. There's a chance that it might overwrite the piece of machine code that was generated from the "return be || !be" statement, and it might replace it with "return 0" (I'm aware of W^X etc, but not all platforms support that). There's no way the compiler can be expected to guard against that. I think a reasonable debate can be had about what exactly should trigger undefined behaviour, and it seems to me that compiler authors agree, as evidenced by the fact that they offer flags like -fwrapv. But I don't think it's reasonable to expect compilers to make any guarantees about what happens after UB has been invoked, not even for the be || !be example.

> Reality and de-facto standards are also important. For decades compilers compiled code with the semantics of wrapping 2's complement signed arithmetic thereby creating an implicit agreement that that's how signed integers work.
Again, SimCity was working fine for many years, despite having use-after-free bugs. Does that mean there's an "implicit agreement" that that's how free should “work”?
I think the standard should be that a behaviour should be preserved (at least optionally) if it has been around for a long time and a reasonable person might have specified it in some form of documentation. Reasonable people might have written a C standard where two's complement representation is specified for signed integer types. But no reasonable person would write a specification for free that says you can still use the data you just freed, because that just doesn't make sense. Now of course people have different ideas about what reasonable people would or wouldn't do, but my point is that “this has been working for years, so it must continue to work” just isn't sufficient by itself.

Footguns

Posted Jul 18, 2021 8:14 UTC (Sun) by Vipketsh (guest, #134480) [Link] (9 responses)

> There's a chance that it might overwrite ...

So, I think this is a key point to understand about C. In C you can and want to have pointers going to any arbitrary piece of memory (e.g. for I/O access) and as you say no standard can hope to define what happens when some data structure, or in more extreme cases, code is arbitrarily overwritten. The reasonable thing to do when analysing C code is to assume such things don't happen, and not to go out of your way to find such cases and generate garbage.

> There's no way the compiler can be expected to guard against that.

Sure, it's unreasonable to expect the compiler to guarantee that your pointers don't overwrite something you didn't expect. It's a completely different matter when the compiler goes out of its way to look for these things and then generating complete garbage, not because it's sane for some surprising reason, but because "it's allowed".

Let's also not forget that in the original example of (be || !be)=false there was no random overwriting of anything. It was reading an un-initialised variable, which is a little easier to reason about -- there is no explicit value one can expect when reading un-initialised variables, at the same time it is also reasonable to expect that the value that one gets does not fall outside of the set of allowed values for that type.

There is also another piece of to this puzzle: only clang assigns values outside the set of allowed values for the type, gcc assigns 0. Yet gcc's performance on the all-important SPEC benchmarks is not lagging miles behind clang so it's clearly not an all important optimisation. So the yet-unanswered question is raised of what advantage does clang's behviour have ? The disadvantages are clear.

> there's an "implicit agreement" that that's how free should “work”?

In some ways, yes, I do think there is implicit agreement here. I also think that user-after-free is not a good example because just like defining what overwriting arbitrary pieces of memory will result in is impossible, so is use-after-free. Still, I think the compiler should not go out of its way to detect use-after-free type situations and then generate garbage in responce.

A different but similar example perhaps would be the case of "The sims", which was known to write past allocated memory areas. It happened to work on Windows 9x. In later versions of windows they changed the allocator and put some internal data structures in those locations resulting in the game not working anymore. In the end Microsoft changed the allocator behaviour such that if you asked for X bytes you got X+Y bytes. Not logical behaviour when looking an allocator but well defined. For me, such changes are a reasonable price to pay for compatibility.

Footguns

Posted Jul 18, 2021 11:34 UTC (Sun) by jem (subscriber, #24231) [Link] (8 responses)

> The disadvantages are clear.

Out of curiousity, what are these clear disadvantages? If be is undefined, then (be || !be) is also undefined. This is "garbage in, garbage out" at work here. You can't create information out of nothing. It's madness to start reasoning about this before fixing the underlying problem first, i.e. make sure the variable has a value before using it.

On the contrary, I would say the compiler did you a favour by giving a hint in the form of an unexpected result. A hint telling you that there is something wrong with the code. Of course it would have been more helpful if the compiler had printed the real error, that the variable is uninitialised. Which is what compilers usually do nowadays, when they are able to detect the it.

Footguns

Posted Jul 18, 2021 12:43 UTC (Sun) by Vipketsh (guest, #134480) [Link] (5 responses)

> what are these clear disadvantages?

It breaks the foundations of the arithmetic all users expect.

My experience with Verilog, that has a number system that is not a ring opposed to (almost) all software, tells me that the number one thing people expect is ring-like behaviour, more so than operating on an infinite set. You can argue that all people are mentally completely deficient when they expect that if they have INT_MAX apples, acquire an additional two and finally eat two of them there are going to be INT_MAX left instead of "poison" number of apples, but perhaps it would be easier to just apply the models that work in the real world and have done so for more than a century ?

> You can't create information out of nothing.

Pretty much the entirety of mathematics is about "creating information out of nothing" and that's the beauty of it. Guess what ? In the given case, the maths says that (be || !be) is true. Not undefined or "poison", but true. In all cases. No exceptions.

You could argue that this isn't a normal system but "LLVM special". It's a valid argument, but you can't have it both ways and keep sane results like LLVM tries in vain: it uses a non-ring but then applies transformations that are valid only on a ring.

> On the contrary, I would say the compiler did you a favour by giving a hint in the form of an unexpected result.

What an overly kind compiler that is. What's next ? I should be donating them a kidney because they were oh-so-kind to insert an arbitrary write in my code, corrupting memory as a way of "warning me of an unexpected result" ?

It's incredibly difficult to figure out what happened when the compiler is doing everything right, and your code is clearly at fault (think: memory corruption). Suggesting that the compiler introducing random behaviour "is helping the user" is preposterous.

Footguns

Posted Jul 18, 2021 13:24 UTC (Sun) by mathstuf (subscriber, #69389) [Link] (4 responses)

I'm OK with the ring argument to an extent, but I don't think one can actually apply it to the C Abstract Machine. My main question is: how far do we go down this road? I understand that `be || !be` is "trivial", but is `i < 2 || is_prime(i) || is_composite(i)` supposed to be something the compiler can reason about? If not, why not? After all, every value in the domain for integer types in C is fine here. To go even further, one could have `i < 2 || is_odd(i) || goldbach_conjecture_holds(i)`. While not known for *all* numbers, this is known for up to 2³²-1 at least (64-bit unsigned seems to be a *bit* past the current proof available; 4e18 versus 1.8e19). Where do we draw the line for such "if we treat the types as holding to the rules of a ring, the value is always N, so replace it with such"?

Of course, this is assuming that `be` and `i` do not change from access to access. However, this is exactly the kind of assumption that is allowed for uninitialized variables. Sure, machines today might have all values in the normal range be the only values supported and they won't change willy-nilly, but the Mill has a NAR result; this could certainly be the representation for uninitialized variables in which case C would either need to inject an initialization for you (what rules exist for this?) or do some complicated runtime NAR tracking to handle the cases given above (because the operations will generally result in NAR propagation).

I think people need to remember that "the hardware" is *not* what C targets. Hasn't been for a long time either. The rules of the C Abstract Machine are not beholden to "what currently prevailing hardware does" and nor should it unless it wants to do one of:

- make uninitialized variables a compile error (like Rust does) and require source code changes to continue using newer compilers;
- pessimize future potentials (such as having to inject initialization into every POD variable on the Mill); or
- declare that hardware today has all of the properties we can expect and enshrine them.

Of course, C (and C++) have painted themselves into a corner by trying to say:

- old code will continue to compile (though not necessarily with the same behaviors :/ ) (this has been broken somewhat by the removal of trigraphs and maybe removing some EBCDIC allowances);
- we don't want to enshrine existing architectures and require future hardware to emulate any decisions made here (e.g., twos complement (yes, C++ has done this, but I don't know the status for C), IEEE floating point formats, etc.);
- ABI compatibility; and
- as close to "zero cost abstractions as possible".

Rust, on the other hand, says "we're willing to give up some future opportunities and supporting older platforms at the expense of better available reasoning about the code at compilation time on the hardware that is widely used today". Sure, if some new floating point format appears, Rust's IEEE expectations are going to have to be emulated where that gets used preferentially. But it means that Rust has chosen "working on extant hardware today as well as possible" over "works on ancient stuff, current stuff, and whatever may exist in the future". Given how much extant language design ends up influencing marketable hardware, I think the former is more useful. But who knows, maybe security will become a big enough problem that capability-based processors will be a thing and Rust will then be in a corner and C able to morph over into defining its behavior on such a machine. It's not a bet I would take, but it's also a possibility.

Footguns

Posted Jul 18, 2021 14:17 UTC (Sun) by khim (subscriber, #9252) [Link]

> Sure, machines today might have all values in the normal range be the only values supported

Period. Full stop. End of discussion. If certain behavior is dictated by hardware then compiler shouldn't invent it's own rules which permit it to ~~optimize~~ destroy programs.

Sure, it may invent some additional rules (e.g. Rust may use the fact that certain types don't use all bit patters to save some memory) — but then it becomes responsibility of the compiler to maintain these additional rules and it must abort an attempt of the user to violate such rules.

It's idiotic to just assume that user is simultaneously super-advanced and knowledgeable and remembers all these hundreds of UBs defined in standard (and not defined by standard too, as we now know) yet, simultaneously, is dumb enough to write code which would do something wrong — unconditionally.

> I think people need to remember that "the hardware" is *not* what C targets.

Why not? The only reason compilers exist and are used is to take programs and execute them on hardware. Why make that task unnecessarily complicated?

> But who knows, maybe security will become a big enough problem that capability-based processors will be a thing and Rust will then be in a corner and C able to morph over into defining its behavior on such a machine.

Dream on. Just don't forget to do a reality check when you would wake up.

> It's not a bet I would take, but it's also a possibility.

No. There are no such possibility. People are not writing code for “abstract C machine”. They are writing code for the existing hardware and then fight the compiler till it works.

I have talked with a guy who participated in the development of compiler for E2k CPU (which does have capability-based processor… (or, rather, it had in the initial design, not sure if they kept it).

Approximately zero non-trivial C programs can be compiled in strict mode. Because in any non-trial program sooner or later you hit a code which assumes that pointer is just a number. Maybe a weird number (like far pointer in Windows 16bit) but still a number.

Similarly you hit code which assumes than numbers are two's complement and so on. Hyrum's Law ensures that you couldn't ever transfer non-trivial C or C++ codebase to a radically new architecture (one of the reasons why all CPUs today are so strikingly similar, BTW).

Safe Rust rules, on the other hand, can happily coexist with capability-driven CPU. And given the fact that Rust developers try to minimize use of unsafe Rust (and it's always clearly marked when used) port of Rust code to the capability-driven CPU is quite feasible.

If we ever would switch to capabilities-driven CPUs then C and C++ wouldn't survive the transition for sure. While Rust might.

> To go even further, one could have i < 2 || is_odd(i) || goldbach_conjecture_holds(i)

What's the issue with that code? Pick any single value and calculate the answer, if you can.

> C would either need to inject an initialization for you (what rules exist for this?) or do some complicated runtime NAR tracking to handle the cases given above (because the operations will generally result in NAR propagation).

As long as mental model “uninitialized variable contains whatever garbage which was found there when memory was allocated” holds… people would accept it. Sure, there are programs which would be broken with these optimizations, but it's very hard to find someone who thinks they should work.

Using standard as a guide which includes current state of affairs is fine, too. But when following the standard gives surprising (to the user) result then changes to the standards should be contemplated, too. I remind you, once more, what C committee said explicitly:

Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior (emphasis mine).

WTH this recommendation (which would have been widely welcomed by C and C++ developers) was followed so rarely and contradicted so often?

Footguns

Posted Jul 18, 2021 16:49 UTC (Sun) by Vipketsh (guest, #134480) [Link] (2 responses)

Is your argument here that it is unreasonable to expect the compiler to be able to be able to prove properties of expressions and perform optimisations based on those properties ? If that is indeed your argument I agree with you. It's also the reason why in the "p = realloc(q, ..); if (q == p) ..." example I think it is wrong to argue "but, I checked the pointers are equal" because there are a lot of expressions which can be true if and only if p and q are equal, but it is not reasonable to expect the compiler to see that in all cases. In that case I think the aliasing rules are just messed up.

The reason I think the (be || !be)=false result is unreasonable is not because LLVM is unable to prove that it should evaluate to true. Instead, my problem is because LLVM goes out of its way to use a non-ring to evaluate it and I have no idea what the advantage is in doing that. If it just didn't try to do anything special, perhaps even just emitted the expression as-is in machine instructions, it would arrive at the expected result.

Footguns

Posted Jul 18, 2021 17:14 UTC (Sun) by khim (subscriber, #9252) [Link]

> Instead, my problem is because LLVM goes out of its way to use a non-ring to evaluate it and I have no idea what the advantage is in doing that.

“Dead code elimination”. The idea the same with most “optimizations” which break formerly valid code: programmer is utterly and inescapably deeply schizophrenic entity which:

Keeps is mind all hundreds of UBs (even if compiler authors couldn't themselves compile adequate list of them in case of C++) and never, just NEVER violates them.
Produces complete and utter garbage in place of code (probably because his head is taken by aforementioned rules and there are no space for anything else in it) which includes lots of pointless manipulations which process undefined values.

If we are dealing with such an entity then using #1 to combat #2 is entirely reasonable.

Unfortunately in real life programmers are exact opposite: they tend not to produce too much garbage yet sometimes create UBs by mistake. Applying these principles to code written by real people turns compiler writers into their adversaries, but that's OK since these users are not the ones who pay salaries to compiler writers, isn't it?

Footguns

Posted Jul 19, 2021 8:36 UTC (Mon) by smurf (subscriber, #17840) [Link]

> I think it is wrong to argue "but, I checked the pointers are equal" because there are a lot of expressions which can be true if and only if p and q are equal

That's not the problem here, and if it was then the compiler should treat them as potentially aliased.

Instead, it treats p as both valid (the location it points to can be written to) and invalid (it cannot possibly point at the same location as another pointer) at the same time. That's the very antithesis of correctness.

Footguns

Posted Jul 18, 2021 13:08 UTC (Sun) by mpr22 (subscriber, #60784) [Link] (1 responses)

> I would say the compiler did you a favour by giving a hint in the form of an unexpected result.

The compiler doing me a favour would be if it defaulted to throwing a fatal error (not a warning, a fatal error, your code doesn't compile, fix it please) when it is asked to compile code which invokes UB 100% of the time and can be cheaply proven to do so.

Footguns

Posted Jul 19, 2021 9:56 UTC (Mon) by farnz (subscriber, #17727) [Link]

This is something I hope future languages steal from Rust (along with Esteban Kuber's great work on high quality error messages): there is no UB in parts of the code not marked out as potentially containing it.

Now, there's a gotcha with this, in that while it's always an unsafe block in Rust that actually invokes UB, it's possible for the preconditions for that UB to be set up by Safe Rust (e.g. you set an offset into an array in safe code, but deference it in unsafe code), but you always know where UB could possibly be found in a body of Rust code, and you can use module scoping (privacy) to ensure that UB can't leak - with that done, it's possible to check an entire module that contains an unsafe block, and be confident that your human analysis shows no UB.

Footguns

Posted Jul 18, 2021 10:17 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

> Now of course people have different ideas about what reasonable people would or wouldn't do, but my point is that “this has been working for years, so it must continue to work” just isn't sufficient by itself.

Another story about compilers: it had taken Microsoft more than 5 years to move their C# and C++ compilers onto a new backend. Not because it was so complex, but because they wanted to maintain bug-for-bug compatibility with the old compiler.

They went to their biggest clients and made sure that huge codebases, like the ones that Adobe has, work perfectly.

This is another reason why Windows platform is still the main desktop environment.