Losing the magic

Posted Dec 6, 2022 18:47 UTC (Tue) by khim (subscriber, #9252)
In reply to: Losing the magic by Wol
Parent article: Losing the magic

> In other words, with C's assumption that UB is impossible, we now have a conundrum if we want to write Operating Systems in C!

Why would it be so? There are lots of art built around how can you avoid UBs in practice. Starting from switches which turn certain UBs into IBs (and thus make them safe to use) to sanitizers which [try to] catch UBs like race conditions or double-free or out-of-bounds array access.

If you accept the goal (ensure that your OS doesn't ever trigger UB) there are plenty of ways to achieve it. Here is an interesting article on subject.

I, personally, did something similar on smaller scale (not OS kernel, but another security-critical component of the system). Ended up with one bug in 10 years system was in use (and that was related to problem with specification of hardware).

But if you insist on your ability to predict what code with UBs would do… you can't write Operation System in C that way (or, rather, you can, it's just there are no guarantees that it will work).

> Which has been my problem ALL ALONG. I want to be able to reason, SANELY, in the face of UB without the compiler screwing me over.

Not in the cards, sorry. In you code can trigger UB then the only guaranteed fix is to change code and make it stop doing that.

> If that's an O_PONY then we really are fscked.

Why? Rust pushes UBs into tiny corner of your code and there are already enough research into how we can avoid UBs completely (by replacing these with markup which includes proof that your code doesn't trigger any UBs). Here is related (and very practical) project.

Of course even after all that we would have issue of bugs in hardware, but that's entirely different can of worms.

Losing the magic

Posted Dec 6, 2022 20:19 UTC (Tue) by Wol (subscriber, #4433) [Link] (11 responses)

> > In other words, with C's assumption that UB is impossible, we now have a conundrum if we want to write Operating Systems in C!

> Why would it be so? There are lots of art built around how can you avoid UBs in practice. Starting from switches which turn certain UBs into IBs (and thus make them safe to use) to sanitizers which [try to] catch UBs like race conditions or double-free or out-of-bounds array access.

I notice you didn't bother to quote what I was replying to. IF THE HARDWARE HAS UBs (those were your own words!), and the compiler assumes that there is no UB, then we're screwed ...

Cheers,
Wol

Losing the magic

Posted Dec 6, 2022 22:45 UTC (Tue) by khim (subscriber, #9252) [Link] (10 responses)

> IF THE HARDWARE HAS UBs (those were your own words!)

Not if. Hardware most definitely have an UB. x86 have less UBs than most other architectures, but it, too can provide mathematically impossible results!

On the hardware level, without help from the compiler! If used incorrectly, of course.

These UBs are results of hardware optimizations, instead. You can not turn these off!

But you can find series of articles which are explaining how one is supposed to work with all that right here, on LWN!

You have probably seen then already, but probably haven't realized what they are actually covering.

> if the compiler assumes that there is no UB, then we're screwed ...

Why? How? What suddenly happened? Compiler deals with these UBs precisely and exactly like with any other UBs: it assumes they never happen.

And then programmer is supposed to deal with all that in the exact same fashion as with any other UBs: by ensuring that compiler assertion is correct.

Losing the magic

Posted Dec 7, 2022 11:05 UTC (Wed) by farnz (subscriber, #17727) [Link] (9 responses)

It's worth noting that you're making the situation sound a little worse than it actually is.

The compiler's job is translate your program from one language (say C) to another language (say x86-64 machine code), with the constraint that the output program's behaviour must be the same as the input program's behaviour. Effectively, therefore, the compiler's job is to translate defined behaviour in the source program into identically defined behaviour in the output program.

For source languages without undefined behaviour, this means that the compiler must know about the destination language's undefined behaviour and ensure that it never outputs a construct with undefined behaviour - this can hurt performance, because the compiler may be forced to insert run-time checks (e.g. "is the shift value greater than the number of bits in the input type, if so jump to special case").

For source languages with undefined behaviour, the compiler gets a bit more freedom; it can translate a source construct with UB to any destination construct it likes, including one with UB. This is fine, because the compiler hasn't added new UB to the program - it's "merely" chosen a behaviour for something with UB.

Losing the magic

Posted Dec 7, 2022 12:44 UTC (Wed) by khim (subscriber, #9252) [Link] (8 responses)

> It's worth noting that you're making the situation sound a little worse than it actually is.

You are mixing issues. Of course it's possible to make language without UB! There are tons of such languages: C#, Java, Haskell, Python…

But that's not what O_PONIES lovers want! They want the ability to “program to hardware”. Lie to compiler (because “they know better”), do certain manipulations to hardware which compiler have no idea about and then expect that code would still work.

That is impossible (and I, probably, underestimate the complexity of task). It's as if Java program opened /proc/self/men, poked the runtime internals and then, when upgrade broken it, its author demanded satisfaction and claimed that since his code worked in one version of JRE then it must work in all of them.

That is what happens when you “use UB to combat UB”. Onus is on you to support new versions of compiler. Just like onus is on you to support new versions of Windows if you use undocumented functions, onus is on you if you poke into linux kernel internals via debugfs and so on.

And Linux kernel developers are not shy when they say that when programs rely on such intricate internal details all bets are off. Even O_PONIES term was coined by them, not by compiler developers!

> For source languages with undefined behaviour, the compiler gets a bit more freedom; it can translate a source construct with UB to any destination construct it likes, including one with UB. This is fine, because the compiler hasn't added new UB to the program - it's "merely" chosen a behaviour for something with UB.

Yes, but that's precisely what O_PONIES lovers object against. Just read the damn paper already. It doesn't even entertain the notion that programs can be written without use of UBs for one minute. They just assert they would continue to write code with UBs (“write code for the hardware” since “C is a portable assembler”) and compilers have to adapt, somehow. Then they discuss how compiler would have to deal with mess they are creating.

You may consider that as a concession of sorts (no doubt caused by the fact that you can not avoid UBs in today's world because even bare hardware have UBs), but it's still not a discussable position because instead of listing constructs which are allowed in the source program they want to just only blacklist certain “bad things”.

Because it doesn't work! Ask any security guy what he thinks about black lists and you would hear that they are always only a papering over the problem and just lead to the “whack the mole” busywork. To arrive at some promises you have to whitelist good programs, not blacklist the bad ones!

Losing the magic

Posted Dec 7, 2022 14:18 UTC (Wed) by farnz (subscriber, #17727) [Link] (7 responses)

You're arguing a different point, around people who demand a definition of UB in their language of choice L, by analogy to another language M.

I'm saying that the situation is not as awful as it might sound; if I write in language L, and compile it to language M, it's a compiler bug if the compiler converts defined behaviour in L into undefined behaviour in M. As a result, when working in language L (whether that's C, Haskell, Python, Rust, JavaScript, ALGOL, Lisp, PL/I, Prolog, BASIC, Idris, whatever), I do not need to worry about whether or not there's UB in language M - I only need care about language L, because it's a compiler bug if the compiler translates defined behaviour language L into undefined behaviour in language M.

So, for example, if language L says that a left shift by more than the number of bits in my integer type always results in a zero value, it's up to the compiler to make that happen. If language M says that a left shift by more than the number of bits in my integer type results in UB, then the compiler has to handle putting in the checks (or proving that they're not needed) so that if I do have a left shift by more than the number of bits in my integer type, I get 0, not some undefined behaviour.

And this applies all the way up the stack if I have multiple languages involved; if machine code on my platform has UB (and it probably does in a high-performance CPU design), it makes no difference if I compile BASIC to Idris, Idris to Chicken Scheme, Chicken Scheme to C, C to LLVM IR and finally LLVM IR to machine code, or if I compile BASIC directly to machine code. Each compiler in the chain must ensure that all defined behaviour of the source language translates to identically defined behaviour in the destination language.

In other words, as you compile from language L to language M, the compiler can leave you with as much UB as you had before, or it can decrease the amount of UB present in language M, but it can never add UB. The only "problem" this leaves you with if you're the O_PONIES sort is that it means that defining what it actually means for UB to flow from language M to language L is tricky, because in the current world, UB doesn't flow that way, it only flows from language L to language M.

Losing the magic

Posted Dec 7, 2022 15:09 UTC (Wed) by khim (subscriber, #9252) [Link] (6 responses)

> In other words, as you compile from language L to language M, the compiler can leave you with as much UB as you had before, or it can decrease the amount of UB present in language M, but it can never add UB.

Of course it can add UB! Every language with manual memory management, without GC, adds UB related to these. On hardware level there are no such UBs, memory is managed by user when he adds new DIMMs or removes then, there may never be any confusion about whether memory is accessible or not.

But Ada, C, Pascal and many other such languages add memory management functions and then say “hey, if you freed memory then onus is on you to make sure you wouldn't try to use object which no longer exists”.

The desire to do what you are talking about is what gave rise to GC infestation and abuse of managed code.

> The only "problem" this leaves you with if you're the O_PONIES sort is that it means that defining what it actually means for UB to flow from language M to language L is tricky, because in the current world, UB doesn't flow that way, it only flows from language L to language M.

UBs can flow in any direction and don't, actually, cause any problems as long as you understand what UB is: something that you are not supposed to do. If you understand what UB is and can list them — you can deal with them.

If you don't understand that UB is (O_PONIES people) or don't understand where they are (Clément Bœsch case or, of we are talking about hardware, Meltdown and Spectre case) then there's trouble.

Ignorance can be fixed easily. But attitude adjustments are hard. If someone believes it's his right to ignore traffic light because that's how he drove for last half-century in his small village then it becomes a huge problem when someone like that moves to big city.

Losing the magic

Posted Dec 7, 2022 16:21 UTC (Wed) by farnz (subscriber, #17727) [Link] (5 responses)

You're misunderstanding me still. If there is no UB in my source code, then there is also no UB in the resulting binary, absent bugs/faults in the compiler, the OS or the hardware.

Your examples are cases where I have UB in language L, I translate to language M, and I still have UB - in other words, no new UB has been introduced, but the existing UB has resulted in the output program having UB, too. The only gotcha is that the UB in the output program may surprise the programmer, since UB in the source language simply leaves the target language behaviour completely unconstrained.

There is never a case where I write a program in language L that is free of UB, but a legitimate compilation of that program to language M results in the program having UB. If this does happen, it's a bug - the compiler has produced invalid output, just as it's a bug for a C compiler to turn int a = 1 + 2; into int a = 4;.

In turn, this means that UB in language M does not create new UB in language L - the flow of UB is entirely one-way in this respect (there was UB in language L, when I compiled it, I ended up with a program that has UB in language M).

The only thing that people find tricky here is that they have a mental model of what consequences of UB are "reasonable", and what consequences of UB are "unreasonable", and get upset when a result of compiling a program from L to M results in the compiler producing a program in language M with "unreasonable" UB, when as far as they were concerned, the program in language L only had "reasonable" UB. But this is not a defensible position - the point of UB is that the behaviour of a program that executes a construct that contains UB is undefined, while "reasonable" UB is a definition of what behaviour is acceptable.

And here we come to the underlying fun with O_PONIES: Coming up with definitions for existing UB and pushing that through the standards process is hard work, and involves thinking about a lot of use cases for the language, not just your own, and getting agreement either on a set of allowable behaviours for a construct that's currently UB, or getting the standards process to agree that something should be implementation-defined (i.e. documented set of allowable behaviours from the compiler implementation). This is a lot of work, and involves getting a full understanding of why people want certain behaviours to be UB, rather than defined in a non-deterministic fashion.

Losing the magic

Posted Dec 7, 2022 17:28 UTC (Wed) by Wol (subscriber, #4433) [Link] (2 responses)

> And here we come to the underlying fun with O_PONIES: Coming up with definitions for existing UB and pushing that through the standards process is hard work, and involves thinking about a lot of use cases for the language, not just your own, and getting agreement either on a set of allowable behaviours for a construct that's currently UB, or getting the standards process to agree that something should be implementation-defined (i.e. documented set of allowable behaviours from the compiler implementation). This is a lot of work, and involves getting a full understanding of why people want certain behaviours to be UB, rather than defined in a non-deterministic fashion.

I don't know whether khim's English skills are letting him down, or whether he's trolling, but I think you've just encapsulated my view completely.

Multiplication exists in C. Multiplication exists in Machine Code. All I want is for the C spec to declare them equivalent. If the result is sane in C, then machine code has to return a sane result. If the result is insane in C, then machine code is going to return an insane result. Whatever, it's down to the PROGRAMMER to deal with.

khim is determined to drag in features that are on their face insane, like double frees and the like. I'm quite happy for the compiler to optimise on the basis of "this code is insane, I'm going to assume it can't happen (because it's a bug EVERYWHERE). What I'm unhappy with is SchrodinUB where the EXACT SAME CODE may, or may not, exhibit UB depending on situations outside the control of the programmer (and then the compiler deletes the programmer's checks!).

And it's all very well khim saying "the compiler writers have given you an opt-out". But SchrodinUB should always be opt IN. Principle of "least surprise" and all that. (And actually, I get the impression Rust is like that - bounds checks and all that sort of thing are suppressed in runtime code I think I heard some people say. That's fine - actively turn off checks in production in exchange for speed IF YOU WANT TO, but it's a conscious opt-in.)

Cheers,
Wol

Losing the magic

Posted Dec 7, 2022 18:48 UTC (Wed) by khim (subscriber, #9252) [Link]

> All I want is for the C spec to declare them equivalent.

If that's really your desire then you sure found a funny way to achieve it.

But I'm not putting you with O_PONIES crowd. It seems you are acting out of ignorance not malice.

John Regehr tried to do what you are proposing to do — and failed spectacularly, of course.

But look here: My paper does not propose a tightening of the C standard. Instead, it tells C compiler maintainers how they can change their compilers without breaking existing, working, tested programs. Such programs may be compiler-specific and architecture-specific (so beyond anything that a standard tries to address), but that's no reason to break them on the next version of the same compiler on the same architecture.

Basically O_PONIES lovers position is the following: if language M (machine code) have UBs then it's Ok for L to have UB in that place, but if M doesn't have UB then it should be permitted to violate rules of L and still produce working program.

But yeah, that's probably problem with me understanding English or you having trouble explaining things.

> What I'm unhappy with is SchrodinUB where the EXACT SAME CODE may, or may not, exhibit UB depending on situations outside the control of the programmer

How is that compatible with this:

> khim is determined to drag in features that are on their face insane, like double frees and the like. I'm quite happy for the compiler to optimise on the basis of "this code is insane, I'm going to assume it can't happen (because it's a bug EVERYWHERE).

I don't see why do you say that this feature is insane. Let's consider concrete example:

On beta versions of Windows 95, SimCity wasn’t working in testing. Microsoft tracked down the bug and added specific code to Windows 95 that looks for SimCity. If it finds SimCity running, it runs the memory allocator in a special mode that doesn’t free memory right away.

It looks as if your approach the EXACT SAME CODE may, or may not, exhibit UB depending on situations outside the control of the programmer very much does cover double free, dangling pointers and other such things. It's even possible to make it work if you have enough billions in bank and obsession with backward compatibility.

The question: are these a well-spent billion? Should we have a dedicated team which cooks up such patches for the clang and/or gcc? Who would pay for it?

Without changing spec (which people like Anton Ertl or Victor Yodaiken very explicitly say not what they want) this would be the only alternative, I'm afraid.

> But SchrodinUB should always be opt IN.

Why? It's not part of the C standard, why should it affect good programs which are not abusing C?

> And actually, I get the impression Rust is like that - bounds checks and all that sort of thing are suppressed in runtime code I think I heard some people say.

Only integer overflow checks are disabled. If you would try to divide by zero you would still get check and panic if divisor is zero.

But if you violate some other thing (e.g. if you program would try to access undefined variable) all bets are still off.

Let's consider the following example:

bool to_be_or_not_to_be() {
    int be;
    return be == 0 || be != 0;
}

With Rust you need to jump through the hoops to use uninitialized variable but with unsafe it's possible:

pub fn to_be_or_not_to_be() -> bool {
    let be: i32 = unsafe { MaybeUninit::uninit().assume_init() };
    return be == 0 || be != 0;
}

You may argue that what Rust is doing (removing the code which follows to_be_or_not_to_be call and replacing it with unconditional crash) is, somehow, better then what C is doing (claiming that value of the be == 0 || be != 0 is false).

But that would hard sell to O_PONIES lover who was counting on getting true from it (like Rust did only few weeks ago).

Yes, Rust is better-defined language, no doubt about it. It has smaller number of UBs and they are more sane. But C and Rust are cast in the same mold!

You either avoid UBs and have a predictable result or not's avoid them and end up with something strange… and there are absolutely no guarantee that program which works today would continue to work tomorrow… you have to ensure you program doesn't trigger UB to cash on that promise.

Losing the magic

Posted Dec 7, 2022 19:48 UTC (Wed) by pizza (subscriber, #46) [Link]

> And it's all very well khim saying "the compiler writers have given you an opt-out". But SchrodinUB should always be opt IN.

The thing is... they are! Run GCC without any arguments and you'll get -O0, ie "no optimization".

These UB-affected optimizations are only ever attempted if the compiler is explicitly told to try.

Now what I find hilarious are folks who complain about the pitfalls of modern optimization techniques failing on their code while simultaneously complaining "but without '-O5 -fmoar_ponies' my program is too big/slow/whatever". Those folks also tend to ignore or disable warnings, so.. yeah.

Losing the magic

Posted Dec 7, 2022 19:06 UTC (Wed) by khim (subscriber, #9252) [Link] (1 responses)

> If there is no UB in my source code, then there is also no UB in the resulting binary, absent bugs/faults in the compiler, the OS or the hardware.

We don't disagree there, but that's not what O_PONIES lovers are ready to accept.

> Your examples are cases where I have UB in language L, I translate to language M, and I still have UB - in other words, no new UB has been introduced, but the existing UB has resulted in the output program having UB, too.

Yes. Because that's what O_PONIES lovers demand to handle! They, basically, say that it doesn't matter whether L have UB or not. It only matters whether M have UB. If M doesn't have “suitably similar” UB then program in L must be handled correctly even if it violates rules of language L.

Unfortunately on practice it works only in two cases:

If L and M are extremely similar (like machine code and assembeler)
or
If translator from L to M is so primitive that you can, basically, predict how precisely each construct from L maps to M (old C compilers)

> In turn, this means that UB in language M does not create new UB in language L - the flow of UB is entirely one-way in this respect (there was UB in language L, when I compiled it, I ended up with a program that has UB in language M).

Ah, got it. Yeah, in that sense it's one-way street in the absence of bugs. Of course bugs may move things from M to L (see Meltdown and Spectre), but in the absence of bugs it's one way street, I agree.

> This is a lot of work, and involves getting a full understanding of why people want certain behaviours to be UB, rather than defined in a non-deterministic fashion.

And it's also explicitly not what O_PONIES lovers want. They explicitly don't want all that hassle, they just want the ability to write code in L with UB and get a working program. That is really pure O_PONIES — exactly like in that story with Linux kernel.

List of UBs in C and C++ is still patently insane, but that's different issue. It would have been possible to tackle that issue if O_PONIES lovers actually wanted to alter the spec. That's not what they want. They want ponies.

Losing the magic

Posted Dec 8, 2022 11:11 UTC (Thu) by farnz (subscriber, #17727) [Link]

Yep - and the O_PONIES problem, when you reduce it to its core is simple. The standard permits non-deterministic behaviour (some behaviours are defined as "each execution of the program must exhibit one behaviour from the allowed list of behaviours", not as a single definite behaviour). The standard also permits implementation-defined behaviour - where the standard doesn't define how a construct behaves, but instead says "your implementation will document how it interprets this construct".

What the O_PONIES crowd want is to convert "annoying" UB in C and C++ to implementation-defined behaviour. There's a process for doing that - it involves going through the standards committees writing papers and convincing people that this is the right thing to do. The trouble is that this is hard work - as John Regehr has already demonstrated by making the attempt - since UB has been used by the standards committee as a way of avoiding difficult discussions about what is, and is not, acceptable in a standards-compliant compiler, and thus re-opening the discussion is going to force people to confront those arguments all over again.