|
|
Subscribe / Log in / New account

Thoughts and clarifications

Thoughts and clarifications

Posted Sep 15, 2024 11:37 UTC (Sun) by pizza (subscriber, #46)
In reply to: Thoughts and clarifications by Wol
Parent article: Whither the Apple AGX graphics driver?

> Have you ever seen a moderately complex *RUST* codebase that is easy to refactor? I would have thought the phrase "moderately complex" was enough to make it clear *any* codebase would be hard to refactor.

Rust (and codebases using it) haven't really matured (ie been around long enough) to reach this critical point.

All of this emphasis on "specifying it right" is all well and good, but... the definition of "right" changes over time (sometimes drastically) along with the requirements.

Linux has historically placed great emphasis on (and heavily leaned into) internal interfaces and structures being freely malleable, but the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

"Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth


to post comments

Thoughts and clarifications

Posted Sep 15, 2024 11:54 UTC (Sun) by Wol (subscriber, #4433) [Link]

> All of this emphasis on "specifying it right" is all well and good, but... the definition of "right" changes over time (sometimes drastically) along with the requirements.

Banging on again, but with a state table you can (and should) address all possible options. Some things are hard to specify that way, some things you personally don't need to address, but if you have three possible boolean state variables, then you have eight possible states. If you only recognise five, and your solution precludes solving one of the other three, then your code will need replacing. If you can't be bothered to address the other three, but your code is designed to make it easy for someone to come along later and add it, then that's good programming.

A good logical (mathematical?) spec/proof should point out all the possible "wrong" paths so if they become a right path they're easily fixed.

> "Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth

The real world is like that :-)

Cheers,
Wol

Thoughts and clarifications

Posted Sep 15, 2024 12:19 UTC (Sun) by asahilina (subscriber, #166071) [Link] (10 responses)

> the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

My experience having gone through several major refactorings of the drm/asahi driver is that it is correct. Some of the bigger ones are:

- Going from a "demo" fully blocking implementation (the first one we shipped) to implementing queues and scheduling and asynchronous work
- Dropping in GPUVM and fully implementing VM_BIND, which along the way ended up changing how non-GPUVM kernel objects are managed including making changes to the heap allocators [1]

It is hard to explain just how liberating it is to be able to refactor and rearrange non-trivial things in the code, fix all the compiler errors, and then end up with working code more often than not. Sure, if you're unlucky you might run into a logic error or a deadlock or something... but with C it's almost impossible to escape adding a new handful of memory safety errors and new flakiness and bugs, every time you make any non-trivial change to the structure of the code.

This is true even when you're interfacing with C code, as long as its API is documented or can be easily understood. The GPUVM change involved first writing abstractions for the underlying C code. That API is reasonably nice and well documented, so it wasn't hard to get the Rust abstraction right (with some care) [2], and then when it came to dropping it into the Rust driver, everything just worked.

Most people don't believe this until they actually start working with Rust on larger projects. All the Rust evangelism isn't just zealotry. There really is something magical about it, even if it might be overstated sometimes.

[1] https://github.com/AsahiLinux/linux/commit/93b390cce8a303...
[2] https://github.com/AsahiLinux/linux/commit/e3012f87bf98c0...

Thoughts and clarifications

Posted Sep 16, 2024 9:05 UTC (Mon) by Wol (subscriber, #4433) [Link] (4 responses)

> > the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

> My experience having gone through several major refactorings of the drm/asahi driver is that it is correct. Some of the bigger ones are:

Out of curiosity, would you describe *your* codebase as complex? Or would you say "my code is simple because Rust handles the complexity for me"?

Or even "the driver problem itself is fairly simple, and Rust just makes it easy to express it"? (Put another way, "C makes the problem a lot more complicated than it should be"!)

Cheers,
Wol

Thoughts and clarifications

Posted Sep 16, 2024 9:48 UTC (Mon) by asahilina (subscriber, #166071) [Link] (3 responses)

Hmm... there are a few dimensions here.

I would say the driver has medium complexity for a GPU driver (in terms of line count it's almost 4x drm/panfrost and around the same as the GPU part of drm/msm). Rust doesn't directly reduce complexity (the driver has to do what it has to do), but it does handle a lot of error-prone boilerplate for you (for example enforced RAII) and it strongly encourages design that makes it easier to reason about the complexity (separation of concerns/encapsulation). So Rust makes it easier to maintain the complexity, understand it, and avoid bugs caused by it. I'm a lot more comfortable dealing with complex code in Rust than in C.

Then, there are some aspects where Rust is specifically a very good fit for this particular GPU driver. One of them is using Rust proc macro magic to describe multiple firmware version and GPU generation interfaces (the firmware interface is not stable) in a single implementation, as cleanly as possible. To do the same thing in C you either end up duplicating all the code, or using ugly preprocessor or build system tricks (drm/apple in our tree is a C driver that has to do this, and it's not pretty. Rust would be a good fit for a rewrite of that driver too for multiple reasons, but we need DRM KMS bindings first). The other one is (ab)using Rust lifetimes to represent GPU firmware interface lifetimes, which makes handling the firmware interface much less error-prone (and this is critical, because an error crashes the GPU firmware irrecoverably). So Rust helps with those more specific kinds of complexity.

At the end of the day it all really boils down to Rust benefiting from decades of programming experience and history in its design. C was designed at a time when programs were a tiny fraction of the size they are today. The entire 1983 UNIX kernel had around the same line count in C as my drm/asahi driver does in Rust. Linux is more than a thousand times more code today, and it shouldn't be a surprise that a programming language designed for codebases 1000x smaller might not be the best option these days. We have learned a lot since then about how to manage complexity, and Rust takes a lot of that and applies it to the kind of systems language that is suitable for writing kernels.

Thoughts and clarifications

Posted Sep 16, 2024 11:16 UTC (Mon) by Wol (subscriber, #4433) [Link] (2 responses)

> Rust doesn't directly reduce complexity (the driver has to do what it has to do), but it does handle a lot of error-prone boilerplate for you (for example enforced RAII) and it strongly encourages design that makes it easier to reason about the complexity (separation of concerns/encapsulation). So Rust makes it easier to maintain the complexity, understand it, and avoid bugs caused by it. I'm a lot more comfortable dealing with complex code in Rust than in C.

So in other words "It's a complex problem, but Rust makes it simple to express that complexity"?

I'm just trying to get a handle on where Rust lies on the problem-complexity / language-complexity graph. I'll upset Jon with this, but I hate Relational/SQL because imnsho Relational lies too far on the simplicity side of the graph, so SQL has to lie way over on the complexity side. So in terms of Einstein's "make things as simple as possible, but no simpler", Relational/SQL lies way above the local minimum. Rustaceans probably feel the same is true of C and modern hardware.

Do you feel Rust lies close to the sweet spot of minimal possible complexity? It certainly comes over you find it easy to express the complexity of the hardware.

Cheers,
Wol

Thoughts and clarifications

Posted Sep 16, 2024 11:57 UTC (Mon) by jake (editor, #205) [Link] (1 responses)

> I'll upset Jon with this

Wol, it's more than just Jon who is tired of you bringing up this stuff in every thread, often multiple times, when it is not particularly relevant. Your comments are voluminous, people have complained to you about that and the content of your posts in comments here, and you are one of the most filtered commenters we have. I think you should perhaps reduce your volume of comments and try to ensure that the drum you are banging does not come up everywhere.

just fyi,

jake

Thoughts and clarifications

Posted Sep 16, 2024 12:21 UTC (Mon) by paulj (subscriber, #341) [Link]

Perhaps putting some stats on "ranking by volume of comments (over the site in last X period, for X in {a, b, c} time|this story)" for a user to that user somewhere would help softly nudge people, where needed, on a self-educating basis?

Thoughts and clarifications

Posted Sep 22, 2024 18:05 UTC (Sun) by Rudd-O (guest, #61155) [Link] (4 responses)

I have reason to believe that you are talking to people who have never seen a match statement in their lives. And so they're used to knowing that when they make a change somewhere in the code, somewhere else very, very far away, and if or case select statement no longer matches that condition that you just added to the code, and therefore things break spectacularly at runtime.

That lack of experience is why they continue issuing the truism that refactoring is "very difficult" and you don't really know when you're changing code if something else is going to break. They haven't gotten the compiler to yell at them, "you're missing this case", because they have never experienced it. Reflectoring is super easy when the computer is doing the thinking of myriad otherwise irrelevant trivialities for you!

There really is something magical about it. And to try and explain to people that haven't seen that, quote, magic, is almost impossible. It's like trying to explain electricity to someone from the 1600s. And it is equally frustrating. In fact, it is doubly frustrating because unlike electricity in the 1600s, this is something that is very easy to witness, you just have to read a little code and push a button in a webpage and you can see it. And they just refuse. It is so oddly disconcerting.

Thoughts and clarifications

Posted Sep 22, 2024 18:29 UTC (Sun) by pizza (subscriber, #46) [Link] (3 responses)

> you just have to read a little code and push a button in a webpage and you can see it. And they just refuse. It is so oddly disconcerting.

$ sloccount `find projdir -name *.[ch]`
[...]
Total Physical Source Lines of Code (SLOC) = 2,278,858

Call me naive, but "read a little code and push a button on a web page" isn't going to cut it.

Thoughts and clarifications

Posted Sep 25, 2024 9:44 UTC (Wed) by Rudd-O (guest, #61155) [Link] (2 responses)

That's a lot of code.

To get back to the topic (common pitfalls of refactoring and how Rust helps avoid errors):

Can you articulate what a match statement does, and how it behaves when you add a new case somewhere very far away from the match statement? How is it different from, say, a chain of if/else or a select case? If your codebase was (hypothetically) Rust, what would the compiler say to such a change, versus what the C compiler says today?

My intention is to figure out if you have had a chance to compare both C and Rust in order to form an honest, informed opinion.

Thanks in advance.

Perhaps that is far enough

Posted Sep 25, 2024 9:48 UTC (Wed) by corbet (editor, #1) [Link] (1 responses)

It is my suspicion that this conversation will go nowhere useful after this point. Perhaps it's best to stop it here?

Perhaps that is far enough

Posted Sep 25, 2024 10:01 UTC (Wed) by Rudd-O (guest, #61155) [Link]

Sure. Have a nice day.

Thoughts and clarifications

Posted Sep 15, 2024 12:27 UTC (Sun) by sunshowers (guest, #170655) [Link]

> the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

It's almost completely true, though.

Think of the type system as a set of proofs you need to provide to the compiler for it to accept your code. To the extent that you can encode your program's properties into the type system, by the time your code compiles you have proven that those properties hold. Mathematically, rock-solid proven.

You can't encode 100% of your properties into the program (for example you can't totally prove linearity, since Rust has an affine system i.e. you can drop values on the floor), but you can get very far.

Thoughts and clarifications

Posted Sep 15, 2024 13:17 UTC (Sun) by mb (subscriber, #50428) [Link] (10 responses)

>but the assertion of "just change the definition and your job is done when the compiler stops complaining"
>is laughably naive.

Yes, it's extremely hard to believe.
And yes, it is an oversimplification.

But it is in fact true, to some extent.

Think of it as being the opposite of what Python does.
In Python code it is extremely difficult and sometimes practically impossible to do large scale refactorings, because almost all things are only checked at run time and almost nothing is checked at "build" time.

Rust is the exact opposite of that. And it also adds many more static checks than Python or C++ program could ever do. The language lets you express properties of the program in the type system. At the core of all this is the lifetime system, the borrow checker and move semantics.

If new Rust code is written, it sometimes has logic bugs or similar bugs in it.
But if Rust code is refactored and the algorithms are not changed, you're 99% done when the compiler stops complaining.

In Python it is extremely scary to pull out part of some code and put it somewhere else. It's extremely easy to forget something that you will only notice years after. It's no fun at all.
In Rust such things are easy, fun and the feeling of "did I forget something?" is not present, because the compiler guides the developer throughout the process.

Rust compiler messages are helpful. If you hear "Rust compiler complaining" translate that to "Rust compiler trying to help the developer".
"To fight the compiler" is not what actually happens. It's not a fight. It's a friendly helping hand.
And that really shines in refactorings.

Thoughts and clarifications

Posted Sep 15, 2024 13:52 UTC (Sun) by pizza (subscriber, #46) [Link] (9 responses)

> But if Rust code is refactored and the algorithms are not changed, you're 99% done when the compiler stops complaining.

That same argument applies to any statically-typed language, even (*gasp*) C.

Meanwhile, if you're not changing the algorithms/semantics/whatever in some way, why are you refactoring anything to begin with?

Thoughts and clarifications

Posted Sep 15, 2024 14:02 UTC (Sun) by mb (subscriber, #50428) [Link]

> That same argument applies to any statically-typed language, even (*gasp*) C.

No. That is not true.
There's a very big difference between what you can encode in the C type system and what is possible with the Rust type system.

>if you're not changing the algorithms/semantics/whatever

That is not what I said.

Thoughts and clarifications

Posted Sep 15, 2024 14:51 UTC (Sun) by asahilina (subscriber, #166071) [Link]

> That same argument applies to any statically-typed language, even (*gasp*) C.

Not at all, not to the extent it does with Rust.

In C, if you change a structure to be reference-counted, the compiler does nothing to ensure you manage the reference counts correctly. In Rust it does.

In C, if you add a mutex to protect some data, the compiler does nothing to ensure you actually hold the mutex before accessing the data. In Rust it does.

In C, if you change the width or signedness of an integer member or variable, the compiler does nothing to ensure you actually update the type of any variables it's copied to or passed into, and it will happily let you truncate or convert integers, even with -Wall. In Rust it won't compile until you change all the types or add explicit casts, and you probably won't even need to touch any code that just creates temporary bindings since Rust has type inference for that and C does not (without extensions like __auto_type or weird macros).

In C, if you need to add cleanup code to a structure that was previously just freed with free(), you need to find all the sites where it is freed and change them to call a helper that does the extra cleanup manually. In Rust none of this code exists to begin with since freeing is automatic, you just implement the `Drop` trait on the struct to add extra cleanup code and you're done, no need to refactor anything at all.

Thoughts and clarifications

Posted Sep 16, 2024 10:36 UTC (Mon) by farnz (subscriber, #17727) [Link] (5 responses)

No; the degree to which you're done when things start compiling depends critically on how much is checked at compile time versus at run time. My experience over 8 years of doing Rust, and approximately 20 years doing C, is that Rust programs tend to have much more in the way of compile time checking than C programs, which in turn means that "it compiles" is a much stronger statement than in C (although not as strong as it tends to be in Idris or Agda).

A more interesting question is whether this will continue to hold as more people write Rust code - is this current behaviour an artifact of early Rust programmers tending to write more compiler-checked guarantees, or is this something that will continue to hold when the Rust programmer pool expands?

Thoughts and clarifications

Posted Sep 16, 2024 11:41 UTC (Mon) by pizza (subscriber, #46) [Link] (4 responses)

> A more interesting question is whether this will continue to hold as more people write Rust code - is this current behaviour an artifact of early Rust programmers tending to write more compiler-checked guarantees, or is this something that will continue to hold when the Rust programmer pool expands?

Personally, I strongly suspect the latter.

Current Rust programmers are self-selecting, in the upper echelon of skill/talent, and largely using Rust for Rust's sake. That is very much non-representative of the software development world as a whole. [1]

Rust will have its Eternal September, when relatively average-to-mediocre corporate body farms start cranking it out. At that point, "Rust Culture" goes out the window as the only "culture" that matters is what falls out of coroporate metrics<->reward mappings.

[1] So is C, for that matter. If I were to pull out a bad analogy, if effectively coding in C represents the top 10th percentile, Rust is currently the top 1%.

Thoughts and clarifications

Posted Sep 16, 2024 11:50 UTC (Mon) by pizza (subscriber, #46) [Link]

> Personally, I strongly suspect the latter.

Gaah, make that 'the former'. (As I hope was clear from the rest of the post)

Thoughts and clarifications

Posted Sep 16, 2024 13:21 UTC (Mon) by farnz (subscriber, #17727) [Link]

I disagree in part; I think it'll get worse than it is today, for the reasons you outline, but that it'll still remain a lot more true of Rust than it is of C.

I have access to a decent body of code written by a contract house (one of the big names for cheap outsourcing), and the comments make it clear that they used their cheap people, not their best people, to write the code. Of the four most common causes of issues refactoring that code, three are things that are compiler-checked in Rust:

  1. Assumptions about the sizes of arrays passed as arguments; where in C, I can pass a 2 element array to a function that expects a 4 element array, Rust either makes this a compile-time error (if the argument type is an array) or a runtime panic (if the argument type is reference to slice).
  2. Assumptions about wrapping of unsigned computations. C promotes unsigned bytes to signed int for calculations, and then does the computation, but there's chunks of this code that assume that every intermediate in a long sequence without storing to a known type remains unsigned (otherwise there's UB lurking when the intermediate exceeds INT_MAX).
  3. Failure to check all possible values of an enum, in large part because it's clear that the value got added after a module was "code complete", and nobody thought to add a default: or a special handler for this value.

Those all become panics or compile failures in Rust, leaving the errors in business logic (of which there are remarkably few) to deal with during refactoring.

And more generally, the biggest issue with cheap contractor C and C++ is the amount of code they write that depends on UB and friends being interpreted a particular way by the compiler, even in cases where there's no way to check that interpretation from code; Rust does seem to reduce this, even in beginner code, since unsafe is easy to find and be scared by.

Thoughts and clarifications

Posted Sep 22, 2024 18:24 UTC (Sun) by Rudd-O (guest, #61155) [Link] (1 responses)

> Rust will have its Eternal September, when relatively average-to-mediocre corporate body farms start cranking it out. At that point, "Rust Culture" goes out the window as the only "culture" that matters is what falls out of coroporate metrics<->reward mappings.

I haven't seen so far, at least in decades of me working in the industry, that eternal September has arrived to Haskell.

And I don't think that's going to happen. At least in Haskell. Maybe in Rust will. Maybe it won't.

There is a set of conceptual difficulties associated with learning any programming language, and it is not the same, depending on the language. Learning ATARI basic is one thing, (by the way that's the first language I learned). Learning Python is another Learning assembly is yet another Learning Haskell is another.

To pull the conversation away from the realm of language and just talk about concepts, pretty much any programmer can program using a stringly typed interface (which we all know leads to shitty code). But not every programmer is capable of learning the Haskell type system (I know I can't but ikcan understand how it leads to improved type safety and thus code quality).

All of this is to say that we're not all made equal. And because we're not all made equal, we are not all able to use the same tools. Just as we are not all able to wield a sledgehammer that weighs 30 pounds and break down a wall, so we are just as unequal to wield a specific programming language with skill and produce the results that one wants. Evolution does not stop at the neck.

But what do i know? Perhaps Haskell will get its eternal September? All i know is i can't learn it. Or at least I'm humble enough to admit that.

Thoughts and clarifications

Posted Sep 22, 2024 18:30 UTC (Sun) by Rudd-O (guest, #61155) [Link]

Addendum:

In case it's interesting for the readers here, the current firm I'm working at started the system that we are developing with Haskell. We had a lot of researchers that were super talented and were able to crank out what I consider pretty high quality code at the very beginning using nothing but Haskell.

The problem is that once you need to grow past 10 engineers, or in this case computer scientists, you can't. Finding 10 Haskell programmers in a Haskell conference is fairly easy. Finding the 11th to join your team when there's no conference going on is almost impossible. Hasklers are wicked smart, and because they're wicked smart, they're wicked rare.

So what did we do after that? We switched our system to Rust. Of course, the system continues to have the same type structure to the extent that it is possible, that it had back in the era when it started as Haskell. And all the Haskell programmers quickly adapted to using Rust because the type system in Rust is less complex than the type system in Haskell, so for them it was a downgrade. But we were able to quickly triple the amount of programmers that we had developing the system.

And the system continues to grow, and it has pretty high code quality for the standards of my career — I've seen code maybe 25 years? I routinely refactor the innards of the system without fearing that I'm going to break something somewhere else, somewhere deep in the system. I don't think I've ever felt so free to actually change the code without having the dread inside of me that it's going to catastrophically explode in production. Two years, and I have yet to put a bug in the system. That is almost completely magical.

Thoughts and clarifications

Posted Sep 22, 2024 18:19 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> That same argument applies to any statically-typed language, even (*gasp*) C.

No. Not even close.

You can make a change in C struct, such that you're missing a member of the struct, and maybe the compiler will complain that you missed initialization somewhere of that struct member. That is true.

But if you add to an enum somewhere, which represents a new state of the object that you are using the enum on, and that enum is used in an if, case or select statement somewhere else, C will happily ignore that you've done this, and compile just fine. Then your code doesn't work.

In Rust, when you do this, generally a selector case statement equivalent, which is called match, will not compile, because it knows that you've added a different case that your code does not handle. Only after you have fixed every site where this enum is used, will it compile.

This simple thing prevents entire classes of logic errors in your code.

There are quite a few other ergonomic changes that the language has over other languages that existed before rust, which work in a similar way. to give you just one other example:

You change a C struct to have a new member that has a pointer to another type. You painstakingly change every place where that struct needs to be initialized so that your program will compile and run. This program is multi-threaded, like the kernel is. You run, your program, and it crashes. In this particular case, that new member that refers to a pointer to this other structure was used at the wrong time. This is due behavior that wasn't there before when the first structure did not have a pointer to the second one.

This is not possible in Rust. The compiler, in fact, the borrow checker in the compiler, keeps track of where every pointer is going and where every pointer is used. And will not let you compile the program if you use a pointer or a reference when you are not supposed to, when it's supposed to be dead or not initialized, or if the lifetime of the object taking a reference to that thing is a different lifetime, incompatible with the lifetime of the object pointed to by the pointer. It even knows when you are using data that you're not supposed to be using because you will have forgotten to grab a lock. And it will tell you you need to change how this is done. Try this, try this other thing, try this other thing. It gives you options.

This is so far ahead of anything that the C language does, that in fact could be construed as magic by Ken, Dennis, and Donald. You need to see it with your own eyes to believe it, but it is amazing.

On a personal note, this conversation on this particular thread has exposed to me the wide difference of perspective that C developers and Rust developers have. Having developed years of my life with both languages, I have the uncomfortable advantage of having perspective from both sides. But to me, it really does feel like we're arguing versus horses and carriages versus automobiles, or electric cars versus gas cars. I, too, thought Teslas were bullshit until I got on one, as a passenger, and the driver punched the pedal. Oh my god. It's a similar experience going from Python, or Go, or C, to Rust.

And I think that explains why a lot of people see what Rust developers say about the language, and then conclude, this must be a religion, or worse, a cult.

Thoughts and clarifications

Posted Sep 15, 2024 20:24 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

> Linux has historically placed great emphasis on (and heavily leaned into) internal interfaces and structures being freely malleable, but the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

Others chimed in with examples to the contrary, I also had a similar experience. FWIW, for me, the best feature of Rust was not the lifetimes and borrows, but pattern matching and exhaustiveness checking. I always hated writing code that encodes state machines, but Rust makes that so much better.

To be clear, other languages with pattern matching have similar properties, and even C++ might get it soon.

Thoughts and clarifications

Posted Sep 16, 2024 8:51 UTC (Mon) by adobriyan (subscriber, #30858) [Link] (6 responses)

> and exhaustiveness checking

fn main() {
let i: u32 = 41;
match i % 2 {
0 => println!("0"),
1 => println!("1"),
}
}

error[E0004]: non-exhaustive patterns: `2_u32..=u32::MAX` not covered

But rustc also exposed an embarassing bug in trivial C++ calculator program of mine, so I can't complain too much.

Thoughts and clarifications

Posted Sep 16, 2024 12:30 UTC (Mon) by excors (subscriber, #95769) [Link] (5 responses)

That's easily worked around by adding a catch-all "_ => unreachable!()", ideally with a comment explaining why you believe it's unreachable (assuming the real code isn't quite this trivial), and if you were mistaken then it'll become a runtime panic (unlike C where reaching __builtin_unreachable() is undefined behaviour).

After making that change, you do lose the benefits of compile-time exhaustiveness checking for that match statement; someone might change the condition to "i % 3" and you won't notice until runtime. But you'll still get the benefits for any code that matches integers you can't guarantee are in a particular sub-range (like the inputs to any API), and for any code that matches enums (presumably what Cyberax meant with state machines). I'd guess those situations are more common in most programs, so the exhaustiveness checking is still a valuable feature even if it's not perfect.

If your code is doing lots of work on bounded integers then I guess you'd want something more like the Wuffs type system, but then you'll get the compromises that Wuffs makes to make that work, which seems to restrict it to a very small niche. And that would still be inadequate if you do "match x & 2" since Wuffs doesn't know the value can't be 1. (Though as far as I can tell, Wuffs doesn't actually support any kind of switch/match statement - you have to write an if-else chain instead.)

Thoughts and clarifications

Posted Sep 16, 2024 14:20 UTC (Mon) by andresfreund (subscriber, #69562) [Link] (4 responses)

> That's easily worked around by adding a catch-all "_ => unreachable!()", ideally with a comment explaining why you believe it's unreachable (assuming the real code isn't quite this trivial), and if you were mistaken then it'll become a runtime panic (unlike C where reaching __builtin_unreachable() is undefined behaviour).

Imo this comparison to __builtin_unreachable() is nonsensical. I dislike a lot of UB in C as well, but you'd IMO only ever use __builtin_unreachable() when you *want* the compiler to treat the case as actually unreachable, to generate better code.

Thoughts and clarifications

Posted Sep 16, 2024 14:52 UTC (Mon) by adobriyan (subscriber, #30858) [Link] (1 responses)

Rust does and doesn't do bounds checking at the same time:
without unreachable!() it is compile error 100% of the time,
but at -O1 code generator knows what remainder does to integers.

https://godbolt.org/z/jbefszqa8

Guaranteed behaviour versus permitted optimizations

Posted Sep 17, 2024 8:49 UTC (Tue) by farnz (subscriber, #17727) [Link]

This is normal for any compiled language; the compiler is allowed but not required to remove dead code, and thus when the optimizer is able to prove that a given piece of code cannot be called, it is allowed to remove it (similar applies to unused data). However, it's never required to remove dead code, and when you're not optimizing, it'll skip the passes that look for dead code in the name of speed.

There's a neat trick that you can use to exploit this; put a unique string in panic functions that doesn't appear anywhere else in the code, and then a simple search of the binary for that unique string tells you whether or not the optimizer was able to remove the unwanted panic. It's not hard to put greps in CI that look for your unique string, and thus get a CI-time check for code that could panic at runtime - if the string is present, the optimizer has failed to see that it can remove the panic, and you need to work out whether that's a missed optimization (and if so, what you're going to do about it - make the code simpler? Improve the optimizer?). If it's absent, then you know that the optimizer saw a route to remove the panic for you.

Thoughts and clarifications

Posted Sep 16, 2024 17:40 UTC (Mon) by excors (subscriber, #95769) [Link] (1 responses)

This is getting slightly tangential, but I don't think it's that far-fetched to compare them - they have basically the same name (especially in codebases like Linux that #define it to "unreachable"), and people do use it in C for non-performance reasons, e.g.:

https://github.com/torvalds/linux/blob/v6.11/arch/mips/la... (unreachable() when the hardware returns an unexpected chip ID; that doesn't sound safe)

https://github.com/torvalds/linux/blob/v6.11/fs/ntfs3/fre... (followed by error-handling code, suggesting the programmer thought maybe this could be reached)

https://github.com/torvalds/linux/blob/v6.11/arch/mips/kv... (genuinely unreachable switch default case, explicitly to stop compiler warnings)

https://github.com/torvalds/linux/blob/v6.11/arch/mips/la... (looks like they expected unreachable() to be an infinite loop, which I think it was when that code was written, but it will misbehave with __builtin_unreachable())

https://github.com/torvalds/linux/blob/v6.10/drivers/clk/... (probably to stop missing-return-value warnings; not clear if it's genuinely unreachable, since clk_hw looks non-trivial; sensibly replaced by BUG() later (https://lore.kernel.org/all/20240704073558.117894-1-liqia...))

__builtin_unreachable() seems like an attractive nuisance (especially when renamed to unreachable()) - evidently people use it for cases where they think it shouldn't be reached, but they haven't always proved it can't be reached, and if it is then they get UB instead of a debuggable error message. It seems they usually add it to stop compiler warnings, not for better code generation. Often they should have used BUG(), which is functionally equivalent to Rust's unreachable!() though slightly less descriptive.

If you really need the code-generation hint in Rust, when the optimiser (which is a bit smarter than the compiler frontend) still can't figure out that your unreachable!() is unreachable, there's "unsafe { std::hint::unreachable_unchecked() }" which is just as dangerous but much less attractive than Linux's unreachable().

Anyway, I didn't originally mean to denigrate C, I was mainly trying to explain the Rust code to readers who might be less familiar with it. But it does also serve as an example of different attitudes to how easy it should be to invoke UB.

Thoughts and clarifications

Posted Sep 16, 2024 18:14 UTC (Mon) by mb (subscriber, #50428) [Link]

Yes, looks like you found a couple of actual soundness bugs in the C code.
I wonder if there are any uses of unreachable that actually make sense. As in: Places where the performance gain actually matters.

Thoughts and clarifications

Posted Sep 22, 2024 18:32 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> but pattern matching and exhaustiveness checking

This was magical to me too. At first it felt super awkward because it felt like an inversion of the order in which things are supposed to read like. But when it clicked... oh my god. Combining that with the question mark or the return inside of the match, it really helped simplifying the structure of the happy path that I could read.

I am so happy I learned Rust. And I've even happier that I'm getting paid to do it.

Thoughts and clarifications

Posted Sep 22, 2024 17:58 UTC (Sun) by Rudd-O (guest, #61155) [Link] (10 responses)

> "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

In C. And assembly.

> Donald Knuth

— famous C and assembly developer

Thoughts and clarifications

Posted Sep 22, 2024 18:11 UTC (Sun) by pizza (subscriber, #46) [Link] (9 responses)

> "Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth

..Your blithe dismissal of Knuth as an "assembly and C programmer" doesn't invalidate his point. "Provably correct" doesn't mean that it actually *works*.

Thoughts and clarifications

Posted Sep 23, 2024 9:15 UTC (Mon) by farnz (subscriber, #17727) [Link] (7 responses)

Knuth's point was not that it wouldn't work, but rather that his proof would only cover the things he cared to prove correct, and not necessarily cover everything that you, the reader of his code, would expect. He wrote that jibe in an era when formal methods practitioners were merrily proving all sorts of properties about code that, to a large extent, were irrelevant to users of computation systems (including those implemented using humans as the compute element), and not considering important properties (like "is this proven to terminate in finite time") because they're not always provable.

Thoughts and clarifications

Posted Sep 23, 2024 9:26 UTC (Mon) by Wol (subscriber, #4433) [Link] (6 responses)

While it may not be what Knuth was thinking of, I also think of it as pointing out that a formal proof merely proves that the mathematical model is internally consistent.

It does not prove that what the model does is what reality does! In a properly specified system, maths and science(reality) usually agree, but there's no guarantee ...

Cheers,
Wol

Thoughts and clarifications

Posted Sep 23, 2024 11:56 UTC (Mon) by pizza (subscriber, #46) [Link] (5 responses)

> It does not prove that what the model does is what reality does! In a properly specified system, maths and science(reality) usually agree, but there's no guarantee ...

This has been my experience.

...The software running on the 737MAX's MCAS was "provably correct" .. for its specifications.

(It turns out that the words "properly specified" are about as common as unicorn-riding sasquatches...)

Thoughts and clarifications

Posted Sep 23, 2024 12:01 UTC (Mon) by farnz (subscriber, #17727) [Link] (4 responses)

Reference for "the software running on the 737MAX's MCAS was "provably correct""? I can't find any evidence anywhere that the MCAS was formally verified at all - merely that it was tested correct, and Boeing asserted to the FAA that the testing covered all plausible scenarios.

Thoughts and clarifications

Posted Sep 23, 2024 14:06 UTC (Mon) by pizza (subscriber, #46) [Link] (3 responses)

> Reference for "the software running on the 737MAX's MCAS was "provably correct""?

I'm giving Boeing's software folks the benefit of the doubt, because the MCAS debacle was a failure of specification (on multiple levels), not one of implementation.

After all, one can't test/validate compliance with a requirement that doesn't exist.

> Boeing asserted to the FAA that the testing covered all plausible scenarios.

It did! Unfortunately, many of those "plausible scenarios" required pilots to be trained differently [1], but a different part of Boeing explicitly said that wasn't necessary [2].

[1] ie recognize what was going on, and flip a circuit breaker (!) to disable MCAS
[2] One of the main selling points of the MAX

737 MAX only tested, not proven correct

Posted Sep 23, 2024 14:40 UTC (Mon) by farnz (subscriber, #17727) [Link] (2 responses)

There is a requirement underpinning all avionics that the aircraft's behaviour is safe in the event of a data source failing, and that the avionics are able to detect that a data source has become unreliable and enter the failsafe behaviour mode. This is a specification item for MCAS, and Boeing asserted to the FAA that they had tested MCAS and confirmed that, in the event of an AoA sensor fault, MCAS would detect the fault and enter the failsafe behaviours.

Boeing's tests for this, however, were grossly inadequate, and at least 3 different failure conditions have been found which were not covered by the tests: first is that "AoA DISAGREE" was an optional indication, available during the tests, but not in production MCAS unless purchased (20% of the fleet). Second is that they did not test enough bit error cases, and later investigation found a 5 bit error case that was catastrophic. And third was that the procedure for handling MCAS issues assumed that the pilot would have time to follow the checklist; in practice, the training issues meant that pilots didn't even realise there was a checklist.

737 MAX only tested, not proven correct

Posted Sep 23, 2024 15:49 UTC (Mon) by pizza (subscriber, #46) [Link] (1 responses)

> first is that "AoA DISAGREE" was an optional indication,

It's worse than that -- _redundant sensors_ were optional.

..and they were optional because one set of folks had a different set of functional specifications than another, and management was disincentivized to notice.

737 MAX only tested, not proven correct

Posted Sep 23, 2024 16:23 UTC (Mon) by paulj (subscriber, #341) [Link]

Went and had a read, as it seems you and farnz don't quite agree and/or are talking about slightly different things. ICBW, but my read of the FAA summary report is:

FAA "Safety Item #1: USE OF SINGLE ANGLE OF ATTACK (AOA) SENSOR" - this refers to the use of /data/ from a single AoA sensor by MCAS.

FAA "Safety item #5: AOA DISAGREE:" - refers to the "AOA DISAGREE" warning in the cockpit FDU, which somehow was tied to the optional "AoA Indicator gauge" feature for the FDU, which airlines had to purchases.

AFAICT from the FAA summary. I.e., the change was entirely in the logic - because there was no action item to retro-fit another AoA vane to the 737 Max. Excluding training and maintenance requirements, aircraft changes were all logic update to do better filtering of the 2 AoA signals, with better differential error detection, and cease ploughing ahead with MCAS commands based on measurements from just 1 vane - which could be faulty. Other aircraft safety items added limits, damping and margins, to prevent runaway MCAS crashing aircraft.

Stunning issues really.

Thoughts and clarifications

Posted Sep 25, 2024 9:52 UTC (Wed) by Rudd-O (guest, #61155) [Link]

Not sure what you're referring to with "provably correct", that's Knuth's claim, not mine.

Nevertheless, in my professional experience, the C compiler's silence generally tells you near to nothing about whether the program will run as intended (I concede things have improved over the last 30 years. Whereas the Rust compiler's silence generally does mean the program is either going to run as intended or will have substantially fewer problems than the C one (mostly logic errors introduced by the programmer).

You can choose to be dogmatic and insist that all compilers / languages give the same results (a belief I like to call " blank slatism"), or you can choose to test that theory for yourself. I know what I chose and I am quite happy. My sincere recommendation is that you test the theory for yourself.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds