|
|
Subscribe / Log in / New account

Thoughts and clarifications

Thoughts and clarifications

Posted Sep 15, 2024 20:24 UTC (Sun) by Cyberax (✭ supporter ✭, #52523)
In reply to: Thoughts and clarifications by pizza
Parent article: Whither the Apple AGX graphics driver?

> Linux has historically placed great emphasis on (and heavily leaned into) internal interfaces and structures being freely malleable, but the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

Others chimed in with examples to the contrary, I also had a similar experience. FWIW, for me, the best feature of Rust was not the lifetimes and borrows, but pattern matching and exhaustiveness checking. I always hated writing code that encodes state machines, but Rust makes that so much better.

To be clear, other languages with pattern matching have similar properties, and even C++ might get it soon.


to post comments

Thoughts and clarifications

Posted Sep 16, 2024 8:51 UTC (Mon) by adobriyan (subscriber, #30858) [Link] (6 responses)

> and exhaustiveness checking

fn main() {
let i: u32 = 41;
match i % 2 {
0 => println!("0"),
1 => println!("1"),
}
}

error[E0004]: non-exhaustive patterns: `2_u32..=u32::MAX` not covered

But rustc also exposed an embarassing bug in trivial C++ calculator program of mine, so I can't complain too much.

Thoughts and clarifications

Posted Sep 16, 2024 12:30 UTC (Mon) by excors (subscriber, #95769) [Link] (5 responses)

That's easily worked around by adding a catch-all "_ => unreachable!()", ideally with a comment explaining why you believe it's unreachable (assuming the real code isn't quite this trivial), and if you were mistaken then it'll become a runtime panic (unlike C where reaching __builtin_unreachable() is undefined behaviour).

After making that change, you do lose the benefits of compile-time exhaustiveness checking for that match statement; someone might change the condition to "i % 3" and you won't notice until runtime. But you'll still get the benefits for any code that matches integers you can't guarantee are in a particular sub-range (like the inputs to any API), and for any code that matches enums (presumably what Cyberax meant with state machines). I'd guess those situations are more common in most programs, so the exhaustiveness checking is still a valuable feature even if it's not perfect.

If your code is doing lots of work on bounded integers then I guess you'd want something more like the Wuffs type system, but then you'll get the compromises that Wuffs makes to make that work, which seems to restrict it to a very small niche. And that would still be inadequate if you do "match x & 2" since Wuffs doesn't know the value can't be 1. (Though as far as I can tell, Wuffs doesn't actually support any kind of switch/match statement - you have to write an if-else chain instead.)

Thoughts and clarifications

Posted Sep 16, 2024 14:20 UTC (Mon) by andresfreund (subscriber, #69562) [Link] (4 responses)

> That's easily worked around by adding a catch-all "_ => unreachable!()", ideally with a comment explaining why you believe it's unreachable (assuming the real code isn't quite this trivial), and if you were mistaken then it'll become a runtime panic (unlike C where reaching __builtin_unreachable() is undefined behaviour).

Imo this comparison to __builtin_unreachable() is nonsensical. I dislike a lot of UB in C as well, but you'd IMO only ever use __builtin_unreachable() when you *want* the compiler to treat the case as actually unreachable, to generate better code.

Thoughts and clarifications

Posted Sep 16, 2024 14:52 UTC (Mon) by adobriyan (subscriber, #30858) [Link] (1 responses)

Rust does and doesn't do bounds checking at the same time:
without unreachable!() it is compile error 100% of the time,
but at -O1 code generator knows what remainder does to integers.

https://godbolt.org/z/jbefszqa8

Guaranteed behaviour versus permitted optimizations

Posted Sep 17, 2024 8:49 UTC (Tue) by farnz (subscriber, #17727) [Link]

This is normal for any compiled language; the compiler is allowed but not required to remove dead code, and thus when the optimizer is able to prove that a given piece of code cannot be called, it is allowed to remove it (similar applies to unused data). However, it's never required to remove dead code, and when you're not optimizing, it'll skip the passes that look for dead code in the name of speed.

There's a neat trick that you can use to exploit this; put a unique string in panic functions that doesn't appear anywhere else in the code, and then a simple search of the binary for that unique string tells you whether or not the optimizer was able to remove the unwanted panic. It's not hard to put greps in CI that look for your unique string, and thus get a CI-time check for code that could panic at runtime - if the string is present, the optimizer has failed to see that it can remove the panic, and you need to work out whether that's a missed optimization (and if so, what you're going to do about it - make the code simpler? Improve the optimizer?). If it's absent, then you know that the optimizer saw a route to remove the panic for you.

Thoughts and clarifications

Posted Sep 16, 2024 17:40 UTC (Mon) by excors (subscriber, #95769) [Link] (1 responses)

This is getting slightly tangential, but I don't think it's that far-fetched to compare them - they have basically the same name (especially in codebases like Linux that #define it to "unreachable"), and people do use it in C for non-performance reasons, e.g.:

https://github.com/torvalds/linux/blob/v6.11/arch/mips/la... (unreachable() when the hardware returns an unexpected chip ID; that doesn't sound safe)

https://github.com/torvalds/linux/blob/v6.11/fs/ntfs3/fre... (followed by error-handling code, suggesting the programmer thought maybe this could be reached)

https://github.com/torvalds/linux/blob/v6.11/arch/mips/kv... (genuinely unreachable switch default case, explicitly to stop compiler warnings)

https://github.com/torvalds/linux/blob/v6.11/arch/mips/la... (looks like they expected unreachable() to be an infinite loop, which I think it was when that code was written, but it will misbehave with __builtin_unreachable())

https://github.com/torvalds/linux/blob/v6.10/drivers/clk/... (probably to stop missing-return-value warnings; not clear if it's genuinely unreachable, since clk_hw looks non-trivial; sensibly replaced by BUG() later (https://lore.kernel.org/all/20240704073558.117894-1-liqia...))

__builtin_unreachable() seems like an attractive nuisance (especially when renamed to unreachable()) - evidently people use it for cases where they think it shouldn't be reached, but they haven't always proved it can't be reached, and if it is then they get UB instead of a debuggable error message. It seems they usually add it to stop compiler warnings, not for better code generation. Often they should have used BUG(), which is functionally equivalent to Rust's unreachable!() though slightly less descriptive.

If you really need the code-generation hint in Rust, when the optimiser (which is a bit smarter than the compiler frontend) still can't figure out that your unreachable!() is unreachable, there's "unsafe { std::hint::unreachable_unchecked() }" which is just as dangerous but much less attractive than Linux's unreachable().

Anyway, I didn't originally mean to denigrate C, I was mainly trying to explain the Rust code to readers who might be less familiar with it. But it does also serve as an example of different attitudes to how easy it should be to invoke UB.

Thoughts and clarifications

Posted Sep 16, 2024 18:14 UTC (Mon) by mb (subscriber, #50428) [Link]

Yes, looks like you found a couple of actual soundness bugs in the C code.
I wonder if there are any uses of unreachable that actually make sense. As in: Places where the performance gain actually matters.

Thoughts and clarifications

Posted Sep 22, 2024 18:32 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> but pattern matching and exhaustiveness checking

This was magical to me too. At first it felt super awkward because it felt like an inversion of the order in which things are supposed to read like. But when it clicked... oh my god. Combining that with the question mark or the return inside of the match, it really helped simplifying the structure of the happy path that I could read.

I am so happy I learned Rust. And I've even happier that I'm getting paid to do it.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds