Thoughts and clarifications

Posted Sep 16, 2024 14:20 UTC (Mon) by andresfreund (subscriber, #69562)
In reply to: Thoughts and clarifications by excors
Parent article: Whither the Apple AGX graphics driver?

> That's easily worked around by adding a catch-all "_ => unreachable!()", ideally with a comment explaining why you believe it's unreachable (assuming the real code isn't quite this trivial), and if you were mistaken then it'll become a runtime panic (unlike C where reaching __builtin_unreachable() is undefined behaviour).

Imo this comparison to __builtin_unreachable() is nonsensical. I dislike a lot of UB in C as well, but you'd IMO only ever use __builtin_unreachable() when you *want* the compiler to treat the case as actually unreachable, to generate better code.

Thoughts and clarifications

Posted Sep 16, 2024 14:52 UTC (Mon) by adobriyan (subscriber, #30858) [Link] (1 responses)

Rust does and doesn't do bounds checking at the same time:
without unreachable!() it is compile error 100% of the time,
but at -O1 code generator knows what remainder does to integers.

https://godbolt.org/z/jbefszqa8

Guaranteed behaviour versus permitted optimizations

Posted Sep 17, 2024 8:49 UTC (Tue) by farnz (subscriber, #17727) [Link]

This is normal for any compiled language; the compiler is allowed but not required to remove dead code, and thus when the optimizer is able to prove that a given piece of code cannot be called, it is allowed to remove it (similar applies to unused data). However, it's never required to remove dead code, and when you're not optimizing, it'll skip the passes that look for dead code in the name of speed.

There's a neat trick that you can use to exploit this; put a unique string in panic functions that doesn't appear anywhere else in the code, and then a simple search of the binary for that unique string tells you whether or not the optimizer was able to remove the unwanted panic. It's not hard to put greps in CI that look for your unique string, and thus get a CI-time check for code that could panic at runtime - if the string is present, the optimizer has failed to see that it can remove the panic, and you need to work out whether that's a missed optimization (and if so, what you're going to do about it - make the code simpler? Improve the optimizer?). If it's absent, then you know that the optimizer saw a route to remove the panic for you.

Thoughts and clarifications

Posted Sep 16, 2024 17:40 UTC (Mon) by excors (subscriber, #95769) [Link] (1 responses)

This is getting slightly tangential, but I don't think it's that far-fetched to compare them - they have basically the same name (especially in codebases like Linux that #define it to "unreachable"), and people do use it in C for non-performance reasons, e.g.:

https://github.com/torvalds/linux/blob/v6.11/arch/mips/la... (unreachable() when the hardware returns an unexpected chip ID; that doesn't sound safe)

https://github.com/torvalds/linux/blob/v6.11/fs/ntfs3/fre... (followed by error-handling code, suggesting the programmer thought maybe this could be reached)

https://github.com/torvalds/linux/blob/v6.11/arch/mips/kv... (genuinely unreachable switch default case, explicitly to stop compiler warnings)

https://github.com/torvalds/linux/blob/v6.11/arch/mips/la... (looks like they expected unreachable() to be an infinite loop, which I think it was when that code was written, but it will misbehave with __builtin_unreachable())

https://github.com/torvalds/linux/blob/v6.10/drivers/clk/... (probably to stop missing-return-value warnings; not clear if it's genuinely unreachable, since clk_hw looks non-trivial; sensibly replaced by BUG() later (https://lore.kernel.org/all/20240704073558.117894-1-liqia...))

__builtin_unreachable() seems like an attractive nuisance (especially when renamed to unreachable()) - evidently people use it for cases where they think it shouldn't be reached, but they haven't always proved it can't be reached, and if it is then they get UB instead of a debuggable error message. It seems they usually add it to stop compiler warnings, not for better code generation. Often they should have used BUG(), which is functionally equivalent to Rust's unreachable!() though slightly less descriptive.

If you really need the code-generation hint in Rust, when the optimiser (which is a bit smarter than the compiler frontend) still can't figure out that your unreachable!() is unreachable, there's "unsafe { std::hint::unreachable_unchecked() }" which is just as dangerous but much less attractive than Linux's unreachable().

Anyway, I didn't originally mean to denigrate C, I was mainly trying to explain the Rust code to readers who might be less familiar with it. But it does also serve as an example of different attitudes to how easy it should be to invoke UB.

Thoughts and clarifications

Posted Sep 16, 2024 18:14 UTC (Mon) by mb (subscriber, #50428) [Link]

Yes, looks like you found a couple of actual soundness bugs in the C code.
I wonder if there are any uses of unreachable that actually make sense. As in: Places where the performance gain actually matters.