|
|
Subscribe / Log in / New account

Unsafe Rust versus C and C++

Unsafe Rust versus C and C++

Posted Feb 5, 2025 17:21 UTC (Wed) by Trainninny (guest, #175745)
In reply to: Unsafe Rust versus C and C++ by farnz
Parent article: Resistance to Rust abstractions for DMA mapping

I am very sorry, but your argument is still wrong and my example still counters your whole argument. You are wrong in multiple ways.

>I'm saying that the Rust compiler hews much closer to the (informal) Rust standard

Are there holes in the Rust type system, and do these give trouble in practice? Are the Rust language developers working on a new type system and solver for Rust, and trying to make it as backwards compatible as possible?

https://github.com/lcnr/solver-woes/issues/1

>Even worse, there may be changes to asymptotic complexity of some part of the trait system. This can cause crates which start to compile fine due to the stabilization of the new solver to hang after regressing the complexity again. This is already an issue of the current type system. For example rust-lang/rust#75443 caused hangs (rust-lang/rust#75992), was reverted in rust-lang/rust#78410, then landed again after fixing these regressions in rust-lang/rust#100980 which caused yet another hang (rust-lang/rust#103423), causing it to be reverted yet again in rust-lang/rust#103509.

How well defined is Rust, unsafe, the memory model of Rust? How easy is it to tell for a programmer whether a given piece of unsafe Rust code has undefined behavior or not? How much material, and what material, does a Rust programmer need to study, to be able to develop and maintain unsafe Rust with a high level of confidence?

chadaustin.me/2024/10/intrusive-linked-list-in-rust/

>Unsafe Rust Is Harder Than C
>
>-
>
>Self-referential data structures are a well-known challenge in Rust. They require unsafe code.
>
>-
>
>Note: This may have been a MIRI bug or the rules have since been relaxed, because I can no longer reproduce as of nightly-2024-06-12. Here’s where the memory model and aliasing rules not being defined caused some pain: when MIRI fails, it’s unclear whether it’s my fault or not.
>
>-
>
>Note: This may have also been a MIRI bug. It is no longer reproducible.
>
>-
>
>Until the Rust memory model stabilizes further and the aliasing rules are well-defined, your best option is to integrate ASAN, TSAN, and MIRI (both stacked borrows and tree borrows) into your continuous integration for any project that contains unsafe code.
>
>If your project is safe Rust but depends on a crate which makes heavy use of unsafe code, you should probably still enable sanitizers. I didn’t discover all UB in wakerset until it was integrated into batch-channel.
>
>-
>
>Without MIRI, it would be hard to trust unsafe Rust.
>
>-
>
>References, even if never used, are more dangerous than pointers in C.

https://lucumr.pocoo.org/2022/1/30/unsafe-rust/

>I made the case on Twitter a few days ago that writing unsafe Rust is harder than C or C++, so I figured it might be good to explain what I mean by that.
>
>-
>
>So first of all: does this [Rust unsafe] work now? The answer is yes. But is it correct? The answer is not.
>
>-
>
>It's 2022 and I will admit that I no longer feel confident writing unsafe Rust code. The rules were probably always complex but I know from reading a lot of unsafe Rust code over many years that most unsafe code just did not care about those rules and just disregarded them. There is a reason that addr_of_mut! did not get added to the language until 1.53. Even today the docs both say there are no guarantees on the alignment on native rust struct reprs.
>
>Over the last few years it seem to have happened that the Rust developers has made writing unsafe Rust harder in practice and the rules are so complex now that it's very hard to understand for a casual programmer and the documentation surrounding it can be easily misinterpreted. An earlier version of this article for instance assumed that some uses of addr_of_mut! were necessary that really were not. And that article got quite a few shares overlooking this before someone pointed that mistake out!
>
>These rules have made one of Rust's best features less and less approachable and also harder to understand. The requirement for the existence MaybeUninit instead of “just” having the old mem::uninitialized API is obvious but shows how complex the rules of the language are.

youtube.com/watch?v=DG-VLezRkYQ

>@oconnor663 11 months ago It could've been thirty seconds:
>
>Rust doesn't have the "strict aliasing" rules from C and C++.
>
>But all Rust references are effectively "restrict" pointers, so getting unsafe Rust right is harder in practice.
>
>It would be nice never to have to worry about any of this, but it turns out that a lot of optimizations don't work without aliasing information.


to post comments

Unsafe Rust versus C and C++

Posted Feb 5, 2025 18:47 UTC (Wed) by farnz (subscriber, #17727) [Link] (1 responses)

Your wall of text does not back your claim that my argument is wrong - indeed, it doesn't even address it at all.

My fundamental argument is that practically Rust is harder than C, while theoretically C is as hard as Rust, because Rust doesn't give you "dialects" - there is one and only one version of Rust, and while Rust may eventually weaken the constraints unsafe code has to follow, for now the only way to be guaranteed safe is to comply with all the constraints.

However, if you try to write ISO Standard C, it's as hard (if not harder) to comply with the rules - the only way to be guaranteed safe is to obey all the constraints ISO Standard C imposes (for which there are no tools to check that you've done this - it's a pure whiteboard exercise, unlike Rust, where there's things like Miri to help). Worse, there are constraints in the ISO spec where no extant compiler currently exploits the fact that ISO Standard C leaves this area underspecified, because doing so would break too much code and get rid of your user.

With practical C, though, you don't write ISO Standard C; you write GCC C, or Microsoft Visual C, or Clang C, which have laxer restrictions than the ISO Standard, and has tooling like UBSAN to detect breaches of the compiler-dialect C rules. That's an easier task than writing ISO Standard C that's fully compliant with the rules, and also easier than writing Rust that's fully compliant with the Rust rules, because each compiler defines some behaviours that ISO leaves as undefined, unspecified, or implementation-defined (and in the last case, it's required to define them by ISO rules, and can't leave them alone).

And, on top of that, it's possible to use compiler switches to ask for a C dialect with certain things that are technically UB in ISO Standard C defined by the compiler in a way that's useful - -fno-strict-aliasing, -fwrapv for two examples. Writing safe code in those dialects is easier, because there's less room to accidentally write UB to begin with.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 19:55 UTC (Wed) by Trainninny (guest, #175745) [Link]

Have some comments been deleted? There were some comments by some other people, but I cannot find them anymore.

>Your wall of text does not back your claim that my argument is wrong - indeed, it doesn't even address it at all.

I am very, very sorry, but you are completely wrong about this.

>[...] because Rust doesn't give you "dialects" - there is one and only one version of Rust, [...]

This is also completely wrong, though it is not core to the argument. Several counter-examples:

https://doc.rust-lang.org/cargo/reference/profiles.html#p...

panic="unwind"/"abort"

https://github.com/rust-lang/rust/issues/126683

-Zoom=panic/abort

https://doc.rust-lang.org/nightly/edition-guide/rust-2024...

Whether the Rust code in that page deadlocks or not depends on its edition. Rust does at least have automatic migration tools, but there are still drawbacks to this: You cannot in general tell without knowing the specific Rust edition what a sample of Rust code does; and documentation, guides, tutorials, etc. that do not explicitly mention the edition that they are valid for, will risk having ambiguous meaning and correctness.

You can then argue that the dialects of C or C++ in specific compilers are significantly worse, which I could imagine being true and is a drawback of those compilers and arguably a drawback of the C and C++ languages as well. But that is a different discussion, and Rust 100% does have dialects. Rust is also in a situation where it has 1 major compiler; would Rust end up with similar issues as C and C++ if it had multiple compilers? And having multiple major compilers are presumably a good thing overall for a language used for critical infrastructure.

>[...] the only way to be guaranteed safe is to obey all the constraints ISO Standard C imposes (for which there are no tools to check that you've done this - it's a pure whiteboard exercise, unlike Rust, where there's things like Miri to help). [...]

This is also completely wrong, MIRI can be seen as a runtime checker/sanitizer with many of the same advantages and drawbacks, and there are many different sanitizers and runtime checkers for C as well as C++. Some of them are ported between C++ and Rust. And MIRI, while greatly helpful, is not perfect. People have complained about bugs in MIRI, about false positives and false negatives, and as a runtime checker/sanitizer, MIRI takes a long time to run like C++ sanitizers also do. And like sanitizers, if your test run with MIRI does not cover a specific combination of control flow and values, MIRI will not check that.

>With practical C, though, you don't write ISO Standard C; you write GCC C, or Microsoft Visual C, or Clang C, which have laxer restrictions than the ISO Standard, and has tooling like UBSAN to detect breaches of the compiler-dialect C rules. That's an easier task than writing ISO Standard C that's fully compliant with the rules, and also easier than writing Rust that's fully compliant with the Rust rules, because each compiler defines some behaviours that ISO leaves as undefined, unspecified, or implementation-defined (and in the last case, it's required to define them by ISO rules, and can't leave them alone).

This is again not relevant as far as I can tell. But, the situation for Rust is worse, since Rust does not have a specification, and also does not have a specification for its memory model, and also only has one main compiler. A specification for Rust is currently a work in progress. C++ and C has several major compilers for them, despite flaws, and some codebases do try to work with any major compiler. I agree that this is a weakness of both C and C++, more so for C++ given increased complexity. But Rust is arguably and unfortunately worse here, which is highly regrettable, given that Rust is much younger. A major reason for Rust being worse here is as far as I can tell the type system of Rust being both complex and having holes. The Rust language developers are working on a new solver and type system for Rust, but it is an effort that is taking many specialized developers years.

One practical consequence of the type system holes of Rust, apart from the issues encountered for maintenance of the language and main compiler and the issues for users like exponential regressions in compile times, is that writing a new compiler may be difficult. Unless you copy-paste the solver of the main Rust compiler, despite that solver having issues.

There are also complaints that the rules of unsafe for Rust has changed and become more complex over time, which I hope as of 2025 are no longer true. It would be very good if unsafe Rust becomes easier, not harder, as the language is developed. But there have been complaints about the opposite happening in the past.

That Rust does not have a specification, and has only one major compiler, also makes it easier for implementation-defined behavior to accidentally become part of the language. https://github.com/rust-lang/rust/issues/97146 tells of some users apparently relying on a specific behavior related to double panics.

Once the new type system and solver for Rust is ready, it may end up not being 100% backwards compatible with the old type system.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 19:28 UTC (Wed) by Wol (subscriber, #4433) [Link] (4 responses)

> Are there holes in the Rust type system, and do these give trouble in practice? Are the Rust language developers working on a new type system and solver for Rust, and trying to make it as backwards compatible as possible?

And now you are really coming over as lying by omission. Sorry.

Let's take an example of electric cars. Pretend I have an electric car with a range of 250 miles, and it takes a day to give it a full charge. How long will it take me to drive 400 miles from London to Edinburgh? A day and a half? No. At 50 mph it will take me about 9 hours.

Because charging an electric car follows the 80/20 rule. If I go half way (preferably a bit more) it will take about 4 hours. Stop for a coffee in a service station, and that is enough to charge the car to about 80%. And that's enough to cover the remaining 200 miles. I might have to stop a second time, but I might not.

The two crucial facts about Rust, is that (a) all "safe" code has been proven correct by the compiler, and (b) the majority of Rust programmers should never have to touch unsafe code.

Maybe you're right banging on about all these exceptions, and the Rust guys are writing all this fancy stuff like MIRI, but the definition of "unsafe code" is "stuff the compiler can't prove is correct". So all you're doing is like climate deniers complaining electric cars are useless because it takes too long to get those last few miles into the battery.

The definition of "unsafe" code is "stuff the compiler can't prove correct". And you're moaning that the compiler writers aren't mathematical gods because they can't (yet) prove some very tricky problems. And other problems are just plain insolvable.

THAT is why unsafe code is hard. Because the maths behind it is hard. Knuth ranks his problems from 0 is "easy" to (iirc) 5 is "if you can solve it it's worth a PhD". As soon as you start programming "unsafe", you are dealing with code where the proofs are 4 or 5 on the scale - if it's even provable!

90% of Rust programmers are unlikely to step outside the safe zone in their entire career. All safe code MUST be fully defined, and MUST be provably correct (bugs, cockups, and acts of God excepted).

100% of C/C++ programmers are likely to step on an unsafe landmine several times a year.

That's a big difference!

Cheers,
Wol

Second request

Posted Feb 5, 2025 19:50 UTC (Wed) by corbet (editor, #1) [Link] (1 responses)

Do not feed the troll. Please.

Wol why do we have to keep asking you this?

Second request

Posted Feb 5, 2025 20:03 UTC (Wed) by Trainninny (guest, #175745) [Link]

I am sorry, but I do not understand why you appear to describe me as a troll. Are there any flaws or issues in any of my comments?

Unsafe Rust versus C and C++

Posted Feb 5, 2025 20:33 UTC (Wed) by Trainninny (guest, #175745) [Link] (1 responses)

>[...] (b) the majority of Rust programmers should never have to touch unsafe code. [...]
>
>-
>
>[...]90% of Rust programmers are unlikely to step outside the safe zone in their entire career. [...]

Is that not in direct contrast to https://lwn.net/Articles/1007973/ ? At least in the context of the Linux kernel?

>[...]but the definition of "unsafe code" is "stuff the compiler can't prove is correct"[...]

A minor technicality: the compiler is not proving the correctness of the code, it is (meant to be) proving the memory safety/absence of undefined behavior. In the sense that memory safe code without undefined behavior can have logic bugs.

>All safe code MUST be fully defined, and MUST be provably correct (bugs, cockups, and acts of God excepted).

Did you mean unsafe?

But even then, this does not always hold in practice.

github.com/rust-lang/rust/commit/71f5cfb21f3fd2f1740bced061c66ff112fec259

cve.org/CVERecord?id=CVE-2024-27308

>100% of C/C++ programmers are likely to step on an unsafe landmine several times a year.

I do not believe this is true, but I do believe that Rust makes some aspects significantly easier, and not only its borrow checking and solver and lifetimes handling, though in the specific case of Rust that also comes with penalties in the unsafe subset. One great advantage is Rust's pattern matching and disjoint unions, taken from functional programming. And one thing that makes C and C++ error prone for some cases is that C and C++ are ancient languages that have a lot of cruft and baggage. I do prefer functional programming, and hope that C++ will get a good and robust implementation of both pattern matching and disjoint unions (C arguably has a limited scope), but Rust has significant issues. To be perfectly frank, I wonder if a Rust killer in the future may greatly iterate on and improve and be closer to what many of us hoped that Rust would be.

>The definition of "unsafe" code is "stuff the compiler can't prove correct". And you're moaning that the compiler writers aren't mathematical gods because they can't (yet) prove some very tricky problems. And other problems are just plain insolvable.
>
>THAT is why unsafe code is hard. Because the maths behind it is hard. Knuth ranks his problems from 0 is "easy" to (iirc) 5 is "if you can solve it it's worth a PhD". As soon as you start programming "unsafe", you are dealing with code where the proofs are 4 or 5 on the scale - if it's even provable!

But these issues are not purely theoretical, and appears to cause not only users but also language developers trouble.

https://github.com/lcnr/solver-woes/issues/1

>Even worse, there may be changes to asymptotic complexity of some part of the trait system. This can cause crates which start to compile fine due to the stabilization of the new solver to hang after regressing the complexity again. This is already an issue of the current type system. For example rust-lang/rust#75443 caused hangs (rust-lang/rust#75992), was reverted in rust-lang/rust#78410, then landed again after fixing these regressions in rust-lang/rust#100980 which caused yet another hang (rust-lang/rust#103423), causing it to be reverted yet again in rust-lang/rust#103509.

And the Rust language developers made multiple blog posts discussing their work on the new solver, and on trying to make it backwards compatible.

There is a comment where I discuss the language design of Rust and related issues https://lwn.net/Articles/1008103/ . One could argue that requiring a complex solver, that in practice may end up with holes, has practical trade-offs in the language design. I recall that Bjarne Stroustrup was against any language feature that would require complex solvers. I wonder if part of the reasoning is that it would make it harder to implement correct compilers. Which may be consistent with some of the headaches that some apparently really skilled people among the Rust language developers appear to be dealing with. I do not envy their position, their challenge looks difficult.

>And now you are really coming over as lying by omission. Sorry.

I do not agree with this at all, and as far as I can tell, you are completely wrong about this.

Stop here.

Posted Feb 5, 2025 20:36 UTC (Wed) by corbet (editor, #1) [Link]

Your comments are trolls - lengthy pieces designed to prolong conversations and make people argue. They are off-topic for an article on kernel development. Whether or not they actually are, they certainly have the look of machine-generated text. I am done asking you to stop, you really need to put an end to this here.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds