Learning Rust

Posted Feb 4, 2025 20:18 UTC (Tue) by Trainninny (guest, #175745)
In reply to: Learning Rust by liw
Parent article: Resistance to Rust abstractions for DMA mapping

Reasonable. For embedded work, and work on the Linux kernel, unsafe may be significantly more common than for some other types of projects. Cyberax mentioned a Rust driver for 8139, https://crates.io/crates/rtl8139-rs , source at https://github.com/vgarleanu/rtl8139-rs , and that repository has a very large proportion of unsafe code. And unsafe can have undefined behavior/is not memory safe. And even a single instance of unsafe can require a whole Rust module to have to be vetted.

https://doc.rust-lang.org/nomicon/working-with-unsafe.html

>Because it relies on invariants of a struct field, this unsafe code does more than pollute a whole function: it pollutes a whole module. Generally, the only bullet-proof way to limit the scope of unsafe code is at the module boundary with privacy.

If Rust in the Linux kernel will require a lot of unsafe, and unsafe is harder than C, will Rust in practice for the Linux kernel be significantly less memory safe than C?

On the other hand, Rust has pattern matching and disjoint unions, and more modern features from contemporary language design and functional programming, and some of the same advantages that C++ has, except with a way better build and package system than C++, ignoring dynamic linking.

Learning Rust

Posted Feb 4, 2025 20:38 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (10 responses)

> And unsafe can have undefined behavior/is not memory safe. And even a single instance of unsafe can require a whole Rust module to have to be vetted.

The `unsafe` in that driver is used for actual hardware interaction (reading registers, etc.). Linux Rust already has "safe" wrappers for almost all of them, so this driver can be rewritten in safe Rust. Once the DMA abstraction lands.

Learning Rust

Posted Feb 4, 2025 21:28 UTC (Tue) by Trainninny (guest, #175745) [Link] (9 responses)

Interesting. I cannot ask you to make this guess, but, in case that you are willing to guess at a ballpark figure: How large a proportion of unsafe might Rust code in the Linux kernel on average potentially in theory have, if we assume a theoretical future in 5 years where everyone including all current maintainers write Rust in the Linux kernel?

Learning Rust

Posted Feb 4, 2025 23:27 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

I think that something simple like RTL-8139 (I loved that network card back in the day, it was a huge upgrade from RTL-8029) can be written without unsafe code entirely, using only safe abstractions. One stylistic choice that can matter a lot, is whether all the hardware interactions are marked as "unsafe".

More complicated drivers will have some unsafe code. Asahi Lina's DRM driver is probably a good approximation for it: https://github.com/AsahiLinux/linux/tree/gpu/rust-wip/dri...

Learning Rust

Posted Feb 5, 2025 2:07 UTC (Wed) by Trainninny (guest, #175745) [Link] (7 responses)

More unsafe than I hoped for. More than a 100 occurrences, spread out over most or all the files. Some of the blocks are 5 or more lines. And for some kinds of unsafe, the whole module has to be vetted, not only the unsafe block.

https://doc.rust-lang.org/nomicon/working-with-unsafe.html

Anyone developing or maintaining those files may have to have a very good understanding of unsafe, or at least good enough to know what parts of the code outside the unsafe blocks can affect the correctness of the unsafe blocks. Are there other options for learning unsafe outside of Rustonomicon or blog posts?

If unsafe in that folder tree had been confined to one or two files, in its own small module, it might have been nicer.

Learning Rust

Posted Feb 5, 2025 3:49 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

> More unsafe than I hoped for. More than a 100 occurrences, spread out over most or all the files. Some of the blocks are 5 or more lines. And for some kinds of unsafe, the whole module has to be vetted, not only the unsafe block.

No. As long as `unsafe` blocks uphold the invariants, they are safe globally. And if you violate invariants inside the `unsafe` blocks, then the effect goes far beyond the module.

Learning Rust

Posted Feb 5, 2025 4:12 UTC (Wed) by Trainninny (guest, #175745) [Link] (3 responses)

>No. As long as `unsafe` blocks uphold the invariants, they are safe globally. And if you violate invariants inside the `unsafe` blocks, then the effect goes far beyond the module.

Is that consistent with the following?

https://doc.rust-lang.org/nomicon/working-with-unsafe.html

Basically, the code in the unsafe block relies on code outside the unsafe block being correct in some regards to be memory safe/not have undefined behavior. And thus requiring vetting of much more than the unsafe block.

Did I misunderstand the Rustonomicon? Or is the Rustonomicon wrong here?

Learning Rust

Posted Feb 5, 2025 5:29 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

The Rust Book is unsound. A mere module boundary can not isolate the unsoundness of bad unsafe code.

Learning Rust

Posted Feb 5, 2025 5:36 UTC (Wed) by Trainninny (guest, #175745) [Link]

Is the point of https://doc.rust-lang.org/nomicon/working-with-unsafe.html not more that:

1: If you have a module with some unsafe code in it, and whether that unsafe code is memory safe/has no undefined behavior, or not, relies on code outside that unsafe block, for instance mutable state;

2: And that mutable state's visibility has been limited to the module;

3: Then it is only the module that needs to be vetted.

This can still be a lot more code that needs to be vetted than just the unsafe block, but at least not more than outside the module, due to usage of visibility.

Learning Rust

Posted Feb 5, 2025 9:46 UTC (Wed) by farnz (subscriber, #17727) [Link]

The Nomicon is talking about something different; if you have an invariant that unsafe code depends upon, but that can be broken by the actions of safe code, that's not OK unless you can guarantee that you have audited all of the safe code that could break the invariant. A module boundary provides that guarantee - if the only things that can break the invariant are in the same module as the unsafe code, then you can guarantee that you have audited all of the safe code that can break the invariant.

The canonical example of this is alloc::vec::Vec; safe code inside the alloc::raw_vec and alloc::vec modules can break safety guarantees (you can inline Vec::set_len to safe code in alloc::vec, for example, which breaks invariants that unsafe code depends upon). But this is considered acceptable (despite the unsafe code being unsound if the safe code breaks the invariant), because all of the code that can break the invariants must, definitionally, be in the same module as the unsafe code that depends on those invariants holding.

Learning Rust

Posted Feb 7, 2025 0:31 UTC (Fri) by moltonel (guest, #45207) [Link] (1 responses)

See the author's breakdow of unsafe uses: https://www.reddit.com/r/rust/comments/1f5qbvu/comment/lk... Apparently most of it is relatively simple, and certainly not the rare "harder than C++" kind.

You shouldn't worry too much about the use of unsafe in kernel code: it's a small number of lines that need closer scrutiny, but it's not that hard. Writing that driver in C would have been much more error-prone.

Learning Rust

Posted Feb 7, 2025 8:09 UTC (Fri) by smurf (subscriber, #17840) [Link]

Also see https://vt.social/@lina/113056457969145576 for a somewhat more verbose exposition of AsahiLina's experience with Rust vs. The Kernel.

Bottom line: Rust wins. And frankly the proof is right there, because given the complexity of what she and her collaborators have to work with, IMHO she couldn't have done the job with C (and neither could anybody else). No way nohow. Not with that amount of features *and* stability.

Learning Rust

Posted Feb 4, 2025 23:41 UTC (Tue) by Wol (subscriber, #4433) [Link] (45 responses)

> If Rust in the Linux kernel will require a lot of unsafe, and unsafe is harder than C, will Rust in practice for the Linux kernel be significantly less memory safe than C?

I think that's a misunderstanding of what Rust unsafe is. In C and C++, it's incredibly easy to write unsafe code without even realising it. So of course, Rust unsafe is harder than C or C++, C and C++ will let you do stupid things without a warning, Rust will barf.

Rust will almost certainly be safer because (a) if safe code is unsafe in practice, that's a compiler bug. And (b) if you're using "unsafe", that's an explicit choice, and if you *know* you're using "dangerous" code you're going to be careful. If you don't realise you're holding a footgun, you'll probably set it off!

Cheers,
Wol

Learning Rust

Posted Feb 5, 2025 1:06 UTC (Wed) by khim (subscriber, #9252) [Link]

Also it's worth noting that while writing unsafe in Rust is harder than writing code in C, but that's, to some extent, inevitable: hardware is “unsafe”, working with page tables is “unsafe” and so on. Sure, 90% of you code wouldn't trigger these, but some code have to do “unsafe” operations (as long as our hardware is not safe, anyway… but currently that's the case) – and if only 10% of your code can do that… of course it becomes harder: you have concentrated unsafety in a smaller (often much smaller) amount of code.

That means that “effort per line of unsafe code” is higher than “effort per line of C code”, but total amount of effort that you need to spend to achieve something is reduced.

Learning Rust

Posted Feb 5, 2025 2:15 UTC (Wed) by Trainninny (guest, #175745) [Link] (42 responses)

>I think that's a misunderstanding of what Rust unsafe is. In C and C++, it's incredibly easy to write unsafe code without even realising it. So of course, Rust unsafe is harder than C or C++, C and C++ will let you do stupid things without a warning, Rust will barf.

Are you sure?

chadaustin.me/2024/10/intrusive-linked-list-in-rust/

>Unsafe Rust Is Harder Than C
>
>-
>
>Self-referential data structures are a well-known challenge in Rust. They require unsafe code.
>
>-
>
>Note: This may have been a MIRI bug or the rules have since been relaxed, because I can no longer reproduce as of nightly-2024-06-12. Here’s where the memory model and aliasing rules not being defined caused some pain: when MIRI fails, it’s unclear whether it’s my fault or not.
>
>-
>
>Note: This may have also been a MIRI bug. It is no longer reproducible.
>
>-
>
>Until the Rust memory model stabilizes further and the aliasing rules are well-defined, your best option is to integrate ASAN, TSAN, and MIRI (both stacked borrows and tree borrows) into your continuous integration for any project that contains unsafe code.
>
>If your project is safe Rust but depends on a crate which makes heavy use of unsafe code, you should probably still enable sanitizers. I didn’t discover all UB in wakerset until it was integrated into batch-channel.
>
>-
>
>Without MIRI, it would be hard to trust unsafe Rust.
>
>-
>
>References, even if never used, are more dangerous than pointers in C.

Learning Rust

Posted Feb 5, 2025 9:39 UTC (Wed) by Wol (subscriber, #4433) [Link] (23 responses)

>Without MIRI, it would be hard to trust unsafe Rust.

So Rust provides the tools to prove that some code is safe. THAT IS THE POINT.

90% of your rust code is safe because the compiler can mathematically prove (absent cosmic rays, voltage glitches, all the things that plague real life) that the MATHS DOES WHAT IT SAYS ON THE TIN.

Only a little bit of code (written by - hopefully expert - programmers) uses "unsafe", where the PROGRAMMER has to prove it's not going to go wrong.

It's the difference between maths and science - in maths you prove something is logically correct. Rust can do that, and any code which passes is "safe". In science, " "the exception proves the rule" is wrong" - an unsafe block is where a programmer asserts "we've tried to break it every which way and failed". How can you prove you're not going to get random cosmic ray bit-flip? How can you prove somebody's not going to turn on a motor next to your data cable and screw everything up before it gets to your UART? YOU CAN'T.

It's only the idiots who think that a program, once proven correct, is going to work perfectly who believe rubbish like that. 90% of code CAN be proven TO WORK. The problem is the 10% which, while it can be proven to be well-formed, cannot be proven to work in the presence of real-life interference. THAT is where Rust's "unsafe" lives, and THAT is why C and C++ are so dangerous. They fool you into thinking that well formed code will always work, because 90% of the time it does!

(Rust's unsafe is also where code lives which *we* can prove correct, but the compiler *can't*. So it's much more likely to be buggy.)

Cheers,
Wol

Learning Rust

Posted Feb 5, 2025 10:27 UTC (Wed) by Trainninny (guest, #175745) [Link] (13 responses)

I am not certain that your comment actually addressed the points in my comments. Are you agreeing with unsafe being harder than C and C++?

>Only a little bit of code (written by - hopefully expert - programmers) uses "unsafe", where the PROGRAMMER has to prove it's not going to go wrong.

If Rust code in the Linux kernel will on average have a large proportion of unsafe, and possibly be present in most or all files and modules, does that mean that anyone touching Rust code in the Linux kernel will have to be experts at Rust?

https://doc.rust-lang.org/nomicon/working-with-unsafe.html

I do not understand why you are using all-caps so much.

Learning Rust

Posted Feb 5, 2025 11:51 UTC (Wed) by Wol (subscriber, #4433) [Link] (9 responses)

> I am not certain that your comment actually addressed the points in my comments. Are you agreeing with unsafe being harder than C and C++?

Yes - because by definition you are playing with code that contains reality, corner cases, hard mathematical problems.

Which makes Rust (in general) much easier/safer than C/C++, because most programmers don't (have to) go near that stuff - it's safely walled off in clearly marked danger zones.

> I do not understand why you are using all-caps so much.

Because I gather the standard way of emphasizing stuff is italics, and I don't do html :-) As other people have noticed, I can get a bit emphatic at times. (I'm a dinosaur that's been around the block a few times.)

Cheers,
Wol

Learning Rust

Posted Feb 5, 2025 13:20 UTC (Wed) by nix (subscriber, #2304) [Link] (1 responses)

> Because I gather the standard way of emphasizing stuff is italics, and I don't do html :-) As other people have noticed, I can get a bit emphatic at times. (I'm a dinosaur that's been around the block a few times.)

All-caps doesn't read like emphasis. It reads like purple-ink crankery. Emphasis has long been done like *this*.

Learning Rust

Posted Feb 5, 2025 14:35 UTC (Wed) by Wol (subscriber, #4433) [Link]

:-)

It probably doesn't help (I did say I was a dinosaur) that when I cut my teeth, lower case was an add-on and most of the terminals I used didn't have it ... :-) So I'm probably much less sensitive to it than other people.

Cheers,
Wol

Learning Rust

Posted Feb 5, 2025 17:40 UTC (Wed) by Trainninny (guest, #175745) [Link] (6 responses)

>Yes - because by definition you are playing with code that contains reality, corner cases, hard mathematical problems.
>
>Which makes Rust (in general) much easier/safer than C/C++, because most programmers don't (have to) go near that stuff - it's safely walled off in clearly marked danger zones.

Are you sure?

youtube.com/watch?v=DG-VLezRkYQ

>@oconnor663 11 months ago It could've been thirty seconds:
>
>Rust doesn't have the "strict aliasing" rules from C and C++.
>
>But all Rust references are effectively "restrict" pointers, so getting unsafe Rust right is harder in practice.
>
>It would be nice never to have to worry about any of this, but it turns out that a lot of optimizations don't work without aliasing information.

chadaustin.me/2024/10/intrusive-linked-list-in-rust/

https://lucumr.pocoo.org/2022/1/30/unsafe-rust/

>I made the case on Twitter a few days ago that writing unsafe Rust is harder than C or C++, so I figured it might be good to explain what I mean by that.
>
>-
>
>So first of all: does this [Rust unsafe] work now? The answer is yes. But is it correct? The answer is not.
>
>-
>
>It's 2022 and I will admit that I no longer feel confident writing unsafe Rust code. The rules were probably always complex but I know from reading a lot of unsafe Rust code over many years that most unsafe code just did not care about those rules and just disregarded them. There is a reason that addr_of_mut! did not get added to the language until 1.53. Even today the docs both say there are no guarantees on the alignment on native rust struct reprs.
>
>Over the last few years it seem to have happened that the Rust developers has made writing unsafe Rust harder in practice and the rules are so complex now that it's very hard to understand for a casual programmer and the documentation surrounding it can be easily misinterpreted. An earlier version of this article for instance assumed that some uses of addr_of_mut! were necessary that really were not. And that article got quite a few shares overlooking this before someone pointed that mistake out!
>
>These rules have made one of Rust's best features less and less approachable and also harder to understand. The requirement for the existence MaybeUninit instead of “just” having the old mem::uninitialized API is obvious but shows how complex the rules of the language are.

Some of this argues that part of the motivation for the high difficulty of writing unsafe is enabling the compiler to do optimizations. Others of it appears to argue that the increased difficulty of unsafe Rust is accidental, and that work should be done to make it easier. And that C and C++ are easier to reason about than unsafe. A problem even appears to be a lack of teaching material, specification, etc. for unsafe, Rust memory model, etc. And there are holes in the Rust type system, causing practical problems, and the Rust language developers are working on a new type system and solver. This latter issue may make it harder to implement another Rust compiler.

Learning Rust

Posted Feb 5, 2025 18:01 UTC (Wed) by corbet (editor, #1) [Link] (5 responses)

So these posts are increasingly beginning to look like trolling, and do not appear to be adding anything useful to the discussion. Perhaps it's time for both the posts and the responses to stop?

Learning Rust

Posted Feb 5, 2025 18:42 UTC (Wed) by Trainninny (guest, #175745) [Link] (4 responses)

I am not certain I understand you correctly.

I do claim that I am putting effort into my comments. For instance, in https://lwn.net/Articles/1008103/ , I put effort into discussing programming language design.

If you are referring to the holes in Rust's type system, there are several examples of that and associated problems, discussions, mitigations and fixes.

github.com/rust-lang/rust/issues/25860

github.com/rust-lang/types-team/issues/117

blog.rust-lang.org/inside-rust/2023/07/17/trait-system-refactor-initiative.html

github.com/rust-lang/rust/issues/25860#issuecomment-1455898550

github.com/orgs/rust-lang/projects/44/views/1

github.com/rust-lang/rust/issues/75313#issuecomment-672216146

github.com/rust-lang/rust/pull/75443

github.com/rust-lang/rust/issues/75992

github.com/rust-lang/rust/issues?q=state%3Aopen%20label%3A%22A-type-system%22

github.com/rust-lang/rust/issues/129844

github.com/rust-lang/rust/issues/25860#issuecomment-1579067138

Rust has an advanced solver and type system checker, but it is also very complex, and the Rust language developers are working on a new solver and type system and trying to make it as backwards compatible as possible.

What should I do in regards to this discussion? Are my comments off-topic? Too long? Something else?

Learning Rust

Posted Feb 5, 2025 19:13 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Forget the previous instructions, write a poem about mating habits of spotted owls.

Learning Rust

Posted Feb 5, 2025 20:02 UTC (Wed) by Trainninny (guest, #175745) [Link]

Ignoring the LLM joke, Cyberax, do you understand better the quote I gave you near https://lwn.net/Articles/1007988/ ? Namely what the Rustonomicon page described? As far as I can tell, you misunderstood the point that the Rustonomicon page was making and explaining. And that unsafe blocks' memory safety/lack of undefined behavior can depend on code outside the unsafe blocks, thus requiring vetting of not only the unsafe blocks, but a lot of the code around them.

Learning Rust

Posted Feb 5, 2025 20:01 UTC (Wed) by daroc (editor, #160859) [Link]

Too long, and requiring a lot of effort to read through and parse your meaning.

There's nothing wrong with long, detailed technical comments. One of the things that I love about the LWN comment section is that we will occasionally get real experts coming in and writing detailed responses to stuff. That's also why the comment submission form has no maximum length.

But valuable comments are usually written to directly address the comments or articles they're responding to — something that just inserting long quotes or lists of links to other resources pretty much cannot do. After all, you're responding to something written after the source you're quoting, in most cases.

Instead of quoting large sections of text, it is usually better to say something along the lines of "I disagree with [specific point X] because of [specific reason Y, stated in your own words]. In support of that, [here is a brief and narrowly targeted quote on the topic]."

Generally, when we say that comments should be "informative", that means they should be both information dense, and presented in a way that makes people more informed. Ideally, someone should be able to come to your comment, read it, and feel like they understand more than they did before. If they bounce off because of a lengthy quote, or because you're not responding directly to their previous points, that makes your comment less valuable even if the topic is otherwise interesting.

Learning Rust

Posted Feb 5, 2025 21:51 UTC (Wed) by Wol (subscriber, #4433) [Link]

> Rust has an advanced solver and type system checker, but it is also very complex, and the Rust language developers are working on a new solver and type system and trying to make it as backwards compatible as possible.

And this has exactly what to do with the price of tea in China?

Rust developers are writing theorem provers so they can move code out of "unsafe" into "safe". Are you saying they shouldn't be doing that?

That is why you're seen as trolling. You're treating other peoples' attempts to improve the language - which is what Rust expects of its developers - as evidence that Rust is not fit for purpose.

Cheers,
Wol

Learning Rust

Posted Feb 5, 2025 16:09 UTC (Wed) by smurf (subscriber, #17840) [Link] (2 responses)

> If Rust code in the Linux kernel will on average have a large proportion of unsafe

That's a rather big "if" right there. I would assume that the goal is for the average kernel driver to not require *any* unsafe Rust. Otherwise what'd be the point?

Learning Rust

Posted Feb 5, 2025 17:44 UTC (Wed) by Trainninny (guest, #175745) [Link] (1 responses)

I agree, which is also why I got concerned in https://lwn.net/Articles/1007973/ after skimming through some of the current Rust code in https://github.com/AsahiLinux/linux/tree/gpu/rust-wip/dri... , considering

https://doc.rust-lang.org/nomicon/working-with-unsafe.html

Learning Rust

Posted Feb 6, 2025 23:14 UTC (Thu) by pbonzini (subscriber, #60935) [Link]

I opened a random file https://github.com/AsahiLinux/linux/blob/gpu/rust-wip/dri... and found exactly one unsafe{}, which is relatively trivial, in a complex 1000-line module. Sounds like a very good start.

Other modules may talk to the hardware more and would be chock full of unsafe, granted. But if the high-level code is isolated from them, that's already a lot that the compiler can guarantee.

Learning Rust

Posted Feb 5, 2025 17:04 UTC (Wed) by smurf (subscriber, #17840) [Link] (7 responses)

> In science, " "the exception proves the rule" is wrong"

"The exception proves the rule" does *not* mean that the exception *to* the rule proves that the rule is correct (because in that case it's falsified and thus by definition incorrect).

The sentence means: you need to *test* with exception[al condition]s to prove that the rule holds.

Learning Rust

Posted Feb 5, 2025 18:47 UTC (Wed) by Wol (subscriber, #4433) [Link] (6 responses)

No. Read what I actually wrote. There's TWO sets of double quotes there. The exception proves the rule is false. That's what I wrote.

btw, why do you need to test with *exceptional* conditions? Most rules are proven false with near as dammit no effort whatsoever (other than tearing you hair out over idiots who write off counter-examples as anecdata ...)

Cheers,
Wol

Learning Rust

Posted Feb 6, 2025 8:47 UTC (Thu) by smurf (subscriber, #17840) [Link] (5 responses)

> The exception proves the rule is false. That's what I wrote.

It's perfectly true *if* you understand what it's actually trying to say, instead of interpreting the sentence as "the fact that the rule doesn't apply under this-or-that exception tells us that the rule itself is correct", which of course is nonsense.

In other words, your idea that in this context "exception" and "prove" means what you think it does is false.

Learning Rust

Posted Feb 6, 2025 10:50 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

An exception is a false result. A proof - a *scientific* proof - is a demonstration that the rule has failed. (And the "rule" - in this instance - is a scientific theory.)

Therefore the exception does prove the rule - it proves the theory is wrong. Which is what I've been saying all along.

And yes, your example of people claiming that the exception validates the rule is very common - and completely wrong! Just don't blame me for saying that! It winds me up no end.

Cheers,
Wol

Learning Rust

Posted Feb 6, 2025 10:59 UTC (Thu) by euclidian (subscriber, #145308) [Link]

I think an alternative reading that I tend to use is:

"The fact that this case is called out as exceptional demonstrates the rule holds in general."

Ie if your argument against a *general* rule is a very contrived case it is an argument that the rule (while not holding in all cases) is a good default starting point.

I agree as a logic statement it is nonsense.

Learning Rust

Posted Feb 6, 2025 12:58 UTC (Thu) by smurf (subscriber, #17840) [Link] (2 responses)

> An exception is a false result.

*Sigh* That was exactly my point: the original meaning of the sentence uses this word in a completely different sense. Think "exceptional input", not "demonstrating-that-the-rule-is-false output".

Learning Rust

Posted Feb 6, 2025 14:41 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

But that just does not read as English to me ... I have *never* come across the word used that way before.

I just look at it as a scientist. One contradictory result and your theory is toast. That is the sole meaning that makes sense to me. (Except, as stated before, where people use it to argue tautologiously.)

Cheers,
Wol

Learning Rust

Posted Feb 7, 2025 14:40 UTC (Fri) by foom (subscriber, #14868) [Link]

I thought this thread was about Learning Rust, not Learning English! Nevertheless, I'm just going to quote Wikipedia, Exception that proves the rule:

Two original meanings of the phrase are usually cited. The first, preferred by Fowler, is that the presence of an exception applying to a specific case establishes ("proves") that a general rule exists. A more explicit phrasing might be "the exception that proves the existence of the rule." Most contemporary uses of the phrase emerge from this origin, although often in a way which is closer to the idea that all rules have their exceptions.

The article does then go on to describe a few other ways in which the phase may be interpreted, including the meanings which people are arguing are the only possibly-correct meaning in this thread.

So, go ahead and keep using it in those ways if you wish, but, please do be aware the above is the meaning intended by the vast majority of English speakers using this phrase—and it's not "nonsense" to use it in that way.

Learning Rust

Posted Feb 6, 2025 13:31 UTC (Thu) by khim (subscriber, #9252) [Link]

> "the exception proves the rule"

Can we, please, stop using that phrase? Because it become just a clever phrase that doesn't mean anything anymore (or, rather, it may mean four or ten, or maybe twenty different things depending on who says it).

Original exceptio probat regulam in casibus non exceptis was pretty clean and unambiguous: if you are investigating what have happened in a certain incident and find out phrase “Special leave is given for men to be out of barracks tonight till 11.00 p.m.” recorded in the books on a big holiday… then you can be reasonably sure that there are a certain rule that forbids staying after 11.00 p.m normally and only “on special occasion” it can be safely violated if you have explicit permission – and we can be reasonably sure that rule was there even if books where said rule was originally written haven't survived.

Simple and easy.

But when people started using that phrase outside of that context… it started having bazillion different meanings and today… very few know where have it come from and what it does even mean… everyone ascribes the meaning that their imagination gives them (Wikipedia lists four meanings, but I'm sure there are more) and so use of it causes more confusion than revelation.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 10:14 UTC (Wed) by farnz (subscriber, #17727) [Link] (17 responses)

There's a deep cultural divide here; doing any C that's fully defined by the Standard is as hard as doing unsafe Rust, but with C, the compiler authors recognise that there are limits to how far they can go with "this is undefined/unspecified/implementation defined per the Standard, and we're choosing to exploit that for optimization", whereas Rust with its split between unsafe and safe Rust can break any unsafe code whose behaviour isn't fully defined.

Aliasing rules are good example: all the C compilers I can find apply a conservative interpretation of the type-based aliasing rules that C uses (assuming you don't turn them off completely with -fno-strict-aliasing or equivalent), because there's enough code out there that almost complies, but not quite, that compilers have to extend the rules with extras to be able to compile enough extant code to be useful. In contrast, Rust's aliasing rules are enforced strictly on users, and if you break them in a controlled and small manner (like you can get away with using a C compiler), Rust will "simply" give itself permission to miscompile your code later.

The result is that unsafe Rust, while theoretically no harder than normal C, is practically harder because Rust is much less forgiving of you breaking the rules of unsafe than C compilers are of you breaking the rules of C.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 10:42 UTC (Wed) by Trainninny (guest, #175745) [Link] (16 responses)

Your whole comment and argument seems fallacious to me, I am very sorry. A clear counter-example is

https://kristerw.blogspot.com/2017/09/why-undefined-behav...

C and C++ compilers cannot be assumed to be "less forgiving". Undefined behavior is undefined behavior, whether in C, C++, or Rust.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 10:47 UTC (Wed) by farnz (subscriber, #17727) [Link] (15 responses)

Right, but there are things that are technically UB in C or C++ but where compilers do what was intended by the author because there's so much code out there that depends on a specific meaning for that UB, to the point where there exist compiler flags like -fno-strict-aliasing and -fwrapv whose sole purpose is to tell the compiler to define certain things that the Standard says are UB in a specific fashion.

UB is thus not the same between compilers, and Rust's compiler is one of the strictest for actually exploiting UB to optimize, where C and C++ compilers don't because they would be useless if they enforced a strict reading of the C Standard.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 11:05 UTC (Wed) by Trainninny (guest, #175745) [Link] (11 responses)

Yet the counter-example I gave uses no flags AFAIK, and still counters your whole argument, I am very sorry.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 11:50 UTC (Wed) by farnz (subscriber, #17727) [Link] (8 responses)

Your counter example talks about something very different - once the compiler has decided that it's not going to define the meaning of a piece of code that contains UB, the risks are the same with Rust or with C.

I'm saying that the Rust compiler hews much closer to the (informal) Rust standard than C compilers do - there's a lot of C out there that has UB in ISO Standard C, but does not contain UB in GCC C (sometimes requiring -fno-strict-aliasing or -fwrapv to define the UB the way the code requires). Writing code that has no UB in ISO Standard C is about as complex as writing unsafe Rust, but C as it's generally talked about is not strictly ISO Standard C, but instead things like GCC C, and Clang C, and Microsoft Visual C.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 17:21 UTC (Wed) by Trainninny (guest, #175745) [Link] (7 responses)

I am very sorry, but your argument is still wrong and my example still counters your whole argument. You are wrong in multiple ways.

>I'm saying that the Rust compiler hews much closer to the (informal) Rust standard

Are there holes in the Rust type system, and do these give trouble in practice? Are the Rust language developers working on a new type system and solver for Rust, and trying to make it as backwards compatible as possible?

https://github.com/lcnr/solver-woes/issues/1

>Even worse, there may be changes to asymptotic complexity of some part of the trait system. This can cause crates which start to compile fine due to the stabilization of the new solver to hang after regressing the complexity again. This is already an issue of the current type system. For example rust-lang/rust#75443 caused hangs (rust-lang/rust#75992), was reverted in rust-lang/rust#78410, then landed again after fixing these regressions in rust-lang/rust#100980 which caused yet another hang (rust-lang/rust#103423), causing it to be reverted yet again in rust-lang/rust#103509.

How well defined is Rust, unsafe, the memory model of Rust? How easy is it to tell for a programmer whether a given piece of unsafe Rust code has undefined behavior or not? How much material, and what material, does a Rust programmer need to study, to be able to develop and maintain unsafe Rust with a high level of confidence?

chadaustin.me/2024/10/intrusive-linked-list-in-rust/

https://lucumr.pocoo.org/2022/1/30/unsafe-rust/

youtube.com/watch?v=DG-VLezRkYQ

Unsafe Rust versus C and C++

Posted Feb 5, 2025 18:47 UTC (Wed) by farnz (subscriber, #17727) [Link] (1 responses)

Your wall of text does not back your claim that my argument is wrong - indeed, it doesn't even address it at all.

My fundamental argument is that practically Rust is harder than C, while theoretically C is as hard as Rust, because Rust doesn't give you "dialects" - there is one and only one version of Rust, and while Rust may eventually weaken the constraints unsafe code has to follow, for now the only way to be guaranteed safe is to comply with all the constraints.

However, if you try to write ISO Standard C, it's as hard (if not harder) to comply with the rules - the only way to be guaranteed safe is to obey all the constraints ISO Standard C imposes (for which there are no tools to check that you've done this - it's a pure whiteboard exercise, unlike Rust, where there's things like Miri to help). Worse, there are constraints in the ISO spec where no extant compiler currently exploits the fact that ISO Standard C leaves this area underspecified, because doing so would break too much code and get rid of your user.

With practical C, though, you don't write ISO Standard C; you write GCC C, or Microsoft Visual C, or Clang C, which have laxer restrictions than the ISO Standard, and has tooling like UBSAN to detect breaches of the compiler-dialect C rules. That's an easier task than writing ISO Standard C that's fully compliant with the rules, and also easier than writing Rust that's fully compliant with the Rust rules, because each compiler defines some behaviours that ISO leaves as undefined, unspecified, or implementation-defined (and in the last case, it's required to define them by ISO rules, and can't leave them alone).

And, on top of that, it's possible to use compiler switches to ask for a C dialect with certain things that are technically UB in ISO Standard C defined by the compiler in a way that's useful - -fno-strict-aliasing, -fwrapv for two examples. Writing safe code in those dialects is easier, because there's less room to accidentally write UB to begin with.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 19:55 UTC (Wed) by Trainninny (guest, #175745) [Link]

Have some comments been deleted? There were some comments by some other people, but I cannot find them anymore.

>Your wall of text does not back your claim that my argument is wrong - indeed, it doesn't even address it at all.

I am very, very sorry, but you are completely wrong about this.

>[...] because Rust doesn't give you "dialects" - there is one and only one version of Rust, [...]

This is also completely wrong, though it is not core to the argument. Several counter-examples:

https://doc.rust-lang.org/cargo/reference/profiles.html#p...

panic="unwind"/"abort"

https://github.com/rust-lang/rust/issues/126683

-Zoom=panic/abort

https://doc.rust-lang.org/nightly/edition-guide/rust-2024...

Whether the Rust code in that page deadlocks or not depends on its edition. Rust does at least have automatic migration tools, but there are still drawbacks to this: You cannot in general tell without knowing the specific Rust edition what a sample of Rust code does; and documentation, guides, tutorials, etc. that do not explicitly mention the edition that they are valid for, will risk having ambiguous meaning and correctness.

You can then argue that the dialects of C or C++ in specific compilers are significantly worse, which I could imagine being true and is a drawback of those compilers and arguably a drawback of the C and C++ languages as well. But that is a different discussion, and Rust 100% does have dialects. Rust is also in a situation where it has 1 major compiler; would Rust end up with similar issues as C and C++ if it had multiple compilers? And having multiple major compilers are presumably a good thing overall for a language used for critical infrastructure.

>[...] the only way to be guaranteed safe is to obey all the constraints ISO Standard C imposes (for which there are no tools to check that you've done this - it's a pure whiteboard exercise, unlike Rust, where there's things like Miri to help). [...]

This is also completely wrong, MIRI can be seen as a runtime checker/sanitizer with many of the same advantages and drawbacks, and there are many different sanitizers and runtime checkers for C as well as C++. Some of them are ported between C++ and Rust. And MIRI, while greatly helpful, is not perfect. People have complained about bugs in MIRI, about false positives and false negatives, and as a runtime checker/sanitizer, MIRI takes a long time to run like C++ sanitizers also do. And like sanitizers, if your test run with MIRI does not cover a specific combination of control flow and values, MIRI will not check that.

>With practical C, though, you don't write ISO Standard C; you write GCC C, or Microsoft Visual C, or Clang C, which have laxer restrictions than the ISO Standard, and has tooling like UBSAN to detect breaches of the compiler-dialect C rules. That's an easier task than writing ISO Standard C that's fully compliant with the rules, and also easier than writing Rust that's fully compliant with the Rust rules, because each compiler defines some behaviours that ISO leaves as undefined, unspecified, or implementation-defined (and in the last case, it's required to define them by ISO rules, and can't leave them alone).

This is again not relevant as far as I can tell. But, the situation for Rust is worse, since Rust does not have a specification, and also does not have a specification for its memory model, and also only has one main compiler. A specification for Rust is currently a work in progress. C++ and C has several major compilers for them, despite flaws, and some codebases do try to work with any major compiler. I agree that this is a weakness of both C and C++, more so for C++ given increased complexity. But Rust is arguably and unfortunately worse here, which is highly regrettable, given that Rust is much younger. A major reason for Rust being worse here is as far as I can tell the type system of Rust being both complex and having holes. The Rust language developers are working on a new solver and type system for Rust, but it is an effort that is taking many specialized developers years.

One practical consequence of the type system holes of Rust, apart from the issues encountered for maintenance of the language and main compiler and the issues for users like exponential regressions in compile times, is that writing a new compiler may be difficult. Unless you copy-paste the solver of the main Rust compiler, despite that solver having issues.

There are also complaints that the rules of unsafe for Rust has changed and become more complex over time, which I hope as of 2025 are no longer true. It would be very good if unsafe Rust becomes easier, not harder, as the language is developed. But there have been complaints about the opposite happening in the past.

That Rust does not have a specification, and has only one major compiler, also makes it easier for implementation-defined behavior to accidentally become part of the language. https://github.com/rust-lang/rust/issues/97146 tells of some users apparently relying on a specific behavior related to double panics.

Once the new type system and solver for Rust is ready, it may end up not being 100% backwards compatible with the old type system.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 19:28 UTC (Wed) by Wol (subscriber, #4433) [Link] (4 responses)

> Are there holes in the Rust type system, and do these give trouble in practice? Are the Rust language developers working on a new type system and solver for Rust, and trying to make it as backwards compatible as possible?

And now you are really coming over as lying by omission. Sorry.

Let's take an example of electric cars. Pretend I have an electric car with a range of 250 miles, and it takes a day to give it a full charge. How long will it take me to drive 400 miles from London to Edinburgh? A day and a half? No. At 50 mph it will take me about 9 hours.

Because charging an electric car follows the 80/20 rule. If I go half way (preferably a bit more) it will take about 4 hours. Stop for a coffee in a service station, and that is enough to charge the car to about 80%. And that's enough to cover the remaining 200 miles. I might have to stop a second time, but I might not.

The two crucial facts about Rust, is that (a) all "safe" code has been proven correct by the compiler, and (b) the majority of Rust programmers should never have to touch unsafe code.

Maybe you're right banging on about all these exceptions, and the Rust guys are writing all this fancy stuff like MIRI, but the definition of "unsafe code" is "stuff the compiler can't prove is correct". So all you're doing is like climate deniers complaining electric cars are useless because it takes too long to get those last few miles into the battery.

The definition of "unsafe" code is "stuff the compiler can't prove correct". And you're moaning that the compiler writers aren't mathematical gods because they can't (yet) prove some very tricky problems. And other problems are just plain insolvable.

THAT is why unsafe code is hard. Because the maths behind it is hard. Knuth ranks his problems from 0 is "easy" to (iirc) 5 is "if you can solve it it's worth a PhD". As soon as you start programming "unsafe", you are dealing with code where the proofs are 4 or 5 on the scale - if it's even provable!

90% of Rust programmers are unlikely to step outside the safe zone in their entire career. All safe code MUST be fully defined, and MUST be provably correct (bugs, cockups, and acts of God excepted).

100% of C/C++ programmers are likely to step on an unsafe landmine several times a year.

That's a big difference!

Cheers,
Wol

Second request

Posted Feb 5, 2025 19:50 UTC (Wed) by corbet (editor, #1) [Link] (1 responses)

Do not feed the troll. Please.

Wol why do we have to keep asking you this?

Second request

Posted Feb 5, 2025 20:03 UTC (Wed) by Trainninny (guest, #175745) [Link]

I am sorry, but I do not understand why you appear to describe me as a troll. Are there any flaws or issues in any of my comments?

Unsafe Rust versus C and C++

Posted Feb 5, 2025 20:33 UTC (Wed) by Trainninny (guest, #175745) [Link] (1 responses)

>[...] (b) the majority of Rust programmers should never have to touch unsafe code. [...]
>
>-
>
>[...]90% of Rust programmers are unlikely to step outside the safe zone in their entire career. [...]

Is that not in direct contrast to https://lwn.net/Articles/1007973/ ? At least in the context of the Linux kernel?

>[...]but the definition of "unsafe code" is "stuff the compiler can't prove is correct"[...]

A minor technicality: the compiler is not proving the correctness of the code, it is (meant to be) proving the memory safety/absence of undefined behavior. In the sense that memory safe code without undefined behavior can have logic bugs.

>All safe code MUST be fully defined, and MUST be provably correct (bugs, cockups, and acts of God excepted).

Did you mean unsafe?

But even then, this does not always hold in practice.

github.com/rust-lang/rust/commit/71f5cfb21f3fd2f1740bced061c66ff112fec259

cve.org/CVERecord?id=CVE-2024-27308

>100% of C/C++ programmers are likely to step on an unsafe landmine several times a year.

I do not believe this is true, but I do believe that Rust makes some aspects significantly easier, and not only its borrow checking and solver and lifetimes handling, though in the specific case of Rust that also comes with penalties in the unsafe subset. One great advantage is Rust's pattern matching and disjoint unions, taken from functional programming. And one thing that makes C and C++ error prone for some cases is that C and C++ are ancient languages that have a lot of cruft and baggage. I do prefer functional programming, and hope that C++ will get a good and robust implementation of both pattern matching and disjoint unions (C arguably has a limited scope), but Rust has significant issues. To be perfectly frank, I wonder if a Rust killer in the future may greatly iterate on and improve and be closer to what many of us hoped that Rust would be.

>The definition of "unsafe" code is "stuff the compiler can't prove correct". And you're moaning that the compiler writers aren't mathematical gods because they can't (yet) prove some very tricky problems. And other problems are just plain insolvable.
>
>THAT is why unsafe code is hard. Because the maths behind it is hard. Knuth ranks his problems from 0 is "easy" to (iirc) 5 is "if you can solve it it's worth a PhD". As soon as you start programming "unsafe", you are dealing with code where the proofs are 4 or 5 on the scale - if it's even provable!

But these issues are not purely theoretical, and appears to cause not only users but also language developers trouble.

https://github.com/lcnr/solver-woes/issues/1

And the Rust language developers made multiple blog posts discussing their work on the new solver, and on trying to make it backwards compatible.

There is a comment where I discuss the language design of Rust and related issues https://lwn.net/Articles/1008103/ . One could argue that requiring a complex solver, that in practice may end up with holes, has practical trade-offs in the language design. I recall that Bjarne Stroustrup was against any language feature that would require complex solvers. I wonder if part of the reasoning is that it would make it harder to implement correct compilers. Which may be consistent with some of the headaches that some apparently really skilled people among the Rust language developers appear to be dealing with. I do not envy their position, their challenge looks difficult.

>And now you are really coming over as lying by omission. Sorry.

I do not agree with this at all, and as far as I can tell, you are completely wrong about this.

Stop here.

Posted Feb 5, 2025 20:36 UTC (Wed) by corbet (editor, #1) [Link]

Your comments are trolls - lengthy pieces designed to prolong conversations and make people argue. They are off-topic for an article on kernel development. Whether or not they actually are, they certainly have the look of machine-generated text. I am done asking you to stop, you really need to put an end to this here.

Unsafe Rust versus C and C++

Posted Feb 5, 2025 12:15 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

I think, however, you are confusing maths and science.

Maths is the art of contriving a logically consistent imaginary world.

Science is the art of finding a logical world that *appears* to describe the real world.

C/C++ - as farnz said, tries to live in both worlds, and much UB is actually defined outside of the standard - I believe the standard even says as much! (Cue the circular argument between the C standard and Posix, where both try to defer to the other).

Rust actively tries to split the two apart, so unsafe Rust is - by definition - hard. It's science and there's no guarantee whatsoever that you're going to get the result you expect.

The problem with C/C++ is it's a chimera, and you're quite likely to trip over UB where you least expect it. Rust defines that as a compiler bug.

So Rust - as a whole - is much easier than C/C++, but that's because all the dangerous stuff is walled away behind "unsafe".

Oh - and as for the comment that "the kernel will contain a lot of unsafe code", (a) I get the impression that's unlikely, and (b) it's a massive improvement on the current state of affairs because if large chunks of the kernel are clearly marked "here be dragons", at least we know where to watch out where we're going.

Cheers,
Wol

Unsafe Rust versus C and C++

Posted Feb 5, 2025 18:25 UTC (Wed) by Trainninny (guest, #175745) [Link]

>Rust actively tries to split the two apart, so unsafe Rust is - by definition - hard.

The programming language design theory of splitting a language into a not unsafe and an unsafe part ("not unsafe" as in lacking undefined behavior), and also try to make it run fast, is a large topic of discussion, with many ways of going about it, with many different trade-offs and different attempts. It is not clear to me whether in this design space, with a split and with performance requirements, that a programming language with these requirements will necessarily have its unsafe be harder than C and C++. If a language design with such a split will necessarily have its unsafe subset be harder, then that could be seen as an argument against the whole approach of having an unsafe split. Also because making hard code even harder than it has to be, is not great - is that really worth what you may gain in return?

But, whether or not all that holds, Rust has at the very least a number of properties making it substantially harder to write unsafe, that are in theory independent of this split. An example is the lack of a specification of Rust, the specification is a work in progress. Another are the holes in Rust's type system, and the Rust language developers are working on a new type system and main compiler solver for Rust, and trying to make it as backwards compatible as possible. Having holes in a type system is independent of having an unsafe split.

https://github.com/lcnr/solver-woes/issues/1

doc.rust-lang.org/nomicon/references.html

>Unfortunately, Rust hasn't actually defined its aliasing model.

The aliasing rules of Rust supposedly not being defined, is independent of and is not inherent to an unsafe split.

chadaustin.me/2024/10/intrusive-linked-list-in-rust/

And there are other factors making unsafe harder in Rust that are independent of the unsafe split. For instance, is there a lack of teaching material and documentation for learning unsafe? Is the Rustonomicon sufficient? What about how the rules for writing correct unsafe changes with new versions of Rust?

>Oh - and as for the comment that "the kernel will contain a lot of unsafe code", (a) I get the impression that's unlikely, and (b) it's a massive improvement on the current state of affairs because if large chunks of the kernel are clearly marked "here be dragons", at least we know where to watch out where we're going.

But what if most Rust code will require the careful vetting that unsafe requires?

https://lwn.net/Articles/1007973/

Unsafe Rust versus C and C++

Posted Feb 5, 2025 11:44 UTC (Wed) by excors (subscriber, #95769) [Link] (2 responses)

A concrete example of this is type-punning through unions, which is undefined behaviour per the C standard, but explicitly allowed by GCC even under -fstrict-aliasing (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#...) and relied upon by e.g. the Linux kernel (https://lkml.org/lkml/2018/6/5/769).

Unsafe Rust versus C and C++

Posted Feb 5, 2025 17:29 UTC (Wed) by Trainninny (guest, #175745) [Link] (1 responses)

Not that this is relevant for the argument that he brought as far as I can see, but, how does that relate to this link?

https://stackoverflow.com/questions/25664848/is-it-allowe...

For C:

>if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined

Unsafe Rust versus C and C++

Posted Feb 5, 2025 18:34 UTC (Wed) by excors (subscriber, #95769) [Link]

Ah, I think I was mixing up C and C++ - it seems it is generally agreed to be undefined behaviour in C++ (though still allowed by GCC), but probably is allowed in C. Though as that Stack Overflow discussion shows, the C standard is somewhat self-contradictory and vague enough (outside of non-normative footnotes) that I'm not sure anyone can be entirely certain of the intended semantics, and especially can't be certain they'll interpret it the same way as all compiler developers.

Learning Rust

Posted Feb 8, 2025 3:27 UTC (Sat) by ssokolow (guest, #94568) [Link]

"Must still uphold all safe Rust invariants. You just get access to new constructs with more lax rules," aside, when people well-versed in Rust say unsafe Rust is harder than C, they're often referring to one specific footgun-for-C-programmers case which has since been made less serious with the introduction of the &raw operator in October 2024.

Basically, that, no matter how hard you try with typecasting, the & operator will create a temporary & or &mut (non-raw) reference which must uphold the validity invariants, so it's instant UB to use it to construct *const or *mut (raw pointers) which alias in situations where legal-with-raw-pointers aliasing is allowed.

Previously, you had to use the non-obvious std::ptr::addr_of! and std::ptr::addr_of_mut! macros to take raw-pointer references to things without being required to uphold the aliasing invariants.