Rust heads into the kernel?

By Jake Edge
April 21, 2021

In a lengthy message to the linux-kernel mailing list, Miguel Ojeda "introduced" the Rust for Linux project. It was likely not the first time that most kernel developers had heard of the effort; there was an extensive discussion of the project at the 2020 Linux Plumbers Conference, for example. It has also been raised before on the list. Now, the project is looking for feedback from the kernel community about its plans, thus the RFC posting on April 14.

Adding Rust

Ojeda started by acknowledging that adding another implementation language to the kernel will be at least somewhat disruptive, so there need to be good reasons to do so. The kernel is already a highly complex body of code to understand and work with, adding a second language into the mix, with all of its complexities, only makes that worse. "Nevertheless, we believe that, even today, the advantages of using Rust outweighs the cost."

Those benefits mainly stem from the memory-safety features of the Rust language. The hope is that the number of bugs in the kernel can be reduced by eliminating these kinds of problems, at least for the pieces that get implemented in Rust. For now, those pieces are envisioned to be well away from the core kernel code:

Please note that the Rust support is intended to enable writing drivers and similar "leaf" modules in Rust, at least for the foreseeable future. In particular, we do not intend to rewrite the kernel core nor the major kernel subsystems (e.g. `kernel/`, `mm/`, `sched/`...). Instead, the Rust support is built on top of those.

There are lots of impedance mismatches between kernel C code and Rust that need to be handled—one way or another. In order to get the most benefit from Rust's safety features, the amount of kernel support code in unsafe blocks needs to be minimized—and carefully documented. One of the stated goals is that the documentation guidelines be automatically enforced:

By taking advantage of Rust tooling, we keep enforcing the documentation guidelines we have established so far in the project. For instance, we require having all public APIs, safety preconditions, `unsafe` blocks and type invariants documented.

The work so far has focused on making the building blocks and starting to implement wrappers for kernel APIs and abstractions, but there is lots more to do: "Covering the entire API surface of the kernel will take a long time to develop and mature." The RFC further acknowledges the task ahead, but notes that there is tooling available to help that process:

[...] modules written in Rust should never use the C kernel APIs directly. The whole point of using Rust in the kernel is that we develop safe abstractions so that modules are easier to reason about and, therefore, to review, refactor, etc.
Furthermore, the bindings to the C side of the kernel are generated on-the-fly via `bindgen` (an official Rust tool). Using it allows us to avoid the need to update the bindings on the Rust side.

In a Google security blog post, which led to a lengthy comment stream when posted here at LWN, one of the Rust for Linux maintainers, Wedson Almeida Filho, gave a detailed description of one of the example drivers that are part of Ojeda's RFC. It is a character device that implements a semaphore, mostly for demonstration purposes. The RFC also has a reimplementation of the Android Binder interprocess communication mechanism. While the latter is not yet complete, it gives a further look at what would be possible with Rust in the kernel:

At the moment we have nearly all generic kernel functionality needed by Binder neatly wrapped in safe Rust abstractions [...]
We also continue to make progress on our Binder prototype, implement additional abstractions, and smooth out some rough edges. This is an exciting time and a rare opportunity to potentially influence how the Linux kernel is developed, as well as inform the evolution of the Rust language.

The RFC notes that, currently, the Rust support adds a fair amount of code to the built kernel, but that there are plans to reduce that over time. The kernel size for "small x86_64 config we use in the CI" increased by around 4% with full Rust support. The Rust version of the semaphore driver is around 50% bigger than its C counterpart, while the Binder driver is roughly equivalent in size. However, "note that while the Rust version is not equivalent to the C original module yet, it is close enough to provide a rough estimation".

Reaction

Overall, the reception to the RFC was favorable, though there are some exceptions, and, of course, there are questions and concerns with the existing code. Linus Torvalds seemed to focus in on the BUG() calls in the support code and was predictably unhappy to see them. The kernel tries hard to continue even in the face of errors, but calling BUG() simply gives up and crashes the kernel with a backtrace. In one case, there are intrinsic operations ("panicking intrinsics") included in the Rust standard library that are not supported for the kernel; calling them effectively crashes the kernel by calling BUG(). Torvalds suggested making calls to those fail at build time; Ojeda agreed that would be a better approach, but also noted that more of the standard library will be removed over time, which may largely eliminate the problem.

Torvalds also pointed to the panic!() calls in the memory allocation code, which seemed "fundamentally wrong" to him:

If the Rust compiler ends up doing hidden allocations, and they then cause panics, then one of the main *points* of Rustification is entirely broken. That's 100% the opposite of being memory-safe at build time.
An allocation failure in some random driver must never ever be something that the compiler just turns into a panic. It must be something that is caught and handled synchronously and results in an ENOMEM error return.

Again, Ojeda agreed; he noted that there is work to do to adapt Rust's standard library for use by the kernel:

What happens here is that we use, for the moment, `alloc`, which is part of the Rust standard library. However, we will be customizing/rewriting `alloc` as needed to customize its types (things like `Box`, `Vec`, etc.) so that we can do things like pass allocation flags, ensure we always have fallible allocations, perhaps reuse some of the kernel data structures, etc.

Using the Binder driver as an example was another area of concern for Torvalds; he would like to see an "example of a real piece of code that actually does something meaningful". Ojeda said there are plans to add a few drivers that talk to real hardware; Matthew Wilcox had an idea for where to start:

I'd suggest NVMe as a target. It's readily available, both as real hardware and in (eg) qemu. The spec is freely available, and most devices come pretty close to conforming to the spec until you start to push hard at the edges. Also then you can do performance tests and see where you might want to focus performance efforts.

Greg Kroah-Hartman agreed with Torvalds that Binder did not make a particularly good example driver, but was duly impressed with what the project had accomplished so far:

[...] this patchset is a great start that provides the core "here's how to build rust in the kernel build system", which was a non-trivial engineering effort. Hats off to them that "all" I had to do was successfully install the proper rust compiler on my system (not these developers fault), and then building the kernel code here did "just work". That's a major achievement.

He also thought that NVMe might make a good choice, but had other thoughts "for some of the basics that driver authors deal with on a daily basis (platform driver, gpio driver, pcspkr driver, /dev/zero replacement)". It would seem that the Rust for Linux project will be working on one or more of these kinds of "real" drivers before long.

While he reiterated the complaints he had for some of the individual patches, Torvalds said: "on the whole I don't hate it". On the other hand, though, Peter Zijlstra seemed to fundamentally object to the idea of adding a second implementation language to the kernel. The RFC noted that the kernel tooling has been focused on C, "including compiler plugins, sanitizers, Coccinelle, lockdep, sparse", but that tooling for Rust will "likely improve if Rust usage in the kernel grows over time". Zijlstra zeroed in on that and asked:

This; can we mercilessly break the .rs bits when refactoring? What happens the moment we cannot boot x86_64 without Rust crap on?
We can ignore this as a future problem, but I think it's only fair to discuss now. I really don't care for that future, and IMO adding this Rust or any other second language is a fail.

Perhaps unsurprisingly, he has strong opinions against the documentation format used by the project (Markdown in the code that gets converted to HTML). He was also unhappy with the code formatting used, which follows, at least for now, "Rust's idiomatic style", according the RFC, but is "really *really* hard to read". Beyond those, he wondered about what memory model Rust follows and how it "aligns (or not)" with the Linux kernel memory model (LKMM).

Memory model

Rust currently uses the C11 memory model, Boqun Feng said, mostly because the LLVM compiler supports it by default, but there is interest in ensuring that its memory model works well with the kernel's. Right now, "there is no code requiring synchronization between C side and Rust side, so we are currently fine", but that will change eventually, so there are plans to put the right Rust and kernel people together to discuss the issue. Almeida noted that the plan is for most Rust code in the kernel to only need to be concerned with the Rust memory model:

We don't intend to directly expose C data structures to Rust code (outside the kernel crate). Instead, we intend to provide wrappers that expose safe interfaces even though the implementation may use unsafe blocks. So we expect the vast majority of Rust code to just care about the Rust memory model.
We admittedly don't have a huge number of wrappers yet, but we do have enough to implement most of Binder and so far it's been ok. We do intend to eventually cover other classes of drivers that may unveil unforeseen difficulties, we'll see.

Almeida disagreed with Zijlstra's characterization of HTML as being an invalid documentation format, writing that off as a personal preference. For the code formatting, he is not opposed to moving away from the Rust style if there are good reasons to do so, but found Zijlstra's criticism unconvincing: "'Not having parentheses around the if-clause expression is complete rubbish' doesn't sound like a good reason to me."

Al Viro tried to explain the aversion to HTML documentation, in characteristically blunt fashion, which sent things briefly off the rails. Zijlstra said that there is no real way to look at HTML documentation in ASCII; "Nothing beats a sane ASCII document with possibly, where really needed some ASCII art." He also explained why his seemingly arbitrary complaints about the formatting actually matter:

Of course it does; my internal lexer keeps screaming syntax error at me; how am I going to understand code when I can't sanely read it?
The more you make it look like (Kernel) C, the easier it is for us C people to actually read. My eyes have been reading C for almost 30 years by now, they have a lexer built in the optical nerve; reading something that looks vaguely like C but is definitely not C is an utterly painful experience.
You're asking to join us, not the other way around. I'm fine in a world without Rust.

Zijlstra also suggested that many of the Rust features being touted could be implemented in C. Almeida agreed that they could be, but that Rust makes it impossible to mistakenly fail to use them, unlike C (at least without compiler changes):

In Rust, this isn't possible: the data protected by a lock is only accessible when the lock is locked. So developers cannot accidentally make mistakes of this kind. And since the enforcement happens at compile time, there is no runtime cost.

He also raised the problem of ownership in C: there is no way to transfer an object's ownership in C, but it is straightforward to do in Rust:

In Rust, there is a clean idiomatic way of transferring ownership of a guard (or any other object) such that the previous owner cannot continue to use it after ownership is transferred. Again, this is enforced at compile time.

But Zijlstra would rather see a C extension that supported ownership, instead of adding Rust to the kernel.

This would mean a far more aggressive push for newer C compilers than we've ever done before, but at least it would all still be a single language. Conversion to the new stuff can be done gradually and where it makes sense and new extensions can be evaluated on performance impact etc.

Almeida was not opposed to that idea, quite the reverse, in fact:

I encourage you to pursue this. We'd all benefit from better C. I'd be happy to review and provide feedback on proposed extensions that are deemed equivalent/better than what Rust offers.
My background is also in C. I'm no Rust fanboy, I'm just taking what I think is a pragmatic view of the available options.

It is a little hard to imagine the kernel switching to a C extension that does not yet exist in order to avoid further investigating adding Rust into the mix, however. But there is still plenty of work that needs to be done by Rust for Linux, some of which seems likely to be needed before Torvalds would be willing to merge the support. For example, more "real" driver examples and removing the paths that lead to BUG() calls seem needed. But Rust for Linux is clearly getting closer to being a reality.

Index entries for this article
Kernel	Development tools/Rust

Rust heads into the kernel?

Posted Apr 21, 2021 1:57 UTC (Wed) by rvolgers (guest, #63218) [Link]

There is already work on a JSON backend for RustDoc, which could be the basis for a non-html CLI viewer: https://github.com/rust-lang/rust/issues/76578

Since Zijlstra also described the kernel rst docs as "crap" and "unreadable garbage" I'm not sure anything except hand-written ASCII would satisfy him, but perhaps others would be happy with that.

Replacing Rust's alloc library with different code more suited to the Linux kernel seems like a good step, and it's something supported by the language just fine and is used in various situations already, it doesn't make the code any less Rust. I do hope they will not listen to the comments asking for Rust to be made to look exactly like C. Rust is a different language, and it's been designed by people well aware that a new language has a "novelty budget" and didn't go out of their way to do things differently for the fun of it.

Seeing if statements without parentheses around the condition was weird for me at first too, but it makes a lot more sense when you consider that an if statement is also an expression in Rust, which means it can be used the way the ternary operator is used in C. I can't quite put a finger on why, but adding the extra parentheses around the condition in such "expression if statements" makes it look a lot noisier and harder to parse for me, even back when I had very little experience with Rust.

Rust heads into the kernel?

Posted Apr 21, 2021 2:36 UTC (Wed) by atnot (subscriber, #124910) [Link] (55 responses)

I don't regularly read lkml but when I saw multiple people linking to especially Peter's messages with something to the effect of "why I'm not touching linux kernel dev" I was intrigued and decided to follow the thread.
I must say, I'm extremely disappointed that this is apparently the level of discourse you can have on the lkml without fearing, at minimum, a loss of reputation.

Rust heads into the kernel?

Posted Apr 21, 2021 3:08 UTC (Wed) by Paf (subscriber, #91811) [Link] (52 responses)

This is part of, I think, the ‘social debt’ built up over years of Linus acting like this periodically. It became accepted.

I do have a pretty basic fear that the maintenance required for this may be overwhelming and the rewards not great enough to maintain the effort over time... but gosh, as someone who has written kernel C for my whole career, it would probably be good if we could work out another option. Rust looks really compelling.

Rust heads into the kernel?

Posted Apr 21, 2021 4:58 UTC (Wed) by xinitrc (subscriber, #126452) [Link] (51 responses)

The problem is that there are lots of discussions about how rust is amazing and why it should be in kernel but not a single driver has been written yet.
While this is the whole point of operating system to control the hardware.
And this is why people have chosen C previously.

Rust heads into the kernel?

Posted Apr 21, 2021 5:55 UTC (Wed) by kunitz (subscriber, #3965) [Link] (49 responses)

I agree with that statement. Write code that works, does something useful, and allows users to achieve a goal that is otherwise not possible. Instead of writing announcements, the Rust people should rewrite three drivers from different areas and prove that their statements about better safety are justified.

I'm skeptical because of statements like this:

Secondly, modules written in Rust should never use the C kernel APIs directly. The whole point of using Rust in the kernel is that we develop safe abstractions so that modules are easier to reason about and, therefore, to review, refactor, etc.

This sounds not very kernel-like. Creating abstractions in such a way that they bear no costs will be hard. Giving the tendency of internal kernel APIs to change over time, there must be automation to support them. All of that will become clear if one would try to write actual drivers in Rust.

Rust heads into the kernel?

Posted Apr 21, 2021 7:19 UTC (Wed) by matthias (subscriber, #94967) [Link] (2 responses)

> I agree with that statement. Write code that works, does something useful, and allows users to achieve a goal that is otherwise not possible.

Definitely.

> Instead of writing announcements, the Rust people should rewrite three drivers from different areas and prove that their statements about better safety are justified.

On the other hand, getting feedback early is not bad. If the only complaints are: there are no drivers, yet. The Rust people can continue and write some drivers. If there would have been big complaints about the interface and the way they try to integrate Rust into the kernel, then having already implemented some drivers would have been a waste of time.

> I'm skeptical because of statements like this:
> Secondly, modules written in Rust should never use the C kernel APIs directly. The whole point of using Rust in the kernel is that we develop safe abstractions so that modules are easier to reason about and, therefore, to review, refactor, etc.
> This sounds not very kernel-like. Creating abstractions in such a way that they bear no costs will be hard.
Nobody said that kernel development would be easy. On the other hand, getting all users of some kernel API to be correct in C is also not easy. In fact it is so hard, that people usually do not even try to prove things like memory-safety.
> Giving the tendency of internal kernel APIs to change over time, there must be automation to support them.
The Rust safe abstraction will be one user of an API like every other user. And it will have to be adapted when an API changes just like every other user of the API. In some cases, it might be enough to change the implementation of the safe abstraction. In other cases, also the Rust interface of the abstraction will have to be changed and therefore Rust code using the interface will also have to be refactored. But this would also be true for any C code using the interface.

> All of that will become clear if one would try to write actual drivers in Rust.

And this is just what should happen now, after there was no clear objection to the way Rust will be integrated into the kernel.

Rust heads into the kernel?

Posted Apr 21, 2021 8:34 UTC (Wed) by mkubecek (guest, #130791) [Link] (1 responses)

> The Rust safe abstraction will be one user of an API like every other user. And it will have to be adapted when an API changes just like every other user of the API. In some cases, it might be enough to change the implementation of the safe abstraction. In other cases, also the Rust interface of the abstraction will have to be changed and therefore Rust code using the interface will also have to be refactored. But this would also be true for any C code using the interface.

But "will have to be adapted" says nothing about who is going to be responsible for that. As of now, the practice is that if you want to change internal API, you are responsible for adapting all its in tree users. This is fine as long as all of kernel is written in C. But when another language is added, it is a natural question if you still expect anyone who would touch any internal API to also update your abstractions - which also means learning rust to a level sufficient for such work. Or if we can simply break the rust stuff at will and you will fix it later. Or if anyone who wants to touch the internal API has to find (in advance) someone who will take care of the rust dependants. Neither of the options looks very appealing to me.

It is very unfortunate that while rust people were very active in other parts of all the LKML and LWN discussions, nobody cared to reply to these concerns raised by Peter (and me). For some of us, this is really a fundamental question and we would like to have an idea where this path is heading before taking the first step. So I'm asking again: do you expect that anyone seriously involved in kernel development will have to get proficient (also) in rust? If not, how do you plan the coexistence to work?

Rust heads into the kernel?

Posted Apr 21, 2021 10:36 UTC (Wed) by tux3 (subscriber, #101245) [Link]

If the Rust side does go ahead with its plan of building wrappers around C APIs vs. allowing Rust drivers to directly interract with C, you at least avoid a scenario where a C refactor patch has to touch hundreds of Rust callers written in idiomatic style that can be very much unreadable to pure C people.

I imagine the wrappers will restrict themselves to very simple, easily understood code. Not only for the sake of people who should not have to read TRPL cover to cover, but also because unsafe wrappers ought to prioritize readability and 'obviously correct'ness.

Now would we still need dedicated people to take your C patches and write the corresponding Rust part?
Maybe. I'd like to hope not!

Rust heads into the kernel?

Posted Apr 21, 2021 7:28 UTC (Wed) by taladar (subscriber, #68407) [Link] (43 responses)

At the very least I would expect Rust wrappers around kernel APIs to get rid of that error-prone C inline error reporting with negative numbers that is so common in C.

Rust heads into the kernel?

Posted Apr 21, 2021 7:48 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (6 responses)

Yes, you can expect that at some point the Result<T, E> will be converted to a Result<(), Errno> or Result<&SomeStruct, Errno>, and then unsafe code will convert that to a negative errno.

Rust heads into the kernel?

Posted Apr 21, 2021 7:56 UTC (Wed) by josh (subscriber, #17465) [Link] (5 responses)

Long-term, we're hoping to teach Rust that inside the kernel a valid reference will never have a value less than 4096 (carving out a "niche" in Rust terms), so that a Result<&T, Errno> (or a Result with any success type that uses the space of valid pointers, such as Box) will be exactly one machine word in size and have the same layout as a C ERR_PTR.

Rust heads into the kernel?

Posted Apr 21, 2021 12:50 UTC (Wed) by pbonzini (subscriber, #60935) [Link]

You still probably want to use other error enums than errno in Rust code, so that should be well into diminishing returns area though.

Rust heads into the kernel?

Posted Apr 21, 2021 22:59 UTC (Wed) by roc (subscriber, #30627) [Link] (3 responses)

Won't you also want to make Result<T, Errno> a single machine word where T is u32 or even u63?

Rust heads into the kernel?

Posted Apr 22, 2021 8:31 UTC (Thu) by pbonzini (subscriber, #60935) [Link]

It would be a bit more complicated by Rust being much more strict on integer casts. It means that you'd use usize in many places where C would use unsigned (or more likely, use an int that would always be positive).

Rust heads into the kernel?

Posted Apr 22, 2021 19:03 UTC (Thu) by josh (subscriber, #17465) [Link] (1 responses)

That's theoretically possible but would take either some more language work or some library work, beyond the language work we'd need to do to allow storing enums in the 0-4095 niche of pointers.

I'm assuming that you mean "a single machine word" on a 64-bit system, because on a 32-bit system you can't store a general u32 and something else in the same 32-bit word.

Rust, today, has the property that a type is always stored the same way no matter where you put it, even when taking advantage of niches and similar. That's important because if you have a Result<T, Errno>, if it's an Ok you can get a &T from it, and if it's an Err you can get a &Errno from it. So, that &Errno needs to actually reference a thing that looks like an Errno in memory, not something that's been tweaked to use different enum values.

That works fine for a reference or box or other type that cannot have a value in the range 0-4095 (once we have the ability to declare that "not in the first page of memory" niche). An Errno can use that niche.

However, if you have some type T that has a niche that isn't 0-4095 (random example: a "userspace pointer" that can't point into kernel memory but *can* potentially be 0-4095), such that T's valid values and Errno's valid values would overlap, you can't tweak the Errno's values to store them in T's niche. That would mean you couldn't get a valid reference to an Errno, because it needs translation before extracting it.

In the specific case of a u32, the easiest solution is just to let Rust store the u32 and the Errno alongside each other, and they should take up no more than a 64-bit word.

To solve the general case of niches that would require transforming the values of Errno when storing/extracting them, we could handle that in two ways, one at the library level or one at the language level:

At a library level, we could just have a dedicated type (not Result) that stores an Errno and another type, which *doesn't* let you get a &Errno from it, and instead only lets you get an Errno. (Errno itself is smaller than a machine word, so we might as well just use its value.) We could then teach that dedicated type to store the errno in the niche of the T, and re-extract it later. That would solve the specific problem, with some effort, but would require writing code specifically to handle that transformation.

Or, at a language level, we can add a concept of "types that you can't have a direct reference to", make a version of Errno with that attribute, and then allow such types to take advantage of arbitrary niches in other types. Because you can't have a reference to those types, they don't have to be stored in a way that looks exactly like they'd be stored if standalone. You'd typically use it for types that are so cheap to copy (because they're smaller than a machine word) that the inability to reference them isn't an issue.

That language-level approach is the kind of thing that you can teach the Rust language to do *once*, and then it can automatically do the kinds of optimizations that in C you'd have to manually manage in each data structure.

Rust heads into the kernel?

Posted Apr 23, 2021 0:43 UTC (Fri) by roc (subscriber, #30627) [Link]

I think there's actually a conundrum here. Someone is going to have to decide how much language and compiler work to do before you really push on Rust being used in the Linux kernel --- because before you do the latter, you will need to settle once and hopefully for all the idiomatic way to handle Errno results for various common T types.

Rust heads into the kernel?

Posted Apr 21, 2021 15:44 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (35 responses)

That's not necessarily more error-prone than having to check a composite result. It's trivial to return composite results in C, it's just rarely used because most of the time it brings little value and brings its own class of bugs as well.

Rust heads into the kernel?

Posted Apr 21, 2021 18:48 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (34 responses)

The difference is having language-level support for the goto-based error-checking idioms that Linux uses. For example something like

  r = func();
  if (r < 0)
    goto some_label;
  ...
some_label:
  // chain of kfree and unlocks here
  return r;

would be just

func()?;

Likewise, this

  r = -EINVAL;
  if (func() < 0)
    goto some_label;

would be

func().map_err(|_| errno::EINVAL)?;

Rust heads into the kernel?

Posted Apr 22, 2021 20:18 UTC (Thu) by wtarreau (subscriber, #51152) [Link] (33 responses)

And there are really people who can parse such horrors ?

Also not being able to figure exactly what that function is doing in my back after the question mark bothers me. How do you decide to atomically increment an error counter on the return path with that method ? How do you increment different counters depending on the case you've met ? I.e. oversized_frame or undersized_frame ?

This looks like eye-candy for dummies to me, just to attract beginners by claiming they'll write easier code but it's hard to imagine it can ever be useful outside of school books.

Rust heads into the kernel?

Posted Apr 22, 2021 20:34 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

`?` is an early return in the error case. If you want to increase an error counter before returning, I would match and on the `Err` arm increment and return. Same thing with the different counters case.

You might call it eye candy, but my eyes can't follow goto spaghetti and match up allocation location/free call pairings on a whim. the RAII makes early returns in-place doable. If you have other things to do, don't use `?`, but do your thing and return there.

Rust heads into the kernel?

Posted Apr 22, 2021 22:04 UTC (Thu) by pbonzini (subscriber, #60935) [Link] (23 responses)

Are there people who can parse C function pointer casts or ternary operator? Sure, they just have to study the language.

Why should it be a surprise that not all of Rust can be guessed without studying the language?

Honestly, the last time I heard this kind of complaint I was teaching first year university students.

Rust heads into the kernel?

Posted Apr 23, 2021 9:02 UTC (Fri) by Wol (subscriber, #4433) [Link] (18 responses)

Weren't C - and POSIX - originally designed as a glorified assembler for writing a games machine to run on a PDP-11?

So it wasn't even designed as a language to write an OS in! As those of us who can remember the - WELL DESIGNED - commercial systems that came before, to us C and *nix are basically crap. The problem is they were "good enough, and cheap enough" to sweep aside the opposition. And now all this extra stuff is being bolted on to get round design decisions that made sense for a single-user games machine, but are rubbish for actually writing real software, but can't actually be designed away because so much old software relies on them!

Whereas Rust has been designed as a language for writing low level systems. The first OS I used was originally written in FORTRAN. Then they wrote chunks of it in PL/1, both the official and their own unofficial versions. It worked pretty well. Then they re-wrote it in C and I think it was one of the major reasons the company tanked.

Rule 0 of health and safety - MAKE DOING THE RIGHT THING EASY. That's a design goal of Rust, which C appears never even to have heard of!

You know I'm on the raid mailing list, and the number of bugs and lockups and whatever that seem to be creeping out of the woodwork because memory safety, spinlock, and the similar assorted mistakes have been made seems awful. Okay, they only seem to hit when users are doing esoteric things - just running a raid array is very well tested - but it would be nice if the language made those sort of bugs HARD, rather than trying to guarantee they will lurk in pretty much any poorly-tested error path.

Cheers,
Wol

Rust heads into the kernel?

Posted Apr 23, 2021 12:04 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (16 responses)

Sincerely I'm not seeing any difference between the examples above and my old horrible memories of Perl, where everything was written as awful regex matches, that was well-known for being a "write only language", and that resulted in massive vulnerabilities everywhere since it was so difficult to figure what was *really* going to happen behind the curtains.

Memory safety is the argument circling in loops all the time about this language. I'd like *CONTROL* safety. By obfuscating controls, I hardly see how someone may assess what is really going to be done. A number of issues come from compilers moving code around to optimize it while it reads fine on the screen, forcing us to add compiler barriers, READ_ONCE() and so on. You can add all the amount of "memory safety" you want, if that results in a totally mangled non-sequential syntax, you can be certain that such issues will be even harder to spot, especially with the the arrogant attitude I've seen from some people around the language constantly claiming "bugs are C, we can't make bugs in rust". Bugs are human. Humans need to understand what is being done. C is far from being perfect, we all know it, but it translates to instructions and does not add magic everywhere to make the developer feel proud of writing code using smileys. This is essential to me and way more than "memory safety at compile time" and "panic in unexpected situations".

Rust heads into the kernel?

Posted Apr 23, 2021 12:21 UTC (Fri) by hummassa (subscriber, #307) [Link] (8 responses)

Sorry, but the whole "Perl is a write-only language" reads to me like "I can't read Perl, how can people be so mean writing in a language that uses symbols!!! Sigils!!! They are programming with smileys!!!"

Yes. I understand that some people have problems with a highly symbolic language. But the "I am in control if I can see every GOTO" is just an illusion. Many parts of the kernel code will be data-driven. And just like the regular expressions you seem to abhor, there are many, many ways to write correct programs that are perfectly readable... by those who can read the language. Even if those programs are as succint as possible.

The fact is that whenever you force kernel devs to trace each and every exception that they will have to redirect flow (goto) to some place where the exact amount of resource acquisition you did up to now can be undone, you are just introducing opportunities for bugs to creep in.

Memory safety is ONE of the concerns -- resource freeing/ RAII is a bigger concern that is addressed by higher-level languages like Rust and C++.

Rust heads into the kernel?

Posted Apr 23, 2021 14:13 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (7 responses)

> And just like the regular expressions you seem to abhor, there are many, many ways to write correct programs that are perfectly readable... by those who can read the language.

Yes, they're all listed on cve.mitre.org

Rust heads into the kernel?

Posted Apr 23, 2021 18:15 UTC (Fri) by hummassa (subscriber, #307) [Link] (6 responses)

> Yes, they're all listed on cve.mitre.org

PLEASE PRETTY PLEASE show me ONE example of a CVE caused by a regular expression. Let me make some popcorn while I wait for you to try.

Rust heads into the kernel?

Posted Apr 23, 2021 19:02 UTC (Fri) by hummassa (subscriber, #307) [Link] (2 responses)

Now, even if I am charitable and see that what you meant was "oh ultra-terse, symbolic code causes CVEs", this is provably false, also.

Rust heads into the kernel?

Posted Apr 23, 2021 19:47 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

J (or APL), anyone :-)

Cheers,
Wol

Rust heads into the kernel?

Posted Apr 29, 2021 16:58 UTC (Thu) by ejr (subscriber, #51652) [Link]

I learned APL2 in high school. I learned Perl 4 from the man page. I find these discussions curious.

Rust heads into the kernel?

Posted Apr 23, 2021 19:02 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Here you go!

http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-23354

"The package printf before 0.6.1 are vulnerable to Regular Expression Denial of Service (ReDoS) via the regex string /\%(?:$([\w_.]+)$|([1-9]\d*)\$)?([0 +\-\]*)(\*|\d+)?(\.)?(\*|\d+)?[hlL]?([\%bscdeEfFgGioOuxX])/g in lib/printf.js. The vulnerable regular expression has cubic worst-case time complexity. "

Rust heads into the kernel?

Posted Apr 23, 2021 19:59 UTC (Fri) by hummassa (subscriber, #307) [Link]

Point conceded! Oh man, I've been proven wrong TWICE already on this thread! I must be turning into a Real Boy™!

Rust heads into the kernel?

Posted Apr 27, 2021 23:28 UTC (Tue) by ras (subscriber, #33059) [Link]

I realise this is just a bit of fun, but I'd say that is not the regex's fault. It's the fault of the underlying re library using an NFA to recognise it. I've been bitten by NFA's going rouge of some input so many times now, I'd say a regex library using a NFA is a bug that leads to CVE's like the one you found.

DFA's might occasionally take exponential space for their compiled form and you have to incur the expense of compiling the entire thing, but you get to find out about your bug the first time the regex is compiled, not some at some random time later in production.

Rust heads into the kernel?

Posted Apr 23, 2021 12:23 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (6 responses)

> Sincerely I'm not seeing any difference between the examples above and my old horrible memories of Perl, where everything was written as awful regex matches, that was well-known for being a "write only language", and that resulted in massive vulnerabilities everywhere since it was so difficult to figure what was *really* going to happen behind the curtains.

The new syntax is the lambda parameter bit (`|_| expr` is a lambda that takes one argument, names it `_` to ignore it, then evaluates to `expr`) and the `?` which is the "if an error occurred, return it from the current function, otherwise evaluate to its success value.

The other bit you need to remember is:

> I'd like *CONTROL* safety

RAII is what you want then. You don't have to thread your own cleanup-on-error cases together based on where they can occur from within the function. If you allocate a lock in Rust, it has two destinies: passing it off to another function or unlocking when the function returns. There's no other choice[1].

> but it translates to instructions and does not add magic everywhere to make the developer feel proud of writing code using smileys.

Rust does the same thing. I feel like saying "?" hides control flow is like saying "?: hides lazy evaluation" or "how can I see short-circuiting boolean operator logic" in C. It's part of the language. If you don't want to learn it, fine. But why does your resistance to learning something new block others from using what they know?

[1] Well, I suppose you could manually drop it, but that is such an oddity that it'd be like seeing "// intentionally don't unlock here; it will be done in $foobar later in the logic that uses this function" except you *have* to spell it out instead of being lucky someone was kind enough to leave a comment for you.

Rust heads into the kernel?

Posted Apr 23, 2021 12:25 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> If you allocate a lock in Rust

Sorry, this should read "take a lock" (was thinking lock guard, but then edited around it).

Rust heads into the kernel?

Posted Apr 23, 2021 12:40 UTC (Fri) by hummassa (subscriber, #307) [Link] (2 responses)

> > I'd like *CONTROL* safety

This person does not want control safety, they want safety control. They want to SEE the "if condition bail out thru such and such path" because they are under the impression that they can do better than the compiler in determining which "such and such path" is appropriate.

As you said (and I also did, in another comment), it's ok to like "things written as words" in software. And it's ok if a person likes C, knows C, and has no interest in learning C++ or Rust. But it's just silly to block others from using what they already know.

Rust heads into the kernel?

Posted Apr 23, 2021 17:24 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

And I get the impression he is also assuming that the language ENFORCES "bail on error". I don't have any knowledge of Rust but I can't believe that the designers would be so stupid as to stop you identifying, handling, and cleaning up errors manually.

Cheers,
Wol

Rust heads into the kernel?

Posted Apr 24, 2021 20:29 UTC (Sat) by farnz (subscriber, #17727) [Link]

Rust indeed does not enforce bail on error. The core library uses a Rust enum called Result<T, E> (where T is a type parameter for the "happy" path, and E is a type parameter for the "failure" path) to handle errors. It has two variants: Ok(T) for the happy path, and Err(E) for the failure path; you can use Rust's match or if let primitives to dissect an error manually, if you wish.

To return a "happy" value, you return Ok(value), where value has to be of type T; similarly, to return a failure, you return Err(error) where error has to be of type E. There is a special operator, previously spelt try!(...) and now spelt ...? which for Result is equivalent to the following Rust:

match ... {
    Ok(v) => v,
    Err(e) => return Err(e.into()),
}

Of course, if you write it out in full, you don't have to return the error, or do the conversion (.into() uses the Into trait to convert from one type to another, allowing you to convert errors in this case if a suitable conversion is implemented. In the kernel, you might use this to convert from a driver-specific error to a KernelError type, for example, so that you can retain semantic details of your error while in Rust). You can implement a different pattern match that does whatever you want with the Result type - maybe not bailing at all, but doing something different.

That said, you do still end up with Rust doing some work when you exit a scope, beyond just deallocating the stack frame. If a type implements the Drop trait, the code in there is run when the object is deallocated.

Rust heads into the kernel?

Posted Apr 23, 2021 14:21 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (1 responses)

> The new syntax is the lambda parameter bit (`|_| expr` is a lambda that takes one argument, names it `_` to ignore it, then evaluates to `expr`) and the `?` which is the "if an error occurred, return it from the current function, otherwise evaluate to its success value.

Thanks for explaining. I find this overly complicated to add an argument that must be ignored.

> The other bit you need to remember is:
> > I'd like *CONTROL* safety
>
> RAII is what you want then. You don't have to thread your own cleanup-on-error cases together based on where they can occur from within the function.

No, please no, that the worst horror of modern languages in my opinion. As I explained, it encourages in *not* handling abnormal conditions and letting someone else deal with them. Abnormal conditions are best handled *where* they occur, not by lazily returning to the caller which will do the same.

> If you allocate a lock in Rust, it has two destinies: passing it off to another function or unlocking when the function returns. There's no other choice[1].

While most of the time I do prefer locks to be symmetric, I've already entered into situations where it was needed *not* to unlock, and yes, that requires clean code and a big fat comment above the function (and if possible a name that suggests it).

> but it translates to instructions and does not add magic everywhere to make the developer feel proud of writing code using smileys.

Fortunately another member here showed how to handle errors with if/else that allows to properly take care of them instead of ignoring them.

> Rust does the same thing. I feel like saying "?" hides control flow is like saying "?: hides lazy evaluation" or "how can I see short-circuiting boolean operator logic" in C. It's part of the language. If you don't want to learn it, fine. But why does your resistance to learning something new block others from using what they know?

It's not a matter of me refusing to learn, it's that the arguments that are presented to defend it are exactly those which I take as counter-arguments: let's encourage developers not to care about anything anymore and ignore who will process their errors since the compiler will surely know better.

> [1] Well, I suppose you could manually drop it, but that is such an oddity that it'd be like seeing "// intentionally don't unlock here; it will be done in $foobar later in the logic that uses this function" except you *have* to spell it out instead of being lucky someone was kind enough to leave a comment for you.

Comments are precisely made to document non-obvious stuff like this. Someone who writes a function that plays with non-obvious locks or locks in a non-obvious way and doesn't mention it is looking for trouble anyway.

Rust heads into the kernel?

Posted Apr 23, 2021 14:51 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

I find this overly complicated to add an argument that must be ignored.

It's not "add[ing] an argument that must be ignored". The .map_err passes the function the error. If you want to ignore it, you use `|_|`. If you want to handle it, give it a name. It's not like C where a N-ary function can be passed to a callback passing M arguments as log as M < N. The arity must match, so the passed closure must take an argument (here, ignored).

No, please no, that the worst horror of modern languages in my opinion. As I explained, it encourages in *not* handling abnormal conditions and letting someone else deal with them. Abnormal conditions are best handled *where* they occur, not by lazily returning to the caller which will do the same.

I think we agree, but we're talking on different levels of abstraction. Of course, where to *best* handle an error depends on the error itself. Bad flags? Caller's fault. Lock contention? Probably my issue. Can't open a file? Do I have a contract to retry? Maybe the kernel *is* different, but most of the time I can wrap up an error like this:

let output = process::Command("git") // the executable
  .arg("--version") // build up an argument list
  .output() // I want the output of the command
  .map_err(|err| MyErr::GitExec("--version", err))?;
if output.status.success() {
  return Err(MyErr::Git("--version", output.status.code()));
}

where `MyErr::GitExec` indicates: - `git` failed to *launch* - the "--version" describes what was being done - the err contains the error that occurred `MyErr::Git` indicates: - `git` executed, but returned failure - "--version" acts largely the same - it contains the exit code as well My code doesn't particularly care *why* it failed, but it can annotate why it was trying to do what it was doing when the error happened.

Fortunately another member here showed how to handle errors with if/else that allows to properly take care of them instead of ignoring them.

Sure. When it calls for it, expand it out. I think I still usually prefer a closure for Result<> handoffs like above, but it's a style difference. The code is basically the same (in fact, I *think* that you can use cargo-expand to see what `?` desugars to in your code as well, but maybe that was when it was spelled `try!()`).

Rust heads into the kernel?

Posted Apr 23, 2021 23:19 UTC (Fri) by anselm (subscriber, #2796) [Link]

Weren't C - and POSIX - originally designed as a glorified assembler for writing a games machine to run on a PDP-11?

Nope. You may wish to read up on the early history of Unix.

In a nutshell, the game was written in assembly language for the PDP-7 (using a cross-assembler on the GE 635) and preceded Unix. Unix came along later as a rudimentary operating system for software development on the PDP-7. The early Unix hackers around Ken Thompson only got access to a PDP-11 in 1970 when they promised to produce a system for editing and typesetting text. This was eventually used by the Bell Labs patent department and gradually evolved into what we would think of as Unix. Various higher-level programming languages were tried and discarded, and eventually the Unix kernel was rewritten in C in 1973.

Unix was actually quite innovative in its time (especially the file system was miles ahead of the competition as far as versatility was concerned), and the fact that Unix was written in C did a lot to make it portable to many of the evolving computer architectures of the 1970s and 1980s. It is safe to say that without Unix and C, the computing landscape today would look a lot different and – in view of the various clunky manufacturer-specific systems that Unix eventually replaced – not necessarily better. Ken Thompson and Dennis Ritchie didn't receive their Turing award for their awesome beards.

Rust heads into the kernel?

Posted Apr 23, 2021 11:52 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (3 responses)

Sorry, but for me, even with a lot of good faith and mind stretching, I cannot figure how something spelled "func().map_err(|_| errno::EINVAL)?;" could translate to "if (func() < 0) goto leave;". Using smileys to write expressions will sooner or later either strike you in the back when you're tired late at night, or result in something different being done to "help" you.

Ultimately you *want* the processor to emit a test and a branch to a place where some things are undone. Why would you force yourself to express it in a totally different way ? I could also take a pencil, draw animals and expect the compiler to figure what I'm trying to do and emit code, and there will possibly be fans of this, but I'm not one of them, I'm sorry.

Rust heads into the kernel?

Posted Apr 23, 2021 12:10 UTC (Fri) by hummassa (subscriber, #307) [Link] (2 responses)

Seriously? You can't read "call func(), and if it returns any error (the less than zero C status code), map this error to errno::EINVAL and return that EINVAL error instead, unwinding any initializations that had been done up to now -- which includes potentially releasing locks, file handles, and other resources that if I forget to release manually will cause a leak?" I don't like Rust very much, but even I can read that

  func()?

is much less error-prone than

  r = func();
  if( r < 0 )
    goto UNWIND_EXACTLY_WHAT_HAS_BEEN_DONE_TILL_NOW;

where you have to manually sprinkle your function with a dozen UNWIND1, UNWIND2, etc...

Rust heads into the kernel?

Posted Apr 23, 2021 14:32 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (1 responses)

> Seriously? You can't read "call func(), and if it returns any error (the less than zero C status code),

No, sorry, I can't parse that this smiley "|_|" means "less than zero".

> I don't like Rust very much, but even I can read that
> func()?
> is much less error-prone than
> r = func();
> if( r < 0 )
> goto UNWIND_EXACTLY_WHAT_HAS_BEEN_DONE_TILL_NOW;
> where you have to manually sprinkle your function with a dozen UNWIND1, UNWIND2, etc...

And the simple fact that you write that is *exactly* what worries me about such practices. Just getting rid of abnormal failures seems to be the default way of developing. With the example involving the "goto" above it's simple to add an error counter for the specific type of condition that was met there, and possibly call other stuff to propagate that.

With the default "unwind" practice, well, it's just that. "OK got an error, let's report above that something happened, surely someone will figure what it was". When reading data from a NIC for example you can face a number of different "errors" (which are in fact abnormal but totally expected conditions) for which you have to write code to increase their respective counters and certainly not just "unwind exactly what has been done". In some circumstances you'll even need *not* to unwind everything, because for example you'd have pre-allocated and initialized a buffer and will prefer to keep it around for the next use instead of putting it back into the dirty pool so that it gets initialized again on next call. This is just an example of course, but you probably understand what I mean. And if not, anyway, I sense that we'll never agree because we don't have the same expectations.

Rust heads into the kernel?

Posted Apr 23, 2021 18:14 UTC (Fri) by hummassa (subscriber, #307) [Link]

No, sorry, I can't parse that this smiley "|_|" means "less than zero". It doesn't. The "less than zero" part is the C construct where you use positive return values to indicate that func did its job ok (and return some useful result, e.g., number of bytes written etc)

Just FYI: The "smiley" |_| is the equivalent to lambda x: in python (where x will be ignored from now on) and map_error just substitutes the error part of a Result for the evaluation of said lambda.

As for the rest of your argument, you are still sustaining that you and all other devs are better in knowing what has to be unwound than the compiler is. So, I'll refer to my other answer at this other comment.

Rust heads into the kernel?

Posted Apr 23, 2021 1:03 UTC (Fri) by roc (subscriber, #30627) [Link] (7 responses)

And there are really people who can parse such horrors ?

Yes. It's far less "horrifying" than, say, C type syntax.

How do you decide to atomically increment an error counter on the return path with that method ?

You can always manually write out

  if func().is_err() {
    increment_error_count();
    return Err(errno::EINVAL);
  }

  if let Err(e) = func() {
    if ... something involving e ... {
      increment_oversized_frame_count();
    } else if ... something involving e ... {
      increment_undersized_frame_count();
    }
    return Err(errno::EINVAL);
  }

This looks like eye-candy for dummies to me, just to attract beginners by claiming they'll write easier code but it's hard to imagine it can ever be useful outside of school books.

People write lots of real code using Rust, including Rust "Result" and "?". I have a product with 200K lines of Rust code written over the last 5 years and we use these features everywhere. I also have 30 years experience writing C and C++ code if that matters to you.

Rust heads into the kernel?

Posted Apr 23, 2021 12:07 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (6 responses)

Well at least your examples look more readable in terms of control.

Rust heads into the kernel?

Posted Apr 23, 2021 12:30 UTC (Fri) by hummassa (subscriber, #307) [Link] (5 responses)

The problem is exactly that. You are arguing that the kernel shouldn't have

  long x = input & SOME_FLAG ?
    (input & SOME_MASK) + SOME_INITIAL_VALUE :
   other_input & SOME_OTHER_FLAG || OTHER_INITIAL_VALUE;

because idiomatic C is "too terse" and "hides control flow". After all, how can you see the two if that are lurking in this smiley-laden code?

Rust heads into the kernel?

Posted Apr 23, 2021 14:01 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (4 responses)

Not to mention with that particular code, I have to dredge up my C operator precedence to figure out if that `|| OTHER_INITIAL_VALUE` is a bug or not. Is it supposed to be `|`? Is it supposed to be with the false branch or the whole conditional expression?

Rust heads into the kernel?

Posted Apr 23, 2021 14:34 UTC (Fri) by wtarreau (subscriber, #51152) [Link]

I totally agree with you and am not writing C code like this either, just like I don't like seeing assignments inside conditions.

Rust heads into the kernel?

Posted Apr 23, 2021 18:09 UTC (Fri) by hummassa (subscriber, #307) [Link] (2 responses)

The short-circuiting logical-or operator || is defined as "evaluate LHS, if it's a truth value (none of false, 0, '\0', nullptr) return that, else return RHS". It binds more tightly than the ?: ternary conditional. The whole statement, explicitly written, would be:

  long x;
  if( input & SOME_FLAG ) {
    x = (input & SOME_MASK) + SOME_INITIAL_VALUE;
  } else {
    x = other_input & SOME_OTHER_FLAG;
    if( !x ) {
      x = OTHER_INITIAL_VALUE;
    }
  }

Rust heads into the kernel?

Posted Apr 23, 2021 18:30 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (1 responses)

This makes me think it's a typo because `expr || OTHER_INITIAL_VALUE` always returns 0 or 1, not `OTHER_INITIAL_VALUE`.

#include <stdio.h>

#define p(x) printf(#x " = %d\n", x)

int main() {
p(0 || 2);
return 0;
}

outputs:

0 || 2 = 1

Rust heads into the kernel?

Posted Apr 23, 2021 18:37 UTC (Fri) by hummassa (subscriber, #307) [Link]

That one was MY mistake. Wrong language. It's not a typo, it's an honest-to-god mistake and I would have introduced a kernel bug, if the code review didn't see it. The || operator always return a bool and it's zero or one.

Rust heads into the kernel?

Posted Apr 21, 2021 7:38 UTC (Wed) by LtWorf (subscriber, #124958) [Link]

Well since this seems pushed by google. They have all the interest in having the kernel not change API so that the android vendors can easily port their drivers across versions.

Rust heads into the kernel?

Posted Apr 21, 2021 20:08 UTC (Wed) by tbelaire (subscriber, #141140) [Link]

I was just a undergrad student and I worked wiht a grad student to re-write (maybe translate would be a better word) a block device driver into Rust 4 years ago, and gave a talk at Linux con.

So we've been working on it for a while now, including writing drivers.

(Turns out Rust wasn't quite ready back then, but the hard part is definitely the safe abstraction layer.)

Rust heads into the kernel?

Posted Apr 21, 2021 14:30 UTC (Wed) by Paf (subscriber, #91811) [Link]

There’s a huge chicken and egg problem here, though. They’ve been asked to, so let’s give them a little time - I think the sincerity of the effort merits at least that.

Rust heads into the kernel?

Posted Apr 21, 2021 6:04 UTC (Wed) by motk (guest, #51120) [Link] (1 responses)

He's been like this for decades. Nothing ever happens.

Rust heads into the kernel?

Posted Apr 21, 2021 13:47 UTC (Wed) by ms-tg (subscriber, #89231) [Link]

I have a suspicion that this is the moment. Learn to communicate appropriately before the “next generation” of developers, who have concrete expectations in this area, and are likely to be the fastest uptake group of Rust, encounter this behavior.

If not, I can’t help but expect more defenestration drama.

Rust heads into the kernel?

Posted Apr 21, 2021 6:49 UTC (Wed) by rahix (guest, #136292) [Link] (24 responses)

I want to comment on this statement in particular:

> If the Rust compiler ends up doing hidden allocations, and they then cause panics [...]

In contrast to C++, Rust does not have such a thing as a core-language allocation primitive (like C++'s `new` operator). As long as you don't link in the `alloc` crate, the is zero possibility for any kind of heap allocation to happen, much less a hidden one. If I understood correctly, this is the route Rust-for-Linux eventually wants to take: The standard `alloc` crate is not used and instead similar APIs are built for the kernel, where all allocating functions are fallible (= they return a Result<Value, AllocationFailure>). This will then mean a caller must consider what their code should do in case of an allocation failure, and no panics are possible or needed. Even better: Due to Rust's Result type you cannot even forget to deal with this possibility, because you cannot access the returned value until you dealt with potential errors.

Rust heads into the kernel?

Posted Apr 21, 2021 15:54 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (22 responses)

> As long as you don't link in the `alloc` crate, the is zero possibility for any kind of heap allocation to happen, much less a hidden one.
Then how do you explain that this simple test instantly runs out of memory ? And how should I test for the allocation since it seems to do it in my back during the concatenation without me having any handle on it ?

   fn main() {
        let mut str1 = "A".to_string();
        let mut str2 = "B".to_string();

        loop {
                str1 = str1 + &str2 + &str2 + &str2 + &str2 + &str2 + &str2 + &str2 + &str2 + &str2 + &str2 + &str2 + &str2;
                str2 = str1.to_string();
        }
   }

Rust heads into the kernel?

Posted Apr 21, 2021 16:10 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

Well, you're linking to `alloc` there, so that'd be why. You'd need to remove `alloc` (which removes many of the stdlib containers/types, including probably String) which depend on it. That's the point of the kernel-specific stdlib (much as the kernel has its own libc).

Rust heads into the kernel?

Posted Apr 21, 2021 16:14 UTC (Wed) by rahix (guest, #136292) [Link] (20 responses)

The to_string() method is part of the `alloc` crate: https://doc.rust-lang.org/alloc/string/trait.ToString.html It won't be available when `alloc` is not linked.

I agree that _with_ liballoc it is sometimes hard to see where allocations happen, but that is no different from medium complex C libraries in userspace. Rust-for-Linux uses Rust in no_std mode which means libstd and liballoc are not linked in by default. Only by explicitly defining a global_allocator (https://github.com/Rust-for-Linux/linux/blob/9e6e67e06bdc...) are they making use of liballoc right now and AFAIK this is to be removed in favor of custom, fallible allocation APIs.

Rust heads into the kernel?

Posted Apr 21, 2021 16:23 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (19 responses)

OK so at the same time it will remove some of the eye-candy stuff that make such modern languages appealing to newcomers, such as "look how strings are much easier to handle than in C". C has no problem with strings, they do not exist. Only the functions of the stdlib which use pointers to arrays of chars as strings are a total mess, but it's possible to write alternatives that provide checks and that will result in similar code as in other languages once controls are enabled :-/ It's just less convenient for the day-to-day work.

Rust heads into the kernel?

Posted Apr 21, 2021 18:03 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

The purpose of Rust-in-the-kernel is not eye candy. It is improving memory and thread safety.

(Besides, concatenating strings in C requires sacrificing a goat to the kmalloc gods; I don't see how no-alloc Rust could possibly be any *worse* than that.)

Rust heads into the kernel?

Posted Apr 21, 2021 18:07 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

> OK so at the same time it will remove some of the eye-candy stuff that make such modern languages appealing to newcomers
I'm not sure about being able to easily concatenate strings with a "+" operator, but pretty much everything else is going to work fine. String algorithms either don't allocate or can be modified to accept an explicit allocator policy. This is simply a matter of writing the kernel stdlib.

You'll certainly be able to:
- Statically check the format string parameters.
- Pre-compile formatting into a fast bytecode.
- Compile regexes into Rust code (during compilation).
- Use the regular split/find/replace/prefix operations.

Rust heads into the kernel?

Posted Apr 23, 2021 1:47 UTC (Fri) by tialaramex (subscriber, #21167) [Link] (7 responses)

The overloadable operators in Rust are implemented as Traits.

So things you can Add have an implementation of add() such that the result of the function is the same type as the left hand side - and Rust will call that when you use the + operator. String implements add (using the allocator implicitly). Because the result must be the same type, you can't get a Result back and so we can't have allocation which might fail.

Presumably we should not like to have this implicit allocation in the Linux kernel, and so any hypothetical Rust KernelString won't implement Add and you won't be able to use the + operator on it. Whereas if some other KernelThing can reasonably have something added without implicit allocation then it can implement Add for whatever that is. The Traits used for overloading don't require that the left and right hand sides are the same type, so adding a_kernel_thing + 4 is allowed, if KernelThing feels it makes sense to implement Add<u8> or whatever type of integer the compiler can reasonably choose to believe 4 is.

Rust heads into the kernel?

Posted Apr 26, 2021 12:47 UTC (Mon) by Jonno (subscriber, #49613) [Link] (6 responses)

> Because the result must be the same type, you can't get a Result back and so we can't have allocation which might fail.

It doesn't, so you can. The following code would implement String + &str -> Result<String, AllocError>, just replace ... with the gnarly details:

impl Add<&str> for String {
    type Output = Result<String, AllocError>;

    fn add(mut self, other: &str) -> Result<String, AllocError> { ... }
}

However, AddAssign does require that the output type is the same left hand side, so you can't implement += fallibly, but I would think it is obvious why.

Rust heads into the kernel?

Posted Apr 28, 2021 3:04 UTC (Wed) by tialaramex (subscriber, #21167) [Link] (5 responses)

Oops, I clearly don't understand Rust even as well as I thought I did, I had read Self::Output as implying you can't change this, when in fact as you say it specifically allows the implementer to pick any type for Output. Thank you for pointing that out.

So, at risk of being wrong again, surely with AddAssign the problem isn't so much that the output type is the same as the left hand side as that Rust itself enforces the assumption that AddAssign just directly changes the left hand side, rather than having any "output" at all ?

Rust heads into the kernel?

Posted Apr 28, 2021 8:00 UTC (Wed) by micka (subscriber, #38720) [Link] (4 responses)

If you can change the object operated on, Addassign can do if.
Strings (for example) is immutable and ’self’ cannot be changed so AddAssign will return a new object instead (common impl https://doc.rust-lang.org/src/core/ops/arith.rs.html#725 for many base types like i32 etc)
See https://doc.rust-lang.org/src/alloc/string.rs.html#2029-2034 (and push_str uses https://doc.rust-lang.org/std/vec/struct.Vec.html#method.... which clones the String underlying Vec).

Rust heads into the kernel?

Posted Apr 28, 2021 8:04 UTC (Wed) by micka (subscriber, #38720) [Link]

Note that the rust library documentation https://doc.rust-lang.org/std/index.html is (in my opinion) very complete and the [src] links go to the implementation (same version as the doc you are reading).

Rust heads into the kernel?

Posted May 1, 2021 0:01 UTC (Sat) by tialaramex (subscriber, #21167) [Link] (2 responses)

I guess I still don't understand. How is the code you pointed to _returning_ a new object? The function is clearly defined without a return type, and indeed doesn't return anything [technically it returns the empty tuple], it just alters self to achieve its purpose.

In some languages the add-assign operator has a result type such that if you write a = (b+= 1) then both a, and b become b + 1. But in Rust that type is always the empty tuple, I just checked and as expected Rust allows me to do this, but not to any sort of integer variable a since (b+= 1) is an empty tuple and thus incompatible.

Rust heads into the kernel?

Posted May 1, 2021 11:14 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (1 responses)

I think there was some confusion. I don't think `AddAssign` can be used where allocation is a failure on types which may need to allocate. I guess one could define it for `Result<String, AllocationError>` where if it fails, it turns into the Err variant, but that doesn't sound great. I think String just loses AddAssign in a fallible allocation world. Which I think is fine, but I am also fine with explicit error checking everywhere it is needed. I feel like the kernel would prefer calculating a single buffer size then allocating it all at once rather than building up a string piecemeal, but I don't know.

And yes, it appears as though assignment does not return a reference to the object as it does in C and C++. One could write a small function which worked that way, but then you lose the `=` spelling of assignment. With lifetime tracking, getting it to work in practice is probably not that easy anyways outside of Copy types.

Rust heads into the kernel?

Posted May 2, 2021 5:47 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

OK great, I think we're both on the same page now, *phew*. I learned some things about Rust and how to interpret Trait definitions, which was valuable.

I'm excited to see kernel Rust once it does have its own alloc implementation.

Rust heads into the kernel?

Posted Apr 21, 2021 19:11 UTC (Wed) by excors (subscriber, #95769) [Link] (8 responses)

I think the significant difference is that Rust (like C++) lets you define a new string type with compile-time-constant capacity, which doesn't use the heap and can be allocated on the stack or as part of a larger object. Then you can define methods to make it almost as convenient as a dynamically-allocated string, with your preferred error-handling behaviour when the capacity is exceeded. And it's guaranteed to be correctly bounds-checked.

E.g. someone on crates.io wrote an "arraystring" crate which allows code like:

type MyString = ArrayString<U20>;
fn test() -> Result<MyString, arraystring::Error> {
    let mut s = MyString::try_from_str("Hello")?;
    s.try_push_str(" world!")?;
    let t = MyString::from_str_truncate("1234");
    s = s + &t + "56789";
    Ok(s)
}

The "try_" functions return a Result type that can indicate an overflow error, and the "?" will propagate the error (and the caller can handle it or propagate it again). The from_str_truncate and "+" can't fail, they are defined to truncate. All the storage is allocated on the stack (but you can't accidentally use a dangling pointer to an old stack frame, because the compiler will notice and complain; in this case it's returning by value so it's safe). Or if you want to save stack space you could let the caller pass in a mutable reference to a string object that was allocated elsewhere.

If I understand it correctly, you can also use s.as_str() to get a 'string slice' referring to the ArrayString's bytes, and most of Rust's string-using libraries will happily work with that string slice, exactly the same as if it came from a normal heap-allocated String. (In C terms a string slice is basically a char* plus a size, so it works anywhere that doesn't need to resize the string, which is most places.)

So there's some unavoidable inconvenience from having to think about how you're going to handle overflow, and you might have to implement the string library yourself if nobody else has made a suitable one yet, but it's still much simpler (and much safer) than string manipulation in C.

Rust heads into the kernel?

Posted Apr 21, 2021 23:03 UTC (Wed) by roc (subscriber, #30627) [Link]

This is basically right, but one issue is that Rust strings are always valid UTF8, and it's not clear to me how often kernel strings are guaranteed to be valid UTF8. Maybe you'll end up using something like the `bstr` crate instead where strings don't have to be valid UTF8, in which case actual Rust `str` in the kernel might end up being rare.

Rust heads into the kernel?

Posted Apr 23, 2021 12:24 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (6 responses)

> The "try_" functions return a Result type that can indicate an overflow error, and the "?" will propagate the error (and the caller can handle it or propagate it again).

I must say I really don't like that at all, it reminds me the horribly broken concept of exceptions in certain languages that is responsible for the vast majority of backtraces produced all over the world every single second.

Errors don't exist, and as such they must not be propagated. What people often call errors are in fact the undersirable outcome of some operations, and very often there are valid alternatives to handle them. But some languages or framework like to make it easier to hire newbie developers by asking them to "only code what's on the diagram and not worry about errors as they will be caught" (heard many times in web development). In this case there's no attempt at falling back to a sane situation, you just pass the "error" to the caller so that your problem become someone else's.

And the farther you are from the failure the harder it is to try to gracefully address it. That's the segfault or the python calltrace basically.

One important principle in computer safety is to always reject the slightest piece of code which contains exception handling or too generic error handling, because it then becomes very hard to figure if some errors were not specifically handled. Each failure to do something needs to be addressed, if possible, gracefully.

I know it probably means nothing to a number of people who will find me obsessed with reliability and trustable control, but the day your father's pacemaker spends its time rebooting because its battery's voltage reads too low a value due to environmental conditions instead of just slowing down the valve to preserve resources, you may think again about this "propagate error" vs "deal with the error right here" discussion. And once code starts to be written with exceptions instead of error handling, it's impossible to fix it.

Rust heads into the kernel?

Posted Apr 23, 2021 13:20 UTC (Fri) by jpfrancois (subscriber, #65948) [Link] (3 responses)

But handling error in C code is not mandatory, so failing to handle error can happen also in C.
Ignoring error or Ignoring excpetion can be equally bad.

Error Handling code is easier to write and read if all you have to do is actually handling the error, instead of handling the error + adding code to handle the resource deallocation.

Error handling in the kernel happens because of code being reviewed or being written by skilled persons. I can't see why the same think would not happen in rust.
I also remeber reading an article here on LWN about how a lot of the error path are broken. Maybe because having to write both an error handling code + a resource management code is more prone to failure...

Rust heads into the kernel?

Posted Apr 23, 2021 14:48 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (2 responses)

> But handling error in C code is not mandatory, so failing to handle error can happen also in C.
> Ignoring error or Ignoring excpetion can be equally bad.

I totally agree and these are indeed among the sources of bugs.

> Error Handling code is easier to write and read if all you have to do is actually handling the error, instead of handling the error + adding code to handle the resource deallocation.

Except when you do not want to just fall back to a default "unwind" and provide more info instead. You know, the typical problem that plagues most applications that cannot report clean logs and instead lazily dump kilometers of backtraces.

> Error handling in the kernel happens because of code being reviewed or being written by skilled persons. I can't see why the same think would not happen in rust.

I'm not saying it could not happen, I'm saying that most of the benefits that are expressed in favor of Rust regularly end up as "we could indeed avoid doing this in the kernel". In the end what's the real benefit if 50% of the code runs under "unsafe", unwinding is not used because we want clean error handling, automatic allocations are not possible, etc ? I'm *not* saying it's a bad language, I personally find it particularly complex to learn and do see the claimed benefits as inconvenients. It's just that I find that a number of the fantastic benefits to expect from such a migration will very likely not be met in the end and that the only result is that we'll have even more complexity and less maintainable code.

> I also remeber reading an article here on LWN about how a lot of the error path are broken. Maybe because having to write both an error handling code + a resource management code is more prone to failure...

But don't you find it shocking as a developer to decide allocate resources if you're not certain you'll be able to release them ? I mean, I'm not saying it's trivial, I'm saying it has to be expected to deal with abnormal conditions. Writing code takes time. For sure if your boss asks to cut the development time in half it's possible by getting rid of error handling. But that's generally not the approach that goes into the kernel.

Last, error handling in low-level code may require very special conditions that a compiler will not magically guess. For example when failing to allocate some memory, you generally don't want to lazily fail. You want to look around, try to flush some buffers etc before retrying. You may even prefer to postpone processing and wait until the expected resources are available. It may also be possible that some of the half-initialized stuff cannot be restored by just undoing the init code. You may have to respect a certain sequence. For example the zlib always performs exactly 5 calls to its allocator, whether they're successful or not. This way you can implement your own using pools of fixed sizes. In case of allocation failure you must not at all change anything in this sequence and you want it to continue to allocate till the end of the sequence because otherwise you will desynchronize the allocator by freeing in the middle of a sequence.

Rust heads into the kernel?

Posted Apr 23, 2021 16:32 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> In the end what's the real benefit if 50% of the code runs under "unsafe"

While the amount is non-zero, I'd be surprised if it was that high. Actual experiments will be needed to not have us just be slinging gut-check numbers around.

> unwinding is not used because we want clean error handling

Unwinding is always used (because RAII). But since error handling is *explicit* in Rust, each call site can decide whether "this can go up -> use `?`" to "this is complicated -> match and handle". You'll never need to worry about an uncaught exception ripping through your stack frame in Rust (well, panics, but that is analogous to BUG_ON() and undefined behavior on the C side). Either you can *see* the handoff of the error or you can *see* the explicit error handling. Which is more useful where is subject to review, as is any other use of "fancy" language constructs.

> automatic allocations are not possible

Sure, the stdlib has bits that aren't useful in the kernel. I don't think anyone would be surprised about that. Same thing with C++: std::string is basically unusable in the kernel too due to its reallocation behaviors.

> error handling in low-level code may require very special conditions that a compiler will not magically guess. For example when failing to allocate some memory, you generally don't want to lazily fail.

In this code, yes, handle the error explicitly. If you're in a driver? it's not usually your problem at that layer and you can hand it off to the parent (with annotations if suitable). Maybe your driver *is* important to the memory allocation pipeline though. Then you handle it.

Rust heads into the kernel?

Posted Apr 23, 2021 17:35 UTC (Fri) by Wol (subscriber, #4433) [Link]

> In the end what's the real benefit if 50% of the code runs under "unsafe",

Well, if you translate C to Rust, and have to put 50% of it it inside "unsafe{}" blocks to get it to compile, it's a big improvement on the C where it's 100% unsafe.

And as others have pointed out. most of that unsafe code will consist of "unsafe{ call C function }". Which again is an improvement on the 100% unsafe C code. I think it's a pretty safe bet that 99% of the *Rust* code will not be unsafe{} blocks. So migrating to Rust will get rid of loads of bugs, as all the unsafe blocks are reviewed and made safe.

Cheers,
Wol

Rust heads into the kernel?

Posted Apr 23, 2021 13:23 UTC (Fri) by hummassa (subscriber, #307) [Link]

> I know it probably means nothing to a number of people who will find me obsessed with reliability and trustable control, but the day your father's pacemaker spends its time rebooting because its battery's voltage reads too low a value due to environmental conditions instead of just slowing down the valve to preserve resources, you may think again about this "propagate error" vs "deal with the error right here" discussion. And once code starts to be written with exceptions instead of error handling, it's impossible to fix it.

There is absolutely NOTHING in "do it explicitly with ifs and gotos" that would prevent the pacemaker from keep rebooting. I would even argue that if the dev put his "if less than zero, goto unwind" blindly (as some of us are wont to do, just being human and all) your bug is even HARDER to find. Ah, and there is a "slippery slope" fallacy at the end of your argument.

Rust heads into the kernel?

Posted Apr 23, 2021 13:57 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

One idiom that is popular in Rust (and I use myself) is to have a specific error type for a given class or group of operations. This way, if an error comes through, it must be transformed into the set of errors that the operations declare. If there is an easy automatic transformation available, that can happen still and is wrapped up with evidence of the cause. This can include "something happened with a path" and the error representation requires that the path be injected into the error type as well. You can't forget to add it because it's a compile error to do so. So yes, the example given is not idiomatic Rust. I would instead have something like:

enum OpErr {
  InvalidFlag { mask: u32 },
  ConflictingFlags { /* something to tell about conflicting flags */ },
  RequiredDriverNotLoaded,
}

impl OpErr {
  fn as_errno(&self) -> Errno {
    match self {
      Self::InvalidFlag { .. } => Errno::EINVAL,
      Self::ConflictingFlags { .. } => Errno::EINVAL,
      Self::RequiredDriverNotLoaded => Errno::ENOTSUP,
    }
  }
}

This way the caller can have code that reads well ("oh, invalid flag, let's make a better error message because we got that from our caller") versus the oh-so-useful "invalid value" and you need to guess which one it means today.

Rust heads into the kernel?

Posted Apr 22, 2021 1:01 UTC (Thu) by hummassa (subscriber, #307) [Link]

https://en.cppreference.com/w/cpp/freestanding

Rust heads into the kernel?

Posted Apr 21, 2021 9:30 UTC (Wed) by fenncruz (subscriber, #81417) [Link] (3 responses)

> /dev/zero replacement

I am not a kernel developer but is there alot going on behind-the-scenes for something that only needs to return null and thus would benefit from being in rust? Or was this suggested as it is something very simple and could be used as a demonstration of rust's abilities?

Rust heads into the kernel?

Posted Apr 30, 2021 14:31 UTC (Fri) by alexander.batischev (guest, #122369) [Link] (2 responses)

I'm not a "real" kernel developer either, but from what I understand, /dev/zero was mentioned because it's simple yet touches a few subsystems. If you look at the function that serves a read(), you'll find that it uses:

data structures associated with character devices (for which there are Rust wrappers already, since the RFC contains an example of a chardev implementing a semaphore);
a min_t macro that'll have to be re-written in Rust;
a clear_user call which is implemented differently on each architecture, so bindings are needed. This is also an opportunity to introduce some safety, e.g. checking that the pointer is a userspace pointer (C code seem to have this already, since it annotates the pointer with __user);
calls to signal_pending and cond_resched functions which are part of the scheduler — again requiring more bindings.

I wrote a couple simple kernel drivers (proprietary, so any real kernel developer will rightfully call me a leech), and I can attest that chardev stuff and some access to the scheduler are the pretty common things to use; they come in handy in many simpler drivers. Adding Rust wrappers for those facilities will simultaneously demonstrate how wrappers look, and pave a path for many, many simple-yet-useful drivers to be written in Rust.

Rust heads into the kernel?

Posted Apr 30, 2021 15:14 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (1 responses)

> a min_t macro that'll have to be re-written in Rust;

Is there a reason that Rust's built-in `min` function wouldn't be sufficient?

Rust heads into the kernel?

Posted Apr 30, 2021 15:35 UTC (Fri) by alexander.batischev (guest, #122369) [Link]

min in Rust requires arguments of the same type, so it can only replace the min macro. The min_t macro casts its arguments to the given type, so Rust will need a similar macro as well.

Rust heads into the kernel?

Posted Apr 21, 2021 12:54 UTC (Wed) by flussence (guest, #85566) [Link] (26 responses)

It would make a lot of sense to do this the way ALSA replaced OSS - port a few drivers at a time, but leave the originals there for the 15 years or so grace period that got...

Rust heads into the kernel?

Posted Apr 21, 2021 16:17 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (25 responses)

> It would make a lot of sense to do this the way ALSA replaced OSS - port a few drivers at a time, but leave the originals there for the 15 years or so grace period that got...

Indeed, and this also leaves the benefit that in 15 years, after the language has been completely abandonned due to becoming too tricky to use with all the necessary tricks added to make it suitable for the task, the original code will still be there to serve as a reference to be reimplemented in the new $language_of_the_year.

All languages are good... until you try to expand them to areas they were not initially designed for. Design a language to write a browser, it will probably shine there. If you want to write operating systems, use a language designed for this, not the one designed to write browsers. It just happens C was designed for this purpose and it should not be a surprise that it still rules in this area 50 years later despite its numerous shortcomings.

Rust heads into the kernel?

Posted Apr 21, 2021 17:57 UTC (Wed) by josh (subscriber, #17465) [Link] (18 responses)

Rust isn't "designed to write browsers". Rust is designed to be a general-purpose systems language for everything from kernels/firmware/embedded to browsers to WebAssembly. There is, by design, no "gap" between assembly and Rust.

In any case, one of the critical values of the Linux kernel is that it evolves over time. If Rust continues to fit well as a powerful safe language for systems programming, as I and many others hope it will, it'll make sense to have increasing parts of the kernel written in it. If some other language replaced Rust and became seen as a preferable language for writing systems code like the Linux kernel in, it would be because that language came up with a way to solve kernel problems substantially better than Rust.

Only time will tell what actually happens. But I don't think "another language might hypothetically come along in the future that's even better" is a good argument against adopting a language that's a substantial improvement over the current state, assuming people agree that it's a substantial improvement.

I also don't think there's value in maintaining parallel versions of the same code.

Rust heads into the kernel?

Posted Apr 21, 2021 19:07 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (17 responses)

> Only time will tell what actually happens.

I agree on this.

> But I don't think "another language might hypothetically come along in the future that's even better" is a good argument against adopting a language that's a substantial improvement over the current state, assuming people agree that it's a substantial improvement.

That's not my point. My point is I suspect that a few years from now it will have changed so much to adapt to the real needs that it will be abandonned (except by a few fanboys) and will end up like plenty of previous languages that lasted only a decade.

> I also don't think there's value in maintaining parallel versions of the same code.

If that's the only way to keep people who understand some code parts that could end up being abandoned, there definitely is some value. If it takes a decade to reach the same level of stability or performance as the original ones, it makes sense. If the required toolchains end up being a total mess to deploy in field, it makes sense as well. The new IDE driver probably took a decade to replace the old one and there's nothing wrong with this. As you said, time will tell.

Rust heads into the kernel?

Posted Apr 21, 2021 22:07 UTC (Wed) by josh (subscriber, #17465) [Link]

> My point is I suspect that a few years from now it will have changed so much to adapt to the real needs that it will be abandonned (except by a few fanboys)

That's certainly a viewpoint one could hold. I can understand where that assumption would come from, given that there have been a large number of languages that thought they could replace C, many of which don't exist anymore. I think Rust specifically fits in a way all of those languages didn't, but rather than try to argue the case for that here, I'll stick with "time will tell", and continue working to make sure the language adapts to the needs of various real users, including kernel developers.

Rust heads into the kernel?

Posted Apr 21, 2021 23:18 UTC (Wed) by roc (subscriber, #30627) [Link] (15 responses)

Rust 1.0 came out nearly six years ago so we're only four years away from a decade. It seems implausible to me that Rust adoption is going to turn from "rapidly increasing" to "abandoned" in the next four years.

> becoming too tricky to use with all the necessary tricks added to make it suitable for the task

I wonder what you have in mind here. I can't think of any examples of this over the last several years. In fact Rust has become less tricky to use, for example supporting non-lexical lifetimes. Rust community commitment to backwards compatibility is high so even if some new features (const generics, generic associated types?) make the language more complex, you don't have to use them.

Rust heads into the kernel?

Posted Apr 22, 2021 2:54 UTC (Thu) by wtarreau (subscriber, #51152) [Link] (14 responses)

> > becoming too tricky to use with all the necessary tricks added to make it suitable for the task

> I wonder what you have in mind here. I can't think of any examples of this over the last several years.

I mean that gcc and even C evolved when facing the reality of the linux kernel. When 1 out of 10 lines was in fact an inline function calling an asm statement it shows there were some limitations. I don't see how Rust will avoid this. There will be tons of "unsafe" blocks everywhere making the code very hard to read and preventing the compiler from holding its promises, and I suspect some new constructs will have to be invented to improve this situation, and that the language will be less appealing to newcomers.

Quite frankly, look at Linux's C code. There area tons of READ_ONCE(), likely(), atomic_inc(), smp_rmb(), readb(), iowrite{8,16,32}(), div64_* etc everywhere that rely on asm and need to be applied not directly because of the language but because of the underlying hardware constraints, that make C code less easy to write. I don't see why it wouldn't be the case in another language since it's not a matter of expressing code, but a matter of semantics. What Rust proponents complain about in C will quickly arrive in their language.

Rust heads into the kernel?

Posted Apr 22, 2021 3:55 UTC (Thu) by Wol (subscriber, #4433) [Link]

> Quite frankly, look at Linux's C code. There area tons of READ_ONCE(), likely(), atomic_inc(), smp_rmb(), readb(), iowrite{8,16,32}(), div64_* etc everywhere that rely on asm and need to be applied not directly because of the language but because of the underlying hardware constraints, that make C code less easy to write.

And if Rust forces programmers to step back, and think about their code, and NOT FORGET all of this stuff, then that makes it the better language.

And how much of that stuff, also, is not there because C can't do it natively, but because given the chance the C optimiser will screw things up? How much of that stuff is there to get round undefined or implementation-defined behaviour?

Cheers,
Wol

Rust heads into the kernel?

Posted Apr 22, 2021 11:53 UTC (Thu) by moltonel (guest, #45207) [Link] (5 responses)

The fact that kernel Rust will probably also need to handle similar READ_ONCE complications as kernel C doesn't make Rust less appealing. Rust will retain its borrow checker, sum types, safe/unsafe split, and other features that make it desirable. It already handles some very low-level hardware weirdness, and it's well equipped to accommodate whatever weirdness the kernel needs without turning the language into an unrecognizable mess.

It seems likely that initial kernel Rust code will have more 'unsafe' and less convenience than we can hope for 10 years down the line. If 10% of kernel "C functions" were initially asm, we can expect similar growing pains with Rust. So what ? Nobody is claiming that enabling Rust in the kernel will enable all Rust developers to become kernel developers. The main goal of Rust in the kernel is to make the kernel code better, not to attract Rust fans and newcomers.

Rust heads into the kernel?

Posted Apr 22, 2021 20:11 UTC (Thu) by wtarreau (subscriber, #51152) [Link] (4 responses)

> Nobody is claiming that enabling Rust in the kernel will enable all Rust developers to become kernel developers.

Sorry but this is exactly how I understood it: make it easier to write drivers (implicit "for rust developers" as not everyone is fluent in it and even those used to it say the learning curve is pretty steep).

> The main goal of Rust in the kernel is to make the kernel code better, not to attract Rust fans and newcomers.

Better in terms of what ? Maintainability, with only 1% of kernel developers being able to review and fix that code instead of the previous 100% ? Ease of use, with everyone having to install two separate toolchains, some combinations of which will possibly cause trouble that will have to be detected and rejected ? Portability, with a number of existing architectures not even being implemented by the language ? Fame and ego for rust fanboys for having won a victory over Linux, maybe.

Rust heads into the kernel?

Posted Apr 22, 2021 20:36 UTC (Thu) by josh (subscriber, #17465) [Link]

You seem to be going out of your way to find the *least charitable interpretation possible* for everything. "fame and ego", really?

We wouldn't be working to enable Rust in the kernel if we didn't think it was an improvement *for the kernel*. The goal is to provide safety and productivity improvements. If you believe that Rust isn't actually achieving those goals, or that it does but that there are other factors outweighing that, by all means you're welcome to make the case for that.

Half your arguments would apply to *any* non-C language, no matter how much of an improvement it would otherwise be. Those arguments would effectively say that the kernel can never consider anything but C, and that "it isn't C" is a factor outweighing other considerations in any language evaluation.

Yes, a new toolchain and a new language has a baseline cost; the only reason Rust is being seriously considered is because *despite* that cost many people still think it'd be an improvement. I have confidence that kernel developers are quite *capable* of using something new, if it's deemed worthwhile. That evaluation, of whether it's actually worthwhile, will be an ongoing process. I'm certain that one of many factors in that evaluation will be "is this not just better, but substantially better, to the point that it's worth the expectation that kernel developers will have to deal with it in the kernel tree". That's a very reasonable question, and it's not one with a pre-determined answer.

Rust heads into the kernel?

Posted Apr 22, 2021 20:38 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (1 responses)

> Maintainability, with only 1% of kernel developers being able to review and fix that code instead of the previous 100% ?

Please don't just make up numbers. I wouldn't trust 100% of kernel developers reviewing sched, mm, rcu, or arch code because those domains are way more specific than just "it's C code, I'm a C programmer, I'll be fine". If maintainers of a domain don't want to learn Rust, then don't let it into your corner. Right now, it appears as though it will just be for device drivers. Over time, sure, maybe it'll do more. I don't forsee the wisdom of subsystem maintainers to all of a sudden being untrustworthy if introducing Rust into a corner is not actually ready (no different than any other patch that shows up).

> Portability, with a number of existing architectures not even being implemented by the language ?

The language supports it. The prevailing toolchain doesn't support every arch. There are those working on getting GCC to compile Rust which should resolve this issue.

Rust heads into the kernel?

Posted Apr 22, 2021 22:23 UTC (Thu) by josh (subscriber, #17465) [Link]

> There are those working on getting GCC to compile Rust which should resolve this issue.

There's also a project to use GCC as the code-generation backend for the existing frontend (which will avoid code duplication and divergence), as well as efforts to add LLVM backends for more architectures that people care about (there's now an m68k backend, so props to the m68k people for putting forth that effort).

Rust heads into the kernel?

Posted Apr 23, 2021 1:18 UTC (Fri) by roc (subscriber, #30627) [Link]

> Maintainability, with only 1% of kernel developers being able to review and fix that code instead of the previous 100% ?

If you think 99% of kernel developers will be unable or uninterested in learning Rust then you have dimmer view of kernel devs than I do.

Rust heads into the kernel?

Posted Apr 22, 2021 13:55 UTC (Thu) by mss (subscriber, #138799) [Link] (4 responses)

> What Rust proponents complain about in C will quickly arrive in their language.

Full agreement on that.
It looks like even basic Rust standard library can't be reasonably implemented without a lot of "unsafe" blocks.

Which resulted in a hilarious number of recent memory-corruption CVEs for a language that has memory safety as its biggest selling point:
> CVE-2020-36317:
> In the standard library in Rust before 1.49.0, String::retain() function has a panic safety problem. It allows creation of a non-UTF-8 Rust string when the provided closure panics. This bug could result in a memory safety violation when other string APIs assume that UTF-8 encoding is used on the same string.
>
> CVE-2021-36318:
> In the standard library in Rust before 1.49.0, VecDeque::make_contiguous has a bug that pops the same element more than once under certain condition. This bug could result in a use-after-free or double free.
>
> CVE-2021-28875:
> In the standard library in Rust before 1.50.0, read_to_end() does not validate the return value from Read in an unsafe context. This bug could lead to a buffer overflow.
>
> CVE-2021-28877:
> In the standard library in Rust before 1.51.0, the Zip implementation calls __iterator_get_unchecked() for the same index more than once when nested. This bug can lead to a memory safety violation due to an unmet safety requirement for the TrustedRandomAccess trait.
>
> CVE-2021-28876:
> In the standard library in Rust before 1.52.0, the Zip implementation has a panic safety issue. It calls __iterator_get_unchecked() more than once for the same index when the underlying iterator panics (in certain conditions). This bug could lead to a memory safety violation due to an unmet safety requirement for the TrustedRandomAccess trait.
>
> CVE-2021-28878:
> In the standard library in Rust before 1.52.0, the Zip implementation calls __iterator_get_unchecked() more than once for the same index (under certain conditions) when next_back() and next() are used together. This bug could lead to a memory safety violation due to an unmet safety requirement for the TrustedRandomAccess trait.
>
> CVE-2021-28879:
> In the standard library in Rust before 1.52.0, the Zip implementation can report an incorrect size due to an integer overflow. This bug can lead to a buffer overflow when a consumed Zip iterator is used again.
>
> CVE-2021-31162:
> In the standard library in Rust before 1.53.0, a double free can occur in the Vec::from_iter function if freeing the element panics.

For extra fun try to find mentions of them in the Rust release notes page:
https://github.com/rust-lang/rust/blob/master/RELEASES.md

Rust heads into the kernel?

Posted Apr 22, 2021 16:22 UTC (Thu) by kleptog (subscriber, #1183) [Link] (2 responses)

> It looks like even basic Rust standard library can't be reasonably implemented without a lot of "unsafe" blocks.

That's to be expected though. The standard library provides the abstractions so that the programs using them don't need any unsafe blocks. This works quite well in practice.

I would expect the kernel implementation to be the same. There would be a core that provides the abstractions (containing many unsafe blocks) and the drivers using these abstractions would be without any unsafe blocks. That would be ideal, the question is how close they can get to this goal.

Rust heads into the kernel?

Posted Apr 22, 2021 20:06 UTC (Thu) by wtarreau (subscriber, #51152) [Link] (1 responses)

So that's great, instead of having 5% of the drivers which are vulnerable to certain bugs, we'd have 100% of the rust ones because the libs themselves will not be safer than the example above and will not be audited by those who area used to deal with operating system issues.

It could very well be that Linux is the last project adopting rust in the end... That's a dangerous move for it.

Rust heads into the kernel?

Posted May 17, 2021 9:06 UTC (Mon) by tao (subscriber, #17563) [Link]

So in essence you're arguing that it'd better to have every C driver implement their own versions of kmalloc, printk, etc. because that'd only make 5% of the drivers vulnerable to certain bugs instead of 100% of them if those functions were to have issues?

To answer your question, yes, it's a lot better to have 100% of the Rust drivers vulnerable because they all use the same function if that means that 100% of the drivers are simultaneously fixed when an issue is remedied in the common library, rather than having each driver have its own implementation.

I know there are people who think static linking and bundling is a good idea, but I certainly don't belong to that camp.

Rust heads into the kernel?

Posted Apr 23, 2021 1:25 UTC (Fri) by roc (subscriber, #30627) [Link]

The Rust community issues a CVE whenever they find that an API not marked "unsafe" can be (ab)used to do something unsafe. In practice, though, you usually have to write very specific code to trigger than unsafe behavior. It hardly ever affects the behavior of real programs. In 5 years of full-time work on Rust code I can think of only one or two cases where such a CVE actually affected our program.

This is a much higher bar than C even attempts to reach. The answer to "which C library APIs can be used to do something unsafe if I write malicious code that uses them?" is "pretty much all of them". It's hardly even a well-formed question.

Rust heads into the kernel?

Posted Apr 23, 2021 1:16 UTC (Fri) by roc (subscriber, #30627) [Link] (1 responses)

> There area tons of READ_ONCE(), likely(), atomic_inc(), smp_rmb(), readb(), iowrite{8,16,32}(), div64_* etc everywhere that rely on asm and need to be applied not directly because of the language but because of the underlying hardware constraints,

These are either already available in Rust or can mostly be added as library APIs. No real increase in language complexity or difficulty of writing Rust code.

> What Rust proponents complain about in C will quickly arrive in their language.

The existence of these features is not what "Rust proponents complain about in C".

But atomics and memory-mapped I/O do provide good examples of the deficiencies of C. Rust defines specific types for atomic variables (e.g. AtomicU32) and you *cannot* read or write them with non-atomic operations (without explicitly writing "unsafe"). That eliminates the class of "forgot to use an atomic operation" bugs. Something very similar can be done for memory-mapped I/O registers (and I assume has been for embedded Rust already).

Unfortunately C doesn't provide these guarantees.

Rust heads into the kernel?

Posted Apr 23, 2021 8:31 UTC (Fri) by farnz (subscriber, #17727) [Link]

Suitable tricks for MMIO have indeed been implemented by the Rust Embedded people - these tricks exploit the borrow checker to confirm invariants are met. For peripherals with access requirements (e.g. GPIOs, where output drive level is nonsense in input mode), there's also a common state machine pattern in Rust. There's a collection of interesting crates to look at if you want to see what embedded Rust people have done to maximise the benefit from using Rust.

And note that with unsafe you can still write code that has as few guarantees as C - it's just that Rust style encourages you to use a compile-time abstraction (no runtime cost) that turns access mistakes into compile failures.

Rust heads into the kernel?

Posted Apr 21, 2021 22:44 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (5 responses)

> All languages are good... until you try to expand them to areas they were not initially designed for

C is no exception here. It was designed to write operating systems and it is still usable to write operating systems.

But even though C has evolved in these 40 years, so have operating systems, and I have serious doubts that C is still the best tool for an entire OS.

Rust heads into the kernel?

Posted Apr 22, 2021 6:30 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (4 responses)

> C is no exception here. It was designed to write operating systems and it is still usable to write operating systems.

C was designed to write operating systems, but on architectures where pointers are weird (e.g. segmented), you don't know how negative numbers are represented (e.g. one's complement), and you can't even assume that bytes have 8 bits.

For some godforsaken reason, the C standards committee slapped "it's not a bug, it's a feature!" labels on all of those things, and now we can't take them away or else the optimizer will get 2% worse.

Rust heads into the kernel?

Posted Apr 22, 2021 8:28 UTC (Thu) by pbonzini (subscriber, #60935) [Link] (1 responses)

It's much more than 2%. Sure you can write your code in such a way that the optimizer doesn't need it, but there are very low-level optimizations such as addressing mode selection that *rely* on some kinds of undefined behavior.

Others are completely asinine, such as undefined behavior on left-shift overflow (which includes left-shift of negative numbers, of course).

Rust heads into the kernel?

Posted Apr 25, 2021 16:57 UTC (Sun) by anton (subscriber, #25547) [Link]

It's much more than 2%.

As usual, without any empirical data to back it up.

Wang et al (2012) actually have measured the "benefit" of assuming that undefined behaviour does not happen, on SPECint, resulting in SPECint score improvements of 1.1% for GCC and 1.7% for Clang. The same benefit could be gotten by changing the source code in two places in the whole benchmark suite.

[Wang et al 2012] Xi Wang, Haogang Chen, Alvin Cheung, Zhihao Jia, Nickolai Zeldovich, and M. Frans Kaashoek. Undefined behavior: What happened to my code? In Asia-Pacific Workshop on Systems (APSYS'12), 2012.

Rust heads into the kernel?

Posted Apr 22, 2021 10:33 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Technically pre-89 C can support even ternary arithmetic.

Rust heads into the kernel?

Posted Apr 22, 2021 23:37 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

To the best of my understanding, pre-89 C is a euphemism for "We read the K&R book and hacked a compiler together without following a formal standard of any kind." I'm not aware of anybody using it other than the NetHack dev team,[1] although I would not be surprised if it was also in use by various financial institutions and/or IBM, because the use of software created after 1985 or thereabouts would appear to violate their religious beliefs.

[1]: https://nethack.org/devel/deprecation.html#361

Rust heads into the kernel?

Posted Apr 21, 2021 18:25 UTC (Wed) by ibukanov (subscriber, #3942) [Link] (6 responses)

Wuffs [1] is more appropriate for the kernel. It is much safer than Rust. For example, compiler ensures that there is no integer overflow or out-of-bounds array access. This is in addition to ensuring that pointer manipulations are safe. Plus it compiles to plain C so there is no issue with unsupported architecture.

[1] https://github.com/google/wuffs

Rust heads into the kernel?

Posted Apr 21, 2021 18:59 UTC (Wed) by mpr22 (subscriber, #60784) [Link]

I dare say it would be, if it was production-ready.

When the latest version tag in the git tree is v0.3.0-beta.1, I feel it's safe to presume that the people responsible for developing it don't think it's production-ready.

Rust heads into the kernel?

Posted Apr 21, 2021 23:07 UTC (Wed) by roc (subscriber, #30627) [Link] (3 responses)

I don't think so.

> Wuffs programs take longer for a programmer to write, as they have to explicitly annotate their programs with proofs of safety.

> The idea isn't to write your whole program in Wuffs, only the parts that are both performance-conscious and security-conscious.

A lot of driver code is boring. That code needs to be easy to write. In Rust it can be easy, in Wuffs not so easy.

Rust heads into the kernel?

Posted Apr 21, 2021 23:09 UTC (Wed) by roc (subscriber, #30627) [Link]

Also

> No way to dynamically allocate or free memory.

Rust heads into the kernel?

Posted Apr 22, 2021 11:00 UTC (Thu) by ibukanov (subscriber, #3942) [Link] (1 responses)

Still with Rust one gets panics on out-of-band access plus there is no check for overflow in release builds. Which suggest that maybe for drivers the whole code should be sandboxed using Wasm or similar. Then one can continue to write borring code in C or Rust with all bugs contained.

Rust heads into the kernel?

Posted Apr 22, 2021 16:31 UTC (Thu) by zlynx (guest, #2285) [Link]

The overflow and other debug checks can be enabled in Rust release builds if you want it.

As I understand it the Rust panic will be wired into a kernel BUG call. Which is what it ought to be. If the code somehow avoids the explicit range checks and still executes an array[out_of_bounds] operation then that really is a BUG, unlike a memory allocation failure.

Wuffs

Posted Apr 23, 2021 3:01 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

Wuffs will cheerfully provide you with C, and presumably in the future if you wanted it could be altered to provide Rust although in most cases I can't see why you'd care since if you're just compiling it then C is easier, and if you _alter_ the code then the guarantees from Wuffs expire.

But Wuffs is a special purpose language whereas Rust isn't. For example when you ask Cargo for a new Rust program, the one you get says "Hello, World!". But Wuffs can't do that. Because that would involve I/O and Wuffs deliberately doesn't have I/O at all, it considers that to be orthogonal to its concerns entirely.

It is good for these special purpose languages to exist, particularly when they address some difficult and interesting problem such as "Wrangling Untrusted File Formats Safely". I should like to do this sometimes, and apparently Wuffs would help. But if my current problem is that my USB Foozle doesn't work, an "untrusted file format" is only at best a tiny fraction of my problem and Wuffs isn't interested in helping me with the rest of it. Whereas perhaps I can write a driver for the USB Foozle in Rust.

Today I can write a _userspace_ USB driver in Rust. Maybe my Foozle can be driven that way. If Linux Rust becomes a thing then that opens up the possibility of writing a kernel USB driver in Rust which is viable even for higher performance gizmos and is also desirable if a Foozle is important/ low-level enough that people don't really want to wait until the userspace spins up to have it working.

Rust heads into the kernel?

Posted May 3, 2021 9:32 UTC (Mon) by ksandstr (guest, #60862) [Link]

>The more you make it look like (Kernel) C, the easier it is for us C people to actually read. My eyes have been reading C for almost 30 years by now, they have a lexer built in the optical nerve; reading something that looks vaguely like C but is definitely not C is an utterly painful experience.

One point thusfar unmade is that the Linux Kernel Style is outdated whether compared to Rust or not. This is to be expected of a thirty year old code base started when ANSI C was still a tad moist behind the ears. Since then there have been two significant standard revisions, both introducing various syntax and library updates of which the kernel style has been slow to adopt the former and generally ignores the latter[0].

It's not reasonable to expect a hojillion-SLOC program to undergo style churn to any degree, and matters like whether switch-cases should be indented or not shall remain a matter of taste forevermore. However, the kernel style (henceforth LKS) has several issues that, mostly consequent to influence of other languages, appear silly today: for example, LKS bars oneliners of the form "if((asdf, sdfg < 666) || (qwer && hjkl)) return PTR_ERR(EWOULDHORK);", instead preferring not even the Perl convention of extra curlies, but placing the return after the if and indenting it one level. Whether this is related to early-onset ergonomy trouble of the right thumb is anyone's guess.

Another example is LKS' propensity for spending vertical space willy-nilly instead of as a means to visually separate subprograms into a tripartite acquire-operate-release[1] form (or for telling apart subsections thereof), perhaps dating back from when text consoles had one or two fonts and no options for vertical spacing so programmers would compensate by adding spaces and linebreaks to improve superficial comprehensibility[2].

Moreover some syntax updates introduced in C99 have gone by the wayside; even today there's code being submitted that introduces variables at start-of-block when inline has been common preference for some 2/3rds of Linux being around, and (to my knowledge anyway) GCC 2.95 hasn't been a supported compiler for a dog's age. You'd also be hard pressed to find any use of struct literals or ephemeral dummies even as they improve both security and legibility by leaving no fields uninitialized as struct/union definitions grow and disallowing typo-based coding mistakes (i.e. the ones that bleary-eyed review misses) when the same pointer is dereferenced a bunch of times in sequence.

So whether Rust[3] is adopted or not, I for one hope the experiment at least leads to some revision in the Linux Kernel Style.

As for language extensions, that way lies the day when someone suggests adding an introspectable type system and deprecating implicit downcasts from void-pointers (we've all been down there, I hope) so that "C<>" can have Safe Iterators; and at that point you might as well follow the Ada rabbithole and start writing formal design proposals and other top-down necktie junk. That one proposal about scoped mutex unlocks is already incredibly horrible simply because tying mutex unlocks to scope exit is the smell of a function with another, smaller function in it, waiting to burst forth all over your breakfast spaghetti.

(Also, the LKS block comment style sucks basketballs wrapped in barbed wire through a garden hose, then pleads for seconds in the ECHR. You'd have to be coming off Borland's Turbo Space Crack to think it's a good idea even in 1992.)

[0] i.e. <stdint.h> types are ignored in favour of u8, u16, etc.; <threads.h> and <stdatomic.h> are right out; and who's even heard about <stdnoreturn.h>?
[1] read-compute-write, plan-setup-test-cleanup, disable-access-enable, etc.
[2] which is quite important for recognizing patterns at a glance, and telling whether something unusual is going on that ought to have a closer look right away
[3] or a future xon-of-Rust, or Ada, or ATS, or whatever floats your boaty mcboatface