|
|
Subscribe / Log in / New account

A pair of Rust kernel modules

A pair of Rust kernel modules

Posted Sep 14, 2022 9:45 UTC (Wed) by wtarreau (subscriber, #51152)
In reply to: A pair of Rust kernel modules by lambda
Parent article: A pair of Rust kernel modules

> Effectively, what is forbidden in safe Rust is anything which could cause objects to be interpreted as the wrong type, accessed when they are not valid, or accessed in overlapping ways in space or time. This means no out of bounds access, use after free, data races (two threads accessing the same memory in ways that no linear interleaving could produce), iterator invalidation, etc. This is a category of bugs which are quite common in C and C++ programs, which are difficult to reason about because they effectively break the model of the programming language, and are quite commonly prone to exploitation by attackers.

But it could also be said that several other categories of bugs are avoided in C thanks to the language being quite primitive and reading fairly well and being suitable for peer reviewing. Do you have an estimate of the increased amount of logic bugs or algorithmic ones that could be caused by the language being significantly more difficult to use when it resists to your demands ? For example I've been caught many times adding bugs when trying to simply shut up an inappropriate gcc warning. When a compiler tries to force you to do something one way that doesn't match your need, the friction introduces new risks of bugs.

> > This was despite months and years of telling people how> wonderful Rust was because it prevented memory leaks, lol.
> No one has ever claimed that Rust prevented memory leaks.

Note, the two of you said at least once "nobody shows" or "nobody claimed" etc. It's pointless to use such rhetoric. It doesn't add any value and needlessly increases tensions because anyone can have one personal counter example. I've personally heard someone tell me the point above for example, and that irritated me because I knew it was an absurd claim. Actually saying "no authoritative developer said/demonstrated/claimed", or even better "I've never heard any ..." would be easier to deal with for both parties in the discussion.

> > It's specified as 'whatever rustc does'. It has one implementation, with no other implementations even remotely close to being ready
> You realize that the same was true of C in the kernel until relatively recently, right? The kernel is not written in standard C; it's written in GCC C. The kernel has a memory model that is different than the standard C memory model. The kernel is now mostly able to be built with clang as well, but only by years of effort of adding GCC features to clang and modifying the kernel to not rely on them in quite as many places.

I can understand this concern and I do share it as well. Not directly for the kernel in fact, rather for the language's life expectancy. 15 years ago I was told that Ruby was *the* language of the future, that prevented bugs etc... (hint: it just made them slower to appear). Now in 2022 can anyone cite any developer not working for Gitlab still using this language ? I do have the same concern about Rust: as long as it remains the self-defined input of rustc, it's not exactly a language and it can seriously fail over time. Serious implementations are absolutely required for it to survive. For sure Linux uses GCC C. But C is used everywhere and runs the whole internet, some built with gcc, some with any other compiler. It maintains an ecosystem afloat and forces implementations from various origins and use cases to exchange and evolve the standard. Rust does need to adopt a similar approach where there is no more *the* leading implementation and a few others trying to catch up like clang does with gcc or gnugo does with Go, but a set of slightly different implementations all following one standard to reach a 100% compatible code base. From there it's fine if some projects decide to only use one flavor for various reasons.


to post comments

A pair of Rust kernel modules

Posted Sep 14, 2022 10:53 UTC (Wed) by Wol (subscriber, #4433) [Link] (6 responses)

> Rust does need to adopt a similar approach where there is no more *the* leading implementation and a few others trying to catch up like clang does with gcc or gnugo does with Go, but a set of slightly different implementations all following one standard to reach a 100% compatible code base. From there it's fine if some projects decide to only use one flavor for various reasons.

That is incredibly difficult to achieve. For two perfect examples from the database arena, SQL and DataBASIC. There's a whole bunch of subtle differences between SQL dialects, as many people here will be able to attest. Likewise, although far fewer people here are familiar with it, DataBASIC. Both have multiple competing implementations, and there are many corner cases where early design decisions collide badly with compatibility - my favourite DataBASIC statement

REM: REM = REM(6,3); REM this takes the remainder of 6 / 3

Every single usage of REM makes sense, and is legal in at least one DataBASIC compiler, but trying to support all four in this one statement is, well, tricky ... (I believe at least one does, probably OpenQM/ScarletDME.)

We're likely to end up with just the one implementation of rust just to get round the dialect problem. Like most C code is written to the "it compiles with gcc" standard for exactly the same reason.

Probably one of the big drivers pushing the kernel towards llvm/clang is too many developers are getting fed up with the breakages caused by the gcc developers attitude towards "undefined behaviour". And if we do get the kernel compiling successfully with llvm/clang we could rapidly hit a tipping point where new code supports the "well it compiles with llvm/clang" standard.

Cheers,
Wol

A pair of Rust kernel modules

Posted Sep 14, 2022 15:42 UTC (Wed) by khim (subscriber, #9252) [Link] (5 responses)

> Probably one of the big drivers pushing the kernel towards llvm/clang is too many developers are getting fed up with the breakages caused by the gcc developers attitude towards "undefined behaviour".

I really like how Rust solved that crazy provenance business.

Instead of trying to invent rules which would work for everyone (that's what C/C++ attempted, but failed to do and thus and still, after 20 years, doesn't have such rules, remember!) Rust just gives you rules which you can use! And then its developers go back to their blackboard to try to invent something better.

What is surprising is that this is what was supposed to happen in the C land, too: Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.

Only in Rust case they actually do that, instead of trying to find an excuse to justify yet-another-way which compiler is allowed to break your program.

C++ once, long ago, was like that, too: it split C-style cast into const_cast, dynamic_cast, reinterpret_cast and static_cast for similar reasons.

But somehow in XXI century all that went out of the window. We can only hope Rust wouldn't repeat the same mistake.

A pair of Rust kernel modules

Posted Sep 15, 2022 13:03 UTC (Thu) by farnz (subscriber, #17727) [Link]

The really nice thing about the Tower of Weakenings approach is that Rust is now able to have several layers of rules for provenance. Strict provenance is guaranteed to be correct for all implementations of Rust on all hardware that can support Rust; but because you have this portable set of rules, it's now possible to define rules like "for Rust on AArch64" or "for single-threaded Rust programs" that only apply if you're a special case.

In C and C++ standard terms, this has "strict provenance" as the rules that must apply, while permitting implementations to define relaxations of strict provenance that they will also accept as valid.

A pair of Rust kernel modules

Posted Sep 15, 2022 19:34 UTC (Thu) by Wol (subscriber, #4433) [Link] (3 responses)

> But somehow in XXI century all that went out of the window. We can only hope Rust wouldn't repeat the same mistake.

As I've said before, the C/C++ standards committee should be removing undefined behaviour. Placing the onus on the compiler writers to provide implementation-defined behaviour. Saying it's "whatever the hardware does". Whatever whatever but getting rid of all that crap.

And then there are things you can't define for whatever reason, where you admit that a definition is impossible.

The thing is, Rust has all three of those, and it clearly pigeonholes them. Safe Rust is supposedly *only* *defined* behaviour. And if undefined behaviour creeps into safe code it is defined as a BUG, a MUST-FIX.

I guess all that "hardware defined" stuff probably belongs in the "unsafe Rust" category, where the language can't reason because it doesn't have any idea what's going to happen behind its back.

And then there's the stuff you can't define, which is unsound, because there's some fault in the logic somewhere.

The important thing is, the programmer can REASON about all this lot, unlike C, where hardware behaviour triggers "undefined behaviour", and the C compiler makes a whole bunch of false assumptions and screws up your code (like deleting necessary safety checks, etc etc).

Cheers,
Wol

A pair of Rust kernel modules

Posted Sep 15, 2022 20:19 UTC (Thu) by khim (subscriber, #9252) [Link] (2 responses)

But the thing is: this is what was supposed to happen with C and C++, too! Except for safe subset, but otherwise it was planned like that.

I mean… the Rationale for International Standard— Programming Languages— C says: Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.

This is your Tower of Weakenings right there!

Note that before C89 was punished it actually worked that way: there was no standard but different implementations permitted different things and even if some were not so easy to implement (e.g. one-element-past-the-end-of-array means it becomes impossible to have simple 64KiB arrays on MS-DOS) they were added to standard where it made sense.

I wonder how that stance turned into “if standard says something is undefined behavior then we have the carte blanche to destroy the program” and then “if standard doesn't say something is undefined behavior yet then we have the permission to destroy your program anyway”.

I don't think there was some evil mastermind behind all these developments, but the end result sure is complete lack of trust.

Periodic Linus outbursts and public complaints is not how you plan development of language which is used by millions!

A pair of Rust kernel modules

Posted Sep 15, 2022 21:35 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (1 responses)

> I wonder how that stance turned into “if standard says something is undefined behavior then we have the carte blanche to destroy the program” and then “if standard doesn't say something is undefined behavior yet then we have the permission to destroy your program anyway”.

Portability I would guess. Once more than one compiler could target a given platform (or one compiler could target more than one platform), "my compiler/platform is better than yours" creeps in and you start down the path of "what kinds of optimizations can we squeeze out here?" come up.

Today? Code that is written to work on multiple platforms from the same source. Here, the compiler saying "well, if it were $obscure_arch, this is different behavior, so we'll show it to you on your machine via UB-based optimizations (but not make any noise about it either)".

On one hand, a UB-less C would be "safer", but its portability would tank because "it worked on my x86_64" means diddly squat when you compile it for aarch64.

A pair of Rust kernel modules

Posted Sep 15, 2022 22:33 UTC (Thu) by Wol (subscriber, #4433) [Link]

> On one hand, a UB-less C would be "safer", but its portability would tank because "it worked on my x86_64" means diddly squat when you compile it for aarch64.

You've missed "implementation defined" and "hardware defined".

If something is "hardware defined" then yes, just because it works on x86_64, you can't expect the SAME code to work on aarch64, but firstly the programmer will KNOW that they need to check behaviour, and secondly they can put the ifdefs and whatever in there, and know that that IS DEFINED behaviour.

The *only* grounds for UB should be because "we can't define it because we can't get the logic to add up". There's no need for the C/C++ standard to define everything itself - it can defer the definition to something else - but all behaviour should be defined *somewhere*, if a definition is possible.

Take for example the size of a byte. In *PRACTICE* it's always 8-bit nowadays. I wouldn't be surprised if it's actually already implementation or hardware defined, but that's a perfect example of something that makes perfect sense as hardware-defined. In places, bytes are 6 bits, and if the programmer doesn't account for it it will cause a major problem if they're targetting an old platform. But the standard CAN, and SHOULD, address it.

Cheers,
Wol

A pair of Rust kernel modules

Posted Sep 14, 2022 13:56 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

> For example I've been caught many times adding bugs when trying to simply shut up an inappropriate gcc warning.

If the warning is inappropriate in Rust, simply explain why in the source code

// We genuinely need Drop here, see https://some.example/url
#[allow(drop_bounds)]

And we can promote a warning in the opposite way

// We tried asking people nicely, it didn't work. If you write an overlapping range this won't compile. Learn to count.
#[forbid(overlapping_range_endpoints)]

A pair of Rust kernel modules

Posted Sep 14, 2022 15:35 UTC (Wed) by lambda (subscriber, #40735) [Link]

> But it could also be said that several other categories of bugs are avoided in C thanks to the language being quite primitive and reading fairly well and being suitable for peer reviewing. Do you have an estimate of the increased amount of logic bugs or algorithmic ones that could be caused by the language being significantly more difficult to use when it resists to your demands ? For example I've been caught many times adding bugs when trying to simply shut up an inappropriate gcc warning. When a compiler tries to force you to do something one way that doesn't match your need, the friction introduces new risks of bugs.

I have not noticed any such tendency in Rust.

One of the advantages of Rust is that the more powerful type system, and a number of language design features, make a lot of things more explicit and possible for the compiler to reason about precisely. For instance, reference types and nullability are orthogonal, so you don't have to constantly add checks for null; the type tells you if a reference could possibly be null, and so there there can be fewer spurious compiler warnings due to the higher precision of the type system.

Another example would be warnings about use of uninitialized value, like the famous Debian SSH key bug that was introduced by trying to silence a warning about use of uninitialized values. Because this warning was found by someone later who wasn't the original author, they weren't as familiar with the code when trying to fix it, and they made a mistake and removed the actual source of entropy that was being used as well as the uninitialized value. In Rust, this is not a separate warning, but part of the language, so it's something that needs to be dealt with by the original author, rather than by someone else later on trying to silence warnings and not paying enough attention.

That's one of the major design goals of Rust; rather than having to rely on imprecise lints that can frequently lead to spurious warnings and dubious fixes, to have greater expressiveness in the language itself that allow these checks to be precise and enforced consistently, which leads to less confusion.

Usually, the kinds of workarounds that you need to do in cases where the compiler gets it wrong are to just be a little bit more explicit, possibly at the cost of being more verbose. I don't know of many cases where this has caused the introduction of bugs; I'm sure it could happen, but in my experience it seems like the additional expressiveness and precision of the type system far outweighs that, leading to many fewer of these kinds of bugs than you find in C.

> Note, the two of you said at least once "nobody shows" or "nobody claimed" etc. It's pointless to use such rhetoric. It doesn't add any value and needlessly increases tensions because anyone can have one personal counter example.

Sorry, fair point! This is a somewhat common misconception, so you're right, my rhetoric was probably too strong here.

> 15 years ago I was told that Ruby was *the* language of the future, that prevented bugs etc... (hint: it just made them slower to appear). Now in 2022 can anyone cite any developer not working for Gitlab still using this language ?

Ruby seems like a fairly different case than Rust, but off the top of my head, Homebrew, Vagrant, Discourse are all fairly widely used tools written in Ruby; and of course, Ruby on Rails is still a quite popular framework for writing web apps, though many of them are non-free, simply SaaS applications.

> I do have the same concern about Rust: as long as it remains the self-defined input of rustc, it's not exactly a language and it can seriously fail over time.

There are plenty of other successful, long-lived languages. Python has been around for as long as the Linux kernel, and it is defined by a single primary implementation, while also having alternate compatible implementations that are useful like PyPy, and Python is widely used for a large variety of software.

Rust is younger, and thus its alternate implementations are younger and not yet as complete, but it has one independent implementation mrustc which can be used for bootstrapping the compiler, it has another completely independent implementation in the gcc-rs project, and it has a GCC-based backend being added to rustc to supplement the LLVM based backend. The progress on these two implementations has been discussed on LWN recently: https://lwn.net/Articles/907405/

I've also heard rumors that there are other implementation projects that haven't yet been made public; of course those could never see the light of day, but there is a lot of active work in this field right now.

There is also the Rust reference, there's an extensive test suite, there's the entirety of crates.io which is used as an additional test suite, and there's a draft Ferrocene Language Specification https://spec.ferrocene.dev/ which is intended to provide a set of requirements that can be verified against for safety-critical applications.

> Rust does need to adopt a similar approach where there is no more *the* leading implementation and a few others trying to catch up like clang does with gcc or gnugo does with Go

I'm not sure I follow; as you're saying here, the situation for Rust is no different than the situation with C in the Linux kernel, where GCC is the leading implementation and clang is catching up. Are you saying that Rust needs to be held to a higher standard, where there are two independent implementations with feature parity before you can use it? I don't think that this is a reasonable requirement.

Yes, there is value in having multiple independent implementations, but there's also substantial cost in writing the new compiler and the standardization process itself. As mentioned, there is work in progress on all of these fronts (alternative implementations, and more detailed specifications/standards), but I don't think there's any reason to avoid using Rust before those are complete.

A pair of Rust kernel modules

Posted Sep 15, 2022 19:02 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> Not directly for the kernel in fact, rather for the language's life expectancy. 15 years ago I was told that Ruby was *the* language of the future, that prevented bugs etc...

Ruby has been overshadowed by Go and JS now that most complex webapps have frontends in JavaScript and the backend just provides a REST API. But back in the day, Ruby allowed tons of small companies to quickly build decent applications and get to market with them.

This list includes GitHub, AirBnB, Groupon, Zillow and many others. The company where I work is built on top of a Ruby app as well.

So it's fair to say that Ruby absolutely fulfilled its promise in the area of web apps. And it has never been really intended as a systems language or a language for desktop applications.

A pair of Rust kernel modules

Posted Sep 15, 2022 19:39 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link]

> And it has never been really intended as a systems language

Yep, it hasn't advertised as one. Aside from the web arena, tools like Puppet or Chef uses it but that's quite different from being a systems language.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds