Rust in the Linux kernel (Google security blog)

Posted Apr 15, 2021 23:00 UTC (Thu) by matthias (subscriber, #94967)
In reply to: Rust in the Linux kernel (Google security blog) by mss
Parent article: Rust in the Linux kernel (Google security blog)

> > How can a function that accepts a raw pointer be sure that the pointer points to valid data? This is inherently unsafe design.
> It's the caller responsibility to not provide any invalid pointers to the function.
Of course it is. And it should be the the compilers responsibility to check this. However, this si impossible with raw pointers. It is not even clear from the function signature whether a null pointer is correct or not. In some functions this is fine and has a semantics, in others it is not. This is what I mean by inherently unsafe, the compiler cannot check correctness. Every such function call would need a documentation why it is safe to do.

> > No, I am talking about a big object managed by one pointer, but now I want to pass a pointer to a subobject to some function.
> > In rust, I can just take a pointer to the subobject, as long as the lifetime of the pointer to the subobject is contained in the lifetime of the whole object.
> It depends what do you specifically mean by a subobject.
> If it's like a class field then normally one does not provide a direct pointer to the field to an unrelated code but implements a specific interface and provides it instead.

So, one is not supposed to use getters for fields of an object?

> > How complex should code be such that you think that one should actually verify correctness?
> > People will disagree about the complexity that can be checked by humans.
> There is no "scientifically-correct" answer to that question, since it's mostly an individual opinion.
> One can simply identify obviously-correct code (like in my example) and treat any disagreement over this as an evidence that the code is not obviously-correct.

And rust has a pretty clear opinion on this. A dereference of a raw pointer is never obviously correct. This rather hard judgement can of course not work in C++. According to rust standards almost every line in C++ is not obviously correct.

> > Then you will only have a few lines that have to be verified by humans.
> "A few lines" of unsafe code in an OS kernel?
> Even the current C code in Linux kernel is not standard-complaint and that's already a rather low bar...
> And this is done for a reason (performance).

Of course, in the kernel it will be a few lines more. Still it would be way way less than with C or C++-code. Most of the Kernel can be written in safe rust. It are mostly the low level primitives that would need unsafe code. Calling these primitives can be safe code. Today all of the Kernel code is unsafe code.

Rust in the Linux kernel (Google security blog)

Posted Apr 15, 2021 23:19 UTC (Thu) by mss (subscriber, #138799) [Link] (7 responses)

> Of course it is. And it should be the the compilers responsibility to check this.
> However, this si impossible with raw pointers. It is not even clear from the function signature whether a null pointer is correct or not.
> In some functions this is fine and has a semantics, in others it is not.
> This is what I mean by inherently unsafe, the compiler cannot check correctness.

GCC has a "nonnull" attribute that does exactly that - warns if you passed a NULL pointer as that function argument.
Microsoft SAL annotations allow even more.

In practice, this is not much of a problem, since one assumes that a function does not allow NULL pointers as parameters unless specifically described as such.
And in order to use a new function one has to read its docs anyway to learn what does it exactly do.

> So, one is not supposed to use getters for fields of an object?

An interface will provide getters.

> Most of the Kernel can be written in safe rust. It are mostly the low level primitives that would need unsafe code. Calling these primitives can be safe code.

That's mostly an opinion-type statement, other commenters have already stated there is a runtime cost of Rust code,
so anywhere performance is the most important consideration it isn't likely the best choice.

Once again, I would like to say that I am not against having Rust in kernel per se.

Rust in the Linux kernel (Google security blog)

Posted Apr 16, 2021 10:56 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (4 responses)

> That's mostly an opinion-type statement, other commenters have already stated there is a runtime cost of Rust code, so anywhere performance is the most important consideration it isn't likely the best choice.

Versus C++? Unlikely. C? More plausible, but it really depends on the context. It's definitely not something one can make a blanket statement about. Rust compiles down to machine code just like them. If you're referring to bounds checks, that is done in debug builds by default only. Do you have links to these claims or can you be more specific?

Rust in the Linux kernel (Google security blog)

Posted Apr 16, 2021 12:05 UTC (Fri) by matthias (subscriber, #94967) [Link]

Bound checks are enabled in release builds. They are crucial for memory safety. What is usually only done in debug builds is integer overflow checking.

In many cases bound checks are necessary (and explicitly added in C-code). And if you do manual bound checking (e.g. to handle errors) the rust compiler will usually see this and optimize away the automatic bound checks.

If it is really crucial for performance and one is sure that the index cannot be out of bounds, then bound checks can of course be omitted for some crucial code path.

Rust in the Linux kernel (Google security blog)

Posted Apr 17, 2021 12:41 UTC (Sat) by jezuch (subscriber, #52988) [Link] (2 responses)

Interestingly, what people find out with Rust is that safe code is easier to optimize by the compiler, because the compiler can prove more things about it. I've seen stories where people thought that they can make the code faster by reimplementing a small section using unsafe code (because it allows more low-level bit-twiggling etc.) and found out that it was actually slower.

One thing in particular is that with the ownership model you can always say what is aliased to what. As a result, the compiler wants to tell LLVM a lot that this particular thing is not aliased. It wants, but it can't, because it exposed so many bugs in LLVM, which are never hit with C and C++, where you have to assume that everything can be aliased always.

As I understand it, Rust is not the fastest game in town mostly because the bitcode it generates is so lousy.

Rust in the Linux kernel (Google security blog)

Posted Apr 17, 2021 12:58 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (1 responses)

> One thing in particular is that with the ownership model you can always say what is aliased to what. As a result, the compiler wants to tell LLVM a lot that this particular thing is not aliased. It wants, but it can't, because it exposed so many bugs in LLVM, which are never hit with C and C++, where you have to assume that everything can be aliased always.

Indeed. That issue[1] got closed recently though, so there's hope.

> As I understand it, Rust is not the fastest game in town mostly because the bitcode it generates is so lousy.

I thought it was that the bitcode was noisy. As more optimizations move to MIR, LLVM gets less bitcode to compile and can do other optimizations. Or maybe that's related to compiler performance more than runtime performance.

[1]https://github.com/rust-lang/rust/issues/54878

Rust in the Linux kernel (Google security blog)

Posted Apr 19, 2021 12:45 UTC (Mon) by jezuch (subscriber, #52988) [Link]

> Indeed. That issue[1] got closed recently though, so there's hope.

Oh cool! After only 2,5 years! :D

Rust in the Linux kernel (Google security blog)

Posted Apr 16, 2021 20:10 UTC (Fri) by dezgeg (subscriber, #92243) [Link] (1 responses)

> GCC has a "nonnull" attribute that does exactly that - warns if you passed a NULL pointer as that function argument.
> Microsoft SAL annotations allow even more.
> In practice, this is not much of a problem, since one assumes that a function does not allow NULL pointers as parameters unless specifically described as such.
> And in order to use a new function one has to read its docs anyway to learn what does it exactly do.

A much bigger problem than pointer parameters is pointer return values - will the return value be a) always valid b) NULL on error c) ERR_PTR on error? Result/Option would help greatly.

Rust in the Linux kernel (Google security blog)

Posted Apr 18, 2021 13:15 UTC (Sun) by matthias (subscriber, #94967) [Link]

> A much bigger problem than pointer parameters is pointer return values - will the return value be a) always valid b) NULL on error c) ERR_PTR on error? Result/Option would help greatly.

You missed the question: How long will the pointer be valid? We need a lifetime analysis to answer this question and to ensure that the returned pointer is not used after the object has vnished.