The safety of unsafe code can depend on the correctness of safe code

Posted Feb 13, 2025 2:10 UTC (Thu) by milesrout (subscriber, #126894)
Parent article: Maintainer opinions on Rust-for-Linux

My understanding is that this is not true:

>Ojeda's answer is: confidence. As a maintainer, when somebody sends in a patch, he may or may not spot that it's broken. In an obvious case like this, hopefully he spots it. But more subtle logic bugs can and do slip by. Logic bugs are bad, but crashes (or compromises) of the kernel are worse. With the C version, some other code could be relying on this function to behave as documented in order to avoid overwriting some part of memory. That's equally true of the Rust version. The difference for a reviewer is that Rust splits things into "safe" and "unsafe" functions — and the reviewer can concentrate on the unsafe parts, focusing their limited time and attention on the part that could potentially have wide-ranging consequences. If a safe function uses the incorrect version of f, it can still be wrong, but it's not going to crash. This lets the reviewer be more confident in their review.

The safety of a function marked "unsafe" depends on its correctness, which in turns depends on the correctness of any other code it depends on. That means that it is not true that you can focus your review of a patch purely on those functions marked "unsafe" and be assured that if the code was safe before it will be safe afterwards if "unsafe" code is untouched. The safety of an "unsafe" function typically depends on the maintenance of invariants by functions marked "safe".

The usual approach taken is apparently to restrict this to the "crate" scope by convention. That is, a crate's (package's) safety (if it contains "unsafe" code, which obviously any kernel Rust code must, right?) might depend on the correctness of other code within the crate, but should not depend on the correctness of code outside the crate to remain safe. For example, entrypoints to a crate should enforce invariants but internal functions may assume them.

When reviewing a crate as a whole it is true that if all the "unsafe"-marked code is safe, then the crate will be safe. But that review may require the inspection of "safe" that operates on data structures, for example, the invariants of which are relied on by the "unsafe"-marked code.

But when reviewing a patch to a crate, it cannot be assumed that you only need to check the correctness of the "unsafe" bits to know it is safe. You need to review the patch, even if it only affects "safe" functions, in the context of all the ways that code might be relied upon by unsafe code. So the review burden is not really any less in practice.

The safety of unsafe code can depend on the correctness of safe code

Posted Feb 13, 2025 2:28 UTC (Thu) by intelfx (subscriber, #130118) [Link] (2 responses)

> The safety of an "unsafe" function typically depends on the maintenance of invariants by functions marked "safe".

That’s called “unsoundness” and it’s very much avoided.

The safety of unsafe code can depend on the correctness of safe code

Posted Feb 13, 2025 6:20 UTC (Thu) by mb (subscriber, #50428) [Link]

No, this is not unsound in general.

Unsafe code can depend on the behavior of safe code, but there are conventional boundaries to that.
For example that trust should not cross crates.

But there are also exceptions to that, too. Let me give you and example: It may be reasonable for some unsafe code in a crate to assume that the std implementation of Vec doesn't randomize the indexes internally. If the unsafe code assumes that [5] will give access to the element number 5 of the Vec, then it is depending on the safe API of Vec to do the right thing. It would be possible to implement a memory-safe Vec that randomizes index accesses and never goes out of bound.
But it is reasonable to assume that the std library does the right thing instead.

What would not be Ok for example: Assume that random joe's crate implements PartialOrd correctly for some type. It's commonly understood as being unsound, if unsafe code depends on PartialOrd being implemented incorrectly.

Another more common example is that your unsafe code may depend on other crate-internal safe code. For example some safe initialization code often is required for unsafe code to access initialized memory.

However, the starting point for the safety analysis *always* is the unsafe-block.

The safety of unsafe code can depend on the correctness of safe code

Posted Feb 13, 2025 10:48 UTC (Thu) by excors (subscriber, #95769) [Link]

https://doc.rust-lang.org/nomicon/working-with-unsafe.html gives an example of a safe, correct, sound function that can cause Undefined Behaviour if you only modify code outside of the `unsafe` block. "This is the fundamental problem of safety: it's non-local. The soundness of our unsafe operations necessarily depends on the state established by otherwise "safe" operations."

In this case I think you could choose to put the bounds check inside the `unsafe` block so the `unsafe` block has no preconditions, in which case a safety analysis would only have to consider the block (and anything called from within the block) and not the whole function. But in general you probably want to minimise the scope of `unsafe` blocks, so the compiler won't let you accidentally call other `unsafe` functions where you didn't intend to - the same motivation as https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-... - which will result in `unsafe` blocks with preconditions that non-`unsafe` code must uphold.

That Rustonomicon page also gives an example where an `unsafe` block depends on the entire module upholding its preconditions. If you can't avoid module-level preconditions, I think it's important to minimise the size of your module to keep the safety analysis manageable. Reviewers will need to be careful about any patches to non-`unsafe` code in that module, and in any modules that it calls, but at least you can still be confident that patches to any other modules can't break it.