Uncheckable by the compiler does not imply `unsafe`

Posted Nov 6, 2024 6:35 UTC (Wed) by jbills (subscriber, #161176)
Parent article: Safety in an unsafe world

> Send is unsafe not because it is inherently dangerous, but just because it represents a property that the compiler cannot check.

This is not entirely true. `Send` and `Sync` are unsafe because incorrectly implementing it can allow you to violate a safety invariant, namely the mutable xor shared property. This cannot be checked by the compiler, but there are other uncheckable properties that aren't safety violations. Probably the most common example in Rust would be the `Ord` trait, where the properties that make it not a partial order cannot be guaranteed by the compiler. This in turn means that unsafe code cannot rely on those properties to ensure their safety. In general, traits in Rust can be used to mark unprovable properties, and unsafe traits can be used to mark unprovable safety properties.

This is something that takes newcomers to Rust a little bit to ingest, as there is often a temptation to mark safe functions unsafe in order to add a warning label to it rather than to mark a genuine safety issue.

Uncheckable by the compiler does not imply `unsafe`

Posted Nov 6, 2024 7:31 UTC (Wed) by emk (subscriber, #1128) [Link] (3 responses)

There are interesting philosophical questions about "safe" in kernel space. For example, assume an API allows you to safely talk to a hardware subsystem that can be used to write to arbitrary memory.

A userspace example of this is file I/O to /proc, which potentially allows you to write to the memory of a running process. The file I/O itself is safe, but it can come around through the back door and do unsafe things.

But it's silly to make file I/O as unsafe, even if it can theoretically be used to trigger undefined memory behavior.

So how do you mark functions that communicate with a programmable MMU?

Uncheckable by the compiler does not imply `unsafe`

Posted Nov 6, 2024 8:08 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (2 responses)

In userspace, the process is the default or presumed safety boundary. You can cause all kinds of bizarre misbehavior by talking to the OS, or by having other processes ptrace you, or whatnot, but that is not Rust's problem because those things are beyond the scope of your process, and Rust cannot be reasonably expected to protect you from the entire system.

In kernelspace, it's significantly harder. I would suggest something along these lines:

* If the system is secure boot enabled, there is no boundary. Everything on the system is within the scope of Rust's safety guarantee. This is because, in such a setup, you really are expected to prevent all (kernelspace) UB under all circumstances, no matter what userspace tries to do, so there is no natural point where we can draw a line and say "it is unreasonable to expect Rust to protect against this."
* If the system is not secure boot enabled, then the boundary is whatever subset of userspace has the necessary privileges to subvert the kernel (write to /proc/kcore, load kernel modules, etc.). You don't have to prevent that from happening, so Rust's safety guarantees should not extend to interfaces that are exposed to such privileged userspace processes. But even then, those guarantees do still extend to all other userspace processes, suggesting that there would need to be some kind of type-state API for distinguishing between privileged and unprivileged userspace processes (in the hypothetical where the whole kernel is Rust).

In either case, "safety" specifically means that the kernel must not perform UB. It does not impose any particular requirements on things that happen in userspace. If some userspace process corrupts its heap and crashes, and then the kernel properly cleans up the dead process, then the kernel has not performed UB, so Rust's safety rules have not been violated with respect to the kernel.

As for your MMU, note that Rust already considers all volatile reads and writes to be unsafe, because you can only do them through raw pointers, and raw pointer reads and writes are unsafe. I must admit that I have never written code that talks to an MMU before, but I find it hard to believe you can do it without some degree of raw pointer manipulation (regardless of whether it needs to be volatile or not), which will be unsafe if you want to dereference anything.

Uncheckable by the compiler does not imply `unsafe`

Posted Nov 6, 2024 8:57 UTC (Wed) by znix (subscriber, #159961) [Link] (1 responses)

What makes an MMU unique though is that you can't easily wrap it with a memory-safe API like you can with most other peripherals.

If you're setting up a network adapter for example, you can do what you want to any register (save for those relating to DMA) and it won't break memory safety. But for an MMU, if you map the wrong physical page to the wrong process it can clobber the kernel's memory. I can't really see how you can make any interface smaller than more-or-less the entire virtual memory manager safe.

So you have a function to map an arbitrary page into a process's address space. Calling the function wrong will never cause UB, nor will it cause the kernel itself to corrupt memory - so it's safe? But it could allow the process you're mapping it into to clobber the kernel's memory if you asked it to map the wrong page, so it's unsafe? I think this is the philosophical question emk was asking about.

My suggestion is that this is one of the rare cases where safe vs unsafe isn't a particularly useful distinction, since the rust code itself being memory safe isn't any more important than the overall correctness of the code.

Uncheckable by the compiler does not imply `unsafe`

Posted Nov 6, 2024 12:00 UTC (Wed) by matthias (subscriber, #94967) [Link]

> I can't really see how you can make any interface smaller than more-or-less the entire virtual memory manager safe.

This should be the same reason why it is unsafe to implement an allocator in userspace. If the allocater returns an incorrect address, then all bets are off. And of course, the kernel memory allocator has to be correct. But this is just one component of the virtual memory manager.

> So you have a function to map an arbitrary page into a process's address space.

Why should this function take arbitrary pages. You can have different types for pages allocated for userspace and kernelspace. And even for different purposes in kernelspace. And then it boils down to the "X-safe" feature. Do you encode the difference between user pages and kernel pages in the type system or not?

And of course there is no single correct answer to the question which (safety) properties should be encoded in the type system. There has to be a balance between complexity (of the encoding in the typesystem) and safety. And I think that this balance is much harder to find in kernel space. But also in userspace this is nowhere near trivial. There is the agrred border of unsound (undefined) behavior. But there are definitely people that want to avoid deadlocks or memory leaks also in userspace.