A discussion of Rust safety documentation

By Daroc Alden
September 17, 2024

Kangrejos 2024

Kangrejos 2024 started off with a talk from Benno Lossin about his recent work to establish a standard for safety documentation in Rust kernel code. Lossin began his talk by giving a brief review of what safety documentation is, and why it's needed, before moving on to the current status of his work. Safety documentation is easier to read and write when there's a shared vocabulary for discussing common requirements; Lossin wants to establish that shared vocabulary for Rust code in the Linux kernel.

Safety documentation has two parts, Lossin explained: requirements and justifications. Requirements are comments attached to functions that have been marked unsafe, and explain what must be true for the function to be used. He gave the example of the Arc::into_raw() and Arc::from_raw() functions that convert between a reference-counted smart pointer (Arc) and a plain pointer. For example, from_raw() must be called once for each call to into_raw() on a given allocation, otherwise the reference count will be incorrect. Also, from_raw() must be given a pointer that really did come from into_raw(), or it will do bad things to whatever object is being pointed to when the Arc is dropped and the reference count is decremented.

Those requirements should be spelled out in a safety comment attached to the function, Lossin said, so that users of the function know how to use it correctly. Furthermore, uses of that function should also have the second kind of safety documentation: justifications. Having a comment explaining why the functions requirements hold at that particular call site makes it easier for reviewers to check whether the author of a patch really got their logic right, which prevents mistakes.

Lossin briefly addressed the objection that these kinds of comments are not traditional in kernel C code. He said that Rust has "higher stakes" — since Rust's safety is its primary reason to exist, the Rust-for-Linux folks had better make it easy to write correct Rust code. Also, Rust has some more complex language features, such as smart pointers and compile-time lifetime tracking, that can be harder to work with when writing low-level unsafe code. Finally, actually writing out one's assumptions can sometimes catch errors that would otherwise have gone unnoticed.

Once everyone had been brought up to date on what Lossin meant by safety documentation, he talked about the current status of his effort to document standards for safety documentation. He pointed out that safety documentation has actually been required by reviewers since the beginning; what he's asking for is to standardize the wording and formatting to make those comments easier to understand.

Almost all existing unsafe blocks in the kernel have associated documentation. Soon, the project will enable a lint to warn about undocumented unsafe blocks. This is good, Lossin said, but there's a catch: the comments do not always use the same terminology, which can make it difficult to know whether they're correct.

For example, some comments say that their function needs a "valid pointer" and some say that it needs a "valid, non-null pointer". But all valid pointers in Rust are non-null. So is this just two ways of saying the same thing? Or is one of the comments using "valid" in a way that's not consistent with Rust's language documentation? It's impossible to tell what the actual requirements are without reading the code, which rather defeats the point of having concise safety documentation to read, Lossin said.

Ideally, all of the comments would be correct, complete, and easy to understand. That's easier to accomplish if there's a shared vocabulary for common conventions — an author shouldn't need to write "valid, non-null" when just "valid" will do. Lossin suggested that they might want to standardize a dictionary of common terms, so that authors can write as little as possible, but readers will still be able to understand. Plus, having an explicit resource saying how to read safety documentation will make it easier for learners to come up to speed, and reduce the chances of misunderstandings between maintainers.

Lossin ended the introductory portion of his talk by calling for people to read his RFC, and get on the Rust-for-Linux project's Zulip chat server to talk about it. Then, he opened things up for a discussion.

Richard Weinberger started by asking whether the safety documentation was meant to be human-readable or machine-readable. Lossin thought that was a good question for discussion, and tossed it back to the attendees. Andreas Hindborg, one of the organizers of Kangrejos, said "I like it when the documentation is human-readable".

Daniel Almeida agreed, asking why the project would want the documentation to be machine-readable. Lossin suggested that it might be possible to have tooling process it, for formal verification. Almeida objected, saying that the existing code linter can ensure the comments are present, and what more is needed?

Lossin suggested that perhaps they could write tooling to check the comments for correctness. Paul McKenney pointed out that if it were both human- and machine-readable, there could be tooling that would expand terse comments into a more detailed form using the dictionary Lossin had composed.

Other attendees remained skeptical. One person pointed out that unsafe code is used exactly when Rust cannot statically guarantee something is safe — so trying to run static analysis on those parts sounds like a fool's errand. Lossin objected that they could simplify common checks, such as checking that a pointer is valid, or check kernel-specific invariants that the wider Rust world doesn't care about. He also pointed out that the Rust compiler has macros that can be used for annotations — those could be used to attach checks to different functions. The downside is that it would raise the difficulty of writing kernel Rust code, something that he was worried about.

McKenney thought that even "stupid" formal verification can be helpful for catching mistakes. Even something as simple as checking that the requirement and justification comments match, somehow, would be valuable.

Weinberger pointed out that the kernel already has sparse, which works on C code, and that could potentially be expanded to Rust. Hindborg noted that there are people using formal verification on some drivers, such as checking that there are no writes outside of a DMA buffer. Formally verifying Rust code has proved to be easier than formally verifying C code in the past.

Despite the possibility of improved tooling, Hindborg was against using machine-readable comments, saying that the Rust-for-Linux project is already bringing in a new language that's hard to accept. Adding some kind of formal notation on top of that will not go well.

Lossin noted that you could make English machine-readable by restricting the grammar and vocabulary. He also pointed out that there are existing tools for formal verification. Despite that, he thought that it made more sense to "surgically insert" formal verification only where it would matter the most, making it opt-in.

Almeida thought that even Lossin's idea of having a standardized dictionary of terms might be a step too far, saying that it could make it harder for people to contribute. Lossin suggested letting contributors know that they can ask for help when writing the comments during the review process if they have trouble with the dictionary.

Miguel Ojeda raised the idea of having different standards for core and leaf functions — the Rust-for-Linux developers could set an example with the core APIs, and let that propagate. Lossin said that the normal documentation already works like that; the kernel crate is entirely documented (as is the Rust standard library), but it's up to individual subsystem maintainers to pick a standard that works for them. Since, ideally, most drivers will not need to use unsafe Rust, maybe the safety documentation is only needed for the kernel crate.

José Marchesi asked how much of the information Lossin wanted comments to capture could instead be expressed in Rust itself. Lossin asked whether he meant in the type system or at run time, and Marchesi clarified that he meant at run time. In Ada, he said, there can be preconditions to a function, and he wondered if Rust had anything similar.

Lossin agreed that it could be a good option to add Rust macros for that, if it could be made to work. Ojeda noted that some formal-verification people are already working on this, and may eventually get pre- and post-conditions into the language itself. But he didn't think that obviated the need for comments.

Gary Guo raised the topic of foreign functions — sometimes, a function is unsafe to call only because it is a foreign function, and there aren't really any requirements to satisfy before using it. Lossin thought that the real problem there was the lack of documentation of the kernel's C functions. In order to know whether the FFI function is really safe, you need to look at the C code. He suggested that perhaps it could make sense to document foreign functions in the C code, and have bindgen (the C-to-Rust bindings generator) transfer the documentation. Guo said that there were a lot of functions that do not really require documentation, and that documenting things in multiple places would certainly not be helpful.

Maciej Falkowski asked why it was necessary to have obvious code accompanied by a safety comment. Lossin said that this gets into why safety documentation is needed in the first place. In C, the programmer needs to argue that the whole program is correct. Having unsafe blocks reduces that global problem to a local property, Lossin said. So if the unsafe blocks are small, only a small amount of context is needed to make sure it's right — which makes the whole system more correct. From that point of view, safety documentation is actually better if it's obvious, because that lets maintainers and people working on the code later have more confidence that the code is correct.

Overall, the attendees were mostly sympathetic to the need for standardized safety documentation, although there was clearly still some disagreement around exactly what form that should take, and whether it should be used as the basis for new tooling. Eventually, the session ran out of time.

[ Thanks to the Linux Foundation, LWN's travel sponsor, for supporting our coverage of Kangrejos. ]

Index entries for this article
Kernel	Development tools/Rust
Conference	Kangrejos/2024

Unsafety can be subtle

Posted Sep 17, 2024 21:39 UTC (Tue) by apoelstra (subscriber, #75205) [Link] (47 responses)

I think the notion that "in C you need to reason about your entire program as though it were all marked unsafe" is a bit misleading and unfair toward C. C has only one kind of pointer type with semantics that are much easier to understand and reason about than those for Rust references.

When I first started working in Rust, because of this sort of messaging, I had the notion that "unsafe Rust is basically just C". In fact, unsafe Rust is much more subtle and dangerous than C. The reason is that Rust references have much stronger requirements on them than C pointers, and Rust references are what you get when you use the & or &mut operators.

Unlike with C pointers, it is illegal to start with a Rust reference and then offset it outside of the bounds of the original object. (In C you can do this as long as the object came from within an array.) Rust references may never be null or dangling; they may not point one-past the end of an array; they may not point to uninitialized memory (except through the special `MaybeUninit` type). If you create such a reference this is immediate UB.

Mutable Rust references may not alias any other reference. If you violate this rule by constructing a bad reference this is immediate UB.

As an example of the sort of problem this can cause, I recently learned of a bug in some unsafe Rust code of mine. It looked like this:

fn bad_load_int_le(buf: &[u8; 32], i: usize) -> u32 {
     debug_assert!(i + mem::size_of::<u32>() <= buf.len());
     let mut data = 0u32;
     ptr::copy_nonoverlapping(
         buf.get_unchecked(i),
         &mut data as *mut _ as *mut u8,
         mem::size_of::<u32>(),
     );
    data.to_le()
}

The C equivalent of this code is pretty straightforward: you take a buffer, assert that it's big enough, memcpy some data into an uint32_t, and then swap the bytes.

But in Rust, the call to buf.get_unchecked(i) returns a &u8 pointing to a single byte of the buffer, and then ptr::copy_nonoverlapping reads past this single byte, and that's UB right there. All the stuff that a reviewer might worry about -- the buffer being too small, pointers being insufficiently aligned, casts between different pointer types, etc -- are all fine. The badness is that the `get_unchecked` method returns a reference instead of a pointer, something that isn't even visible in the code as written.

(This was found by MIRI, an interpreter for Rust's "MIR" intermediate language which is designed to detect violations of the pointer rules, so I am fairly confident that this is actually undefined behavior, and not just extra paranoia.)

Unsafety can be subtle

Posted Sep 18, 2024 2:26 UTC (Wed) by cplaplante (subscriber, #107196) [Link] (15 responses)

I think the mistake here is reaching for unsafe code immediately, rather than spending a bit of time looking at the standard library and finding the (safe) function that does exactly this: https://doc.rust-lang.org/std/primitive.u32.html#method.f...

Unsafety can be subtle

Posted Sep 18, 2024 2:56 UTC (Wed) by intelfx (subscriber, #130118) [Link] (4 responses)

> I think the mistake here is reaching for unsafe code immediately, rather than spending a bit of time looking at the standard library and finding the (safe) function that does exactly this

You can't have a library function for every thing you would ever need. If the language doesn't lend itself to writing unsafe/unrestricted/low-level code without excessive accidental complexity, it might indeed be a problem.

Unsafety can be subtle

Posted Sep 18, 2024 10:31 UTC (Wed) by excors (subscriber, #95769) [Link] (3 responses)

Even without a built-in function, you can implement it easily without `unsafe`:

fn load_int_le(buf: &[u8; 32], i: usize) -> u32 {
    buf[i .. i+4].iter().rfold(0, |n, &b| (n << 8) + (b as u32))
}

On x86, when not inlined, that compiles into a bounds check and a single `mov` instruction.

As a bonus, the bounds check will apply in release mode too (unlike the original example's `debug_assert!()`), and avoids the danger of integer overflow (unlike the original `i + mem::sizeof::<u32>()`), so it's no longer possible for a caller to trigger UB and the code is actually safe.

Or if you don't like doing the arithmetic yourself, you can implement it with `unsafe` but without any tricky references or pointers:

fn load_int_le(buf: &[u8; 32], i: usize) -> u32 {
    let b: [u8; 4] = buf[i .. i+4].try_into().unwrap(); // convert to a fixed-size array type
    u32::from_le(unsafe { mem::transmute(b) })
}

(which I think is what `from_le_bytes` does internally), where `transmute` is basically a byte-wise move between two arbitrary types of the same size. The documentation says `transmute` is "incredibly unsafe", but I think none of its dangers apply when simply converting integers and arrays of integers.

Correctly handling references and pointers in `unsafe` does seem pretty difficult, but you can do a surprising amount in safe or unsafe code without that. I think it involves a significant shift from the C mindset where you naturally think in terms of pointers and bytes - in Rust that should be seen as a last resort when you've exhausted all other possibilities, it shouldn't be the default way of approaching a problem.

Unsafety can be subtle

Posted Sep 19, 2024 12:55 UTC (Thu) by apoelstra (subscriber, #75205) [Link] (2 responses)

>As a bonus, the bounds check will apply in release mode too

This isn't a bonus. The point of the original code was to avoid the bounds check and associated panic path.

>you can implement it with `unsafe` but without any tricky references or pointers:

Aside from your new code now having `try_into` and an explicit panic path, it also illustrates my comment perfectly :). You have replaced code which "narrowly" uses unsafe to do an unchecked array index with code that uses `transmute`, which has pretty-much unlimited chaos potential! In this case your code appears to be correct, but I don't think "avoid pointers, always reach for transmute first" is a reasonable policy for authors of unsafe code.

Unsafety can be subtle

Posted Sep 20, 2024 0:38 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

> In this case your code appears to be correct, but I don't think "avoid pointers, always reach for transmute first" is a reasonable policy for authors of unsafe code.

Why? Generally the advice is to use the safest approach if possible and while, technically, transmute have pretty-much unlimited chaos potential, but it have fewer failure modes than transmute_copy (compiler can at least verify that sizes of objects match) and, of course, transmute_copy have fewer failure modes then copy_nonoverlapping (in addition to all failure modes that transmute_copy you now have to deal with possibilities of having two objects overlapping and pointer being invalid, etc).

> You have replaced code which "narrowly" uses unsafe to do an unchecked array index with code that uses `transmute`, which has pretty-much unlimited chaos potential!

That's still a good change. Heck, C++20 even added their own version of transmute citing precisely pifalls of correct use of memcpy (C/C++ version of copy_nonoverlapping).

P.S. I wanted to avoid discussion about why your approach to the whole thing is more-or-less ridiculous by assuming that you just wanted to show how it's not easy to use pointers in Rust, but if your goal is to just read integer then Rust documentation, of course, includes pretty good example which explains how to do what you want to do properly, without any unsafe code here. And if you want to avoid panic there, then the simplest way is to just literally tell the compiler that you know what you are doing:

pub fn read_le_u32(input: &[u8]) -> u32 {
    if input.len() < 4 {
        unsafe { std::hint::unreachable_unchecked() }
    }
    let (int_bytes, _) = input.split_at(std::mem::size_of::<u32>());
    u32::from_le_bytes(int_bytes.try_into().unwrap())
}

That's it. It would be compiled into one instruction and there are really no need to invoke horrors of transmute, transmute_copy or copy_nonoverlapping if you really just want to eliminate the bounds check.

Unsafety can be subtle

Posted Sep 20, 2024 23:02 UTC (Fri) by apoelstra (subscriber, #75205) [Link]

This idea of using `unreachable_unchecked` to just tell the compiler what I know to be true, then using safe unwrapping methods from there on out (since the optimizer will now definitely be able to make the error paths go away), is really cool, and very general!

But I will add my voice to several others asking that you take a kinder tone and offer others the benefit of the doubt (or even, just stop assuming the worst). It makes these technical insights much harder to read and less likely to be noticed.

Unsafety can be subtle

Posted Sep 19, 2024 12:50 UTC (Thu) by apoelstra (subscriber, #75205) [Link] (9 responses)

The method you are quoting takes a [u8; 4], not a slice or a larger array.

Slice to `[u8;4]`

Posted Sep 19, 2024 13:46 UTC (Thu) by farnz (subscriber, #17727) [Link] (8 responses)

Note that you can use try_from to convert a slice of the correct size into an array. Assuming that you've verified that the slice is big enough, this can't actually fail - and you'd expect the compiler to optimize out the failure path as a result.

And, indeed, when I go that route (and yes, this is slightly obscure), I get the optimal form that cannot panic at runtime:


use std::convert::TryInto as _;

pub fn load_int_le(buf: &[u8; 32], i: usize) -> u32 {
    let slice = &buf[0..4];
    u32::from_le_bytes(slice.try_into().expect("load_int_le slicing issue Q1234Qload_int_leQ32Z32Z"))
}

The string "Q1234Qload_int_leQ32Z32Z" is in there as something that's unlikely to appear in the compiled binary by mistake; you can thus have release-mode CI grep for that string and fail if it appears, because you want the failure path to be resolved at compile time; if it does appear, you know where to check.

Slice to `[u8;4]`

Posted Sep 19, 2024 17:02 UTC (Thu) by atnot (subscriber, #124910) [Link]

Personally I've never needed this because usually if you're reading bytes from a file, you'll be dealing with the `Read` trait, which means you can just `reader.read_exact(&mut array[..])` which will let you handle it as an EOF.

I do sometimes wish there was a function to return an array directly in the standard library and save that line of code, but honestly there's enough utility crates like bytemuck, binrw etc. that you'll probably want to be using anyway if you're doing a lot of reading bytes into filetypes.

Slice to `[u8;4]`

Posted Oct 4, 2024 8:19 UTC (Fri) by geert (subscriber, #98403) [Link] (6 responses)

Seems you have dropped using the "i" input parameter value?

Slice to `[u8;4]`

Posted Oct 4, 2024 10:42 UTC (Fri) by farnz (subscriber, #17727) [Link] (5 responses)

You're right - that's the other fun of using "clever" code, in that you can confuse someone trying to refactor it, and the compiler won't complain.

A better translation would be this Godbolt link, which makes the use of unsafe very clearly about not panicking on error, and uses safe functionality for all of the data transformation. It also makes it clear that this function should be marked as unsafe, since as a precondition it requires buf to be long enough.

Slice to `[u8;4]`

Posted Oct 4, 2024 11:41 UTC (Fri) by geert (subscriber, #98403) [Link] (1 responses)

About your original: Ah, adding -Wunused helps.
About the new version: Wow, writing that much code to generate a function with a single asm instruction...

Slice to `[u8;4]`

Posted Oct 4, 2024 11:55 UTC (Fri) by farnz (subscriber, #17727) [Link]

The reason there's so much code for one asm instruction is that most of the code is concerned with things that asm doesn't care about - asm doesn't care if you're reading out of bounds (that's a CPU fault at worst, and a program bug at best), or if you're asking it to read from the "wrong" location, or a whole host of other things that programming languages care about.

And it's similar in size to the dance you need to do with standard C to get the same instruction.

Slice to `[u8;4]`

Posted Oct 4, 2024 12:08 UTC (Fri) by pizza (subscriber, #46) [Link] (2 responses)

> You're right - that's the other fun of using "clever" code, in that you can confuse someone trying to refactor it, and the compiler won't complain.

So... you're saying that the job _isn't_ done when the Rust compiler stops complaining, and that the resultant code might not actually _work_ even though the compiler "proved" it to be "correct"?

(Yeah, yeah, I know you're not one of the obnoxious evangelists making that idiotic claim, but I've been piled upon here multiple times here for making this point..)

Slice to `[u8;4]`

Posted Oct 4, 2024 12:21 UTC (Fri) by farnz (subscriber, #17727) [Link]

As with anything, it depends what you require the compiler to prove. My experience is that Rust code is more likely to use enums, structs and other such type system to make it impossible to represent impossible states, rather than clever code that allows for this sort of bug to slip past.

In other words, the problem here is clever code where you're trying to beat the optimizer, rather than trusting that the optimizer will get it right. And my experience of cheap coders (offshore outsourcing body shops) is that they trust the optimizer instead of trying to be clever, so they'll do the right thing in Rust; I have, however, had a pile-on for expressing that in the past.

Slice to `[u8;4]`

Posted Oct 4, 2024 13:12 UTC (Fri) by Wol (subscriber, #4433) [Link]

> So... you're saying that the job _isn't_ done when the Rust compiler stops complaining, and that the resultant code might not actually _work_ even though the compiler "proved" it to be "correct"?

But isn't that *always* true? Just because the *maths* is correct, doesn't mean the *logic* is sound.

I have a production system that uses bell-curve statistics on a skewed distribution. It mostly works because the data is USUALLY a strong peak with a weak tail on one side. But feed it a double-peak, or a strong tail, and the results will be rubbish. That's a risk we've chosen to accept. The mathematical model is sound. It's just the fit to reality isn't quite right ...

Cheers,
Wol

Unsafety can be subtle

Posted Sep 18, 2024 8:29 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (19 responses)

> But in Rust, the call to buf.get_unchecked(i) returns a &u8 pointing to a single byte of the buffer, and then ptr::copy_nonoverlapping reads past this single byte, and that's UB right there.

Perhaps I'm missing something, but that does not make sense to me. ptr::copy_nonoverlapping takes its first argument as a raw pointer, not a reference. The &u8 is legal when it is constructed (it points to the ith element of buf, which must have been initialized before the function was called or else it was already UB to begin with), and it is then implicitly converted to a raw pointer to match the type expected by ptr::copy_nonoverlapping, before any dereference occurs. At that point, the rules for a raw pointer are more or less the same as C, minus the strict aliasing rule (the assumption is that, if you're using raw pointers, you're already off the beaten path and so you may also be doing some type punning).

Under the strict provenance rules documented in std::ptr, this is technically not allowed, because the conversion to &u8 shrinks the provenance of the pointer, and you're not allowed to get it back just by converting to a raw pointer. But literally nobody follows the strict provenance rules anyway, since they're explicitly marked as experimental and non-normative.

Strict Provenance experiments

Posted Sep 18, 2024 10:29 UTC (Wed) by farnz (subscriber, #17727) [Link]

Note, too, that the point of the strict provenance rules is that they're sufficient but not necessary; if you comply with strict provenance, you're going to be compliant with whatever the real rules are, but you can comply with the real rules without meeting strict provenance rules. The long-term goal is to give you simple rules (strict provenance) for simple cases, and a tower of steadily more complex (and powerful) rules for more complicated cases, where you start at the top with strict provenance, and pick up the specific rules from lower tiers of the tower as needed to make your code safe.

And if the tower is well-constructed, you'll be able to pick and choose which level of the tower you're working at; you can say "all of this is fine under strict provenance" and deal with a big chunk of unsafe code using simple rules, and only have to use the rules of lower in the tower for the stuff that isn't OK at higher levels of the tower, making your correctness proof easier to write.

Unsafety can be subtle

Posted Sep 18, 2024 10:42 UTC (Wed) by khim (subscriber, #9252) [Link] (17 responses)

> But literally nobody follows the strict provenance rules anyway, since they're explicitly marked as experimental and non-normative.

That's quite an assertion. I know a lot of people who follow strict provenance rules, even if they don't know them.

They just try to make Miri, with default settings, happy and this automatically makes you follow the strict provenance rules.

But I think that sentence very nice highlights the difference between C and Rust. And no, difference in not in the fact that Rust's unsafe have too many rules. Nope. In reality it has much smaller amount of rules than C. And that makes it harder to write unsafe code, ironically enough.

> the assumption is that, if you're using raw pointers, you're already off the beaten path and so you may also be doing some type punning

Nope. Rust doesn't do TBAA because it doesn't need to! Even if you are using pointers you still have to ensure that rules for references that exist somewhere outside of your code are not violated.

C and C++ use TBAA as a poor man's substitute for Rust's reference access rules. This works extremely poorly (as in: we know that rules for pointers are complicated that existing standards describe them – but we still have no idea what are the exact rules for pointers in C. There are dozens of proposals, none of them implemented: one, two, three…

And that's the code difference: rules that C places on your are insanely complicated and unknowable (when rules are not actually written and the only way to know for sure is to show your code to dozen of compiler developers from different companies… you can be sure that is much closer to “nobody does that, anyway”), while rules that Rust places on your are knowable and understandable, if not finalized (two main of these are stacked borrows and tree borrows), but, more importantly, they are actually used and followed by real developers (even if in the form of “making Miri happy”)

And that is core difference: complicated rules of C are, mostly, just ignored by developers (and kernel developers even have an excuse: Linus is famous enough and kernel is important enough so that approach of “we violated all your rules, but don't you dare to miscompile our code” approach mostly works), while Rust rules are followed!

Of course if you ignore rules in one case and [try to] follow them in the other case then it doesn't actually matter how complicated rules are in first case and how simple they are in the second case: following the rules would always be harder then ignoring them!

Unsafety can be subtle

Posted Sep 18, 2024 18:37 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (16 responses)

OK, "nobody" was maybe an exaggeration. But if something is marked as "non-normative and experimental" in the documentation, I personally am going to ignore it until such time as that label goes away.

The other problem is that you have to read these rules extremely carefully to even notice that this is illegal in the first place, because they're almost exclusively focused on the usize -> ptr conversion, and barely acknowledge that "shrinking provenance" is a thing that can happen through other operations. I still cannot find a single, authoritative source which explicitly documents all of the operations that can shrink provenance, and I'm not 100% convinced that it has even been formalized in the first place.

Unsafety can be subtle

Posted Sep 18, 2024 21:37 UTC (Wed) by khim (subscriber, #9252) [Link] (14 responses)

> I still cannot find a single, authoritative source which explicitly documents all of the operations that can shrink provenance, and I'm not 100% convinced that it has even been formalized in the first place.

How is this even relevant? There are no “shrinkage of provenance” in that example. You have reference that's pointing to one, single byte. That means that you get to access just one, single, byte and nothing more.

That's direct consequence of refusal to adopt insanity of C/C++ TBAA: if we no longer consider memory as typed where different pieces of memory turn into differently typed objects magically, then boundaries of objects are now parts of pointers and references, for how else can we reason about them?

This automatically and immediately makes this example illegal. Which is good for everyone concerned: given the fact that out-of-object accesses are core issue with 90% of all exploits we don't want to declare such accesses “normal”.

> But if something is marked as "non-normative and experimental" in the documentation, I personally am going to ignore it until such time as that label goes away.

Then you couldn't do many things then these “non-normative and experimental” models give you. You are left, essentially, with this: The result of casting a reference to a pointer is valid for as long as the underlying object is live and no reference (just raw pointers) is used to access the same memory.

All other usages (in particular the potential ability to go from a single byte to something larger) need these “non-normative and experimental” models because without them they all these operations fall under the the precise rules for validity are not determined yet clause.

This reading, most definitely, doesn't give you the ability to access anything outside of that one, single, byte that you received from get_unchecked

Unsafety can be subtle

Posted Sep 18, 2024 22:39 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (13 responses)

> How is this even relevant? There are no “shrinkage of provenance” in that example. You have reference that's pointing to one, single byte. That means that you get to access just one, single, byte and nothing more.

Because the reference was returned by slice::get_unchecked, and slice::get_unchecked is documented to have the same semantic meaning as C pointer arithmetic (but with the added constraint that it is illegal to construct a "one past the end" pointer, unlike in C where that is legal). C pointer arithmetic is generally understood to return a pointer with provenance over the whole array, so if you're going to change that rule, you really ought to do so explicitly.

Note also that a completely literal interpretation of your argument would equally well apply to slice::as_ptr(), because the return type of that function is const *T, not some nonexistent unsafe slice type (and so a pedantically literal interpretation is that you may only access the first element of the slice through said pointer, because it is a pointer to T, not a pointer to unsafe-slice-ish of T).

In fact, I cannot find a solid reason to allow one and not the other. Nothing in either function's documentation clarifies that there is a meaningful distinction. At the same time, it is obvious (by the existence of e.g. as_ptr_range()) that this is at least meant to be allowed in the case of as_ptr(). So what's so special about get_unchecked() that makes it different, exactly?

> That's direct consequence of refusal to adopt insanity of C/C++ TBAA: if we no longer consider memory as typed where different pieces of memory turn into differently typed objects magically, then boundaries of objects are now parts of pointers and references, for how else can we reason about them?

It sounds as if you're saying that the rules for pointers are the same as the rules for references, which I agree makes logical sense. Unfortunately, in this case, we have the option of saying that it is indeed legal to do pointer arithmetic on pointers derived from get_unchecked(), we have the option of saying that references have stricter provenance rules than raw pointers, or we have the option of saying that it is not possible to use as_ptr_range() for anything other than accessing the first element. The latter is obviously ridiculous and untenable, so which of the first two is your interpretation of this case? If it is the second one (stricter provenance for references than for raw pointers), what exactly are those rules, and where can I read about them?

Unsafety can be subtle

Posted Sep 18, 2024 23:26 UTC (Wed) by khim (subscriber, #9252) [Link] (8 responses)

> Because the reference was returned by slice::get_unchecked, and slice::get_unchecked is documented to have the same semantic meaning as C pointer arithmetic (but with the added constraint that it is illegal to construct a "one past the end" pointer, unlike in C where that is legal).

Where is it documented that way? Official documentation is pretty clear: returns a reference to an element or subslice, without doing bounds checking. And subslice there means not any random subslice but specifically subslice specified by something like Range. That's not how it was used in the discussed example, thus we can ignore that case.

Nowhere does it tell is that you get permission to access anything outside of that single element (or slice).

> C pointer arithmetic is generally understood to return a pointer with provenance over the whole array, so if you're going to change that rule, you really ought to do so explicitly.

If Rust would have been a new version of C then sure. But Rust is not a new version of C and, in general, in Rust, you don't get the permission to access something that lies outside of that object that function gives you and, most importantly, nowhere near that point raw pointers are even discussed.

Why function from not C, that returns something that C doesn't support and in a fashion that is typical for that language should include warning in the form of “hey, if you know about C then remember that this function works like a Rust function and not like a C function”. That would be really very strange. Obnoxious and repetitive (remember that most Rust developers don't, actually, know C, they come from Java or Python background).

> Note also that a completely literal interpretation of your argument would equally well apply to slice::as_ptr()

Sure, but result would be different. Because that one is different: it returns a raw pointer to the slice’s buffer.

There are big difference betwee a reference to an element or subslice and pointer to the slice’s buffer.

Element is, well, element. Singular. One byte. Buffer is not one element, it's content of the entire slice.

> Nothing in either function's documentation clarifies that there is a meaningful distinction.

You couldn't tell the difference between “one” and “many”? That's explained pretty early in most schools. Maybe you forgot?

> So what's so special about get_unchecked() that makes it different, exactly?

Well… the fact that it does different thing?

> Unfortunately, in this case, we have the option of saying that it is indeed legal to do pointer arithmetic on pointers derived from get_unchecked(),

Whoa, whoa, whoa. Of course you can do pointer aritmetic – as long as you don't go beyond boundaries of that one, single, byte that you have got access to!

What else do you expect if you have got reference to one byte and not reference to the whole buffer?

> or we have the option of saying that it is not possible to use as_ptr_range() for anything other than accessing the first element.

Whoa, whoa, whoa. Why would that be the case? One function returns reference to one, single, element (or subslice), another function returns pointer to buffer and third one returns two raw pointers spanning the slice.

Why would they behave identically if they return references and/or pointers that deal with different entities. How is that ever logical?

> If it is the second one (stricter provenance for references than for raw pointers), what exactly are those rules, and where can I read about them?

Why would you need any such rules? If your pointer or reference are pointing to one element, then you can touch that element and nothing else. If your pointer or reference are pointing to the entire buffer then the whole buffer is fair game.

It's really as simple as that: there are no radical difference between pointer and reference in the [naïve] approach to their provenance (experimental models relax the restrictions, don't add a new ones), but these functions return references (or pointers) to a different objects, why should they behave identically?

Unsafety can be subtle

Posted Sep 19, 2024 0:03 UTC (Thu) by atnot (subscriber, #124910) [Link]

Not to call the pot black as a kettle, but I would greatly appreciate if you could stop having the same near-identical heated argument under every C or Rust related post, and presumably so would others that haven't muted you yet.

Unsafety can be subtle

Posted Sep 19, 2024 3:34 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (4 responses)

> If Rust would have been a new version of C then sure. But Rust is not a new version of C and, in general, in Rust, you don't get the permission to access something that lies outside of that object that function gives you and, most importantly, nowhere near that point raw pointers are even discussed.

Rust cannot escape the shadow of C, as you can see by the the functions as_ptr() and friends, since they exist primarily for interoperation with C (as_ptr_range() says this explicitly). It is quite unreasonable to assume that people writing unsafe Rust have no preconceptions from C whatsoever.

> There are big difference betwee a reference to an element or subslice and pointer to the slice’s buffer.

The bytes that make up the element are also the bytes that make up the buffer, so it is perfectly reasonable to interpret the two as synonymous when reading documentation casually. If this is really an important distinction, then it needs an entire section of the 'nomicon, not just a couple of stray words in std::ptr and then a minor difference in phrasing here or there in other parts of std. This is not a game of Clue - nobody is going to carefully scrutinize every word of docs.rust.org trying to figure out whether Professor Plum did it with the misaligned pointer in the .text section.

Either you tell everyone what the rules are in a highly explicit and unambiguous way, or some programmers will misunderstand them, and then compiler writers will once again be stuck having to stick flags all over their shiny new optimizer because it breaks some legacy code that was wrong when it was written.

Unsafety can be subtle

Posted Sep 19, 2024 13:18 UTC (Thu) by khim (subscriber, #9252) [Link] (3 responses)

> It is quite unreasonable to assume that people writing unsafe Rust have no preconceptions from C whatsoever.

Yet it would be even more unreasonable, some would even say preposterious, to assume that everyone who uses Rust and even unsafe Rust is a C programmer.

People who are using Rust and who never used C do exists and there would be more of them over the time.

I wouldn't be surprised to find out that there are more of them already than Rust users that know C.

The desire to bring Rust in kernel is driven, in large part, by the desire to bring precisely these people into the mix.

> The bytes that make up the element are also the bytes that make up the buffer, so it is perfectly reasonable to interpret the two as synonymous when reading documentation casually.

That's not true for the majority of the languages. If we only include popular languages then it's, essentially, true only for C and C++. C#, Go, Java, JavaScript, Python… nowhere would you find such equation. Most of these languages don't give you access to the part of the object, but if such ability is there then you get to touch that part and nothing else.

> Rust cannot escape the shadow of C, as you can see by the the functions as_ptr() and friends, since they exist primarily for interoperation with C (as_ptr_range() says this explicitly).

That's true, for some degree, but you can write lots of apps for years in Rust and never hit the need to know that dark corner of the language. Pushing its existence in the description of various perfectly normal, “safe” function would definitely be strange.

Equally as strange as saying that you have to go and learn C before you would attempt Rust.

Remember that Rust wasn't even imagined as a replacement for C and C++, it was originally not developed for that audience and such work is still not it's primary focus.

Rust developers are ready to accommodate certain requests to help these people, but “fuck everyone without C experience and make their life miserable for the sake of 10% of Rust users who are using it as C replacement” doesn't sound like a reasonable request.

> Either you tell everyone what the rules are in a highly explicit and unambiguous way, or some programmers will misunderstand them,

Nope. This doesn't work like this. “I will ignore any and all rules that you may place on me” is an attitude problem, not documentation problem. And the only solution is social. Rust community does well there thus there's a chance.

Unsafe Rust is hard. You just have to accept it. Yes, in an ideal world it would be easy. Yes, people know it's a mess and try to fix that. But right now the story is: you have to be careful.

Sticking with strict provenance is your best bet right now (after you have looked on safe sound wrapper around unsafety and couldn't find a suitable one for your usecase, of course). If you can do that. These are very strict, yet sensible, rules and they allow 99% of code that needs unsafe to be written. And the remaining 1%… they are still working on it.

Complaining about the fact that something that is known to be underdocumented and hard and just saying that it's underdocumented and hard wouldn't help you if you have no constructive ideas about how to change that situation.

> and then compiler writers will once again be stuck having to stick flags all over their shiny new optimizer because it breaks some legacy code that was wrong when it was written.

Not if they would proactively kick out people who are writing code that ignores the rules.

It's one thing to say “hey, I don't know how to implement this kind of design while staying in the boundaries outlined by the strict provenance approach”. Such stories (if justified and without any simple solution that would make them strict-provenance compliant) are always welcome on the URLO and IRLO or even just the bugtracker.

But the first line of defense should always be an attempt to not use the unsafe code in the first place! And the second line is to use sound API. And third line is strict provenance. And only after all these possibilities are exhausted you go to these forums in a search of a solution.

You tell us: nobody is going to carefully scrutinize every word of docs.rust.org trying to figure out whether Professor Plum did it with the misaligned pointer in the .text section – but that's exactly how fully conforming C code is supposed to be written! You already have to look on subtle nuiances of wording of the C standard and defect reports and other such things… Rust just makes it easier because there are various forums where you can ask Rust language developers for clarification.

Unfortunately “we code for the hardware” people don't even stop to think about whether it's possible to do something while staying within boundaries – but that's social problem and can not be solved by technical means.

Technology may just make conformance easier. And Rust language developers are trying to help with documentation and tooling (Miri is a must-have for someone who develops unsafe code in Rust), but, ultimately, only developer can be responsible for the conformance with the rules. Part where compiler does it's magic and where you don't have to carefully scrutinize every word exists on the other side of unsafe!

Unsafety can be subtle

Posted Sep 19, 2024 23:03 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (2 responses)

> Rust developers are ready to accommodate certain requests to help these people, but “fuck everyone without C experience and make their life miserable for the sake of 10% of Rust users who are using it as C replacement” doesn't sound like a reasonable request.

Please stop making quotes up and attributing them to other people. It is extremely rude.

> Nope. This doesn't work like this. “I will ignore any and all rules that you may place on me” is an attitude problem, not documentation problem.

Ibid.

> It's one thing to say “hey, I don't know how to implement this kind of design while staying in the boundaries outlined by the strict provenance approach”. Such stories (if justified and without any simple solution that would make them strict-provenance compliant) are always welcome on the URLO and IRLO or even just the bugtracker.

I literally just told you a story like that, and you responded by making up a bunch of things I didn't say, and then ridiculing those things.

Since this conversation is clearly going nowhere, I'm going to bow out.

Unsafety can be subtle

Posted Sep 20, 2024 0:51 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

> I literally just told you a story like that, and you responded by making up a bunch of things I didn't say,

How is it different from how you make up piece of documentation that never existed and write “slice::get_unchecked is documented to have the same semantic meaning as C pointer arithmetic” then use that “a bunch of things [others] didn't say” as justification for your assertions?

I have even gave you benefit of doubt and asked just where it was documented that way.

And in the end, instead of showing us something like that you decided to blame the documentation for your inability to read: “it is perfectly reasonable to interpret the two as synonymous when reading documentation casually”.

And when I ridiculed it editor only had to say this: “This is an on-topic discussion, but please remember to keep things polite”.

And yes, I agree, I wasn't polite, but I, at least, stick to the facts.

> and then ridiculing those things.

You want to say that only you can lie (that one even you admitted as a lie: “but literally nobody follows the strict provenance rules anyway, since they're explicitly marked as experimental and non-normative“) and then use these lies to misrepresent things? Others couldn't do that?

Let me quite myself, for a change (you haven't misquoted me, just ignored what I say): Rust developer would like to hear “hey, I don't know how to implement this kind of design while staying in the boundaries outlined by the strict provenance approach” stories. Not stories related to someone's inability to read or understand the documentation (documentation writers do accept patches for such cases, but, as noted, it's not clear how to change the documentation and force someone to actually try to read and understand it… I'm not even sure that's possible at all)!

And you tell me that we are discussing exactly such story here. But that's big fat lie, author of the original example even admitted that original code that was discussed was brought to compliance with stacked borrows (and the whole example that we are discussing here really-really doesn't need any pointers and thus stacked borrows or other such horrors, if you want to eliminate bounds checking then call to one unsafe function with literally zero arguments is absolutely enough).

Thus it doesn't see an example of design that is limited by strict provenance. Not even remotely close to it. Sorry.

Unsafety can be subtle

Posted Sep 20, 2024 11:20 UTC (Fri) by daroc (editor, #160859) [Link]

Let's stop discussing this here. It doesn't look like you're likely to change NYKevin's mind, and it seems like this comment doesn't really contribute any new technical information. Sometimes the best thing you can say is just "I still don't agree with you, but thanks for having this discussion with me."

Unsafety can be subtle

Posted Sep 19, 2024 11:53 UTC (Thu) by daroc (editor, #160859) [Link]

This is an on-topic discussion, but please remember to keep things polite. Specifically:

> You couldn't tell the difference between “one” and “many”? That's explained pretty early in most schools. Maybe you forgot?

This is not polite or respectful, and doesn't add anything interesting to your technical point. Please avoid insulting other commenters.

Unsafety can be subtle

Posted Sep 21, 2024 6:45 UTC (Sat) by ralfj (subscriber, #172874) [Link]

khim, I understand you are passionate about Rust-for-Linux and want to see that project succeed, but I think unnecessarily aggressive comments like this are hurting the cause. It is not necessary to repeat the point ("one element" vs "entire buffer") so many times in your answer, nor is it appropriate to suggest people forgot basic things taught early in school. Such a comment will not manage to convince anyone of your position, it will just make people uninterested in engaging with you. Please re-think your communication strategy and try to be a little more empathetic towards the position of others. :)

Kind regards,
Ralf

Unsafety can be subtle

Posted Sep 19, 2024 13:42 UTC (Thu) by apoelstra (subscriber, #75205) [Link] (3 responses)

>So what's so special about get_unchecked() that makes it different, exactly?

To restate khim's post much more briefly: the difference is that as_ptr returns a raw pointer while get_unchecked returns a reference.

I agree that it'd be helpful to say this in the docs (as well as to mention that you can call get_unchecked with a range if you want access to more than one element, which is how I ended up fixing my original code). I don't even think you need to mention C.

Though having said that, `get_unchecked` is a very old method and one of the classic "here's a case where you'd need to use unsafe" methods, so I'm skeptical that many users will carefully (or even casually) read the docs.

Unsafety can be subtle

Posted Sep 19, 2024 14:20 UTC (Thu) by khim (subscriber, #9252) [Link] (2 responses)

> Though having said that, `get_unchecked` is a very old method and one of the classic "here's a case where you'd need to use unsafe" methods, so I'm skeptical that many users will carefully (or even casually) read the docs.

Note that get_unchecked is, actually, range-check-skipping version of get and it even explicitly tells you: For a safe alternative see get (with clickable link!), which actually then tells you that you may either access one element or range of elements with this method.

Your situation is a bit unusual because you have immediately decided to skip all the range-checking without testing and without verifying that these are the bottlenecks and are worth eliminating (note that if they are easy to eliminate for human then often compiler can easily eliminate them, too, after inlining, which means these checks are not, necessarily, a performance bottlenecks and often where they are performance bottlenecks they are, often, desirable because if compiler couldn't prove that they are not needed then chances are high that it's not actually guaranteed by the structure of your code, either).

That's why you have missed all these explanations (that are there for get, but not for it's range-check-skipping sibling).

That's not how slice is supposed to be used, normally! I guess documentation was just not written with people which would try to remove as many guardrails as possible as early as possible.

Are you even sure in your case use of get_unchecked actually brings measurable performance wins over the use of get?

Unsafety can be subtle

Posted Sep 19, 2024 19:09 UTC (Thu) by apoelstra (subscriber, #75205) [Link] (1 responses)

>Your situation is a bit unusual because you have immediately decided to skip all the range-checking without testing and without verifying that these are the bottlenecks and are worth eliminating

This is an incredible inference from example code I posted to illustrate pointer provenance rules and which has never actually been deployed in the form I described.

Unsafety can be subtle

Posted Sep 19, 2024 19:35 UTC (Thu) by khim (subscriber, #9252) [Link]

> This is an incredible inference from example code I posted to illustrate pointer provenance rules and which has never actually been deployed in the form I described.

That's not not inference from the example code, but more of inference from your comments. You are using unsafe function to do some manipulations that usually are done with normal, safe, code in an unsafe way by using function that's sole purpose is to save a tiny amount of execution time by bypassing the usual safety checks. And then you add debug_assert that, essentially, restores error-checking that is normally used by Rust programs unconditionally in debug build. And then complain that function that you used to bypass normal safety checks have poor documentation – while asking for clarifications that are already present in the sibling version of that same function, that's already referred as “safe alternative” there and which is one click away. Which means that, you probably, have never read the documentation for the safe variant: how else would you not notice that this variant already addresses all you concerns?

Forgive me, but I fail to see how one would end up in that situation except if one have apriori decided for oneself that one is not interested in writing normal, safe code and would stick to the removal of Rust-provided guardrails as much as possible.

Note: I'm not even saying that's it's wrong to write code that way. Maybe, for the kernel needs, it's even the best way and you haven't forgotten to erect manual check after you removed the normal Rust-provided provided check there.

But that definitely puts you into a different usage mode: normally it's expected that Rust developer would stick to normal, “safe” interface as much as possible and would only remove guardrails where that's truly justified.

Going down that “normal” road it's almost impossible to miss detailed specification that explains what get function works, how is it supposed to be used, there are examples with a single argument and a range, etc.

Provenance rules not yet formalised

Posted Sep 19, 2024 9:11 UTC (Thu) by farnz (subscriber, #17727) [Link]

You're right to say that the rules for provenance haven't yet been fully formalised, and I suspect that the operation you're talking about is in that set.

Right now, the "real" rules for pointer provenance are "whatever LLVM happens to do on this machine at this version, given that LLVM doesn't want to do anything that will shock C or C++ programmers". Strict provenance is a set of rules that are guaranteed to be stricter than whatever the final rules will be; the intention is that the final formalization of provenance rules will end up being equivalent to "strict provenance, except that these things banned by strict provenance are actually allowed in the final rules".

It'll probably not be written that way, though; there will be strict provenance rules (as today), and then a set of rules that are verified to be a superset of strict provenance rules. There may even be multiple tiers of this, where each tier is both less strict and harder to verify your code against than the tier above.

The idea remains, though, that if your code is compliant with strict provenance rules, it will definitely be compliant with the final rules for pointer provenance; whether those be PVNI-ae-udi from draft N2676 for the C language, or something with more subtleties in it. And there's thus a promise from the Rust compiler developers that they'll avoid breaking unsafe code that complies with strict provenance, even though the final provenance rules haven't yet been determined.

Unsafety can be subtle

Posted Sep 18, 2024 9:35 UTC (Wed) by donald.buczek (subscriber, #112892) [Link] (6 responses)

Hmm, interesting. Are you sure, this is UB and not just a case which is not yet handled by miri's stacked borrow checks? I think these are still a bit experimental.

Can you identify a concrete rule which is violated?

- https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping....
- https://doc.rust-lang.org/std/ptr/index.html#safety

I can't.

Unsafety can be subtle

Posted Sep 18, 2024 11:09 UTC (Wed) by khim (subscriber, #9252) [Link] (5 responses)

> Are you sure, this is UB and not just a case which is not yet handled by miri's stacked borrow checks?

The problem is not with “miri checks”, the problem is with stacked borrows model, itself.

> Can you identify a concrete rule which is violated?

Can we, please, stop the language lawyering. That's how C turned into the disaster, there are no need to repeat that story.

The idea here is to use conservative rules as long as the work and talk with language developers about complicated cases when these conservative rules couldn't be satisfied.

Currently there are two sets of rules: stacked borrows and tree borrows. They are a bit different and both support some code that the other rejects. Miri supports both.

Try to run your code with them and see if it would accept it. It would be nice to write somewhere which model are you using, but please don't dismiss them.

There are known issues, sure (e.g. Rust doesn't provide any way to attach pointer provenance to function pointers and yet Miri would complain about these issues). Discussion is still ongoing but since that's an important case where even documentation recommends things that Miri currently rejects you may assume it's Ok to use it, for now.

The main problem with unsafe Rust is not that it's rules are too complicated but that these rules are not finalized!

Of course in that same sense the whole C/C++ are not finalized WRT what can be done with pointers and what couldn't be done with pointers, but in Rust people try to create sane rules… that's harder than ignoring problems and doing nothing, of course, but still… if you have legitimate design patterns that couldn't be reconciled with Miri in both modes – talk to Ralf, file bugs, etc. Don't dismiss these issues by saying “oh. well, these are just some experiements, it's Ok to ignore them”.

P.S. Frankly, I suspect it could be easier to use tree borrows with Rust-in-Kernel because it allows more tricks that expand pointers validity while cases which it rejects are mostly irrelevant for kernel, but to know for sure we need to run code with both and compare.

Unsafety can be subtle

Posted Sep 18, 2024 11:51 UTC (Wed) by pizza (subscriber, #46) [Link] (4 responses)

> Can we, please, stop the language lawyering. That's how C turned into the disaster, there are no need to repeat that story.

Language lawyering is how "problems" with your language are identified [1]. How those "problems" are handled/resolved is another matter entirely.

[1] Strongly enforcing a still-growing set of highly opinionated rules is the *entire point* of Rust.

Unsafety can be subtle

Posted Sep 18, 2024 12:19 UTC (Wed) by khim (subscriber, #9252) [Link] (3 responses)

> Language lawyering is how "problems" with your language are identified

Sure. That's how you cause dissent and doubt. Why is that needed here?

> Strongly enforcing a still-growing set of highly opinionated rules is the *entire point* of Rust.

No. The “entire point” of Rust is to make program safer. Highly-opinionated rules (as well as any other rules) help as long as they are followed.

And language lawyering (attempt to use the documentation as holy gospel while forcibly ignoring things that exist outside of it) is not how rules are followed, but the exact opposite: it's the way to dig extra rules from the exact same text.

It's useful activity only as long as you are planning to improve the documentation itself. To show that documentation contradicts the implementation or is deficient for some other reason (e.g. if it have self-contradictions or something that couldn't be supported), etc.

When you are doing that to find an excuse to do or not do something… this just splinters the community, because different people may “discover” different things in the exact same “holy” text.

Unsafety can be subtle

Posted Sep 18, 2024 12:35 UTC (Wed) by pizza (subscriber, #46) [Link] (2 responses)

> No. The “entire point” of Rust is to make program safer. Highly-opinionated rules (as well as any other rules) help as long as they are followed.

...And what do you do if the rules are ambiguous or do not match reality?

while (<undesireable behavior occurs>) {
if (<behavior allowed by spec>) {
if (!<we care>) {
break; // The typical answer of C language stewards
} else {
// bug in specification
fix_spec(); // The typical answer of Rust language stewards
}
} else {
// bug in compiler/tooling
fix_implementation();
}
}

Unsafety can be subtle

Posted Sep 18, 2024 13:03 UTC (Wed) by khim (subscriber, #9252) [Link] (1 responses)

> ...And what do you do if the rules are ambiguous or do not match reality?

The important thing is what you don't do. You don't say “can you identify a concrete rule which is violated?”

This asserts that only documentation and rules included in it matters and everything outside of it just simply doesn't exist or could be ignored.

Unsafety can be subtle

Posted Sep 19, 2024 9:01 UTC (Thu) by donald.buczek (subscriber, #112892) [Link]

> Can we, please, stop the language lawyering.

I don't understand why you have to devalue your otherwise interesting answers by using unnecessary fighting words.

> You don't say “can you identify a concrete rule which is violated?”
> This asserts that only documentation and rules included in it matters and everything outside of it just simply doesn't exist or could be ignored.

This interpretation of my question is wrong.

Unsafety can be subtle

Posted Sep 20, 2024 2:48 UTC (Fri) by comex (subscriber, #71521) [Link] (1 responses)

FYI, the main pattern you cited - where you take a reference to part of an allocation, then convert it to a pointer and index out of bounds of the referenced type (but still in bounds of the allocation) - may end up becoming defined behavior.

It is undefined behavior according to Stacked Borrows. But it's not undefined behavior [1] according to Tree Borrows, which is an alternative model that Rust may end up officially adopting instead of Stacked Borrows. Ralf Jung has also explored the idea of modifying Stacked Borrows to make it legal [2]. See also the canonical issue report about this pattern [3].

Broadly speaking, this pattern is something that would be nice to make legal at least for immutable references / reads, because there aren't any current optimizations, or highly-desired future optimizations, that would break it.

The other pattern you cited - aliasing with mutable references - is different. It needs to be UB in _some_ form, because it's used to justify an important optimization: marking function parameters of reference type with LLVM's `noalias` attribute, equivalent to C `restrict`. Since most C code does not use `restrict`, this is and will remain an area where Rust unsafe is harder to reason about than C.

That said, Tree Borrows affects this pattern as well, and at least reduces the amount of UB compared to Stacked Borrows. Specifically, if you create an aliasing mutable reference but don't actually write through it (at least not to the same bytes that you later read from), then you're probably okay under Tree Borrows. [5]

(For pedantry's sake, I should mention that even Stacked Borrows' rule is less strict than how you phrased it, but only slightly.)

[1] https://github.com/BoxyUwU/rust-quiz/issues/11
[2] https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Making.20Stacked.20Borrows.20better.20with.20ugly.20hacks
[3] https://github.com/rust-lang/unsafe-code-guidelines/issue...
[4] https://github.com/rust-lang/unsafe-code-guidelines/issue...
[5] https://perso.crans.org/vanille/treebor/protectors.html#n...

Unsafety can be subtle

Posted Sep 21, 2024 6:22 UTC (Sat) by ralfj (subscriber, #172874) [Link]

> Ralf Jung has also explored the idea of modifying Stacked Borrows to make it legal [2].

Unfortunately those modifications wouldn't make this particular code legal. I don't know any way to achieve that without going to something like Tree Borrows.

Unsafety can be subtle

Posted Sep 21, 2024 6:18 UTC (Sat) by ralfj (subscriber, #172874) [Link]

Note that the error Miri shows for this code has the following:

= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/... for further information

It is hard to find a set of rules for these aliasing requirements that is precise, can be automatically checked, and allows the desired optimizations. We don't have the final answer yet. That's why there are no strong warnings against this kind of code in the docs: It is deep inside the "gray area", and it is not something we actually already use for optimizations. Restricting references to only access the field/element they point to is something I strongly think we should relax (and that's what the discussion at https://github.com/rust-lang/unsafe-code-guidelines/issue... is about). This is especially true for your case, where the reference points into an array, for the reasons you give -- pointer arithmetic inside arrays is legal even in C, so we should avoid such pitfalls in that space as much as we can. (Unlike C, Rust also permits pointer arithmetic in structs if you use raw pointers instead of references.) But it's hard to make Stacked Borrows accept this code without making Stacked Borrows accept *too much* code.

Nobody forces you to make your code compatible with Stacked Borrows, it is an *experimental* model after all. But if you want to be as sure as you can currently be that your code is fine with whatever the future aliasing model ends up being, then it is a good idea to do the extra work and fix Stacked Borrows errors. Think of it like adding a safety margin -- we don't know where the edge between "UB" and "okay" will be, so it is better to err on the side of staying a bit conservative and avoiding the gray area. If you are okay with living a bit more "on the edge" and an increased risk of having to adjust your code in the future as the aliasing model solidifies, you can use "-Zmiri-tree-borrows" to switch to a different, even more experimental aliasing model that *will* accept your code. This model should still be able to catch all UB that is exploited by today's optimizer (but the optimizer can change in the future, and we're not ready yet to commit to a hard set of rules here).

> so I am fairly confident that this is actually undefined behavior, and not just extra paranoia

It is Undefined Behavior under *experimental* rules, as the error clearly indicates. The rules are deliberately conservative, so there is some extra paranoia involved here. We don't currently instruct the optimizer to actually make us of this particular UB, but there is other Stacked Borrows UB that we *do* use in the optimizer, and a bunch of Stacked Borrows UB that we *may want to* use in the optimizer. It's unclear how to tell these apart, hence this error.

Why this code possibly has undefined behavior using strict providence

Posted Oct 6, 2024 18:13 UTC (Sun) by hkBst (guest, #173872) [Link]

For others wondering how this code can be UB: this code triggers a UB error from Miri due to provenance issues, the exact rules of which are still being worked out, but which do have a simple superset, called strict provenance. The exact rule which is relevant for this example is that a pointer derived from a reference/slice only has valid provenance if it points to where that reference/slice can point to. Given this analysis it is simple to satisfy Miri with a small change to the code[1]:

```rust
use std::{mem,ptr};

fn bad_load_int_le(buf: &[u8; 32], i: usize) -> u32 {
assert!(i + mem::size_of::<u32>() <= buf.len());
let mut data = 0u32;
unsafe {
ptr::copy_nonoverlapping(
buf.get_unchecked(i), // provenance problem because only references the first byte
//buf.get_unchecked(i..i + mem::size_of::<u32>()).as_ptr(), // reference all four bytes
//buf.as_ptr().add(i), // reference entire buf
&mut data as *mut _ as *mut u8,
mem::size_of::<u32>(),
);
}
data.to_le()
}

// this let's you use Miri
fn main() {
let data = [0u8; 32];
bad_load_int_le(&data, 0);
}
```

[1]:https://play.rust-lang.org/?version=stable&mode=debug...

comments can end up being very good at _hiding_ bugs

Posted Sep 17, 2024 23:00 UTC (Tue) by viro (subscriber, #7872) [Link] (1 responses)

... from the author, in the first place. The main reason why reviewing each other's code is more efficient than reviewing your own is that when you _know_ what that chunk is supposed to be doing, you are prone to looking through it hell knows how many times without actually reading it - you skip details and interpolate what obviously should be there. Gets only nastier as debugging session goes on; taking a break or asking somebody else to look it through helps exactly by breaking out of that mode.

IME the combination of comment saying the right thing with opaque chunk of code being almost, but not quite matching it is one hell of a way to prime that kind of trap. And what I've seen of safety comments in Rust patches raises all kinds of red flags of that sort...

Basically, a reaction along the lines of "that looks tricky; is it even correct?" is inhibited by assertion "this is correct for <incomprehensible prose, presumably containing reasons>". And then there's https://www.tuhs.org/PUPS/Usenet/notunderstand.txt ...

comments can end up being very good at _hiding_ bugs

Posted Sep 21, 2024 6:21 UTC (Sat) by ralfj (subscriber, #172874) [Link]

Personally, I find such comments to be invaluable when reviewing tricky unsafe code -- I can compare the comments (expressing the authors intent in a fine-grained way) against the actual code in a fairly local way, which is a lot easier than trying to figure out how everything in the current function works together to make this line correct. It's the difference between things being checkable in a local way vs having to globally keep everything in my head at once.