The perils of pinning

Posted Sep 16, 2022 7:52 UTC (Fri) by Karellen (subscriber, #67644)
In reply to: The perils of pinning by gray_-_wolf
Parent article: The perils of pinning

I'm at a similar level of knowledge with rust, and I'm getting stuck on:

the Rust compiler will happily move objects in memory whenever that seems like the thing to do. Since the compiler knows where the references to an object are, it can move that object safely — most of the time.

[...]

initializing a list_head structure to indicate an empty list is done by setting both the next and prev fields to point to the structure itself. If, after that happens, the compiler decides to move the structure, those pointers will now point to the wrong place;

If the compiler knows where the references to an object are, and can presumably update them, why doesn't it update prev and next when it does the move?

The perils of pinning

Posted Sep 16, 2022 8:27 UTC (Fri) by rsidd (subscriber, #2582) [Link] (5 responses)

I am guessing it knows where the references to an object are other than self-references.

That is, if A refers to B, and you move B, you can update A since you know A refers to it. But if B refers to B and you move B, it doesn't update.

Why not (ie, why doesn't it recognize that B refers to B and update it) is an interesting question. Maybe it's somewhere deep in the language design: reference-counting doesn't include self-references? And maybe it's hard to fix?

Actually I have a more basic confusion: how does the compiler move objects in memory? Isn't that a run-time thing?

The perils of pinning

Posted Sep 16, 2022 10:33 UTC (Fri) by Bigos (subscriber, #96807) [Link] (4 responses)

The compiler is able to generate code that moves an object from one place to another, but allows that only when the object is owned*.

When the object is owned, the compiler knows there exist no reference to it. It can then just update the internal state to know the location of the moved object without updating anything really. This is basically what happens when Foo::new() returns Foo object and you then Box::new() it to put it on the heap.

Self-referent types are foreign to the Rust borrow rules. You cannot have a reference to itself (the lifetime cannot be defined), so these are usually raw pointers in disguise. And Rust will happily memcpy raw pointers without any consideration as it is the user's responsibility to correctly maintain raw pointers (which is why they are unsafe to dereference).

So it is not about "A refers to B vs B refers to B" but "A has a reference to B vs B has a raw pointer to B".

And you cannot even move B if it is referred by A. If a type is referenced, it means it is borrowed and is no longer owned**.

Unlike C++, Rust has been designed not to be able to alter how object move is implemented. You thus cannot alter the memcpy behavior. This is why Pin<> is important as it effectively disallows object moves - which is necessary for self-referent types.

The thing about "compiler moving objects" is obviously a simplification. The object must be moved, i.e. placed in a specific memory location, and the compiler can use various strategies to make that happen. It can perform memcpy at runtime. Or it can make it so that the original object is created in the move destination already (often called "move elision").

* There is std::mem::swap() (and its friends) that can move things that are not owned but just referenced by an exclusive reference &mut T. However, we cannot really use these with self-referent types anyway.

** Again, exclusive references could make that happen. But if B has an exclusive reference to A, it is the only object that knows A's address, so only B has to be updated when A is moved. Exclusive references are not how you usually model a linked list as then you cannot even iterate over a list without consuming it.

The perils of pinning

Posted Sep 16, 2022 10:55 UTC (Fri) by rsidd (subscriber, #2582) [Link]

Thanks for the detailed explanation.

The perils of pinning

Posted Sep 16, 2022 14:34 UTC (Fri) by kkdwivedi (subscriber, #130744) [Link] (2 responses)

So if I'm understanding this correctly, I think what it really needs is something similar to C++'s placement new? So that it can begin the lifetime of a pinned self referent object by reusing another object's storage (or MaybeUninit storage), without having to move objects around?

The perils of pinning

Posted Sep 16, 2022 15:03 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (1 responses)

I think that would work except that there's no "constructor" mechanism in Rust. You can have methods that act like constructors by not taking `self` in any way and returning a `Self` (it might be `Option<>` or `Result<>` wrapped), but there's no trait tracking such APIs to be able to write a placement new.

The perils of pinning

Posted Sep 16, 2022 22:34 UTC (Fri) by ssokolow (guest, #94568) [Link]

I'm not sure about the details in C++, so I don't know how they differ, but there's been interest and discussion around designing something placement new-like for Rust for a long time.

(The earliest RFC I could find in a quick search was #1228 from two months after Rust 1.0 in 2015, and it's clearly a continuation of plans for placement new already in progress.

My impression is that it just keeps getting bumped down the priority queue in favour of higher-demand/ROI things like support for C-style untagged unions in the FFI side of things (yes, they weren't present in stable-channel Rust until version 1.19), async/await, GATs, and so on.

(caniuse.rs is a good way to quickly get an overview of what got stabilized when and The Unstable Book is a good overview of the compiler features that are currently implemented for some definition of "implemented" but haven't been committed to stable channel and the language stability promise.)