The perils of pinning
The perils of pinning
Posted Sep 18, 2022 18:37 UTC (Sun) by Wol (subscriber, #4433)In reply to: The perils of pinning by foom
Parent article: The perils of pinning
The point is NO formal language will let you do most of the stuff Linux does. (Of course assembler will let you, because assembler has no formal model.)
Linux is an operating system. It is MEANT to talk to other devices, with other controllers, that do their own thing. NO formal language can cope with other systems doing things behind its back.
So basically, the less "unsafe" code there is the better, because safe code means the compiler has proven that the code will work as intended (barring programmer "sillies"). But at the end of the day, there will still need to be a lot of "unsafe" code, because ...
If your code is reading from the receive buffer of a network card, you really do not want a language that assumes memory only changes when you write to it! What was that about GCC assuming reading from uninitialised memory is UB and can be optimised away? Bang goes your networking!
Cheers,
Wol
Posted Sep 19, 2022 9:02 UTC (Mon)
by tialaramex (subscriber, #21167)
[Link] (2 responses)
Rust actually reflects what your machine code can do for that network card receive buffer. You can say look, just perform actual fetches for all the "memory".
let recvd = std::ptr::read_volatile<NetworkBuffer>(recv_buffer); /* Rust will bit-wise copy the values in the buffer. */
This function is unsafe, because it fetches some arbitrary memory so clearly you can blow up the world (e.g. unaligned read on an architecture where those are forbidden, or just point it out of bounds), but it's very well defined if recv_buffer actually points at a NetworkBuffer size blob of suitably aligned and addressable "memory" we can load values from, it will emit actual loads for those values and not try to cache them or assume it knows their value or whatever.
That's a contrast from C which has us actually pretend recv_buffer is just pointing at an actual NetworkBuffer with a "volatile" qualifier and so then we can go around doing operations to it, even though in practice that's a bad idea and the only thing we ought to do is copy it somewhere. On a good day that's all the C (or worse C++) does with a volatile, on a bad day you need to guess whether the programmer knew what's really going or whether the code you're reading is full of unintentional races.
There was an effort to get C++ to move towards intrinsics for fetch/ store like Rust rather than C's volatile qualifier hack. But this got some very angry push back, and I anticipate there will not be any further attempts.
Posted Sep 19, 2022 18:16 UTC (Mon)
by kreijack (guest, #43513)
[Link] (1 responses)
> That's a contrast from C which has us actually pretend recv_buffer is just pointing
I don't think to understood your sentence. But my understood is that you can wrote a
read_volatile(recv_buffer)
that copy the data to a "volatile" buffer. IIRC When a pointer is passed to a function
Posted Sep 20, 2022 8:40 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
I'm going to stick to C syntax throughout, since I think that's easier for kernel programmers to follow than Rust.
In C, you might have code like:
It's then tempting (but often wrong) to write code that does things like buf->ip_hdr.src_addr, when this isn't actually a good idea because of the volatile reads that will be done.
Rust doesn't have a volatile qualifier on storage that changes all codegen accessing that storage. Instead it has the equivalent of void * memcpy_volatile(void * dest, volatile void * src, size_t count); (but using generics from Rust's type system to replace void * and size_t count). Because your only dependable operation on the buffer is to copy out of the shared space that can change underneath you at no notice (imagine that, for example, NetworkBuffer actually lives in memory the far side of the PCIe bus, on the NIC itself), that's what you'll do.
Posted Sep 19, 2022 18:11 UTC (Mon)
by kreijack (guest, #43513)
[Link] (1 responses)
I think that C++ does...
Posted Sep 21, 2022 11:24 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link]
Sometimes the abstract machine's differences from a real machine just have performance consequences, for example most doubly linked list operations look really clever in the abstract machine and have reasonable performance, but this has lousy performance on an actual computer you can buy today because of caches.
But often there are simply practical differences, the abstract model lacks entirely something the real machine has. A higher level C++ application needn't care but the Linux kernel does. For some thing Linux relies on inline assembler, the same thing works in Rust, and the same trick kernels written in C++ use. In other places Linux is relying on semantics which are not offered by the formal language and which likewise are not offered in ISO C++ but do happen to work in the chosen compiler.
The perils of pinning
The perils of pinning
> at an actual NetworkBuffer with a "volatile" qualifier and so then we can go around
> doing operations to it, even though in practice that's a bad idea and the only thing
> we ought to do is copy it somewhere. On a good day that's all the C (or worse C++)
> does with a volatile, on a bad day you need to guess whether the programmer knew
> what's really going or whether the code you're reading is full of unintentional races.
function in C
the compiler stops any assumption to the pointed data.
The perils of pinning
struct NetworkBuffer {
struct IpHeader ip_hdr;
union {
struct TcpSegment tcp;
struct UdpSegment udp;
};
};
volatile * NetworkBuffer buf;
The perils of pinning
> (Of course assembler will let you, because assembler has no formal model.)
The perils of pinning