|
|
Log in / Subscribe / Register

read/write volatile

read/write volatile

Posted Jan 9, 2026 17:19 UTC (Fri) by mb (subscriber, #50428)
Parent article: READ_ONCE(), WRITE_ONCE(), but not for Rust

> in end they come down to calls to the Rust read_volatile() and write_volatile() functions.

I experimented with read/write volatile on AVR, where it would be greatly beneficial to use them for certain inter-thread (interrupt) communication.
Just in the same way every AVR C program uses volatile "atomic" variables to do this.
https://github.com/mbuesch/avr-atomic

However, I came to the conclusion that using read/write volatile for inter-thread (interrupt) communication is unsound in Rust.

AVR hardware is not capable of doing a non-atomic read/write for byte sized objects. So the hardware is fine for all of the relevant cases.
But the read/write volatile documentation was pretty clear to me that the Rust virtual machine considers concurrent volatile accesses unsound and is free to optimize it to bits.

https://doc.rust-lang.org/std/ptr/fn.read_volatile.html

> an operation is volatile has no bearing whatsoever on questions involving concurrent accesses from multiple threads
> Volatile accesses behave exactly like non-atomic accesses in that regard.

> This access is still not considered atomic, and as such it cannot be used for inter-thread synchronization.

My implementation worked fine and the generated assembly code was perfect.
However, I changed it to a less optimal inline asm implementation just because I think the Rust documentation considers concurrent read/write volatile without additional synchronization to be unsound.

(Yes, I know there is AtomicXX and no it's not efficient for reasons... And yes, I should fix the compiler's atomic intrinsics instead of working around the problem... I know :)


to post comments

read/write volatile

Posted Jan 9, 2026 19:20 UTC (Fri) by josh (subscriber, #17465) [Link] (8 responses)

> Yes, I know there is AtomicXX and no it's not efficient for reasons

Relaxed atomics are effectively compiler-barrier atomics, and *shouldn't* have any runtime overhead. Are you encountering cases where there's more inefficiency than that?

read/write volatile

Posted Jan 9, 2026 19:54 UTC (Fri) by mb (subscriber, #50428) [Link] (1 responses)

Yes. Should not.
As I said, it's an LLVM problem on AVR. Which is not a stable tier.
Atomic always use heavy syncing on AVR in the current compiler.

But that was not my point.

My point was that I think volatile accesses are not sound in Rust for inter thread communication.

read/write volatile

Posted Jan 9, 2026 22:10 UTC (Fri) by josh (subscriber, #17465) [Link]

> As I said, it's an LLVM problem on AVR. Which is not a stable tier.
> Atomic always use heavy syncing on AVR in the current compiler.

Ah, got it, thank you. Hopefully that can be fixed.

> My point was that I think volatile accesses are not sound in Rust for inter thread communication.

Right, I believe that's correct.

read/write volatile

Posted Jan 10, 2026 9:33 UTC (Sat) by plugwash (subscriber, #29694) [Link] (5 responses)

> Relaxed atomics are effectively compiler-barrier atomics, and *shouldn't* have any runtime overhead.

The situation is a little more subtle than that.

Relaxed atomics on a given memory location, must behave as-if they had a well-defined order (though this order may differ from operations on other memory locations, unless fences are used), and this must apply to the whole set of atomic operations. You may only be using load and store on a particular location, but the compiler doesn't know that. Other code might be performing other atomic operations on that location.

My understanding is that this effectively means that if you implement read-modify-write operations by using a global lock, you must also implement plain write operations using that same global lock.

read/write volatile

Posted Jan 10, 2026 20:13 UTC (Sat) by comex (subscriber, #71521) [Link] (1 responses)

That’s correct, but not actually applicable to Rust, because Rust doesn’t allow atomics to be implemented with a global lock, unlike C++ and C. Instead, on targets that don’t natively support atomics, Rust just makes the atomic APIs unavailable. It’s one of those tradeoffs where Rust is willing to accept slightly less portability in exchange for a nicer programming model.

read/write volatile

Posted Jan 11, 2026 17:49 UTC (Sun) by garyguo (subscriber, #173367) [Link]

Note that the kernel unconditionally provide 64-bit atomic operation to all archs, and in archs that doesn't support native atomics on 64-bit integers, they are implemented using locks. This means that using `READ_ONCE()` on u64 for atomic ops is incorrect (it needs to be a `atomic64_read()`).

`READ_ONCE` on u64 might still be useful is you just want to read the value in a data-race-free way and you don't care about atomicity (i.e. allow the read to tear). However this is yet another reason I don't want people to just use `READ_ONCE()` for atomic ops on Rust side -- it's just not intuitive which semantics is desired.

read/write volatile

Posted Jan 10, 2026 20:44 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (2 responses)

> Relaxed atomics on a given memory location, must behave as-if they had a well-defined order (though this order may differ from operations on other memory locations, unless fences are used), and this must apply to the whole set of atomic operations. You may only be using load and store on a particular location, but the compiler doesn't know that. Other code might be performing other atomic operations on that location.

That is true but misleading. Your parenthetical negates all of the guarantees that are actually expensive to implement, at least on x86.

Noting for the record: A fence must be on the same thread as the relaxed atomic in order to restrict it, and there are several other requirements as well. I refer the curious reader to https://en.cppreference.com/w/cpp/atomic/atomic_thread_fe... and related documentation for more information.

> My understanding is that this effectively means that if you implement read-modify-write operations by using a global lock, you must also implement plain write operations using that same global lock.

If you take a global lock, then there are two different memory locations in play (lock and payload), so your parenthetical above already tells us that the lock is ineffective (at protecting against against un-fenced relaxed atomics on either the lock or the payload).

Or perhaps I have misunderstood what you mean by "implement read-modify-write operations by using a global lock." I would normally understand a "read-modify-write operation" to be a hardware instruction (or sequence of instructions), which is not our problem to "implement" in the first place. If you mean "emulate," then the problem we run into is that emulators do not emulate the C abstract machine. They emulate some real hardware like x86, or virtual hardware like the JVM. Those platforms have their own, more specific memory models than the C abstract machine, and the compiler backend must necessarily take advantage of those memory models to emit correct assembly/machine code. So our emulator is not permitted to stop at just taking locks for relaxed atomics - it doesn't necessarily know which stores or loads originated as relaxed atomics in the first place, and therefore may have to take locks for all loads and stores whatsoever. Of course, it would be preferable to implement these operations lock-free if it is possible to do so.

read/write volatile

Posted Jan 10, 2026 21:12 UTC (Sat) by willmo (subscriber, #82093) [Link] (1 responses)

I think it’s a third meaning of “implement”, as implied by comex’s adjacent comment: when the target hardware doesn’t natively support the desired C/C++ atomic operation (e.g. it lacks even an appropriate CAS to implement a RMW), the compiler must compile it to use a global lock. This is certainly not applicable to common data types on modern x86, ARM, etc.

> If you take a global lock, then there are two different memory locations in play (lock and payload), so your parenthetical above already tells us that the lock is ineffective (at protecting against against un-fenced relaxed atomics on either the lock or the payload).

In this case, the compiler would need to compile un-fenced relaxed atomics so that they take the global lock. That’s what plugwash meant.

read/write volatile

Posted Jan 12, 2026 16:36 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

Well, sure, if the platform leaves you up the river with no atomics, then there's only so much you can do. You're effectively in the business of emulating a (virtual) platform with atomics (the C abstract machine) on a (physical) platform that doesn't provide them. And as I said, emulation is slow.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds