|
|
Log in / Subscribe / Register

overly strict semantics

overly strict semantics

Posted Jan 9, 2026 17:42 UTC (Fri) by bertschingert (subscriber, #160729)
Parent article: READ_ONCE(), WRITE_ONCE(), but not for Rust

Does anyone have a sense of how frequently READ/WRITE_ONCE() are used as a needlessly strict substitute for relaxed atomic reads/writes, versus situations where the additional strictness is actually required?

I wasn't around when they were implemented, so I'm speculating here, but I get the sense that READ/WRITE_ONCE() were implemented as a volatile cast not because volatility gives the optimal or desired semantics in most situations, but because that was the best tool available prior to the C11 atomics model.

While it may be prohibitively difficult, it does seem like changing the C side to use relaxed atomics (when correct) would be the right thing to do. But I don't really know how many uses actually require the additional "volatility" guarantee provided by READ/WRITE_ONCE().


to post comments

overly strict semantics

Posted Jan 10, 2026 1:39 UTC (Sat) by wahern (subscriber, #37304) [Link] (8 responses)

IIUC the issue is that at least on some architectures atomic interfaces provided by both the compiler and ISA are unnecessarily costly in some situations, at least in the estimation of those who wrote and use rwonce.h. The comments in rwonce.h suggest that they're not guaranteeing cross-CPU atomicity, and relying on real-world behavior wrt asynchronous operations (e.g. across interrupts) on the same CPU that no compiler or memory model provides explicit support for today.

Also, C11 atomics is not the origin point for atomic intrinsics[1] or a meaningful memory model in either GCC or Linux. It's not the final or even 100% comprehensive model, either. I think the push for a more formal memory model in C, C++, and the compilers gives a false impression such a thing was completely non-existent beforehand and that things are satisfactory today.

[1] GCC had at least two sets of intrinsics before supporting C11 atomics, and of course projects like the kernel had their own set that work just as well today as they did before the latest set of builtins.

overly strict semantics

Posted Jan 10, 2026 8:54 UTC (Sat) by koverstreet (✭ supporter ✭, #4296) [Link] (2 responses)

Hang on, at the ISA level there is no notion of an "atomic" load or store, there's just loads and stores. Atomic - like the lock prefix on x86 - only makes sense for operations that are doing multiple things within the same instruction: load, increment, add - atomic increment.

The "atomicity" guarantees that READ_ONCE() and WRITE_ONCE() provide only come in at the compiler level. The compiler will coalesce loads and stores or emit multiple loads as a substitute for spilling registers without some notion of atomicity at the language level.

The "unnecessarily costly" part of READ_ONCE() and WRITE_ONCE() is that they don't distinguish between atomicity and ordering - they also specify strict ordering, but only to the compiler, not the hardware (they don't emit memory barriers).

Rust's atomic load/store really are just better, because they separate out ordering from atomicity and make ordering explicit. And instead of sprinkling around separate memory barrier calls, which may or may not be commented, they're attached to the operation that needs them - which is good for readability.

overly strict semantics

Posted Jan 10, 2026 17:45 UTC (Sat) by excors (subscriber, #95769) [Link]

> at the ISA level there is no notion of an "atomic" load or store, there's just loads and stores.

I'm not certain what you mean by that. E.g. the ARM ARM defines "single-copy atomicity" which is important even in single-processor code: if an interrupt occurs during a STP (Store Pair) instruction, whose operation is defined as a single assignment to memory, the interrupt handler may observe the first half of memory was updated and the second half wasn't, because STP is treated as two separate atomic writes. (The STP instruction will be restarted after the interrupt returns, so it'll complete eventually). So I think the ISA does define the notion of atomic loads and stores, even before getting to the more complex operations.

(GCC will happily use STP for an int64_t assignment, making it non-atomic, unless you add 'volatile' and then it'll use a single 64-bit STR (which is single-copy atomic).)

overly strict semantics

Posted Jan 10, 2026 20:18 UTC (Sat) by comex (subscriber, #71521) [Link]

Strictly speaking, misaligned loads are not atomic on x86, and SIMD loads may or may not be.

overly strict semantics

Posted Jan 10, 2026 16:55 UTC (Sat) by joib (subscriber, #8541) [Link] (4 responses)

> Also, C11 atomics is not the origin point for atomic intrinsics[1] or a meaningful memory model in either GCC or Linux. It's not the final or even 100% comprehensive model, either. I think the push for a more formal memory model in C, C++, and the compilers gives a false impression such a thing was completely non-existent beforehand and that things are satisfactory today.

I wonder, if the C++/C11 memory models and atomics were to be developed today, how different would they look, considering the amount of knowledge the world has gained since then and now?

Certainly there were parts of the C/C++11 models that were, ahem, less than successful, like the consume memory ordering, but otherwise, would there be a place for doing it substantially better and different in general?

overly strict semantics

Posted Jan 11, 2026 20:33 UTC (Sun) by pbonzini (subscriber, #60935) [Link] (1 responses)

One change I would remove is seq_cst stores and memory operations; and in their place I would rather have operations that behave as if they were enclosed by seq_cst fences on both sides, like for example Linux's atomic_add_return. The difference is that a RMW seq_cst operation can be reordered after a subsequent relaxed load, but that's not the case for LKMM and atomic_add_return. So this would actually make semantics *stricter*, not looser.

I don't have high hopes that this would be accepted now, but maybe it would be since "almost nobody will need anything but sequential consistent variables" has been shown wrong.

The other thing that still hasn't been fully formalized is out-of-thin-air values. Everybody agrees that they won't happen but strictly speaking they aren't prohibited, or weren't last time I checked.

overly strict semantics

Posted Jan 12, 2026 6:06 UTC (Mon) by riking (subscriber, #95706) [Link]

The special snowflake global ordering of only seqcst operations (but not seqcst fences) has got to go for sure, and I agree that "operations fused with seqcst fences" would be better.

overly strict semantics

Posted Jan 15, 2026 0:18 UTC (Thu) by milesrout (subscriber, #126894) [Link] (1 responses)

The worst part of the design is _Atomic/std::atomic. Atomic operations are atomic *operations, the operations are atomic. There is nothing inherently atomic or non-atomic about the operands themselves. The operator overloading is also a plain bad idea.

overly strict semantics

Posted Jan 15, 2026 15:40 UTC (Thu) by bertschingert (subscriber, #160729) [Link]

The GCC atomic intrinsics seem to get this right. I'm not sure if there's a portable way to do atomic operations on regular int types, though.

OTOH, what I like about the Rust (and C/C++11?) atomics is that the type system prevents accidentally introducing data races because you can't do a non-atomic load/store to an atomic type -- at least without unsafe code. Given that the article mentions there are cases in C where READ_ONCE() and WRITE_ONCE() should have been used, but weren't, this seems to be a real risk.

overly strict semantics

Posted Jan 10, 2026 15:42 UTC (Sat) by bjackman (subscriber, #109548) [Link] (1 responses)

I think one of the most common usecases for {READ,WRITE}_ONCE is where you have concurrency without parallelism. E.g. when sharing CPU-local data between a task and an IRQ.

IIUC C11's relaxed ordering is too weak for that, but any of the other C11 orderings are likely to be unnecessarily strict, i.e. they might force the use of special (costly) CPU instructions where normal reads and writes are already safe enough.

overly strict semantics

Posted Jan 11, 2026 17:57 UTC (Sun) by garyguo (subscriber, #173367) [Link]

I think in C11 ordering, a relaxed op is too weak, but a relaxed op + an atomic signal fence (which is usually just a compiler barrier) is sufficient. Alternatively, a volatile relaxed op should also be sufficient.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds