Masters: ARM atomic operations
Posted Nov 23, 2012 1:19 UTC (Fri) by jzbiciak
(✭ supporter ✭
In reply to: Masters: ARM atomic operations
Parent article: Masters: ARM atomic operations
Right, but an outside master that has to snoop in to the ARM's memory hierarchy could see writes commit in a different order than the CPU sent them, on the basis of the snoops landing in L2, L1D or the write merge buffer. A DMB effectively draws a line for snoops, too.
FWIW, another source of fun, at least on processors like A15, is the fact that snoops have to deal with the run-ahead OoO pipeline. There may be loads and stores in flight that are on a mispredicted path, and need to be unwound. That can be yet another source of memory reordering wackiness in the memory system.
What I've seen is that a processor like A15 will respond faster to a snoop that hits L2 and misses L1D than a snoop that hits L1D also, because it doesn't need to sync with the OoO pipeline. Depending on your access pattern, you could have later writes that got flushed out to L2 ahead of earlier rights. For example, A15 will stop write-allocating in L1D if you stream too many write misses. So, you could easily have some older writes in L1D, some middle-aged writes write-allocating in L2 and the youngest writes in the L1D write-merge buffer. (More info here. Look at bits 25 thru 28, which control the L1 and L2 write streaming no-allocate threshold.)
A DMB after the write stream should ensure that snoops that come in see all these writes, if the snoops could also see a write that followed the DMB, regardless of which of these three places the write stream landed.
to post comments)