I guess it would depend on the cache protocol too, wouldn't it?
In a MESI, you would end up bouncing lines (S => M transition on the first writer, S => I on the others, followed by M => S and I => S). An ESI system (write through caches w/ no notion of "modified"), you'd get something similar.
In a MOESI such as Athlon's, I believe you minimize the cost. The first writer does an S => O and broadcasts the write to everyone else that's in S.
From that, I'd say it's rather important to measure on multiple architectures, since the tradeoffs will vary.
Posted Aug 26, 2010 16:25 UTC (Thu) by PaulMcKenney (subscriber, #9624)
[Link]
I agree with your MESI assessment, but would want to see test results for the MOESI case -- there would be added traffic from the broadcasts. Which is all to say that I very much agree with you about the importance of testing on multiple architectures!