There has been more discussion along the lines of per-numa-node and per-
process tables to reduce false-sharing on the futex hash table, but the
effect is similar. One thing I feel is lacking a use-case that exemplifies
the contention on the hash-table and any cache-ping-pong it may cause on
multi-socket and/or multi-node systems. I'm working on a futex test suite
now and I hope some of the perf and stress tests will help here.
Posted Nov 16, 2009 0:48 UTC (Mon) by dlang (✭ supporter ✭, #313)
[Link]
one real-world example of lock ping-pong with futexes that I ran into recently was rsyslog (whould be reduced in recent versions)
it has threads that receive messages and add them to a (lock protected) queue, while other threads retrieve messages from the queue to output them.
with a simple UDP input and file output a high enough input rate could push it into lock contention, at which point throughput plummets. I can't say for sure that this is SMP cach line bouncing, but there's a good chance of this being the case.