What about separating locks and data?
What about separating locks and data?
Posted Jan 4, 2013 18:58 UTC (Fri) by daney (guest, #24551)In reply to: What about separating locks and data? by renox
Parent article: Improving ticket spinlocks
1) Each load from the Now-Serving counter memory location creates traffic on the internal CPU buses. This traffic decreases the amount of useful work that can be done. Since the threads blocked waiting for their Now-Serving number to come up are not doing anything useful, decreasing the amount of bus traffic they are generating makes everything else go faster.
2) Cache line contention/bouncing caused by Ticket-Counter modifications and modifications to the data protected by the lock.
The first issue is (mostly) the one addressed by the patch in question. Increasing the size/alignment of arch_spinlock_t to occupy an entire cache line might be beneficial for some use cases, but it would increase the size of many structures, thus causing increased cache pressure.
Posted Jan 5, 2013 10:38 UTC (Sat)
by renox (guest, #23785)
[Link]
For the second issue, what you're describing (having the spinlock occupying an entire cache line) isn't always necessary: in some cases you could put 'cold' data in the same cache line as the lock to get the best performance without using too much memory.
Posted Jan 10, 2013 18:30 UTC (Thu)
by ksandstr (guest, #60862)
[Link]
That'd protect the significant cachelines from not only write-bouncing from ticket-acquisition, but from any spinlock-related "oops, had to flush this exclusive cache line to RAM in the meantime" cases due to read traffic also. I'm guessing that an operation to acquire locks on N objects without ordering foulups could fit on top of that as well.
What about separating locks and data?
What about separating locks and data?
