LWN.net Logo

Preventing overly-optimistic spinning

By Jonathan Corbet
August 25, 2010
A kernel mutex is a sleeping lock; a thread which loses in contention for a specific mutex will be blocked until that mutex becomes available. At least, that's what the documentation says; the reality is a bit more complicated. Experience has shown that throughput can sometimes be improved if processes waiting for a lock do not go to sleep immediately. In particular, if (1) a thread finds a mutex to be unavailable, and (2) the holder of the mutex is currently running, that thread will spin until the mutex becomes available or the holder blocks. That "optimistic" spinning allows the transfer of the mutex without going through a sleep/wakeup cycle, and, importantly, it gives the mutex to a running (and, thus, cache-hot) thread. The result is an unfair, but better-performing mutex implementation.

Except that, as it turns out, it doesn't always perform better. While doing some testing on a 64-core system, Tim Chen noticed a problem: multiple threads can be waiting for the same mutex at any given time. Once the mutex becomes available, only one of this spinning threads will obtain it; the others will continue to spin, contending for the lock. In general, optimism can be good, but excessive optimism can be harmful if it leads to continued behavior which does not yield useful results. That would appear to be the case here.

Tim's response was a patch changing the optimistic spinning implementation slightly. There is now an additional check in the loop to see if the owner of the mutex has changed. If the ownership of a mutex changes while a thread is spinning, waiting for it, that means that it was released and somebody else grabbed it first. In other words, there is heavy contention and multiple CPUs are spinning in a race that only one of them can win. In such cases, it makes sense to just go to sleep and wait until things calm down a bit.

Various benchmark results showed significant performance improvements in heavily-contended situations. That was enough to get the patch merged for 2.6.36-rc2.


(Log in to post comments)

Preventing overly-optimistic spinning

Posted Aug 26, 2010 8:37 UTC (Thu) by liljencrantz (guest, #28458) [Link]

Wouldn't an alternative strategy, one where at most one thread will ever spin waiting for a specified Mutex have fewer edge cases?

Preventing overly-optimistic spinning

Posted Aug 26, 2010 15:12 UTC (Thu) by PaulMcKenney (subscriber, #9624) [Link]

Hard to say. The usual way to see that many threads are spinning is for these threads to write (using atomic instructions) to the lock word, which would result in more cache thrashing, and would delay the lock release. This usually slows things down rather than speeding them up.

But you can always give it a try and see what happens.

Preventing overly-optimistic spinning

Posted Aug 26, 2010 16:10 UTC (Thu) by jzbiciak (✭ supporter ✭, #5246) [Link]

I guess it would depend on the cache protocol too, wouldn't it?

In a MESI, you would end up bouncing lines (S => M transition on the first writer, S => I on the others, followed by M => S and I => S). An ESI system (write through caches w/ no notion of "modified"), you'd get something similar.

In a MOESI such as Athlon's, I believe you minimize the cost. The first writer does an S => O and broadcasts the write to everyone else that's in S.

From that, I'd say it's rather important to measure on multiple architectures, since the tradeoffs will vary.

Preventing overly-optimistic spinning

Posted Aug 26, 2010 16:25 UTC (Thu) by PaulMcKenney (subscriber, #9624) [Link]

I agree with your MESI assessment, but would want to see test results for the MOESI case -- there would be added traffic from the broadcasts. Which is all to say that I very much agree with you about the importance of testing on multiple architectures!

Preventing overly-optimistic spinning

Posted Aug 26, 2010 14:43 UTC (Thu) by juanjux (guest, #11652) [Link]

"Spinning threads".

This is why I read this section.

Preventing overly-optimistic spinning

Posted Aug 30, 2010 17:25 UTC (Mon) by dvhart (guest, #19636) [Link]

As I'm working on a similar mechanism for user-space mutexes, this caught my eye. One concern I have with the "if owner-changed then sleep" approach is the possibility of effectively disabling spinning for all but the first spinner. Perhaps this is unavoidable. I have considered an "only one spinner at a time" approach, and while it avoids a bunch of spinners stacking up, it has the same end result. I suppose workloads for which this is a benefit are not so concerned about a single block of threads trying to acquire a lock once, but rather several threads repeatedly trying to acquire the lock for the duration of the run. In this latter case, a heavily contended lock will frequently have a spinner. I wonder if the single spinner limitation would allow the system itself to perform better (on other things).

Preventing overly-optimistic spinning

Posted Sep 3, 2010 8:20 UTC (Fri) by efexis (guest, #26355) [Link]

Do locks get served in a first-come-first-served basis? If so, a single spinner solution would be an advantage, but if not, what you'd be doing is preempting a schedular decision, as a lock might not be passed to the first spinner, but to the one with the highest priority. You wouldn't want the high priority task to sleep because a low priority task started spinning first, and then have to wake the high priority task when the lock becomes available, have the low priority task notice an ownership change and then sleep.

Are there other useful things that can be done while spinning on the processor that add latency to waking up when the lock is gained? Like saving power? Or does switching power states tend to be too hungry an operation in itself? Am just curious as a non-kernel-hacker :-)

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds