LWN: Comments on "A surprise with mutexes and reference counts"

A surprise with mutexes and reference counts

tvld — Thu, 27 Nov 2014 10:42:25 +0000

No, the code snippet in the article would be correct under pthreads. The Austin Group has clarified that they want this to work: http://austingroupbugs.net/view.php?id=811
C++11 makes a similar requirement. IIRC, C11 isn't specific, but they usually default to semantic equality with C++11 and/or POSIX in the synchronization constructs.

The underlying issue is that there is both a fast path and a slow path for wake-up, and other threads can wake up in both ways because mutex release first wakes up through the fast-path and then throw the slow path; then, it can happen that the slow path wake-up runs concurrently with another thread having the lock acquired.

This affects glibc's mutex implementation too. There's no segfault or such, but there can be pending FUTEX_WAKE calls to memory locations that might have been unmapped or reused for a different futex. That can introduce spurious wake-ups for FUTEX_WAIT calls on other futexes, which violates the promise FUTEX_WAIT makes for a return value of 0.

I see two meaningful solutions for this:

(1) Clarify that spurious wake-ups (ie, wake-ups that happen but are not due to FUTEX_WAKE calls by the code that created this futex) are allowed when FUTEX_WAIT returns 0. This changes the futex contract, but it's incompletely documented anyway. This doesn't break any code in glibc or
other futex-using code that I'm aware of. Generally, unless you have a one-shot synchronization mechanism, typical futex uses will have to tolerate past slow-path wake-ups anyway, so need to be robust to spurious wake-ups. Also, all programs using glibc will have been and are affected by such spurious wake-ups anyway.

(2) Introduce a new futex type or variant of FUTEX_WAIT whose contract explicitly allows spurious wake-ups; combine with a FUTEX_WAKE that wakes only these calls or futexes. Has the benefit that nothing in contract of the original futex changes, but requires a kernel implementation change and many futex users will need to change (e.g., glibc).

Solving this in userspace entirely is possible, but would degrade performance. If you just allow slow-path wake-up, lock release latency increases and thus scalability decreases. Trying to implement some of the deferred memory reclamation schemes that Paul mentioned is hard for process-shared mutexes.

For more background, see https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html

A surprise with mutexes and reference counts

dpotapov — Sun, 22 Dec 2013 23:37:35 +0000

The above code is definitely incorrect. Even if a pthread mutex was used instead, the code would be still incorrect.

The correct use of a pthread mutex requires to invoke mutex_destroy() before freeing memory, and any sane implementation of pthreads takes an internal spinlock (or do something similar) and thus fixing the above problem.

So mutex_lock and mutex_unlock functions are fine as they are. There is no need for additional requirements. It is just that the Linux kernel does not provide mutex_destroy, because there is no resources to free. However, it means that you can use such a mutex safely in patterns like the code above.

A surprise with mutexes and reference counts

chrisV — Tue, 17 Dec 2013 14:06:46 +0000

> ... is unsafe even barring any unexpected implementation details of the
> lock. There is nothing preventing new users of the data structure
> incrementing refcount between the mutex_unlock and the kfree (unless
> that is prevented by some other mechanism, of course)

That isn't an issue. Reference counted structures like this have to be regarded as defunct once the last thread with access to it has acquired the mutex in order to decrement the count to 0. The deferring of the freeing of memory until after the internal mutex has been unlocked is an implementation detail. You could not access the structure again once the count has reached zero even if the freeing of memory were to take place within the lock (which of course it can't).

You have to marshal access to ensure that once the count has reached 0 the structure is not accessed again, irrespective of the deferred freeing, such as by having all threads acquire their references before any one of them releases one. That gives rise to other issues concerning the locking strategy for the structure, which are not related to the bug with which this article is concerned.

A surprise with mutexes and reference counts

sorokin — Tue, 10 Dec 2013 08:20:02 +0000

Another solution can be having a function that must be called before mutex memory is freed. Something like C++ destructor for mutex. This function must ensure, that all other threads finished unlocking mutex.

A surprise with mutexes and reference counts

sorokin — Tue, 10 Dec 2013 08:13:16 +0000

> Whether it's a mutex bug depends upon what you expect
> from a mutex, in particular when you consider to be the
> instant the lock is released.

Completely argee. Whether the bug is in mutex or in reference counting code depends on the contract of the mutex. Still I believe, that mutex that can access mutex memory for undefinite amount of time after unlocking can be used safely only in very simple programs.

E.g. Linus said that you can keep mutex outside of object you are protecting. But this does not fix anything. How can we guarantee that CPU2 after removing reference counting object will not remove object where mutex reside? After removing reference counting object CPU2 can do anything.

As we decided that mutex that can access mutex memory for undefinite amount is not safe, the only valid kind of mutex is one that has some requirement when it stops accessing its memory after unlock. So I tried to devise those requirements.

> the lock gets released at some instant after mutex_unlock's last reference to the lock

This is very strong requirement. If it is feasible to implement effectively -- good, if not we can relax our requirement as I did.

A surprise with mutexes and reference counts

giraffedata — Sat, 07 Dec 2013 23:34:26 +0000

Whether it's a mutex bug depends upon what you expect from a mutex, in particular when you consider to be the instant the lock is released.

As we usually define timing in library calls, we would say the lock is released sometime during the execution of mutex_unlock(), with no more precision than that. In the same way, we would say that mutex_unlock() can access its arguments at any time during its execution. With that definition, you obviously cannot put the lock inside the object being protected by the lock, because that means the mutex_unlock() could conceivably access the protected object after it has released the lock.

But since that prevents a very useful locking paradigm -- one that allows an object to be somewhat autonomous and just disappear on its own when its last reference goes away -- most of us are used to a more precise guarantee: the lock gets released at some instant after mutex_unlock's last reference to the lock. I presume that's actually written into official specifications for things like POSIX mutexes.

That stronger guarantee is what the mutexes in question fail to provide; I don't know if the designer intended that or not. If he did, then the reference counting code that assumed the stronger guarantee is where the bug is.

A surprise with mutexes and reference counts

nybble41 — Fri, 06 Dec 2013 18:41:58 +0000

If you define "unlock is complete" as the point where other threads are allowed to run in the critical section, then you're right, that can't happen. If I understand the issue correctly, the problem is that this happens before we return from mutex_unlock(), and mutex_unlock() continues to access the memory for the mutex after it's been unlocked and other threads have been allowed to run. These other threads could free the memory which holds the mutex, leading to an invalid memory access in mutex_unlock().

A surprise with mutexes and reference counts

Karellen — Fri, 06 Dec 2013 17:50:11 +0000

How can "another thread jump in and decrement to 0" before our unlock is complete? Until our unlock is complete, we still hold the lock. And until we've completed the unlock, no other thread should be able to grab the lock in order to drop the refcount to 0.

A surprise with mutexes and reference counts

sorokin — Fri, 06 Dec 2013 12:19:59 +0000

Looks like this is a bug in mutex implementation, not a bug in reference counting code.

CPU2 from Linus example was able to

1. lock mutex
2. see other thread side effects
3. make its own side effects
4. unlock mutex
5. free memory of mutex

while CPU1 was still unlocking mutex.

The requirement for mutex must be: if I have managed to lock mutex, the other threads must not access mutex memory.

The relaxed requirement could be: if I have managed to lock/make side effects/unlock mutex, other thread must not access mutex memory.

A surprise with mutexes and reference counts

bfields — Thu, 05 Dec 2013 22:00:53 +0000

(But yes, agreed on the "unsafe in general" comment, I would've thought the same. While also sympathizing with anyone caught by surprise by the fact that another user could acquire and drop the lock while we still haven't returned from the unlock!)

A surprise with mutexes and reference counts

bfields — Thu, 05 Dec 2013 21:45:47 +0000

There is nothing preventing new users of the data structure incrementing refcount between the mutex_unlock and the kfree

Assume the object is created with a positive refcount and thereafter only ever decremented, never incremented.

Then once the refcount goes to zero it's never going positive again. So the problem you're concerned about doesn't happen. But there's still a problem:

The problem is we assume that it's still safe to call mutex_unlock(&s->lock) after dropping our own reference.

But in fact it's possible that we decrement to 1, then another thread jumps in and decrements to 0 while our mutex_unlock(&s->lock) is still in progress and frees s out from under us.

A surprise with mutexes and reference counts

PaulMcKenney — Thu, 05 Dec 2013 16:44:06 +0000

Even Paul McKenney is going to have to provide some way of preventing the memory protected by the lock from being freed while acquiring the lock.

One way approach is to use a deferred-reclamation scheme, such as RCU (my personal favorite), SLAB_DESTROY_BY_RCU, hazard pointers (Maged Michael's personal favorite), a general-purpose garbage collector, or even the rough-and-ready approach of simply never freeing any memory. Another approach is to make use of a lock or reference counter that lives outside of the memory being protected, the so-called "hand-over-hand locking" being one (slow!) example, and hashed arrays of locks being another (also slow!) example. Yet another approach is to use a scheme that allows the reference-count increment to proceed only if the pointer to the memory remains unchanged, for example, using MC68020's dcas instruction or one of the transactional-memory approaches. Of course, in the case of transactional memory, make sure to take into account the forward-progress properties!

If the only protection for a piece of memory is a lock inside that piece of memory, life is hard. As far as I know, Gamsa et al. were the first to point this out in their 1999 OSDI paper: http://www.usenix.org/events/osdi99/full_papers/gamsa/gam....

A surprise with mutexes and reference counts

PaXTeam — Thu, 05 Dec 2013 16:02:46 +0000

> There's another thread on a different CPU doing the same thing.

then you have a refcount underflow bug to begin with, nothing to do with the problem discussed here.

A surprise with mutexes and reference counts

zmower — Thu, 05 Dec 2013 13:52:31 +0000

There's another thread on a different CPU doing the same thing.

Runekock's right. But for some reason Linus thinks the second CPU does the free. I'll take his word for it. ;)

A surprise with mutexes and reference counts

Karellen — Thu, 05 Dec 2013 09:36:46 +0000

Sorry, I don't see the problem at all.

Where do these "new users of the data structure" come from? In the case we're talking about, at the point the mutex lock is acquired, the refcount is 1. The current thread is the only thread with a reference to the protected data structure. At that point, there should be no other users at all, running or not, who *could* be able to alter the refcount or any other part of the data structure.

A surprise with mutexes and reference counts

runekock — Thu, 05 Dec 2013 09:24:49 +0000

The example in the article:

int free = 0; mutex_lock(&s->lock); if (--s->refcount == 0) free = 1 mutex_unlock(&s->lock); if (free) kfree(s);

is unsafe even barring any unexpected implementation details of the lock. There is nothing preventing new users of the data structure incrementing refcount between the mutex_unlock and the kfree (unless that is prevented by some other mechanism, of course).

It seems to me that protecting the allocation of any kind of lock with itself is unsafe in general. Though I bet Paul McKenney could get away with it, it's probably a thing to avoid in code touched by the average hacker.