|
|
Log in / Subscribe / Register

Rethinking the futex API

Rethinking the futex API

Posted Jun 19, 2020 19:35 UTC (Fri) by itsmycpu (guest, #139639)
In reply to: Rethinking the futex API by krisman
Parent article: Rethinking the futex API

> How do you wake up multiple threads from your producer
> (1) without issuing multiple syscalls to wake each futex,
> (2) avoiding spurious wakeups of unrelated threads, or relying on
> (3) poll (which we've shown is not fast enough); while
> (4) not relying in your waiters spinning (which throws cpu utilization through the roof)?
> One of the main motivations of the patchset, which might have been lost in the article, is to reduce cpu utilization.

I agree with the posters who say wait-multiple can be implemented efficiently in user space using the existing Futex API.

Re (1): It might makes sense to add a wake-multiple call (as opposed to a wait-multiple) since that doesn't require copying dynamic lists into the kernel.
This would be much, much simpler.
Re (2): You can protect against spurious wakeups by re-checking atomic variables after wake up. However what are "unrelated threads"?
Re (3): A user space implementation doesn't require polling (except once which is good for peformance).
Re (4): You have the same general options implementing this as you do when implementing locks.

> Can you better describe what your model would look like in memory?
> One of my use cases is a producer-consumer scenario where each consumers wait on a subset of events triggered by different producers.

When necessary, you associate a list of consumers with each event producer, each list entry containing a pointer to the waiters futex and/or atomic guard variable.
Depending on the type of producer, you mark and/or wake one or all consumers. Why is that a question, did you try and have problems with this approach?

The kernel would have to do the same or similar processing otherwise, however in my understanding it is to be avoided for the kernel, for example, to copy and maintain dynamic memory lists. It is better to do in user space, also for performance (and to avoid impacting the performance of other applications). At least I know this works well in my own use cases.

Additionally, in use cases where the same "subset" is used repeatedly, these associations don't have to be re-established for each call.


to post comments

Rethinking the futex API

Posted Jun 20, 2020 2:54 UTC (Sat) by krisman (subscriber, #102057) [Link] (3 responses)

> When necessary, you associate a list of consumers with each event producer, each list entry containing
> a pointer to the waiters futex and/or atomic guard variable.

So, your design has one futex per consumer. This means that when a producer signals, in order to wake all of the entries in its waitqueue, it walks the list and does a FUTEX_WAKE syscall per entry. That's much worse than FWM.

Rethinking the futex API

Posted Jun 20, 2020 3:38 UTC (Sat) by ras (subscriber, #33059) [Link]

> So, your design has one futex per consumer.

At this point I'm lost. You obviously have some specific use case in mind none of this fits. Could you post a link / explain what it is?

Rethinking the futex API

Posted Jun 20, 2020 3:58 UTC (Sat) by itsmycpu (guest, #139639) [Link] (1 responses)

> So, your design has one futex per consumer. This means that when a producer signals,
> in order to wake all of the entries in its waitqueue, it walks the list and does a FUTEX_WAKE syscall per entry. That's much worse than FWM.

Almost, but not with that disadvantage. As I indicated, "in my design" there is (optionally) a FUTEX_WAKE_MULTIPLE (not wait_multiple) kernel API which is much simpler to implement, also because it doesn't need to copy and maintain the list internally in the kernel. It does multiple FUTEX_WAKE operations in a single syscall. And this can be added later on, as an optimization, if and when such a call becomes available in the kernel, while the rest can already be implemented, tested and tuned with the kernel as is.

Since a FUTEX_WAKE is only needed for futexes that actually have thread(s) waiting, and the waiting itself is already a time consuming kernel call, one non-blocking WAKE call per waiting futex shouldn't be that bad, although it will be better to have a wake_multiple call, which solves that problem (if it is one) completely.

Rethinking the futex API

Posted Jun 20, 2020 4:05 UTC (Sat) by itsmycpu (guest, #139639) [Link]

(Also, for example, the FUTEX_WAKE calls can offloaded to a worker thread, if the callers are high priority.)


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds