LWN.net Logo

Some approaches to parallelism avoidance

By Jonathan Corbet
November 17, 2009
What do you do if you have a group of processes, but only want one of them to run at any given time? This kind of workload is not that uncommon; it appears in user-space threading applications, asynchronous I/O applications, and in applications which have background processing tasks. Stijn Devriendt has such a problem; he recently proposed a solution in the form of a new system call:

    int sched_wait_block(pid_t pid, struct timespec *uts);

This call would put the process to sleep until the process indicated by pid blocked, at which point the calling process would go back onto the run queue. It would thus allow a sort of "only run me when process pid is sleeping" semantic.

Ingo Molnar responded with a suggestion for a very different approach; to him, this problem is another nail for the "perf events" hammer. An interested process could sign up for "parallelism" events, then receive notifications when specific processes sleep or become runnable. He sees some real benefits from such a capability:

This would make a very powerful task queueing framework. It basically allows a 'lazy' user-space scheduler, which only activates if the kernel scheduler has run out of work.

Linus, though, had a very different suggestion: rather than create this whole framework, just add a relatively stupid "only run one of this group of threads at a time" mode to the scheduler. This mode, which could be specified with a new clone() flag, seems like it could solve most of the problems in this area without adding a new set of complicated interfaces.

As of this writing, only sched_wait_block() has an actual patch associated with it, and nobody has committed to writing any others. So the eventual outcome - if any - from this conversation is unclear at best, but it's an interesting exploration of approaches in any case.


(Log in to post comments)

Some approaches to parallelism avoidance

Posted Nov 19, 2009 17:23 UTC (Thu) by martinfick (subscriber, #4455) [Link]

I would think that a slight tweak to this idea might also be desired: a "run only one of these processes per CPU at a time" flag. This way, I could start my make with a -j <infinity> and avoid useless cache flushing behavior. New processes would only run when another process is actually waiting for IO.

Some approaches to parallelism avoidance

Posted Nov 23, 2009 10:42 UTC (Mon) by nye (guest, #51576) [Link]

I think the idea (or one of the variants proposed) was to have a 'maximum parallelism' knob, that you could set to the number of cores/CPUs in order to get this behaviour.

This sounds very much like how goroutines were described, which is interesting.

Some approaches to parallelism avoidance

Posted Dec 2, 2009 18:58 UTC (Wed) by HIGHGuY (guest, #62277) [Link]

The original idea is geared towards threadpools. If a user enqueues work which blocks or might block (e.g. lock a mutex, send data on a socket, ...)
then the threadpool's efficiency is temporarily decreased.

One way to solve would be to overallocate threads, but that only brings unnecessary context switching. Another would be to require the user to use async I/O but this tends to be rather complex and impacts the threadpool user's code instead of the library itself.

The prototype implementation, as I saw later on, actually resembles what Tejun Heo is doing with his workqueue implementation. When one thread of the workqueue blocks, another one is woken to resume work. The big difference, of course, being that workqueues are kernel-only, while userspace might definitely benefit from a similar approach.

At this moment I'm geared towards implementing Ingo's solution as that would not only benefit my use-case, but could also be used as a true perf-counter to measure how well a given workload is using the available CPU-power. It's also quite flexible.
Linus' solution could still be added later on as both solutions have no impact on each other as far as I can tell.

An updated overview of perf events might be nice?

Posted Nov 19, 2009 17:41 UTC (Thu) by MarkWilliamson (guest, #30166) [Link]

I know perf events / performance counters (as they previously were called)
have come up before but perhaps it would be useful to have a "state of perf
events" article at some point?

When they appeared to be a counter mechanism I could understand what they
did; I know they've been renamed to perf events because they've become more
general. But now it seems they're not even performance-related - they're
starting to sound more like a "generic callback framework" that happens to
get used in some perf monitoring code.

It'd be really nice to have a from-the-top explanation of what they have
actually become in recent kernels, since I'm having some trouble keeping up!

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds