| Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net. |
One of the biggest internal changes in 2.6.36 will be the adoption of concurrency-managed workqueues. The short-term goal of this work is to reduce the number of kernel threads running on the system while simultaneously increasing the concurrency of tasks submitted to workqueues. To that end, the per-workqueue kernel threads are gone, replaced by a central set of threads with names like [kworker/0:0]; workqueue tasks are then dispatched to the threads via an algorithm which tries to keep exactly one task running on each CPU at all times. The result should be better use of the CPU for workqueue tasks and less memory tied up by the workqueue machinery.
That is a worthwhile result in its own right, but it's really only a beginning. The 2.6.36 workqueue patches were deliberately designed to minimize the impact on the rest of the kernel, so they preserved the existing workqueue API. But the new code is intended to do more than replace workqueues with a cleverer implementation; it is really meant to be a general-purpose task management system for the kernel. Making full use of that capability will require changes in the calling code - and in code which does not yet use workqueues at all.
In kernels prior to 2.6.36, workqueues are created with create_workqueue() and a couple of variants. That function will, among other things, start up one or more kernel threads to handle tasks submitted to that workqueue. In 2.6.36, that interface has been preserved, but the workqueue it creates is a different beast: it has no dedicated threads and really just serves as a context for the submission of tasks. The API is considered deprecated; the proper way to create a workqueue now is with:
int alloc_workqueue(char *name, unsigned int flags, int max_active);
The name parameter names the queue, but, unlike in the older implementation, it does not create threads using that name. The flags parameter selects among a number of relatively complex options on how work submitted to the queue will be executed; its value can include:
The combination of the WQ_HIGHPRI and WQ_CPU_INTENSIVE flags takes this workqueue out of the concurrency management regime entirely. Any tasks submitted to such a workqueue will simply run as soon as the CPU is available.
The final argument to alloc_workqueue() (we are still talking about alloc_workqueue(), after all) is max_active. This parameter limits the number of tasks which can be executing simultaneously from this workqueue on any given CPU. The default value (used if max_active is passed as zero) is 256, but the actual maximum is likely to be far lower, given that the workqueue code really only wants one task using the CPU at any given time. Code which requires that workqueue tasks be executed in the order in which they are submitted can use a WQ_UNBOUND workqueue with max_active set to one.
(Incidentally, much of the above was cribbed from Tejun Heo's in-progress document on workqueue usage).
The long-term plan, it seems, is to convert all create_workqueue() users over to an appropriate alloc_workqueue() call; eventually create_workqueue() will be removed. That task may take a little while, though; a quick grep turns up nearly 300 call sites.
An even longer-term plan is to merge a number of other kernel threads into the new workqueue mechanism. For example, the block layer maintains a set of threads with names like flush-8:0 and bdi-default; they are charged with getting data written out to block devices. Tejun recently posted a patch to replace those threads with workqueues. This patch has made some developers a little nervous - problems with writeback could create no end of trouble when the system is under memory pressure. So it may be slow to get into the mainline, but it will probably get there eventually unless regressions turn up.
After that, there is no end of special-purpose kernel threads elsewhere in the system. Not all of them will be amenable to conversion to workqueues, but quite a few of them should be. Over time, that should translate to less system resource use, cleaner "ps" output, and a better-running system.
Working on workqueues
Posted Sep 9, 2010 16:27 UTC (Thu) by marcH (subscriber, #57642) [Link]
Naive question: isn't this change going to make more difficult to see what kernel threads are busy doing?
Working on workqueues
Posted Sep 9, 2010 16:34 UTC (Thu) by nix (subscriber, #2304) [Link]
Working on workqueues
Posted Sep 11, 2010 1:31 UTC (Sat) by dmag (guest, #17775) [Link]
Yes, but you only got to see those threads that were spun off as dedicated threads. It's a case of "just because it's easy to measure doesn't mean it's the whole picture."
Working on workqueues
Posted Sep 17, 2010 5:59 UTC (Fri) by kevinm (guest, #69913) [Link]
Copyright © 2010, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds