LWN.net Logo

CPUS*PIDS = mess

By Jonathan Corbet
April 27, 2010
Mike Travis recently ran into a problem: if you have a system with a mere 2048 processors, there's only room for 16 processes on each CPU before the default 32K limit on process IDs is reached. Systems with lots of processors tend not to run large numbers of processes on each CPU, but 16 is still a bit tight - especially when one considers how many kernel threads run on each CPU. With 2K processors, the kernel threads alone may run the system out of process IDs; with 4K processors, the system will not even succeed in booting.

The proposed solution was a new boot-time parameter allowing the specification of a larger maximum number of process IDs. That idea did not get very far, though; there is not much interest in adding more options just to enable the system to boot. The fact that concurrency-managed workqueues should eventually solve this problem (by getting rid of large numbers of workqueue threads) hasn't helped either; that makes the kernel option look like a temporary stopgap. But the workqueue changes are only so helpful to people who are having this problem now; some form of this work will probably go in eventually, but it does not appear to be a fast process.

So there will most likely be a shorter-term fix merged. Instead of a kernel parameter, though, it will probably be some sort of heuristic which looks at the number of processors and ensures that a sufficient number of process IDs is available. If the default limit is too low, it will be raised automatically.

There is one remaining concern: what about ancient applications which store process IDs in signed, 16-bit integers? Apparently such applications exist. It is less clear, though, that such applications exist on 4096-processor systems. So this fear is unlikely to hold up this change. By the time the rest of us get those shiny, new, 4096-core desktop systems, hopefully, any remaining broken applications will have long since been fixed.


(Log in to post comments)

CPUS*PIDS = mess

Posted Apr 29, 2010 13:39 UTC (Thu) by nix (subscriber, #2304) [Link]

I certainly hope this goes away eventually. In 2.6.32 ext4 gained an extra set of direct-IO threads; one per CPU per filesystem! I've got direct-IO turned on for the sake of one single-threaded program which touches one FS... but I'm paying for it in a couple of hundred kernel threads. *sigh*

CPUS*PIDS = mess

Posted Apr 29, 2010 13:49 UTC (Thu) by kov (subscriber, #7423) [Link]

Hopefuly all of us C programmers will be making fat money fixing broken Enterprise systems when the 2048-core servers hit the server rooms, as with COBOL and y2k!

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds