A set of patches has been making the rounds for the last month or so which
implements a concept known as a "cpuset." A cpuset is simply an arbitrary
collection of processors in an SMP system; cpusets can be used to partition
a large system into smaller virtual machines in a flexible sort of way.
This patch was originally posted
Derr; more recent versions (found in the "patches" section, below) have
been sent out by Stephen Hemminger at OSDL.
Internally, the patch creates a hierarchy of cpusets. At boot time, the
root set is created containing all of the system's processors. System
then be used to create child sets. The creation of a cpuset is not a
privileged task, but no process can expand beyond the set of processors
initially assigned to it. Thus, for example, the system administrator can
create a cpuset for a particular group of processes which will be confined
to the designated processors. Those processes can, however, further
partition the set for their own purposes.
In normal use, one would expect cpusets to correspond to the underlying
hardware; all processors in a set would normally be part of the same NUMA
node, for example. There is nothing in the patch that requires users to do
things that way, however; cpusets can be any arbitrary subset of the
available processors. Processors can also belong to multiple cpusets, so
cpusets can overlap each other in arbitrary ways. There is, however, a
"strict" flag which can be set to disallow the sharing of processors in
There are a few new system calls created by this patch:
- Creates a new cpuset as a child of the process's current cpuset,
containing the same processors as the parent.
- Destroys the given cpuset.
- Attaches a process to a particular cpuset.
- Changes the set of processors belonging to a cpuset. The name of this
call is a little misleading, since it can release processors from a
cpuset. In fact, removing CPUs will be the normal usage, since a
cpuset cannot contain processors which are not also contained in its
- Returns a list of processors which are not part of the current cpuset,
but which could be added.
Processes running within a cpuset have no view of the processors which are
not contained within that set. Processors in a cpuset are renumbered to
appear to be the only processors on the system; thus, for example, system
calls like sched_setaffinity() will only bind processes within
their particular cpuset.
This patch has generated a certain amount of interest in the large-systems
community. It clearly does not fall within the 2.6.0-test "stability
patches only" mandate, but there may be pressure to get it into the kernel
not much after 2.6.0 is released.
to post comments)