The scheduler and hyperthreading
[Posted August 28, 2002 by corbet]
Hyperthreading is a hardware technique where a single CPU behaves as if it
were multiple (usually two) virtual processors. When one virtual processor
stalls (on a cache miss, for example), the other runs. Hyperthreading can
yield significant performance improvements (numbers of around 30% have been
floated) for a very small silicon investment. And the software side is
free: a hyperthreaded processor is almost indistinguishable from a pair of
real, physical processors, and the current Linux (or whatever) SMP code
works.
However, a scheduler which handles SMP, but which is unaware of
hyperthreading, will not obtain optimal performance. If you have two
processes running on two virtual processors on the same physical CPU, they
will be contending with each other in a way that processes on separate CPUs
will not. A naive scheduler, such as the one currently found in the Linux
kernel, does not understand the difference between the two situations, and
will thus make wrong decisions.
Ingo Molnar has posted some scenarios where
the current scheduler gets things wrong, along with, of course, a patch
that makes everything right. Consider a system with two physical CPUs,
each of which provides two virtual processors. If there are two running
tasks, the current scheduler would happily let them both run on a single
physical processor, even though far better performance would result from
migrating one process to the other physical CPU. The scheduler also
doesn't understand that migrating a process from one virtual processor to
its sibling is cheaper (due to cache loading) than migrating it across
physical processors.
The solution is to change the way the run queues work. The 2.5 scheduler
maintains one run queue per processor, and attempts to avoid moving tasks
between queues. The change is to have one run queue per physical
processor which is able to feed tasks into all of the virtual processors.
Throw in a smarter sense of what makes an idle CPU (all virtual processors
must be idle), and the resulting code "magically fulfills" the needs of
scheduling on a hyperthreading system. The actual patch involves a bunch
of tricky details, of course, but the end result is that a relatively
simple idea yields a 10% or greater performance improvement.
(
Log in to post comments)