Kernel latency hits a new low
[Posted January 8, 2003 by corbet]
Tucked away in Andrew Morton's
2.5.54-mm3
patchset is a new bit of work aimed at reducing the latency of the Linux
kernel. Latency, from the point of view of this work, is the time lag
between when a high-priority process becomes runnable and when it actually
gets the processor. Scheduling latency is important in a number of
contexts, and it can be especially important for desktop users. When you
move your mouse, it is nice not to have to wait until the pointer on the
screen moves to keep up with it. Low latency is crucial for certain
applications, including streaming media recording and playback, CD
recording, data acquisition, and so on. If the system is not sufficiently
responsive, these applications just do not work at all.
The last source of long delays in the kernel, says Andrew, is in the page
table teardown code. This delay is easily seen - simply shut down a large
application (Mozilla or OpenOffice will do nicely) and try to get anything
else done while the cleanup is happening. This delay happens because
teardown code holds the process's page_table_lock for the entire
cleanup task. If the process is large, the cleanup can take a long time.
Since the kernel is holding a lock, it can not be dislodged from the
processor even if the kernel is compiled for preemption. So anything else
that wants to run has to wait until the whole cleanup job is done.
The solution is to create a new "uber-zapper" function
(unmap_vmas()) which handles the page table cleanup task. The
page range to be torn down is split into smaller chunks (between 256 and
2048 pages, depending on the architecture and kernel config options);
between chunks, the lock is dropped and the processor rescheduled if
necessary. When the high-priority task has finished doing its thing, the
lock is reacquired and the next block of pages is freed. Along with
reducing latency, the patch has the additional advantage of cleaning up the
separate unmapping code which was duplicated in three different places.
The result, it is claimed, is a worst-case scheduling latency of 500
microseconds on a 500 MHz Pentium processor. At least, if you are
using the ext2 filesystem and if you are not mounting
and unmounting filesystems. That should be fast enough for most users.
(
Log in to post comments)