Ingo Molnar's work to improve the kernel's support of threads was covered
here last week
. This week, Ingo has moved on
to the final part of a thread's life cycle: the exit()
turns out that the Linux exit()
implementation has some real
scalability problems, which are described and fixed in this patch
The cost of killing a process, it turns out, is proportional to the total
number of processes running. In situations where thousands of tasks are
running (and, remember, some threaded applications run thousands of
threads) the exit() call can become truly expensive.
Why is this happening? When a process exits, the kernel must "reparent"
all of its children to keep the process hierarchy consistent. This should
be a straightforward job, since each process keeps a list of its children
in the task_struct structure. Unfortunately, due to some
weirdness in how the ptrace() system call is handled, that list is
not sufficient. ptrace(), it seems, rearranges the process tree
so that the process being traced becomes a child of the process doing the
tracing. To find processes which have been temporarly relocated to a
"foster parent," the exit() system call must iterate over all
processes in the system. And that, of course, is where the scalability
problems come in.
Ingo's solution is simply to maintain a separate list of all processes
which are being debugged with ptrace() at any given time. That
list will generally be quite short. When a process exits, it is now
necessary to look at its list of children and the ptrace list, but
at no other processes. No more scalability problems.
to post comments)