Making threads die quickly
The cost of killing a process, it turns out, is proportional to the total number of processes running. In situations where thousands of tasks are running (and, remember, some threaded applications run thousands of threads) the exit() call can become truly expensive.
Why is this happening? When a process exits, the kernel must "reparent" all of its children to keep the process hierarchy consistent. This should be a straightforward job, since each process keeps a list of its children in the task_struct structure. Unfortunately, due to some weirdness in how the ptrace() system call is handled, that list is not sufficient. ptrace(), it seems, rearranges the process tree so that the process being traced becomes a child of the process doing the tracing. To find processes which have been temporarly relocated to a "foster parent," the exit() system call must iterate over all processes in the system. And that, of course, is where the scalability problems come in.
Ingo's solution is simply to maintain a separate list of all processes
which are being debugged with ptrace() at any given time. That
list will generally be quite short. When a process exits, it is now
necessary to look at its list of children and the ptrace list, but
at no other processes. No more scalability problems.
Posted Aug 22, 2002 9:35 UTC (Thu)
by jcownie (guest, #3374)
[Link] (1 responses)
For instance consider a code which forks and whose child is Reparenting debugged processes is a "neat hack" to make the As someone who works on debuggers, I can tell you that we've had
Posted Aug 22, 2002 14:42 UTC (Thu)
by pflugstad (subscriber, #224)
[Link]
http://kerneltrap.com/node.php?id=384 Near the end Linus says exactly the same thing you do - the ptrace However, at the end, Linus says: You can read the whole thread by following the link on the Pete
Of course the real problem here is in the ptrace implementationMaking threads die quickly
forcing the re-parenting of processes. That is just plain wrong,
and prevents some codes from working as expected when being debugged.
debugged. When the child process exits the debugger receives
the SIGCHLD, not the parent, and the parent therefore behaves differently
than it would if the child process were not being debugged.
ptrace implementation easier, but it's not really the right solution.
(And AFAIK no other unix systems behave this way).
complaints that our product doesn't work right which are a direct
result of this ptrace hack (and which we therefore can't fix in our
debugger :-( ).
The Linux developers know it's a problem. If you read aboutMaking threads die quickly
this discussion here:
mechanism sucks.
> Ok, you've convinced me. The reparenting is fairly ugly, but it sounds
> like other implementations would be fairly equivalent and it would be
> mainly an issue of just which list we'd work on.
>
> Linus
above page.