Fighting fork bombs
The problem with fork bombs is that they are moving targets; by the time a system administrator notices a rapidly-forking process, it may have created vast numbers of children and exited. Killing processes individually in a fork bomb situation is not really an option; even a program written especially for this task can be hard put to keep up with the stream of new processes. There is just no way to get a handle on the entire tree of offending processes from user space. So it is not surprising that the best response in this situation can be to hit the Big Red Button and start over. Even if, as in Kamezawa-san's case, hitting the button involves walking to another building where the afflicted system is housed.
Indeed, it can be hard to get a handle on this tree from kernel space as well. The process tree only exists, as such, as long as the parent processes remain alive; once a process exits, all of its children are reparented to the init process. That causes a flattening of the tree structure and makes it hard to identify all of the processes involved in the attack. So Kamezawa-san's patch starts with the addition of a new process tracking structure. It is organized as a simple tree reflecting the actual family structure of the processes on the system. It differs from existing data structures, though, in that this "history tree" persists even when some processes exit. That allows the kernel to view the entire tree of processes involved in a fork bomb even if those which launched the attack have long since gone away.
Keeping the entire history of all processes created over the lifetime of a Linux system would be a costly endeavor. Clearly, there comes a point where history needs to be discarded. Every so often (30 seconds by default), the kernel will try to determine whether there might possibly be a fork bomb attack in process; if no signs of an attack are detected, any tracking history which has existed for more then 30 seconds will be deleted.
How does the kernel decide whether it might be under attack? The way fork bombs incapacitate a system is usually through memory exhaustion, so the code looks for signs of memory stress: in particular, it looks to see if there have been any memory allocation stalls or kswapd runs since the last check. It also looks at whether the total number of processes on the system has increased. If none of those checks shows any reason for concern, the older history data will be removed from the system. If, instead, memory allocations are getting harder to come by or the number of processes is growing, the tracking structure will be kept around.
If a fork bomb runs the system out of memory, the kernel's first response will be to fire up the out-of-memory (OOM) killer. Given time, the OOM killer might manage to clean up the mess, but the fact of the matter is that the OOM killer is designed around finding the one process which is creating the problem and killing it. The OOM killer cannot identify a whole tree of rapidly-forking processes and do away with all of them.
Enter the fork bomb killer, which is invoked by the OOM killer. The fork bomb killer will perform a depth-first traversal of the process history tree, filling in each node with information on the total number of processes below that node and the total memory used by those processes. At the end, the process with the highest score is examined; if there are at least ten processes in the history below the high scorer, it is deemed to be a fork bomb; that process and all of its descendants will be killed. Problem solved - hopefully.
There are a couple of control knobs which have been placed under /sys/kernel/mm/oom. History tracking will only be performed if mm_tracking_enabled is set to "enabled" (which is the default setting). The value in mm_tracking_reset_interval_msecs controls how often the process tracking tree is cleaned up; the default value is 30,000 milliseconds. A possibly surprising omission is the lack of a knob controlling how many descendants a process must have before it is declared to be a fork bomb; the hardcoded value of ten seems low.
The reception for this patch has not been entirely favorable; commenters
worry about the runtime cost of maintaining the tracking structure and
suggest that user-space solutions may be better. Kamezawa-san seems resigned that the patch may not go in,
saying "To go to other buildings to press reset-button is good for
my health.
" Other administrators, who may not be within easy
walking distance of their systems, may feel their health is better
served by some extra fork bomb protection, though.
Index entries for this article | |
---|---|
Kernel | Fork bombs |
Kernel | OOM killer |
Kernel | Security/Security technologies |
Posted Mar 31, 2011 2:44 UTC (Thu)
by jengelh (guest, #33263)
[Link]
Eh.. this sounds very much like a case that cgroups can handle. systemd is said to use them already to kill all processes spawned from a master even if the children have detached and reparented (think sshd).
Given that, the oom-killer may be tuned to group killable targets by cgroup rather than just tgid/tid.
Posted Mar 31, 2011 7:43 UTC (Thu)
by alonz (subscriber, #815)
[Link] (1 responses)
For example: it is easy to develop a “creeping” fork-bomb that will just wait 30s (or even 1m, or 5m) between spawning successive generations of children. When this bomb begins to make its impact, it will already have tens (or hundreds, or thousands) of children, and the history will be long gone.
Posted Mar 31, 2011 11:10 UTC (Thu)
by dholland (subscriber, #14680)
[Link]
(and something about not letting "perfect" be the enemy of "good"?)
Posted Mar 31, 2011 14:21 UTC (Thu)
by cesarb (subscriber, #6266)
[Link] (9 responses)
I am failing to see why. You only need to keep the family tree of live processes (thus, branches with only dead leaves can be pruned). You do not need to keep all the inner nodes too; if you have a dead inner node with a single dead children, you can collapse both into a single dead inner node (how many intermediate dead nodes you had does not matter, and even if it did they could be replaced by a counter in the collapsed node). Unless I am visualizing it incorrectly, the worst case then is a binary tree with all the live nodes being the leaves, and so it has a bounded size (which is not that large).
Posted Mar 31, 2011 14:54 UTC (Thu)
by Seegras (guest, #20463)
[Link] (8 responses)
You will still know which process spawned what "inetd", even if the parent is long gone from memory or even disk.
Definitly worth some consideration.
Posted Mar 31, 2011 21:04 UTC (Thu)
by dafid_b (guest, #67424)
[Link] (7 responses)
Background
There are a couple of use-cases I think the above tool could help with
2)
Posted Apr 3, 2011 1:58 UTC (Sun)
by giraffedata (guest, #1954)
[Link] (6 responses)
I have long been frustrated by the Unix concept of orphan processes, for all the reasons mentioned here.
If I were redesigning Unix, I would just say that a process cannot exit as long as it has children, and there would be two forms of exit(): kill all my children and exit, and exit as soon as my children are all gone. And when a signal kills a process, it kills all its children as well.
Furthermore, rlimits would be extended to cover all of a process' descendants as well, and be refreshable over time. Goodbye, fork bomb.
There are probably applications somewhere that create a neverending chain of forks, but I don't know how important that is.
Posted Apr 3, 2011 2:52 UTC (Sun)
by vonbrand (subscriber, #4458)
[Link] (5 responses)
Keeping processes around just because some descendent is still running is a waste of resources.
Posted Apr 3, 2011 19:06 UTC (Sun)
by giraffedata (guest, #1954)
[Link] (2 responses)
Seems like a pretty good return on investment for me. Maybe 50 cents worth of memory (system-wide) to be able to avoid system failures due to runaway resource usage and always be able to know where processes came from. It's about the same tradeoff as keeping a process around just because its parent hasn't yet looked at its termination status, which Unix has always done.
A process that no longer has to execute shouldn't use an appreciable amount of resource.
Posted Apr 7, 2011 9:24 UTC (Thu)
by renox (guest, #23785)
[Link] (1 responses)
Posted Apr 7, 2011 15:16 UTC (Thu)
by giraffedata (guest, #1954)
[Link]
I don't think "whole process" implies the program memory and I agree - if I were implementing this, I would have exit() free all the resources the process holds that aren't needed after the program is done running, as Linux does for zombie processes today. But like existing zombies, I would probably keep the whole task control block for simplicity.
Posted Apr 4, 2011 16:51 UTC (Mon)
by sorpigal (guest, #36106)
[Link] (1 responses)
Posted Apr 5, 2011 6:29 UTC (Tue)
by giraffedata (guest, #1954)
[Link]
This appears to be a rhetorical question, but I can't tell what the point is.
Posted Mar 31, 2011 23:31 UTC (Thu)
by mrons (subscriber, #1751)
[Link] (3 responses)
Sending a signal to the process group kills all fork bombs in my experience.
A signal to the process group also kills what we call "comets", a process that forks then exits. You can never catch a PID to kill the comet directly. They can even be hard to detect on a busy system. lastcomm process logs are often the only way to see one.
The other requirement is process limits on users. Fork bombs will make a system unusable if there are no limits.
I don't really see the need for this patch in the kernel. The current facilities of process groups and user process limits solve all the problems that I've seen.
Posted Apr 1, 2011 0:29 UTC (Fri)
by dtlin (subscriber, #36537)
[Link] (2 responses)
Posted Apr 1, 2011 0:57 UTC (Fri)
by mrons (subscriber, #1751)
[Link]
To kill a fork bomb that you can't send a kill(-pgid), you need to send
Many years ago we had a lot of fun here in an fork bomb arms race. That's where several forms of "comets" mentioned above were invented in an effort to find something that the sys admin (me) could not kill.
One neat way to kill a comet, is to create a fork bomb as the user of the comet! That will slow down the comet enough so you can STOP it. Then you kill the fork bomb in the usual way.
Posted Apr 4, 2011 15:35 UTC (Mon)
by jeremiah (subscriber, #1221)
[Link]
Posted May 31, 2011 13:15 UTC (Tue)
by mehuljv (guest, #52868)
[Link]
How this patch handles below scenario,
Process A starts and forks 9 children. Lets refer all these new children as GROUP-B. Now, Process A exits so that init becomes parent of all GROUP-B processes.
Now, consider if all GROUP-B processes wait for 1 minute so that history of their original parent - PROCESS-A gets cleared. After 1 minute each GROUP-B process does fork of 9 children. So in total GROUP-B will spawn 81 processes. Lets refer these 81 processes as GROUP-C.
Now if all processes in GROUP-B exits, init will become parent of all processes in GROUP-C. Again all GROUP-C processes will wait for 1 minute so that history of all GROUP-B processes gets cleared.. and fork again...
If above iterations continue then after a while there will be many processes waiting/forking/exiting to avoid oom and system is still under fork attack.
Can any one explain me what happens in above scenario ?
Mehul.
Fighting fork bombs
This mechanism appears to be very naive, and is easily bypassed.
Fighting fork bombs
Fighting fork bombs
Fighting fork bombs
Fighting fork bombs
Fighting fork bombs
Hold in this tree the reason the process was created...
eg
"login-shell" (init hard code)
"Firefox Web Browser" (menu entry text)
"print-spooler"
"Chrome - BBC News Home" (Window title)
I find myself uneasy when evaluating the safety of my system - the process list of 140 odd processes with perhaps 10 recognised, leaves me no wiser..
1)
Should I use the browser to transfer cash between bank accounts?
Or should I reboot first?
How can I become more confident of code running on my system?
Was that web-site really benign?
I allowed the site to run scripts in order to see content more clearly...
Has it created a process to execute in the background after I closed the frame?
Fighting fork bombs
Fighting fork bombs
Fighting fork bombs
Keeping processes around just because some descendent is still running is a waste of resources.
Fighting fork bombs
Fighting fork bombs
you're [suggesting] keeping the whole process until its children exits which can be expensive, maybe a middleground could be more useful ie keep only the 'identity' of the parent process and free the rest.
Fighting fork bombs
Fighting fork bombs
Isn't "disk/ram/cpu is cheap" typically the argument used to dismiss Unix design decisions based on efficiency?
Fighting fork bombs
A process can easily Fighting fork bombs
setsid()
and make itself a session and process group leader, escaping kill(-pgid)
in the same way that fork()
escapes kill(pid)
.
RLIMIT_NPROC
/ulimit -u is good, though.
Fighting fork bombs
a STOP signal to each of the processes. The fork bomb won't grow past the users process limits and a STOPped process can't fork. So once all the processes are stopped you can KILL them.
Fighting fork bombs
Fighting fork bombs
Consider history cleanup time is 30 seconds.