Killing processes that don't want to die

May 28, 2018

This article was contributed by George Dunlap

Suppose you have a program running on your system that you don't quite trust. Maybe it's a program submitted by a student to an automated grading system. Or maybe it's a QEMU device model running in a Xen control domain ("domain 0" or "dom0"), and you want to make sure that even if an attacker from a rogue virtual machine manages to take over the QEMU process, they can't do any further harm. There are many things you want to do as far as restricting its ability to do mischief. But one thing in particular you probably want to do is to be able to reliably kill the process once you think it should be done. This turns out to be quite a bit more tricky than you'd think.

Avoiding kill with fork

So here's our puzzle. Suppose we have a process that we've run with its own individual user ID (UID), which we want to kill. But the code in the process is currently controlled by an attacker who doesn't want it to be killed.

We obviously know the process ID (PID) of the initial process we forked, so we could just use the kill() system call:

    kill(pid, 9);

So how can an attacker avoid this? It turns out to be pretty simple:

    while(1) {
        if (fork())
            _exit(0);
    }

This simple snippet of code will repeatedly call fork(). As you probably know, fork() returns twice: once in the existing parent process (returning the PID of the newly-created child), and once in a newly-created child process (returning 0). In the loop above, the parent will always call _exit(), and the child will call fork() again. The result is that the program races through the process ID space as fast as the kernel will let it. ~~These types of programs are often called "fork bombs".~~ [The author disagrees with this characterization, which was added by an editor late in the publication process.]

I encourage you to run the above code snippet (preferably in a virtual machine), and see what it looks like. It's not even very noticeable. Running top shows a system load of about 50% (in my virtual machine anyway), but there's not obviously any particular process contributing to that load; everything is still responsive and functional. If you didn't know about it, you might never notice it was there.

Now try killing it. You can run killall to try to kill the process by name, but it will frequently fail with "no process killed"; even when it succeeds, it often turns out that you've killed the parent process after the fork() but before the _exit(), so the rogue forking process is still going strong. Even determining whether you've managed to kill the process or not is a challenge.

The basic problem here is a race condition. What killall does is:

Read the list of processes, looking for one with the specified name
Call kill(pid, sig) on each one found

In between 1 and each instance of 2, the kernel tasklist lock is released (since it has to return from the system call), giving the rogue process a chance to fork. Indeed, it has many chances; since the second step takes a non-negligible amount of time, by the time you manage to find the rogue process, it's likely already forked, and perhaps even exited.

It's true, if we ran killall 1000 times, the rogue process would very likely end up dead; and if we ran ps 1000 times, and found no trace of the process, we might be pretty sure that it was gone. On the other hand, that assumes that the "race" is fair, and that the attacker hasn't discovered some way of making sure that the race ends up going their way. It would be best if we didn't rely on these sorts of probabilistic calculations to clean things up.

Better mousetraps?

One thing to do, of course, would be to try to prevent the process from executing fork() in the first place. This could be done on Linux using the seccomp() call; but it's Linux-specific. (Xen, for example, wants to be able to support NetBSD and FreeBSD control domains, so it can't rely on this for correctness.) Another would be to use the setrlimit() system call to set RLIMIT_NPROC to 0. This should, in theory, prevent the process from calling fork() (since by definition there would already be one process with its user ID running).

But RLIMIT_NPROC has had its own set of issues in the past. Setting it to 0 would also break a lot of perfectly legitimate code. Surely there must be a way to kill a process in a way that it can't evade, without relying on being able to take away fork(). Looking more closely at the kill() man page, it turns out that the pid argument can be interpreted in four possible ways:

pid > 0: PID of a single process to kill
pid < -1: the negative of the ID of a process group (pgid) to kill
pid == 0: Kill every process in my current process group
pid == -1: Kill every process that I'm allowed to kill

At first glance it seems like killing by pgid might do what we want. To run our untrusted process, set the pgid and the user ID; to kill it, we call kill(-pgid, 9).

Unfortunately, unlike the user ID, the pgid can be changed by unprivileged processes. So our attacker could simply run something like the following to avoid being killed in the same way:

    while(1) {
        if (fork())
            _exit(0);
        setpgid(0, 0);
    }

In this case, the child process changes its pgid to match its PID as soon as it forks, making kill(-pgid) as racy as kill(pid).

A better mousetrap: kill -1

What about the last one — "kill every process I'm allowed to kill"? Well we obviously don't want to run that as root unless we want to nuke the entire system; we want to limit "all processes I'm allowed to kill" to the particular user ID we've given to the rogue process.

In general, processes are allowed to kill other processes with their own UID; so what about something like the following?

    setuid(uid);
    kill(-1, 9);

(Note that for simplicity error handling is omitted in these examples; but when playing with kill() you should certainly make sure that you did switch your UID.)

The kill() system call, when called with -1, will loop over the entire task list, attempting to send the signal to each process except the one making the system call. The tasklist lock is held for the entire loop, so the rogue process cannot complete a fork(); since the UIDs match, it will be killed.

Done, right? Not quite. If we simply call setuid(), then not only can we kill the rogue process, but the rogue process can also kill us:

    while(1) {
        if (fork())
            _exit(0);
        kill(-1, 9);
        setpgid(0, 0);
    }

If the rogue process manages to get its own kill(-1) in after we've called setuid() but before we've called kill() ourselves, we will be the ones to disappear. So to successfully kill the rogue process, we still need to win a race — something we'd rather not rely on.

A better mousetrap: exploiting asymmetry

If we want to reliably kill the other process without putting ourselves at risk of being killed, we must find an asymmetry that allows the "reaper" process to do so. If we look carefully at the kill() man page, we find:

For a process to have permission to send a signal, it must either be privileged (under Linux: have the CAP_KILL capability in the user namespace of the target process), or the real or effective user ID of the sending process must equal the real or saved set-user-ID of the target process.

So there is an asymmetry. Each process has an effective UID (euid), real UID (ruid), and saved UID (suid). For process A to kill process B, A's ruid or euid must match one of B's ruid or suid.

When we started our target process, we set all of its UIDs to a specific value (target_uid). Can we construct a <euid, ruid, suid> tuple for our "reaper" process to use that will allow it to kill the rogue process, and no other processes, but not be able to be killed by the rogue process?

It turns out that we can. If we create a new reaper_uid, and set its <euid, ruid, suid> to <target_uid, reaper_uid, X> (where X can be anything as long as it's not target_uid), then:

The reaper process can kill the target process, since its effective UID is equal to the target process's real UID
But the target process can't kill the reaper, since its real and effective UIDs are different than the real and saved UIDs of the reaper process.

So the following code will safely kill all processes of target_uid in a race-free way:

    setresuid(reaper_uid, target_uid, reaper_uid);
    kill(-1, 9);

Note that this reaper_uid must have no other running processes when we call kill(), or they will be killed as well. In practice this means either setting aside a single reaper_uid (and using a lock to make sure only one reaper process runs at a time) or having a separate reaper_uid per target_uid.

Proof-of-concept code for both the rogue process and the reaper process can be found in this GitHub repository.

No POSIX-compliant mousetraps?

The setresuid() system call is implemented by both Linux and FreeBSD. It is not currently implemented by NetBSD, but implementing it seems like a pretty straightforward exercise (and certainly a lot simpler than implementing seccomp()). NetBSD does implement RLIMIT_NPROC, which should also be helpful at preventing our process from executing fork().

On the other hand, neither setresuid() nor RLIMIT_NPROC are in the current POSIX specification. It seems impossible to get a process to have the required tuple using only the current POSIX interfaces (namely setuid() and setreuid(), without recourse to setresuid() or Linux's CAP_SETUID); the assumption seems to be that euid must always be set to either ruid or suid. So there would seem to be no way within that specification to safely prevent a potentially rogue process from using fork() to evade kill().

Acknowledgments

Thanks to Ian Jackson for doing the analysis to discover the appropriate <euid, ruid, suid> tuple, as well as confirming my assessment that there is no way to set that tuple using current POSIX interfaces.

Index entries for this article
GuestArticles	Dunlap, George

Killing processes that don't want to die

Posted May 28, 2018 15:34 UTC (Mon) by juliank (guest, #45896) [Link] (9 responses)

Linux: Run it an a systemd unit - problem solved. Or if you want to go barebones, put it in a cgroup yourself.

Killing processes that don't want to die

Posted May 28, 2018 17:03 UTC (Mon) by zblaxell (subscriber, #26385) [Link] (1 responses)

I was about to say...if you're hosting rogue fork-bomb processes, and you're not using at least cgroups (or some equivalent feature in non-Linux environments), then you're just doing it wrong.

Also you probably want to freeze the cgroup before killing processes within it; otherwise, you just get to play the same games of race-condition whack-a-mole. Last time I checked, systemd just looped forever trying to win the race.

Killing processes that don't want to die

Posted May 28, 2018 18:04 UTC (Mon) by kentonv (subscriber, #92073) [Link]

Use PID namespaces. If you kill the root process of a PID namespace, all other procs in the namespace are killed too.

Killing processes that don't want to die

Posted May 29, 2018 19:45 UTC (Tue) by wahern (subscriber, #37304) [Link] (6 responses)

systemd has the same race condition as killall. cgroups doesn't solve that.

Killing processes that don't want to die

Posted May 29, 2018 21:24 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

In theory.

In practice the kill loop in systemd works much faster than forking. And crucially processes can't escape their cgroup (unless they have sufficient privileges).

Killing processes that don't want to die

Posted May 29, 2018 23:46 UTC (Tue) by wahern (subscriber, #37304) [Link] (2 responses)

Reading a cgroups list from /proc is not intrinsically different than reading any other process list. Yes, it's probably faster.[citation needed] But you could always have 2 or 3 or 20 processes in a fork loop. As a practical matter it's a distinction without a difference. We shouldn't address TOCTTOU races by making loops faster, and it would be foolhardy to think doing so presents any significant barrier. It would be nice to have a way to reliably, consistently, and *provably* do process management without having to roll dice. Preferably using a mechanism that isn't easily broken with the next absent-minded patch set. The nice thing about relying on UID semantics is that it's not an area where people tend to be oblivious to the ramifications of their changes because it's always been possible to atomically kill, e.g., process groups. Though, by all means lobby for making it possible to atomically kill a cgroup. Not my choice, but defensible.

Killing processes that don't want to die

Posted May 29, 2018 23:46 UTC (Tue) by wahern (subscriber, #37304) [Link]

s#/proc#/sys#

Killing processes that don't want to die

Posted May 30, 2018 0:14 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Try it. cgroup-based killing is way faster than the classic /proc-based tools and you can't realistically outrace it in regular conditions.

I used to worry about it, but in practice it's not a problem. It'd be interesting to add support for atomic signalling to cgroups, though.

If your cgroups also have attached controllers, you can start by decreasing the cgroups CPU and memory priority.

Killing processes that don't want to die

Posted Sep 25, 2018 2:06 UTC (Tue) by lukeshu (guest, #105612) [Link]

Looping over /sys/fs/cgroup/…/cgroup.procs until it's empty *does* resolve the race-condition that it's difficult to determine whether you've successfully killed the process. You are correct in that it doesn't resolve the race-condition that it's difficult to get the active PID to kill. However, changing the question from "if" to "when" is *significant*.

Additionally, in the case of systemd: since systemd is the process that will will be collecting the dead parent PIDs, this removes the safety concern that another process re-uses the PID between the time the target process abandons the PID and the reaper calling kill(PID). If not using systemd, the same thing can be accomplished by having a trusted parent process mark itself as a subreaper before invoking the untrusted executable.

Killing processes that don't want to die

Posted May 31, 2018 11:17 UTC (Thu) by grawity (subscriber, #80596) [Link]

They could now that the pids controller exists. As systemd already supports it for limiting the number of processes per-cgroup, it could drop the limit to 1 and prevent fork() from being used.

kill a kernel thread

Posted May 28, 2018 15:52 UTC (Mon) by sytoka (guest, #38525) [Link] (12 responses)

Sometime, a process is block because of the kernel part. Example, mount, umount, nfs...

If your NFS become crazy, impossible to kill it. You have to reboot.

I will love a kill command for some kernel job !

kill a kernel thread

Posted May 28, 2018 16:22 UTC (Mon) by willy (subscriber, #9762) [Link]

I did some work to address this about ten years ago. We now have TASK_KILLABLE and related infrastructure such as mutex_lock_killable and wait_on_page_killable.

It's now a SMOP to use this everywhere. NFS reads were my initial target, and I believe they still work. NFS writes are harder; last time I checked, they were killed correctly, then the task hung trying to fsync the file on close.

Maybe you would have some time to work on this?

kill a kernel thread

Posted May 28, 2018 17:06 UTC (Mon) by Sesse (subscriber, #53779) [Link] (3 responses)

A classic trick for NFS; remount the filesystem with intr, then kill the process.

kill a kernel thread

Posted May 28, 2018 18:26 UTC (Mon) by willy (subscriber, #9762) [Link] (2 responses)

Classic, and obsolete on Linux for ten years. When I put in the TASK_KILLABLE changes, "intr" became a no-op.

kill a kernel thread

Posted May 28, 2018 18:28 UTC (Mon) by Sesse (subscriber, #53779) [Link] (1 responses)

Does this also mean that read() on a file on NFS can give EINTR now?

kill a kernel thread

Posted May 29, 2018 3:23 UTC (Tue) by willy (subscriber, #9762) [Link]

No. It means if you ^C a read() and the task hasn't set a handler for SIGTERM, the task will die without waiting for the read to complete.

If it did set a handler, the read() will block indefinitely as before.

kill a kernel thread

Posted May 28, 2018 19:27 UTC (Mon) by flussence (guest, #85566) [Link] (3 responses)

Filesystems have gotten much better about this sort of thing over the years, but graphics drivers seem to be regressing. Not only can a buggy GL/Vulkan user become unkillable, it sometimes messes the system up so bad adjacent X clients get wedged in D-state and their windows can't be closed.

kill a kernel thread

Posted May 28, 2018 21:02 UTC (Mon) by blackwood (guest, #44174) [Link] (2 responses)

We're extensively testing all the error paths and make sure (or try to at least) that all lock acquisition paths and anything blocking is interruptible. At least on the drm/i915 driver. Not being able to kill stuff after a gpu hang (especially when the driver didn't manage to recover the hw) is a bug. Reports would be appreciated.

kill a kernel thread

Posted May 29, 2018 2:14 UTC (Tue) by zlynx (guest, #2285) [Link]

Well, when trying to use Nouveau drivers in a i915 Wayland session in Gnome and running anything more complex than glxgears with DRI_PRIME=1, the DRM subsystem locks up almost every time. It's not subtle or hard to reproduce.

kill a kernel thread

Posted May 30, 2018 2:41 UTC (Wed) by flussence (guest, #85566) [Link]

I'll admit i915 *used to* be pretty bad about locking up (around 2010-2012?), but it's improved a lot since then. Chromium still manages to provoke long pauses in it somehow, but it looks like there's plenty of bugs open about that already, and my problems mysteriously vanished when I switched browsers.

Anyway those symptoms above were actually things I'm getting in amdgpu. There's corresponding bugs for them too (and a bunch of other irritants I didn't mention), so I can't really do anything but wait, and scowl at the company… their management's been overpromising and underdelivering since they bought out ATi.

kill a kernel thread

Posted May 28, 2018 23:33 UTC (Mon) by liam (guest, #84133) [Link] (2 responses)

I believe that's one of the reasons why there have been numerous attempts to implement a revoke() syscall.

kill a kernel thread

Posted May 28, 2018 23:51 UTC (Mon) by smurf (subscriber, #17840) [Link] (1 responses)

Umm, no. revoke() is wanted for taking *devices* away from strange processes which might still be using them (like your tty or your microphone). Never intended for file systems.

kill a kernel thread

Posted Jun 6, 2018 6:45 UTC (Wed) by liam (guest, #84133) [Link]

I had this in mind:

Other potential uses exist as well; consider, for example, disconnecting a process from a file which is preventing the unmounting of a filesystem.

Killing processes that don't want to die

Posted May 28, 2018 16:13 UTC (Mon) by smurf (subscriber, #17840) [Link] (11 responses)

Who cares about POSIX? that thing is becoming more and more irrelevant.

Killing processes that don't want to die

Posted May 29, 2018 7:44 UTC (Tue) by k8to (guest, #15413) [Link] (10 responses)

The main upside to the ancient posix behavior is often someone wrote down a clear explanation of the behavior (typically R Stevens). The new interfaces coming along tend to have an out of date text file somewhere or a half-maintained website that eventually goes offline.

Otherwise, sure, strict compliance with a spec that isn't really living anymore doesn't seem very valuable.

Killing processes that don't want to die

Posted May 29, 2018 14:37 UTC (Tue) by NightMonkey (subscriber, #23051) [Link] (9 responses)

This is just a question: as a sysadmin, I've been spoiled by POSIX and Linux's attempts to keep within its specifications. As you say, I have nice documentation on expected behaviors, when I need it for that wonderful 3:37 AM troubleshooting call. :)

I see at https://en.wikipedia.org/wiki/POSIX that indeed, there is some modern work (2017?), but is that work just whistling past the graveyard? Is there something waiting in the wings to replace POSIX? Or is that what the JVM is now for (as a "standard system interface"). (I'm half-joking with the last one, but I'm in a world of web and mobile commercial apps where few with the money or authority apparently care what is running between the metal and the application... and the JVM under Linux seems like the worst of both worlds, at least from a troubleshooting perspective.)

Killing processes that don't want to die

Posted May 29, 2018 15:13 UTC (Tue) by k8to (guest, #15413) [Link] (1 responses)

In industry, I find most people want to pretend the operating system isn't a matter of much concern anymore. Various abstractions should make it go away, for example no one wants to see what's going wrong with software, just reboot the VM/container programmatically with more software.

In the opengroup docs, scanning for "what changed in here for POSIX.1-2017" which seems to be forming into SUS2018, I find items like:

"The UUCP utilities option is added."

It seems like mostly "a clarifying type was added to these two arguments to one function" is a pretty big change for this update. Mostly it seems like it's a matter of officially dropping already deprecated things.

The most significant set of changes appear to be here:

http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_x...

They seem to have sort of unbroken locale a bit, by letting the program ask for an answer in a specific locale.
the *at set of functions are moved from some kind of API annex to the main spec.

It's difficult to figure out what's truly new.

Killing processes that don't want to die

Posted Sep 25, 2018 2:29 UTC (Tue) by lukeshu (guest, #105612) [Link]

That page is a bit confusing in what it is and what it's describing. It's describing the major changes between POSIX Issue 6 (AKA POSIX-2001) and POSIX Issue 7 (AKA POSIX-2008).

So where do 2017 & 2018 come in on that page? POSIX Issue 7 has had several "bugfix" releases since it was released in 2008. The most recent of which was "1003.1-2017", which didn't actually become official until January 2018.

There are real changes and additions being worked on by the POSIX committee, but they won't show up in a "bugfix" release to Issue 7, they're being held until Issue 8. I'm unsure what the release timeline looks like for Issue 8.

Killing processes that don't want to die

Posted May 29, 2018 15:45 UTC (Tue) by k8to (guest, #15413) [Link]

Or comparably, do you find the opengroup docs very readable? I don't. They're not bad for what they are, but I find them laborious to follow and missing information about purpose and intent.

Basically the value to me as a higher level user of computer systems is that someone has created more digestible information that contains what is in them and more. Is anyone doing that anymore with libc & system calls?

For example, I can work out for myself that call_l(..., locale_choice) allows me to to write code that doesn't break when someone creates a goofy set of env vars, but can most modern developers work that out on their own with the information provided in POSIX? I'd expect not.

Killing processes that don't want to die

Posted May 29, 2018 19:15 UTC (Tue) by xtifr (guest, #143) [Link] (5 responses)

The replacement is basically SUS (the Single Unix Specification).

The big things that changed are that 1. VMS died, and 2. the Open Group took over the Unix trademark. Which means, modern OSes can basically be divided into two families: Unix-like, which includes Linux and MacOS, and Microsoft. So we no longer need a watered-down "in-between" standard like Posix. Microsoft (unlike DEC) just doesn't care, and everyone else just went ahead and became a more-or-less "real" Unix.

Single Unix

Posted May 30, 2018 16:58 UTC (Wed) by marcH (subscriber, #57642) [Link] (4 responses)

> So we no longer need a watered-down "in-between" standard like Posix.

What are the difference today between "POSIX" and "Single Unix"? Only the former is available on line? :-)

Single Unix

Posted Jun 1, 2018 13:30 UTC (Fri) by jschrod (subscriber, #1646) [Link] (2 responses)

Download the latter at https://publications.opengroup.org/t101

Single Unix

Posted Jun 1, 2018 18:32 UTC (Fri) by marcH (subscriber, #57642) [Link] (1 responses)

Sorry I meant: only the latter is behind a registration/paywall? (which?)

This was just a side and half-joke actually, I don't really care that much. My more important question is: what are in a nutshell the *technical* differences between today's POSIX and today's Single Unix? Assuming of course these can fit in a nutshell. For instance: is Single Unix just a new name fancy name for what could have been just called POSIX 2018? Or is POSIX is an outdated and significantly smaller subset of Single Unix? Are the exact same players shooting again? Etc.

xtifr seemed to know much more than he shared.

Single Unix

Posted Jun 1, 2018 20:00 UTC (Fri) by jschrod (subscriber, #1646) [Link]

> Sorry I meant: only the latter is behind a registration/paywall? (which?)

registration

> what are in a nutshell the *technical* differences between today's POSIX and today's Single Unix?

POSIX is a part of SUS.

In fact, current POSIX publication is also done by OpenGroup; e.g. http://pubs.opengroup.org/onlinepubs/9699919799/nframe.html is POSIX.1-2017, which is the most important part of SUS Version 4.

Single Unix

Posted Sep 25, 2018 2:41 UTC (Tue) by lukeshu (guest, #105612) [Link]

Back in the day, there were many differences between SUS and POSIX, but today SUS is just POSIX+Curses. SUSv4 is literally a document set (Open Group T101) of two separate documents; Open Group C165 (POSIX-2008, 2016 edition), and Open Group C094 (X/Open Curses, Issue 7).

Killing processes that don't want to die

Posted May 28, 2018 21:03 UTC (Mon) by hawski (guest, #121884) [Link] (6 responses)

I once wondered about this. I had an idea of a new syscall and I described it here: https://hadrianw.github.io/condemned-rfc/killbelow.2.html

Excerpt from it:

> int killbelow(int signal, int timeout);
>
> killbelow() sends the signal signal to all descendant processes. The timeout argument specifies the maximal interval in miliseconds to wait until there are no descendant processes.
>
> killbelow() is most useful if a process calling it is a reaper for its descendant processes. Reaper status can be aquired by using procctl(2) with PROC_REAP_ACQUIRE or prctl(2) with PR_SET_CHILD_SUBREAPER.
>
> Signal will be delivered to every descendant process even if its user ID is different from the process calling killbelow().

Killing processes that don't want to die

Posted May 28, 2018 23:49 UTC (Mon) by smurf (subscriber, #17840) [Link] (1 responses)

What's a "descendant process"? when its parent dies the child is "descended from" init, or the pid namespace's master. You can kill that instead, today. Problem solved. In Linux anyway.

Killing processes that don't want to die

Posted May 29, 2018 19:35 UTC (Tue) by hawski (guest, #121884) [Link]

You are right. I think that namespaces nowadays are a correct answer. That was just my exploration of ideas. Getting subreaper status is pretty much close to using namespaces. I don't know how good namespaces are supported on different systems, so then killbelow and subreaper could be probably a simple solution for other systems.

Killing processes that don't want to die

Posted May 29, 2018 19:27 UTC (Tue) by ebiederm (subscriber, #35028) [Link] (2 responses)

The practical problem with killbelow is that a child can daemonize itself and then not be your child. So even if you could kill every child escaping from being killed is still straight forward.

Killing processes that don't want to die

Posted May 29, 2018 19:31 UTC (Tue) by hawski (guest, #121884) [Link]

It's covered by this part:

> killbelow() is most useful if a process calling it is a reaper for its descendant processes. Reaper status can be acquired by using procctl(2) with PROC_REAP_ACQUIRE or prctl(2) with PR_SET_CHILD_SUBREAPER.

But yes, with this it's quite close to just using namespaces.

Killing processes that don't want to die

Posted May 29, 2018 20:00 UTC (Tue) by ay (guest, #79347) [Link]

With sufficient capability you can PTRACE_SEIZE a process and become its parent, but that doesn't do much good for children of children and so on.

Killing processes that don't want to die

Posted May 31, 2018 14:53 UTC (Thu) by sbaugh (guest, #103291) [Link]

You can do killbelow in userspace, as long as you're a subreaper. I implemented it here: https://github.com/catern/supervise/blob/master/src/subre...

Killing processes that don't want to die

Posted May 29, 2018 4:35 UTC (Tue) by mrons (subscriber, #1751) [Link] (1 responses)

These days I would use systemd cgroups to kill these rouge fork bombs.

Back in the day, on a computer science teaching system, it would be sport for the students to try to make fork() bombs that were hard to kill (for the sys admin (me)).

The example used in this article, where the PID is rapidly changing, was one such technique used by students. We used to call such fork bombs "comets".

One amusing way I found to kill a comet was to use the limit of max users processes (ulimit -u). I would create a standard fork bomb, run as the rouge user, and exhaust the max number of processes the user could run. The comet would then no longer be able to fork(). Then I could killall the user processes to recover.

So using a fork bomb to kill a fork bomb.

Killing processes that don't want to die

Posted May 31, 2018 13:26 UTC (Thu) by fanf (guest, #124752) [Link]

I call these kinds of processes "fork rabbits". I got into a sticky situation with one of them once...

I was hacking on a production server (I didn't have an adequate test environment). I had a daemon that was supposed to re-open its log files etc. when it got a signal. In order to cope with slow cleanup of the old file descriptors, it would fork and reopen the new file descriptors in the child, allowing the parent to clean up at leisure.

I refactored the signal handling code, and screwed it up.

When the daemon received a signal, it became a rabbit.

It was running as root on a production server.

I couldn't use `kill -KILL -1` and I couldn't reboot the machine. (I might have been able to kill by pgid, but I didn't think of that at the time.)

Fortunately this machine did not have randomized pids, so I could anticipate the future pid of the rabbit a few seconds in advance and run a `while :; do kill $pid; done` loop. Of course the rabbit raced right through the trap.

I rewrote the killer in C, and tried again, but the rabbit kept winning the race. So I tried running multiple concurrent killers targeting several adjacent pids. Eventually this worked!

(The side effect would have been a number of failed FTP connections...)

Killing processes that don't want to die

Posted May 29, 2018 21:14 UTC (Tue) by csigler (subscriber, #1224) [Link]

I've got to ask:

Who else remembers RWAST...?

Killing processes that don't want to die

Posted May 31, 2018 15:09 UTC (Thu) by sbaugh (guest, #103291) [Link]

The primary issue with a UID-based approach to killing processes is that it requires privileges to set up. Also, this isn't robust in the presences of setuid binaries such as sudo. You can get around this by setting PR_SET_NO_NEW_PRIVS. But then of course your subprocesses can't use setuid binaries.

I hacked together https://github.com/catern/supervise which uses PR_SET_CHILD_SUBREAPER to solve this issue without requiring privileges. Even in the presence of fork-bombs, supervise will still kill all its children in finite time, without privileges or any special setup of the system.

Unfortunately, supervise is also not robust to arbitrary setuid binaries by default, but you can again set PR_SET_NO_NEW_PRIVS to make that issue go away.

Killing processes that don't want to die

Posted Jun 1, 2018 16:01 UTC (Fri) by dps (guest, #5725) [Link] (3 responses)

Nobody seems to have mention this. so maybe I should.

One of the standard ways of dealing with fork bombs is to send the processes you can to kill SIGSTOP (19), to prevent news processes appearing, before killing them with SIGKILL. All you need is the kill(1) command and the privileges required to use SIGKILL. The only major excpetion is init(1) and maybe some kernel threads.

Also note that kill(-1, ...) is liable to hit more than you probably want to hit. It is probably better to target a process group instead. Most fork bombs only have one process group id and therefore the signal will be delivered to all of their components.

Killing processes that don't want to die

Posted Jun 1, 2018 19:07 UTC (Fri) by smurf (subscriber, #17840) [Link] (1 responses)

Please explain how SIGSTOP+SIGKILL could possibly be more effective than SIGKILL.

Killing processes that don't want to die

Posted Jun 3, 2018 16:30 UTC (Sun) by anselm (subscriber, #2796) [Link]

If you're dealing with a fork bomb that fills up the process table, SIGKILLing a process will free a slot in the process table that a new instance of the fork bomb can immediately fill. If you SIGSTOP them first, that can't happen because none of the still-existing-in-the-process-table-but-stopped fork bomb processes will be able to spawn new children.

Killing processes that don't want to die

Posted Jun 3, 2018 18:19 UTC (Sun) by sbaugh (guest, #103291) [Link]

As the article says, process groups can be changed by unprivileged processes, so a fork bomb can easily switch process group and avoid your attempt at killing.

Your SIGSTOP suggestion is likewise flawed. Nothing prevents a normal unprivileged process from simply sending SIGCONT to its parents.