LWN: Comments on "Killing processes that don't want to die"

Single Unix

lukeshu — Tue, 25 Sep 2018 02:41:05 +0000

Back in the day, there were many differences between SUS and POSIX, but today SUS is just POSIX+Curses. SUSv4 is literally a document set (Open Group T101) of two separate documents; Open Group C165 (POSIX-2008, 2016 edition), and Open Group C094 (X/Open Curses, Issue 7).

Killing processes that don't want to die

lukeshu — Tue, 25 Sep 2018 02:29:07 +0000

That page is a bit confusing in what it is and what it's describing. It's describing the major changes between POSIX Issue 6 (AKA POSIX-2001) and POSIX Issue 7 (AKA POSIX-2008).

So where do 2017 & 2018 come in on that page? POSIX Issue 7 has had several "bugfix" releases since it was released in 2008. The most recent of which was "1003.1-2017", which didn't actually become official until January 2018.

There are real changes and additions being worked on by the POSIX committee, but they won't show up in a "bugfix" release to Issue 7, they're being held until Issue 8. I'm unsure what the release timeline looks like for Issue 8.

Killing processes that don't want to die

lukeshu — Tue, 25 Sep 2018 02:06:14 +0000

Looping over /sys/fs/cgroup/…/cgroup.procs until it's empty *does* resolve the race-condition that it's difficult to determine whether you've successfully killed the process. You are correct in that it doesn't resolve the race-condition that it's difficult to get the active PID to kill. However, changing the question from "if" to "when" is *significant*.

Additionally, in the case of systemd: since systemd is the process that will will be collecting the dead parent PIDs, this removes the safety concern that another process re-uses the PID between the time the target process abandons the PID and the reaper calling kill(PID). If not using systemd, the same thing can be accomplished by having a trusted parent process mark itself as a subreaper before invoking the untrusted executable.

kill a kernel thread

liam — Wed, 06 Jun 2018 06:45:08 +0000

I had this in mind:

Other potential uses exist as well; consider, for example, disconnecting a process from a file which is preventing the unmounting of a filesystem.

Killing processes that don't want to die

sbaugh — Sun, 03 Jun 2018 18:19:45 +0000

As the article says, process groups can be changed by unprivileged processes, so a fork bomb can easily switch process group and avoid your attempt at killing.

Your SIGSTOP suggestion is likewise flawed. Nothing prevents a normal unprivileged process from simply sending SIGCONT to its parents.

Killing processes that don't want to die

anselm — Sun, 03 Jun 2018 16:30:41 +0000

If you're dealing with a fork bomb that fills up the process table, SIGKILLing a process will free a slot in the process table that a new instance of the fork bomb can immediately fill. If you SIGSTOP them first, that can't happen because none of the still-existing-in-the-process-table-but-stopped fork bomb processes will be able to spawn new children.

Single Unix

jschrod — Fri, 01 Jun 2018 20:00:45 +0000

> Sorry I meant: only the latter is behind a registration/paywall? (which?)

registration

> what are in a nutshell the *technical* differences between today's POSIX and today's Single Unix?

POSIX is a part of SUS.

In fact, current POSIX publication is also done by OpenGroup; e.g. http://pubs.opengroup.org/onlinepubs/9699919799/nframe.html is POSIX.1-2017, which is the most important part of SUS Version 4.

Killing processes that don't want to die

smurf — Fri, 01 Jun 2018 19:07:44 +0000

Please explain how SIGSTOP+SIGKILL could possibly be more effective than SIGKILL.

Single Unix

marcH — Fri, 01 Jun 2018 18:32:09 +0000

Sorry I meant: only the latter is behind a registration/paywall? (which?)

This was just a side and half-joke actually, I don't really care that much. My more important question is: what are in a nutshell the *technical* differences between today's POSIX and today's Single Unix? Assuming of course these can fit in a nutshell. For instance: is Single Unix just a new name fancy name for what could have been just called POSIX 2018? Or is POSIX is an outdated and significantly smaller subset of Single Unix? Are the exact same players shooting again? Etc.

xtifr seemed to know much more than he shared.

Killing processes that don't want to die

dps — Fri, 01 Jun 2018 16:01:20 +0000

Nobody seems to have mention this. so maybe I should.

One of the standard ways of dealing with fork bombs is to send the processes you can to kill SIGSTOP (19), to prevent news processes appearing, before killing them with SIGKILL. All you need is the kill(1) command and the privileges required to use SIGKILL. The only major excpetion is init(1) and maybe some kernel threads.

Also note that kill(-1, ...) is liable to hit more than you probably want to hit. It is probably better to target a process group instead. Most fork bombs only have one process group id and therefore the signal will be delivered to all of their components.

Single Unix

jschrod — Fri, 01 Jun 2018 13:30:52 +0000

Download the latter at https://publications.opengroup.org/t101

Killing processes that don't want to die

sbaugh — Thu, 31 May 2018 15:09:53 +0000

The primary issue with a UID-based approach to killing processes is that it requires privileges to set up. Also, this isn't robust in the presences of setuid binaries such as sudo. You can get around this by setting PR_SET_NO_NEW_PRIVS. But then of course your subprocesses can't use setuid binaries.

I hacked together https://github.com/catern/supervise which uses PR_SET_CHILD_SUBREAPER to solve this issue without requiring privileges. Even in the presence of fork-bombs, supervise will still kill all its children in finite time, without privileges or any special setup of the system.

Unfortunately, supervise is also not robust to arbitrary setuid binaries by default, but you can again set PR_SET_NO_NEW_PRIVS to make that issue go away.

Killing processes that don't want to die

sbaugh — Thu, 31 May 2018 14:53:44 +0000

You can do killbelow in userspace, as long as you're a subreaper. I implemented it here: https://github.com/catern/supervise/blob/master/src/subre...

Killing processes that don't want to die

fanf — Thu, 31 May 2018 13:26:18 +0000

I call these kinds of processes "fork rabbits". I got into a sticky situation with one of them once...

I was hacking on a production server (I didn't have an adequate test environment). I had a daemon that was supposed to re-open its log files etc. when it got a signal. In order to cope with slow cleanup of the old file descriptors, it would fork and reopen the new file descriptors in the child, allowing the parent to clean up at leisure.

I refactored the signal handling code, and screwed it up.

When the daemon received a signal, it became a rabbit.

It was running as root on a production server.

I couldn't use `kill -KILL -1` and I couldn't reboot the machine. (I might have been able to kill by pgid, but I didn't think of that at the time.)

Fortunately this machine did not have randomized pids, so I could anticipate the future pid of the rabbit a few seconds in advance and run a `while :; do kill $pid; done` loop. Of course the rabbit raced right through the trap.

I rewrote the killer in C, and tried again, but the rabbit kept winning the race. So I tried running multiple concurrent killers targeting several adjacent pids. Eventually this worked!

(The side effect would have been a number of failed FTP connections...)

Killing processes that don't want to die

grawity — Thu, 31 May 2018 11:17:21 +0000

They could now that the pids controller exists. As systemd already supports it for limiting the number of processes per-cgroup, it could drop the limit to 1 and prevent fork() from being used.

Single Unix

marcH — Wed, 30 May 2018 16:58:24 +0000

> So we no longer need a watered-down "in-between" standard like Posix.

What are the difference today between "POSIX" and "Single Unix"? Only the former is available on line? :-)

kill a kernel thread

flussence — Wed, 30 May 2018 02:41:22 +0000

I'll admit i915 *used to* be pretty bad about locking up (around 2010-2012?), but it's improved a lot since then. Chromium still manages to provoke long pauses in it somehow, but it looks like there's plenty of bugs open about that already, and my problems mysteriously vanished when I switched browsers.

Anyway those symptoms above were actually things I'm getting in amdgpu. There's corresponding bugs for them too (and a bunch of other irritants I didn't mention), so I can't really do anything but wait, and scowl at the company… their management's been overpromising and underdelivering since they bought out ATi.

Killing processes that don't want to die

Cyberax — Wed, 30 May 2018 00:14:18 +0000

Try it. cgroup-based killing is way faster than the classic /proc-based tools and you can't realistically outrace it in regular conditions.

I used to worry about it, but in practice it's not a problem. It'd be interesting to add support for atomic signalling to cgroups, though.

If your cgroups also have attached controllers, you can start by decreasing the cgroups CPU and memory priority.

Killing processes that don't want to die

wahern — Tue, 29 May 2018 23:46:55 +0000

s#/proc#/sys#

Killing processes that don't want to die

wahern — Tue, 29 May 2018 23:46:23 +0000

Reading a cgroups list from /proc is not intrinsically different than reading any other process list. Yes, it's probably faster.[citation needed] But you could always have 2 or 3 or 20 processes in a fork loop. As a practical matter it's a distinction without a difference. We shouldn't address TOCTTOU races by making loops faster, and it would be foolhardy to think doing so presents any significant barrier. It would be nice to have a way to reliably, consistently, and *provably* do process management without having to roll dice. Preferably using a mechanism that isn't easily broken with the next absent-minded patch set. The nice thing about relying on UID semantics is that it's not an area where people tend to be oblivious to the ramifications of their changes because it's always been possible to atomically kill, e.g., process groups. Though, by all means lobby for making it possible to atomically kill a cgroup. Not my choice, but defensible.

Killing processes that don't want to die

Cyberax — Tue, 29 May 2018 21:24:29 +0000

In theory.

In practice the kill loop in systemd works much faster than forking. And crucially processes can't escape their cgroup (unless they have sufficient privileges).

Killing processes that don't want to die

csigler — Tue, 29 May 2018 21:14:49 +0000

I've got to ask:

Who else remembers RWAST...?

Killing processes that don't want to die

ay — Tue, 29 May 2018 20:00:53 +0000

With sufficient capability you can PTRACE_SEIZE a process and become its parent, but that doesn't do much good for children of children and so on.

Killing processes that don't want to die

wahern — Tue, 29 May 2018 19:45:56 +0000

systemd has the same race condition as killall. cgroups doesn't solve that.

Killing processes that don't want to die

hawski — Tue, 29 May 2018 19:35:03 +0000

You are right. I think that namespaces nowadays are a correct answer. That was just my exploration of ideas. Getting subreaper status is pretty much close to using namespaces. I don't know how good namespaces are supported on different systems, so then killbelow and subreaper could be probably a simple solution for other systems.

Killing processes that don't want to die

hawski — Tue, 29 May 2018 19:31:52 +0000

It's covered by this part:

> killbelow() is most useful if a process calling it is a reaper for its descendant processes. Reaper status can be acquired by using procctl(2) with PROC_REAP_ACQUIRE or prctl(2) with PR_SET_CHILD_SUBREAPER.

But yes, with this it's quite close to just using namespaces.

Killing processes that don't want to die

ebiederm — Tue, 29 May 2018 19:27:31 +0000

The practical problem with killbelow is that a child can daemonize itself and then not be your child. So even if you could kill every child escaping from being killed is still straight forward.

Killing processes that don't want to die

xtifr — Tue, 29 May 2018 19:15:46 +0000

The replacement is basically SUS (the Single Unix Specification).

The big things that changed are that 1. VMS died, and 2. the Open Group took over the Unix trademark. Which means, modern OSes can basically be divided into two families: Unix-like, which includes Linux and MacOS, and Microsoft. So we no longer need a watered-down "in-between" standard like Posix. Microsoft (unlike DEC) just doesn't care, and everyone else just went ahead and became a more-or-less "real" Unix.

Killing processes that don't want to die

k8to — Tue, 29 May 2018 15:45:48 +0000

Or comparably, do you find the opengroup docs very readable? I don't. They're not bad for what they are, but I find them laborious to follow and missing information about purpose and intent.

Basically the value to me as a higher level user of computer systems is that someone has created more digestible information that contains what is in them and more. Is anyone doing that anymore with libc & system calls?

For example, I can work out for myself that call_l(..., locale_choice) allows me to to write code that doesn't break when someone creates a goofy set of env vars, but can most modern developers work that out on their own with the information provided in POSIX? I'd expect not.

Killing processes that don't want to die

k8to — Tue, 29 May 2018 15:13:53 +0000

In industry, I find most people want to pretend the operating system isn't a matter of much concern anymore. Various abstractions should make it go away, for example no one wants to see what's going wrong with software, just reboot the VM/container programmatically with more software.

In the opengroup docs, scanning for "what changed in here for POSIX.1-2017" which seems to be forming into SUS2018, I find items like:

"The UUCP utilities option is added."

It seems like mostly "a clarifying type was added to these two arguments to one function" is a pretty big change for this update. Mostly it seems like it's a matter of officially dropping already deprecated things.

The most significant set of changes appear to be here:

http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_x...

They seem to have sort of unbroken locale a bit, by letting the program ask for an answer in a specific locale.
the *at set of functions are moved from some kind of API annex to the main spec.

It's difficult to figure out what's truly new.

Killing processes that don't want to die

NightMonkey — Tue, 29 May 2018 14:37:13 +0000

This is just a question: as a sysadmin, I've been spoiled by POSIX and Linux's attempts to keep within its specifications. As you say, I have nice documentation on expected behaviors, when I need it for that wonderful 3:37 AM troubleshooting call. :)

I see at https://en.wikipedia.org/wiki/POSIX that indeed, there is some modern work (2017?), but is that work just whistling past the graveyard? Is there something waiting in the wings to replace POSIX? Or is that what the JVM is now for (as a "standard system interface"). (I'm half-joking with the last one, but I'm in a world of web and mobile commercial apps where few with the money or authority apparently care what is running between the metal and the application... and the JVM under Linux seems like the worst of both worlds, at least from a troubleshooting perspective.)

Killing processes that don't want to die

k8to — Tue, 29 May 2018 07:44:11 +0000

The main upside to the ancient posix behavior is often someone wrote down a clear explanation of the behavior (typically R Stevens). The new interfaces coming along tend to have an out of date text file somewhere or a half-maintained website that eventually goes offline.

Otherwise, sure, strict compliance with a spec that isn't really living anymore doesn't seem very valuable.

Killing processes that don't want to die

mrons — Tue, 29 May 2018 04:35:04 +0000

These days I would use systemd cgroups to kill these rouge fork bombs.

Back in the day, on a computer science teaching system, it would be sport for the students to try to make fork() bombs that were hard to kill (for the sys admin (me)).

The example used in this article, where the PID is rapidly changing, was one such technique used by students. We used to call such fork bombs "comets".

One amusing way I found to kill a comet was to use the limit of max users processes (ulimit -u). I would create a standard fork bomb, run as the rouge user, and exhaust the max number of processes the user could run. The comet would then no longer be able to fork(). Then I could killall the user processes to recover.

So using a fork bomb to kill a fork bomb.

kill a kernel thread

willy — Tue, 29 May 2018 03:23:23 +0000

No. It means if you ^C a read() and the task hasn't set a handler for SIGTERM, the task will die without waiting for the read to complete.

If it did set a handler, the read() will block indefinitely as before.

kill a kernel thread

zlynx — Tue, 29 May 2018 02:14:30 +0000

Well, when trying to use Nouveau drivers in a i915 Wayland session in Gnome and running anything more complex than glxgears with DRI_PRIME=1, the DRM subsystem locks up almost every time. It's not subtle or hard to reproduce.

kill a kernel thread

smurf — Mon, 28 May 2018 23:51:04 +0000

Umm, no. revoke() is wanted for taking *devices* away from strange processes which might still be using them (like your tty or your microphone). Never intended for file systems.

Killing processes that don't want to die

smurf — Mon, 28 May 2018 23:49:23 +0000

What's a "descendant process"? when its parent dies the child is "descended from" init, or the pid namespace's master. You can kill that instead, today. Problem solved. In Linux anyway.

kill a kernel thread

liam — Mon, 28 May 2018 23:33:32 +0000

I believe that's one of the reasons why there have been numerous attempts to implement a revoke() syscall.

Killing processes that don't want to die

hawski — Mon, 28 May 2018 21:03:34 +0000

I once wondered about this. I had an idea of a new syscall and I described it here: https://hadrianw.github.io/condemned-rfc/killbelow.2.html

Excerpt from it:

> int killbelow(int signal, int timeout);
>
> killbelow() sends the signal signal to all descendant processes. The timeout argument specifies the maximal interval in miliseconds to wait until there are no descendant processes.
>
> killbelow() is most useful if a process calling it is a reaper for its descendant processes. Reaper status can be aquired by using procctl(2) with PROC_REAP_ACQUIRE or prctl(2) with PR_SET_CHILD_SUBREAPER.
>
> Signal will be delivered to every descendant process even if its user ID is different from the process calling killbelow().

kill a kernel thread

blackwood — Mon, 28 May 2018 21:02:49 +0000

We're extensively testing all the error paths and make sure (or try to at least) that all lock acquisition paths and anything blocking is interruptible. At least on the drm/i915 driver. Not being able to kill stuff after a gpu hang (especially when the driver didn't manage to recover the hw) is a bug. Reports would be appreciated.