A "kill" button for control groups
A "kill" button for control groups
Posted May 3, 2021 23:19 UTC (Mon) by zblaxell (subscriber, #26385)In reply to: A "kill" button for control groups by Cyberax
Parent article: A "kill" button for control groups
It looks like there are some ways to escape from the pids controller which the kill button closes off: a process that is running fork() can evade some of the limits that are imposed after fork() starts and before it finishes, or escape by migrating to another cgroup. The kill-button patch leaves a note to smack that process with a SIGKILL just before fork() returns.
Posted May 3, 2021 23:52 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Personally, I would prefer a reliable handle-based API for processes instead of trying to plug leaks in a dam with fingers.
Posted May 4, 2021 22:07 UTC (Tue)
by zblaxell (subscriber, #26385)
[Link] (1 responses)
Rights can be delegated. That's one of the central features of cgroups: you don't need to be root to use it.
A process can move around within its delegation hierarchy and evade a (naive, non-looping) userspace terminator--that was part of what made looping (and possibly also recursive search) in userspace necessary. Processes can hold the controller FD's open so they can give themselves their rights back even if the control files are chmod-ed. Also probably a hundred other holes I haven't bothered to think about, and with this patch set, no longer have to.
Posted May 4, 2021 22:46 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted May 4, 2021 16:50 UTC (Tue)
by mezcalero (subscriber, #45103)
[Link]
The cgroupsv2 freezer makes a ton more sense, and we expose it with hence high level operations (systemctl freeze + systemctl thaw), but we don't use it to make killing race-free. We could do that, but it doesn't feel ideal to me, since freezing is slow, i.e. we need to initiate the freeze, then wait until the kernel tells us it is done (poll()), then enqueue the signal, and then unfreeze and wait again. And blocking syscalls can delay the freeze for long times. Thus killing would become a "slow" operation in the worst case (at least that's my understanding), and that kinda sucks. After all we want this as a clean-up operation that gets rid of broken stuff, i.e. SIGKILL is the unfriendly way to abort stuff, but if things are not abortive anymore if we use the freezer, that defeats half the point.
I love Christian's work on this, since it fixes the race for us *and* is always a quick operation. We don't have to wait for anything "slow". (I mean, it internally also iterates through all processes, so it's not O(1), but that's not what I mean by "slow"...) It just enqueues the SIGKILL for each process in a race-free fashion, and that's all we need.
So, yeah, I am looking forward to Christian' work land and we'll happily make use of it in systemd once it lands. It fixes a real problem for us.
Lennart
A "kill" button for control groups
A "kill" button for control groups
A "kill" button for control groups
A "kill" button for control groups