LWN: Comments on "Teaching the OOM killer about control groups" https://lwn.net/Articles/761118/ This is a special feed containing comments posted to the individual LWN article titled "Teaching the OOM killer about control groups". en-us Sat, 01 Nov 2025 09:13:04 +0000 Sat, 01 Nov 2025 09:13:04 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Teaching the OOM killer about control groups https://lwn.net/Articles/761464/ https://lwn.net/Articles/761464/ mstsxfx <div class="FormattedComment"> <font class="QuotedText">&gt; This patch set is not yet in -mm, but there does not appear to be any real opposition to it at this point.</font><br> <p> Well, this is not entirely true. There were quite serious concerns about the<br> proposed API allowing for weird corner cases. E.g.<br> <a href="http://lkml.kernel.org/r/20180130085013.GP21609@dhcp22.suse.cz">http://lkml.kernel.org/r/20180130085013.GP21609@dhcp22.su...</a>. The<br> discussion circled in some repetitive arguments so it is not really easy to<br> follow up, unfortunately. The last version hasn't been reviewed AFAIK but bases<br> are quite similar so I am skeptical this is mergeable anytime soon.<br> <p> The biggest quarrel about this whole thing is a different view about feature<br> completeness IMHO.<br> <p> While the original Roman's proposal was targeting very specific class of<br> usecases (containers as you've mentioned) and it was an opt-in so those<br> uninterested could live with the original policy/heuristic with potential<br> extensions[1] to be done on top.<br> <p> David was pushing hard for a more generic solution which would give a bigger<br> power over the oom selection policy to userspace. While this is a good thing<br> in general, the primary problem is that this is extremely hard to get right.<br> We have been discussing this for years without finding moving forward much<br> because opinions of what is really important vary a lot.<br> <p> [1] group_oom makes a lot of sense regardless of the oom victim selection<br> policy because some workloads are inherently indivisible<br> <p> Michal Hocko<br> <p> </div> Wed, 01 Aug 2018 13:21:24 +0000 OOM killer and cgroups. Better do it in user-space? https://lwn.net/Articles/761405/ https://lwn.net/Articles/761405/ nilsmeyer <div class="FormattedComment"> Depends on the service really, what I often see is that there is a main process that fork()s into multiple children and some of the child processes grow out of control. Although this may be handled better within the cgroup? <br> </div> Tue, 31 Jul 2018 21:04:15 +0000 Teaching the OOM killer about control groups https://lwn.net/Articles/761250/ https://lwn.net/Articles/761250/ vbabka <div class="FormattedComment"> AFAIU this is not about cgroups memory limit, but a system-wide OOM in a situation where there are cgroups. There might be per-cgroup limits, but it's not that the sum of these limits would equal the system memory, because that would lead to underutilized system. So instead there's overcommit and the workloads can tolerate being killed on the level of a whole group.<br> </div> Mon, 30 Jul 2018 07:42:43 +0000 Teaching the OOM killer about control groups https://lwn.net/Articles/761242/ https://lwn.net/Articles/761242/ dbe <div class="FormattedComment"> Would that mean that a cgroup could ignore an oom notification?<br> </div> Mon, 30 Jul 2018 05:50:35 +0000 Teaching the OOM killer about control groups https://lwn.net/Articles/761223/ https://lwn.net/Articles/761223/ meyert <div class="FormattedComment"> How does this related to cgroups with the memory controller? Isn't oom killer also invoked when the cgroups memory limit is hit? Can then"kill the whole cgroup" not also be implemented in the cgroup itself by installing an cgroup oom notification listener?<br> </div> Sun, 29 Jul 2018 21:27:40 +0000 Killing containers. https://lwn.net/Articles/761213/ https://lwn.net/Articles/761213/ rweikusat2 <div class="FormattedComment"> Obviously.<br> <p> But if a control group happens to represent a container, the processes belonging to it will usually be cooperating more closely than unrelated, other processes also running on the same system. Hence, killing one process running inside a container will end up as more or less drastically malfunctioning virtual server while killing all of them would be more like a clean shutdown. I certainly prefer the latter of the former.<br> <p> There's also another issue: Assuming the OOM situation was caused by some software bug/ malfunction, chances are that it will happen again fairly quickly as the "monitoring infrastructure" will seek to reestablish the problematic state. Killing everything in the cgroup should prevent that from happening.<br> </div> Sun, 29 Jul 2018 16:09:27 +0000 OOM killer and cgroups. Better do it in user-space? https://lwn.net/Articles/761211/ https://lwn.net/Articles/761211/ foom <div class="FormattedComment"> I don't think killing all the processes in the cgroup makes anything appreciably easier. Since any of the processes could crash in some other way, you still need whatever monitoring mechanisms you had before.<br> </div> Sun, 29 Jul 2018 15:15:59 +0000 OOM killer and cgroups. Better do it in user-space? https://lwn.net/Articles/761188/ https://lwn.net/Articles/761188/ darwish <div class="FormattedComment"> Yeah, I really like the "kill the whole cgroup" option too (memory.oom_group).<br> <p> It makes sense especially since systemd puts each "service" in its its own cgroup, and killing the whole service sounds much more graceful than picking a single victim process from within it.<br> <p> </div> Sat, 28 Jul 2018 23:40:17 +0000 OOM killer and cgroups. Better do it in user-space? https://lwn.net/Articles/761186/ https://lwn.net/Articles/761186/ simcop2387 <div class="FormattedComment"> I think making it aware of the groups for this kind of scheduling isn't too much policy. Since the cgroups can affect everything else the kernel does with them (scheduling, io, etc.) it makes sense to extend it to this. And adding the "kill the whole group" bit makes administration easier, have your container all in one cgroup and just let the oom killer take the whole thing out and then your infrastructure restarts it. No need for a complicated heartbeat or other more likely to fail signal saying this container isn't functioning properly anymore. That said, anything beyond making it aware of these structures I think does make sense to put into something userspace. Not sure how facebook's setup works but I could see something that's dynamically adjusting the oom scores to try to keep certain things from being killed while encouraging others and that's going to be a much more complicated policy since it'll be system/service/etc. specific.<br> </div> Sat, 28 Jul 2018 21:16:12 +0000 OOM killer and cgroups. Better do it in user-space? https://lwn.net/Articles/761173/ https://lwn.net/Articles/761173/ darwish <div class="FormattedComment"> Thanks a lot Jon for the great article :-)<br> <p> Part of me feels that this is _too much policy_ leaking into the kernel code. I guess the Facebook approach of relegating this to user-space makes more sense?<br> <p> ▷▷ <a href="https://code.fb.com/production-engineering/open-sourcing-oomd-a-new-approach-to-handling-ooms/">https://code.fb.com/production-engineering/open-sourcing-...</a><br> <p> Maybe something similar can be done in systemd (systemd-oomd?), with its standardized user-space configuration options instead of all of these policy decisions and tunables ..<br> <p> </div> Sat, 28 Jul 2018 19:09:32 +0000 Teaching the OOM killer about control groups https://lwn.net/Articles/761167/ https://lwn.net/Articles/761167/ abk77 <div class="FormattedComment"> Very nice explanation. thank you!. Diagrams/pictures are very useful. <br> </div> Sat, 28 Jul 2018 13:51:04 +0000