LWN.net Logo

Solving out-of-memory situations the Linux way

The out-of-memory (OOM) killer is a longstanding source of controversy in Linux development circles. The killer comes into play if the kernel encounters a memory shortage so severe that the ongoing functioning of the system is endangered. Rather than panic or lock up, the kernel brings in the OOM killer, which goes looking for processes to kill. The killer has a complicated set of heuristics built into it in an attempt to have it target the processes that are least likely to be missed. Anybody who has seen the OOM killer in action, however, knows that it can still make unfortunate choices. Choosing the process which (1) is among the least valuable on the system, and (2) is a significant part of the memory problem is a difficult task.

As a result of discomfort with this grim reaper lurking within the kernel, and of recently merged VM improvements, the OOM killer has been removed from the 2.4.23 prepatch series.

For 2.6, Rusty Lynch has just posted a different answer that should, perhaps, have been obvious from the beginning. Rather than trying to come up with a set of OOM killer heuristics that works for everybody, Rusty's patch sets up a notifier-based mechanism that allows for pluggable OOM killer modules. With this patch, anybody who wants to set up a different response to memory shortages need only write a module implementing that technique.

The patch includes the standard OOM killer, along with an example alternative which simply panics the system. But there is already talk of creating OOM killer modules implementing different policies. One, which has been posted already, targets processes if they are seen to be forking children which fall victim to the OOM killer; it works on the assumption that the parent is the real source of the problem. A "blame Mozilla" module has been suggested. And Alan Cox has suggested involving the security module code so that a site's security policies can be part of the OOM reaction process.

It's unclear how far this process will go. But pluggable OOM killers is a clear way of ending the long discussion over what the right policy should be. Linux is, after all, about choice, even when the choices are unpleasant.


(Log in to post comments)

Solving out-of-memory situations the Linux way

Posted Sep 18, 2003 3:30 UTC (Thu) by metacircles (guest, #8895) [Link]

It would be nice if there was some (sane - no, parsing /proc doesn't count) way for userland to get information about memory pressure. This would make it possible for applications to do things like free cached data or tables, run garbage collectors, discard undo information, or whatever. There are probably a lot of apps that could potentially benefit from this - assuming that notification was early enough that there was still enough ram left to take whatever action they'd need to take (which might itself involve allocating).

A lot of kernel hackers like to talk about the difficulty of predicting what userspace is going to want to do, whether with VM, address space, timeslices, devices, or whatever else. Wouldn't it be easier to extend the channels by which userspace can /tell/ the kernel what it wants to do? madvise is a start, but I'm sure there's more.

Programs voluntarily giving up memory

Posted Sep 19, 2003 19:31 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

That's really addressing a rather different problem than the one at which the OOM killer is aimed. The OOM killer is not about optimally allocating scarce memory resource among competing uses. It's about repairing a broken system.

The system administrator is still supposed to arrange things so that the system never comes close to running out of physical memory. The OOM killer gets involved when a program runs out of control and tries to use an unlimited amount of memory, or when some unplanned-for workload shows up that causes an unplanned memory demand.

So the main goal of an OOM killer is to find the broken program and stop it.

I believe every time you see the OOM killer do its thing, you should treat it as a system failure and put measures in place to avoid the situation in the future (more swap space, lower ulimits, workload throttling, etc.).


The other problem, that of relieving the system administrator of responsibility for manually managing memory, is also important. But OOM killers as we know them aren't the solution to that problem.

Solving out-of-memory situations the Linux way

Posted Sep 18, 2003 6:01 UTC (Thu) by lacostej (guest, #2760) [Link]

I like that! It's probably not an alternative, but a new signal could be nice, such as kill -SIGMEM or something similar.

Solving out-of-memory situations the Linux way

Posted Sep 18, 2003 7:42 UTC (Thu) by spitzak (guest, #4593) [Link]

One thing I think would help is to have processes themselves say "I can be killed if OOM". I would think a lot of processes (ie games and anything that can restore it's state) could and would do this.

Solving out-of-memory situations the Linux way

Posted Sep 18, 2003 8:08 UTC (Thu) by yodermk (subscriber, #3803) [Link]

Good idea.

I'm not a kernel guru, but how about just an environment variable? Something like OOM_KILLME=x, where x is a number from -10 to 10. The kernel can see the env of any process, so in an OOM condition it would first go through all the processes that have OOM_KILLME=10. If nothing has been freed (or it still needs more), it would repeat with OOM_KILLME=9, etc. Processes with no variable set would be like OOM_KILLME=0. Then it would get into negative numbers.

So if you were running a process that you know should be nice, and isn't critical, run it with a positive value.

Most user processes are probably OK with no variable set. Important services like your PostgreSQL server should probably have OOM_KILLME=-5 or less.

Probably, only processes owned by root would allow a negative value to take effect, just like nice().


Pluggable choice is good, as long as the default OOM killer kills jobs

Posted Sep 18, 2003 13:28 UTC (Thu) by cfischer (subscriber, #3983) [Link]

A pluggable module is good, and choice is good, but it is absolutely
necessary IMHO that the default kernel applies a default policy and
kills well-picked jobs in an OOM situation.

Rationale: There are multiple ways in which to use a Linux box.
One is particularly affected by the OOM situation: the case
of remotely administered boxes (and I mean really remote) and
multiple users running unpredictable processes.

This could be a machine in a large batch queue machine pool, running
memory-intensive numerical simulations. Or it could be your remote web
server on the Internet where a bunch of your friends can login, and
could potentially run memory-intensive applications, or you run a
beta version of some new server application that suddenly goes wild.

In those situations, if the kernel makes a reasonable choice and
kills one of the most memory-intensive applications, the OS will
survive, and you can login remotely and fix the problem. But if
the kernel panics, you can't do anything remotely.

If you want something different, fine, go ahead and implement a module
or user-space-daemon. But taking the default OOM killer out of the
kernel will make a huge difference for remote admins who encounter
OOM situations, and a difference for the worse.

Pluggable choice is good, as long as the default OOM killer kills jobs

Posted Sep 18, 2003 18:16 UTC (Thu) by NAR (subscriber, #1313) [Link]

A pluggable module is good, and choice is good, but it is absolutely necessary IMHO that the default kernel applies a default policy and kills well-picked jobs in an OOM situation.

If the default policy manages to kill your sshd, you're out of the box as well. I think, if you have the choice to set up OOM policy, you set it up when you install the machine along with the netfilter rules, etc., so it doesn't make really your job harder.

Bye,NAR

Pluggable choice is good, as long as the default OOM killer kills jobs

Posted Sep 19, 2003 1:46 UTC (Fri) by dlang (✭ supporter ✭, #313) [Link]

anything this critical needs to have a watchdog to restart it if it gets killed, if you can't do anything else make your server not detach from the controlling terminal (probably with a debug option) and start it via inittab so that init will respawn it if it dies.

you don't want to do this with everything, but something like sshd for a remotely administered box is a perfect use for this.

Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds