User: Password:
|
|
Subscribe / Log in / New account

Avoiding the OOM killer with mem_notify

Avoiding the OOM killer with mem_notify

Posted Jan 31, 2008 9:09 UTC (Thu) by njs (guest, #40338)
In reply to: Avoiding the OOM killer with mem_notify by salimma
Parent article: Avoiding the OOM killer with mem_notify

It's possible to take this metaphor of processes fighting over memory too far.  I don't think
in practice app writers consider themselves to have "won" if they've managed to partially
crash the user's system but keep their own process running while they did it.  The goal is to
not invoke the capricious god OOM at all.

Interesting possibility enabled by this patch: userspace OOM killer.  You don't *have* to
reduce memory by freeing caches -- killing other processes is quite effective too :-).  And if
you have a relatively integrated environment like a phone UI, you may know perfectly well from
userspace that killing that java game is better than killing the windowing system, which is
better than killing the gsm daemon.  (Even on desktops, one knows that killing X will also
automatically kill all its clients -- so one should always start by killing those clients
first, because if that works, then you've managed to escape the OOM situation with strictly
less damage.)


(Log in to post comments)

Requesting 'real' memory

Posted Jan 31, 2008 11:08 UTC (Thu) by epa (subscriber, #39769) [Link]

The OOM killer is needed because the kernel has overallocated memory.  Surely for critical
processes there is a way to request 'hard' memory, where you can be sure that it really exists
either as RAM or swap space, and you can be certain you're not going to be arbitrarily killed
later for using the memory you requested.  The tradeoff is that a memory allocation request
can fail - but better to have malloc() return 0 where the app can handle it sensibly than to
have it pretend to work and then randomly kill your process later.

Can you turn off overallocation (and OOM killing) on a per-process basis?

Requesting 'real' memory

Posted Jan 31, 2008 11:43 UTC (Thu) by njs (guest, #40338) [Link]

>and you can be certain you're not going to be arbitrarily killed later for using the memory
you requested

Well, here's what makes designing the OOM-killer hard -- attempting to use memory that the
system doesn't have actually *doesn't* kill you, it just wakes up the OOM-killer and then it's
perfectly possible that the OOM-killer will go after someone else.  Imagine a scenario where
one app allocates 99% of the system's memory (and not with some virtually allocated
overallocation bs, like they actually touch the pages or whatever), and then stops.  Then you
try to run "ls", and it overflows that last 1% of memory and wakes up the OOM-killer.  The
OOM-killer tries to identify and then attack the runaway giant process, not ls, even though ls
was the one who tried to get more memory.

So it doesn't actually help much for you, personally, to make sure your memory is not
overallocated -- if anything it will hurt, since it increases memory pressure overall and also
makes your process a bigger target.  You can turn off overallocation globally, but that
doesn't necessarily help either, since in the scenario above it just means that ls (and every
other program you try to run) fails, while the runaway monster just sits there.

Google say you can disable the OOM killer on a per-process basis, though, which I hadn't
known: http://linux-mm.org/OOM_Killer

Requesting 'real' memory

Posted Feb 1, 2008 20:21 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

So it doesn't actually help much for you, personally, to make sure your memory is not overallocated

Right, that's as dodgy as expecting to avoid the OOM Killer pseudo-crash by having the kernel notify users that memory is tight.

What you want is a combination of the two: a process turns off overallocation for itself, and in exchange, is made immune to the OOM Killer.

That way, processes that need determinism can have it while processes that don't want to waste swap space can have that.

Linux doesn't have this. AIX does.

Requesting 'real' memory

Posted Feb 1, 2008 23:55 UTC (Fri) by njs (guest, #40338) [Link]

> What you want is a combination of the two: a process turns off overallocation for itself,
and in exchange, is made immune to the OOM Killer.

I don't see the connection between these.  Turning off overallocation just means that you get
a different error handling API.  It certainly doesn't stop you from running the system out of
memory.

Making a process immune from the OOM killer is clearly a root-level operation; all you have to
do to force allocation is to touch pages after you allocate them, obviously not a root level
sort of ability.

Requesting 'real' memory

Posted Feb 2, 2008 2:00 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

Turning off overallocation just means that you get a different error handling API

How is getting killed by the OOM Killer an error handling API? Turning off overallocation means you get an error handling API where you had none before.

It certainly doesn't stop you from running the system out of memory.

Turning off overallocation for one process doesn't stop you from running the system out of memory; that's why the OOM Killer is still there. But he only kills other processes. The connection is that if Process X is not overallocating memory (swap space), then the OOM Killer is guaranteed to be able to relieve memory pressure without having to kill Process X. You can't say that about an overallocating process.

Think of it as two separate pools of swap space; one managed by simple allocation; the other with optimistic overallocation and an OOM Killer. A process decides which one works best for it.

all you have to do to force allocation is to touch pages after you allocate them,

No, that's not enough. In overallocating mode, the swap space does not get allocated until the kernel decides to steal a page frame. By then, the process is in no position to be able to deal with the fact that there's not enough swap space for him.

Requesting 'real' memory

Posted Feb 3, 2008 1:39 UTC (Sun) by njs (guest, #40338) [Link]

Ah, I see, that's not what overallocation means.  It has nothing to do with swap.  As far as
the memory manager is concerned, the total amount of memory available in the system is RAM +
swap -- if you have 1G ram and 2G swap, then you have 3G total memory.

This 3G total is distributed among processes.  If overallocation is turned off, then each time
a process calls malloc() (well, really mmap()/sbrk()/fork()/etc., but never mind), either some
pages from that 3G are shaved off and reserved for that process's use, or if there are not
enough pages remaining then the syscall fails.

If overallocation is turned on, then malloc() never actually allocates memory.  What it does
instead is set up some virtual pages in the process's address space, and then the first time
the process tries to write anywhere on each of those not-really-there pages, the process takes
a fault, the memory manager allocates one of those 3G of pages, sticks it in place of the fake
page, and then finally allows the write to continue.  The upside of this is that if a process
malloc()'s a big chunk of memory and then only uses part of it, it's as if they only
malloc()'ed exactly what they ended up needing.  The downside is that since the actual
allocation is now happening somewhere in the middle of a single user-space cpu instruction,
there's no way to signal an error back to the process if the allocation fails, and so in that
case the only thing you can do is wake up an OOM-killer.

> The connection is that if Process X is not overallocating memory (swap space), then the OOM
Killer is guaranteed to be able to relieve memory pressure without having to kill Process X.

Which means that this just isn't true.  Memory pressure is caused by actually-allocated pages,
and the only difference between a process using overallocation and one that isn't is that the
overallocating process may have some virtual pages set up to trigger allocation sometime in
the future.  Whether such pages exist has no bearing whatsoever on memory pressure *now*.  The
only way the OOM-killer can relieve memory pressure is to kill off processes that are using
memory, and Process X qualifies.

Requesting 'real' memory

Posted Feb 3, 2008 2:29 UTC (Sun) by giraffedata (subscriber, #1954) [Link]

Though you start out saying overallocation has nothing to do with swap, your second sentence shows that it is strongly related to swap, saying that swap space is half the equation in determining how much memory is available to allocate.

But what overallocation are you talking about? You describe it like some well-defined Linux function. Are you talking about something that Linux implements? AFAIK, Linux does not implement a per-process overallocation mode, and we were talking about what should be.

It's clear how the mode should work: The same way it does in AIX, which is what I described. "Turning off" overallocation has to mean that once you've allocated memory, you can use it and can't be killed for lack of memory. Otherwise, why bother having the mode?

And the way to do that is to allocate swap space to the process at the moment you allocate the virtual addresses to it. You could alternatively permanently allocate some real memory to the process, but that would be really wasteful (and we already have a means to do that: mlockall()).

BTW, it's not helpful to talk about memory not being actually allocated by malloc (brk/mmap/etc). malloc() does actually allocate virtual memory. Allocating swap space and allocating real memory are separate things that exist in support of using virtual memory which has been allocated. I also don't think "virtual" amounts to "fake." It's a different form of memory.

Memory pressure is caused by actually-allocated pages

I assume you mean pages of real memory. Filling up of real memory is the primary cause of memory pressure, but it's easy to relieve that pressure: just push data you aren't using out to swap space and free up the real memory. When swap space is full, that's when the pressure backs up into the real memory and the OOM Killer is needed. That's why I say swap space is the key to giving guaranteed-usable virtual memory to a process.

Requesting 'real' memory

Posted Feb 3, 2008 4:06 UTC (Sun) by njs (guest, #40338) [Link]

Obviously we're totally talking past each other, but I'll try one more time...

>Though you start out saying overallocation has nothing to do with swap, your second sentence
shows that it is strongly related to swap, saying that swap space is half the equation in
determining how much memory is available to allocate.

Well, sure, swap is, by any measure, an important part of a VM system, but that doesn't mean
it's "strongly related" to any other particular part of a VM system.  My point is that from
the point of view of overallocation, the difference between swap and physical RAM is just
irrelevant.

> But what overallocation are you talking about? You describe it like some well-defined Linux
function. Are you talking about something that Linux implements?

Yes.  I'm talking about "overallocation" or "overcommit", which in this context is a technical
term with a precise meaning.  When people are talking about it in this thread, they are
referring to a particular policy implemented by the Linux kernel and enabled by default.
Evidentally you haven't encountered this particular design before, which is why I described it
in my previous comment...

>AFAIK, Linux does not implement a per-process overallocation mode, and we were talking about
what should be.

No, AFAIK it doesn't, but it does support a global overallocation/no-overallocation switch,
and it's obvious what it would mean to take that switch and make it process-granular.  Maybe
there's yet another policy that Linux should implement, but if you want to talk about that
then trying to redefine an existing term to do so will just confuse people.

>And the way to do that is to allocate swap space to the process at the moment you allocate
the virtual addresses to it.

Huh, so is this how traditional Unix works?  Is this tight coupling between memory allocation
policy and swap management policy the original source of that old advice to make your swap
space = 2xRAM?  I've long wondered where that "rule" came from.

I guess I've heard before that in traditional Unix all RAM pages are backed on disk somewhere,
either via the filesystem or via swap, but I hadn't thought through the consequences before.

I'm guessing this is the original source of confusion.  Linux doesn't work like that at all;
I'm not sure there's any modern OS that does.  In a system where RAM is always swap-backed,
having 1G RAM and 2G of swap means that all processes together can use 2G total; in Linux,
they can use 3G total, because if something is in RAM it doesn't need to be in swap, and
vice-versa.  (What happens in your scheme if someone is running without swap?  I bet there are
people in this thread who both disable overallocation and run without swap (hi Zooko!).)

"Allocated memory" in my comment really just means "anonymous transient data that the kernel
has committed to storing on the behalf of processes".  It can arrange for it to be stored in
RAM, or in swap, or whatever.  There is no such thing as "allocated swap" or "allocated RAM"
in Linux.  (Except via mlockall(), I guess, if you want to call it that, but I don't think
calling it that is conceptually useful -- it's more of a way to pin some "allocated memory" 

Does that make my previous comment make more sense?

Requesting 'real' memory

Posted Feb 4, 2008 9:43 UTC (Mon) by giraffedata (subscriber, #1954) [Link]

but it does support a global overallocation/no-overallocation switch, and it's obvious what it would mean to take that switch and make it process-granular.

That would be only slightly different from the scheme I described. The only difference is that the global switch lets you add a specified amount of real memory to the size of swap space in calculating the quota. If you could stop the kernel from locking up that amount of real memory for other things, you could have the OOM-proof process we're talking about, with less swap space.

I think the only reason I haven't seen it done that way is that swap space is too cheap to make it worthwhile to bring in the complexity of allocating the real memory. If I were to use the Linux global switch, I would just tell it to consider 0% of the real memory and throw some extra disk space at it, for that reason.

What happens in your scheme if someone is running without swap? I bet there are people in this thread who both disable overallocation and run without swap

They don't have that option. The price they pay to have zero swap space is that nothing is ever guaranteed to be free from being OOM-killed. Which is also the case for the Linux users today who disable overallocation and run without swap.

Requesting 'real' memory

Posted Feb 5, 2008 13:44 UTC (Tue) by filipjoelsson (guest, #2622) [Link]

> But what overallocation are you talking about? You describe it like some
> well-defined Linux function. Are you talking about something that Linux
> implements? AFAIK, Linux does not implement a per-process overallocation
> mode, and we were talking about what should be.

I think the overallocation he talks about is on the userspace level. When I'm programming an
application to read, store and analyze data on a laptop (a field application - the user wants
to have preliminary analysis right away), I can make a big fat allocation to use for cache. I
store the data in a database, which also uses a big fat cache. Until now, I have been had to
either make both caches no larger than a quarter the size of the RAM (roughly), since swapping
out really defeats the point of the cache.

If I had bigger caches, I could just hope that the sum of the actually used memory would not
be larger than RAM (or else it'd start swapping out, or if I run without swap - trigger the
OOM). With this patch, I can safely overcommit - and when I get the notification, I can cut
down on the caches and survive. The analysis step of the app will be slower, but my user will
not have been near a crash - and the analysis would have been slower anyway because of
swapping. One difference is that he can run without swap in a much more safe manner. Anyway,
this is not at all a case of a partial crash. You could argue that it is a case of sloppy
programming, but why should I reinvent memory managing? I'd much rather let the kernel do that
for me.

Oh, and lastly - on embedded platforms, you often have to do without swap. Swapping to flash
does not strike me as a good idea.

Requesting 'real' memory

Posted Feb 8, 2008 5:00 UTC (Fri) by goaty (guest, #17783) [Link]

Don't forget about stack! For heavily multi-threaded processes on Linux, stack is usually the
biggest user of virtual memory.

As an aside, I know some operating systems use a "guard page" stack implementation, where
writes into the first page off the bottom (top?) of the stack trigger a page fault, which
allocates another page worth of real memory, and also moves the "guard page". The benefit
being that virtual memory use is much closer to actual memory use, and turning off overcommit
is much more viable. The downside is that the ABI requires the compiler to generate code to
probe every page when it needs to allocate a stack frame larger than the page size. Which
ironically can end up using more real memory than Linux's virtual-memory-hungry approach.

Requesting 'real' memory

Posted Feb 8, 2008 21:23 UTC (Fri) by nix (subscriber, #2304) [Link]

Linux has used a guard-page stack implementation since forever. It's 
transparent to userspace: the compiler doesn't need to do a thing.

(Obviously when threading things get more complex.)

Requesting 'real' memory

Posted Feb 1, 2008 0:01 UTC (Fri) by zooko (guest, #2589) [Link]

You can set sys.vm.overcommit_memory to policy #2.  Unfortunately, it isn't entirely clear
that this will banish the OOM killer entirely, or if it will just make it very rare.

http://www.linuxinsight.com/proc_sys_vm_overcommit_memory...

Requesting 'real' memory

Posted Feb 1, 2008 1:21 UTC (Fri) by zlynx (subscriber, #2285) [Link]

I ran my Linux laptop with strict overcommit enabled for a while.  Unfortunately, it does not
help.  Almost all desktop applications expect memory allocation to succeed.  From some of the
application errors I saw, developers seem to have become very lax about checking for NULL from
malloc.

C++ and Python applications did better, because they get an exception, and they have to do
*something* with it.

Requesting 'real' memory

Posted Feb 1, 2008 3:28 UTC (Fri) by zooko (guest, #2589) [Link]

Even if what you say is true, I would think that this would make the effects of memory
exhaustion more deterministic/reproducible/predictable.

C++ and Python apps, and also C apps that use malloc sparingly, would be less likely to crash
than others, I guess.

Perhaps this degree of predictability isn't enough to be useful.

Requesting 'real' memory

Posted Feb 1, 2008 17:56 UTC (Fri) by zlynx (subscriber, #2285) [Link]

I did not notice any extra predictability.  The effect was that the desktop programs crash
apparently randomly.  It was much like the OOM killer.  And just like the OOM killer, it was
generally the big stuff that blew up, like Evolution and Open Office.  I lost gnome-terminal a
few times.

The C++ and Python apps still crashed, they were simply more polite about it.

By the way, I don't read it that way, but your phrasing "Even if what you say is true" *could*
be offensive.  It seems to be saying that I wrote untruthfully.

Even if you don't see the same effect on your system, I did see it just the way I described it
on mine.

Requesting 'real' memory

Posted Feb 1, 2008 19:34 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

Desktop applications aren't where I would expect to see deterministic memory allocation exploited. Allocation failures and crashes aren't such a big deal with these applications because if things fall apart, there's a user there to pick up the pieces. Overallocation and OOM Killer may well be the optimum memory management scheme for desktop systems.

Where it matters is business-critical automated servers. For those, application writers do spend time considering running out of memory -- at least they do in cases where an OOM killer doesn't make it all pointless anyway. They check the success of getting memory and do it at a time when there is some reasonable way to respond to not getting it.

And they shouldn't spend time worrying about freeing up swap space for other processes (i.e. mem_notify is no good). That resource management task belongs to the kernel and system administrator.

Requesting 'real' memory

Posted Feb 1, 2008 20:14 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

You can set sys.vm.overcommit_memory to policy #2. Unfortunately, it isn't entirely clear that this will banish the OOM killer entirely, or if it will just make it very rare.

It's entirely clear to me that it banishes the OOM killer entirely. The only reason the OOM killer exists is that sometimes the processes use more virtual memory than there is swap space to put its contents. With Policy 2, virtual memory isn't created in the first place unless there is a place to put the contents.

Requesting 'real' memory

Posted Feb 1, 2008 20:47 UTC (Fri) by zooko (guest, #2589) [Link]

But doesn't the kernel itself dynamically allocate memory?  And when it does so, can't it
thereby use up memory so that some user process will be unable to use memory that it has
already malloc()'ed?  Or do I misunderstand?

Requesting 'real' memory

Posted Feb 1, 2008 21:26 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

The kernel reserves at least one page frame for anonymous virtual memory (actually, it's a whole lot more than that, but in theory one frame is enough for all the processes to access all their virtual memory as long as there is adequate swap space).

So any kernel real memory allocation can fail, and the code is painstakingly written to allow it to handle that failure gracefully (more gracefully than killing an arbitrary process). It allocates memory ahead of time so as to avoid deadlocks and failures at a time that there is no graceful way to handle it.

Requesting 'real' memory

Posted Feb 1, 2008 21:50 UTC (Fri) by zooko (guest, #2589) [Link]

Right, but I wasn't asking about the kernel's memory allocation failing -- I was asking about
the kernel's virtual memory allocation succeeding by using memory that had already been
offered to a process as the result of malloc().

Oh -- perhaps I misunderstood and you were answering my question.  Are you saying that the
kernel will fail to dynamically allocate memory rather than allocate memory which has already
been promised to a process (when overcommit_memory == 2)?

Thanks,

Zooko

Requesting 'real' memory

Posted Feb 1, 2008 23:05 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

The kernel doesn't use virtual memory at all (well, to be precise let's just say it doesn't use paged memory at all). The kernel's memory is resident from the moment it is allocated, it can't ever be swapped out, and the kernel uses no swap space.

Requesting 'real' memory

Posted Feb 5, 2008 23:05 UTC (Tue) by dlang (subscriber, #313) [Link]

the problem that you will have when you disable overallocating memory is that when your 200M
firefox process tries to spawn a 2k program (to handle some mime type) it first forks, and
will need 400M of ram, even though it will immediatly exec the 2k program and never touch the
other 399.99M of ram.

with overallocation enabled this will work. with it disabled you have a strong probability of
running out of memory instead.

yes, it's more reproducable, but it's also a completely avoidable failure.

Requesting 'real' memory

Posted Feb 6, 2008 3:25 UTC (Wed) by giraffedata (subscriber, #1954) [Link]

I wonder why we still have fork. As innovative as it was, fork was immediately recognized, 30 years ago, as impractical. vfork took most of the pain away, but there is still this memory resource allocation problem, and some others, and fork gives us hardly any value. A fork-and-exec system call would fix all that.

Meanwhile, if you have the kind of system that can't tolerate even an improbable crash, and it has processes with 200M of anonymous virtual memory, putting up an extra 200M of swap space which will probably never be used is a pretty low price for the reliability of guaranteed allocation.

Requesting 'real' memory

Posted Feb 6, 2008 5:26 UTC (Wed) by dlang (subscriber, #313) [Link]

many people would disagree with your position that vfork is better then fork. (the issue came
up on the lkml within the last week and was dismissed with something along the lines of 'vfork
would avoid this, but the last thing we want to do is to push more people to use vfork')

I agree that a fexec (fork-exec) or similar call would be nice to have, but it wouldn't do
much good for many years (until a significant amount of software actually used it)

as for your comment of just add swap space to avoid problems with strict memory allocation.

overcommit will work in every case where strict allocation will work without giving
out-of-memory errors, and it will work in many cases where strict allocation would result in
errors. overcommit will also work in many (but not all) cases where strict allocation would
result in out of memory errors.

if it's trivial to add swap space to avoid the OOM errors in strict allocation, that same swap
space can be added along with overcommit and the system will continue to work in even more
cases.

the only time strict allocation will result in a more stable system is when your resources are
fixed and your applications are fairly well behaved (and properly handle OOM conditions), even
then the scenerio of one app allocating 99% of your ram, preventing you from running other
apps, is still a very possible situation. the only difference is that the timing of the OOM
error is more predictable (assuming that you can predict what software will be run when in the
first place)

Requesting 'real' memory

Posted Feb 7, 2008 0:35 UTC (Thu) by giraffedata (subscriber, #1954) [Link]

Many people would disagree with your position that vfork is better then fork

No, they wouldn't, because I was talking about the early history of fork and comparing the original fork with the original vfork. The original fork physically copied memory. The original vfork didn't, making it an unquestionable improvement for most forks. A third kind of fork, with copy-on-write, came later and obsoleted both. I didn't know until I looked it up just now that a distinct vfork still exists on modern systems.

the only time strict allocation will result in a more stable system is when your resources are fixed and your applications are fairly well behaved (and properly handle OOM conditions)

The most important characteristic of a system that benefits from strict allocation is that there be some meaningful distinction between a small failure and a catastrophic one. If all your memory allocations must succeed for your system to meet requirements, then it's not better to have a fork fail than to have some process randomly killed, and overallocation is better because it reduces the probability of failure.

But there are plenty of applications that do make that distinction. When a fork fails, such an application can reject one piece of work with a "try again later" and a hundred of those is more acceptable than one SIGKILL.

Avoiding the OOM killer with mem_notify

Posted Jan 31, 2008 15:12 UTC (Thu) by salimma (subscriber, #34460) [Link]

Nokia has something similar on their Linux-based Maemo platform -- run it without swap, start
a bunch of applications, and a lot of the built-in applications would enter a
reduced-memory-usage mode -- noticeable because it takes much longer to switch to them than it
normally would.

I wonder whether the apps currently just poll the system to find out how much memory is left,
or they have their own mechanism, though.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds