Overcommit and OOM killer are totally orthogonal issues. You can exhaust memory on a system without overcommit, too.
Toward reliable user-space OOM handling
Posted Jun 8, 2013 3:32 UTC (Sat) by tstover (guest, #56283)
-Without overcommitted memory, userspace programs know when they are out of memory with malloc() == NULL, or say mmap() failure.
-With overcommitted memory, userspace does not know about the condition until after memory is already exhausted, thus the need for a userspace OOM killer scheme.
My point is that either way is still user space memory management, so why not use the less complex model. My guess is that higher level userspace can't (ie java).
Posted Jun 8, 2013 4:04 UTC (Sat) by mjg59 (subscriber, #23239)
Posted Jun 8, 2013 10:13 UTC (Sat) by smcv (subscriber, #53363)
I'm sure libdbus still has bugs in OOM situations (that are, in practice, never reached), despite our best efforts. I certainly fixed several, years after Havoc's blog post.
Posted Jun 8, 2013 20:34 UTC (Sat) by giraffedata (subscriber, #1954)
Also, the OOM killer does a better job at picking a process to kill. malloc()==NULL just hits whoever is the first to ask for memory after it has been exhausted.
Another approach is rlimits. It is traditional in Linux to give every process unlimited virtual address space, but you can limit it per-process. I massively overcommit, but have never seen OOM killer because I use rlimits, mostly set at half of the system's virtual memory. I normally have just one process at a time running amok, and that process gets malloc()==NULL and dies.
It's worth noting that at least half the time, the death is not graceful but a segfault. And half of that time it's because effort wasn't put into handling the out of memory situation and the other half it's because the programmer thought malloc() cannot fail in Linux.
Seems like there ought to be an option to have rlimit violation just generate a signal.
Posted Jun 9, 2013 0:17 UTC (Sun) by nix (subscriber, #2304)
(I use rlimits, too, and have also never seen OOM, despite all that 'git gc' could do.)
OOM and rlimits
Posted Jun 9, 2013 1:57 UTC (Sun) by giraffedata (subscriber, #1954)
IIRC, if you exceed rlimits, you don't get malloc()==NULL
I can assure you that at least sometimes, you do. A few years back, I was having inexplicable SIGSEGVs when trying to play an audio CD with Mplayer. I finally traced through the code of Cdparanoia (used by Mplayer) and found 1) it was using far more memory than it ought to have used; and 2) it was ignoring malloc()==NULL and dereferencing the null pointer.
I fixed it, to get on with my project, and then sent the patch to the maintainer of Cdparanoia, who rejected it saying that malloc()==NULL is a myth because of infinite overcommit. (I told him I was actually experiencing the problem, but I don't know if he believed me).
It's always struck me as a crazy thing to do for e.g. cpu-time rlimits,
You mean SIGKILL is a crazy thing to do for cpu-time rlimits? If so, what would you do instead?
A lot of times the problem isn't just that the error paths weren't tested, but they were never written because it's too tedious.
For the rlimits other than CPU time, I guess the ideal design would be that the program can declare its ability to deal with an "out of resource" system call failure, and if it doesn't so declare, SIGKILL.
Posted Jun 9, 2013 3:31 UTC (Sun) by dlang (subscriber, #313)
Posted Jun 9, 2013 21:32 UTC (Sun) by giraffedata (subscriber, #1954)
if you exceed CPU limits, I would expect that the result would be that you get scheduled less frequently so that you drop back below the limits, not that you get killed.
CPU time rlimits as we know them have always been total CPU time, as opposed to CPU utilization, so getting killed is pretty much the only option. But I've often thought that a more useful resource limitation for some processes would be a rate limitation. And I usually dream about having rate limits for other things too (e.g. you get 5 gigabyte-minutes per minute of real memory).
One way to implement CPU rate limit would be to keep the existing CPU time rlimit but just have the kernel stop scheduling the process indefinitely when the process goes over. Then some resource allocator process could raise the process' CPU time limit on a regular basis.
Posted Jun 9, 2013 22:48 UTC (Sun) by anselm (subscriber, #2796)
It seems to me that control groups (cgroups) rather than resource limits may be what you are looking for.
Posted Jun 10, 2013 15:52 UTC (Mon) by alankila (guest, #47141)
The reason why this wouldn't be completely horrible from user experience point of view is that you would generally put a decent amount of swap on your system, say 100 % of the size of your physical memory. If these supposed per-process limits are even reasonably accurate, the system will be severely paging before it hits any issue. The upshot of this exercise would be the elimination of OOM killer which I really don't like. It is a dynamic answer to a problem that I think only has "static" real answers: ahead of time contemplation of the maximum usage possible, and limiting of the resources for individual programs according to their expected maximum needs.
I don't expect this to happen though. I get the sense that this sort of thing was what people did before and it was horrible, so now we just put enough hardware to the task and the most advanced of us put some ulimits, and that's how the problem is solved.
Posted Jun 10, 2013 15:57 UTC (Mon) by mjg59 (subscriber, #23239)
Posted Jun 10, 2013 20:14 UTC (Mon) by alankila (guest, #47141)
Posted Jun 10, 2013 16:57 UTC (Mon) by giraffedata (subscriber, #1954)
Yes, if people aren't even willing to earmark the virtual memory at malloc time, they certainly wouldn't be willing to earmark it at program startup time.
Of course, some (rare) people are willing to earmark at malloc time (they turn off overcommit), and some of them would probably appreciate being able to earmark it at startup time. I have seen that done with user space memory allocators: the program at startup time (or other convenient decision point) gets from the kernel a large enough allocation of virtual memory for its future needs, then suballocates from that.
This is orthogonal to eliminating the OOM killer. You do that by turning off overcommit. What this adds is that when your program fails randomly because of other processes' memory usage, it fails at startup in code designed to handle that eventuality instead of at some less convenient point where you're trying to allocate 80 bytes to build a string.
Posted Jun 10, 2013 20:17 UTC (Mon) by alankila (guest, #47141)
Posted Jun 10, 2013 21:43 UTC (Mon) by giraffedata (subscriber, #1954)
Touching pages of the preallocated heap ahead of time to prevent OOM later is probably a poor way to go about it, because you're just trading a later necessary OOM kill for a sooner possibly unnecessary OOM kill, and that doesn't seem better. Also, unless every process in the system does this preallocation, even the ones doing it could get killed later by the OOM killer. No one is safe from the OOM killer.
Better just to disable overcommit (/proc/sys/vm/...). Then there's no OOM kill ever, there are no unnecessary page faults, and when the swap space runs out, the programs that preallocate the heap just get an allocation failure as they start up and can terminate gracefully.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds