When you get an out-of-memory error in an overcommited state there is no way to free up resources, nor ask the user how to proceed (and anyway that would generally require allocations). What happens is that on out-of-memory the oom killer wakes up and kills a semi-random process, with you having no say in this at all.
For the case where you have resources that can be safely freed in an out of memory situation the right thing to do is not OOM from allocation at all, but rather have some kind of signal for memory pressure when memory is tight (but not full). Then apps could handle this by cleaning up caches and other resources. That way you will not run into the OOM killer problem.
There is one kind of allocation failure that is not oom-killer related though, and thats where a single allocation is larger than the physical memory or the mappable region. This can happen for instance if you're reading in some random user file (say an image) and it happens to decode to a 8 gigabyte array (maybe because its an exploit, or just large). In these kinds of situation I think it makes sense to check for allocation failures, and glib does in fact have a call for that (g_try_malloc).
However, in most cases (like allocating internal know sized objects) I'm purely in the abort-on-oom school, since adding all the complexity (both to your code and to users of your library) means more bugs, and doesn't help anyway (since oom doesn't get reported, the kernel just kills some process instead). Of course, as david said in the article, there are of course exceptional situations, like core system software (init, dbus, etc) where we can't just have it die and where the complexity is worth it.
Posted Jun 28, 2011 17:48 UTC (Tue) by xtifr (subscriber, #143)
[Link]
> the oom killer wakes up
Assuming A) you have an OOM killer, and B) it hasn't been thoroughly disabled. If you're writing a _general purpose_ library, neither is really a valid assumption, though both remain possibilities you should remain aware of. Aside from that quibble, I basically agree with you, but I'll note that writing libraries for embedded systems comes with a whole additional set of complications of its own. (Basically, my advice would be to not try unless you or someone on your team has some expertise with embedded systems.)
Zeuthen: Writing a C library, part 1
Posted Jun 30, 2011 15:23 UTC (Thu) by nix (subscriber, #2304)
[Link]
I'm not sure you *can* completely disable overcommit. Robust Unix programs theoretically have to assume that they might get killed at any instant, either due to OOM in something like the stack (which obviously cannot be trapped), or due to a user sending it a kill signal.
Alas the latter is rare (and misbehaviour might be expected if you kill something maintaining persistent state while it is updating that state), and the former is so rare and so hard to cater to that simply nobody ever bothers. Sufficiently Paranoid Programs could avoid the stack-OOM by doing a massive deep recursion early on, to balloon their stack out to the maximum they might need. A few programs do this. You can avoid being user-killed by being installed setuid or setgid, but this has other disadvantages and is basically never done (at least not solely for this reason).
This is probably a fault of some kind in POSIX, but I have not even the faintest glimmerings of a clue as to how to fix it.
Zeuthen: Writing a C library, part 1
Posted Jul 1, 2011 9:46 UTC (Fri) by dgm (subscriber, #49227)
[Link]
I believe that what really paranoid programs have to do is keep critical state in non-volatile memory (a disk, a remote machine, etc), and do everything possible to ensure that it's always consistent. That way it doesn't matter if the program goes away because of a programming error, a kill signal or the power going down in the middle of a system call.
Zeuthen: Writing a C library, part 1
Posted Jul 1, 2011 13:40 UTC (Fri) by nix (subscriber, #2304)
[Link]
Yes, probably. And then we can get into fsync() flamewars instead! Isn't POSIX fun?
Zeuthen: Writing a C library, part 1
Posted Jul 3, 2011 23:09 UTC (Sun) by dgm (subscriber, #49227)
[Link]
We can probably shortcircuit the flamewar by using a relational database. Oh, noes! PostgreSQL vs. MySQL anyone? ;-)
Zeuthen: Writing a C library, part 1
Posted Jul 3, 2011 23:40 UTC (Sun) by nix (subscriber, #2304)
[Link]
And then we can write a nice high-performance FUSE filesystem on top of the relational database! And then we can run MySQL on top of that! (And then we can run the FUSE filesystem atop that, and run a virtual machine inside that filesystem. And then we have a nice room heater.)