User: Password:
|
|
Subscribe / Log in / New account

Respite from the OOM killer

Respite from the OOM killer

Posted Sep 30, 2004 11:18 UTC (Thu) by copsewood (subscriber, #199)
Parent article: Respite from the OOM killer

I think in most situations it would make sense for the kernel to extend swap space by creating swap files on any filesystem with free space to which it has write access. OK, there will be marginal cases where there is overcommitted memory and insufficient free disk space to prevent serious thrashing due to the reducing free disk space this approach causes. However, system admins who want a more reliably-performing system have long known that they need to provide adequate memory and disk resources in any case. Why not have swap space dynamically extensible in this manner ?


(Log in to post comments)

Respite from the OOM killer

Posted Sep 30, 2004 12:54 UTC (Thu) by nix (subscriber, #2304) [Link]

This means that any malicious user without sufficiently vicious ulimits can exhaust not just memory but disk space as well, even on disks he can't write to.

Is this entirely wise?

Respite from the OOM killer

Posted Sep 30, 2004 13:39 UTC (Thu) by fergal (guest, #602) [Link]

Right now, such a user would cause random processes to be killed, which is arguably worse. Well written programs can gracefully handle a lack of disk space, they cannot gracefully handle being killed by the OOM killer.

Respite from the OOM killer

Posted Sep 30, 2004 15:32 UTC (Thu) by hppnq (guest, #14462) [Link]

However, system admins who want a more reliably-performing system have long known that they need to provide adequate memory and disk resources in any case.

And that they should forbid overcommitting memory. ;-)

Respite from the OOM killer

Posted Oct 2, 2004 20:35 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

However, system admins who want a more reliably-performing system have long known that they need to provide adequate memory and disk resources in any case.
And that they should forbid overcommitting memory. ;-)

Does forbidding overcommitting memory make a more reliably performing system? When you forbid overcommitting memory, all you do is make a different process fail at a different time. A process that's minding its own business, using a small amount of memory and doing something very important fails when its fork() gets "out of memory." And this happens even though there's only a 1% chance that letting the fork() go through would lead to trouble. And it happens to dozens of applications while one broken application sucks up all the virtual memory resources.

But in the overcommiting case, the program would work fine and 1% of the time some other process which is likely to be doing something unimportant and/or likely to be the cause of the memory shortage dies.

I think you could say the overcommitting system is performing more reliably.

Respite from the OOM killer

Posted Oct 4, 2004 17:22 UTC (Mon) by jzbiciak (subscriber, #5246) [Link]

...and both are broken.

But, just like my car, which currently idles rough, has an exhaust leak, and the "service engine" light's on, it still gets me to and from work.

The difference is in the failure mode. Do you degrade gracefully, or do you start to blow up at the first sign of error? If you're a user, you probably want graceful degradation--you can tolerate some excessive swapping to a point, and if it gets too bad, you reboot. At least OOo didn't implode taking your document with it. If you're a developer, you probably want to know ASAP something's wrong so you can fix it.

Thankfully, my car still runs (albeit not entirely happily), rather than flashing "service engine" and shutting down.

Respite from the OOM killer

Posted Oct 5, 2004 3:50 UTC (Tue) by mbp (guest, #2737) [Link]

OK, graceful degradation is nice. But it's hard to tell whether overcommit helps or hurts.

Even car designers have this problem: some modern cars will refuse to start if the engine is getting into a state where there is a chance of permanent damage. If it's approaching zero oil pressure, I think I would rather have an electronic cutout than an engine seizure.

Respite from the OOM killer

Posted Oct 6, 2004 15:51 UTC (Wed) by giraffedata (subscriber, #1954) [Link]

If you're a user, you probably want graceful degradation--you can tolerate some excessive swapping to a point, and if it gets too bad, you reboot. At least OOo didn't implode taking your document with it.

True, but that's not an option with either of the cases being discussed -- overcommit or no overcommit. This choice comes into play only when there's no place left to swap to.

The no-overcommit case can cause OOo to fail more gracefully. If OOo is written to tolerate a failed fork (et al) and give you the chance to kill some other process and then save your document, then no-overcommit could be a good thing for OOo.

On the other hand, if you don't have the technical skills to find the right process to kill, you're going to have to reboot anyway and lose your document. By contrast, with overcommit, you probably wouldn't have lost the document. Either there never would have been a crisis in the first place, or the OOM killer would have killed some other process and let OOo continue normally.

Respite from the OOM killer

Posted Oct 8, 2004 20:05 UTC (Fri) by tmcguire (guest, #25295) [Link]

You know, back in the old days when I was working with AIX (3.2.5, if that means anything to you), it had the policy of overcommitting memory and then randomly killing processes when it discovered that it was out.

Of course, the process that it seemed to kill first was always inetd, which made the system completely useless and didn't take up much resources anyway. So AIX had to go on and kill other stuff, too.

And naturally system calls like sbrk would never fail, so no application had any opportunity to gracefully handle any problems. But then, no developer ever had the interest or the incentive to actually handle errors, so the situation was nicely symmetrical.

One of the programs best known for allocating memory and then not using it (in large amounts) was the X server, so turning off overallocation wasn't really an option. Fixing *that* bug probably wasn't an option either.

It's always nice to see modern systems learning from their elders, so to speak. But if Linux is going to repeat previous mistakes, it really should go all the way. It's much more fun. Or has someone introduced SIGDANGER*, and I've just missed the memo?

* The SIGDANGER signal was sent to all processes just before the OOM killer started to work. Theoretically, a process handling SIGDANGER could reduce its memory allocation. If it had time. And, if the programmer wanted to. The inetd maintainers apparently didn't. Or, a process could have a handler for SIGDANGER and then just ignore it---the OOM killer would skip any process that handled SIGDANGER.

Respite from the OOM killer

Posted Sep 30, 2004 18:17 UTC (Thu) by mongre26 (guest, #4224) [Link]

Extending swap dynamically has performance issues I would rather avoid.

First you would have to move from a disk partition swap to a disk file swap. That hurts you because you are reading and writing not from a custom swap file system but a file simulating a swap file system placed on a regular file system. That means your file systems ability to handle swap like files will impact swap performance.

Also when a swap file changes size it has to re-organize itself somehwat. This an hurt performance while the swap is resizing. If the swap resizes too much you waste a lot of disk activity unecessarily.

Microsoft does dynamic swap allocation, and I think it is one of the weaknesses of the OS. For performance reasons I lock my swap file to a specific min and max so it does not do this on my gaming system.

In all cases I simply allocate sufficient swap to handle all my RAM and then some, often 4-8GB of swap. Since disks are massive these days committing 2-5% of the main disk for swap is not a problem.

Just allocate enough swap at install time to accomodate your expectations. If you really find you need more swap add another disk and make another swap partition. Linux can use multiple swap partitions. You may even get a performance boost by load balancing swap across spindles.

Respite from the OOM killer

Posted Oct 1, 2004 15:48 UTC (Fri) by Luyseyal (guest, #15693) [Link]

First you would have to move from a disk partition swap to a disk file swap. That hurts you because you are reading and writing not from a custom swap file system but a file simulating a swap file system placed on a regular file system. That means your file systems ability to handle swap like files will impact swap performance.

Actually, this is not true anymore. On recent kernels (2.4+) swap file performance is exactly the same as swap partition performance because the kernel bypasses the filesystem layer (I believe it relies on the magic of 'dd' creating files without holes). The only annoying thing with swap files is if you have a large one, it takes awhile to initialize on boot.

Also when a swap file changes size it has to re-organize itself somehwat. This an hurt performance while the swap is resizing. If the swap resizes too much you waste a lot of disk activity unecessarily.

This is true. If you're going to allow overcommit, disallow OOM, and allow the kernel to create new swap files on the fly, it would probably be best done in its own subdirectory using a number of smaller swap files as adding a new swap is probably cheaper than resizing an existing one.

As an aside, I recently converted a swap partition to a swap file on a dedicated ext2 partition so I could use e2label to make swap still work in case of SCSI name reordering. Since performance is identical -- except the longer boot time -- it was worth it.

Cheers!
-l

Respite from the OOM killer

Posted Oct 7, 2004 23:29 UTC (Thu) by nobrowser (guest, #21196) [Link]

> it would probably be best done in its own subdirectory using
> a number of smaller swap files as adding a new swap is probably
> cheaper than resizing an existing one.

It already exists. Look for swapd in the ibiblio archive.
The response was completely underwhelming when it was young (long ago),
and I don't see why that should have changed.

Respite from the OOM killer

Posted Oct 2, 2004 20:21 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

In all cases I simply allocate sufficient swap to handle all my RAM and then some,

If by RAM you mean physical memory, as most people do, swapping doesn't handle RAM -- it handles virtual memory. So the correct calculation isn't amount of RAM plus something, it's (roughly) amount of virtual memory minus amount of RAM. And there's no practical way to put a limit on the amount of virtual memory. The best you can do normally is watch your swap space and when you see it approaching full, add more.

Just allocate enough swap at install time to accomodate your expectations.

I agree that is the policy for which Linux is designed. The OOM killer is specifically to handle the pathological case that your expectations are exceeded. That's always a possibility. Consider program bugs.

The question of what to do when, despite the sysadmin's best intentions, the system runs out of virtual memory resources, is a difficult one; the various approaches (kill the system, kill some memory-using process, fail any attempt to get more virtual memory, make new swap space, etc.) all have their advantages and disadvantages.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds