> If you get to the point that you're stealing a page from a process simply because that process is over its quota of real memory, you should steal ALL that process' pages. It can't fit its working set into memory, so it isn't going to make decent progress, so the memory you do give it is wasted.
I suppose I see three cases here. One is that the page was part of the process' working set at an earlier point in time, but no longer is. In that case swapping it out is the right thing to do. The other is that the process is in control, but it's working set is bigger than the available memory. Then I agree that there is a good case for putting it on hold until enough memory is available, although that is a non-trivial problem which is somewhat outside of the scope of what I am trying to do. And the third case is the one that I am interested in - a runaway process which will eventually be OOMed. In this case, the quota will stop it from trampling on the working set of every other process in memory in the meantime.
While we are on the subject, does anyone reading this know where RSS quotas are handled in the current kernel code? I was able to find the original patches enabling them, but the code seems to have changed out of recognition since then.
Posted Nov 9, 2009 12:34 UTC (Mon) by hppnq (subscriber, #14462)
[Link]
You may want to look at Documentation/cgroups/memory.txt. Otherwise, it seems there is no way to enforce RSS limits. Rik van Riel wrote a patch a few years ago but it seems to have been dropped.
Personally, I would hate to think that my system spends valuable resources managing runaway processes. ;-)
** Encouragement encouragement encouragement **
Posted Nov 13, 2009 22:32 UTC (Fri) by efexis (guest, #26355)
[Link]
I (for one) would be most interested in your work. The systems I manage are very binary in whether they're behaved or not, because I have configured them to behave (ie, how much memory is available, decide how much to give to database query caching etc, so everything just works). I try to keep swap file around 384Meg whether the system has 1G or 8G of RAM because that's a nice size to swap stuff out that you don't need to keep in memory, but using disk as virtual RAM is just way too slow, I'd prefer processes be denied memory requests than have them granted at the cost of slowing the whole system down. But all in all, because everything's set up for the amount of memory available, the only time I will get into OOM situations is when there is a runaway process (I manage systems for hosting small numbers of database driven websites, some of them may be developed on windows systems and then moved to the linux system, most are written in PHP which has a very low bar of entry, and so developers often do not have a clue when it comes to writing scalable code).
So, what I would want is something that assumes that most of the system is being well behaved, but will quickly chop off anything that is not, and will stop the badly bahaved stuff from dragging the well behaved stuff down with it. The well behaved stuff quite simply doesn't need managing; that's my job. The badly behaved stuff needs taking care of quickly, by something that your idea seems to reflect *perfectly* (it's not often you read someones ideas and your brain flips "that's -exactly- what I need").
How would I find out if you do get chance to hammer out the code that achieves this? Is there an non-LKML route to watch this (please don't say twitter :-p )
** Encouragement encouragement encouragement **
Posted Nov 16, 2009 13:45 UTC (Mon) by michaeljt (subscriber, #39183)
[Link]
Ahem, I haven't thought that far ahead yet :) So far it is one of a few small projects I have lined up for whenever I have time, but I was posting here in order to get some feedback from wiser minds than my own before I made a start.
death by swap
Posted Nov 16, 2009 13:50 UTC (Mon) by michaeljt (subscriber, #39183)
[Link]
>> If you get to the point that you're stealing a page from a process simply because that process is over its quota of real memory, you should steal ALL that process' pages. It can't fit its working set into memory, so it isn't going to make decent progress, so the memory you do give it is wasted.
>I suppose I see three cases here. One is that the page was part of the process' working set at an earlier point in time, but no longer is. In that case swapping it out is the right thing to do. The other is that the process is in control, but it's working set is bigger than the available memory. Then I agree that there is a good case for putting it on hold until enough memory is available, although that is a non-trivial problem which is somewhat outside of the scope of what I am trying to do. And the third case is the one that I am interested in - a runaway process which will eventually be OOMed. In this case, the quota will stop it from trampling on the working set of every other process in memory in the meantime.
Actually case 2 could be handled to some extent by lowering the priority of a process that kept on swapping for too long.