LWN.net Logo

Class-based Kernel Resource Management

The Class-based Kernel Resource Management (CKRM) project is an effort at IBM to provide the hooks for better control over resource consumption by processes. The CKRM project sees the existing resource management tools (nice, ulimit) as not being up to the task. So the CKRM hackers have set out to provide a whole new infrastructure for process control. The ideas were presented at the Ottawa Linux Symposium last July; now, the first set of patches has been posted. The overview posting describes the other patches in the set and gives some pointers to further information.

The core concept behind CKRM is the division of processes into distinct classes, each of which has a separate set of policies applied to it. A kernel API has been provided which enables the loading of classifier modules, enabling different sites to have entirely different ways of classifying processes. Most would likely stick with the rule-based classifier, which is provided with the CKRM patch set; it allows classification based on various task structure fields. So, for example, processes can be classified based on their UID, which program they are running, etc.

Tasks can be reclassified any number of times over their lifetime. The CKRM core patch places hooks in the logical spots where a process could change classification: when a user or group ID is changed, when a program calls exec(), when a new process is forked, etc. There is also a plan for a system call allowing a process to request reclassification at any time, but that call does not appear to be present in the current patches.

Once a task is classified, the system can apply policies to that task. So, for example, the CPU control patch enforces CPU usage policies on processes. Essentially, each class (as a whole) can be restricted to (and guaranteed) access to a administrator-specified percentage of the available processor time. To implement this policy, the patch modifies the scheduler by creating a new run queue for each class. Before the scheduler picks a new process to run, it first decides which class has the highest-priority claim on the CPU. The process to run can then be chosen from that class's queue in the usual way.

The memory control patch, instead, implements policies stating how much physical memory each class can use. The patch hooks into the page reclamation code, making that code rather more selective in how it choses pages to kick out of main memory. Whenever possible, the page reclaimer only choses pages from classes which are going over their maximum allowed share of physical memory. As memory gets tighter, each class will be trimmed down to its minimum share, as set up by the administrator. If there is no real pressure on memory, however, processes are allowed to grow beyond the bounds set for their class.

The memory control problem is complicated by shared pages: what happens when pages are shared between processes in different classes? The documentation on the CKRM web site describes an elaborate mechanism where classes are set up in a hierarchy and shared pages are divided across the appropriate parts of that hierarchy. What the current code appears to do, however, is to simply assign shared pages to the class with the largest share of physical memory.

The CKRM team also describes mechanisms which allow control over the disk I/O bandwidth used by each class and the number of incoming network connections each class can be handling at a given time. The I/O limitations are implemented by adding per-class queues to the disk I/O scheduler and merging requests into a single dispatch queue with the bandwidth policies taken into account. The networking policies involve the creation of yet another set of class-specific queues; in this case, incoming connections are divided into classes through the use of iptables rules. Patches for I/O bandwidth and incoming network connection control have not been released at this time, however.

CKRM is clearly a work in progress; much of the structure is in place, but not everything has been implemented and the code is full of "this needs to be cleaned up" comments. The CKRM hackers hope to get their work into 2.7, however, so they have some time yet to work things into shape.


(Log in to post comments)

Class-based Kernel Resource Management

Posted Sep 5, 2003 3:32 UTC (Fri) by jwdoughty (guest, #2373) [Link]

Gee; I'm not a kernel hacker, and I see that these are targeted at 2.7
but wouldn't these CKRM process classes be a much simpler approach to the
interactive scheduling effort that was reported in the last two LWN issues?

The scheduling priority boost scheme based on various "tweaking [of] the interactivity estimation code" seems prone to problems where the tweaking heuristics mis-identify as non-interactive what some users will inevitably want to be interactive processes.

This scheme would seem much simpler to manage if I interpret the messages I've seen correctly. Want a process to be more responsive: simply put it in a different class.

Class-based Kernel Resource Management vs interactive bonus

Posted Sep 5, 2003 18:10 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

You mean a system administrator would put the process in the high-priority class for interactive processes? I think the point is for the system to figure out on its own what processes need higher priority. A manual system wouldn't solve the problem.

Of course, you could keep the interactivity estimation and use it to automatically assign a class.

Unfortunately, what isn't being addressed in any of the work is the fact that we don't really want to give more CPU time to interactive processes. What we want is to reduce latency when the user interacts -- i.e. a process coming off of a long wait for user input should run soon after that. It probably will use only a few cycles before going back into wait for user input. But if that process decides to go and do some intensive computing, it should do it as slowly as a non-interactive process.

Class-based Kernel Resource Management vs interactive bonus

Posted Sep 11, 2003 12:34 UTC (Thu) by lars_stefan_axelsson (guest, #10660) [Link]

Unfortunately, what isn't being addressed in any of the work is the fact that we don't really want to give more CPU time to interactive processes.

Well, I don't think these patches are meant to address that in the first place. I see this as a server feature. This enables the administrator to better handle the situation when the machine is used (perhaps virtually) for many different tasks of differing importance. Not letting your webserver squash your mailserver, or vice versa. (Or rather, one departments internal webserver clog the customer handling system).

In the application within Ericsson I have in mind a feature such as this would make our job a lot easier, without having to deal with all the problems associated with 'proper' real time OS:es.

Class-based Kernel Resource Management vs interactive bonus

Posted Sep 13, 2003 22:43 UTC (Sat) by nagar (subscriber, #4734) [Link]

>> Well, I don't think these patches are meant to address that in the
>> first place.

That's true. CKRM is only concerned with enforcing shares of a resource not determining what those shares should be. While it does have to be concerned with the scheduler tweaks being done to give interactive tasks a boost (just so it can replicate them as best as it can), it is not trying to come up with such heuristics.

Lars, your feedback on how CKRM fares with your app would be much appreciated (do consider joining/posting on ckrm-tech@lists.sourceforge.net)

Also, while CKRM is very useful for servers, we hope regular users will also like the ability to restrict some of their resource-hungy apps !

- Shailabh

Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds