LWN.net Logo

RSDL hits a snag

In last week's episode, the Rotating Staircase Deadline Scheduler (RSDL) had appeared out of the blue and was busily impressing testers left and right. One person even called for it to go straight into 2.6.21. In reality, the replacement of something as fundamental as the CPU scheduler was never going to be an entirely smooth operation. So it's not all that surprising that the RSDL has run into an obstacle or two.

The biggest snag would appear to be this workload reported by Mike Galbraith. Mike is trying to run some CPU hogs (MP3 encoding, in particular) in the background while watching some interactive eye candy. It's a load that works with the current scheduler, but it becomes sluggish when running under RSDL. There have been a couple of other reports of a visible interactive slowdown when serious computation is going on - though others have reported better results.

There is little surprise in the appearance of behavioral regressions for certain workloads. Few people would have expected RSDL to be perfect within a week of its first posting. The real difficulty, instead, is that RSDL creator Con Kolivas has reacted in a somewhat defensive manner, refusing to see the behavior as a regression:

Your expectations of what you should be able to do are simply skewed. Find what cpu balance you loved in the old one (and I believe it wasn't that much more cpu in favour of X if I recall correctly) and simply change the nice setting on your lame encoder - since you're already setting one anyway.

We simply cannot continue arguing that we should dish out unfairness in any manner any more. It will always come back and bite us where we don't want it. We are getting good interactive response with a fair scheduler yet you seem intent on overloading it to find fault with it.

Con's position is that the scheduler should strive to provide fairness and low latency; any further expectations about interactive response should then be addressed by playing with nice levels. The interactivity estimator built into the current scheduler is just too difficult to work with; the kernel should not be in that particular business. The problem is that this approach conflicts with how Linux users have come to expect things to work.

As soon as one looks at improving RSDL for these situations, one gets into the same old discussions on improving interactive response in general. Linus pointed out that RSDL's way of scheduling is not quite as fair as it could be, since it does not always account for work in the right place:

And the problem is that a lot of clients actually end up doing *more* in the X server than they do themselves directly. Doing things like showing a line of text on the screen is a lot more expensive than just keeping track of that line of text, so you end up with the X server easily being marked as getting "too much" CPU time, and the clients as being starved for CPU time. And then you get bad interactive behaviour.

There are a couple of ways of handling problems like this. One is to just favor the X server, either by somehow marking it as the core of interactive behavior or by simply raising its priority. Con has been in favor of the latter approach; to that end, he has posted a separate patch which is aimed at improving latencies for all processes, even when they are not all running at the same priority levels. There have not been any follow-up results reported as of this writing.

This difficulty may well not keep RSDL out of the mainline kernel. The advantages inherent in dumping the interactivity heuristics are large, and RSDL does seem to improve life for a number of users. Noticeable performance regressions for some workloads are a problem, though; nobody wants to field a bunch of "2.6.x turned my response to crap" messages from unhappy users. So expect some iterations on this project yet - and, perhaps, an additional kernel cycle or two before it can be merged.


(Log in to post comments)

Xorg

Posted Mar 15, 2007 10:48 UTC (Thu) by xav (guest, #18536) [Link]

The problem seems to lie in great part with Xorg itself: it has a
notoriously bad behavior wrt interactivity, which is shared between the
core and the drivers. Moreover, the interface with clients, Xlib, uses
roundrips for everything where simple messages could suffice.
But that is known and should be fixed in the future (Xlib's hopeless, but
it's being replaced by XCB). So maybe the fault isn't entirely inside RSDL
after all.

Xorg

Posted Mar 15, 2007 11:58 UTC (Thu) by jospoortvliet (subscriber, #33164) [Link]

well, 'fault' might not be the right word, but RSDL WILL be inherently
less interactive, something you'll notice on heavy loads. Running, as I do
now, 2 make -j4 processes at the same time on my dualcore is definitely
less fun on on RSDL compared to staircase and to a lesser extend mainline.

But the point is, I should nice them. The interactive schedulers,
staircase and the one from mainline, automatically renice processes - but
this leads to problems. RSDL doesn't do that, simple as that. So if a
process needs more CPU power than it's fair share (or less) you should use
nice.

Automatic niceing would be a good thing (eg gcc or make should nice
themselves), same with DPKG (yes, IO bound, but there are IO priorities as
well, and as far as they aren't there now, there might be in the future).

Xorg

Posted Mar 15, 2007 12:10 UTC (Thu) by k8to (subscriber, #15413) [Link]

I wonder if there is some clever way I can express niceness. Like "I would always like autofoo and libtool to run with nice at least 5" or some such. It's not too uncommon that a task I would think to nice manually when run directly is not always run directly, leaving me to renice moderately laboriously.

I guess I'm saying if we're going to push priority setting onto users for them to achieve pleasant interactivity then maybe there could be better tools for the priorities to be set?

I've personally always been extremely murky on what nice is supposed to do on Unix. Sometimes it doesn't seem to do much at all. My Amiga featured simplistic preemption, which was easy to grasp. The highest priorty task would run, priorities were fixed, and you could set them arbitrarily. The only sort of unexpected behavior you would get is if you set your cpu bound program to a higher priority than the (task-implemented) filesystem. Unix is certainly safer for the multi-user case, but I often find myself infurated that I can't prevent the "low priority" task from slowing down my "high priority" task.

VeryNice

Posted Mar 15, 2007 12:27 UTC (Thu) by brugolsky (subscriber, #28) [Link]

See the VeryNice Dynamic Process Re-nicer.

From the homepage:

VeryNice is a tool for dynamically adjusting the nice-level of processes under UNIX-like operating systems. It can also be used to kill off runaway processes and increase the priority of multimedia applications, while properly handling both batch computation jobs and interactive applications with long periods of high CPU usage.

Xorg

Posted Mar 15, 2007 16:55 UTC (Thu) by vmole (guest, #111) [Link]

Like "I would always like autofoo and libtool to run with nice at least 5" or some such.

Create file nice_5 in /usr/local/bin (or ~/bin):

#!/bin/bash
exec nice -5 "$@"

Then create links in /usr/local/bin (or ~/bin):

ln -s nice_5 gcc
ln -s nice_5 libtool
ln -s nice_5 autobarf

Put /usr/local/bin (or, need I say, ~/bin) early in your path.

autobarf??

Posted Mar 15, 2007 21:58 UTC (Thu) by pr1268 (subscriber, #24648) [Link]

Is autobarf a standard Unix/Linux shell program? I can't seem to find it on my Slackware system.

;-)

Xorg

Posted Mar 16, 2007 2:46 UTC (Fri) by njs (guest, #40338) [Link]

> exec nice -5 "$@"

I believe you mean:

exec nice -5 "$0" "$@"

Xorg

Posted Mar 16, 2007 17:25 UTC (Fri) by vmole (guest, #111) [Link]

Oops. You're absolutely correct. Sorry about that, to anyone trying this at home.

Xorg

Posted Mar 27, 2007 23:29 UTC (Tue) by efexis (guest, #26355) [Link]

Only if you want a script which perpetually exec's itself... otherwise, you'll need to put the absolute path in to the original binary it's meant to be calling, or drop the location of the script from the PATH env.

eg, for background running tasks, such as make:
exec nice 5 "/usr/bin/$0" "$@"

Xorg

Posted Mar 16, 2007 15:17 UTC (Fri) by k8to (subscriber, #15413) [Link]

Yeah i'm currently using such a trick to run i386 binaries, and another similar trick for a special file database maintenance task. I'm kind of uneasy with them in terms of unexpected complexity springing out at the user in the troubleshooting case.

Xorg

Posted Mar 23, 2007 15:57 UTC (Fri) by slamb (guest, #1070) [Link]

I think it'd need to be a bit more clever than that - the shell doesn't replace $PATH, so this will just exec itself over and over. You'll need to either specify a more limited path or manually walk $PATH, excluding symlinks to itself.

what nice means

Posted Mar 17, 2007 2:59 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

I've personally always been extremely murky on what nice is supposed to do on Unix. Sometimes it doesn't seem to do much at all.

The last time I investigated that was years ago, so who knows what nice is now, but assuming it's roughly the same: Nice is just a limit on the priority a process is allowed to have, as the scheduler adjusts it up and down according to its own policies. If the scheduler naturally comes to the same conclusion as you as to the needs of a process, your nice won't have an effect. You can watch priorities in 'top' (better: 'htop') and get an idea. For me, the big compile job I force to be nice is often pretty nice anyway just because the scheduler figures out it's a long running job.

And the priority value itself is no great thing: it's just how long the process is allowed to keep the CPU when it gets it. Even the highest priority process can wait a long time to get it.

My Amiga featured simplistic preemption, which was easy to grasp. The highest priorty task would run, priorities were fixed, and you could set them arbitrarily.

This is all just the dynamic priority scheme, i.e. scheduling among processes with absolute priority 0 (which is usually all of them). If you give a process absolute priority 1, it will always run before any of the processes with absolute priority 0; it can even preempt a process that already has it. (Absolute priorities are what people usually call realtime priority).

Accounting

Posted Mar 16, 2007 1:13 UTC (Fri) by gdt (guest, #6284) [Link]

I'm not sure that blaming X11 is appropiate. Linux doesn't allow X11 (or any other 'system' process) to account its use of resources to the process that caused the resources to be used. Which is the way this sort of issue is handled in most non-UNIX operating systems.

Accounting

Posted Mar 17, 2007 2:39 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

I don't know if these are the systems you're thinking of, but I know much more elaborate accounting is done by older operating systems that were designed to be used on equipment so expensive that it was necessary to break the cost down by application.

A nice byproduct of that is that you can do fairer scheduling -- mainly because you can use a useful definition of fair. The Unix concept that fair means each process gets an equal share is a very, very, rough idea of fair allocation of resources.

In addition to something where a server process takes its CPU time from its clients, I'd like to see something where a child process takes its CPU time from its parent.

Why not a strait forward approach.

Posted Mar 15, 2007 19:02 UTC (Thu) by aashenfe (guest, #12212) [Link]

Wouldn't it be better to determine interactivity of a process based on it's interaction?

So priority would be based on the hardware driver a process interacts with.

For instance, everytime a process sends (or receives) data to the sound system, it would be given a short term boost. It would continue to have the boost as long as it continues to send or receive data. Also sound is the system were you really notice the problems with so processes doing sound would get the most boost.

Next would be keyboard and mouse events. If I type on open office for instance, it should get a moment of increased priority so that it can draw the new character I type to the screen faster. Plus if I'm trying to get control back on a runaway server, it would be nice if my processes had priority.

Interacting with the screen might provide a small boost as well, but might be to easy to abuse. Movies and games usually come with sound and thus would get a boost anyway.

Most other hardware subsystems would probably not (and shouldn't) have an effect on the priority of a process.

It sounds like a nice idea, and maybe it is already done this way, or maybe there is a lot more to it that I haven't considered (Like isn't X always the one getting keyboard and mouse events?).

Why not a strait forward approach.

Posted Mar 16, 2007 17:31 UTC (Fri) by vmole (guest, #111) [Link]

Because such an approach is not, in fact, straight-forward. The current scheduler does attempt to identify "interactiveness" and boost such processes. The problem is that the heuristics involved are complicated and not at all obvious, leading to a lot of dissastifaction among the kernel developers; one gets the impression that Linus tolerated the current scheduler only because nothing better was available. There's a *lot* of enthusiasm on the lkml for RSDL, I doubt a few corner case regressions will keep it out, because *all* of the schedulers have bad corner cases, and making them more complicated doesn't seem to prevent that, it just moves them around.

Why not a strait forward approach.

Posted Mar 18, 2007 18:59 UTC (Sun) by nlucas (subscriber, #33793) [Link]

(Like isn't X always the one getting keyboard and mouse events?).

Yes and the same with sound daemons, which are separate processes from it's users.

On Windows it's easier for the scheduler because the graphics sub-system (or part of it) is part of the kernel, which means an application with a foreground window does have a priority boost.

Why not a strait forward approach.

Posted Mar 18, 2007 19:41 UTC (Sun) by aashenfe (guest, #12212) [Link]

So then what would really need to happen is X or a sound daemon would somehow need to give some of it's priority to processes they interact with. These programs would either have to consciously be written to do this, or some kind of heuristic would be used.

It is a little easier for Windows, but I'm not sure if giving a priority boost to the foreground window application is always the best.

Why not a strait forward approach.

Posted Mar 18, 2007 20:41 UTC (Sun) by nlucas (subscriber, #33793) [Link]

It is a little easier for Windows, but I'm not sure if giving a priority boost to the foreground window application is always the best.

I was over-simplifying, off course.
On windows each process has a single base priority, but each thread has a priority (based on the process priority) that can be boosted for short periods (and with limited range).
Also, there is a distinction between system and local processes. The system ones have a base priority sligthly higher than local ones, and if a wait is not satisfied for a thread, it's quantum is reduced (they call it quantum decay).
This foreground window boost for all threads that own that window is made for interactivity sake, but can be disabled when you configure Windows to optimize performance for background services (and is the default on the server versions).

I'm no scheduling master, I just happen to have read the "Windows Internals" book by Mark Russinovich and David Solomon. There are differences between Windows versions, so if you really want to learn more (and not only about this) I would advise you to get that book.

Why not a strait forward approach.

Posted Mar 24, 2007 20:19 UTC (Sat) by pimlott (guest, #1535) [Link]

There is a classic story of a clever user who figured out that his compile would run faster if he hit the space bar every once in a while. :-)

Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds