LWN: Comments on "BFS vs. mainline scheduler benchmarks and measurements"

BFS vs. mainline scheduler benchmarks and measurements

vonbrand — Tue, 08 Jun 2010 13:22:51 +0000

Way back when I confiscated a dual Pentium Pro (200MHz) to use as a desktop machine for use in a class I was teaching... the machine was old already (I actually canibalized two of them to get a working one).

worse than it should be

oak — Wed, 16 Sep 2009 20:06:48 +0000

Put Firefox to a container group of its own and set a limit on the active
pages it can have?

pluggable schedulers vs. tunable schedulers

paragw — Sat, 12 Sep 2009 18:48:16 +0000

Surely one single tunable (I want the desktop scheduler for example in the case of PlugSched) is better (i.e. less complex) from user standpoint rather than having to figure out say 5 complex numerical things such as granularity and what not?

Or do we have one single tunable for CFS that converts it into desktop friendly? If it does have such a knob then the next and most important question is how well does it work for desktops. From the reports I think we are still some way from claiming excellent "automatic" interactivity for desktops. Note that I am excluding the nicing games and making the user do a complex dance of figuring out how to make his/her desktop interactive. I am sure you agree that does not work well.

To your point, if we have to have one tunable for the CFS scheduler to make it desktop friendly - essentially a single knob (like sched=desktop in the PlugSched case) it is easy to see how that would fail to work satisfactorily for all desktop workloads. For one thing unless the user messes with nice levels of each process that he/she opens, minimizes or closes or brings to foreground (that is out of question from usability standpoint) the scheduler has no way to distinguish the foreground process from a background one, it has no way of distinguishing mplayer from dekstop window manager from some system daemon going bad and eating CPU.

For another, the scheduler seems to have no reliable way to know what processes it needs to favor. Window manager and the process of the foreground window need to be registered with the scheduler as foreground processes, each minimized window needs to be registered with scheduler as background. Then as long as the window manager and the process owning the foreground window are not runnable everyone else gets CPU. Multimedia applications need to be registered with the scheduler as such - automatically, so that Mplayer always gets CPU when it needs it, even favoring it over the window manager and other process of another foreground window if there is only one available CPU. Until this co-ordination happens I think we will be away from achieving great desktop interactivity which works for most desktop workloads.

Then the question would be that do we want to put all this "only needed on desktop" complexity into the completely fair scheduler or do we want to keep both separate. That is sort of a secondary question - the first question is how do we get the desktop to hint the scheduler as to which processes the user is actively interacting with, which ones are the ones he/she is likely to interact with (minimized windows) and then the scheduler favoring those accordingly - that ought to solve the interactivity problems in an automatic fashion.

[ Windows has this notion of distinguishing between "Programs" (which are running desktop applications) and background services (things without desktop interaction) and in its default configuration on the desktop it favors "Programs" and on Servers it favors "Background services" (Web Server service for e.g.). And it certainly helps interactivity. It can do this because it can distinguish between what is a desktop application and which is foreground or background and what is a non-desktop, background application.]

pluggable schedulers vs. tunable schedulers

khc — Sat, 12 Sep 2009 18:31:58 +0000

I already have a compile time way to select scheduler:

patch -p1 < 2.6.31-sched-bfs-211.patch

pluggable schedulers vs. tunable schedulers

mingo — Sat, 12 Sep 2009 15:28:25 +0000

What i believe you are missing relates to the very first question i asked: wouldnt it be better if a scheduler had nice runtime tunables that achieved the same?

Your original answer was (in part and way up in the discussion):

If we had a nice modular scheduler interface that allows us to load a scheduler at runtime or choose which scheduler to use at boot time or runtime that would solve the complexity problem and it will work well for the workloads it was designed for. As a bonus I will not have to make decisions on values of tunables - we can make the particular scheduler implementation make reasonable assumptions for the workload it was servicing.

What you are missing is that 'boot time' or 'build time' schedulers (i.e. what PlugSched did in essence) are build time / boot time tunables. A complex one but still a knob as far as the user is concerned.

Furthermore they are worse tunables than nice runtime tunables. They inconvenience the user and they inconvenince the distro. Flipping to another scheduler would force a reboot. Why do that?

For example, it does not allow the example i suggested: to run Firefox under BFS while Thunderbird under another scheduler.

So build-time/boot-time pluggable schedulers have various clear usage disadvantages and there are also have various things they cannot do.

So if you want tunability then i cannot understand why you are arguing for the technically worse solution - for a build time or boot time solution - versus a nice runtime solution.

pluggable schedulers vs. tunable schedulers

paragw — Sat, 12 Sep 2009 14:44:03 +0000

I don't really understand it when you say "think through the technical issues involved [ in designing pluggable schedulers ] not being trivial" since you already mentioned PlugSched did just that prior to CFS.

It might be a terminology difference that is getting in the way - when I say "pluggable" I imply choice more than anything else. In other words it would be perfectly OK for the scheduler to be selectable only at compile and boot time and not at runtime just like PlugSched was.

We are advertising a completely fair scheduler that will do all things (ponies included ;) for everybody but no one has so far explained, HOW fundamentally, on the conceptual level, on the design level are we going to ensure that when resources get scarce (2CPU cores, 130 runnable processes - most CPU heavy jobs and one mplayer doing video and other doing audio encoding) we make sure we give enough, continuous CPU share to mplayer and the audio encoder and the whole desktop as such so it feels fluid to the user without the user having to play the nice games.

Making it even simpler, asking the same question differently - what logic in the current scheduler will hand out the most resources to mplayer, the audio encoding process and the desktop window manager (switching between windows needs to be fluid as well) when user is interacting with them? You can say the scheduler will be completely fair and give an equal chunk to every process but desktop users get pissed if that means mplayer is going to skip - not enough CPUs and lot of processes to run.

In other words - if I hand out $100 to a charity and ask them to be completely fair while distributing the amount to everyone equally and 200 people turn up for help - the charity did the fair thing and gave out 50c to everyone without considering the fact that 3 people out of the 200 badly needed at least 2$ so they could not only eat but also buy their pill and stay alive, that would be an unfair result at the end. So the charity has to have some notion of bias to the most needy and for that it needs to figure who are the most needy.

The point I am trying to make is we need to have a scheduler that is both completely fair (server workloads) and desktop friendly and these conflicting objectives can only be met by having 2 different user selectable schedulers. The desktop scheduler can get into the details of foreground and background Xorg and non-Xorg, multimedia vs. non-multimedia processes and fight hard to keep the desktop fluid without bothering about the background jobs taking longer or bothering about scaling to 1024 CPUs. The CFS scheduler can stay fair and moderately interactive and scalable as it is and server people can select it.

So again why do we not want to bring PlugSched back and have user select BFS or CFS or DS (Desktop Scheduler) (at compile or boot time)? If we do want CFS to do everything while being fair - I don't think we have explained on paper how it would ensure desktop interactivity without having a notion of what constitutes the desktop. We have to question the CFS goals/design/implementation if we are to go by the reports that after substantial development interactivity issues with CFS still remain. (Please don't say the nice word - I have explained already that it doesn't work well practically.) If it turns out that it is hard to meet conflicting goals well or if it turns out we need to add more complexity to CFS to meet those conflicting goals even in "most" workloads - it is still prudent to ask why not just have 2 different schedulers each with one, non-conflicting goal?

pluggable schedulers vs. tunable schedulers

nix — Sat, 12 Sep 2009 12:24:20 +0000

But you still have to figure out which processes get their priorities
decided by which 'schedulers' (it is not very useful to jump into a Linux
discussion assuming that the terminology used is that of some other
kernel's development community, btw).

pluggable schedulers vs. tunable schedulers

mingo — Sat, 12 Sep 2009 09:00:41 +0000

Let me repeat - in Solaris, schedulers are the parts of code that calculate priorities. They don't do other things - specifically, they don't switch threads. You don't have to schedule them in any way - just switch threads conforming to the priorities calculated by the schedulers.

That's not pluggable schedulers. It's one scheduler with some flexibility in calculating priorities. The mainline Linux scheduler has something like that too btw: we have 'scheduling classes' attached to each process. See include/linux/sched.h::struct sched_class.

And if you don't like this approach, you could still do what FreeBSD has been doing for several years now - implement schedulers changeable at compile time.

It's not about me 'liking' anything. My point is that i've yet to see a workable model for pluggable schedulers. (I doubt that one can exist - but i have an open mind about it and i'm willing to be surprised.)

Compile-time is not a real pluggable scheduler concept: which would be multiple schedulers acting _at once_. See the example i cited: that you can set Firefox to BFS one and Thunderbird to CFS.

Compile-time (plus boot time) schedulers is what the PlugSched patches did for years.

pluggable schedulers vs. tunable schedulers

trasz — Sat, 12 Sep 2009 08:46:18 +0000

And if you don't like this approach, you could still do what FreeBSD has been doing for several years now - implement schedulers changeable at compile time.

pluggable schedulers vs. tunable schedulers

mingo — Sat, 12 Sep 2009 08:37:59 +0000

That does not answer the fundamental questions though.

Who schedules the schedulers? What happens if multiple tasks are on the same CPU with different 'schedulers' attached to them? For example a Firefox process scheduled by BFS and Thunderbird scheduled by CFS. How would it behave on the same CPU for it to make sense?

Really, i wish people who are suggesting 'pluggable schedulers!!!' spent five minutes thinking through the technical issues involved. They are not trivial.

Programming the kernel isnt like LEGO where you can combine bricks physically and have a nice fire station in addition to your police car ;-)

pluggable schedulers vs. tunable schedulers

trasz — Sat, 12 Sep 2009 07:50:09 +0000

Just do what Solaris does - schedulers are pieces of code that calculate thread priorities. This way you can assign different schedulers to different processes.

Morton's Fork

Spudd86 — Fri, 11 Sep 2009 01:51:21 +0000

Ingo mention that he does test exactly this on low end machines further up

BFS vs. mainline scheduler benchmarks and measurements

efexis — Thu, 10 Sep 2009 22:23:14 +0000

This I believe is something that was more of an issue than it is now, so CPU's can ramp up their speed much quicker than they could've done before. One problem was for example that higher CPU speeds requires higher volts which can cause delays with the CPU stalling while the voltage steps up. Now instead the voltage will be pushed up a split moment before the frequency is ramped up, so there's no stall. Otherwise, it's all down to the CPU, with different models taking different amounts of time to change frequency, it can make sense to jump to the highest frequency when the usage goes up and then slow it down if needed (such as the ondemand governor does) or scale it up step by step. You want to try set a lower watermark where responsiveness is important, so CPU's always running at say twice the speed that you need it, so you always have room to move into while you wait for the cpu to speed up (eg, when load goes from 50% to 80%, the CPU speeds up to bring the load back down to 50%. Only if loads reaches 100% have you not sped up quickly enough). Of course if you wish to conserve more power, you run the CPU at speeds closer to load. In Linux, there're many tuneables for you to play with to get the responses you wish (/sys/devices/system/cpu/cpu?/cpufreq/<governor>). To see what's available on the Windows platform, there's a free download you can find by googling rmclock that proper spoils you for configuration options. There's no one rule that has to fit all, during boot up the kernel will test transition speeds and set defaults accordingly.

Does the kernel scheduler even matter???

ajb — Thu, 10 Sep 2009 20:54:40 +0000

The scheduler could still help more under conditions of VM stress. For example, on my netbook, which thrashes when you run firefox + anything, I literally run killall -STOP firefox-bin; killall -CONT other-app when I want to switch between them. This is a lot more convenient than quitting and restarting each app, which otherwise I would have to do. I imagine there might be some less manual way to achieve the same thing by building some more smarts into the scheduler/VM to achieve the same effect. Possibly with help from userspace.

BFS vs. mainline scheduler benchmarks and measurements

i3839 — Thu, 10 Sep 2009 19:35:12 +0000

I'll try to send a patch against tip later this week, not feeling too well at the moment.

pluggable schedulers vs. tunable schedulers

paragw — Thu, 10 Sep 2009 11:57:05 +0000

[ Warning - long winded thoughtlets follow ]

About the plugsched - since it was a boot time selectable it could do what I was proposing just not at runtime (which is no big deal really). And I wasn't suggesting mixing schedulers per CPU. My thought was to have one CPU scheduler exactly as we have it today - either selectable at boot time or based on how much complex it would be to implement, at runtime.

If we talk about CFS as it is in mainline - I think its objective of being completely fair is a noble one on paper but does not work well on desktops with workloads that demand interactivity bias in favor of only a certain set of apps. Like many people have reported CFS causes movie skips and does worse than BFS for interactivity. I am not saying the problems with CFS are 100% due to it being completely fair by design but it is not hard to imagine it will try to be fair to all tasks and that in itself will not be enough for mplayer to keep running the movie without skips if there are enough processes and not enough CPUs. If it favored running mplayer it would not be completely fair unless we also started renicing the processes - which if you think of it, is fundamentally broken from usability standpoint unless it was made fully automatic which in turn is impossible without user involvement. (Desktop user is simply not going to renice every desktop process he works on and then one has to select what gets more interactivity bonus apart from Xorg - now the browser, later the mail client, etc. you get the idea. I explain more problems with nice a little further down.)

Now if we think about the CPU(s) as a finite resource - if people start running more tasks than there are CPUs it becomes clear that a bunch of tasks have to be scheduled less frequently and given less time slice than a bunch of other tasks if we are to maintain interactivity. (In Windows for example - one can set a scheduler switch that either favors foreground tasks (desktop workload) or background (server) tasks.)

So if we were to do something like build a scheduler with only goal of latency for interactive processes - we then would not have to worry about throughput in that scheduler. I.e. no conflicting goals, so less complexity and better results. Then one can think of a per process flag which Xorg and its clients can set that tells the desktop scheduler when the process window is foreground and interactive (when it is the topmost window or when a window needs user input) and the scheduler will ensure that it meets its goal of giving that process enough CPU resources to keep it running smoothly. This would solve the ugly problem of making the scheduler guess which process is interactive/needs user input or needs to be given interactivity boost so that the desktop feels responsive for the user. In my opinion making a scheduler with conflicting goals also making it guess processes to give interactivity boost simply does not work as the scheduler doesn't have enough data to know for sure what process needs the most interactivity at any given point of time - at least it is not straight forward to make that guess reliably every time, without any hint from the applications themselves.

Similarly for servers we could simplify CFS to make sure it remains completely fair and goes after throughput and latency comes second.

The benefit of having two schedulers is that of course users can choose one that does what they need - interactivity or fairness. So if someone complains my desktop is jerky when I run make -j128 kernel build, we can tell them to use the desktop scheduler and stop worrying about kernel build times if they are also going to play a movie at the same time. And for people needing fairness they can go with CFS and we can tell them to stop complaining about desktop jerkiness when running kernel builds as long as it is not anomalously jerky -i.e. not completely fair per goal.

We then also keep complexity in each scheduler to minimum without penalizing server workloads with interactivity logic and desktop workloads with fairness logic.

In short the point I am trying to make is that doing all things in one scheduler as we do it today, without any notion of what process needs user interaction or what process needs to be boosted in order to make the user feel the desktop is more interactive - it is never going to be a 100% success for all parties. (Correct me if I am wrong but I don't think we have any separate treatment for multimedia applications - they are just another process from the scheduler's PoV and it fails when there are also other 128 runnable processes that need to run on vastly less than 128 CPUs). Which means that the scheduler needs to be biased to the apps user cares most about - and nice does not work as long as it is a static, one time, user controlled thing. I don't want my browser to be nice -10 all the times - if it is minimized and not being used I want it to be nice +5 and instead have mplayer in the foreground nice'd to -5. Who decides what amount of nice in relation to other nice'd processes is sufficient so mplayer plays without skipping? We need something absolute there unlike nice - if a multimedia application is playing in the foreground - it gets all resources that it needs no matter what - that IMHO is the key to making the desktop users happy.

BFS vs. mainline scheduler benchmarks and measurements

mingo — Thu, 10 Sep 2009 09:56:18 +0000

It shouldnt have too big cost unless you are really RAM constrained. (read running: a 32 MB system or so) So it's a nice tool if you want to see a general categorization of latency sources in your system.

latencytop is certainly useful enough so that several distributions enable it by default. It has size impact on task struct but otherwise the runtime cost should be near zero.

BFS vs. mainline scheduler benchmarks and measurements

mingo — Thu, 10 Sep 2009 09:53:40 +0000

Thanks for testing it. It would be helpful (to keep reply latency low ;-) to move this to email and Cc: lkml.

You can test the latest upstream scheduler development tree via:

http://people.redhat.com/mingo/tip.git/README

Interactive benchmarks

man_ls — Thu, 10 Sep 2009 09:52:38 +0000

More to the point: even when one side proposed invalid benchmarks, the other side was not able to come up with anything better. (And no, "beat them at their own benchmarks" is not a valid excuse; we are talking about engineering, not about marketing.)

pluggable schedulers vs. tunable schedulers

mingo — Thu, 10 Sep 2009 09:50:24 +0000

Note that what you propose is not what has been proposed on lkml under 'pluggable schedulers' before - that effort (PlugSched) was a build time / boot time scheduler selection approach.

Your model raises a whole category of new problems. For example under what model would you mix these pluggable schedulers on the same CPU? Add a scheduler of schedulers? Or can a CPU have only one pluggable scheduler defined at a time?

Also, how is this different from having per workload parameters in a single scheduler? (other than being inherently more complex to implement)

BFS vs. mainline scheduler benchmarks and measurements

jamesh — Thu, 10 Sep 2009 05:18:52 +0000

In the pipe test, neither process is going to be able to fill the pipe buffer. Each process blocks on the other doing alternating reads and writes on the pipes with pretty much no work in between.

I guess it is possible that a scheduler could preempt the task between when the read returns and before it performs the write, but that seems unlikely.

My intuition is that performance would primarily depend on how quickly the scheduler gets round to run a process when it becomes unblocked, which is essentially a measure of average scheduling latency (and as I said before, this doesn't tell you much about the variance in that latency).

BFS vs. mainline scheduler benchmarks and measurements

russell — Thu, 10 Sep 2009 04:01:54 +0000

The pipe test would work best if the scheduler gave each task sufficient time to fill or empty the pipe, depending on it's role. It would suffer badly if it kept preempting those tasks to give some other task a go when it became runnable.

The pipe test is more about ordering producers and consumers. Not latency.

Interactive benchmarks

njs — Wed, 09 Sep 2009 23:40:32 +0000

> As to the benchmarks, the first test was how fast can he build the kernel using n processes.

To be fair, that benchmark is originally Con's, not Ingo's (Con's original announcement claims that "make -j4 on a quad core machine with BFS is faster than *any* choice of job numbers on CFS").

pluggable schedulers vs. tunable schedulers

paragw — Wed, 09 Sep 2009 23:10:30 +0000

How does moving your tunable to boot time make it less of a tunable?

Where did I say move the tunable to boot time? I said the particular modular scheduler can make reasonable assumptions that are best for the objective it is trying to meet - low latency for Xorg and its clients for example at the expense of something else (throughput) on the desktop systems.

Interactive benchmarks

man_ls — Wed, 09 Sep 2009 20:14:16 +0000

You are right, there are no benchmarks that show that BFS is good at interactivity. However I contend that such "hand-waving" is to be expected from an anaesthetist and a crowd of enthusiasts (and is not a bad thing at all). The real pity is that on lkml, a list full of high-flying engineers, nobody has been able to construct those benchmarks or do those measurements either. The best we have is a scheduler hacker posting odd benchmarks on esoteric hardware. No offense for Ingo, he was very respectful and had interesting data, but it was all biased:

we tune the Linux scheduler for desktop and small-server workloads mostly [...] what i consider a sane range of systems to tune for - and should still fit into BFS's design bracket as well according to your description: it's a dual quad core system with hyperthreading

And then repeating the measures on a quad-core machine, the best he has offered so far. It seems that, despite having an expressed focus on the desktop, a netbook and a few days for testing on it are out of reach.

As to the benchmarks, the first test was how fast can he build the kernel using n processes. Well, this is only measuring thoughput; if each process is supposed to be interactive, it is not unreasonable to expect that they will more easily interrupted and thus the build will last longer. Then a very artificial pipe-messaging test, followed by similarly contrived benchmarks -- which CFS has already been tuned to. So the "other side" (lkml) has not been able to produce anything better either to show that CFS is good at interactivity, measuring skips and jitter, and I find this to be even more pitiful.

pluggable schedulers vs. tunable schedulers

martinfick — Wed, 09 Sep 2009 16:31:46 +0000

How does moving your tunable to boot time make it less of a tunable?

BFS vs. mainline scheduler benchmarks and measurements

k8to — Wed, 09 Sep 2009 14:33:39 +0000

I think the idea of 'normal users' going to LKML with their problems is unworkable. However, I am willing to give it a try with my next interactivity stall. I expect to give up rapidly if faced with derision or brush-off.

BFS vs. mainline scheduler benchmarks and measurements

nix — Wed, 09 Sep 2009 11:50:52 +0000

I thought CONFIG_LATENCYTOP had horrible effects on the task_struct size and people were being encouraged to *disable* it as a result?

BFS vs. mainline scheduler benchmarks and measurements

nix — Wed, 09 Sep 2009 11:41:57 +0000

Well, I'm a counterexample: I upgrade my hardware every decade, if that, but the kernels are normally as new as possible, because I'd like newish software, thanks, and that often likes new kernels. Further, everyone I know who isn't made of money and runs Linux does the same thing: they tend to run Fedora, recentish Ubuntu, or Debian testing, because non-enterprise users generally do not want to run enterprise distros because all the software on them is ancient, and non-enterprise distro kernels *do* get upgraded.

I suspect your argument is pretty much only true for corporate uses of Linux (i.e. 'just work with *this* set of software', as opposed to other uses which often involve installation of new stuff). But perhaps those are the only uses that matter to you...

Well, this settles it for me

liljencrantz — Wed, 09 Sep 2009 11:27:53 +0000

Ingo has said that the graph size was a user error, apologized and replaced them. Calling him «extraordinary clueless» without knowing the facts is hostile and unmotivated. Mistakes happen.

I agree that Ingos choice of test machine and benchmarks is telling when it comes to what his priorities are - he gets paid to create software that runs well on big systems, 8 CPUs probably looks small to him. No malice or stupidity involved, just a different perspective.

I think the ball is firmly in the BFS camps court. Con won't and shouldn't deal with this, but any random BFS user with a bit of time could sit down and redo a set of benchmarks that _he_ feels is more relevant and use as a counterpoint. Maybe compiling vim on an Atom CPU as well as some measurements of dropped frames in mplayer while compiling? Latencies and stuttering may be hard to measure, but it is far from impossible. Something better than «it feels better when i shake my mouse» is needed.

BFS vs. mainline scheduler benchmarks and measurements

etienne_lorrain@yahoo.fr — Wed, 09 Sep 2009 10:08:56 +0000

I also have some strange behaviour on a no-name dual core all intel portable PC, kind of 2-4 seconds where mouse is not even moving, without any load whatsoever, no log in /var/log/messages, completely random.
This portable PC is cheap and "designed for the other OS" system even if it was sold without anything installed: the DMI information is blank, the ACPI information does not seem to be better.
I tend to think that it is a SMM problem, instead of a scheduler problem, the crappy BIOS (cannot update because no DMI name) does not like Linux, or was explicitely designed to give a bad experience. I would really like to be wrong here.
There was a time when Linux did not rely on any BIOS, but it is no more (SMM cannot be disabled, even under Linux - what is what is handling the forced power off by pressing On/Off button for more than 3 seconds).

BFS vs. mainline scheduler benchmarks and measurements

realnc — Wed, 09 Sep 2009 08:42:38 +0000

I've tried those tweaks. They don't really help much.

Because I can

gmaxwell — Wed, 09 Sep 2009 07:57:26 +0000

Jitter, frame drops, and audio skips are all *easily measurable*. Yet *none* of the advocacy of BFS that I've seen includes any measure of these things. Only vague hand-waving about smoothness. Perhaps these people should color the edges of there disks with green markers... I hear it reduces jitter.

Meanwhile I do audio processing with a ~2ms processing interval using the mainline scheduler, thrashing the system, high loads... and underruns are basically unheard of at least after tossing the drivers and hardware that I determined were misbehaving (with measurements, ... imagine that!)

I don't doubt that there are genuine areas for improvement, even in the scheduler but it isn't going to get better without real measurements and some social skills superior to those of Hans Reiser.

BFS vs. mainline scheduler benchmarks and measurements

alankila — Tue, 08 Sep 2009 23:59:36 +0000

Well, what I get is that the full-screen button flashes a full-screen video for a frame or two, and invariably falls back to windowed mode. I'm not sure if that is the symptom for him, though.

It could also be another problem: right now mouse button clicks within the flash applets don't seem to register -- I have to start video playback by pressing space because clicking with the mouse somehow doesn't seem to go through. Especially pressing the full-screen button does absolutely nothing right now. *sigh*

BFS vs. mainline scheduler benchmarks and measurements

ikm — Tue, 08 Sep 2009 21:04:34 +0000

I think you can just try asking Con nicely.

BFS vs. mainline scheduler benchmarks and measurements

maxbg — Tue, 08 Sep 2009 19:37:43 +0000

Hello, first post here :)

I would really like to find out the actual algorithm the BFS uses. Reading the patches gives me little information as I
am not a kernel hacker (yet :).
I know it uses a global runqueue for all CPUs ... and it does not measure sleep time. What are those deadlines?
How does it differ from the SD and RSDL?

BFS vs. mainline scheduler benchmarks and measurements

realnc — Tue, 08 Sep 2009 19:12:18 +0000

Does it behave in an anomalous way for you? What would you expect it to do and what does it do for you currently?

It does behave "anomalous." A simple example would be mplayer (or any other video player) or an OpenGL app "hanging" for a bit while I leave my mouse over the clock in the systray. This brings up details about the current time (what day it is, month, etc) in a "bells and whistles" pop-up that just doesn't pop-up out of the blue but slowly fades-in using transparency. It is for the duration of this compositing effect (which actually doesn't even need that much CPU power) that mplayer stalls, barks and drops frames.

Now image how bad things can seem with virtually a crapload of actions (opening menus, switching desktops, moving windows, etc, etc) result in frame skipping, sound stuttering, mouse pointer freezing, etc. They perform well, that's not the problem. The problem is that due to the skips and lag, they *seem* to be sluggish. Not in a dramatic way, but still annoying. I was actually quite used to Linux behaving like that. But after applying the BFS patch, Linux joined the list of "smooth GUI" OSes (alongside OS X and MS Vista/7). That's how a desktop should feel like. Frankly, I never quite suspected the kernel to be at fault here, but rather the applications themselves. But after seeing BFS solving all those problems, it seems the kernel can be at fault for such things.

The Android folks also confirmed that their devices ran much more fluid and responsive after they loaded a custom firmware on them with a BFS-patched kernel. Folding users claim increased folding performance which doesn't interfere with their GUI anymore. This can't be coincidence.

BFS vs. mainline scheduler benchmarks and measurements

jzbiciak — Tue, 08 Sep 2009 18:13:29 +0000

I wonder if it might be a different effect. My dual dual-core Opteron box (4 CPUs across 2 chips) dynamically scales the frequency of the CPUs based on load.

What I don't know is the cost of doing so. That is, when it switches from 1GHz to 2.4GHZ, yes, it got faster, but was there, say, a 1ms hitch between the two? Did that hitch affect both cores on that die or just one? If there was a cache-to-cache coherence transfer at the time, did it also experience that hitch?

These details could vary by processor platform, vendor and maybe even chipset and BIOS if the switch is effected via SMM or the like. A sloppier CPU scheduler that kept all the CPUs in the high-frequency state (or low frequency state) would eliminate these sorts of hitches, whereas one that kept the load more concentrated might experience more such hitches when the occasional background load spills onto the CPU that was left sleeping.

pluggable schedulers vs. tunable schedulers

paragw — Tue, 08 Sep 2009 14:01:31 +0000

[ Gaah - Here is a better looking copy of above commment ]

No reboots needed, only a single scheduler needs to be maintained, only a single scheduler needs bugfixes - and improvements to both workloads will flow into the same scheduler codebase so server improvements will indirectly improve the desktop scheduler and vice versa. Sounds like a nice idea, doesn't it?

Well no, I don't think so. My line of thinking was that making one scheduler balance the arbitrary needs of multiple workloads leads to complexity and suboptimal behavior.

If we had a nice modular scheduler interface that allows us to load a scheduler at runtime or choose which scheduler to use at boot time or runtime that would solve the complexity problem and it will work well for the workloads it was designed for. As a bonus I will not have to make decisions on values of tunables - we can make the particular scheduler implementation make reasonable assumptions for the workload it was servicing.

And if you ask me I will take 5 different code modules that each do one simple thing rather than taking 1 code module that tries to achieve 5 different things at once.

After all, if we can have multiple IO schedulers why cannot we have multiple selectable CPU schedulers? Are there technical limitations or complexity issues that make us not want to go to pluggable schedulers?

pluggable schedulers vs. tunable schedulers

paragw — Tue, 08 Sep 2009 13:59:04 +0000

No reboots needed, only a single scheduler needs to be maintained, only a single scheduler needs bugfixes - and improvements to both workloads will flow into the same scheduler codebase so server improvements will indirectly improve the desktop scheduler and vice versa. Sounds like a nice idea, doesn't it? Well no, I don't think so. My line of thinking was that making one scheduler balance the arbitrary needs of multiple workloads leads to complexity and suboptimal behavior. If we had a nice modular scheduler interface that allows us to load a scheduler at runtime or choose which scheduler to use at boot time or runtime that would solve the complexity problem and it will work well for the workloads it was designed for. As a bonus I will not have to make decisions on values of tunables - we can make the particular scheduler implementation make reasonable assumptions for the workload it was servicing. And if you ask me I will take 5 different code modules that each do one simple thing rather than taking 1 code module that tries to achieve 5 different things at once. After all, if we can have multiple IO schedulers why cannot we have multiple selectable CPU schedulers? Are there technical limitations or complexity issues that make us not want to go to pluggable schedulers?