|
|
Log in / Subscribe / Register

BFS vs. mainline scheduler benchmarks and measurements

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 5:00 UTC (Mon) by MattPerry (guest, #46341)
In reply to: BFS vs. mainline scheduler benchmarks and measurements by nash
Parent article: BFS vs. mainline scheduler benchmarks and measurements

I thought that Con's response was completely appropriate. The test machine that Ingo used, and the tests he performed, were not what BFS was designed for. Ingo is either being disingenuous or just didn't bother to read the FAQ. If Ingo wants to conduct a useful test, he should try the scheduler on a single processor, dual-core machine performing tasks that normal, non-programmer computer users would perform (music listening, web browsing, file copies, word processing, and so forth).


to post comments

I understand Con's response

Posted Sep 7, 2009 5:46 UTC (Mon) by aorth (subscriber, #55260) [Link] (5 responses)

It's important to read the Gmane thread linked a few comments up. While I'm excited for Linux in the server/high performance space, my Thinkpad only has one core. I've seen a marked increase in responsiveness with Con's BFS. They are things which can't be benchmarked, but make all the difference (like the time for my gmrun box to pop up when I hit Alt-F2 in Fluxbox) when using Linux on a desktop.

"Can't be benchmarked" – No.

Posted Sep 7, 2009 6:14 UTC (Mon) by quotemstr (subscriber, #45331) [Link] (4 responses)

The explanation is less technical and more psychological. What you're seeing is observer bias. See that poster on LKML who claimed that he was seeing improvements in sub-seconds lag in a 3D FPS (which is probably spinning at 100% CPU anyway)? That's precisely the kind of environment most susceptible to observer bias: a supposed small effect in a noisy signal like game latency.

I'll believe there's something to this "can't be benchmarked" nonsense when I see a double-blind experiment run that shows a statistically significant effect. As the old saying goes, "data is not the plural of anecdote".

"Can't be benchmarked" – No.

Posted Sep 7, 2009 6:46 UTC (Mon) by flewellyn (subscriber, #5047) [Link]

Perhaps there is a way to benchmark such things: make a test 3D program which plays multimedia onto a rotating 3D cube, and outputs the frame rate and other latency data on the screen. Run this, then start up some other things that contend for the scheduler, like some I/O (copying a large file?), some network traffic (pinging a host in the LAN?), and such. See how the 3D app holds up under such strain, by watching the numbers.

I don't know how well this would work, but it'd be a test of some kind.

"Can't be benchmarked" – No.

Posted Sep 7, 2009 13:09 UTC (Mon) by cesarb (subscriber, #6266) [Link] (1 responses)

The same poster mentioned frame drops in mplayer. That would be somewhat easy to convert into a benchmark (if mplayer does not output to the console the number of frames dropped, edit its source code to make it do so; then write an app to move the mplayer window around the screen pseudo-randomly and drop it back to the desktop, and see how many frames you can make it drop).

All the other examples mentioned by that poster sound like they could be benchmarkable with some coding effort. For instance, in the Doom 3 example, you would not measure the frame rate, but the frame jitter (record the time of the end of the "flush" call which actually pushes the image to the screen for each frame, subtract from the time for the previous frame, and see which is the highest difference and how uniform the differences are). Even if for some reason you cannot change the source code of your game, you can change the source code of the libraries it calls to do the "flush", or even interpose with LD_PRELOAD or something like it.

You could even measure the "input lag" in his sound example by building a hardware contraption which "presses a key" (by pretending to be a keyboard), listens to the analog audio output, and logs the time difference between the input and the output.

This all seems benchmarkable without the need for a double-blind test.

"Can't be benchmarked" – No.

Posted Sep 7, 2009 17:12 UTC (Mon) by cesarb (subscriber, #6266) [Link]

This is what I meant by interposing with LD_PRELOAD:

http://github.com/cesarb/glxswapbuffersmeasure/tree/master

This is a small quick-and-dirty library I just wrote which hooks into glXSwapBuffers via LD_PRELOAD and prints some statistics to stderr on exit.

An example of its output with everyone's favorite "benchmark" tool, glxgears, on an outdated distribution (thus an older kernel):

LD_PRELOAD=./glxswapbuffersmeasure.so glxgears
1142 frames in 5.0 seconds = 228.375 FPS
1035 frames in 5.0 seconds = 206.474 FPS
934 frames in 5.0 seconds = 186.540 FPS
glXSwapBuffers count: 3947, avg: 0.004757, variance: 0.000045, std dev: 0.006699, max: 0.204504

I did some moving of windows around to make it stutter a bit more, and the output from my test library shows it (200ms max latency, which corresponds to around 5 FPS). Note that the average time between glXSwapBuffers calls approximately matches glxgear's FPS printout.

It should be quite simple for someone who sees latency problems which seem to be cured by BFS to try to run the same 3D game with something like this library both in the mailine scheduler and in BFS and see if it shows any differences in the output. Of course, the code I posted can be enhanced to get better statistics (like a histogram of the latencies); I put the code under Creative Commons CC0 (a bit similar to "public domain").

"Can't be benchmarked" – No.

Posted Sep 7, 2009 13:49 UTC (Mon) by job (guest, #670) [Link]

I disagree. You could, for example, easily count the number of buffer underruns with pulseaudio when playing an mp3 at the same time as you compile the kernel. Then you could do the same thing with a movie.

These are the kinds of things normal users do when latency really counts. The problem is not measuring it, the problem is that nobody is really interested.

Morton's Fork

Posted Sep 7, 2009 6:35 UTC (Mon) by quotemstr (subscriber, #45331) [Link] (2 responses)

First of all, the burden of proof is on BFS advocates to provide a better test. Ingo's test was well-described and performed under reasonable conditions. Kolivas provided no comparably rigorous numbers. Your suggestion, to test what users actually use, puts kernel developers in an unreasonable dilemma. One the one hand, kernel developers can test the tasks that "users would perform", but because numeric results of these tests are not easily measured, they are meaningless without an expensive, inconvenient double-blind satisfaction study. (And really, the onus is on BFS advocates to provide one if that's what it takes.)

On the other hand, kernel developers can use contrived tests like the pipe example that are easily quantified, but that only approximate user workloads. These tests can be improved, but one will always be able to claim that they don't measure what users "really" do. Either way, the claim that BFS is superior will have been made unfalsifiable and unscientific.

Morton's Fork

Posted Sep 7, 2009 12:57 UTC (Mon) by Lennie (guest, #49641) [Link] (1 responses)

Let's start with 'frames skipped' in mplayer or vlc or something.

Morton's Fork

Posted Sep 11, 2009 1:51 UTC (Fri) by Spudd86 (guest, #51683) [Link]

Ingo mention that he does test exactly this on low end machines further up

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 6:53 UTC (Mon) by bvdm (guest, #42755) [Link]

Did Ingo claim that he was testing BFS against the mainline scheduler for BFS's intended use cases? No.

"I'd guess that a machine with more than 16 CPUS would start to have less performance."

Even if you consider hyper-threading to result in 16 cores (which is not the case at all), in his FAQ Con claims that BFS should perform well up to 16 cores.

I realize how people can be very passionate about Linux and how the scheduler as something that critically affects user experience becomes important, but is all this emotion really necessary?


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds