|
|
Log in / Subscribe / Register

BFS vs. mainline scheduler benchmarks and measurements

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 4:52 UTC (Mon) by flewellyn (subscriber, #5047)
In reply to: BFS vs. mainline scheduler benchmarks and measurements by nash
Parent article: BFS vs. mainline scheduler benchmarks and measurements

I begin to understand why Con's prior work was not included in mainline...


to post comments

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 5:08 UTC (Mon) by tajyrink (subscriber, #2750) [Link] (27 responses)

I think Con's response was precisely reiterating what he was already accusing kernel devs about - everything has to scale. And additionally using things like compiling kernel and piping messages as "benchmarks".

How about perceived user experience blind tests when using Firefox on a netbook? (because it's hard to benchmark responsiveness to user interaction vs. completion time)

I think there would be demand for a new tool that tracks any user interaction with the time it takes to have a proper response. Probably not really generally doable, but a set of benchmarks that can be used to test this would be beneficial.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 5:32 UTC (Mon) by flewellyn (subscriber, #5047) [Link] (26 responses)

All good points, but Con's attitude was terrible. Ingo's really not, I don't think, trying to drive him away, or invade his space, or anything. Just trying to work with him.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 6:31 UTC (Mon) by nash (guest, #50334) [Link] (24 responses)

To be fair on Con, you could argue there was a bit of trolling going on Ingo's side, with the choice of hardware and the like.

However it was a troll with real numbers associated.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 6:38 UTC (Mon) by flewellyn (subscriber, #5047) [Link] (23 responses)

A dual quad-core system with hyperthreading? That's hardly a bad choice of hardware for the "desktop system" scale. That's the standard CPU setup for a Mac Pro, for instance.

Granted, it's the upper end of the "desktop system" scale, but he did say as much.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 7:51 UTC (Mon) by drag (guest, #31333) [Link] (20 responses)

There is not one single person I know that owns a dual-socket desktop.

It's just not a desktop machine. There is a very good reason for this... it dramatically increases the cost of the board and doubling the cost of cpu for pretty much no good reason. It's not going to benefit you in any way for surfing the internet or playing games or even processing media.

The only people who would benefit from a system like that is for compiling software, long render batch jobs, and the like. That is just not a typical desktop workload.

The mainstream desktop system is very obvious to me.

Core2Duo Intel laptop, Dual core AMD desktop, single core Atom processor. Those are the cpus that your going to see on a typical Linux system.

I know lots of people that P4 machines, a few people still using P3 laptops, a bunch of Core2Duo laptops, and a bunch of people owning netbooks for various reasons (high mobility, secondary computer, regular laptops are too expensive, etc).

Dual-socket Quad-core systems? That's just not the target audiance for the most part.

----

That being said I don't think it would make a big deal. Ingor's testing is probably going to reflect accurate performance for machines less powerful then that one. But I can't be sure about that. It would of really had more impact if the tests were carried out on a dual core machine.

That and the point of the BFS is to make things more friendly and more interactive. That is hard to benchmark and having something that very responsive to user input would probably be slightly less efficient overall even though users would actually prefer it.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 7:58 UTC (Mon) by dlang (guest, #313) [Link] (15 responses)

today Intel is selling single socket systems with 6 real cores + hyperthreading (simulating 12 cores)

they have in their roadmap to be selling single socket systems with 8 real cores in less than a year.

so what today is a two socket 'business only' system is next summer's (or next christmas') power user system

just like about a year ago the only people with 8 cores were the high-end 4 socket systems, and the only people with 4 cores were dual socket server systems.

nowdays it's common for single socket systems to have 4 cores (+ hyperthreading)

Yes he did pick a system at the high end of that BFS claims to support (and I would like to see how it fares with 4 or so cores in use), but at the same time, the benchmark numbers weren't a matter of a couple percentage points of difference, on one benchmark the time went from < 4 seconds to > 40 seconds, 10x worse.

that doesn't mean that BFS is junk, just that it's not finished, but utterly dismissing (and ignoring) the results is not a good start for discussions.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 8:15 UTC (Mon) by kragil (guest, #34373) [Link] (10 responses)

Most of the machines in the real world are single cores.

Most of the machines sold today are dual cores ( real or only with HT like Atom ).

Most people still don't buy big desktops with quad-cores, they buy cheap laptops/netbooks.

It will take a long time before most computers sold will have more than 16 cores as the computers that were sold the last 4 years are perfectly capable of doing everything a non-gamer/non-kernel dev needs.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 8:44 UTC (Mon) by dlang (guest, #313) [Link] (1 responses)

actually, if you want to start playing the 'most of the computers are.. ' games

most of the computers in use in the world are 8 bit cpu's

most of the new computers sold each year are _still_ 8 bit cpu's (by a smaller margin than in prior years, true, but stil the winner)

so by that argument, both linux and windows are completely irrelevant since neither of them will run on themajority of computers around or being sold.

what Con should have done was to respond that 16 (simulated) cores is too many for the current stage of BFS code, and told Ingo that with X cores it is still solidly in it's sweet spot. Ingo could then go back and run the tests again to see what results he gets.

if with 4 cores his benchmarks still show the machine completely locking up, Con would then need to look at BFS to see why it's so bad for some workloads (which is exactly what he lambastes the kernel scheduler for)

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 9:43 UTC (Mon) by stijn (subscriber, #570) [Link]

Clearly the game is "most of the desktop computers are …". This makes the first half of your response rather moot (and detracts from the rest). Admittedly my own (this) response has little to offer except nitpicking, but I care about the particular nit where no effort is made to understand someones position. It accounts for about 99.9% of flame wars.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 10:02 UTC (Mon) by xav (guest, #18536) [Link] (7 responses)

Most computers will shortly be smartphones running some kind of linux kernel.
And they'll be very picky about reactivity.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 10:24 UTC (Mon) by kragil (guest, #34373) [Link] (6 responses)

That is totally OK _with me_.

I think Linux has a lot cruft that is only useful on supercomputers/monster X-cores and in future some kernel devs want to see.

Optimising for smartphones/smartbooks/MIDs/netbooks is really needed and the benchmarks should be very very different like response time to clicks under load or frame drops while playing video etc... at the moment the Linux desktop freezes and skips way too often.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 12:53 UTC (Mon) by mingo (subscriber, #31122) [Link] (5 responses)

at the moment the Linux desktop freezes and skips way too often.

We take such problems seriously - please post to lkml about this, with the scheduler maintainers (Peter Zijstra and me) Cc:-ed.

We have many good tools that can get to the bottom to such skipping, if there are people willing to report problems and willing to trace latencies and test patches.

Both Peter Zijstra and me have and test on low-spec systems as well. I've got a 833 MHz Pentium-3 laptop that i (auto-)reboot new kernels into about 10 times every day with new -tip kernels. Peter has a 1.2 GHz Pentium-mobile laptop for interactivity testing. My daily desktop is a dual-core box - not some big honking server machine.

But ... we can only fix the scheduler if you help out too and report your interactivity problems on lkml.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 13:20 UTC (Mon) by k3ninho (subscriber, #50375) [Link] (3 responses)

Are you in the process of testing BFS on your 'low end' PIII laptop?

How many people report bugs of stuttering, lockups and hangs anyway? I'd forgive you for thinking it's not a problem because the CFS and Deadline schedulers have been good for me and my home-use workload.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 14:36 UTC (Mon) by mingo (subscriber, #31122) [Link] (2 responses)

Are you in the process of testing BFS on your 'low end' PIII laptop?

Not likely - it took 8+ hours to do the quad core tests and a single kernel build iteration takes 1-2 hours on this box.

But that box is perfect for audio skipping problems. Right now it can play an mp3 stutter-free while a make -j3 job is running on it. That's roughly in line with what i'd expect from that box.

How many people report bugs of stuttering, lockups and hangs anyway? I'd forgive you for thinking it's not a problem because the CFS and Deadline schedulers have been good for me and my home-use workload.

We have on the order of one such bugreport per kernel cycle (3 months). They generally get fixed if they are reported and if the reporter reacts to feedback and further testing requests.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 18:57 UTC (Mon) by drag (guest, #31333) [Link]

Be sure to throw PulseAudio in there. :)

It'll end up being required desktop component since it's the only system developed so far that can handle hotplugging audio devices _and_ network audio in effective manner. This means on the fly audio configuration changes, which means that USB headsets for VoIP and gaming, bluetooth audio devices, and usb docking stations (etc etc) which are now increasingly common cannot be handled in a sane manner without PA's ability to do on the fly reconfiguration.

Then you'll need to do some graphical benchmarks. Maybe some of those things from Mesa or whatever. Their little things. Just stuff that runs for a few seconds at a time. Those phoronix folks have their benchmark suite and maybe that would be usefull for you guys.

The point for interactivity, as I see it, is adapting to changing workloads. Playing a mp3 + doing a kernel compile is fairly static and the system has time to adapt to it, and whatnot. The system should have a "peaky" workload with occasional high loads and whatnot.

Not that I experienced many problems with the modern kernel compiled with preemption enabled. At least nothing that stands out in my mind right now.

Measuring on down-to-earth hardware

Posted Sep 7, 2009 22:30 UTC (Mon) by man_ls (guest, #15091) [Link]

Not likely - it took 8+ hours to do the quad core tests and a single kernel build iteration takes 1-2 hours on this box.
Pity. It would be interesting to run your benchmarks on a PIII, even if it takes 5 days; or tune them to last less. Just about any current netbook would do too. Any takers?

As a socratic exercise: just what would it prove if BFS performed better than CFS? And then, what would we learn if the reverse happened and CFS bested BFS?

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 9, 2009 14:33 UTC (Wed) by k8to (guest, #15413) [Link]

I think the idea of 'normal users' going to LKML with their problems is unworkable. However, I am willing to give it a try with my next interactivity stall. I expect to give up rapidly if faced with derision or brush-off.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 8:43 UTC (Mon) by iive (guest, #59638) [Link] (3 responses)

I'm not CPU expert or kernel expert, so feel free to correct me.

However I do have the feeling that hyperthreading is the reason of these suboptimal benchmarks. The BFS scheduler could have been made with the assumption that each core runs at same speed, so it would finish X work for Y time on any core. In hyperthreading this is not true, as both threads share same core. In general the CPUs have more computational units than could be used in any given moment. So the second h-thread is "lurking" behind and reusing the free units when first h-thread could not utilize them. This is why HT on P4 gave only 30% boost in best case.

This could also explain why only some people with Intel CPU notice issues, while others don't.

I also wonder how many of the stock CFS heuristics are tuned for HT scheduling and how many special cases are there.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 8, 2009 18:13 UTC (Tue) by jzbiciak (guest, #5246) [Link] (2 responses)

I wonder if it might be a different effect. My dual dual-core Opteron box (4 CPUs across 2 chips) dynamically scales the frequency of the CPUs based on load.

What I don't know is the cost of doing so. That is, when it switches from 1GHz to 2.4GHZ, yes, it got faster, but was there, say, a 1ms hitch between the two? Did that hitch affect both cores on that die or just one? If there was a cache-to-cache coherence transfer at the time, did it also experience that hitch?

These details could vary by processor platform, vendor and maybe even chipset and BIOS if the switch is effected via SMM or the like. A sloppier CPU scheduler that kept all the CPUs in the high-frequency state (or low frequency state) would eliminate these sorts of hitches, whereas one that kept the load more concentrated might experience more such hitches when the occasional background load spills onto the CPU that was left sleeping.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 9, 2009 10:08 UTC (Wed) by etienne_lorrain@yahoo.fr (guest, #38022) [Link]

I also have some strange behaviour on a no-name dual core all intel portable PC, kind of 2-4 seconds where mouse is not even moving, without any load whatsoever, no log in /var/log/messages, completely random.
This portable PC is cheap and "designed for the other OS" system even if it was sold without anything installed: the DMI information is blank, the ACPI information does not seem to be better.
I tend to think that it is a SMM problem, instead of a scheduler problem, the crappy BIOS (cannot update because no DMI name) does not like Linux, or was explicitely designed to give a bad experience. I would really like to be wrong here.
There was a time when Linux did not rely on any BIOS, but it is no more (SMM cannot be disabled, even under Linux - what is what is handling the forced power off by pressing On/Off button for more than 3 seconds).

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 10, 2009 22:23 UTC (Thu) by efexis (guest, #26355) [Link]

This I believe is something that was more of an issue than it is now, so CPU's can ramp up their speed much quicker than they could've done before. One problem was for example that higher CPU speeds requires higher volts which can cause delays with the CPU stalling while the voltage steps up. Now instead the voltage will be pushed up a split moment before the frequency is ramped up, so there's no stall. Otherwise, it's all down to the CPU, with different models taking different amounts of time to change frequency, it can make sense to jump to the highest frequency when the usage goes up and then slow it down if needed (such as the ondemand governor does) or scale it up step by step. You want to try set a lower watermark where responsiveness is important, so CPU's always running at say twice the speed that you need it, so you always have room to move into while you wait for the cpu to speed up (eg, when load goes from 50% to 80%, the CPU speeds up to bring the load back down to 50%. Only if loads reaches 100% have you not sped up quickly enough). Of course if you wish to conserve more power, you run the CPU at speeds closer to load. In Linux, there're many tuneables for you to play with to get the responses you wish (/sys/devices/system/cpu/cpu?/cpufreq/<governor>). To see what's available on the Windows platform, there's a free download you can find by googling rmclock that proper spoils you for configuration options. There's no one rule that has to fit all, during boot up the kernel will test transition speeds and set defaults accordingly.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 8:14 UTC (Mon) by ketilmalde (guest, #18719) [Link] (1 responses)

> There is not one single person I know that owns a dual-socket desktop.

I have an old dual Pentium-II 450MHz. It's not actually in use anymore, though, so it probably doesn't count.

> That being said I don't think it would make a big deal.

I think it might - things like processor affinity is likely to matter a great deal more on multiple socket systems than on just multicore systems. Multicore chips typically come with a large, shared cache, so moving threads across cores isn't as costly as moving them across sockets.

From what I read, BFS doesn't even try to be NUMA-aware, it doesn't seem unreasonable that it would perform quite differently on single and multi-socket systems.

-k

BFS vs. mainline scheduler benchmarks and measurements

Posted Jun 8, 2010 13:22 UTC (Tue) by vonbrand (subscriber, #4458) [Link]

Way back when I confiscated a dual Pentium Pro (200MHz) to use as a desktop machine for use in a class I was teaching... the machine was old already (I actually canibalized two of them to get a working one).

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 13:34 UTC (Mon) by da4089 (subscriber, #1195) [Link] (1 responses)

At my office, everyone has an 8-core desktop.
At home, people tend to have single-socket, quad-core desktops.
Laptops are mostly dual-core, although the last two guys who bought one got quad-core, 17" monsters.

So, I think Ingo was reasonable in his choice of platform.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 8, 2009 0:18 UTC (Tue) by awalton (guest, #57713) [Link]

> At home, people tend to have single-socket, quad-core desktops.

I want to live at your home. We've bought 4 new home PCs in the past two years, including one just a month ago. They're all dual cores. Even my brand new laptop is dual core.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 10:30 UTC (Mon) by nye (subscriber, #51576) [Link] (1 responses)

>A dual quad-core system with hyperthreading? That's hardly a bad choice of hardware for the "desktop system" scale. That's the standard CPU setup for a Mac Pro, for instance.

I hope this doesn't sound trollish, but if you think that's even remotely realistic for even one percent of the PC user base, then you are living in a fantasy realm - or perhaps five years in the future.

I've never even *seen* a computer that powerful. A machine like that would cost *thousands* - nobody spends more than £500 on a computer unless they are a serious enthusiast who happens to be rolling in money - the average user, if they think about it at all, is around now thinking that it might be time to get one of those newfangled dual-core machines.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 10:34 UTC (Mon) by nye (subscriber, #51576) [Link]

Okay I realise that that probably did sound trollish, and it's been better covered upthread. My apologies.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 12:16 UTC (Mon) by fb (guest, #53265) [Link]

Yes, Con's response was outright rude. OTOH he had made clear that he wasn't interested in that kind of discussion.

However Ingo was obviously down the route of "lies & benchmarks". The point of the scheduler is low end machines, and responsiveness. Ingo posts a benchmark with ridiculously high-end machine, and measuring performance.

I just wish that Con had had the cool head to politely point that if you ask the wrong question, you get the wrong answer.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 8:09 UTC (Mon) by sitaram (guest, #5959) [Link]

Maybe, but a couple of para's into Ingo's email I did wonder how relevant the testbed/tests were to my normal workload.

Con's email merely confirmed my suspicions.

Too bad... I might now have to take off my "user" hat (distro supplied stuff only, nothing gets compiled locally, etc) and actually try BFS to see for myself.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds