BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 7, 2009 8:43 UTC (Mon) by iive (guest, #59638)
In reply to: BFS vs. mainline scheduler benchmarks and measurements by dlang
Parent article: BFS vs. mainline scheduler benchmarks and measurements

I'm not CPU expert or kernel expert, so feel free to correct me.

However I do have the feeling that hyperthreading is the reason of these suboptimal benchmarks. The BFS scheduler could have been made with the assumption that each core runs at same speed, so it would finish X work for Y time on any core. In hyperthreading this is not true, as both threads share same core. In general the CPUs have more computational units than could be used in any given moment. So the second h-thread is "lurking" behind and reusing the free units when first h-thread could not utilize them. This is why HT on P4 gave only 30% boost in best case.

This could also explain why only some people with Intel CPU notice issues, while others don't.

I also wonder how many of the stock CFS heuristics are tuned for HT scheduling and how many special cases are there.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 8, 2009 18:13 UTC (Tue) by jzbiciak (guest, #5246) [Link] (2 responses)

I wonder if it might be a different effect. My dual dual-core Opteron box (4 CPUs across 2 chips) dynamically scales the frequency of the CPUs based on load.

What I don't know is the cost of doing so. That is, when it switches from 1GHz to 2.4GHZ, yes, it got faster, but was there, say, a 1ms hitch between the two? Did that hitch affect both cores on that die or just one? If there was a cache-to-cache coherence transfer at the time, did it also experience that hitch?

These details could vary by processor platform, vendor and maybe even chipset and BIOS if the switch is effected via SMM or the like. A sloppier CPU scheduler that kept all the CPUs in the high-frequency state (or low frequency state) would eliminate these sorts of hitches, whereas one that kept the load more concentrated might experience more such hitches when the occasional background load spills onto the CPU that was left sleeping.

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 9, 2009 10:08 UTC (Wed) by etienne_lorrain@yahoo.fr (guest, #38022) [Link]

I also have some strange behaviour on a no-name dual core all intel portable PC, kind of 2-4 seconds where mouse is not even moving, without any load whatsoever, no log in /var/log/messages, completely random.
This portable PC is cheap and "designed for the other OS" system even if it was sold without anything installed: the DMI information is blank, the ACPI information does not seem to be better.
I tend to think that it is a SMM problem, instead of a scheduler problem, the crappy BIOS (cannot update because no DMI name) does not like Linux, or was explicitely designed to give a bad experience. I would really like to be wrong here.
There was a time when Linux did not rely on any BIOS, but it is no more (SMM cannot be disabled, even under Linux - what is what is handling the forced power off by pressing On/Off button for more than 3 seconds).

BFS vs. mainline scheduler benchmarks and measurements

Posted Sep 10, 2009 22:23 UTC (Thu) by efexis (guest, #26355) [Link]

This I believe is something that was more of an issue than it is now, so CPU's can ramp up their speed much quicker than they could've done before. One problem was for example that higher CPU speeds requires higher volts which can cause delays with the CPU stalling while the voltage steps up. Now instead the voltage will be pushed up a split moment before the frequency is ramped up, so there's no stall. Otherwise, it's all down to the CPU, with different models taking different amounts of time to change frequency, it can make sense to jump to the highest frequency when the usage goes up and then slow it down if needed (such as the ondemand governor does) or scale it up step by step. You want to try set a lower watermark where responsiveness is important, so CPU's always running at say twice the speed that you need it, so you always have room to move into while you wait for the cpu to speed up (eg, when load goes from 50% to 80%, the CPU speeds up to bring the load back down to 50%. Only if loads reaches 100% have you not sped up quickly enough). Of course if you wish to conserve more power, you run the CPU at speeds closer to load. In Linux, there're many tuneables for you to play with to get the responses you wish (/sys/devices/system/cpu/cpu?/cpufreq/<governor>). To see what's available on the Windows platform, there's a free download you can find by googling rmclock that proper spoils you for configuration options. There's no one rule that has to fit all, during boot up the kernel will test transition speeds and set defaults accordingly.