|
|
Subscribe / Log in / New account

Con Kolivas returns with a new scheduler

Con Kolivas returns with a new scheduler

Posted Sep 2, 2009 1:52 UTC (Wed) by lethal (guest, #36121)
In reply to: Con Kolivas returns with a new scheduler by intgr
Parent article: Con Kolivas returns with a new scheduler

Not quite so simple. There are plenty of single-CPU systems with NUMA characteristics. Memory-only NUMA nodes are becoming fairly common place, both in commodity and especially in embedded platforms. Indeed, many Linux-based cellphones are shipping with NUMA enabled by default today -- and more recently also on the microcontroller side, albeit without page migration.

Any scheduler that fails to take issues like NUMA, SMP, dynamic ticks, etc. in to account while claiming to be "looking forward" will remain nothing but a toy scheduler for an insular workload. All of these have effectively become common place, to the extent that simply discounting them out of hand reads a lot more like looking back than forward, especially given that many low memory systems depend on all of these capabilities.

In any event there is anything wrong with out of tree pet projects, especially for trying out new things. If this new scheduler improves things for certain workloads, then hopefully someone will step up to work with upstream and improve things there incrementally.


to post comments

Con Kolivas returns with a new scheduler

Posted Sep 2, 2009 15:01 UTC (Wed) by jzbiciak (guest, #5246) [Link] (9 responses)

Any scheduler that fails to take issues like NUMA, SMP, dynamic ticks, etc. in to account while claiming to be "looking forward" will remain nothing but a toy scheduler for an insular workload

I believe what Con meant by "forward looking" when he described his scheduler is that it doesn't adjust time slices based on recent history (ie. "backward looking") of a task's run/sleep behavior. From Con's post:

I feel the scheduler should being forward looking only (not calculating sleep)

His explicit rejection of NUMA and high-CPU-count machines makes it clear he's only really interested in what constitutes a "typical desktop" today. From Con's post:

Machines with NUMA will probably suck a lot with this because it pays no deference to locality of the NUMA nodes when deciding what cpu to use.

I imagine the loss of raw MIPS due to lack of NUMA awareness on even a high end personal desktop isn't too great either. Modern desktop machines are NUMA, but they're fairly mild NUMA from what I gather. (Why else would my Opteron's BIOS offer me the option to interleave my memory across all nodes? That'd be disaster on a more extreme NUMA architecture, but it supposedly provides a performance boost on older non-NUMA-aware OSes.)

In fact, I'd further imagine the average user would trade actual increased responsiveness for a few % loss on peak benchmark performance numbers. At the very least, I imagine Con might. :-)

All that said, the place where I experience the greatest loss in responsiveness is Firefox, not the rest of my desktop, and Firefox is just a single thread at the moment. Con, can you come fix Firefox? ;-)

Con Kolivas returns with a new scheduler

Posted Sep 3, 2009 2:37 UTC (Thu) by charris (guest, #13263) [Link] (8 responses)

I find that gnome terminals will ignore my bluetooth keyboard for up to several seconds. I don't know if this is because of the bluetooth driver, X, or the scheduler, but it sure is annoying. And this on a 3GHz quad core system.

Con Kolivas returns with a new scheduler

Posted Sep 3, 2009 13:15 UTC (Thu) by nix (subscriber, #2304) [Link] (4 responses)

It's certainly not the bluetooth driver or X, because I see this on the console of a one-die four-core Nehalem (Core i7) with no X running at all and a PS/2 keyboard. It's lightly loaded by these standards (load average 8.1, which as it's hyperthreaded is pretty much equivalent to 'everything loaded but not much'), and surely isn't swapping (12Gb RAM, 6Gb *free*, not even used for cache). Yet I see three-second pauses in keyboard activity.

I'll turn on latencytop and see what it says, but the pauses are fairly rare so it might be interesting to interpret.

Con Kolivas returns with a new scheduler

Posted Sep 3, 2009 13:39 UTC (Thu) by jzbiciak (guest, #5246) [Link]

I've only gotten multi-second pauses like that when I had a hard drive going bad, or I was swapping rather furiously.

With that much RAM, I wonder if VM housekeeping itself could cause such lags. 12Gb would have 3 million 4K pages. If you are running at 3GHz and did something that averaged 3000 cycles/page across all 3 million pages, that's 3 seconds.

Con Kolivas returns with a new scheduler

Posted Sep 3, 2009 17:31 UTC (Thu) by charris (guest, #13263) [Link] (2 responses)

I also see "stuck" keys where a letter will repeat until the buffer is full. I see this on various hardware with various keyboards, so it isn't an actual stuck key. Do you see that also?

Con Kolivas returns with a new scheduler

Posted Sep 3, 2009 18:14 UTC (Thu) by jzbiciak (guest, #5246) [Link]

That sounds more like a dropped event somewhere such as dropped interrupt or the like. I haven't experienced that on any Linux box so far.

Stuck keys.

Posted Sep 4, 2009 2:47 UTC (Fri) by ncm (guest, #165) [Link]

I've seen on 2.6.29. Unplugging and re-plugging the (USB) keyboard didn't fix it. Only switching to console and back did it. I was inclined to blame the new X input system. It hasn't happened since I upgraded X and also went to 2.6.30, so who knows?

Input pauses

Posted Sep 4, 2009 4:09 UTC (Fri) by ncm (guest, #165) [Link] (2 responses)

I see input-event stalls for up to six seconds in Firefox (Debian Iceweasel, actually), routinely. To get it I just have to leave it running for a week or so, with a few dozen pages, some of which self-update.

I doubt the scheduler would help; during those six seconds, one CPU is pegged at 100%. This is with nothing swapped. I suspect it's doing a garbage collection scan during each pause. I'd welcome suggestions for how to discover what's going on.

Google Chrome is coming along too slowly.

Input pauses

Posted Sep 4, 2009 4:32 UTC (Fri) by jzbiciak (guest, #5246) [Link]

I know Firefox's stalls are Firefox's fault, not the Linux scheduler's. I was begging Con to do interactivity work on Firefox. ;-)

(As in, contribute to the Mozilla team.)

Input pauses

Posted Sep 4, 2009 8:17 UTC (Fri) by Cato (guest, #7643) [Link]

I've had these pauses for several seconds sometimes in Firefox 3.5.2, but they seem to be associated with large writes e.g. un-tarring a huge file at the same time - so I suspect it's the well known "Firefox 3.0 + ext3 + fsync" issue, reported in http://lwn.net/Articles/284126/ and elsewhere.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds