AutoNUMA: the other approach to NUMA scheduling

Posted Mar 28, 2012 11:36 UTC (Wed) by Ben_P (guest, #74247)
Parent article: AutoNUMA: the other approach to NUMA scheduling

Has anyone posed benchmarks on either proposed solution? Having read a small number of academic papers on NUMA scheduling; it looks like for every seemingly good solution, there exist fairly common workloads which decimate performance.

Is it possible NUMA scheduling can even do the right thing without introducing new system calls? As the programmer I will forever know more about the locality requirements of my code than the scheduler.

AutoNUMA: the other approach to NUMA scheduling

Posted Mar 28, 2012 12:56 UTC (Wed) by slashdot (guest, #22014) [Link] (1 responses)

If the workload is "static", then the kernel can in principle learn which threads read/write which pages with what throughput, and the CPU behavior of them, and simply optimize.

Whether this learning and optimization can be done cheaply is an open question though.

If the workload is not static, the kernel cannot predict the future, so it can't optimize things automatically.

Thus, it will probably be necessary to both have syscalls (esp. to express thread memory affinity) and an automatic system.

AutoNUMA: the other approach to NUMA scheduling

Posted Mar 28, 2012 13:26 UTC (Wed) by Ben_P (guest, #74247) [Link]

From my limited understanding; even for "static" workloads most NUMA schedulers do better on some static workloads and significantly worse on others. Thus the default naive behavior tends to win overall.

Benchmarks

Posted Mar 28, 2012 13:52 UTC (Wed) by corbet (editor, #1) [Link]

Andrea and Peter have both posted some benchmark results, but the testing so far is recognized by everybody involved as being insufficient.

AutoNUMA: the other approach to NUMA scheduling

Posted Mar 29, 2012 21:03 UTC (Thu) by riel (subscriber, #3142) [Link] (1 responses)

It depends entirely on what you are programming. If you are building an HPC application, you know what you will be accessing.

However, if you write a JVM, you have no idea what the application running inside the JVM will be accessing. It is entirely possible that the application will generate data (once) with one thread, and then access it hundreds of times with another thread.

For one situation, it looks obvious that Peter's solution has less overhead. For the other situation, it is not clear at all what the way to go would be. Maybe Andrea's code will automatically figure it out...

NUMA scheduling is a hard problem. Not because the solutions are difficult, but because nobody even knows exactly what all the problems look like.

approachES to NUMA scheduling?

Posted Apr 11, 2012 19:50 UTC (Wed) by gvy (guest, #11981) [Link]

> It depends entirely on what you are programming.
Exactly, and thus there might be just no real point in tyring to get *the* approach implemented when there might be at least two of them feasible and readily available, either as a kernels or (preferably but probably less realistic) as a runtime knob.

As you wrote, those who are there for performance would rather invest some more time in libraries and apps which tend to be custom or customizable; and those who won't or can't could at least pay with their RAM and cycles for some generic service.