It depends entirely on what you are programming. If you are building an HPC application, you know what you will be accessing.
However, if you write a JVM, you have no idea what the application running inside the JVM will be accessing. It is entirely possible that the application will generate data (once) with one thread, and then access it hundreds of times with another thread.
For one situation, it looks obvious that Peter's solution has less overhead. For the other situation, it is not clear at all what the way to go would be. Maybe Andrea's code will automatically figure it out...
NUMA scheduling is a hard problem. Not because the solutions are difficult, but because nobody even knows exactly what all the problems look like.
Posted Apr 11, 2012 19:50 UTC (Wed) by gvy (guest, #11981)
[Link]
> It depends entirely on what you are programming.
Exactly, and thus there might be just no real point in tyring to get *the* approach implemented when there might be at least two of them feasible and readily available, either as a kernels or (preferably but probably less realistic) as a runtime knob.
As you wrote, those who are there for performance would rather invest some more time in libraries and apps which tend to be custom or customizable; and those who won't or can't could at least pay with their RAM and cycles for some generic service.