The solutions we tried in the past seem to be "big hammer" style solutions, that try to be fairly rigid in what the kernel is allowed to do.
I want to see how little change we can get away with, and still get a decent performance improvement. A home node would only be the node that memory allocations start on, and that the process is preferentially run on - the CPU scheduler does need to be able to run processes elsewhere temporarily.
Only when a node is permanently overloaded, is it time to move some tasks elsewhere and eventually migrate over some of their memory (maybe with Lee's patches, or something based on them).
My plan is to start small and only add things as needed, trying to stay away from a large, complete & heavy plan.