> my understanding of NX isn't that it does any local rendering, but that it instead shortcuts the roundtrips to the server by intercepting calls and returning the values directly.
I think that's true for older apps that use the core X protocol, xman, xterm, xeyes, xclock are accelerated this way. For newer apps using newer toolkits like Gtk2/3, Qt, etc. I think there is a lot more bitmap shuffling because the client app toolkit depends on being able to do compositing, blending, antialiasing, etc. and shipping the result to the display.