Posted Jul 4, 2011 11:40 UTC (Mon) by csamuel (✭ supporter ✭, #2624)
[Link]
Or MPI which lets you span nodes and is frequently used for highly parallel High Performance Computing (HPC) codes.
It also works well within a single system; I've seen a particular HPC crash simulation code which came in both SMP and MPI variants and even within a single multi core system the MPI version scaled better than the SMP version.