Ok, I'm guilty of not supplying all information. I run Debian testing+bits of unstable+tiny bits of experimental personally with rare issues. And for cluster installations, you only *need* the kernel+drivers up to whatever hardware is there, and you can pick whichever kernel + OFED level that is. Forklift upgrades still are the norm in the HPC world. Otherwise, I've been recommending a (tested, via slow VM emulating 4+ nodes) Debian testing snapshot every N months (often 12). That seems quite stable for *core* pieces in my little slice of the world. I imagine that other fixed-hardware installations are similar.
And in response to another comment, for *me* the VM performs much better with respect to my multithreaded jobs than older kernels do. But I also turn off swapping for HPC installations; if you swap, you loose the high-performance part. That does eat a little memory for monitors that *could* be swapped out, but it reduces the perturbation when those monitors happen to trigger during a compute job. Turning off swap doesn't turn off mmap-ing of large data sets, so it's not a huge deal for common apps and helps stop student mistakes from crippling the nodes. Setting the ulimit at a tested maximum keeps the OOM killer at bay (or seems to do so).
But I'm also stuck with RHEL given current employer restrictions. It's not terribly OpenMP-friendly, in my experience, leading to many people griping about how "Linux" sucks.