I'm imagining a much simpler scenario (for two CPUs):
1. High-priority task A takes a mutex and runs for some time without going to sleep.
2. Low-priority task B tries to take the mutex and spins, since A is running. This is where we lose latency, because...
3. ...while B spins, it can't be preempted, so a medium-priority task C can't run on either of CPUs, even if there is some work for it.
I presume, I'm missing something obvious and would be glad to know what it is.