On a multicore box, wouldn't it be easier just to dedicate a whole CPU core, 100% of the time to a particular process? In many of the RT use-cases I can think of, there's one task (often a single-threaded C-program) that needs to be able to run at top priority without interruption, while the entire rest of the OS, userspace and GUI could happily fit into the remaining cores.
Maybe I'm oversimplifying this, and it's certainly a bit wasteful (and won't work well for embedded), but for many common cases, such as low-latency audio processing, or avoiding dropouts, or data-acquistion, it would work just fine!
RT is hard when you have a mostly busy CPU (especially single-core), and multiple tasks, which might be relatively lightweight, require their small slice of CPU with hard-constraints on timing. But often, this isn't the case: we have just one critical task, and the system is mostly idle.