Determinism is what's really important to real-time.
It's often confused with low latency, but the two are separate criteria and often conflicting goals requiring a trade-off, made complicated by the fact that most applications typically want BOTH - determinism AND low latency.
Determinism is easier understood as the ability to say "this task will take AT MOST n ms". That is, bounded maximum latency.
In the strictest case, this would mean the following:
it is preferable for all 5000 iterations of a task execution
to take 49us (less than 50us) than it is for 4950 to take
35us and 50 iterations to take 69us, when your application
requires a maximum latency of 50us.
For most enterprise applications, the max latency is not a MUST_FINISH_BY with severe consequences for failure, but a REALLY_GOOD_TO_FINISH_WITHIN, with the average low latency being also important. Some applications can tolerate some outliers (maximum latency bound exceeded) as they usually need average low latency as well.
Most OSs are optimized for throughput-driven applications (where average latency is minimized).
Real-time Linux is optimized to offer greater determinism than the stock kernel. Hence the need for greater preemption, including the ability to preempt critical kernel tasks should a higher priority application become runnable.
And remember, you can only guarantee/meet real-time requirements for as many threads as you can run concurrently on your system - on an N-core system, you can at most guarantee that N SCHED_FIFO tasks at the same highest priority P will meet their real time guarantees (depending on a lot of things, handwave, handwave, but you get the general idea). So a lot depends on what the system is running, the overall application solution and top-down configuration of the entire system.