A realtime preemption overview
Posted Aug 16, 2005 4:29 UTC (Tue) by mingo
In reply to: A realtime preemption overview
Parent article: A realtime preemption overview
Re #1, isnt SPL() disabling process/process preemption? That makes the concept unsuitable for the purposes of PREEMPT_RT. SPL() is also a pretty 'opaque' serialization method, only little better than the 'Big Kernel Lock' that Linux has now finally gotten rid of. Thirdly, it only has a limited number of (32? 64?) "priority levels" available, while Linux has thousands of separate types of critical sections. Fourthly, isnt SPL() nested? E.g. blocking up to level 5 means all execution covered by levels 0,1,2,3,4 are blocked - while with a spinlock you will only block access to the data structure affected. Such artificial nesting is pretty bad if you want to avoid deadlocks and want to have good SMP scalability. The natural expression of locking hierarchies is not a flat "priority space" as SPL() does, but it's more like a forest of trees of independent entities, where we want to maintain as much independence as possible.
Re #2, there over 5000 uses of the spin-lock APIs in the Linux kernel, renaming it just to show that it might not spin anymore is not really worth the trouble (and the huge intrusion!) at this point.
Re #3, yes, priority inheritance is pretty important when an RT task wants to make use of kernel services.
Re #4, what precisely do you mean by "interrupt context" and "process context" in this particular case? The current situation is the following:
In the stock kernel there are 3 basic types of contexts: there is "interrupt context" (non-preemptible), "soft interrupt context" (non-preemptible) and "process context" (preemptible, unless executing in one of the many types of critical sections such as spinlocks).
In the PREEMPT_RT kernel there are 4 essential types of contexts: "hard interrupt context", "interrupt context", "soft interrupt context" and "process context". The hard interrupt context is an extremely small shim in essence - a few tens of lines total, per arch - it just deals with the interrupt controller, masks the IRQ line, acks the controller and returns. The "interrupt context" is a separate per-IRQ interrupt thread, which behaves like a process and is fully preemptible. "Soft interrupt context" is a separate per-softirq system-thread too, fully preemptible. "Process context" is what it used to be, and fully preemptible too. ['fully preemptible' means it's preemptible for in essence everything but the scheduler code and the basic RT-mutex/PI code]
considering the above description, your comment about "the lesser we run in interrupt context, the better" is indeed correct: in PREEMPT_RT the hardirq context execution time and complexity has been reduced to an absolute physical minimum. It is a fundamentally good and important thing to achieve determinism. Everything else is a "thread", as far as the scheduler is concerned, and is as preemptible as possible. You can then use individual thread priorities to make some interrupts more important than others.
There is (inevitably) some scheduling overhead due to having more contexts, i've measured it to be 3-5%, worst-case [80 thousand irqs/sec], and near zero for the common case [couple of thousand irqs/sec], which is pretty good.
Note that Linux has specific scheduler optimizations that makes the introduction and use of system threads cheaper: e.g. the 'lazy-TLB' optimization will skip TLB flushes when switching between system threads, by letting system threads 'inherit' the TLB context of the previous user-process. Thus we might not need to do any TLB flushing if we switch back to the same user-process - and we dont have to do any TLB flushing if we switch between system threads. So in the TLB flushing sense, system threads are completely transparent and do not increase the number of TLB flushes.
to post comments)