Well, as for being one of those who actually pushed and implemented a little bit of the priority inheritance in the beginning, I must say that he is just making excuses for not making it in RTLinux, because making it right _is_ indeed very hard. From experience I know it is done wrong even in VxWorks!
But it can be done, and it is done in the current rtmutex in Linux.
From the cases he is talking about, shows me that he has not understood how to use the system at all. He does the usual mistake of not distinguishing between a mutex and a semaphore used as a condition (i.e. waiting for some external event to happen).
Yes, making an RT application work with priority inheritance mutex requires some programming rules: You can't block on a semaphore, socket etc. while holding a lock. But, heck you should always try to avoid that in any multi-threaded program to avoid the whole program eventually locking up because some message didn't arrive over TCP as expected.
In general locks should only be used "inside a module" to protect it's state. The user of the module should not be aware of it. The modules should be ordered such that low level module is not calling a highlevel module with an internal the lock taken - or you can create a deadlock. Or a even simpler rule: Newer make a call to another module with a lock held. In a RT environment with priority inheritance the module can use this to ensure the timing of all the calls to it because all the modules "lower" in the chain have a known timing and you therefore know the maximum time all the internal locks can be held by any thread.
And yes, priority inheritance takes a lot of performance. But in general you should try to avoid congestion and make your program such that the locks are not contended. The locks should only be considered as "safeguards" against a contention, which should not happen very often.
If you know how to use locks, and can avoid the pitfalls, priority inheritance will work for you - provided they are properly implemented by the OS. As is done in Linux.
Wrt. rwlocks: If a high priority, realtime writer wants the lock, it doesn't make sense to boost the readers as you don't know how many there are. What you could do was to limit the number of readers to specific number. Or you could say that writers don't boost the readers but readers can boost the single writer. That way you can't use rwlocks in real time tasks and that would not be a problem in most cases. But the kernel would need a lot of review to be sure and therefore I fully understand the current solution in the preemt RT patche.
Posted Sep 30, 2009 8:05 UTC (Wed) by michaeljt (subscriber, #39183)
[Link]
> Or you could say that writers don't boost the readers but readers can boost the single writer. That way you can't use rwlocks in real time tasks and that would not be a problem in most cases.
So to return to my previous question, this would simply mean not trying to get it "right" for this API and clearly write that on the box.
> But the kernel would need a lot of review to be sure and therefore I fully understand the current solution in the preemt RT patche.
Of course I was naively thinking that the API user would be aware of what locking they are using, but that won't hold if they are doing the locking implicitly through other APIs.
The realtime preemption mini-summit
Posted Oct 4, 2009 13:56 UTC (Sun) by dvhart (guest, #19636)
[Link]
WRT rwlocks. We actually cap reader count to 1 in PREEMPT_RT for that very reason. This is unfortunate, and one of the causes for performance degradation on -rt for certain workloads. There was some discussion during the rt-summit in Dresden about making kernel rwlocks non-pi-aware for this reason. Some more investigation is needed before we make a decision there.
The realtime preemption mini-summit
Posted Oct 6, 2009 15:37 UTC (Tue) by simlo (subscriber, #10866)
[Link]
As I said: You could leave half-PI-aware : Let readers boost the writer, but not the other way around. This will most likely work in many cases. It means a RT task can't write-lock a rwlock must defer the operation to another task. Config options are needed....