1. the offline scheduler is about treating a processor as a device. this is why I am offloading it. i have compared in my essay several partition- system, CPU sets, INtime and IBM partitions. I did not comare it to dynticks because dynticks is simply a different matter.
2. the offline schdeuler has other features that monitor (RTOP) and protect the kernel ( offline firewall ) when it is not possible.
Posted Sep 4, 2009 15:30 UTC (Fri) by mingo (subscriber, #31122)
[Link]
Hello Mingo
1. the offline scheduler is about treating a processor as a device. this is why I am offloading it. i have compared in my essay several partition- system, CPU sets, INtime and IBM partitions. I did not comare it to dynticks because dynticks is simply a different matter.
The "offline scheduler" is, as you say, a CPU partitioning scheme.
Our (oft repeated) point is that Linux already has a CPU partitioning scheme: cpusets. It can be configured dynamically and will isolate one (or more CPUs) just fine.
This cpusets scheduler feature has been added to the Linux kernel 4.5 years ago in 2005, and has been released as part of the v2.6.12 Linux kernel. It has been part of Linux ever since then - continuously fixed/updated/enhanced.
If cpusets as implemented today does not fit your needs then the (upstream acceptable) solution is not to add a completely different facility with its extra layering, but to fix the currently existing one.
That will benefit all current cpusets users as well beyond enabling the usecases you are interested in.
A new facility is only added if the old one is unfixable. That has not been outlined here - it has not even been argued to be unfixable. [If that is proven then the new facility will simply replace the old (broken) one.]
This is really how the Linux kernel is developed - and always was. We try to avoid reinventing the wheel and we try to avoid duplicate functionality in the core kernel as much as possible. This is what is happening here too.
It sure does mean extra work and requires willingness to work with existing upstream facilities.
Duplicate/overlapping functionality quickly becomes a mess to users and is unmaintainable as well in the long run due to the increased complexity. We try to avoid such overlap and duplication as much as possible.
The lkml discussions with you stalled because you basically only repeated your arguments why you'd want to have the offline scheduler (which in itself is fine) - without showing much interest in improving existing kernel facilities or showing that they are unfixable (which is not fine if you want to enhance the upstream kernel).
Anyway, there's lots of possibilities how to continue this on the technical level. Everyone agrees that undisturbed CPU cores are desirable, so if you (or someone else) implements it correctly it will be accepted upstream - and gladly so. The job of a maintainer (like me) is to say 'no' to patches that are (not yet) good enough technically.
Thanks,
Ingo
Dynamic scheduler tick
Posted Sep 4, 2009 20:46 UTC (Fri) by razb (guest, #43424)
[Link]
Hello again Ingo
Well, I understand your arguments and agree with the "upstream" consideration. the offline scheduler approach is agressive . when i offlined napi, i had to do some re-writing in dev.c .
>The lkml discussions with you stalled because you basically only >repeated your arguments why you'd want to have the offline scheduler >(which in itself is fine) - without showing much interest in improving >existing kernel facilities or showing that they are unfixable (which is >not fine if you want to enhance the upstream kernel
In the case of cpu sets, i argue that cpu sets do not provide complete partitioning. Meaning , i cannot ask a packet from 10gbps interface to be moved to processor X and another packet from the same 10gbps interface to be moved to processor Y. why should a flash video packet be moved to processor 7 if processor 7 is heavily busy with incoming ftp traffic ?
For the best of my knowledge; a napi context is triggered by the first packet which can be any processor "in the affinity".
But this is possible by offlin'ing napi. just simply route packets by their service type; not by irq masking; And who care for cache misses if i have an entire processor to do that work;
But you are correct that i haven't replied with technical details. i just posted the link to the essay.
what is correct way to isolate a processor, What are the restrictions ? what are the requirements ?
Raz
Dynamic scheduler tick
Posted Sep 4, 2009 21:07 UTC (Fri) by mingo (subscriber, #31122)
[Link]
[...] In the case of cpu sets, i argue that cpu sets do not provide complete partitioning. [...]
Obviously they do not, as otherwise you would not have implemented your patch.
My point, which i outlined in more detail in my reply above, is that there are two approaches possible that are acceptable for upstreaming:
- either extend and fix cpusets with the features you desire
- or prove/show that that's impossible or undesirable. (in which case your solution will have to replace cpusets, cover all its usecases, migrate all its APIs and users smoothly, etc., etc.)
You took a third approach: "I added it as a new, separate, special-purpose feature, not integrated with existing cpusets facilities because it was the easiest for me that way".
That is the ... short-term easy but long-term expensive answer which people on lkml objected to for good reasons. We've been there, we've done that, we are still suffering the consequences ;-)
Linux is a 18+ years old kernel, there's not that many easy projects left in it anymore :-/ Core kernel features that look basic and which are not in Linux yet often turn out to be not that simple.
I hope this explains our point of view. We can continue this discussion on lkml - i'm very interested in extensions to cpusets and Peter Zijstra outlined models for integrating IRQ space partitioning into the cpusets model. (he called them system-sets) He sent a few prototype patches to lkml as well - early 2008 IIRC. Those could be picked up and finished, if you are interested.