Extensible scheduler class rejected
We are comfortable with the current API. Everything we tried fit pretty well. It will continue to evolve but sched_ext now seems mature enough for initial inclusion. I suppose lack of response doesn't indicate tacit agreement from everyone, so what are you guys all thinking?
Scheduler maintainer Peter Zijlstra gave
him his answer: "I'm still hating the whole thing with a
passion
". He went on to make it clear that this work will not be
merged into the mainline. So, it seems, developers wanting to try their
hand at BPF scheduler development will need to apply an out-of-tree patch
series, for now at least.
Posted Jul 26, 2023 19:23 UTC (Wed)
by flussence (guest, #85566)
[Link] (26 responses)
Posted Jul 27, 2023 5:27 UTC (Thu)
by TheGopher (subscriber, #59256)
[Link] (23 responses)
Posted Jul 27, 2023 7:38 UTC (Thu)
by zorro (subscriber, #45643)
[Link] (22 responses)
Posted Jul 27, 2023 9:38 UTC (Thu)
by mb (subscriber, #50428)
[Link] (21 responses)
Posted Jul 27, 2023 11:51 UTC (Thu)
by zorro (subscriber, #45643)
[Link]
Posted Jul 27, 2023 14:23 UTC (Thu)
by quotemstr (subscriber, #45331)
[Link] (17 responses)
Posted Jul 27, 2023 15:42 UTC (Thu)
by mb (subscriber, #50428)
[Link] (16 responses)
Posted Jul 27, 2023 19:53 UTC (Thu)
by flussence (guest, #85566)
[Link] (12 responses)
These things are fixing a 25%+ performance hit in CPU-bound multithreaded applications. It sounds insane, and it is insane that the kernel has fought to maintain the status quo for this long. DOI:10.1145/2901318.2901326
Posted Jul 27, 2023 20:18 UTC (Thu)
by mb (subscriber, #50428)
[Link] (3 responses)
Fix CFS.
>and until it happens people are just going to keep applying experimental out of tree patches
That is a good thing.
Posted Jul 27, 2023 20:20 UTC (Thu)
by mb (subscriber, #50428)
[Link]
Posted Jul 27, 2023 20:44 UTC (Thu)
by djk121 (subscriber, #152710)
[Link] (1 responses)
They can, and already are. The SHARED_RUNQ patches for CFS were born out of sched_ext experiments: https://lore.kernel.org/all/20230710200342.358255-1-void@...
Posted Jul 27, 2023 20:49 UTC (Thu)
by mb (subscriber, #50428)
[Link]
Posted Jul 27, 2023 21:43 UTC (Thu)
by pizza (subscriber, #46)
[Link] (7 responses)
Perhaps the reason those patches remain out of tree are because they cause serious regressions on other workloads?
Posted Jul 28, 2023 4:19 UTC (Fri)
by quotemstr (subscriber, #45331)
[Link] (6 responses)
Posted Jul 28, 2023 6:28 UTC (Fri)
by mb (subscriber, #50428)
[Link] (5 responses)
The problem with a user programmable scheduler is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.
Posted Jul 28, 2023 9:51 UTC (Fri)
by kleptog (subscriber, #1183)
[Link] (4 responses)
This is a terrible argument.
The problem with a user-programmable computer is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.
The problem with a user-modifiable kernel is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.
The problem with a user-configurable desktop is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.
Unless there are really good reason, the trend is always to *more* configurability, not less. The problem is now that the people with the broken workloads can't fix the scheduler, and the people who can fix the scheduler don't know the workloads. At least with a generally accepted user-modifiable scheduler people could start communicating and collaborating about what works for them and we'd get progress, because the people with the workloads can actually try things out easily. As long as it's out of tree that will never happen, because you have no way to testing whether a change will affect normal users negatively.
Also, the idea that it's possible to build a single scheduler that works well for everyone is (from what I can tell) an assumption without basis.
Posted Jul 28, 2023 15:54 UTC (Fri)
by mb (subscriber, #50428)
[Link] (3 responses)
You just identified the real problem.
Posted Jul 28, 2023 21:56 UTC (Fri)
by kleptog (subscriber, #1183)
[Link] (2 responses)
I said 'fix', not 'patch'. Anyone can patch the scheduler, the question is if it still works afterwards.
Since the goal of the scheduler is to be the 'one scheduler to rule them all', the term 'fix' means to makes it better for all workloads, even those you've never seen. Most people with an unusual workloads aren't qualified for that, and probably don't care either. They just want it to work for their workload. If a BPF controlled scheduler allows them to solve their problem without having to learn any kernel coding, that's a win.
Ofcourse, my thought is: is anyone qualified to 'fix' the scheduler on these terms? Possibly not. It would have to evolve by slow evolution, and for that you need data. Which can get best gotten by running experiments. You can do that best if you have a mechanism that allows people to quickly and easily try out different things. Like, say, by being able to load little programs into the scheduler to change the behaviour on the fly.
Posted Jul 28, 2023 22:39 UTC (Fri)
by mb (subscriber, #50428)
[Link] (1 responses)
So developing a BPF scheduler is automatically better for all workloads?
> If a BPF controlled scheduler allows them to solve their problem without having
No. It's a hack.
> Like, say, by being able to load little programs into the scheduler to change the behaviour on the fly.
or by patching the freely available C source.
Anybody who can't code C, shall not attempt to develop a new core component of the kernel.
Posted Jul 29, 2023 14:10 UTC (Sat)
by kleptog (subscriber, #1183)
[Link]
> So developing a BPF scheduler is automatically better for all workloads?
Why would it be? It doesn't need to be, because it's not claiming to be the one scheduler to rule them all. I see it as tool that allows people to solve their problems and experiment with alternatives without having to rebuild their kernel each time. It's not trying to replace the existing scheduler. It rather makes it simpler, because then the main scheduler only has to be best for *almost* all workloads, rather than all.
Peter obviously disagrees with this assessment, that's his prerogative. If enterprise software was going to depend on certain scheduler configurations they would have done that already. "Thou shalt only run with the kernel we give you" is not exactly unheard of. Instead we now have the kernel developers requiring people to use out of tree patches, which hardly seems better.
Posted Jul 28, 2023 4:20 UTC (Fri)
by quotemstr (subscriber, #45331)
[Link] (2 responses)
Posted Jul 28, 2023 6:23 UTC (Fri)
by mb (subscriber, #50428)
[Link] (1 responses)
Posted Jul 28, 2023 13:35 UTC (Fri)
by Manifault (guest, #155796)
[Link]
Additionally:
1. BPF allows user space to communicate state to the kernel via maps. Look at [0] -- there's even a hybrid scheduler that does load balancing in rust in user space.
[0]: https://lore.kernel.org/all/20230711011412.100319-35-tj@k...
Posted Jul 31, 2023 13:26 UTC (Mon)
by nye (subscriber, #51576)
[Link] (1 responses)
Phew! Thanks for this.
Now that that's clear, we can get rid of selectable IO schedulers because there should be only one that's right for everyone. Thank god we no longer need non-root users since every process should be equal; that'll save a whole bunch of code. And namespaces - plus control groups of course! It's wonderful that, thanks to your brilliant insight, we now know that we can remove these maintenance burdens. Not to mention priority levels and scheduling classes - they seemed so complicated so it's great that we can cut the gordian knot and just get rid of them.
Posted Jul 31, 2023 14:47 UTC (Mon)
by mb (subscriber, #50428)
[Link]
Posted Jul 28, 2023 1:43 UTC (Fri)
by DemiMarie (subscriber, #164188)
[Link]
Posted Jul 30, 2023 19:25 UTC (Sun)
by DemiMarie (subscriber, #164188)
[Link]
Posted Jul 27, 2023 10:05 UTC (Thu)
by eduperez (guest, #11232)
[Link]
So, I guess that "to pull a RedHat" is now a phrase...
Posted Jul 28, 2023 7:50 UTC (Fri)
by andrewsh (subscriber, #71043)
[Link] (4 responses)
Posted Jul 28, 2023 8:29 UTC (Fri)
by spacefrogg (subscriber, #119608)
[Link] (2 responses)
Posted Jul 28, 2023 8:35 UTC (Fri)
by andrewsh (subscriber, #71043)
[Link] (1 responses)
Posted Jul 28, 2023 13:52 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
That said, a simple "call BPF to ask user space what it thinks" shouldn't be ringing any alarm bells. Those people who don't want BPF scheduling don't install any BPF schedulers, those who do take responsibility for the resulting cock-ups ...
(And a general purpose distro should have no need for any BPF schedulers.)
Cheers,
Posted Jul 31, 2023 13:20 UTC (Mon)
by nye (subscriber, #51576)
[Link]
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Or keep your patches.
Where's the problem, really?
These out of tree patches can evolve into actual CFS improvements.
The custom BFS hacks cannot.
Extensible scheduler class rejected
Typo: BPF
Extensible scheduler class rejected
Extensible scheduler class rejected
It works as-is as a hacking playground.
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Why can't they patch the scheduler?
Extensible scheduler class rejected
> Why can't they patch the scheduler?
Extensible scheduler class rejected
> to learn any kernel coding, that's a win.
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
2. BPF can do a _lot_ now. You can create rbtrees natively in BPF, for example. See the flat cpu controller scheduler in [1].
[1]: https://lore.kernel.org/all/20230711011412.100319-27-tj@k...
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Sending Project-C patches upstream?
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Extensible scheduler class rejected
Wol
Extensible scheduler class rejected