|
|
Subscribe / Log in / New account

Extensible scheduler class rejected

The extensible scheduler class enables the creation of CPU schedulers in BPF. After the fourth version of this series was greeted with relative silence, Tejun Heo asked about the status of this work:

We are comfortable with the current API. Everything we tried fit pretty well. It will continue to evolve but sched_ext now seems mature enough for initial inclusion. I suppose lack of response doesn't indicate tacit agreement from everyone, so what are you guys all thinking?

Scheduler maintainer Peter Zijlstra gave him his answer: "I'm still hating the whole thing with a passion". He went on to make it clear that this work will not be merged into the mainline. So, it seems, developers wanting to try their hand at BPF scheduler development will need to apply an out-of-tree patch series, for now at least.


to post comments

Extensible scheduler class rejected

Posted Jul 26, 2023 19:23 UTC (Wed) by flussence (guest, #85566) [Link] (26 responses)

And so we continue manually patching Project-C in our kernels to get usable framerates…

Extensible scheduler class rejected

Posted Jul 27, 2023 5:27 UTC (Thu) by TheGopher (subscriber, #59256) [Link] (23 responses)

Or figure out why this is and submit a patch?

Extensible scheduler class rejected

Posted Jul 27, 2023 7:38 UTC (Thu) by zorro (subscriber, #45643) [Link] (22 responses)

What if "what works" is application dependent?

Extensible scheduler class rejected

Posted Jul 27, 2023 9:38 UTC (Thu) by mb (subscriber, #50428) [Link] (21 responses)

If you think your application is special, then it's probably not.

Extensible scheduler class rejected

Posted Jul 27, 2023 11:51 UTC (Thu) by zorro (subscriber, #45643) [Link]

Reading the original motivation for this patch set, this is not what Google and Meta seem to think. Granted, they are big enough to run custom kernels with custom schedulers anyway.

Extensible scheduler class rejected

Posted Jul 27, 2023 14:23 UTC (Thu) by quotemstr (subscriber, #45331) [Link] (17 responses)

You didn't answer his question. Some workloads really are hard to predict. What if the kernel really doesn't in some situations have enough information to do the right thing?

Extensible scheduler class rejected

Posted Jul 27, 2023 15:42 UTC (Thu) by mb (subscriber, #50428) [Link] (16 responses)

Then BPF could not solve the situation either.

Extensible scheduler class rejected

Posted Jul 27, 2023 19:53 UTC (Thu) by flussence (guest, #85566) [Link] (12 responses)

BFS/MuQSS solved the situation a decade and a half ago, as does PDS/BMQ today. Asking for a BPF plugin system is a *concession*, and until it happens people are just going to keep applying experimental out of tree patches, because there continues to be a very real dollar cost to using CFS.

These things are fixing a 25%+ performance hit in CPU-bound multithreaded applications. It sounds insane, and it is insane that the kernel has fought to maintain the status quo for this long. DOI:10.1145/2901318.2901326

Extensible scheduler class rejected

Posted Jul 27, 2023 20:18 UTC (Thu) by mb (subscriber, #50428) [Link] (3 responses)

>there continues to be a very real dollar cost to using CFS.

Fix CFS.
Or keep your patches.
Where's the problem, really?

>and until it happens people are just going to keep applying experimental out of tree patches

That is a good thing.
These out of tree patches can evolve into actual CFS improvements.
The custom BFS hacks cannot.

Extensible scheduler class rejected

Posted Jul 27, 2023 20:20 UTC (Thu) by mb (subscriber, #50428) [Link]

>BFS
Typo: BPF

Extensible scheduler class rejected

Posted Jul 27, 2023 20:44 UTC (Thu) by djk121 (subscriber, #152710) [Link] (1 responses)

> The custom BPF hacks cannot.

They can, and already are. The SHARED_RUNQ patches for CFS were born out of sched_ext experiments: https://lore.kernel.org/all/20230710200342.358255-1-void@...

Extensible scheduler class rejected

Posted Jul 27, 2023 20:49 UTC (Thu) by mb (subscriber, #50428) [Link]

So? If anything, then this shows that this doesn't have to be merged mainline.
It works as-is as a hacking playground.

Extensible scheduler class rejected

Posted Jul 27, 2023 21:43 UTC (Thu) by pizza (subscriber, #46) [Link] (7 responses)

> These things are fixing a 25%+ performance hit in CPU-bound multithreaded applications. It sounds insane, and it is insane that the kernel has fought to maintain the status quo for this long.

Perhaps the reason those patches remain out of tree are because they cause serious regressions on other workloads?

Extensible scheduler class rejected

Posted Jul 28, 2023 4:19 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (6 responses)

Thus the idea that a single scheduler might not be appropriate for all workloads.

Extensible scheduler class rejected

Posted Jul 28, 2023 6:28 UTC (Fri) by mb (subscriber, #50428) [Link] (5 responses)

Or the idea that these schedulers are just bad hacks that paper over a problem elsewhere.

The problem with a user programmable scheduler is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.

Extensible scheduler class rejected

Posted Jul 28, 2023 9:51 UTC (Fri) by kleptog (subscriber, #1183) [Link] (4 responses)

> The problem with a user programmable scheduler is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.

This is a terrible argument.

The problem with a user-programmable computer is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.

The problem with a user-modifiable kernel is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.

The problem with a user-configurable desktop is, that this will hurt the overall ecosystem in the long term by creating thousands of "special" flowers.

Unless there are really good reason, the trend is always to *more* configurability, not less. The problem is now that the people with the broken workloads can't fix the scheduler, and the people who can fix the scheduler don't know the workloads. At least with a generally accepted user-modifiable scheduler people could start communicating and collaborating about what works for them and we'd get progress, because the people with the workloads can actually try things out easily. As long as it's out of tree that will never happen, because you have no way to testing whether a change will affect normal users negatively.

Also, the idea that it's possible to build a single scheduler that works well for everyone is (from what I can tell) an assumption without basis.

Extensible scheduler class rejected

Posted Jul 28, 2023 15:54 UTC (Fri) by mb (subscriber, #50428) [Link] (3 responses)

>The problem is now that the people with the broken workloads can't fix the scheduler

You just identified the real problem.
Why can't they patch the scheduler?

Extensible scheduler class rejected

Posted Jul 28, 2023 21:56 UTC (Fri) by kleptog (subscriber, #1183) [Link] (2 responses)

> You just identified the real problem.
> Why can't they patch the scheduler?

I said 'fix', not 'patch'. Anyone can patch the scheduler, the question is if it still works afterwards.

Since the goal of the scheduler is to be the 'one scheduler to rule them all', the term 'fix' means to makes it better for all workloads, even those you've never seen. Most people with an unusual workloads aren't qualified for that, and probably don't care either. They just want it to work for their workload. If a BPF controlled scheduler allows them to solve their problem without having to learn any kernel coding, that's a win.

Ofcourse, my thought is: is anyone qualified to 'fix' the scheduler on these terms? Possibly not. It would have to evolve by slow evolution, and for that you need data. Which can get best gotten by running experiments. You can do that best if you have a mechanism that allows people to quickly and easily try out different things. Like, say, by being able to load little programs into the scheduler to change the behaviour on the fly.

Extensible scheduler class rejected

Posted Jul 28, 2023 22:39 UTC (Fri) by mb (subscriber, #50428) [Link] (1 responses)

>the term 'fix' means to makes it better for all workloads

So developing a BPF scheduler is automatically better for all workloads?

> If a BPF controlled scheduler allows them to solve their problem without having
> to learn any kernel coding, that's a win.

No. It's a hack.

> Like, say, by being able to load little programs into the scheduler to change the behaviour on the fly.

or by patching the freely available C source.

Anybody who can't code C, shall not attempt to develop a new core component of the kernel.

Extensible scheduler class rejected

Posted Jul 29, 2023 14:10 UTC (Sat) by kleptog (subscriber, #1183) [Link]

> >the term 'fix' means to makes it better for all workloads

> So developing a BPF scheduler is automatically better for all workloads?

Why would it be? It doesn't need to be, because it's not claiming to be the one scheduler to rule them all. I see it as tool that allows people to solve their problems and experiment with alternatives without having to rebuild their kernel each time. It's not trying to replace the existing scheduler. It rather makes it simpler, because then the main scheduler only has to be best for *almost* all workloads, rather than all.

Peter obviously disagrees with this assessment, that's his prerogative. If enterprise software was going to depend on certain scheduler configurations they would have done that already. "Thou shalt only run with the kernel we give you" is not exactly unheard of. Instead we now have the kernel developers requiring people to use out of tree patches, which hardly seems better.

Extensible scheduler class rejected

Posted Jul 28, 2023 4:20 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (2 responses)

Except eBPF can teach the kernel to to use information it couldn't before.

Extensible scheduler class rejected

Posted Jul 28, 2023 6:23 UTC (Fri) by mb (subscriber, #50428) [Link] (1 responses)

No? BPF can do less than a kernel patch. That should be obvious.

Extensible scheduler class rejected

Posted Jul 28, 2023 13:35 UTC (Fri) by Manifault (guest, #155796) [Link]

That's a significant oversimplification. Yes, programming in BPF is more constraining than just doing basic kernel programming. There's a reason for that though -- it's because the BPF schedulers _can't crash_ thanks to the verifier (_and_ they can't hang the kernel because of the watchdog).

Additionally:

1. BPF allows user space to communicate state to the kernel via maps. Look at [0] -- there's even a hybrid scheduler that does load balancing in rust in user space.
2. BPF can do a _lot_ now. You can create rbtrees natively in BPF, for example. See the flat cpu controller scheduler in [1].

[0]: https://lore.kernel.org/all/20230711011412.100319-35-tj@k...
[1]: https://lore.kernel.org/all/20230711011412.100319-27-tj@k...

Extensible scheduler class rejected

Posted Jul 31, 2023 13:26 UTC (Mon) by nye (subscriber, #51576) [Link] (1 responses)

> If you think your application is special, then it's probably not

Phew! Thanks for this.

Now that that's clear, we can get rid of selectable IO schedulers because there should be only one that's right for everyone. Thank god we no longer need non-root users since every process should be equal; that'll save a whole bunch of code. And namespaces - plus control groups of course! It's wonderful that, thanks to your brilliant insight, we now know that we can remove these maintenance burdens. Not to mention priority levels and scheduling classes - they seemed so complicated so it's great that we can cut the gordian knot and just get rid of them.

Extensible scheduler class rejected

Posted Jul 31, 2023 14:47 UTC (Mon) by mb (subscriber, #50428) [Link]

Did you forget to take your pills?

Extensible scheduler class rejected

Posted Jul 28, 2023 1:43 UTC (Fri) by DemiMarie (subscriber, #164188) [Link]

Which patch do you apply?

Sending Project-C patches upstream?

Posted Jul 30, 2023 19:25 UTC (Sun) by DemiMarie (subscriber, #164188) [Link]

Has anyone tried sending the Project-C patches upstream? If not, why?

Extensible scheduler class rejected

Posted Jul 27, 2023 10:05 UTC (Thu) by eduperez (guest, #11232) [Link]

> They will not care, they will not contribute, they might even pull a RedHat and only share the code to customers.

So, I guess that "to pull a RedHat" is now a phrase...

Extensible scheduler class rejected

Posted Jul 28, 2023 7:50 UTC (Fri) by andrewsh (subscriber, #71043) [Link] (4 responses)

I’m happy I don’t need to communicate with Peter Zijlstra in any way in my hobby or professional work.

Extensible scheduler class rejected

Posted Jul 28, 2023 8:29 UTC (Fri) by spacefrogg (subscriber, #119608) [Link] (2 responses)

Is this a general remark on his communication style or does it refer to his specific answer? I have no personal interest in application-specific schedulers but found his argument very compelling. Today more than ever must the kernel force applications to not unduly mandate customer compliance. Just an example: BigCorp A, which has the majority on <customers> computer, enforces using the BadSched scheduler for the mere purpose of hurting FreeAlternative B's application performance on the same system to promote their own solution. Insert Chrome and Firefox or any of the other duals, here, to get the picture...

Extensible scheduler class rejected

Posted Jul 28, 2023 8:35 UTC (Fri) by andrewsh (subscriber, #71043) [Link] (1 responses)

It’s a general remark on his communication style, but it’s also quite visible in this particular response. Somehow, most of emails from him I’ve seen were either passive aggressive, or outright very rude. He’s definitely not a person I would enjoy working with.

Extensible scheduler class rejected

Posted Jul 28, 2023 13:52 UTC (Fri) by Wol (subscriber, #4433) [Link]

Is he the sort of person who is (usually) correct in his gut feel? Sometimes you can present something to me and I'll just hate it because it "feels wrong". That can be a real sixth sense.

That said, a simple "call BPF to ask user space what it thinks" shouldn't be ringing any alarm bells. Those people who don't want BPF scheduling don't install any BPF schedulers, those who do take responsibility for the resulting cock-ups ...

(And a general purpose distro should have no need for any BPF schedulers.)

Cheers,
Wol

Extensible scheduler class rejected

Posted Jul 31, 2023 13:20 UTC (Mon) by nye (subscriber, #51576) [Link]

I fortunately don't ever have a need to interact on LKML but I do end up reading a fair few threads for various issues, and even nowadays (it used to be a lot worse) there are a lot of "core" developers who consistently act in ways that I would expect to result in at least a written warning from HR in most places of work. This particular example is on the friendlier side for LKML - at least it's actually fairly technical and isn't just a childish personal attack, an ego-led rant belittling the patch submitter, or a one work "NAK" with an outright refusal to interact at all.


Copyright © 2023, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds