Adding periods to SCHED_DEADLINE
Deadline scheduling does away with priorities, replacing them with a three-parameter tuple: a worst-case execution time (or budget), a deadline, and a period. In essence, a process tells the scheduler that it will require up to a certain amount of CPU time (the budget) by the given deadline, and that the deadline optionally repeats with the given period. So, for example, a video-processing application might request 1ms of CPU time to process the next incoming frame, expected in 10ms, with a 33ms period thereafter for subsequent frames. Deadline scheduling is appealing because it allows the specification of a process's requirements in a natural way which is not affected by any other processes running in the system. There is also great value, though, in using the deadline parameters to guarantee that a process will be able to meet its deadline, and to reject any process which might cause a failure to keep those guarantees.
The SCHED_DEADLINE scheduler being developed by Dario Faggioli appears to be on track for an eventual mainline merger, though nobody, yet, has been so foolish to set a deadline for that particular task. This scheduler works, but, thus far, it takes a bit of a shortcut: in SCHED_DEADLINE, the deadline and the period are assumed to be the same. This simplification makes the "admission test" - the decision as to whether to accept a new SCHED_DEADLINE task - relatively easy. Each process gets a "bandwidth" parameter, being the ratio of the CPU budget to the deadline/period value. As long as the sum of the bandwidth values for all processes on a given CPU does not exceed 1.0, the scheduler can guarantee that the deadlines will be met.
As Dario recently brought up on linux-kernel, there are users who would like to be able to specify separate deadline and period values. Adjusting the scheduler to implement those semantics is not particularly hard. Coming up with an admission test which insures that deadlines can still be met is rather harder, though. Once the period and the deadline are separated from each other, it becomes possible for processes to miss their deadlines even if the total bandwidth of the CPU has not been oversubscribed.
To see how this might come about, consider an example posted by Dario. Imagine a process which needs 4ms of CPU time with a period of 10ms and a deadline of 5ms. A timeline of how that process might be scheduled could look like this:
Here, the scheduler is able to run the process within its deadline; indeed, there is 1ms of time to spare. Now, though, if a second process comes in with the same set of requirements, the results will not be as good:
Despite the fact that 20% of the available CPU time remains unused, process P2 will miss its deadline by 3ms. In a hard realtime situation, that tardiness could prove fatal. Clearly, the scheduler should reject P2 in this situation. The problem is that detecting this kind of problem is not easy, especially if the scheduler is (as seems reasonable) expected to leave some CPU time for applications and not use it all performing complex admission calculations. For this reason, admission decision algorithms are currently an area of considerable research interest in the academic realtime community. See this paper by Alejandro Masrur et al. [PDF] or this one by Marko Bertogna [PDF] for examples of how involved it can get.
There are a couple of ways to simplify the problem. One of those would be to change the bandwidth calculation to be the ratio of the CPU budget to the relative deadline time (rather than to the period). For the example processes shown above, each has a bandwidth of 0.8 using this formula; the scheduler, on seeing that the second process would bump the total to 1.6, could then decide to reject it. Using this calculation, the scheduler can, once again, guarantee that deadlines will be met, but at a cost: this formula will cause the rejection of processes that, in reality, could be scheduled without trouble. It is an overly pessimistic heuristic which will prevent full utilization of the available resources.
An alternative, proposed by Dario, would be to stick with the period-based bandwidth values for admission decisions and to take the risk that some deadlines might not be met. In this case, user-space developers would be responsible for ensuring that the full set of tasks on the system can be scheduled. They might take some comfort in the fact that, since the overall bandwidth of the CPU would still not be oversubscribed, the amount by which a deadline could be missed would be deterministically bounded.
That idea did not survive its encounter with Peter Zijlstra, who thinks it ruins everything that a deadline scheduler is supposed to provide:
Peter's suggestion, instead, was to split deadline scheduling logically into two different schedulers. The hard realtime scheduler would retain the highest priority, and would require that the deadline and period values be the same. If, at some future time, a suitable admission controller is developed then that requirement could be relaxed as long as deadlines could still be guaranteed.
Below the hard realtime scheduler would be a soft realtime scheduler which would have access to (most of) the CPU bandwidth left unused by the hard scheduler. That scheduler could accept processes using period-based bandwidth values with the explicit understanding that deadlines might be missed by bounded amounts. Soft realtime is entirely good enough for a great many applications, so there is no real reason not to provide it as long as hard realtime is not adversely affected.
So that is probably how things will go, though the real shape of the solution will not be seen until Dario posts the next version of the SCHED_DEADLINE patch. Even after this problem is solved, though, deadline scheduling has a number of other challenges to overcome, with good multi-core performance being near the top of the list. So, while Linux will almost certainly have a deadline scheduler at some point, it's still hard to say just when that might be.
(Readers who are interested in the intersection of academic and practical
realtime work might want to peruse the recently-released proceedings
[PDF] from the OSPERT 2010 conference, held in Brussels in early July.)
Index entries for this article | |
---|---|
Kernel | Realtime/Deadline scheduling |
Kernel | Scheduler/Deadline scheduling |
Posted Jul 20, 2010 22:04 UTC (Tue)
by xxiao (guest, #9631)
[Link] (4 responses)
Posted Jul 21, 2010 7:46 UTC (Wed)
by abacus (guest, #49001)
[Link] (2 responses)
Posted Jul 21, 2010 16:18 UTC (Wed)
by cesarb (subscriber, #6266)
[Link] (1 responses)
I do not know if this already exists; I have not looked.
Posted Jul 21, 2010 19:31 UTC (Wed)
by sbohrer (guest, #61058)
[Link]
Posted Jul 21, 2010 8:01 UTC (Wed)
by viiru (subscriber, #53129)
[Link]
Google is having the same problem with some of their apps, which is why they have ported the old scheduler over to the 2.6 series. I'm not sure if they have a kernel tree with that publicly available, though.
Posted Jul 21, 2010 15:09 UTC (Wed)
by ms (subscriber, #41272)
[Link] (9 responses)
Is this initial scheduling problem the only issue with P2 here or am I missing something else?
Posted Jul 21, 2010 15:10 UTC (Wed)
by ms (subscriber, #41272)
[Link]
Posted Jul 21, 2010 16:03 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (7 responses)
That's the entire problem - kernel-side, spotting that you have to tell P2 that its request for CPU time cannot be met as-is, but could be met if P2 could delay its initial deadline is the hard part. It's easy to tell that we have enough CPU time in total in the 10ms time period, but not (in general) easy to tell if we can't have P2 and P1 both meet their first deadline. If P2 could cope with having its deadline delayed by 2ms, it would say that - IOW, it would ask for first deadline 7ms from now, not 5ms from now.
Note that while this example is rather simple (we have two 4ms processes to fit into 5ms, which isn't going to work), you've also got to cope with more complex examples - e.g. P1 has a deadline of 5ms from now, 4ms of work to do, and a period of 10ms (so deadlines at 5ms from now, 15ms from now, 25ms from now etc), while P2 has a deadline of 3ms from now, 1ms of work to do, and a period of 16ms (so deadlines at 3ms, 19ms, 35ms etc). Can P1 and P2 co-exist on the same CPU, or does P2 have to be told that its demands can't be met?
Posted Jul 21, 2010 17:02 UTC (Wed)
by iabervon (subscriber, #722)
[Link] (6 responses)
Say you're doing video processing on a stream at 60 Hz and one at 50 Hz. If you've got tight windows between when the input is available and when the output is required (say 2ms of work out of 3ms of real time), you may be fine for the first 100ms and then have to do both frames at the same time. Or the streams might be staggered such that they never conflict.
Posted Jul 21, 2010 20:34 UTC (Wed)
by blitzkrieg3 (guest, #57873)
[Link] (5 responses)
The example you provided was exactly what I thought of when I started reading, sort of like how blinkers will phase in and out of sync while sitting in the turning lane.
Posted Jul 21, 2010 20:43 UTC (Wed)
by corbet (editor, #1)
[Link] (4 responses)
Posted Jul 21, 2010 21:58 UTC (Wed)
by Shewmaker (guest, #1126)
[Link] (2 responses)
Perhaps another way to describe your example is that it is specifically constructed to show that when a scheduler tries to allow admission of a task at any time, and not just on period boundaries, then it gets harder. Yes, the scheduler could delay the admission of the new task until the deadline of the existing task, but then you have a periodic scheduler and not one that handles sporadic job arrivals.
I hope that helps.
Posted Jul 21, 2010 22:37 UTC (Wed)
by blitzkrieg3 (guest, #57873)
[Link] (1 responses)
Yeah exactly. I went back and read the email after seeing the response, and the key point is:
Obviously if we have the good fortune of having P2 arrive after 5 time units, we're golden. Before reading the email, I would have assumed that if you're just starting an executable, rather than acting on an event such as a udp packet or something, then you can sacrifice 5 time units.
For the purposes of explaining how this works and what these values mean, it might make more sense to use a simple example that isn't schedule-able, no matter what order it comes in, like:
Even if T_2 can be scheduled after 2 time units, using the naive algorithm that I illustrated in the picture, it's clear that they're both going to wake up at the same time after 10 units, at which time you have 3 units to schedule 4 slots of cpu time.
Posted Jul 23, 2010 22:33 UTC (Fri)
by giraffedata (guest, #1954)
[Link]
I think where people really get lost in these explanations is that they appear to describe a system where the scheduler gets to to decide exactly who to give CPU time and when, or where a process requests a precise CPU schedule for itself for the rest of time.
But in reality, the scheduler must wait for a process to request each chunk of time, and even the process can't know exactly when it will do that.
I presume the parameters actually say, "I am going to be requesting CPU time in the future. There will be at least 10 ms between requests. Each one will be for less than 4 ms, but whatever it is, I'll need it all within 5 ms of when I ask for it.
Based on that, the scheduler is supposed to say, "As long as you stick to those constraints, I guarantee I will fill all your requests" or "I can't guarantee that, so don't bother trying to run."
I really don't get the one statement about it being OK to miss the deadline by a bounded amount. If the process is OK with getting its 4ms within 10 ms, why did it state a deadline of 5 ms?
Posted Jul 22, 2010 19:57 UTC (Thu)
by droundy (subscriber, #4559)
[Link]
Posted Jul 22, 2010 23:27 UTC (Thu)
by ejr (subscriber, #51652)
[Link]
Posted Jul 23, 2010 0:36 UTC (Fri)
by gdt (subscriber, #6284)
[Link]
From an application programming point of view I don't want to have to profile my application for each target system. I'd very much like to run a few AV frames, BGP Hellos or whatever and then say "deadline schedule me that". It would be nice if the APIs were better thought out so that this could be simple, but they aren't. The current APIs really encourage reserving the worst-case behaviour of the algorithm on the worst-case target (not really appropriate for a soft real time such as video decode or protocol handling, where we can let the occasional extraordinary event pass through to the keeper), so we can be asking for two or three orders of magnitude too much resources.
Posted Jul 23, 2010 10:23 UTC (Fri)
by mjthayer (guest, #39183)
[Link]
Posted Jul 24, 2010 1:42 UTC (Sat)
by mcmechanjw (subscriber, #38173)
[Link]
For example a current SCHED_DEADLINE process has claimed 100% until its deadline in 10ms but won't reschedule for 1000ms afterwards.
I can also see having 100% free for a while and 100% used later.
Looking at it, it seems it does not matter if the period and deadline are separate or not, the current schedule must be tested against the first deadline regardless of when that will be and the periodic bandwidth must be tested separately against future bandwidth reservations.
The simplest method that comes to mind for the current schedule is to just walk the schedule forward and see when enough free ms have accumulated before the proposed deadline is hit...
Posted Jul 29, 2010 6:04 UTC (Thu)
by eternaleye (guest, #67051)
[Link] (1 responses)
Link:
Posted Aug 2, 2010 14:22 UTC (Mon)
by Arakageeta (guest, #60609)
[Link]
BTW, the research group that published the paper you linked also maintains a Linux testbed (Litmus^RT) to test real-time scheduling (pfair, various deadline schedulers) and synchronization algorithms: http://www.cs.unc.edu/~anderson/litmus-rt/
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
> bad performance on my multithread embedded apps(>10% cpu usage
> increase,etc), the same app ran perfectly well with the O1 scheduler in
> the past for years.
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
I'll confess to being totally confused by this comment. The example was (deliberately) simple, but I think it's clear and valid: the scheduler simply cannot meet the deadlines in that case. I don't see how that can be fixed by "staggering the CPU time." The scheduler can't change the deadline... What am I missing?
Example
Example
Example
I think their confusion centers around the implications of allowing separate deadline and period values. If the commenters read Dario's email, then they would have seen his reference to the "sporadic task model" which may have helped.
one of the tasks can miss its deadline by 3 time units in the worst case (if, and if yes which one, depends on the order with witch they arrive).
T_1 = (2, 4, 3)
T_2 = (2, 5, 3)
Example
Example
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
Adding periods to SCHED_DEADLINE
E.g. the current process has completed and won't reschedule for 1000ms but the total reserved is 100% of the CPU then...
I am fairly sure most heuristics will have bad to very bad corner cases either not offering cpu time when available or oversubscribing.
Adding periods to SCHED_DEADLINE
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1....
Adding periods to SCHED_DEADLINE