|
|
Subscribe / Log in / New account

The return of the BFQ I/O scheduler

By Jonathan Corbet
February 3, 2016
Once upon a time, block I/O schedulers, which are charged with optimizing the stream of I/O requests to storage devices, were an area of intensive development. The characteristics of rotating drives meant that I/O performance would vary greatly depending on the order in which requests were serviced. Hardware changes (and the move to solid-state storage devices in particular) have taken away much of the motivation for work in this area; on modern drives, the best thing the I/O scheduler can do is often to just stay out of the way. Still, some interest in improving I/O schedulers remains, as can be seen in the return of the budget fair queuing (BFQ) scheduler.

BFQ, developed by Paolo Valente and Arianna Avanzini, was last covered here in June 2014. It is derived from the CFQ I/O scheduler found in current kernels; CFQ is the default for most systems, though some users replace it with one of the simpler schedulers when they have fast drives. BFQ has diverged a long way from its CFQ roots, though.

See the 2014 article for details on how BFQ works; what follows is a much-abbreviated summary. For any given block device, BFQ creates a request queue for each process in the system. Each queue is assigned a budget, being the amount of I/O traffic it is allowed to dispatch to the drive in any given scheduling cycle; the budget is determined by the scheduler itself based on I/O priority and observations of the process's I/O behavior. The dispatching code, given the mellifluous name "B-WF2Q+", services each queue in turn; an amount of I/O up to the associated budget will be dispatched during that turn. By setting the budgets properly, BFQ tries to ensure that all processes get a fair amount of the available I/O bandwidth within fairly tight latency limits.

The core idea is simple, but the real world is rather more complex; as a result, BFQ, like CFQ before it, has accrued a number of heuristics aimed at improving performance in specific areas. Queues for cooperating processes working in the same area of the disk are merged to enable more request consolidation. If a process doing read requests empties its queue before exhausting its budget, the device will be idled briefly to give that process a chance to create another request. Processes identified as "soft realtime" (those with moderate bandwidth requirements and an observed regular on/off request cycle) will get higher priority, as will processes that are just starting up. And so on.

By all appearances, BFQ has not changed a great deal since the June 2014 posting. The described heuristics are mostly the same. Undoubtedly bugs have been fixed and performance improved based on experience using the scheduler in the real world. Such experience does exist; the scheduler has been shipped with a few second-tier distributions and, evidently, has been used in some handsets. Paolo has previously posted from the University of Modena and Reggio Emilia, but the current patches come instead from a Linaro address, suggesting that there is some commercial interest in this work.

The BFQ scheduler was well received in 2014, but the proposed method of getting it into the kernel was not. At that time, BFQ was added as a new I/O scheduler alongside the others in the kernel, but the kernel community had little appetite for yet another scheduler, much less one that resembles the in-kernel CFQ scheduler so closely. Since BFQ is, by most appearances, an improvement on CFQ, Paolo was advised to generate a series of patches evolving CFQ in the desired direction. That was easier said than done, even given that BFQ derives from CFQ; the two schedulers had diverged considerably while BFQ was being developed. As a result of this requirement, BFQ essentially disappeared from the kernel mailing lists for more than a year.

Its return shows how the BFQ developers intend to try to satisfy the request that was made of them. The new patch set consists of 22 changesets, divided into three main phases:

  1. The CFQ scheduler is regressed back to the state it was in when BFQ originally split off from it. The first eight patches simply strip features (mostly heuristics) out of CFQ, simplifying it considerably. At the end of this phase, CFQ remains and (presumably) still works, but lacks most of the features it has acquired in the last few years.

  2. The core CFQ engine is replaced wholesale with the core BFQ engine; that patch removes 1,700 lines of code and adds nearly 4,300 lines. The scheduler still calls itself CFQ (a name that may never change for compatibility reasons); at this point it represents BFQ in a relatively early stage of development. The next patch adds back full hierarchical scheduling and control-group support.

  3. The remaining 11 patches add new heuristics to BFQ, addressing various performance and fairness issues that have come up over time.

As of this writing, the new patch set has not yet received any comments, so it remains to be seen whether the development community will accept this approach or not. As a general rule, kernel developers like to see code evolved in relatively small steps; a massive replacement of code in a single patch is hard to review and has a relatively high chance of introducing regressions and performance problems. The CFQ scheduler is heavily used in production; destabilizing it for a couple of releases is not really a viable option. So it's entirely possible that this submission will be no more successful than the previous ones.

On the other hand, there may be no better way to get BFQ into the kernel at this point. If enthusiasm for BFQ were low, that might simply doom it to an out-of-tree existence. But BFQ seems demonstrably better than the CFQ scheduler, and its heuristics are better understood, so there is a real motivation to get it into the kernel. One can imagine that it might take some time to build a sufficiently high level of confidence in the quality of the code, so BFQ should not be expected in, say, the 4.6 development cycle. But, given that time, the development community might yet see a way to get this code merged and made available to the user community as a whole.

Index entries for this article
KernelBlock layer/I/O scheduling


to post comments

The return of the BFQ I/O scheduler

Posted Feb 4, 2016 4:36 UTC (Thu) by samroberts (subscriber, #46749) [Link]

Don't want a second scheduler, and don't want to evolve CFQ too fast... those are conflicting goals.

Sounds like BFQ should be added, so people can switch it in and see if it is better, once enough experience proves it, it can be swapped for CFQ, or made the default.

The first approach sounded better!

The return of the BFQ I/O scheduler

Posted Feb 4, 2016 8:33 UTC (Thu) by Abrahams (guest, #103692) [Link] (2 responses)

> The dispatching code, given the mellifluous name "B-WF2Q+",

Fabulous, hahaha!

The return of the BFQ I/O scheduler

Posted Feb 5, 2016 3:16 UTC (Fri) by jschrod (subscriber, #1646) [Link] (1 responses)

Your comment made me look up the unknown-to-me word »mellifluous«. Thanks for that; I learned something.

Now I wonder if our esteemed editor meant the association with »mellow«, »smooth«, »sweet«, »lovely«, »candied«, or »over-sweet«. (These are the associations dict.leo.org provides.)

I wonder if it's the latter term. ;-)

mellifluous

Posted Feb 5, 2016 16:42 UTC (Fri) by Jandar (subscriber, #85683) [Link]

Wiktionary explains mellifluous:

From Latin mellifluus ‎(“flowing like honey”), from mel ‎(“honey”) + fluō ‎(“flow”).

1) Flowing like honey.
2) Sweet, smooth and musical; pleasant to hear (generally used of a person's voice, tone or writing style).

The return of the BFQ I/O scheduler

Posted Feb 4, 2016 15:37 UTC (Thu) by flussence (guest, #85566) [Link] (5 responses)

That patchset's methodology really demonstrates how absurd these demands are.

To use a bad car analogy, it's like driving cross-country for a friend to deliver a few clothes in a box, and then on arrival being told you folded them wrong and stacked the colours in the wrong order, should have known that in the first place, and ordered to go all the way back home, do that there and try again.

(P.S. I'm thankful the kernel devs didn't manage to burn Paolo out with this circus like they did to Con Kolivas. In spite of that, I've been a very happy user of both sets of scheduling code for many years.)

The return of the BFQ I/O scheduler

Posted Feb 4, 2016 19:40 UTC (Thu) by martinfick (subscriber, #4455) [Link] (4 responses)

Another way to see it, perhaps from the other side, is that you have been asked to wear the clothes your friend designed to a party. When the clothes arrive, they don't fit, and you are being asked to also store them in your closet, which is already too full, along with your other clothes. They are nice clothes though, you just aren't sure you want to get rid of your uniform you need for work to store them though!

The return of the BFQ I/O scheduler

Posted Feb 5, 2016 4:31 UTC (Fri) by liam (guest, #84133) [Link] (3 responses)

Rather than stretching metaphor I'll just state the obvious: 1. kernel developers seem to like the scheduler, 2. seemingly arbitrarily, they don't want to add another scheduler (it's not as though io schedulers are being developed all the time and their developers asking for them to be included in the tree), 3. will accept the code if bfq, through some unknown, assuming it's possible, manipulations can be made to replace cfq and provide zero regressions (assumed by context).
Looks doable! And a great way to bring in new developers, to boot!

The return of the BFQ I/O scheduler

Posted Feb 5, 2016 11:36 UTC (Fri) by dgm (subscriber, #49227) [Link] (2 responses)

> Looks doable! And a great way to bring in new developers, to boot!

Given enough trust pigs fly just nice, and given enough time and resources any software project is doable. But doable is not the same as practical or desirable. And to me it doesn't sound as a great "way to bring new developers", but rather a way to burn them.

If regressions are feared (and they should) do not remove cfq, just add bfq. Let users transition at their own peace and, after a curing period, set a timeline for cfq deprecation and removal. If bfq is shown to be sensibly inferior for some workload, just keep them both.

The return of the BFQ I/O scheduler

Posted Feb 5, 2016 20:10 UTC (Fri) by eternaleye (guest, #67051) [Link]

I believe that was rather liam's point, but Poe's Law seems to have sapped the sarcasm of their post.

The return of the BFQ I/O scheduler

Posted Feb 5, 2016 22:26 UTC (Fri) by viro (subscriber, #7872) [Link]

... and in the meanwhile, let the bloody peas^W^Wexisting developers deal with both - surely that (or anything they do, really) can't be hard, right?

Frankly, you remind me of USAnian pro-lifers - every sperm^Wlife is sacred and must be defended no matter the cost to anybody[*], until the moment they are born. At which point they cease to be innocent, presumably, and stop being worthy of any consideration...

How long is the, er, grace period in your case? IOW, how long does it take for a new developer to become a not so new one and lose your oh-so-valuable moral support?

[*] except for those who inspire the holy warriors, of course.

The above is not a comment about its quality of patchset or about its authors. I've no quarrel with either. The hypocritical advocates, OTOH...


Copyright © 2016, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds