Gettys: Traditional AQM is not enough
Many have understood bufferbloat to be a problem that primarily occurs when a saturating 'elephant flow' is present on a link; it is easiest to test for bufferbloat this way, but this is not the only problem we face. The dominant application, the World Wide Web, is anti-social to any other application on the Internet, and it’s collateral damage is severe. Solving the latency problem, therefore, requires a two prong attack."
Posted Jul 11, 2013 16:34 UTC (Thu)
by mtaht (subscriber, #11087)
[Link] (22 responses)
I would like to know what barriers would remain to get this teeny patch, turning on fq_codel by default in the linux mainline, at this point.
http://snapon.lab.bufferbloat.net/~cero2/deb/patches/0003...
Posted Jul 11, 2013 18:03 UTC (Thu)
by darthscsi (guest, #8111)
[Link] (16 responses)
Posted Jul 11, 2013 18:25 UTC (Thu)
by roskegg (subscriber, #105)
[Link] (13 responses)
Posted Jul 11, 2013 19:08 UTC (Thu)
by jg (guest, #17537)
[Link] (9 responses)
Posted Jul 11, 2013 19:14 UTC (Thu)
by roskegg (subscriber, #105)
[Link] (8 responses)
But even sustained disk io of any kind makes the system... jittery. The nightly locatedb run, for instance.
It is the same problem; bulk transfer versus a small throughput hit for small latency. As a desktop user I want no latency for keyboard and mouse, and the throughput is "good enough".
Could there be a sysctl to enable and disable this type of queueing for everything, page swapping, disk io, etc, to make it smoother?
Even something as simple as bittorrent finishes a download and "verifies" the gigs of downloaded material, makes everything shudder for a few seconds.
Posted Jul 11, 2013 19:38 UTC (Thu)
by jg (guest, #17537)
[Link] (5 responses)
1) There are HOL (Head of Line) blocking latency problems in display systems too, and some of the heritage left over from XFree86 brain damage that have not, to my knowledge been completely stamped out in X.org, that cause bad CPU scheduling behavior.
In short, operating systems like to give the CPU back to a process which is not "greedy", ones that use small amounts of compute and go back to sleep. (the OS can easily identify this behavior as "interactive").
So if you aren't very careful about busy waiting in user space on graphics device registers and yielding to the OS in a timely fashion, display servers (whether X.org, wayland or Mir) get identified as pigs and don't get preferentially scheduled, and therefore tend to suffer from other processes.
The original X implementations (before XFree86) usually got this right. X.org, the last I looked, not so much....
2) It turns out that there is another similarity to network devices (particularly highly variable throughput such as wireless): the command ring buffers display hardware uses are very easy to load up with a long string of commands, so they can operate autonomously. It's really tempting for benchmarking to make the number of commands outstanding to the display high. But it can kill latency and cause lots of HOL waiting.
In networking, at least if you know the current throughput (which Minstrel can give us for 802.11 someday) you can begin to choose how much buffering is appropriate to get the most bang for the buck without destroying latency.
But graphics has a yet harder problem: predicting how long any given command will take is not knowable, even simple commands may take radically different amounts of executution time, and with today's 3D engines, there are Turing complete processors out there that may never even terminate. So from that perspective, controlling display latency is harder than the problems we have in networking, until the GPU's grow into full processors running multiple threads themselves (which seems the long term trend). Again, it's really easy to get this wrong if you only do throughput benchmarking; you gotta measure latency.
I think careful attention to 1) above and care in the number of commands allowed to be outstanding in 2) above could go a long way to making the Linux display environment (whether X11 or otherwise) work much better. But
First, measure; ***then*** go fix....
Posted Jul 12, 2013 13:10 UTC (Fri)
by jg (guest, #17537)
[Link] (4 responses)
The HOL blocking problems are, as I discussed, non-trivial. People interested in tackling those problems, to the extent they can be attacked, I'm sure would be greatly appreciated. Given the dynamic range of graphics hardware performance, it's an "interesting" optimization problem.
Developers of all sorts need to understand that latency is at least as important as throughput, and should be routinely testing for it. I greatly applaud all the work so many have done to getting Linux and the desktop to boot quickly; over the last 5 years, boot times seem to have dropped from of order a minute to well under 10 seconds, only some of which is due to SSD's.
Posted Jul 12, 2013 16:39 UTC (Fri)
by roskegg (subscriber, #105)
[Link] (3 responses)
Your article reminds me of this classic:
An article by William Beaty. He found that he could clear up traffic jams fairly easily. What I took away from his research is that if you back off from shoving yourself forward as much as possible (bulk transfer) and let others pop in front of you, most of the time, they are transients who just wanted to change lanes to get onto the off exit. Or they needed to get in from the on ramp and transfer to another lane.
So, just like traffic in real life, I think looking at the overall picture, overall throughput will be optimal if per-process throughput gives up the "greedy" approach and lets the transients pop in, do their thing, and vanish.
When there are no transients, the big apps can still do mega block transfers and the kernel can still aggregate them.
I hope by combining the insights of Bill Beatty and the codel method, disk access can be fixed so that bittorrent and updatedb no longer eat up latency and freeze my desktop.
Posted Jul 13, 2013 10:47 UTC (Sat)
by kleptog (subscriber, #1183)
[Link] (1 responses)
Now I figure there's some monitoring system in the back nipping all these traffic waves in the bud. I figured it was something like that, but now I understand why it works.
Posted Jul 20, 2013 5:04 UTC (Sat)
by giraffedata (guest, #1954)
[Link]
I hadn't heard of variable speed limit signs, but I can see that that would work. The best device I've seen for eliminating stop-and-go traffic jams is the "Don't change lanes" sign that lights up when there is congestion.
I don't think any of this automobile traffic theory helps us at all with network traffic, though, because the automobile traffic unpleasantness is caused by two things that have no analog on data networks: 1) multiple lanes that cars can switch between continuously; and 2) following distance psychology. If all freeways had solid lines between the lanes except for occasional openings and cars had computers to ensure constant distance from the car ahead, there wouldn't be any stop-and-go traffic. You especially wouldn't see a car come to a stop because another one two miles down the road slowed down 3 mph to look at an accident.
Posted Jul 23, 2013 0:31 UTC (Tue)
by Baylink (guest, #755)
[Link]
I do it whenever possible, and I've found that, on urban interstates, I'm not the only one. Didn't realize it had a website, though.
Posted Jul 11, 2013 21:40 UTC (Thu)
by cesarb (subscriber, #6266)
[Link] (1 responses)
Posted Jul 12, 2013 4:42 UTC (Fri)
by roskegg (subscriber, #105)
[Link]
Posted Jul 11, 2013 19:26 UTC (Thu)
by darthscsi (guest, #8111)
[Link] (2 responses)
Posted Jul 11, 2013 19:50 UTC (Thu)
by dlang (guest, #313)
[Link] (1 responses)
Posted Jul 11, 2013 20:31 UTC (Thu)
by darthscsi (guest, #8111)
[Link]
Posted Jul 11, 2013 19:02 UTC (Thu)
by jg (guest, #17537)
[Link] (1 responses)
Posted Jul 11, 2013 20:42 UTC (Thu)
by mtaht (subscriber, #11087)
[Link]
It is very hard to configure tc correctly otherwise for a wide range of devices, notably multiqueued ones, and htb. This has been the primary barrier to adoption by bleeding edge distros. (OpenWrt went the kernel patch route which appears to be working out well.)
I will propose a patch in a week or so that does that rather than just arbitrarily switch.
Posted Jul 11, 2013 19:00 UTC (Thu)
by jg (guest, #17537)
[Link] (2 responses)
Note the fq_codel qdisc not a panacea: don't believe that just because it becomes the default, all your bufferbloat problems will go away.
There is lots of driver work left to do, particularly for 802.11, DSL, and cellular to name a few. Some of these drivers are closed source, and getting it deployed everywhere is going to take a long time even after they've been fixed.
Posted Jul 11, 2013 19:52 UTC (Thu)
by dlang (guest, #313)
[Link] (1 responses)
Exactly, changing the default is a big step. fq_codel hasn't even been out long enough to make it in to any of the Enterprise distros as an option, let alone having any of them turn it on by default.
As such, it's probably premature for the upstream kernel to turn it on by default.
Posted Jul 11, 2013 20:02 UTC (Thu)
by jg (guest, #17537)
[Link]
The big point in favor of changing the default in kernel.org (after more a year upstream in kernel.org, I'll note) is just how terrible PFIFO_FAST turns out to be as shown by Toke's CDF plot in that blog entry and all the other measurements we've made. We're orders of magnitude from where we know we can be. The improvement is *not* subtle.
But changing a default *is* a serious step, and until we get everyone in a room to talk it through, expressing all the pros and cons and concerns, it's not a decision to be taken lightly.
Posted Jul 11, 2013 22:22 UTC (Thu)
by gdt (subscriber, #6284)
[Link] (1 responses)
My criticism of the proposed default is the lack of explicit provision for control plane traffic. The very HTTP behaviour FQCoDel is designed to punish (ten odd packets back-to-back) is exactly the pattern of most network control planes. If you starve the control plane then you blackhole (lose connectivity despite the link being available). Note that the largest FQCoDel deployment had to add a queue for control plane traffic, so that deployment can't be used as an argument for unmodified FQCoDel as the default.
Posted Jul 11, 2013 22:38 UTC (Thu)
by mtaht (subscriber, #11087)
[Link]
https://lists.bufferbloat.net/pipermail/codel/2013-July/0...
I'm trying to make it possible to have "a default other than pfifo_fast" at the moment.
In most of the rest of the universe, (and I'm talking android, desktops, home routers, most servers) the "control plane" consists of one (or two) packets at a time, consisting of basic routing/arp/nd sorts of packets which work just fine (and in the general case, better than pfifo_fast)
I do agree that retaining a three tier model has some uses, but very few - and most of my concern is in coming up with a structure for implementing it efficiently and generally. And deployably. I don't quite buy the control plane argument if it is only 10 packets...
Posted Jul 11, 2013 19:28 UTC (Thu)
by kleptog (subscriber, #1183)
[Link] (1 responses)
Posted Jul 11, 2013 20:30 UTC (Thu)
by jg (guest, #17537)
[Link]
But tons of work remains in drivers and elsewhere.... The enemy of the good is the perfect; draining this size swamp will take quite a few years yet.
Fq_codel is not the end, or all of the solution, it is a (very good) step in dealing with bufferbloat.
Posted Jul 12, 2013 8:16 UTC (Fri)
by mchazaux (guest, #64024)
[Link] (1 responses)
Posted Jul 12, 2013 13:14 UTC (Fri)
by jg (guest, #17537)
[Link]
And UDP apps can fill buffers just as easily as TCP can (often faster).
It turns out that isolating flows lets the mark/drop policy of CoDel function much better in the face of uncontrolled UDP applications that are unresponsive to signalling (by mark or drop), so the two sides of fq_codel are synergistic.
Posted Jul 23, 2013 0:22 UTC (Tue)
by Baylink (guest, #755)
[Link]
(Is that what it was even called? Back in 96 or so I was Chief at a teeny ISP -- we fit 60 dialups into a 256kb/s frame link to Texas, which then uplinked through a T-1.
We were actually pretty decent, at that point in the net's evolution... except when netPhone came out... it was UDP, you see, and disrespected TCP flow control, and 4 people melted my entire link.
I didn't like to do it, but I had to ban it.)
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
solving the second problem in the general case is just a PITA.
- Jim
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough
Gettys: Traditional AQM is not enough