LWN.net Logo

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Jim Gettys and Kathleen Nichols have published a comprehensive, updated description of the bufferbloat problem in ACM Queue. "For TCP congestion avoidance to be useful to people using that link, a TCP connection causing congestion must react quickly to changes in demand at the bottleneck link, but TCP's reaction time is quadratic to the amount of overbuffering. A link that is 10 times overbuffered not only imposes 10 times the latency, but also takes 100 times as long to react to the congestion. Your short, interactive TCP connection loses completely to any long-lived flow that has saturated your link."
(Log in to post comments)

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 5, 2011 18:32 UTC (Mon) by jg (subscriber, #17537) [Link]

Actually, the article is not "comprehensive"; it's what we could put into it without going too grossly over the CACM word count limit for major articles. The full paper we're still working on, and major topics had to be cut out, unfortunately, along with lots of detail.

The ACM queue posting is that of the CACM article that will appear in the "dead tree" magazine next month, along with an interview of Van Jacobson, myself, and Nick Weaver done by Vint Cerf. I hope the full paper will be available by then. The cutting room floor is very littered. We have to pick up those pieces and pull it back together again.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 12:40 UTC (Tue) by sorpigal (subscriber, #36106) [Link]

Is a work in progress/draft version of the complete paper available anywhere?

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 5, 2011 18:55 UTC (Mon) by joib (guest, #8541) [Link]

So for those of us who have followed JG's articles about the buffer bloat issue but haven't followed the in-depth technical discussions, is some kind of technical progress overview wrt the Linux kernel available?

E.g. Dave Täht mentions in a comment in the ACM Queue article that byte queue length (BQL) are now in the net->next tree. But what about e.g. AQM algorithms? At some point there was some patch for stochastic fair blue (SFB) for Linux, is there progress on this front? What about the updated RED paper (nRED?), or if the paper is done, an implementation? What else is going on?

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 5, 2011 21:30 UTC (Mon) by Unladen (guest, #72953) [Link]

Boring.

I respect the author's experience.

But a year in, isn't it time to move on from "This is a bad problem!!!!" and anecdotes about the author's home network and kids, to big-picture insight and solutions?

First, it would be great to have some experimental data showing improvement of the problem. i.e, show, on a home network or cell network, how forcing some packets to be dropped would change latencies/RTT.

Second, some insight as to why AQM is (1) hard and (2) not deployed would be useful. Is it really just that it's a burden on ISP's? You'd think if it helped their customers they'd put engineering resources on it.
I've heard that it's partially a prisoner's dilemma - if you turn AQM/RED off and others on the same pipe use it, you get better performance. And packet drops can't be manipulated like this, so TCP uses the measure that's harder to fake. True?

Third, a working implementation of useful AQM would be nice. It's a year or so from initial report, and I get that it's hard, but all we've gotten in this article is "hold onto your pants, Van Jacobsen is working on it".

I think the main problem is the disconnect - either bufferbloat is a terrible disaster, in which every available TCP architect should care about it and be working on it (and they're not), or it's a less general problem with workarounds and engineers and academics can afford to spend a year or more tinkering with solutions (where we're at). In my reading all these articles by jg have this cognitive dissonance.
To improve: stop talking about history, and talk about how to solve this.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 5, 2011 22:17 UTC (Mon) by rilder (subscriber, #59804) [Link]

Solutions ?

Have you checked the patches submitted and/or merged into linux mainline tree ? They also have a few pending patches I believe.

Check projects (like CeroWRT) on bufferbloat.net and also their mailing list. You will realize what I am talking about.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 5, 2011 22:47 UTC (Mon) by jg (subscriber, #17537) [Link]

> First, it would be great to have some experimental data showing
> improvement of the problem. i.e, show, on a home network or cell
> network, how forcing some packets to be dropped would change latencies/RTT.

The data is well known for core Internet routers: the problem is that classic AQM algorithms don't function at all properly in the face of variable bandwidth. While there are problems in the core of the network when people have not enabled and configured RED, the big problems I know we have a the moment are at the edge, where we have variable bandwidth (both in broadband, e.g. PowerBoost) and wireless which is by its nature variable.

So your sane request is actually harder than it seems.

Having said that, I am working on demo (and some data) I'll put up sometime soon in video form; we can use bandwidth shaping to get the buffers out of the broadband connection (sacrificing bandwidth and powerboost), and the difference is actually quite dramatic.

>Second, some insight as to why AQM is (1) hard and (2) not deployed would > be useful.

Some of this is the nature of the beast; we (now) have to adapt over a huge dynamic range, of variable bandwidth and number of flows.

I have more space in the long paper on this topic.

AQM is *not* easy.

As to why it isn't universally enabled where it is available, this goes back to classic RED requires tuning, and if you get the tuning wrong, it can hurt you. So some ISP's run with RED in their core networks, and some do not.

But AQM typically isn't available *at all* in your broadband connection, your home router, and in your host, even if we had an algorithm that we knew worked.

> Is it really just that it's a burden on ISP's?

It is a burden to ISP's. They get the service calls when you suffer.

So much so that by last spring, the cable industry added a change to the DOCSIS spec to allow them to control buffering, so that sometime next calendar year, cluefull operators can reduce the overbuffering to something semi-sane. It won't be what AQM would give them, but should reduce the problem an order of magnitude.

> You'd think if
> it helped their customers they'd put engineering resources on it.
> I've heard that it's partially a prisoner's dilemma - if you turn
> AQM/RED off and others on the same pipe use it, you get better
> performance. And packet drops can't be manipulated like this,
> so TCP uses the measure that's harder to fake. True?

No, I don't believe so.

> Third, a working implementation of useful AQM would be nice. It's a year > or so from initial report, and I get that it's hard, but all we've
> gotten in this article is "hold onto your pants, Van Jacobsen is
> working on it".

And Kathie Nichols and some others. But it's hard and knowing your AQM really works is even harder once you think you have an algorithm. We (internet folks in general) already got the solution wrong once.

> I think the main problem is the disconnect - either bufferbloat is a
> terrible disaster, in which every available TCP architect should care
> about it and be working on it (and they're not), or it's a less general
> problem with workarounds and engineers and academics can afford to spend
> a year or more tinkering with solutions (where we're at). In my reading > all these articles by jg have this cognitive dissonance.
> To improve: stop talking about history, and talk about how to solve this.

And that's what I've been mostly doing recently.

Remember, the enemy of the good is the perfect: many mitigations can help greatly without requiring us to solve the whole problem.

Things like the byte queue stuff and the DOCSIS change are helpful and steps along the way. And lots of random bugs in ECN have been found and fixed.
- Jim

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 0:18 UTC (Tue) by Lennie (subscriber, #49641) [Link]

I have a feeling more widespread use of 10 Gbps hardware is gonna make things worse, are there any indications of that ?

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 1:42 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

any time you have chokepoints where the bandwidth changes drastically, you are going to have problems

right now, you can have 1Gb wired network going to a <1Mb DSL upload rate. The machine connected to the 1G wired network has no way of knowing that the system that it's talking to is on the other side of such a slow connection, and so it needs to have large enough buffers to talk to another system connected to a 1Gb network. As the data trickles out over the 1Mb network you have horrible performance.

1Gb is only now taking over in home networks from 100Mb, I don't expect to see 10Gb on very slow networks like this for quite a while yet.

10Gb in the datacenter is a good thing for communications within the datacenter. having 1Gb of connectivity from the datacenter to the Internet is far from being unheard of, and 100Mb is touching the range of the high-end home user, even the 100Mb connection is less of a speed ratio than the home user 1Gb to 1Mb transition.

but a fixed transition like this is not _that_ hard for the routers to deal with. what absolutely kills things is where a wireless network may be anywhere from 300Mb/sec to under 1Mb/sec, and may vary within this range within the length of a single session, adapting to that is extremely hard.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 15:17 UTC (Tue) by marcH (subscriber, #57642) [Link]

> right now, you can have 1Gb wired network going to a <1Mb DSL upload rate. The machine connected to the 1G wired network has no way of knowing that the system that it's talking to is on the other side of such a slow connection, and so it needs to have large enough buffers to talk to another system connected to a 1Gb network. As the data trickles out over the 1Mb network you have horrible performance.

Not a problem as long as the 1Mb link does not let the queue build up and does drop packets.

Throughput is not the problem, latency is.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 12, 2011 13:55 UTC (Mon) by ekj (guest, #1524) [Link]

Yes, but the existence of *wildly* varying bandwiths, force queue-management.

A 10Gbit capable device, can have 10MB worth of buffer, and still transmit the entire buffer in 10ms, which is low enough that it could probably behave well with no queue-management.

If such a 10MB buffer ends up holding data that trickles out over a 1mbps link though, it'd take a *minute* for the buffer to empty, i.e. completely unusable.

At the same time, the 5kB buffer that may be reasonable for a 1Mbps link, is clearly much too small for a 10Gbit link.

In short, when link-speed varies a *lot* there is no correct buffer-size, instead you MUST actively manage your buffers, using some sort of AQM, which today most routers and devices do not, infact, do.

"does not let the queue build up" is the key phrase here. More specifically, does not let the queue for any one outgoing link grow beyond what can (probably) be transmitted over the next few milliseconds on that specific link, without dropping a packet or two as a signal that congestion is occuring.

wireless is a special challenge: just because that link is 5mbps this moment, doesn't mean it won't be a lot less 20ms from now, and if a packet is lost, you don't know if it was congestion or noise - and the apropriate response to high-noise-low-traffic is exactly the opposite response to low-noise-high-congestion.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 12, 2011 15:34 UTC (Mon) by marcH (subscriber, #57642) [Link]

> If such a 10MB buffer ends up holding data that trickles out over a 1mbps link though, it'd take a *minute* for the buffer to empty, i.e. completely unusable.

This keeps coming... why would the 1Gb/s link hold the data on behalf on the 1Mb/s link?

Plug the modem - get the problems...

Posted Dec 12, 2011 15:46 UTC (Mon) by khim (subscriber, #9252) [Link]

This keeps coming... why would the 1Gb/s link hold the data on behalf on the 1Mb/s link?

Buy modem, attach to computer, get the problem. Computer only know about 1GBit link to the modem and so allocates 1MB buffer, router only gets 1Mbit because your line is not ideal... instant 1000x impedance mismatch.

Plug the modem - get the problems...

Posted Dec 12, 2011 15:55 UTC (Mon) by marcH (subscriber, #57642) [Link]

> instant 1000x impedance mismatch.

Not the problem.

What's happening here is bufferbloat inside *the modem*; NOT inside the computer. Make the modem adjust its queue size depending on the speed of each outgoing link and your problem is solved.

The problem is the modem having the same buffer size on every link (the buffer is probably even shared across the links). Simple laziness from the designers.

Plug the modem - get the problems...

Posted Dec 12, 2011 15:57 UTC (Mon) by marcH (subscriber, #57642) [Link]

> What's happening here is bufferbloat inside *the modem*; NOT inside the computer.

... unless you have Ethernet flow control enabled, in which case you might have bufferbloat in BOTH places because of backpressure! Disable flow control right now since it's not compatible with Van Jacobson congestion control.

Plug the modem - get the problems...

Posted Dec 16, 2011 2:47 UTC (Fri) by quanstro (guest, #77996) [Link]

ethernet flow control, at least through switches, makes
tcp-style flow control work better! at least that's been
my experience.

ethernet flow control

Posted Dec 16, 2011 16:57 UTC (Fri) by marcH (subscriber, #57642) [Link]

Your mileage may vary. The effect of Ethernet flow control depends on a wide range of parameters.

Ethernet flow control is effectively chaining queues across devices. Since the aggregated queue is bigger I can see how it *may in some cases* enhance TCP throughput. But it will obviously make any existing bufferbloat even worse.

Most importantly, Ethernet flow control will create HOL blocking.

Your mileage may vary.

Some old musings with Ethernet flow control: http://marc.herbert.free.fr/noq/

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 12, 2011 15:47 UTC (Mon) by marcH (subscriber, #57642) [Link]

> wireless is a special challenge: just because that link is 5mbps this moment, doesn't mean it won't be a lot less 20ms from now,

Is wireless link speed varying that fast that you cannot adjust your queue size accordingly with (again) results not good enough for scientists but decent enough for engineers and end users? This is a genuine question.

Surely when sitting at your desk your link does not keep jumping from 100Mb/s to just 1Mb/s several times per second, does it?

> and if a packet is lost, you don't know if it was congestion or noise - and the apropriate response to high-noise-low-traffic is exactly the opposite response to low-noise-high-congestion.

Yes, dropping packets is a very poor congestion signal. A LOT has been said about this already. Is it really related to bufferbloat? I do not think so. It was a concern a long time before anyone noticed bufferbloat, and for sure it will still be a concern a long time after bufferbloat is fixed (if ever...) I can imagine that the two can interact badly with each other however, this does not prevent working on and fixing the two problems independently of each other.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 1:52 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

>> You'd think if
>> it helped their customers they'd put engineering resources on it.
>> I've heard that it's partially a prisoner's dilemma - if you turn
>> AQM/RED off and others on the same pipe use it, you get better
>> performance. And packet drops can't be manipulated like this,
>> so TCP uses the measure that's harder to fake. True?

> No, I don't believe so.

I don't think that was the reason for TCP using packet dropping as the measure, remember that TCP congestion fallback predates RED and most other AQM proposals.

but that being said, I have heard that some of the AQM proposals do end up being disadvantaged if used over a congested link with traffic that isn't well behaved by that algorithm's definition.

the issue that the AQM protocols require tuning and attention to get the best performance, and incorrect tuning can cripple you is far more of an issue. If 'out of the box' with no AQM is 'good enough' anyone lacking manpower will be reluctant to add AQM that requires manpower to get right.

the issue is now we are getting to the point where no AQM is no longer being considered 'good enough' for many areas of the Internet, but the existing AQM protocols still have the same problems they always have had, so there is still research to try and find better protocols.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 19:32 UTC (Tue) by Unladen (guest, #72953) [Link]

>the issue that the AQM protocols require tuning and attention to get the
>best performance, and incorrect tuning can cripple you is far more of an
>issue. If 'out of the box' with no AQM is 'good enough' anyone lacking
>manpower will be reluctant to add AQM that requires manpower to get right.

Thanks, that's useful - worst-case AQM is much worse than without it, but best-case is better, does explain a lot.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 17:19 UTC (Tue) by nye (guest, #51576) [Link]

>Having said that, I am working on demo (and some data) I'll put up sometime soon in video form; we can use bandwidth shaping to get the buffers out of the broadband connection (sacrificing bandwidth and powerboost), and the difference is actually quite dramatic.

Is this materially different in some way to things like wondershaper and its ilk that have been around for years?

The whole bufferbloat issue confuses me since AFAIU it looks like a restatement of what every P2P user has known for a decade or more, and that's really not news. Maybe I'm missing something.

(And there's a corollary to that, which is that carriers really have no incentive to fix the problem since it mostly affects the users that they hate.)

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 19:28 UTC (Tue) by Unladen (guest, #72953) [Link]

Fair enough. Sounds like you're saying that automatic buffer management algorithms don't work well because modern links often have variable bandwidth. Thus better algorithms need to be devised.

It is striking that many end users are told what bandwidth they achieve (numbers in browser download, file copy windows), but not link latency. Perhaps that has given ISPs an incentive to optimize for bandwidth (i.e. FastStart, or jack up buffer sizes) and not care about latency because it's hard to measure.

And if AQM/ECN requires all users of a link to turn it on, then adoption will be difficult. Tasks that want maximal bandwidth and don't care about latency (like browser downloads) have no incentive to use it.
Someone needs to write a mobile app that forces packet drops after it detects congestion and speeds up the browser experience. You do that and users will be clamoring for AQM.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 19:51 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

AQM isn't an application thing, it's a system network stack thing.

in the past, entities have tested their systems for latency (usually with tiny packets using a small amount of bandwith) and separately for bandwidth (usually with huge packets)

if you bandwidth isn't saturated, the small packets never queue up and so the device shows good latency on the latency test. buffer size doesn't matter.

in the bandwidth test, larger buffers prevent packets from being dropped (and therefor retransmit times), so larger buffers help bandwith measurements (even if only marginally).

so this has lead to thinking that larger buffers are always better. Add in the fact that memory is getting cheaper (OLPCv1.5 went from 256M of ram to 1G of ram because it was cheaper to buy the 1G memory modules). If you are building a router and have more memory, what are you going to use it for besides larger buffers?

the bufferbloat problem is that when the bandwidth is saturated, latency becomes horrible. This is not a combination that vendors have been testing.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 1:33 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

what you are seeing is the disconnect between the opensource community and the academic community.

for those of is in opensource, this is old news (almost the same post that he made a year ago)

but for the academic world, this is going to be their first introduction to the problem. At that, this is just a summary with the formal paper detailing all of this still under development (although JG indicates that he is hoping to get the formal paper done about the time the print copy of this gets into people's hands)

there has been some progress in getting things into linux, but much of it is still in the matter of producing new options for people to test and see how they work in the real world.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 21:38 UTC (Tue) by daglwn (subscriber, #65432) [Link]

> for those of is in opensource, this is old news (almost the same post that > he made a year ago)

> but for the academic world, this is going to be their first introduction
> to the problem.

I don't think so. Back when I was in "the academic community," I paid very close attention to the free software community. Most researchers do, as this is where they get their software.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 6:31 UTC (Tue) by bersl2 (subscriber, #34928) [Link]

Remember, we're talking about people here. Even intelligent people sometimes need to be repeatedly clobbered over the head with the truth before it sinks in.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 8:33 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

I used to run help to work as network engineer in a small starup WiMax ISP. We DEFINITELY had this problem and it was very severe.

There are simply no good solutions for it. In our case we couldn't even use most of intelligent AQMs on provider side because lots of our customers used VPN which made impossible distinguishing between flows. In the end, we had to resort to shaping traffic on CPEs (customer premise equipment).

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 15:27 UTC (Tue) by marcH (subscriber, #57642) [Link]

I am tired of hearing about Active Queue Management research when I can easily create 400ms latency here and there using only a couple of regular TCP connections.

A 400ms queue does not need to be "actively managed", it just needs to be made smaller. No queue should be longer than 100ms, end of. I am really not interested in having super throughput from outer space while Akamai and Google and whoever else are installing server farms on every single continent to give super low latency... instantly destroyed as soon as I download some software update.

I am sure there is fascinating research and fine-tuning to be done in the 10-100 milliseconds range but please, put some effort on fixing the most blatant problems first?

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 16:24 UTC (Tue) by jmm82 (guest, #59425) [Link]

So what about the person who runs batch file transfers and only cares about throughput and cares less about latency? Maybe we should just optimize the whole internet to your workload and at least one person will be happy.

ISPs optimize the network for benchmarks and common internet users do not even know "latency" exists as a concept. All the average consumer bases their purchase on is the average throughput. Only later do they find out about latency when their system because intermittently lagging.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 16:39 UTC (Tue) by marcH (subscriber, #57642) [Link]

> So what about the person who runs batch file transfers and only cares about throughput and cares less about latency?

She'll get 100% throughput when downloading from the same continent and 95% when from a different one. She will not even notice.

PS: please tell her I feel sorry for her POTS bill now that everyone else switched to VoIP in one form or the other.

> Maybe we should just optimize the whole internet to your workload and at least one person will be happy.

Me *and Jim*.

Jim went all the way to the ACM to tell about the problems with his kids and his home connection and yet you forget about him: not nice!

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 17:11 UTC (Tue) by nye (guest, #51576) [Link]

>PS: please tell her I feel sorry for her POTS bill now that everyone else switched to VoIP in one form or the other.

s/everyone else/a handful of people/

And don't forget that in most of the world an internet connection requires paying for a POTS line anyway, so there isn't even a benefit to VoIP unless you're calling another country, but the downsides are still enormous.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 9, 2011 16:16 UTC (Fri) by jmm82 (guest, #59425) [Link]

I work with cellular internet and 1 to 3 second ping times are far to common. My point was that 100ms or 400ms are arbitrary numbers and may fit all of your needs, but the whole internet does not have 10ms ping times, hence the reason a magic number for the whole internet will not work.

Physics

Posted Dec 9, 2011 22:10 UTC (Fri) by marcH (subscriber, #57642) [Link]

> hence the reason a magic number for the whole internet will not work.

The 10-100ms range is anything but magic numbers. These two numbers are fundamental requirements coming straight from physics and biology (and a tiny bit of maths).

100ms (give or take) is the amount of buffering required at every potential bottleneck to maximize the throughput of Van Jacobson's congestion control algorithm across a continent. This number does not come from some hairy research but straight from the speed of light and the average size of a continent; not exactly an arbitrary number. Buffer more than 100ms on any link and you will harm latency even more for NO throughput benefit.

10ms is a threshold in human perception - think VoIP and gaming. Again, no magic here: just biology. Less than 10ms buffering harms your throughput (even more) for no perceptible benefit.

It is a funny cosmic coincidence that playing Counter-Strike across an ocean sucks while it's OK on the same continent (well, maybe not between Alaska and Chili but you get the point).

Now these two numbers are orders of magnitude rounded for convenience. If you think the ideal range is rather 15ms-150ms I have absolutely no problem with that. What I have a problem with is:

> I work with cellular internet and 1 to 3 second ping times are far to common.

Researchers always focus on the complicated stuff (here: optimizing between 10 and 100ms). Simply because trivial requirements do not get papers published. Do NOT let researchers distract you from simple facts like: 1 second ping time is just a plain bug/a joke. Reduce buffering to 100ms (or 150ms if you prefer) on every link and you will make most of your customers happier and upset practically NONE.

mtr and job done

Posted Dec 10, 2011 14:23 UTC (Sat) by marcH (subscriber, #57642) [Link]

In a similar fashion, do not let yourself distracted by any impressive monitoring frameworks or charts researchers may be using. Matt's traceroute ("mtr") is almost always good enough to very accurately pinpoint any bufferbloat currently destroying your latency. Sometimes iperf/netperf is not even required; downloading some DVD image is enough. The main problem is not technical but going through first level and second level support. Support stories seldom make it to the ACM/IEEE though.

Science is useful and makes great reads for rainy weekends but, when push comes to shove use simpler engineering to do the job.

Physics

Posted Dec 12, 2011 11:49 UTC (Mon) by jlokier (guest, #52227) [Link]

1 second ping time is just a plain bug/a joke.

I often use cellular internet with a marginal signal that is oversubscribed, and 5 second ping times are quite common. Even 20 seconds at some times.

As these are ping times to the cellular network access point, from an otherwise idle handset, it's quite possible this time cannot be improved by simply dropping packets early at any stage.

In other words, it may not be a bufferbloat problem - and it may not be a bug either, if the RF link is simply too marginal and oversubscribed. As far as I can tell, these timings depend greatly on the strength of RF signal, and on the time of day.

In this case the way forward looks like newer cellular technology. We all look hopefully at 4G/LTE, and (would be nice) better cross-carrier RF diversity.

Even so, it's not clear that "speed of light" is an achievable latency on fully-subscribed large area wireless networks with large numbers of moving devices.

Physics

Posted Dec 12, 2011 12:19 UTC (Mon) by marcH (subscriber, #57642) [Link]

> I often use cellular internet with a marginal signal that is oversubscribed, and 5 second ping times are quite common. Even 20 seconds at some times.
> [...]
> In other words, it may not be a bufferbloat problem

Whenever you experience ping times over 1 second, something somewhere is buffering your ping (or pong) packet for more than 1 second. Even if this buffer is not "bloated" strictly speaking, holding on any packet for that long is WRONG and is definitely a BUG.

As an example, any link retransmission technique with timeouts over 1 second is simply not compatible with TCP/IP. Making that link technology compatible with TCP/IP is as simple as making it timeout (and drop packets) much, much sooner. It is really that simple.

Now again, finding the *optimal* timeout value is a very difficult problem. However, reducing a multi-seconds timeout to a reasonable 100-150 milliseconds value is NOT difficult at all and will make every user happier, while upsetting none.

Here is a supermarket analogy (for a change). You know that queues always become too long at peak time. Customers complain about it. You have money for two extra tills. But you do not proceed because it is oooh so hard to find the optimal number of tills.

> In this case the way forward looks like newer cellular technology.

This is throwing out the baby with the bath water. And the newer technology might make the same mistake again. And in any case it will not displace 2.5G/3G everywhere overnight.

> Even so, it's not clear that "speed of light" is an achievable latency

Of course it's not; you need some reasonable amount of buffering for a number of reasons.

Physics

Posted Dec 12, 2011 12:47 UTC (Mon) by ekj (guest, #1524) [Link]

That's a bug. A packet is always either in-transit, being processed by a device, or being stored in a buffer for later processing and/or later sending.

The only way you can get 1+ seconds on local short-distance links, is by having the packet spend the huge majority of that time stored in some buffer. Which is a bug.

You want a sufficient bug that short term spikyness of packet-arrival does not needlessly cause lost packets when transmission a few milliseconds later would be preferable.

But 5 seconds, or even 1 second, worth of buffering is *way* too much, sure we can debate if you want 25ms or 250ms worth of buffering, and the answer is surely "it depends", but there's just no way 5 *seconds* worth of buffering can avoid causing an order of magnitude more problems than it solves.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 16:42 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

the problem is that figuring out the right size queue to use is really hard to do

If you have a 1Gb/sec connection across the country, you need to have a large buffer on your machine.

the default buffer sized in the kernel are sized for this sort of thing.

however, if your 1G network is connected to a 1M network, then your server buffer size should be 1/1000 the size to maintain the correct latency, but your desktop has no way of knowing that there is a 1m link somewhere in the middle.

in this situation (1G - 1M - 1G links) you need the routers connecting to the 1M link to have small buffers and drop packets

in practice, things are actually worse than this

you have a laptop (1g) connecting to to your firewall/access point (1G) connecting to your DSL modem (10M or 100M) connecting to the ISP (1M) and similar setup on the other end.

you don't have any control over the buffers on the DSL modem, and the buffers there are much larger than they should be, and genrally not configurable, so you can easily fill them up and generate the high latency.

to work around this, you need to shrink the buffers on the firewall to have it drop packets sooner, or on your laptop to have it not generate the packets.

AQM is needed to have these devices detect that there is a problem and shrink the buffers in response.

everybody knows that the current buffer sizes are far too large, but there isn't a clear answer to the question of what the buffer size should be.

variable link speeds (wireless or cable 'turbo mode') greatly complicate this issue.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 8:55 UTC (Wed) by marcH (subscriber, #57642) [Link]

> however, if your 1G network is connected to a 1M network, then your server buffer size should be 1/1000 the size to maintain the correct latency, but your desktop has no way of knowing that there is a 1m link somewhere in the middle.

No: in this case the queue(s) in your server do not matter because they will be empty most of the time. Packets will only stack up at the bottleneck (as the name implies).

> in this situation (1G - 1M - 1G links) you need the routers connecting to the 1M link to have small buffers and drop packets

Yes.

If every link makes sure not to buffer more than 100ms or so, then bufferbloat goes away in 90% of the traffic cases (10% are left for the researchers to have fun with). Let's fix the most obvious problems first.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 10:45 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

you are right that the buffers on the 1G machine will always be empty, you need to (eventually) detect packet loss and then throttle the sending speed

the reality is that there is a lot of equipment out there that you are not going to be able to get replaced for several years, and part of that reason is that the vendors are still building equipment with buffers that are way too large because they still aren't doing testing that shows them the problem. this sort of publication and formal paper is what's required to get them to notice the problem and start the process of fixing it.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 12:05 UTC (Wed) by marcH (subscriber, #57642) [Link]

> you are right that the buffers on the 1G machine will always be empty, you need to (eventually) detect packet loss and then throttle the sending speed

TCP does that at the source. TCP is ACK-clocked on whatever is the current bottleneck thanks to the congestion or receiver window (whichever is smaller). No need to throttle TCP anywhere else.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 12:17 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

as I understand the effects of bufferbloat, the fact that these over-large buffers are queuing the packets instead of dropping them is breaking the TCP ack clocking mechansim

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 13:26 UTC (Wed) by marcH (subscriber, #57642) [Link]

> ... is breaking the TCP ack clocking mechansim

... which does not mean TCP will go mad and send even more packets and clog whatever queues even more. I would bet it is actually the opposite.

Anyway, I only had the "reasonable queue size @ bottleneck" case in mind in my previous post. In this "normal", non-bufferbloated case ACK-clocking works fine and there is no need for externally throttling TCP anywhere else than at the bottleneck's queue when it fills up.

*In theory* not just TCP but every other protocol should be a good citizen and follow TCP's lead in respect to congestion/throttling: http://en.wikipedia.org/wiki/Datagram_Congestion_Control_...
In practice no one is using DCCP but it's not too bad either: Skype for instance actively throttles itself down in case of congestion/high latencies. Unsurprisingly, no application tries to damage the network it is using itself.

I would not be surprised if TCP is actually the worst guy for falling in and filling bufferbloat traps. There is some irony in that considering it is usually considered the best congestion citizen *in the lack of bufferbloat*.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 16:39 UTC (Wed) by martinfick (subscriber, #4455) [Link]

> I would not be surprised if TCP is actually the worst guy for falling in and filling bufferbloat traps. There is some irony in that considering it is usually considered the best congestion citizen *in the lack of bufferbloat*

I had the same thought myself and was wondering if TCP needs to be fixed to take latency into account? From my lame understanding it seems to only care about throughput, isn't that why the bittorrent folks came up with their own UDP based solution? So, while I am all for fixing buffers wherever possible, shouldn't there be more discussion about fixing TCP?

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 22:58 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

actually, what is happening is that TCP (eventually) gets the acks for the packets that it sends, so it speeds up to try and send more traffic.

eventually the delay gets so large that it times out before the acks arrive and the speed collapses.

If you look at the graphs in Getty's paper, you will see exactly this sort of picket-fence for what he is calling 'goodput', which is the amount of traffic that is actually getting to the destination (with the rest of the available bandwidth being taken up by 'badput', which is packets that are going to be dropped by the time they get to the destination because they are either too old, or they are retransmissions of packets that are still in flight, and so will be duplicates by the time they get there)

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 11:40 UTC (Wed) by mtaht (✭ supporter ✭, #11087) [Link]

Dlang:

You are doing a great job here of explaining things, but I have to correct you on one MAJOR point.

"however, if your 1G network is connected to a 1M network, then your server buffer size should be 1/1000 the size to maintain the correct latency, but your desktop has no way of knowing that there is a 1m link somewhere in the middle."

Um, no. The closest we know to a correct figure for buffering is the square root(flows) * bandwidth* delay product of the next hop.

Bandwidth as humans measure it is X Mbit/sec, and as computers do, it
bits/nanosec and this distinction trips us up. Also the 1G network generally has very low delay, and the 1Mbit network very high.

I recently shot myself in the foot here myself, I was doing some shell scripting that assumed a linear relationship of buffers to speed for tc, and those estimates got very wrong, quickly. It rather bugs me that there is no sqrt() call in the shell, you have to simulate one using echo "sqrt(the bdp)" | bc -i or something like that.

Assuming delay is a constant, (and delay is not!), doing some tons of square roots
for practice of common figures, tossing in nearly random numbers for the above values, straightened out my assumptions and thinking and code considerably.

http://www.bufferbloat.net/projects/bloat/wiki/Equations

http://en.wikipedia.org/wiki/Bandwidth-delay_product

http://www.cc.gatech.edu/~dovrolis/Papers/buffers-ton.pd

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 11:52 UTC (Wed) by mtaht (✭ supporter ✭, #11087) [Link]

And I should probably mention that whilst I'm fiddling with sqrts at this point, incorporating the next hop delay term into your thinking about this stuff is far more important mathematically.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 12:17 UTC (Wed) by mtaht (✭ supporter ✭, #11087) [Link]

Meh. I can't believe how much I trip myself up on this. Who knows, maybe I've been getting it wrong all this time, too....

Let me try again.

the total amount of buffering in the txqueue + tx ring buffer portion of the stack needs to be not much longer than the the BDP to the *next hop*.

BQL appears to solve the tx ring portion of the problem thoroughly, at least on ethernet.

Figuring out how many streams can co-exist in the txqueuelen set of buffers above the tx ring, and when to start dropping packets there, is an AQM problem, about which much debate exists. The next-hop BDP*sqrt(flows) thing is, well, debatable, but getting the effective txqueue's length down to where that portion of the AQM debate can take place again, seems doable with the time in queue idea floating about.

The total amount of buffering in tcp's algorithms, which do their own buffering internally, that is required for the end-to-end queue to be handled, is dependent on the BDP, and I'm going to flat out wave hands and say that AQM can help there a lot, and typically has very 'interesting' problems with streams of different RTTs.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 12:36 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

If the buffer size is root(flows) * bandwidth * delay and the bandwidth is 1/1000 of what it was before (with all else being equal) doesn't that make the required buffer size 1/1000 as well?

I'm missing something here, but I don't see how the sqrt piece matters when we are talking about the bandwidth changing

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 13:04 UTC (Wed) by mtaht (✭ supporter ✭, #11087) [Link]

In your example, you missed the delay component.

'All else' is not equal in your example.

And I was mentally going from the 1Mbit link UP to the 1000Mbit link, where
delay factors in a lot. Your typical 1Mbit internet link can have an inherent next-hop delay of 1-60ms on wired technologies which is a significant component of that portion of the BDP. Wireless is far worse,
of course.

And I was still kicking myself about the sqrt part from my mis-spent weekend. And I conflated the three together in trying to explain myself.

I really shouldn't post stuff before my third cup of coffee. I may just delete what I tried to post and start over. If there is a way to explain it better (if you can explain it back to me!) I'm either going to make another pot of coffee or go to bed and pull a pillow over my head. Or both.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 16:51 UTC (Tue) by intgr (subscriber, #39733) [Link]

> A 400ms queue does not need to be "actively managed", it just needs to be
> made smaller. No queue should be longer than 100ms, end of.

That's a very interesting point. Much of their work has gone into making queue sizes tunable in bytes, but they're really worried about *latency*, not queue size.

Seems like it could be as simple as tagging each packet with receive timestamps and dropping packets on the draining end if they've been in the queue for too long.

Now I'm sure smarter people than you and I have thought of this and rejected this approach for some reason. Does anyone know why this isn't a good idea?

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 17:00 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

the problem is that getting the current time is a rather expensive operation, doing this for each packet (and worse, doing it multiple times per packet at different points in the queue), can hurt your network throughput.

in the past, queues have been managed in terms of how many packets are in the queue, not caring if the packet is a 64 byte minimum size packet or a 9000 byte jumbo packet.

one of the results of the bufferbloat effort is the new AQM queue type (I;m blanking on the name of it, but it was merged upstream a release or two ago) that manges the queue size in terms of the size of the queue in bytes, when you have a consistant link speed (i.e. most wired networks), the time needed to transmit each byte is very close to a constant, and so managing the queue in terms of bytes is almost exactly the same as managing it in terms of time.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 6, 2011 18:40 UTC (Tue) by nix (subscriber, #2304) [Link]

You're thinking of BQL.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 7, 2011 11:44 UTC (Wed) by mtaht (✭ supporter ✭, #11087) [Link]

dlang: actually, all the work that has gone into making timestamping fast in the last decade seems to have paid off. Eric Dumazet proved to me that it is now incredibly cheap, and I think pursuing "time in queue" has great potential to get us into the sub 30ms range for inherent latencies across a wide range of gear.

I knew in my gut, too, that timestamping was expensive. It *was* - in the early 00s. My gut was wrong.

Milk algorithm?

Posted Dec 7, 2011 18:20 UTC (Wed) by dmarti (subscriber, #11625) [Link]

Isn't everyone independently coming up with the "milk algorithm?" Here's a carton of milk (packet) with a expiration date. Put it in your fridge (buffer). When you're ready to take it out, compare the expiration date to your current date. If it's not expired, drink it (send it). If it's expired, pour it out.

Milk algorithm?

Posted Dec 8, 2011 9:49 UTC (Thu) by mtaht (✭ supporter ✭, #11087) [Link]

Heh. I'm glad there's prior art.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Jan 2, 2012 10:23 UTC (Mon) by Randakar (guest, #27808) [Link]

Even if putting a timestamp on packets is too expensive for each individual packet, there are ways to mitigate that.

For example, a queue manager could put a magic timestamp packet in it's queue at periodic intervals. Say, 0.1 ms. (This number may need tuning..)

Every time a timestamp packet hits the front of the queue all packets behind it until the next timestamp marker will be at least (timestamp + something less than 1 interval) old. If that timestamp is too old you just drop the packets in the interval and move on.

Of course this solution isn't as good as individual timestamps - you can still get into situations where you're sending very old packets because there are too many packets in the intervals that you ARE processing - but I can imagine cases where this type of tradeoff may be worthwhile.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Jan 3, 2012 13:01 UTC (Tue) by etienne (subscriber, #25256) [Link]

Or use 16 queues organised by received time to store packets, at some point queue (head) No 6 contains new packets currently arriving, it you are still sending packets from queue (tail) No 7 then throw away the whole queue No 7.
Next timeslot store new packets in queue No 7 and throw away queue No 8.

Bufferbloat: Dark Buffers in the Internet (ACM Queue)

Posted Dec 15, 2011 5:13 UTC (Thu) by vMeson (subscriber, #45212) [Link]

What ever happened to PCP: Efficient Endpoint Congestion Control?
http://www.usenix.org/event/nsdi06/tech/anderson.html

<quote>
Our initial experiments show that PCP, unlike TCP, achieves rapid startup, small queues, and low loss rates, and that the efficiency of our approach does not compromise eventual fairness and stability. Further, PCP is compatible with sharing links with legacy TCP hosts, making it feasible to deploy
</quote>

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds