LWN.net Logo

Gettys: Bufferbloat demonstration videos

Jim Gettys says: "If people have heard of bufferbloat at all, it is usually just an abstraction despite having personal experience with it. Bufferbloat can occur in your operating system, your home router, your broadband gear, wireless, and almost anywhere in the Internet. They still think that if experience poor Internet speed means they must need more bandwidth, and take vast speed variation for granted. Sometimes, adding bandwidth can actually hurt rather than help. Most people have no idea what they can do about bufferbloat. So I’ve been working to put together several demos to help make bufferbloat concrete, and demonstrate at least partial mitigation." Definitely useful viewing for anybody who is concerned with the problem and how to begin addressing it.
(Log in to post comments)

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 18:31 UTC (Thu) by hitmark (guest, #34609) [Link]

I fear this is a no win situation, as reducing buffers may well cause the routers and such to drop old packets and cause retransmissions. Not a major issue for file transfers, it will just slow down. But voip or similar will start to stutter once the packet drops becomes high enough.

The only real way to avoid trouble is "wasteful", and this is to provide upstream bandwidth that match or exceed the potential total downstream. But as traffic either stream (voip, video/audio, games, p2p) or burst (browsing, email, p2p depending on the swarm) this may see fat pipes lay mostly unused for a lot of the time.

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 18:57 UTC (Thu) by kleptog (subscriber, #1183) [Link]

That sounds better than the current situation where we have retransmissions and we're *not* dropping the packets. That's waste.

I'm not understanding your problem though. If packets are being dropped fairly, then everything is fine. Suppose your bulk transfer transfers 100 times as much data as your voip connection then there would be 100 times as many packets dropped in your bulk transfer than your voip connection, which seem like it shouldn't be a problem, because the transfer will slow down till there are no drops, at which point your voip connection has no drops either.

But you're right, if you want to guarantee 0.00000% loss, then the only solution is to not saturate the pipes.

Remember, the more streams going through a pipe, the smaller the buffer can be, because any loss will only affect each stream slightly.

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 21:40 UTC (Thu) by jg (subscriber, #17537) [Link]

And not saturating a pipe isn't an option: modern TCP's will do so routinely at the edge.

What is more, the buffers are inducing *excess* packet loss under load; it may be counter intuitive, but all the excess buffering does is potentially add delay.
- Jim

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 21:59 UTC (Thu) by jg (subscriber, #17537) [Link]

I realized I should explain how the buffers are inducing excess packet loss: drop tail is not good at the best of times; you are running the packet buffers nearly full, and if you don't have space at the right instant, you have to drop. If you'd managed the buffers properly in the first place, the packet burst would have someplace to go. So you really want to be trying to keep the buffers (nearly) empty, most of the time, minimizing both delay while keeping the link busy. That allows them to absorb bursts, which is their intent in the first place.

Doing that requires signalling the end points in a *timely* fashion, via packet drops or ECN (not typically turned on). These buffers are now huge, to the point that TCP has had lots of time to ramp its rate to a excess point, and in fact, can be so huge that TCP thinks the path has changed and starts to go hunt for a new operating point more aggressively. These combine to induce higher than normal packet loss. The buffers have defeated the *timelyness* presumptions of TCP's design. The bigger they are, the worse it is.

Much of the bandwidth performance degradation has been hidden (in terms of bandwidth tests), by the deployment of SACK, so other than the wasted packets (which in my traces were a few percent on the broadband link), you don't see much of the other performance loss you might by TCP having to restart. So I ended up with just a few percent packet loss in my wired traces (but normal would be well under 1 percent).

One item I didn't understand until a few months ago until a long email exchange with Van Jacobson, was that the other problem is that TCP's reaction to competing traffic is related to the RTT in a quadratic fashion.
So having 15 times the buffering in place in my broadband connection means it takes 15^2 times longer for the "elephant flow" filling the link to get out of the way. Not good... Your transient traffic is really losing relative to the big flows for a long time.

In short, the excess buffering just hurts and does not help.
- Jim

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 19:17 UTC (Thu) by hechacker1 (subscriber, #82466) [Link]

Well, that's why we need proper AQM at the home router, and at ISP's routers.

It's fairly trivial to prioritize VOIP, DNS, and gaming traffic so they get low latency queues and rarely dropped packets. After all, these low latency sources of traffic actually consume very little in bandwidth (though they do consume packet space!).

I do this already with my home router running a custom version of openwrt. I can prioritize certain classes of traffic to effectively never drop, while punishing long lived, bursty traffic by causing it to drop, or causing it to get held in the queue.

At my home network, bufferbloat isn't really an issue with small enough queueing as to never occur a large latency.

But, my ISP's provided modem has its own buffering that I can't do anything about, except to reduce my own upload and download below that of my provisioned speed.

After all my tweaking, I managed to get my buffering down to about 150ms at the worse loaded case, as before it could get into the seconds.

I'm excited to see development of home router firmware that should "just fix" most of the offending problems at home. Also those changes are going into the Linux kernel, so perhaps servers in time will adopt those techniques.

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 21:42 UTC (Thu) by jg (subscriber, #17537) [Link]

One of the headaches right now is that the broadband gear is typically giving us exactly one queue: so if those buffers fill (from say, bulk TCP traffic), it can damage everything else (e.g. VOIP).

So right now, the best you can do is throttle your traffic enough that you might be able to prioritize traffic in your home router as best you can, and try to keep the broadband gear's buffers as empty as possible.

Sigh....
- Jim

Gettys: Bufferbloat demonstration videos

Posted Feb 3, 2012 7:57 UTC (Fri) by smurf (subscriber, #17840) [Link]

That's why I am using a hosted system on the other end of the link and an IPIP tunnel for *all* my traffic. Traffic shaping the way I need it on both ends of the link, at the cost of 4ms extra delay. On the plus side, no more buffers. Worth it.

Gettys: Bufferbloat demonstration videos

Posted Feb 3, 2012 14:16 UTC (Fri) by mtaht (✭ supporter ✭, #11087) [Link]

If I have any one goal for the AQM work, it's to get buffering and jitter for 'sparse streams' (voip, dns, gaming) below 4ms in the general case in a home/small business routing scenario on a 4Mbit uplink. I think we're accomplishing that now in the lab, but serious testing is indicated.

Not doing anywhere near as well on the elephants as I'd like, at the moment.

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 22:23 UTC (Thu) by dlang (✭ supporter ✭, #313) [Link]

actually, VoIP and many other multimedia protocols tend to handle gaps in the data better than they handle significant delays.

This is why the ATM protocol was designed with no retransmission capability and buffering was strongly discouraged for example.

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 22:25 UTC (Thu) by farnz (guest, #17727) [Link]

There's a key bit of the puzzle beyond reducing the buffering, and that's active queue management. Obviously, even with no packet loss, your VoIP is going to be unusable if I have 5 to 10 seconds of buffering in the path; there are very few applications that are both delay-insensitive, and unable to cope with packet loss, and those applications can be trivially fixed by adding a retransmission mechanism.

Assuming you cannot overprovision, you escape this by doing some form of queue management; the classic example (although known buggy) is Random Early Drop, where packets are dropped in proportion to how full the queue is when the packet arrives. AQM is still an area of active research, but more recent mechanisms don't suffer from the same flaws as RED. The hard part is getting them deployed everywhere - FIFO queueing is still the default, even when it's inappropriate for the likely link conditions.

Remember also that traffic splits into two rough classes by behaviour:

  1. Rate-limited flows, like VoIP, where as soon as the flow is being given enough capacity, it stops increasing in rate.
  2. Unrestricted flows, like file uploads, where adding capacity results in the flow speeding up.

It is entirely possible for AQM schemes to be designed to take this into account - indeed, on links where the link speed is much higher than the speed of the rate-limited flows, most random chance AQM schemes will do so simply because there are many more packets from unrestricted flows. And, of course, it is possible to do some sort of flow aware treatment - either genuine QoS rules, which will give VoIP special treatment, or statistically fair methods such as Stochastic Fair Blue, which is designed to give each flow on a link an equal share of link capacity.

Gettys: Bufferbloat demonstration videos

Posted Feb 3, 2012 0:01 UTC (Fri) by jd (guest, #26381) [Link]

There are many variations of RED (Weighted RED, for example). In addition to RED and BLUE, there are other - less common - packet-dropping schemes such as GREEN, BLACK, PURPLE and WHITE. These are not, however, available for Linux as far as I know. The reason they are of interest, though, is that they're specifically designed with multimedia In mind.

I would also recommend to people that they look up Class-Based Queueing and Hierarchical Fair Service Curve, since these allow different traffic types to be handled differently. There are (or were, not sure if they're up-to-date) patches for other service schemes, if these aren't quite what is wanted.

Finally, getting back to buffering, I would advise people to take a serious look at projects like Web10G, which give you network profiling information. You can't select an algorithm or tune the buffer size if you don't have the information to work with. You'd be whistling in the dark, with no idea if an improvement is real or just for that moment.

Web10G

Posted Feb 3, 2012 14:12 UTC (Fri) by mtaht (✭ supporter ✭, #11087) [Link]

I took a serious look at web10G and had it in cerowrt for a while.

The stats I got out of it were generally not relevant to bufferbloat.

As it tinkles all over the hot path for tcp, to generate stats that weren't relevant to my problem, I ended up ripping it out.

I felt that perhaps periodically or temporarily inserting a watchpoint on a critical function might be better. I'm still open to finding a magic variable of some sort, believe me...

And if anyone wants to look at web10g harder, there's a port of the estats utility to openwrt in the cerowrt repo... and there are 3.1 patches floating about too.

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 22:42 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]

Of course having too big a buffer can be a disaster for a latency sensitive application like VOIP. If you fill your buffer, you're automatically introducing a delay of buffer size/transmission speed, which can be several seconds in some real world conditions. I'd rather have some stutter from packets being dropped than a multi-second lag in the conversation.

And think about what happens when the bandwidth hog that's saturating your link finishes. The VOIP packets that have been backed up in the buffer will be delivered as fast as the link can handle, which will likely be much faster than their natural delivery rate. What are you going to do with the great rush of packets you get? If you drop them, you'll wind up causing exactly the kind of stuttering you were trying to prevent by including a buffer. If you try to buffer them within the application you're left with the extra delay, which could potentially grow the next time a bandwidth hog saturates the connection and refills up the buffer.

Gettys: Bufferbloat demonstration videos

Posted Feb 2, 2012 22:57 UTC (Thu) by AndreE (subscriber, #60148) [Link]

I think the idea (please someone correct me if I'm wrong) is that reasonable packet loss/retransmission actually functions as a very crude form of queue management. Even time sensitive applications like VOIP can tolerate some packet loss, but they cannot tolerate huge packet delay being induced by buffer bloat.

Obviously, what is "reasonable" packet loss depends on a lot of things, including the application, it's required maximum latency, and how much delay the buffer bloating actually introduces.

Gettys: Bufferbloat demonstration videos

Posted Feb 3, 2012 9:38 UTC (Fri) by jezuch (subscriber, #52988) [Link]

> I fear this is a no win situation, as reducing buffers may well cause the routers and such to drop old packets and cause retransmissions. Not a major issue for file transfers, it will just slow down. But voip or similar will start to stutter once the packet drops becomes high enough.

As I understand it, this is exactly this idea that packet drops are Evil (capital E Evil) that led to bufferbloat. Packet drops are like pain, a warning signal that you may be doing something that may harm you. In case of TCP it means you're going too fast. So it may sound undesirable, and apparently many people thought so, but you don't want to stop feeling pain altogether.

Gettys: Bufferbloat demonstration videos

Posted Feb 4, 2012 6:55 UTC (Sat) by foom (subscriber, #14868) [Link]

Before I can stop thinking that packet drops are evil, I'd like a fix for the 200msec RTO_MIN issue...

BTW, I've learned since last time I mentioned this that apparently Windows delayed-ack timeout defaults to 200msec (TcpDelAckTicks=2), which is the same value as linux's RTO_MIN. (Contrast that with linux's TCP_DELACK_MIN of 40msec). That means that on networks with < 200msec round-trip-time (a lot of them), a linux server talking to a windows client will *always* retransmit the last packet in a sequence, if the windows client decides to delay the ack (which will be the case if there were an odd number of packets in that sequence).

So I guess that's pretty good empirical evidence that setting rto_min < 40msec (let's say, for example, zero) can't be so bad, even in the presence of the remote host attempting to use delayed acks. :)

Fixing the wrong problem

Posted Feb 2, 2012 21:36 UTC (Thu) by ncm (subscriber, #165) [Link]

This strikes me as attacking the wrong problem. Buffering capability is not the real problem, it just makes the effects of the real problem more visible. The problem is fundamental to TCP: it has a pathetically naive flow control protocol. The real solution is not to strip buffering capability from routers, it's to replace TCP with one of the (phenomenally successful) alternatives that doesn't depend on packet loss, or proxies for it, as an indicator of congestion.

One such phenomenally successful approach is to take changes in average flight time of packets to indicate corresponding growth or shrinkage of queue lengths in the path. The receiver knows the flight times. The sender needs to know how that average is changing, and adjust its sending rate accordingly. That way it never provokes packet loss itself, and isn't fooled by (numerous) non-congestion-related causes of packet loss.

Fixing the wrong problem

Posted Feb 2, 2012 21:47 UTC (Thu) by jg (subscriber, #17537) [Link]

While it might have been nice, 25 years ago, to foresee this and make the congestion avoidance algorithm delay sensitive, I fear we can't get there from here (for game theory reasons). I'd love to be wrong, but I suspect any delay sensitive transport is going to lose relative to a non-delay sensitive transport. So I don't see how to get there from here.

The fundamental issue here is allowing the buffers (of whatever size) to get filled in the first place. Then it doesn't matter what size the buffers are. AQM can prevent that; but RED won't work in many places we have to deal with today, and we don't have a replacement for it yet. There is hope here, but not quite ready to try to do code...

In the meanwhile, understand that the default buffer sizes are often hugely too big (say, by an order of magnitude), and tuned for the fastest the hardware can ever go (when you are actually using it at a fraction of its possible bandwidth). Even simple mitigation can help alot.

Fixing the wrong problem

Posted Feb 3, 2012 2:25 UTC (Fri) by ras (subscriber, #33059) [Link]

> it's to replace TCP with one of the (phenomenally successful) alternatives that doesn't depend on packet loss, or proxies for it, as an indicator of congestion.

Is there such an protocol, ie one that gives us something beyond what ECN + TCP already provides, and preserves the basic IP tenet of putting the intelligence at the edges? What is it?

It is true the ECN hasn't been deployed widely enough to be reliable. But as jg says here, the reason isn't technological, it's to do with incentives of deploying it versus not deploying it - which can be analysed using game theory. But we haven't lost all hope of implementing it. We have an inflection point coming up, IPv6. Hopefully all IPv6 capable routers will implement ECN reliably.

Fixing the wrong problem

Posted Feb 3, 2012 5:03 UTC (Fri) by imgx64 (guest, #78590) [Link]

> Is there such an protocol, ie one that gives us something beyond what ECN + TCP already provides, and preserves the basic IP tenet of putting the intelligence at the edges? What is it?

Well, there is always the forgotten StepChild Transmission Protocol (SCTP).

While I think that SCTP would solve many problems, the sad truth is that it's simply unusable over the internet. Many nodes on the internet (and about 100% of home routers) drop everything they don't know about (anything but TCP, UDP, and ICMP).

Judging from the IPv6 transition process, the only way a widely-used protocol could be replaced is when it becomes simply *impossible* to use further (i.e. IPv4 address exhaustion), and even then, the transition could take decades to happen.

So yeah, I'm not holding my breath over SCTP. Fixing TCP is more reasonable.

Fixing the wrong problem

Posted Feb 3, 2012 6:00 UTC (Fri) by ras (subscriber, #33059) [Link]

I thought SCTP illustrated my point. It does add several features to TCP, but congestion control isn't one of them - it copies TCP's congestion control implementation. As does DCCP for that matter.

The reason it just copies TCP it as far as I know there isn't a solution beyond ECN. Another way of saying that is ECN is about as far as you can go without breaking the "intelligence at the edges" condition. Setting that one bit when your queues are long doesn't require too much overhead, but doing something more may well do. Example of solutions that do break that condition are bandwidth reservation and QoS.

I say "as far as I know", because what I know is pretty dated. If someone come up with a better solution I would hear about it - just out of intellectual curiosity.

Fixing the wrong problem

Posted Feb 3, 2012 10:41 UTC (Fri) by epa (subscriber, #39769) [Link]

SCTP over IPSec?

Fixing the wrong problem

Posted Feb 3, 2012 23:42 UTC (Fri) by wmf (guest, #33791) [Link]

You often have to tunnel IPSec over UDP due to borkenness. SCTP over IPSec over UDP? Why not...

Fixing the wrong problem

Posted Feb 6, 2012 11:39 UTC (Mon) by epa (subscriber, #39769) [Link]

If crappy home routers, etc, can be trusted to pass UDP packets through unmolested, then SCTP-over-UDP is a possibility. Adding IPSec is nice because it defeats packet inspection, making it less likely that things will break in the future.

Fixing the wrong problem

Posted Feb 3, 2012 5:20 UTC (Fri) by raven667 (subscriber, #5198) [Link]

I would argue that ECN was deployed widely and it broke so much stuff for really lame reasons that it just can't be used on the open Internet safely. Maybe the situation has changed but AFAIK ECN is dead, even though it's widely implemented.

Fixing the wrong problem

Posted Feb 3, 2012 9:12 UTC (Fri) by farnz (guest, #17727) [Link]

ECN may be widely deployed in endpoints (Wikipedia's article tells me that all major OSes have it available), but it's normally disabled there, and isn't necessarily well-deployed in midpoints.

The problem is twofold:

  1. Misconfigured firewalls drop ECN traffic in a variety of interesting ways, making ECN-enabled systems appear "broken".
  2. Routers often don't do ECN marking of packets, they go straight to traffic drop - no point unless a majority of endpoints are ECN enabled.

It's the interaction of point 2 and point 1 that makes ECN a non-starter - there's no point making my routers do ECN marking if the endpoints won't use it. There's no point making endpoints use ECN marking if the result is traffic lost due to misconfigured firewalls. There's no point fixing the firewalls if routers don't do ECN...

A shame, as ECN is a technically better solution, but that's life for you.

Fixing the wrong problem

Posted Feb 3, 2012 2:59 UTC (Fri) by kevinm (guest, #69913) [Link]

Using packet loss to tell a sender to slow down is the only viable option, because packet loss is the only thing that imposes pain on the sender. Anything that does not impose pain on the sender can and will be ignored by a sender that wants to consume an unfair proportion of the link.

Fixing the wrong problem

Posted Feb 3, 2012 10:32 UTC (Fri) by khim (subscriber, #9252) [Link]

Yup. And even if sender is "nice" it can be coerced to become "evil". Remember download accelerators? They are still around...

Fixing the wrong problem

Posted Feb 3, 2012 17:49 UTC (Fri) by shemminger (subscriber, #5739) [Link]

The alternative is delay based TCP algorithms (Vegas and its descendants). These have never been widely deployed for two reasons: first, any delay based algorithm loses in a fight for bandwidth against a loss based algorthm. The good guy (Vegas) ends up losing to the bad guy (Reno) in any contended queue in the core routers. Secondly, delay based algorithms are too sensitive to other traffic on the reverse path or normal delays (like IRQ coalescing). The Round Trip Time measurement has a lot of variation and attempts to filter than variation out is hard except in the case of long lived flows (ie benchmarks).

Fixing the wrong problem

Posted Feb 3, 2012 19:06 UTC (Fri) by mtaht (✭ supporter ✭, #11087) [Link]

I have to note that 'westwood' is a replacement for vegas and needs to be looked at harder in a contested scenario with cubic.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds