User: Password:
Subscribe / Log in / New account

Gettys: Diagnosing Bufferbloat

Jim Gettys looks into how to figure out which hop is the current culprit for bufferbloat. "In this particular case, with only a bit more investigation, we can guess most of the problems are in the train<->ISP hop, because my machine reports high bandwidth on its WiFi interface (130Mbps 802.11n), with the uplink speeds a small fraction of that, so the bottleneck to the public internet is usually in that link, rather than the WiFi hop (remember, it’s just *before* the lowest bandwidth hop that the buffers fill in either direction). In your home (or elsewhere on this train), you’d have to worry about the WiFi hop as well unless you are plugged directly into the router. But further investigation shows additional problems."
(Log in to post comments)

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 4:58 UTC (Tue) by jackb (guest, #41909) [Link]

I've been reading these articles about bufferbloat for a while now but at this point I'm not sure if I should be doing anything or not.

I have my cable modem attached to a Linux machine that serves as a router/firewall. I'm comfortable writing iptables rules but know little to nothing about queuing disciplines. My connection seems to be fine but that might just be because I am accustomed to bad performance and don't realize that it's a problem.

So at this stage is there any software I can install or kernel option I can enable on my router/firewall system that would improve my cable modem's performance that doesn't require me to be a networking expert to configure or should I keep waiting until the solution becomes easier to deploy?

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 6:42 UTC (Tue) by fest3er (guest, #60379) [Link]

I can honestly say I've never been affected by bufferbloat since I started using a traffic control config generated by my config generator (found at It's also built-in to my update of Smoothwall (found at

It's a mid-beta that walks the user through setting up Linux Traffic Control. The javascript program allows only valid selections to be made. Knowledge of traffic control is assumed (you need to know what you are trying to accomplish), but the user does not have to be an expert in the arcane syntax of tc, htb, iptables, or the other bits and pieces used; that syntax is completely hidden (until you peruse the shell script it generates).

Even if you only set the ISP's uplink speed and use a Stochastic Fair Queueing (SFQ) qdisc on each NIC, you will probably experience much less buffering and delay. The trick is to set the outbound link speed of each NIC to the observed speed of the link. For example, my ISP limits me to 2.8Mb/s uplink, my 100Mb/s LAN really runs at a max of 92-95Mb/s, GigE runs at 250-350Mb/s on PCI, and runs 850-950Mb/s on PCI-E. This is on a quad-core Phenom-II 965; your mileage may vary. With the link limits set, packets are dropped in the perimeter firewall instead of being buffered there and beyond.

The main trick is to limit the outbound bit rate to the slowest rate in the path as early in the path as is possible. Local retransmission across a 100Mb/s or GigE network is 'free'; re-sending across that slow link is costly (time-wise).

The trouble with Linux is there is no easy way to share a slow link's bandwith among one or more high-speed links. (It can be done using netfilter. But IMQ was dropped a while back, and the alternative isn't necessarily easy to use, or well documented.) The other problem is that traffic control on Linux (part of iproute2) doesn't work too well with a NATted NIC. And I haven't had the opportunity to get back to the program to make it work more correctly.

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 10:45 UTC (Tue) by imitev (guest, #60045) [Link]

"there is no easy way to share a slow link's bandwith among one or more high-speed links."

you mean for ingress ? for 2 uplinks on one nic, or aggregating incoming traffic from 2 nics at different speeds ? I had a setup with both cases - multiple ISPs on one nic, plus aggregating "download" to multiple nics (DMZ and private interfaces). It was working really well in a company with 500+ employees, and loads of different traffics (thin clients, voip, cvs commits, http browsing, ...).

"But IMQ was dropped a while back, and the alternative isn't necessarily easy to use, or well documented"

If you have IFB in mind, I found it quite straightforward to use and less hackish than IMQ. NAT is also "easily" handled with connmarks.

But altogether, I agree it's a hassle to get everything configured and working properly; testing/debugging is also really painful.

But to come back to the OP's point: there are a few resources available: generators like yours, scripts (eg. wondershaper), TC recipes, ... ; As a side note I was surprised at how easy it was to setup a basic qos with openwrt's qos-scripts (even using HFSC).

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 12:33 UTC (Tue) by farnz (subscriber, #17727) [Link]

The thing I struggled to find was good documentation for doing DiffServ-style classification with IFB. Specifically, I could not find any documentation for tc filter that made sense, and iptables marking does not work with incoming traffic on IFB. I made it work by educated guesswork, but I have no idea why the commands I am running work, while other commands fail.

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 13:42 UTC (Tue) by imitev (guest, #60045) [Link]

I don't remember having such problem. Maybe because I didn't use "true" ingress (that is, qos'ing on the incoming interface), but rather used ifb on the outgoing interfaces - so packets going through an ifb device had previously been marked in FORWARD. But that's only feasible if the firewall is only forwarding packets and doesn't have local services.

Newer builds of openwrt use ifb instead of imq and I think qos-scripts use "true" ingress so the problem should be fixed.

Gettys: Diagnosing Bufferbloat

Posted Feb 22, 2012 10:43 UTC (Wed) by farnz (subscriber, #17727) [Link]

The problem I found isn't that the mechanisms for ingress QoS don't work - it's that they're under-documented. I've looked at OpenWRT's qos-scripts, and they don't help - they're just another way to generate mystic tc incantations.

What I'd really like, if someone who understands the combined tc/netfilter flow has the time spare, is a document that explains how tc's filter syntax works on incoming packets, and what all the actions actually do. With that in hand, I have a chance of working out why my mystic incantations work.

Gettys: Diagnosing Bufferbloat

Posted Feb 27, 2012 22:03 UTC (Mon) by kleptog (subscriber, #1183) [Link]

I find the LARTC chapter on this topic to be reasonable, though since I studied it quite a bit at one point it's hard to know whether its readable for people who know nothing at all.

The document could do with some cleaning up I'll admit.

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 13:03 UTC (Tue) by sorpigal (subscriber, #36106) [Link]

> Knowledge of traffic control is assumed (you need to know what you are trying to accomplish)

This has been a stumbling block for me. I thought I knew the basics of networking, at least enough to muddle through, but whenever I hit your tool the choices I must make up front assume knowledge I don't have and don't know how to obtain.

Where are the tools for people who *don't* have knowledge of traffic control, know what they are trying to accomplish, but don't know any of the terminology and don't know how to mentally break the desired results down in to a hierarchy of queues?

I don't mean to just complain, the tool is fantastic and it's about 500% better than futzing with tc by hand, but not any more likely to help an average sysadmin arrive at an appropriate configuration.

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 13:58 UTC (Tue) by imitev (guest, #60045) [Link]

as mentionned in another post openwrt's qos-scripts seem to do a good jobs without any user input except of course setting the real up/down bandwidth.
it should be easy to "port" them to other distros.

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 20:05 UTC (Tue) by hechacker1 (subscriber, #82466) [Link]

openwrt's qos-scripts are pretty good with the default settings, but they do stumble when it comes to classifying bittorent uTP traffic (which looks like any other small packet sized UDP traffic).

So just remember to disable uTP on your clients, or disable the UDP classification in the scripts.

Another option is to use the SFB scheduler:

I was emailing the author and apparently it only requires very basic parameters to get it working. There's no need to carefully prioritize traffic with it, since it just tries to give each flow a equal share of bandwidth while limiting buffering on each flow.

I plan to experiment with SFB and see how it works out. I may create my own qos-scripts for SFB if it's useful.

Gettys: Diagnosing Bufferbloat

Posted Feb 22, 2012 0:13 UTC (Wed) by fest3er (guest, #60379) [Link]

You have a point there. It is almost impossible to find docs that introduce traffic control or lucidly describe what TC does beyond a few examples. You might dig up a PDF of Lucian Gheorghe's "Linux Firewall and QoS" (ISBN 1-904811-65-5; Packt Pub.) That is probably the most lucid description of networking and related controls as I've ever seen. A little old, but still mostly relevant today.

I do need to write a 'theory of operation' for my UI. There are five main points to bear in mind. (1) you need to tell HTB the maximum speed of your interface. (2) HTB's classes are used solely to determine how tokens are distributed among leaf classes. Every byte of network traffic queued for transmission requires a token before it can be released. (3) All classification, be it done with 'tc filter' or iptables' '-j CLASSIFY' is directed to leaf classes (those HTB classes that are at the end of the HTB class tree and have a qdisc attached). (4) Keep burst buffers small-ish (keep buffered packets to a reasonable minimum). (5) Use SFQ qdisc everywhere, setting the size to 6kB or 12kB and the time to 2-5 seconds; it should keep any one data flow from hogging all the bandwidth.

I have had long (200MB) FTP, SSH and HTTP uploads and downloads going across my wheezing cable link (2.8Mb up/maybe 8Mb down) involving my wired systems and my 'server' on wireless. The bandwidth was fully and constantly used, but I never noticed any delays or hiccups in interactive data (email, web browsing, etc.) It *can* be done. Bufferbloat *can* be controlled. Gettys is researching bufferbloat to determine what it is, how it is evidenced, and describing it to the community. I simply made assumptions about where stuff was piling up, why, and took steps to alleviate the congestion using the tools I had to hand.

As to p2p carp, it has gotten to where it cannot be readily identified (without employing lots of CPU cycles); so determine what it is not. I usually identify normal data flows (FTP, SSH, HTTP, email, VoIP, et al.) and ensure good sharing and flow for them. Then I restrict everything else to very low speeds (like 56kb/s :)); I know/hear *very* quickly if I missed a normal or important data flow.

If you are suffering from a clogged internet connection and have a linux FW/router, you might first try changing from the default PFIFO_FAST qdiscs to SFQ (as mentioned above). You might find that part of your problem is one or two data flows are hogging all the bandwidth. SFQ should ease that particular bother.

Gettys: Diagnosing Bufferbloat

Posted Feb 23, 2012 0:34 UTC (Thu) by fest3er (guest, #60379) [Link]

This is the first version. My thoughts were to create individual schemes for each interface (IF). It works well enough. But it's far from perfect. The iptables hack isn't complete (but does handle NAT decently). There's too much detail (but I intended it to give 'complete' control to the expert, rather than to provide a 'simplified' UI for non-experts.) It doesn't address bandwidth sharing among IFs.

LTCGUI was mostly a 6-month learning process; compressed to 'full time', it took me at least 6 months to figure out TC and how to make it work well.

But I now have a generic scheme that provides decent sharing. I should be able to use that knowledge to provide a simplified UI that lets the user/admin set

  • the bandwidth-sharing ratios of various data flows relative to each other,
  • which types of flows are active (get specific controls) and which are inactive (get dumped into the 'default)
  • the bandwidth-sharing ratios of the IFs relative to each other--addressing both inbound and outbound flow
  • the observed inbound and outbound bandwidths

With that info, it should be fairly easy to generate HTB class schemes to guarantee fair use while allowing any IF or dataflow to use up to 100% of available B/W. If I can do it right, the vast majority of the details of TC and netfilter will remain hidden.

But there are still details that must be admin-configurable. There are 5-10 ports often used for web browsing. There are at least six ports used for email. And so on. The admin must have a way to modify these port lists. VPN (PPTP, IPSEC and OpenVPN) are fairly stable; but they, too, can change. But this isn't going to happen any time soon.

Now that I think on it, there are more details I could hide from the user. The burst sizes can probably remain fairly constant (and small-ish). And I did the filters all wrong. Most filters will have a fairly fixed set of clauses; they should all appear in the one pane instead of adding them separately. Priorities should be very limited. Isochronous-like data, limited in B/W needs, should have higher priority, 'undesired' data should have lower priority, and the rest should have normal priority. Anything else and the balance is easily upset.

Best I can do right now is provide an extensive example (the one I use on my perimeter firewall: Save it, edit it, copy it all, then go to, click the Load/Save button, paste it into the window, and click Import. Then wander around the worksheet UI. If it doesn't generate a usable shell script, try; this one is tailored toward Smoothwall and Roadster and might handle that config better. Alas, I may not have kept these two UIs as up-to-date as the one in Roadster: too many irons in the fire; and there are probably a few odd bugs. At the least, this config might yield a few more clues to how to how TC works.

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 19:56 UTC (Tue) by raven667 (subscriber, #5198) [Link]

Just a side point, are you sure your numbers are right for PCI vs. PCI-E? Regular 33Mhz, 32bit PCI has over 1Gbit of bandwidth (133MB/s) so your numbers seem awfully low. There may be other bottlenecks but bandwidth of the slot shouldn't be one of them.

Gettys: Diagnosing Bufferbloat

Posted Feb 21, 2012 20:34 UTC (Tue) by khim (subscriber, #9252) [Link]

Regular PCI shares it's bandwidth with all the devices in the system and it's half-duplex. If you'll use old high-end hardware (pre-PCIe times) you can get better results but today all these optimizations are considered pointless (if you want something fast then you'll use PCIe, right?) so 250-350Mb/s is pretty typical for PCI...

Situation is the same as PIO modes: today's chipset achieves about 15-20% of what old 486 or Pentium achieved if you'll disable DMA because it's just not optimized for this mode - it's supported for compatibility reasons only.

Gettys: Diagnosing Bufferbloat

Posted Feb 22, 2012 10:45 UTC (Wed) by paulj (subscriber, #341) [Link]

Plus, some of the cycles of parallel-PCI have to be used for bus control (don't remember the exact ratio anymore, somewhere around 1 in 4 or 1 in 5??), so you could not ever get sustained 133MB/s data transfer from it.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds