Gettys: Bufferbloat in 802.11 and 3G Networks

[Posted January 3, 2011 by ris]

Jim Gettys has another post on bufferbloat, this time looking at "big fat networks". "A simple concrete optimal example of such a busy network might be 25 802.11 nodes, each with a single packet buffer; no transmit ring, no device hardware buffer, trying to transmit to an access point. Some nodes are far away, and the AP adapts down to, say 2Mbps. This is common. You therefore have 25 * 1500 bytes of buffering; this is > .15 seconds excluding any overhead, if everything goes well; the buffers on the different machines have "aggregated" behavior. This is the optimal case for such a busy network. Even a 802.11g network with everyone running full speed will only be about 10 times better than this."

Gettys: Bufferbloat in 802.11 and 3G Networks

Posted Jan 4, 2011 5:11 UTC (Tue) by smoogen (subscriber, #97) [Link]

Most interesting. I spent the last week in a hotel where most of the issues he mentions were clear to be seen. I didn't have rrdtool installed and couldn't get enough bandwidth to download the tools, but just taking the output of pinging the default router showed large amounts of variance with a 35% drop rate and a time range of 10 ms to 4000 ms. Quite pretty in some ways.

Better texts

Posted Jan 4, 2011 9:38 UTC (Tue) by job (guest, #670) [Link] (13 responses)

It's a good basic overview of latency in networks, except this guy makes up his own terms along the way and skips some basic math that actually gives you a quick idea what kind of latency to expect. I don't know why it's Linux related news. Read about latency in a network textbook instead if you're interested in this stuff.

Better texts

Posted Jan 4, 2011 10:35 UTC (Tue) by paulj (subscriber, #341) [Link] (12 responses)

It's not news, but having someone of jg's stature discuss the issues helps bring the problem fresh and wider attention.

Better texts

Posted Jan 4, 2011 13:00 UTC (Tue) by erwbgy (subscriber, #4104) [Link] (11 responses)

Agreed. I certainly wasn't aware of the problem until I started reading these blog entries.

Better texts

Posted Jan 4, 2011 15:51 UTC (Tue) by jg (guest, #17537) [Link] (10 responses)

Nor was I, until I started to run into the problems in a way I could understand last spring.

I don't doubt there are better texts/discussions: but they generally lack grounding to the reality of the problems we all experience right now.

Pointers to such texts and papers would be very welcome: this isn't really my area; rather it is an area a blundered into by accident and necessity of my job. I have to write this up properly into something more coherent over the next few months.
- Jim

Better texts

Posted Jan 5, 2011 0:09 UTC (Wed) by calhariz (guest, #5003) [Link] (9 responses)

In the past I have read articles about networks, QoS and queuing. Yours blogs entries are the first that try to connect the theory and the day to day of using the Internet.

I think this is your best text. It's clear, well written and easy to understand by people that don't have a background on networks.

Better texts

Posted Jan 5, 2011 19:55 UTC (Wed) by jg (guest, #17537) [Link] (8 responses)

Thanks. That's the intent, though I do ramble on at times too long. My challenge will be trying to be Thomas Jefferson when I do a more formal publication.

I see bufferbloat as an education problem: we've (the entire industry) all been making the same mistakes of excessive/static buffering for over a decade; and the consequences of which are not obvious.

And better references are really gratefully needed: for that formal publication, I really do not want to (nor will I have space for) the kind of exposition I've been trying to do in the blog.

Thanks

Posted Jan 5, 2011 22:31 UTC (Wed) by kleptog (subscriber, #1183) [Link] (7 responses)

This stuff has helped me understand latency issues better. I'm working on an app that has to process lots of data in realtime. The buffers there also caused problems. The OS has buffers, TCP has buffers, the app had buffers. Not good for a realtime app. It took some tuning, but it worked (essentially, make buffer sizes based on time rather than bytes).

This also reminds me of playing with wireless >ten years ago, when they were full length ISA cards. We had a serious problems with packet loss and that TCP couldn't distinguish between packet loss (data corrupted) and packet loss (congestion). TCP would keep backing off, while sending more would have been more beneficial, kind of the opposite problem to here. I was actually thinking of building a retransmission protocol under TCP to compensate for it! Fortunately, the technology improved before I needed to do that.

What we need is way of hiding the unreliability of the network from TCP (so it doesn't back off) while at the same time keeping latency to a minimum. Forward Error Correction should do this, and recent protocols have stacks of it, but it's obviously not working. Or perhaps an Explicit Crappy Network Notification bit, to tell TCP that the packet got lost but it *wasn't* congestion.

Thanks

Posted Jan 8, 2011 21:17 UTC (Sat) by kleptog (subscriber, #1183) [Link] (6 responses)

Actually, rereading what I wrote I think I mean that the reason all the low level protocols go to such effort to avoid packet loss is because TCP can't handle non-congestion packet loss on the link-layer at all.

When I was playing with that lossy wifi, at 1% packet loss it was sorta ok. But at 5% it became totally unusable. You can experiment easily, there's an iptables module to simulate packet loss.

Thanks

Posted Jan 8, 2011 22:49 UTC (Sat) by dlang (guest, #313) [Link] (5 responses)

one of the reasons why the tendancy has been to try and paper over problems with large buffers is that the TCP retransmit time is lo long (in human terms).

if you think that 1 second latency is bad, remember that dropping a packet introduces a 30 second delay until that packet is retransmitted.

for a data transfer that will take several minutes, this is not a big deal, especially if it makes the back-off work properly. but if this is an interactive session, dropping a packet can be disastrous.

the 'right' answer for this is to use QOS and priorities to slow down and drop packets for the long-term data connections while keeping the interactive connections as close to loss-free as possible.

It may be that the right answer for this is to reduce the retransmit time for dropped packets.

Thanks

Posted Jan 8, 2011 23:24 UTC (Sat) by foom (subscriber, #14868) [Link] (4 responses)

It's 200ms for 1 dropped packet, not 30s (still way too long, and a problem). But yes, after a *few* dropped packets (e.g. your wireless went out of range for a few seconds), it takes disastrously long for the sessions to start working again. So long that a human can sit there for a little while waiting, then give up and type ssh hostname<ret>screen -r<ret>, way before the previous session starts working again, which is quite ridiculous.

Thanks

Posted Jan 13, 2011 20:01 UTC (Thu) by dlang (guest, #313) [Link] (3 responses)

I don't think that it can be 200ms for a dropped packet.

the time to retransmit a dropped packet needs to be longer than the round trip time or you will be sending out replacements for packets that are still on their way to the destination.

with satellite single-hop ping times in the 1000ms range, and dialup lines in the 250-300ms range I don't see how it can possibly be less than a few seconds.

if it was 200ms, then a connection with a total ping time of 1000ms would send 6+ copies of every packet.

Thanks

Posted Jan 13, 2011 22:26 UTC (Thu) by kleptog (subscriber, #1183) [Link] (2 responses)

The retransmit time is ofcourse not constant. It is calculated relative to the round-trip-time. So links with low round-trip get quicker resends than long links. Also SACK's may trigger early retransmits. Large documents have been written about this, the algorithms are quite smart (except when it comes to link layer packet loss).

Thanks

Posted Jan 13, 2011 23:39 UTC (Thu) by dlang (guest, #313) [Link] (1 responses)

I don't see any place in the linux kernel code or stats that store round trip time. where does the system track this?

Thanks

Posted Jan 14, 2011 3:44 UTC (Fri) by foom (subscriber, #14868) [Link]

TCP_RTO_MIN is 200ms (which is about 100x longer than you need for many networks...). But it scales up depending on the measured RTT, up to TCP_RTO_MAX (120 seconds!).

Look at tcp_rtt_estimator in net/ipv4/tcp_input.c

RE: Thanks

Posted Jan 5, 2011 22:49 UTC (Wed) by jg (guest, #17537) [Link]

Glad the essays helped your problems. I know I've had similar problems I've wrestled with, but not had the general framework to think about the problem I do now.

Note that in reality, you can never fully distinguish congestion loss from random error, and you can easily get into other pain by trying to hide everything from TCP, including its ability to use SACK and fast retransmit.

So trying to paper over problems can easily be self defeating (as the 802.11 and 3g people have succeeded in doing). I have a bit more sympathy for the 3g folks, who were essentially doing a retrofit over existing technology, than the 802.11 people, who just should have done more study about how packet networks really work...

But getting this all cleaned up (as best we can) is an almost Sisyphean task. Help gratefully accepted...
- Jim