Bufferbloat: the summary
Bufferbloat: the summary
Posted Feb 26, 2011 0:30 UTC (Sat) by zlynx (guest, #2285)In reply to: Bufferbloat: the summary by jg
Parent article: The debloat-testing kernel tree
Posted Feb 26, 2011 2:05 UTC (Sat)
by jg (guest, #17537)
[Link]
Memory has gotten so big/so cheap that people often use values much much larger than makes sense under any circumstances.
For example, I've heard of DSL hardware with > 6 seconds of buffering.
Take a look at the netalyzr plots in: http://gettys.wordpress.com/2010/12/06/whose-house-is-of-...
The diagonal lines are latency in *seconds*.
Posted Feb 26, 2011 2:15 UTC (Sat)
by jg (guest, #17537)
[Link] (5 responses)
Example: A gigabit ethernet.
Say you have supposedly sized your buffers "correctly" presuming a global length path, at your maximum speed, presuming some number of flows, by the usual rule of thumb: bandwidth x delay x sqrt (#flows).
Now you plug this gigabit NIC, into your 100Mbps switch. Right off the bat, your system should be using 1/10th the # of buffers.
And you don't know how many flows.
So even if you did it "right", you have the *wrong answer*, and do so most of the time.
Example 2: 802.11n
Size your buffers for, say, 100Mbps over continental US delays, presuming some number of flows.
Now, go to a conference with 802.11g, and sit in a quiet corner. Your wireless might be running at a few megabits/second; but you are sharing the channel with 50 other people.
Your "right answer" for buffering can easily be off by 2-3 *orders magnitude*. At that low speed, it can take a *very long time* for your packets to finally get transmitted.
***There is no single right answer for the amount of buffering in most network environments.***
Right now, our systems' buffers are typically sized for the maximum amount of buffering they might ever need, even though we seldom operate them in that regime (if the buffer sizes were thought about by the engineers involved).
So the buffers aren't just over sized, they are downright bloated.
Posted Feb 26, 2011 12:20 UTC (Sat)
by hmh (subscriber, #3838)
[Link] (2 responses)
The queue should be able to grow large, but only for flows where the bandwidth-delay product requires it. And it should early-drop.
And the driver DMA ring-buffer size really should be considered part of the queue for any calculations, although you probably have to consider that part of the queue a "done deal" and not drop/reorder anything there. Otherwise, you can get even fast-ethernet to feel like a very badly behaved LFN (long fat network). However, reducing DMA ring-buffer size can have several drawbacks on high-throughput hosts.
Using latency-aware, priority-aware AQM (even if it is not flow-aware) should fix the worst issues, without downgrading throughput on bursty links or long fat networks. Teaching it about the hardware buffers would let it autotune better.
Posted Feb 26, 2011 13:37 UTC (Sat)
by jg (guest, #17537)
[Link] (1 responses)
What AQM algorithm is a different question.
Van Jacobson says that RED is fundamentally broken, and has no hope of working in the environments we have to operate in. And Van was one of the
SFB may or may not hack it. Van has an algorithm he is finishing up the write up of that he thinks may work. Hopefully will be available soon. We have fundamentally interesting problem here. And testing this is going to be much more work than implementing, by orders of magnitude.
It isn't clear the AQM needs to be priority aware; wherever the queues are building, you are more likely to choose a packet to drop (literally drop, or ECN mark) just by running an algorithm across all the queues. I haven't seen arguments that makes me believe the AQM must be per queue (that doesn't mean there aren't any! just I haven't seen them).
And there are good reasons why the choice of packet to drop should have randomness in it; time based congestion can occur if you don't. Different packet types also have different leverage to them (acks, vs. data, vs. syn, etc.).
Posted Feb 26, 2011 16:14 UTC (Sat)
by hmh (subscriber, #3838)
[Link]
The Diffserv model got it right, in the sense that even on a simple host, there are flows for which you do NOT want to drop packets (DNS, NTP) if you can help it, and that there is naturally an hierarchy of priorities of which services you'd rather suffer more packet drops than others during congestion.
I've also found that "socializing" the available bandwidth among flows of the same class is a damn convenient thing (SFQ). SFB does this well, AFAIK.
So, I'd say that what we should aim for hosts is an auto-tuned flow-aware AQM that at least pays attention to the bare minimum of priority ordering (802.1p/DSCP class selectors) and does a good job of keeping latency under wraps without killing throughput on high bandwidth-delay product flows. Such a beast could be enabled by default on a distro [for desktops] with little fear.
This doesn't mean you need multiple queues. However, you will want multiple queues in many cases because that's how hardware-assisted QoS works, such as what you find on any 802.11n device or non-el-cheap-o gigabit ethernet NIC.
Routers are a different deal altogether.
Posted Feb 26, 2011 17:58 UTC (Sat)
by kleptog (subscriber, #1183)
[Link] (1 responses)
Any delay in the routers adds to the overall delay and thus adds to the bandwidth-delay product. In essence your endpoint's memory usage is the sum of the memory used by all the routers in between. Packets spend more time in memory than they do in-flight.
There was a paper somewhere about buffers and streams and the more streams you have the few buffers you need. So your endpoints need big buffers, your modem smaller buffers and internet routers practically none.
Posted Feb 27, 2011 16:33 UTC (Sun)
by mtaht (subscriber, #11087)
[Link]
http://www.cs.clemson.edu/~jmarty/papers/PID1154937.pdf
Wireless is not part of this study and has special problems (retries)...
Bufferbloat: the summary
Bufferbloat: the summary
Bufferbloat: the summary
Bufferbloat: the summary
inventors of RED...
Bufferbloat: the summary
Bufferbloat: the summary
Bufferbloat: the summary