The linked paper is the best explanation I've seen for why largish buffers are needed at all.
One thing I wonder, though: would it make sense for TCP implementations to space out multiple-segment sends if those sends are the result of a congestion window increase or a received ack that does not increase the advertised receive window size?
The idea is that, if the receiver stopped reading for awhile (due to load, for example), then it would ack a segment and decrease its window. When it catches up, the window will increase and the sender should send as much data as will fit.
On the other hand, if the connection is starting up or if the receiver only acks full windows, there's no real benefit to sending the full cwnd all at once, since it's unlikely to reach the receiver any faster than spacing the data out. The latter would improve latency of competing flows.
Posted May 20, 2012 22:31 UTC (Sun) by farnz (guest, #17727)
[Link]
One of the challenges is determining the "correct" spacing; for Ethernet, it's trivial, as to send at a given speed, you just maintain a steady inter-packet spacing. For other link layers, however, it gets more complex.
Assume that your sender is on 10G Ethernet (a nice fast server for a web company, for example). Put the bottleneck link on VDSL2 (often used as the copper end of FTTC connectivity). VDSL2 has a fixed 4 kilobaud signalling rate, so 80MBit/s down (the current fastest in the UK) is provided as 20,000 bits per symbol. A standard ISP MTU in the UK is 1,500 bytes, or 9,000 bits. You therefore need to send a burst of 3 packets to guarantee enough bits buffered to sustain the full 80MBit/s down, then spacing, then another burst.
And note that you can't rely on link speeds to identify what the connection technology in the bottleneck is - if I run a VDSL2 link at 25kbit/symbol, I get 100MBit/s, same as 100BASE-TX; but 100BASE-TX sends at under 1 bit/symbol, so can saturate with a single packet buffered, while VDSL2 at 25kbit/symbol and an Ethernet-compatible 1,500 byte MTU needs 3 packets buffered to saturate.