I am tired of hearing about Active Queue Management research when I can easily create 400ms latency here and there using only a couple of regular TCP connections.
A 400ms queue does not need to be "actively managed", it just needs to be made smaller. No queue should be longer than 100ms, end of. I am really not interested in having super throughput from outer space while Akamai and Google and whoever else are installing server farms on every single continent to give super low latency... instantly destroyed as soon as I download some software update.
I am sure there is fascinating research and fine-tuning to be done in the 10-100 milliseconds range but please, put some effort on fixing the most blatant problems first?
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 6, 2011 16:24 UTC (Tue) by jmm82 (guest, #59425)
[Link]
So what about the person who runs batch file transfers and only cares about throughput and cares less about latency? Maybe we should just optimize the whole internet to your workload and at least one person will be happy.
ISPs optimize the network for benchmarks and common internet users do not even know "latency" exists as a concept. All the average consumer bases their purchase on is the average throughput. Only later do they find out about latency when their system because intermittently lagging.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 6, 2011 16:39 UTC (Tue) by marcH (subscriber, #57642)
[Link]
> So what about the person who runs batch file transfers and only cares about throughput and cares less about latency?
She'll get 100% throughput when downloading from the same continent and 95% when from a different one. She will not even notice.
PS: please tell her I feel sorry for her POTS bill now that everyone else switched to VoIP in one form or the other.
> Maybe we should just optimize the whole internet to your workload and at least one person will be happy.
Me *and Jim*.
Jim went all the way to the ACM to tell about the problems with his kids and his home connection and yet you forget about him: not nice!
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 6, 2011 17:11 UTC (Tue) by nye (guest, #51576)
[Link]
>PS: please tell her I feel sorry for her POTS bill now that everyone else switched to VoIP in one form or the other.
s/everyone else/a handful of people/
And don't forget that in most of the world an internet connection requires paying for a POTS line anyway, so there isn't even a benefit to VoIP unless you're calling another country, but the downsides are still enormous.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 9, 2011 16:16 UTC (Fri) by jmm82 (guest, #59425)
[Link]
I work with cellular internet and 1 to 3 second ping times are far to common. My point was that 100ms or 400ms are arbitrary numbers and may fit all of your needs, but the whole internet does not have 10ms ping times, hence the reason a magic number for the whole internet will not work.
Physics
Posted Dec 9, 2011 22:10 UTC (Fri) by marcH (subscriber, #57642)
[Link]
> hence the reason a magic number for the whole internet will not work.
The 10-100ms range is anything but magic numbers. These two numbers are fundamental requirements coming straight from physics and biology (and a tiny bit of maths).
100ms (give or take) is the amount of buffering required at every potential bottleneck to maximize the throughput of Van Jacobson's congestion control algorithm across a continent. This number does not come from some hairy research but straight from the speed of light and the average size of a continent; not exactly an arbitrary number. Buffer more than 100ms on any link and you will harm latency even more for NO throughput benefit.
10ms is a threshold in human perception - think VoIP and gaming. Again, no magic here: just biology. Less than 10ms buffering harms your throughput (even more) for no perceptible benefit.
It is a funny cosmic coincidence that playing Counter-Strike across an ocean sucks while it's OK on the same continent (well, maybe not between Alaska and Chili but you get the point).
Now these two numbers are orders of magnitude rounded for convenience. If you think the ideal range is rather 15ms-150ms I have absolutely no problem with that. What I have a problem with is:
> I work with cellular internet and 1 to 3 second ping times are far to common.
Researchers always focus on the complicated stuff (here: optimizing between 10 and 100ms). Simply because trivial requirements do not get papers published. Do NOT let researchers distract you from simple facts like: 1 second ping time is just a plain bug/a joke. Reduce buffering to 100ms (or 150ms if you prefer) on every link and you will make most of your customers happier and upset practically NONE.
mtr and job done
Posted Dec 10, 2011 14:23 UTC (Sat) by marcH (subscriber, #57642)
[Link]
In a similar fashion, do not let yourself distracted by any impressive monitoring frameworks or charts researchers may be using. Matt's traceroute ("mtr") is almost always good enough to very accurately pinpoint any bufferbloat currently destroying your latency. Sometimes iperf/netperf is not even required; downloading some DVD image is enough. The main problem is not technical but going through first level and second level support. Support stories seldom make it to the ACM/IEEE though.
Science is useful and makes great reads for rainy weekends but, when push comes to shove use simpler engineering to do the job.
Physics
Posted Dec 12, 2011 11:49 UTC (Mon) by jlokier (guest, #52227)
[Link]
1 second ping time is just a plain bug/a joke.
I often use cellular internet with a marginal signal that is oversubscribed, and 5 second ping times are quite common. Even 20 seconds at some times.
As these are ping times to the cellular network access point, from an otherwise idle handset, it's quite possible this time cannot be improved by simply dropping packets early at any stage.
In other words, it may not be a bufferbloat problem - and it may not be a bug either, if the RF link is simply too marginal and oversubscribed. As far as I can tell, these timings depend greatly on the strength of RF signal, and on the time of day.
In this case the way forward looks like newer cellular technology. We all look hopefully at 4G/LTE, and (would be nice) better cross-carrier RF diversity.
Even so, it's not clear that "speed of light" is an achievable latency on fully-subscribed large area wireless networks with large numbers of moving devices.
Physics
Posted Dec 12, 2011 12:19 UTC (Mon) by marcH (subscriber, #57642)
[Link]
> I often use cellular internet with a marginal signal that is oversubscribed, and 5 second ping times are quite common. Even 20 seconds at some times.
> [...]
> In other words, it may not be a bufferbloat problem
Whenever you experience ping times over 1 second, something somewhere is buffering your ping (or pong) packet for more than 1 second. Even if this buffer is not "bloated" strictly speaking, holding on any packet for that long is WRONG and is definitely a BUG.
As an example, any link retransmission technique with timeouts over 1 second is simply not compatible with TCP/IP. Making that link technology compatible with TCP/IP is as simple as making it timeout (and drop packets) much, much sooner. It is really that simple.
Now again, finding the *optimal* timeout value is a very difficult problem. However, reducing a multi-seconds timeout to a reasonable 100-150 milliseconds value is NOT difficult at all and will make every user happier, while upsetting none.
Here is a supermarket analogy (for a change). You know that queues always become too long at peak time. Customers complain about it. You have money for two extra tills. But you do not proceed because it is oooh so hard to find the optimal number of tills.
> In this case the way forward looks like newer cellular technology.
This is throwing out the baby with the bath water. And the newer technology might make the same mistake again. And in any case it will not displace 2.5G/3G everywhere overnight.
> Even so, it's not clear that "speed of light" is an achievable latency
Of course it's not; you need some reasonable amount of buffering for a number of reasons.
Physics
Posted Dec 12, 2011 12:47 UTC (Mon) by ekj (guest, #1524)
[Link]
That's a bug. A packet is always either in-transit, being processed by a device, or being stored in a buffer for later processing and/or later sending.
The only way you can get 1+ seconds on local short-distance links, is by having the packet spend the huge majority of that time stored in some buffer. Which is a bug.
You want a sufficient bug that short term spikyness of packet-arrival does not needlessly cause lost packets when transmission a few milliseconds later would be preferable.
But 5 seconds, or even 1 second, worth of buffering is *way* too much, sure we can debate if you want 25ms or 250ms worth of buffering, and the answer is surely "it depends", but there's just no way 5 *seconds* worth of buffering can avoid causing an order of magnitude more problems than it solves.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 6, 2011 16:42 UTC (Tue) by dlang (✭ supporter ✭, #313)
[Link]
the problem is that figuring out the right size queue to use is really hard to do
If you have a 1Gb/sec connection across the country, you need to have a large buffer on your machine.
the default buffer sized in the kernel are sized for this sort of thing.
however, if your 1G network is connected to a 1M network, then your server buffer size should be 1/1000 the size to maintain the correct latency, but your desktop has no way of knowing that there is a 1m link somewhere in the middle.
in this situation (1G - 1M - 1G links) you need the routers connecting to the 1M link to have small buffers and drop packets
in practice, things are actually worse than this
you have a laptop (1g) connecting to to your firewall/access point (1G) connecting to your DSL modem (10M or 100M) connecting to the ISP (1M) and similar setup on the other end.
you don't have any control over the buffers on the DSL modem, and the buffers there are much larger than they should be, and genrally not configurable, so you can easily fill them up and generate the high latency.
to work around this, you need to shrink the buffers on the firewall to have it drop packets sooner, or on your laptop to have it not generate the packets.
AQM is needed to have these devices detect that there is a problem and shrink the buffers in response.
everybody knows that the current buffer sizes are far too large, but there isn't a clear answer to the question of what the buffer size should be.
variable link speeds (wireless or cable 'turbo mode') greatly complicate this issue.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 8:55 UTC (Wed) by marcH (subscriber, #57642)
[Link]
> however, if your 1G network is connected to a 1M network, then your server buffer size should be 1/1000 the size to maintain the correct latency, but your desktop has no way of knowing that there is a 1m link somewhere in the middle.
No: in this case the queue(s) in your server do not matter because they will be empty most of the time. Packets will only stack up at the bottleneck (as the name implies).
> in this situation (1G - 1M - 1G links) you need the routers connecting to the 1M link to have small buffers and drop packets
Yes.
If every link makes sure not to buffer more than 100ms or so, then bufferbloat goes away in 90% of the traffic cases (10% are left for the researchers to have fun with). Let's fix the most obvious problems first.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 10:45 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
you are right that the buffers on the 1G machine will always be empty, you need to (eventually) detect packet loss and then throttle the sending speed
the reality is that there is a lot of equipment out there that you are not going to be able to get replaced for several years, and part of that reason is that the vendors are still building equipment with buffers that are way too large because they still aren't doing testing that shows them the problem. this sort of publication and formal paper is what's required to get them to notice the problem and start the process of fixing it.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 12:05 UTC (Wed) by marcH (subscriber, #57642)
[Link]
> you are right that the buffers on the 1G machine will always be empty, you need to (eventually) detect packet loss and then throttle the sending speed
TCP does that at the source. TCP is ACK-clocked on whatever is the current bottleneck thanks to the congestion or receiver window (whichever is smaller). No need to throttle TCP anywhere else.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 12:17 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
as I understand the effects of bufferbloat, the fact that these over-large buffers are queuing the packets instead of dropping them is breaking the TCP ack clocking mechansim
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 13:26 UTC (Wed) by marcH (subscriber, #57642)
[Link]
> ... is breaking the TCP ack clocking mechansim
... which does not mean TCP will go mad and send even more packets and clog whatever queues even more. I would bet it is actually the opposite.
Anyway, I only had the "reasonable queue size @ bottleneck" case in mind in my previous post. In this "normal", non-bufferbloated case ACK-clocking works fine and there is no need for externally throttling TCP anywhere else than at the bottleneck's queue when it fills up.
*In theory* not just TCP but every other protocol should be a good citizen and follow TCP's lead in respect to congestion/throttling: http://en.wikipedia.org/wiki/Datagram_Congestion_Control_...
In practice no one is using DCCP but it's not too bad either: Skype for instance actively throttles itself down in case of congestion/high latencies. Unsurprisingly, no application tries to damage the network it is using itself.
I would not be surprised if TCP is actually the worst guy for falling in and filling bufferbloat traps. There is some irony in that considering it is usually considered the best congestion citizen *in the lack of bufferbloat*.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 16:39 UTC (Wed) by martinfick (subscriber, #4455)
[Link]
> I would not be surprised if TCP is actually the worst guy for falling in and filling bufferbloat traps. There is some irony in that considering it is usually considered the best congestion citizen *in the lack of bufferbloat*
I had the same thought myself and was wondering if TCP needs to be fixed to take latency into account? From my lame understanding it seems to only care about throughput, isn't that why the bittorrent folks came up with their own UDP based solution? So, while I am all for fixing buffers wherever possible, shouldn't there be more discussion about fixing TCP?
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 22:58 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
actually, what is happening is that TCP (eventually) gets the acks for the packets that it sends, so it speeds up to try and send more traffic.
eventually the delay gets so large that it times out before the acks arrive and the speed collapses.
If you look at the graphs in Getty's paper, you will see exactly this sort of picket-fence for what he is calling 'goodput', which is the amount of traffic that is actually getting to the destination (with the rest of the available bandwidth being taken up by 'badput', which is packets that are going to be dropped by the time they get to the destination because they are either too old, or they are retransmissions of packets that are still in flight, and so will be duplicates by the time they get there)
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 11:40 UTC (Wed) by mtaht (✭ supporter ✭, #11087)
[Link]
Dlang:
You are doing a great job here of explaining things, but I have to correct you on one MAJOR point.
"however, if your 1G network is connected to a 1M network, then your server buffer size should be 1/1000 the size to maintain the correct latency, but your desktop has no way of knowing that there is a 1m link somewhere in the middle."
Um, no. The closest we know to a correct figure for buffering is the square root(flows) * bandwidth* delay product of the next hop.
Bandwidth as humans measure it is X Mbit/sec, and as computers do, it
bits/nanosec and this distinction trips us up. Also the 1G network generally has very low delay, and the 1Mbit network very high.
I recently shot myself in the foot here myself, I was doing some shell scripting that assumed a linear relationship of buffers to speed for tc, and those estimates got very wrong, quickly. It rather bugs me that there is no sqrt() call in the shell, you have to simulate one using echo "sqrt(the bdp)" | bc -i or something like that.
Assuming delay is a constant, (and delay is not!), doing some tons of square roots
for practice of common figures, tossing in nearly random numbers for the above values, straightened out my assumptions and thinking and code considerably.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 11:52 UTC (Wed) by mtaht (✭ supporter ✭, #11087)
[Link]
And I should probably mention that whilst I'm fiddling with sqrts at this point, incorporating the next hop delay term into your thinking about this stuff is far more important mathematically.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 12:17 UTC (Wed) by mtaht (✭ supporter ✭, #11087)
[Link]
Meh. I can't believe how much I trip myself up on this. Who knows, maybe I've been getting it wrong all this time, too....
Let me try again.
the total amount of buffering in the txqueue + tx ring buffer portion of the stack needs to be not much longer than the the BDP to the *next hop*.
BQL appears to solve the tx ring portion of the problem thoroughly, at least on ethernet.
Figuring out how many streams can co-exist in the txqueuelen set of buffers above the tx ring, and when to start dropping packets there, is an AQM problem, about which much debate exists. The next-hop BDP*sqrt(flows) thing is, well, debatable, but getting the effective txqueue's length down to where that portion of the AQM debate can take place again, seems doable with the time in queue idea floating about.
The total amount of buffering in tcp's algorithms, which do their own buffering internally, that is required for the end-to-end queue to be handled, is dependent on the BDP, and I'm going to flat out wave hands and say that AQM can help there a lot, and typically has very 'interesting' problems with streams of different RTTs.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 12:36 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
If the buffer size is root(flows) * bandwidth * delay and the bandwidth is 1/1000 of what it was before (with all else being equal) doesn't that make the required buffer size 1/1000 as well?
I'm missing something here, but I don't see how the sqrt piece matters when we are talking about the bandwidth changing
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 13:04 UTC (Wed) by mtaht (✭ supporter ✭, #11087)
[Link]
In your example, you missed the delay component.
'All else' is not equal in your example.
And I was mentally going from the 1Mbit link UP to the 1000Mbit link, where
delay factors in a lot. Your typical 1Mbit internet link can have an inherent next-hop delay of 1-60ms on wired technologies which is a significant component of that portion of the BDP. Wireless is far worse,
of course.
And I was still kicking myself about the sqrt part from my mis-spent weekend. And I conflated the three together in trying to explain myself.
I really shouldn't post stuff before my third cup of coffee. I may just delete what I tried to post and start over. If there is a way to explain it better (if you can explain it back to me!) I'm either going to make another pot of coffee or go to bed and pull a pillow over my head. Or both.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 6, 2011 16:51 UTC (Tue) by intgr (subscriber, #39733)
[Link]
> A 400ms queue does not need to be "actively managed", it just needs to be
> made smaller. No queue should be longer than 100ms, end of.
That's a very interesting point. Much of their work has gone into making queue sizes tunable in bytes, but they're really worried about *latency*, not queue size.
Seems like it could be as simple as tagging each packet with receive timestamps and dropping packets on the draining end if they've been in the queue for too long.
Now I'm sure smarter people than you and I have thought of this and rejected this approach for some reason. Does anyone know why this isn't a good idea?
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 6, 2011 17:00 UTC (Tue) by dlang (✭ supporter ✭, #313)
[Link]
the problem is that getting the current time is a rather expensive operation, doing this for each packet (and worse, doing it multiple times per packet at different points in the queue), can hurt your network throughput.
in the past, queues have been managed in terms of how many packets are in the queue, not caring if the packet is a 64 byte minimum size packet or a 9000 byte jumbo packet.
one of the results of the bufferbloat effort is the new AQM queue type (I;m blanking on the name of it, but it was merged upstream a release or two ago) that manges the queue size in terms of the size of the queue in bytes, when you have a consistant link speed (i.e. most wired networks), the time needed to transmit each byte is very close to a constant, and so managing the queue in terms of bytes is almost exactly the same as managing it in terms of time.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 6, 2011 18:40 UTC (Tue) by nix (subscriber, #2304)
[Link]
You're thinking of BQL.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Dec 7, 2011 11:44 UTC (Wed) by mtaht (✭ supporter ✭, #11087)
[Link]
dlang: actually, all the work that has gone into making timestamping fast in the last decade seems to have paid off. Eric Dumazet proved to me that it is now incredibly cheap, and I think pursuing "time in queue" has great potential to get us into the sub 30ms range for inherent latencies across a wide range of gear.
I knew in my gut, too, that timestamping was expensive. It *was* - in the early 00s. My gut was wrong.
Milk algorithm?
Posted Dec 7, 2011 18:20 UTC (Wed) by dmarti (subscriber, #11625)
[Link]
Isn't everyone independently coming up with the "milk algorithm?" Here's a carton of milk (packet) with a expiration date. Put it in your fridge (buffer). When you're ready to take it out, compare the expiration date to your current date. If it's not expired, drink it (send it). If it's expired, pour it out.
Milk algorithm?
Posted Dec 8, 2011 9:49 UTC (Thu) by mtaht (✭ supporter ✭, #11087)
[Link]
Heh. I'm glad there's prior art.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Jan 2, 2012 10:23 UTC (Mon) by Randakar (guest, #27808)
[Link]
Even if putting a timestamp on packets is too expensive for each individual packet, there are ways to mitigate that.
For example, a queue manager could put a magic timestamp packet in it's queue at periodic intervals. Say, 0.1 ms. (This number may need tuning..)
Every time a timestamp packet hits the front of the queue all packets behind it until the next timestamp marker will be at least (timestamp + something less than 1 interval) old. If that timestamp is too old you just drop the packets in the interval and move on.
Of course this solution isn't as good as individual timestamps - you can still get into situations where you're sending very old packets because there are too many packets in the intervals that you ARE processing - but I can imagine cases where this type of tradeoff may be worthwhile.
Bufferbloat: Dark Buffers in the Internet (ACM Queue)
Posted Jan 3, 2012 13:01 UTC (Tue) by etienne (subscriber, #25256)
[Link]
Or use 16 queues organised by received time to store packets, at some point queue (head) No 6 contains new packets currently arriving, it you are still sending packets from queue (tail) No 7 then throw away the whole queue No 7.
Next timeslot store new packets in queue No 7 and throw away queue No 8.