But slide 27 in the slide deck is incredibly troubling, where it shows BBR completely destroying the throughput of a neighboring host using Cubic. (Cubic being the default on modern Linux).
Even if everyone upgrades to BBR, slide 30 shows more trouble in paradise with BBR not getting to a stable bandwidth allocation. I have a hard time reconciling this with the ACM Queue article, since supposedly Google saw great results rolling it out at Youtube.
ACM Queue article: https://queue.acm.org/detail.cfm?id=3022184
IIRC (I read the paper too) the Youtube numbers assumed that the bandwidth available and needed are both largely stable, and that their success was in using that better.
Ie. they assumed that if 4mbps appeared available most of the time, then it really was available all of the time. If the capacity of the narrowest point is 10Mbps and 4mbps youtube streaming briefly competes against a bulk download from a Cubic sender, then BPR would hold the Cubic-using download to 60%, which Youtube might count as success and you might or might not.
What is different about BBR that makes it dominate while LEDBAT subsides?
"It over-corrects on loss and leaves available path capacity idle
– 10Gbps rates over 100ms RTT demands a packet loss rate of less than 0.000003%
– An average 1% loss rate over a 100ms RTT can’t operate faster than 3Mbps"
Can someone explain the math there to me? I looked at the slides and I'm not seeing it. Am I missing something really obvious?
Assuming packet loss halves the congestion window while a packet delivery increases the congestion window by 1/cwnd, the break-even point for 1% packet loss is at a congestion window of 14. (1% change of reducing congestion window by 7, 99% chance of increasing it by 0.07).
At 100ms RTT the average delivery rate will then be about 14 packets/rtt * 1500 bytes/packet * 10 rtts/second = 200kB/s.
What if we cut down the packet loss rate by a factor of 10, to 0.1%? It won't give a 10x speed increase; instead the break-even point for the congestion window will be about 45; only a 3x increase in speed. If RTT is constant, increasing speed by 10x would require reducing packet loss by 100x.
(The math here is simplified, but should suffice for illustrating this).
In short, the higher the loss, the slower it goes. The longer the rtt, the more loss sensitive it becomes. BBR avoids both of these flaws.
Congestion control (the purpose of BBR) deals with the bottleneck problem. Delayed ACKs are an optimization to reduce protocol overhead. Indeed there are problems with delayed ACKs when combined with Nagle’s algorithm and an application that doesn’t send a steady stream of data, but it has nothing to do with the topic of this thread.
TCP_QUICKACK seems to be a boolean (basically 1 or 2 packets per ack)
DelayedAckFrequency goes from 10 to 600 but it's millisecs!?
You don't think acks. have an impact on congestion?
Right now it's hard coded to 2 which is just weird.
4 -> 5 -> X -> 7 -> 8 -> 9
Until the client gets 6 then 7,8,9 will be held(and getting stale) until you roundtrip 6. At that point 9 may even be useless since it would be too stale and you've just added extra latency that you don't want or need.
Look up how dead-reckoning is done in realtime games, this is one of many other cases where having timely data works better over UDP than TCP.
For context I built this multiplayer platform, not that that prooves anything but I have 20 years of working with multiplayer: http://fuse.rupy.se/about.html
Really there's no up-side to doing it that way with UDP already existing and tons of downsides in changing an existing protocol in such a fundamental way.