Hacker News new | comments | ask | show | jobs | submit login
TCP and BBR [pdf] (ripe.net)
76 points by pjf 8 months ago | hide | past | web | favorite | 19 comments



I've been running BBR on both my web server and my own laptop for quite some time now with good results. It's included in the mainline Linux kernel - if you have kernel 4.9 or newer you can probably trial BBR yourself with these sysctls:

  net.ipv4.tcp_congestion_control=bbr
  net.core.default_qdisc=fq
Most vaguely recent distributions (Debian stable, Ubuntu, etc) now support it, and a few (like Solus) even default to using BBR.


I've been interested in BBR since the ACM Queue article and finally turned it on yesterday on my home workstation.

But slide 27 in the slide deck is incredibly troubling, where it shows BBR completely destroying the throughput of a neighboring host using Cubic. (Cubic being the default on modern Linux).

Even if everyone upgrades to BBR, slide 30 shows more trouble in paradise with BBR not getting to a stable bandwidth allocation. I have a hard time reconciling this with the ACM Queue article, since supposedly Google saw great results rolling it out at Youtube.

ACM Queue article: https://queue.acm.org/detail.cfm?id=3022184


Youtube doesn't have neighbouring senders using Cubic, and doesn't try to fill the path's narrowest hop. It tries to send the video at exact playback speed and if possible at high quality, and its notions of success will reflect that.

IIRC (I read the paper too) the Youtube numbers assumed that the bandwidth available and needed are both largely stable, and that their success was in using that better.

Ie. they assumed that if 4mbps appeared available most of the time, then it really was available all of the time. If the capacity of the narrowest point is 10Mbps and 4mbps youtube streaming briefly competes against a bulk download from a Cubic sender, then BPR would hold the Cubic-using download to 60%, which Youtube might count as success and you might or might not.


Delay based congestion control is the basis of LEDBAT (RFC6817) as used by BitTorrent over UDP. The claimed advantage of LEDBAT is that it is submissive to TCP, instead of dominating like BBR. This allows BitTorrent to run in the background without hurting your videocalls.

What is different about BBR that makes it dominate while LEDBAT subsides?


It's interesting to compare the work being done on BBR in (relatively) recent times with the kinds of results coming out of various implementation of SCPS[1]. In general, the biggest pain with conventional algorithms is their boom/bust cycle and (re)ramp-up speeds. Satellite characteristics provide some fascinating playgrounds.

[1] https://en.wikipedia.org/wiki/Space_Communications_Protocol_...


The article states the following about TCP Reno:

"It over-corrects on loss and leaves available path capacity idle

– 10Gbps rates over 100ms RTT demands a packet loss rate of less than 0.000003%

– An average 1% loss rate over a 100ms RTT can’t operate faster than 3Mbps"

Can someone explain the math there to me? I looked at the slides and I'm not seeing it. Am I missing something really obvious?


An additive increase/multiplicative decrease congestion control algorithm has the property that the effect of a packet loss gets larger as the congestion window increases. There will be a point where the upward and downward forces on the congestion window balance out. The actual congestion window will oscillate around that.

Assuming packet loss halves the congestion window while a packet delivery increases the congestion window by 1/cwnd, the break-even point for 1% packet loss is at a congestion window of 14. (1% change of reducing congestion window by 7, 99% chance of increasing it by 0.07).

At 100ms RTT the average delivery rate will then be about 14 packets/rtt * 1500 bytes/packet * 10 rtts/second = 200kB/s.

What if we cut down the packet loss rate by a factor of 10, to 0.1%? It won't give a 10x speed increase; instead the break-even point for the congestion window will be about 45; only a 3x increase in speed. If RTT is constant, increasing speed by 10x would require reducing packet loss by 100x.

(The math here is simplified, but should suffice for illustrating this).


Thank you yes this makes perfect sense, this is indeed the sawtooth pattern you see with AIMD. Cheers.


Here’s a tool showing basically all the factors affecting maximum speed with traditional tcp: https://wand.net.nz/~perry/max_download.php

In short, the higher the loss, the slower it goes. The longer the rtt, the more loss sensitive it becomes. BBR avoids both of these flaws.


This is a fantastic resource! Thanks.


While other commenters have given excellent answers, I just want to point out that these numbers probably fall out of the Mathis model of TCP throughput. Here's an article on the topic: https://blog.thousandeyes.com/a-very-simple-model-for-tcp-th...


ACK frequencies should be configurable, solving both bottlenecks and making UDP less relevant at the same time:

https://www.ietf.org/id/draft-add-ackfreq-to-tcp-00.txt


That configuration is already possible in Windows with the “DelayedAckFrequency” parameter and in Linux with the “quickACK” setting, but it’s unrelated to congestion control, and it will neither “solve” bottlenecks nor “turn TCP into UDP.”

Congestion control (the purpose of BBR) deals with the bottleneck problem. Delayed ACKs are an optimization to reduce protocol overhead. Indeed there are problems with delayed ACKs when combined with Nagle’s algorithm and an application that doesn’t send a steady stream of data, but it has nothing to do with the topic of this thread.


I need to set ack frequency to never or inifiny (say 1000+) packets per ack. so these are not usable for me.

TCP_QUICKACK seems to be a boolean (basically 1 or 2 packets per ack)

DelayedAckFrequency goes from 10 to 600 but it's millisecs!?

You don't think acks. have an impact on congestion?


UDP will always be relevant when you have head of line blocking and temporal data that expires quickly.


Well, if you set the ack freq. to 0 then TCP becomes UDP.

Right now it's hard coded to 2 which is just weird.


This has nothing to do with sending. On the receiving side you can't receive newer packets when a missed packet exists with an earlier sequence ID.

Ex:

4 -> 5 -> X -> 7 -> 8 -> 9

Until the client gets 6 then 7,8,9 will be held(and getting stale) until you roundtrip 6. At that point 9 may even be useless since it would be too stale and you've just added extra latency that you don't want or need.

Look up how dead-reckoning is done in realtime games, this is one of many other cases where having timely data works better over UDP than TCP.


If you would be able to set ack. freq. to 0 = no ack. needed; you would not want ordering, just like UDP but over TCP. Why would you want that you might ask, because then you can choose if you want order over the same protocol precisely for games.

For context I built this multiplayer platform, not that that prooves anything but I have 20 years of working with multiplayer: http://fuse.rupy.se/about.html


The reason you can't do that is because you would no longer have a stream based protocol which is one of the core requirements of TCP.

Really there's no up-side to doing it that way with UDP already existing and tons of downsides in changing an existing protocol in such a fundamental way.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: