
BBR, the new kid on the TCP block - pjf
https://blog.apnic.net/2017/05/09/bbr-new-kid-tcp-block/
======
notacoward
It's almost irresponsible to write an article on this topic in 2017 without
explicitly mentioning bufferbloat or network-scheduling algorithms like CoDel
designed to address it. If you really want to understand this article, read up
on those first.

[https://en.wikipedia.org/wiki/CoDel](https://en.wikipedia.org/wiki/CoDel)

~~~
throwasehasdwi
CoDel is different from the packet scheduling algorithm even though both fight
bufferbloat in different ways. CoDel is a congestion control algorithm for
controlling what happens when outgoing buffers start overflowing. This is on a
lower level than TCP and happens to any type of packet. The scheduling
algorithm, like VEGAS or BBR, controls the transmit rate of only the TCP
protocol.

When packets are being sent over the wires, the TCP scheduling algorithm
(usually CUBIC, VEGAS, RENO, or now BBR) will send out packets until the
parameters they monitor indicate the downstream device is about to overload.
Then they will back off slightly to prevent packets from being lost. These TCP
transmit strategies tend to either monitor packet loss rate or round trip
time, sometimes both. What they do with these two parameters determines the
biggest differences between the packet sending algorithms.

CODEL comes into affect when the scheduling algorithm decides it can't send
out packets quickly enough without losing them, and they build up on local
buffers. This can happen with TCP but also other internet protocols.

Something most people don't know is that without a scheduling algorithm like
BBR,VEGAS, or RENO, you can send out packets at interface speed. In simpler
protocols like UDP you need to do your own packet scheduling. Otherwise your
machine will send out packets at interface speed until they are mostly dropped
by the first slower link. This is why TCP has scheduling algorithms, they're
all an attempt to monitoring the end to end link speed from A to B you can
achieve without losing data.

Edit: BBR is a new TCP scheduling algorithm to fight buffer-bloat at the TCP
level. Since the majority of internet traffic is TCP, wide adoption would
cause a big improvement. TCP scheduling only affects outgoing packets, so its
important to get this into Windows and Linux so we can get the full benefit of
having buffer-bloat reduction on both ends. I'm looking at MS here because
they're the last major OS running an aggressive and buffer-bloat causing TCP
algorithm.

~~~
Klathmon
Do you have any information on Window's TCP algorithms and why they are so
bad, or what kinds of problems they cause?

~~~
throwasehasdwi
By default, Windows machines use the algorithm NewReno. Older algorithms like
NewReno are known for causing horrible buffer-bloat in comparison to
algorithms like VEGAS which is used in Linux. The reason is pretty simple. The
two main metrics used for determining when to send a packet in TCP are round
trip delay and packet loss.

Reno/NewReno slow their send rate mainly when they detect lost packets,
whereas VEGAS and similar mainly detect round trip delay but also respond to
loss. The problem with primarily loss-based strategies is that most routers
won't drop packets until their buffers are full. In modern times these buffers
can be very large (many seconds). TCP Reno will keep sending packets until
downstream routers have full buffers and drop packets, which could be when
they're already holding seconds worth of data. This is buffer-bloat. Any
packets that makes it to these routers will spend seconds waiting to be put on
the wire.

VEGAS on the other hand will try to maintain a constant round trip delay. It
uses TCP's ACK packets to gauge how long it takes packets to go back and forth
and reduces send rate when this starts to rise. This keeps router buffers
empty, delay low, and packet loss near zero. BBR is a further enhancement of
the VEGAS delay sensing strategy.

As mentioned in this chain, CoDel is one strategy for "fixing' loss based
aggressive protocols like TCP Reno. It detects "flows" (IP/port pair
combinations) that are causing the local outbound buffers to overflow and
starts selectively dropping packets on them. If all internet protocols were
delay based like TCP VEGAS CoDel would not be needed to keep traffic flows
"fair". Without using something like CoDel on your routers to punish
aggressive strategies, scheduling algorithms like NewReno will cause your
routers outgoing buffers to always be full and out-compete friendlier traffic
like VEGAS that cuts itself back when buffers fill.

~~~
metafnord
little side node here: the windows 10 creator's update introduced experimental
support for CUBIC [https://www.ietf.org/proceedings/98/slides/slides-98-tcpm-
tc...](https://www.ietf.org/proceedings/98/slides/slides-98-tcpm-tcp-
improvements-in-windows-01.pdf) which is currently the default CC algorithm
used in linux. However, the slides state that in the absence of AQM qdelay
gets worse when CUBIC is used

------
brutuscat
First saw it at the morning paper: [https://blog.acolyer.org/2017/03/31/bbr-
congestion-based-con...](https://blog.acolyer.org/2017/03/31/bbr-congestion-
based-congestion-control/)

 _This is the story of how members of Google’s make-tcp-fast project developed
and deployed a new congestion control algorithm for TCP called BBR (for
Bandwidth Bottleneck and Round-trip propagation time), leading to 2-25x
throughput improvement over the previous loss-based congestion control CUBIC
algorithm._

------
netheril96
Network performance across national borders within China has been abysmal
since the censorship got much more serious. BBR seems promising, so more and
more people (that includes me) who bypass GFW with their own VPS has been
deploying BBR, and seen marvelous results.

------
huhtenberg
Any data on BBR vs Reno and Vegas sharing?

Link capacity estimation is easy. It's the co-existing gracefully with all
other flow control options that's tricky.

~~~
baq
There's a 'Sharing' section near the end where two scenarios are compared.
Doesn't look like an exhaustive test, rather the opposite.

~~~
huhtenberg
Sure, I read the article. That section covers just CUBIC.

~~~
dsr_
It also doesn't discuss competition between BBR users. Playing nicely with
your neighbors is important for the health of the Internet.

~~~
metafnord
Several BBR flows do actually converge quite nicely to a fair share of
bandwidth. Take a look at the presentation the guys from google gave at the
IETF 97 in Seoul:
[https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-b...](https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-
bbr-congestion-control-02.pdf)

I think there is also a recording of that session somewhere on youtube.

Also, the complete paper can be downloaded for free at
[http://queue.acm.org/detail.cfm?id=3022184](http://queue.acm.org/detail.cfm?id=3022184)

------
abainbridge
Not be confused with BBR enhancing the Mazda MX-5:
[https://www.pistonheads.com/news/ph-japanesecars/mazda-
mx-5-...](https://www.pistonheads.com/news/ph-japanesecars/mazda-mx-5-bbr-
stage-1-turbo-review/36187)

Also significantly reduces latency and increases throughput :-)

~~~
TheSwordsman
It even has a flow-control mechanism (wastegate).

------
emmelaich
This article is not only a great intro to BBR, but an excellent introduction
the history of flow control.

Congrats to Geoff and his team.

------
skyde
How can we use it today! Is it in Linux code already and easy to enable ?

~~~
Scaevolus
It's available in Ubuntu 17.04. Add these lines to /etc/sysctl.conf:

    
    
        net.core.default_qdisc=fq
        net.ipv4.tcp_congestion_control=bbr
    

I got 400Mbps (on a 1Gbps link) from Seattle to New York with a single TCP-BBR
stream, vs ~50Mbps before :-).

~~~
assafmo
Thank you! This is helpful also about fq vs fq_codel -
[https://groups.google.com/forum/m/#!topic/bbr-
dev/4jL4ropdOV...](https://groups.google.com/forum/m/#!topic/bbr-
dev/4jL4ropdOV8)

~~~
assafmo
[https://groups.google.com/d/topic/bbr-
dev/4jL4ropdOV8](https://groups.google.com/d/topic/bbr-dev/4jL4ropdOV8)

------
emmelaich
> ... the startup procedure must rapidly converge to the available bandwidth
> irrespective of its capacity

It seems to me that you'd be able to make a rough guesstimate by noting the ip
address; whether it's on the same LAN, or continent/AS.

It wouldn't matter if you got it very wrong as long as you converged quickly
to a better one (as you have to do anyway)

------
kstenerud
It seems like the best way to handle this situation is to assume that all
other algorithms are hostile, and to seize as much bandwidth as you can
without causing queue delay. That would reduce the problem set to a basic
resource competition problem, which could then be solved with genetic
algorithms.

------
raldi
For those dying to know what it stands for: Bottleneck Bandwidth and Round-
trip time

------
skyde
Would adding this only to the http reverse proxy machines provide most of the
benefit without have to patch all servers.

This seem to have the greatest effect over wan links.

------
gritzko
Sounds like they adapted LEDBAT delay measuring tricks.

