
Evaluating TCP BBRv2 on the Dropbox edge network - fanf2
https://arxiv.org/abs/2008.07699
======
throwaway189262
BBR is great. It should be on by default in Linux kernel.

Unfortunately its only for TCP, so usefulness is somewhat limited outside the
datacentre.

For us devs, it's sad how little software uses LEDBAT protocol for bulk
downloads. All of you should be aware of the advantages and use it wherever
possible. It allows bulk data transfer without slowing down priority data
streams. And it works without any OS and router support.

For those with slow home connections, OpenWrt with SQM CAKE enabled is
incredible. If you turn on the options for per-ip flows, it almost perfectly
eliminates bufferbloat. I run OpenWrt x64 edition on one of my old desktops.
We have abysmally slow DSL at home and in office and it's a night and day
difference. Multiple people torrenting, uploading, watching video and you can
still browse the web

~~~
luizfelberti
Is there any study comparing the impact of BBR vs CUBIC on a LEDBAT stream?

I assume it to be equivalent because of LEDBAT's eager backoff, but it'd still
be a cool thing to read

~~~
throwaway189262
I haven't seen anything. They're both based primarily on RTT round trip time
instead of packet loss %. So they should share roughly equally.

The big innovation around these new schedulers is focusing on fairness instead
of maximum throughout. For the longest time TCP flow control was judged only
by how close it reached to theoretical maximum throughout.

So we ended up with extremely aggressive schedulers. The challenge now is to
make a scheduler that limits bufferbloat while not surrendering all it's
throughout to old school bully schedulers. This is where BRR really innovates

------
danudey
For anyone else who didn't know, BBR stands for "Bottleneck Bandwidth and
RTT", and is a congestion control algorithm.

More (but not much) information is available on Google's BBR repository:
[https://github.com/google/bbr](https://github.com/google/bbr)

------
the8472
The improved ss output mentioned in the paper has helped me immensely tracking
down what was limiting the upload speed from a corporate net to AWS. Turned
out no middleboxes were to blame, just a misconfiguration in nginx (receiving
side) limiting the http2 send window (sender side). Yes, http2 has a separate
send window on top of tcp. Yet another source of errors.

~~~
Matthias247
Yes, it has. And it's one of the prime reasons HTTP/2 is slower for lots of
applications than pure HTTP/1.1.

The tricky part of HTTP/2 is that you now have 2 flow control algorithms
running on top of each other. And they are rather good at harming the other
one instead of cooperating. E.g. a too small HTTP window can lead to poor
performance as you observed. However a too big HTTP window can lead to
excessive buffering in the application for HTTP/2 streams.

It's a rather hard challenge to get to a good compromise solution here (and
not a lot of people have arrived there yet).

In doubt the rule of thumb is to use HTTP/2 for small requests where the
benefit of multiplexing is biggest. And stick to HTTP/1.1 (a non multiplexed
TCP connection) for high throughput.

~~~
mobilio
This is fixed on HTTP/3

~~~
Matthias247
How do you define "fixed"? Quic still defines per stream plus per connection
flow control windows, on top of the lower level congestion control algorithm.

~~~
xxpor
Does it not use UDP?

~~~
mobilio
It uses UDP because simply can't change all TCP infrastructure worldwide.

Example - i'm still using on some place router Linksys WRT54GL with kernel 2.4
that is almost 15 years old. Because device still works and there was low
network requirements.

~~~
xxpor
Sure, I totally get that. But my point is UDP doesn't have any underlying
congestion control you have to fight.

------
luizfelberti
Here's a few questions I have for any lurking network engineers "in the know",
since there doesn't seem to be an official paper for BBRv2 yet (or at least I
couldn't find it on
[https://research.google/pubs/](https://research.google/pubs/)):

\- Is BBRv2 still modeled close to the PID equation, in a way that expanding
the equation as shown by BBRv1x [0] is still possible, and we can tune for
different target parameters?

\- How tunable is BBRv2? For example, say I'm doing some voice chat app, and
in my use case an increased sensitivity to ECN is presumably better (I read
somewhere that Facetime does this, but regardless of that, let's just assume
it's true for a moment). Can I tune that? Or disable ECN-awareness entirely?
Is there any good reference material on the parameters BBRv2 exposes?

\- I never saw any version of BBR be compared to TIMELY [1] which is also from
Google. Even though I assume TIMELY would be worse given that it only takes
RTT into consideration, it'd still be cool to see

\- BBR is great for clients, but ultimately dependent on trusting other
clients in the network to be well behaved, else the game-theoretical fairness
doesn't work out. What is the SOTA algorithm for AQM on middle-boxes nowadays,
and preventing bad clients from hogging link capacity?

Questions aside though, it's great to see more algorithms leveraging ECN as a
signal. Good stuff, kudos to the researchers!

[0]
[https://www.youtube.com/watch?v=PeYPqnLhUuc](https://www.youtube.com/watch?v=PeYPqnLhUuc)

[1]
[https://research.google/pubs/pub43840/](https://research.google/pubs/pub43840/)

~~~
the8472
> What is the SOTA algorithm for AQM on middle-boxes nowadays, and preventing
> bad clients from hogging link capacity?

fq_pie for constrained systems. fq_codel or cake if you have enough CPU
cycles. They can be operated in ECN-aware mode.

------
Diederich
[https://lwn.net/Articles/701165/](https://lwn.net/Articles/701165/) is a nice
writeup.

