
Fixing bufferbloat on your home network with OpenBSD 6.2 or newer - paulsmith
https://pauladamsmith.com/blog/2018/07/fixing-bufferbloat-on-your-home-network-with-openbsd-6.2-or-newer.html
======
Animats
It's discouraging to me that, 33 years after I first identified this problem
[1], there are no good solutions in wide use.

Informally, the basic problem is that pure datagram networks with no
backpressure, which describes the Internet, handle congestion badly. A big
first-in, first-out queue at a choke point works especially badly. That's
"bufferbloat".

Back in 1985, I proposed "fair queuing" (a term I coined) as a step to a
solution. Fair queuing is simply identifying "flows" (packets with the same
endpoints, which may be IP addresses or TCP/UDP ports), giving them individual
queues, and servicing the queues fairly. I also proposed making TCP
congestion-aware, a new idea at the time.[2] That was enough to deal with the
problems of the 1980s and 1990s. I did not forsee a future where people would
be trying to run Netflix, Fortnite, and VoIP on the same cable modem
connection at the same time.

The Internet works only because most of the congestion is at the edges, where
the user's LAN feeds an ISP connection with less bandwidth than LAN. Most
packets are lost there. If they're lost further upstream (say at the cable
headend), the problems are much worse. We still can't deal with congestion in
the middle of the network. Fortunately, fiber optic bulk bandwidth is cheap
enough to prevent that from being the big bottleneck.

Most of the "bufferbloat" aftermarket fixes work by assuming the ISP
connection has a fixed data capacity. So the user-side gear does rate-
limiting, reordering, and dropping packets to handle congestion locally, to
prevent the dumb FIFO queue in the ISP's edge router from building up. This
can work if the ISP connection has constant outgoing bandwidth. If that
varies, as on an overloaded cable segment, there's going to be trouble. And,
of course, the ISP connection doesn't tell the user side nodes it's congested.
So there's a lot of guessing and tweaking involved, which is why none of these
fixes Just Work.

There are two levels of troublesome FIFO queue - within each host on the LAN,
and at the router that connects to the ISP link. Each host has to decide what
to send first. The default is FIFO, which, as noted, sucks. Then the router
has to decide which packets from which local nodes to send up the ISP link
first. Most ISP-provided routers are still FIFO, although some are more
intelligent. A basic property of FIFO queues is that the one who sends the
most wins. The nice guys who aren't blasting stuff up the pipe get squeezed
out. This is why your VoIP stutters.

So there's a trend towards front-ending the ISP's router with another box to
do traffic-shaping, which means reordering and dropping packets. That's what
this article is about. There are commercial "gamer routers" which do this, and
firmware for various routers.

Now, with the ability to shape the traffic, the question is what to do. Basic
fair queuing prevents a big stream from squeezing out a small stream. That's
step one, and that's what the parent article is talking about. He's stopped
Speed Test from squeezing out his pings.

But that may not be enough. If one node is frantically making large numbers of
short HTTP connections, the usual case for an ad-heavy and tracker heavy web
page, those may all look like separate flows to the router and get a big
fraction of the bandwidth. That's no good. Now you have to start defining
policy rules, which is a huge pain.

Some of the "gamer routers" come with policy rules that know too much about
specific games. Move and shoot packets get priority over texture updates. It's
often enough to prioritize UDP packets over TCP until a UDP flow hits some
relatively low bandwidth limit. If you can give a game low latency for the
most important 5% of its traffic, it will often play well.

Arguably, each host on the local network should prioritize its own outgoing
traffic, leaving the next router upstream to deal with prioritization between
nodes. But that requires each node to know something about what the next
router upstream is doing. There's no mechanism for this. All players are
guessing what the other players are doing by observing round trip time and
latency. They don't talk to each other about this.

And that is why this area is still a mess.

John Nagle

[1] [https://tools.ietf.org/html/rfc970](https://tools.ietf.org/html/rfc970)
[2] [https://tools.ietf.org/html/rfc896](https://tools.ietf.org/html/rfc896)

~~~
mhneu
Would you opine on algorithms to control queues (e.g. RED, CoDel), and to
respond to congestion (e.g. CUBIC, BBR- at TCP level, correct?)? Will they get
anywhere? Or is a totally new approach needed?

[https://cloudplatform.googleblog.com/2017/07/TCP-BBR-
congest...](https://cloudplatform.googleblog.com/2017/07/TCP-BBR-congestion-
control-comes-to-GCP-your-Internet-just-got-faster.html)

~~~
wtallis
I don't think a totally new approach is needed. We already have what we need
to completely solve bufferbloat. What we're missing is universal deployment.

If we could get fq_codel on every router, modem and switch, then the problem
would be gone and we wouldn't need any of the fancier TCP congestion control
algorithms (but ECN-capable TCPs would still be nice to have).

If we could get BBR on every server and client device, then that would mostly
solve bufferbloat (for TCP traffic), but probably wouldn't be as good as
having fq_codel throughout the network.

Having partial deployments of both is sub-optimal, but even the potential
negative interactions between delay-based TCP congestion control and delay-
eliminating AQM should still be better than not having either and suffering
the full effects of bufferbloat.

~~~
AstralStorm
How do you handle CoDel at multigigabit transfer rates on limited hardware?

~~~
wtallis
The ideal solution is to put CoDel in hardware where it can be cheap. FQ-CoDel
would be better, but would require many times more silicon. It's probably
worth it, but it's a harder proposition to sell to the ASIC designers.

Even on current hardware, anything that's currently handling packets with a
CPU instead of fixed-function hardware should be able to add CoDel without
sacrificing too much throughput. Home routers running SQM-style QoS run into
performance problems not because of CoDel or fq_codel but because of the
traffic shaping. When everything has AQM, you no longer need a traffic shaper
on your home gateway router, just AQM, and the CPU requirements for that are
vastly lower.

------
y0ssar1an
Fix it on Linux by enabling BBR TCP congestion control.

    
    
      cat << EOF >> /etc/sysctl.conf
      net.core.default_qdisc=fq
      net.ipv4.tcp_congestion_control=bbr
      EOF
      sudo sysctl -p
    

1\. You will need a recent kernel.

2\. net.ipv4.tcp_congestion_control=bbr applies to IPv6 too.

3\. You must set the queuing discipline to fq or it won't work.

~~~
jdc
Running 30mbps fibre and getting marginally better results (upload especially)
with these settings:

    
    
        net.core.default_qdisc = fq_model
        net.ipv4.tcp_congestion_control = cubic

~~~
aorth
Guessing you meant fq_codel instead of fq_model?

~~~
jdc
Yeah, good catch.

------
bscphil
I'm quite aware that I have bad bufferbloat problems. Unfortunately
suggestions like these are kind of useless to me, and I'm surprised that
they're not useless for more people. My (cable) residential internet averages
about 110mbps. However, sometimes during the day it will go down to 85mbps or
so, and later at night it's usually about 120mbps, and I've seen up to 135.

The way these congestion algorithms work, you've got to let the algorithm
limit your bandwidth to ~95% of what's actually available. If I always got
_exactly_ 110mbps, I'd be okay with that. The problem is that I've got to
limit it to 80 mbps or so for it to be useful during the day. And there's no
way I'm going to do that and lose 40-50 mbps of my bandwidth every night.

Widely varying connection speeds are common enough that I'm surprised many
people have found these algorithms useful. I wish there was one intelligent
enough to respond dynamically to however much bandwidth is available.

~~~
PhantomGremlin
My cable is very asymmetric 150/5 nominal, 170/6 best case.

I just now experimented with creating _only_ the outgoing queue on my external
interface. That's the slow direction. I went from a D in bufferbloat to an A,
with just that one line addition.

Can you live with reducing just your upload speed? I sure can, I rarely upload
much. But even if I were doing large amounts of cloud backup, it might not
matter. If I go from 5000K to 4500K, is that really such a loss? Am I going to
cry over 10%?

And if there is congestion such that my actual upload speed falls below 4500K,
it's possible, maybe even likely, that I'm no worse off than before I created
my upload queue.

Unfortunately the DSL Reports speedtest only tests one direction at a time. So
maybe my fix doesn't work well if there is significant bidirectional
simultaneous traffic?

Fortunately I'm running OpenBSD on my firewall, so it was very easy to
experiment with this.

~~~
bscphil
I'm running OPNsense here, so it's also fairly easy to manage. If I
interpreted the DSL Reports test correctly, it looks like I see bufferbloat
(median ~1.5 seconds) when downloading, but not a significant amount when
uploading. So I don't think shaping only the outgoing traffic would help me.
But maybe I'm misunderstanding.

------
w8rbt
Great article. OpenBSD makes it so simple to do this.

Anyone using Ubiquiti Edge X routers can apply this too (Linux under the hood)
in almost the exact same fashion. Works great even on slow DSL links.

~~~
yumraj
what exactly would you do in EdgerouterX ? Do you mind sharing the commands or
point to some link?

~~~
Arie
Just enable smart queue QoS in the web interface. The ER-X should be able to
handle links up to about 150-180Mbit with Smart Queue QoS enabled.

------
dbolgheroni
Didn't know about this issue until I saw this article.

With 2 lines added to conf, I managed to go from C, D grades to A almost
instantly.

~~~
tbrock
Same, why haven’t the defaults changed after all these years?

~~~
gpvos
Because you need to specify the bandwidth, which the router maker cannot know?

~~~
AstralStorm
Not really, fq_codel + bbr can work based on RTT changes alone.

------
minton
I have 1000/1000 fiber and I can’t make a video call over Slack. People claim
my video is freezing and they’re only getting every other word. I wonder if
bufferbloat could be an explanation.

~~~
deagle50
Unlikely unless another host on your LAN is saturating all 1000Mbit. My first
guess would be poor wifi then funky shaping on the ISP side. Try it wired and
if that doesn't help try it over a VPN.

------
Lunatic666
The first question which came to my mind was if OpenBSD already supports fast
enough WiFi devices to work as a proper access point – looks like last time I
tried was quite a while ago:
[https://undeadly.org/cgi?action=article&sid=20101216231634](https://undeadly.org/cgi?action=article&sid=20101216231634)

It supports Realtek USB adapters and I remember how easy it was to enable
traffic shaping with pf and always have lag free ssh-consoles!

------
ksec
Many more ISP are giving Router / Modem / ONT as default. One of the reason
Apple exited the Router Market. The problem is their WiFi Sucks, they do not
have any incentive to provide you better WiFi, or some sort of QoS.

So for many consumers, it doesn't seems buffer bloat will be fixed anytime
soon.

One thing we could do is to educate the customer and hope Speedtest, Fast.com
from Netflix includes buffer bloat as listing like the one in DSLReprot.

------
tyfon
I got a B in bufferbloat and 512 mbit down and 517 mbit up on my 500/500 fiber
link. It seems that I could do better when downloading though, the bufferbloat
on that graph is 50 ms.

I'm going to experiment a bit with my openbsd firewall.

