
Bufferbloat - rcfox
http://apenwarr.ca/log/?m=201101#10
======
gjm11
I don't understand his description of why SFQ + TBF is disastrous. "Well,
given that SFQ has inserted a whole bunch of fairness into the queue and the
packets are no longer bursty... your interactive session has about a 50%
chance of having its packet dropped by TBF." But, er, the situation he's
talking about is one where your interactive session contributes only 1% of the
packets, and no matter how much fairness SBQ inserts it can't make the next
packet as likely to be from your interactive session as not.

The reduced burstiness seems more relevant: suppose your bandwidth hog session
is only using 10x more packets than your interactive one overall, but it's
using them in bursts; then "when there's been too much data lately" that's
probably because you're in a burst of bandwidth-hoggery, so there's much less
than a 1/10 chance that the next packet is from your interactive session.

Only, that doesn't quite make sense, because for the bandwidth hog to be
hurting your interactive session the latter needs to be trying to send/receive
packets while the bandwidth hog is being a bandwidth hog, or at least soon
enough after that the buffers are still full of hogswill. In which case, that
next packet is surely quite likely to be from your interactive session after
all.

Network experts: What am I missing here?

[Edited to add: My best guess is that what I'm missing is that "soon enough
after that" may be quite a long time after. In any case, I don't see any way
for what _apenwarr_ wrote to be right.]

~~~
akira2501
The reason he had those problems, I'm guessing, is that he never setup traffic
classing filters.

Generally, I've never run _just_ TBF/HTB + SFQ. I've always run them
hierarchically, HTB on the top interface, then several HTBs feeding into that
one. Each of the lower HTB queues has a different amount of "flow" allowed to
it.

Once you approach it this way, it all works rather nicely. As the multiple
queues allow you to spread bandwidth, but also priority into the higher level
queue. I can run telnet/ssh sessions during a torrent swarm with no issues.

------
xtacy
Instead of Stochastic Fairness Queuing, a better approach the bufferbloat
problem from the end host is to enable RED/ECN on all client hosts. However,
as the author mentions, this is going to massively hurt the non-ECN capable
hosts in the network.

This approach is "better" because it is analytically proven that for a number
of fair TCP stacks competing on an ECN enabled link, the average buffer
occupancy can be controlled "around" a certain operating value, which puts a
bound on the latency contributed by a particular link.

Also, I wonder if the bufferbloat problem becomes significant only in really
small RTT networks. i.e., where the RTT is at least the same order of
magnitude compared to queuing delays. On long distance links, my guess is that
the propagation delays might dominate latencies.

I disagree with the point that a solution to bufferbloat is purely end-to-end,
if that's what the author implied (Apologies if the author didn't mean this).
Even if you have perfectly shaped end hosts, if they share a bottleneck link
that's deep buffered, they both will suffer the latency problems.

Queuing/buffering in networks is a very nice research area. "How much
buffering do we need in a network?" "What if I run a network with as small a
buffer as possible?" There are some people who believe that there should just
be O(1) buffering in the network. It has been theoretically and experimentally
shown that TCP loses only 30% of the throughput if the bottleneck link in the
network has a buffer size of just ONE PACKET! So the question here is: is 30%
thput loss acceptable? If latency matters, then perhaps yes.

~~~
ajb
"this is going to massively hurt the non-ECN capable hosts in the network"

Where does the author say that? And why would it? ECN wasn't designed to screw
non-ECN hosts; it's supposed to be incrementally deployable. AFAIK, the only
reason it hasn't been deployed is because a small percentage of home gateways
are buggy WRT ECN.

~~~
grogers
Actually he says the opposite:

(This isn't cheating; you aren't hurting anybody else's performance by turning
on ECN for only one machine. Like magic, ECN is just always an improvement,
and always fair. Nowadays, chances are that the reasons ECN is off by default
don't apply to you anymore; try turning it on and feel the love.)

------
m_eiman
A simpler way than the _"Super complex UI, lots of fiddly magic numbers, and
no documentation. You're sort of doomed."_ method in the article:
wondershaper. It's in Ubuntu and as easy as wondershaper eth1 9500 9500 (for a
symmetrical 10Mbps connection). You'll probably be able to fine tune better
settings manually, but this is a good start.

~~~
viraptor
I think you're missing the same thing the author missed in the Jim Gettys'
article. You cannot fix the situation easily in every situation if the ISP is
not behaving correctly, contrary to what the post says. He was able to fix
this specific scenario, because he was the bottleneck (he was generating more
traffic / accepting promise of delivery) in this case. The original series of
articles (I really recommend reading them - they've got a great deal of tech
details) dealt with a bit different situation - the person was not even
saturating the link. Problems started appearing because of the shaping on a
link which actually should support the traffic - he could introduce fake
bottleneck, which was not the point really, because he would be wasting some
possible bandwidth. The problem there was mostly out of his control, although
could be mitigated to some extent by local tuning
([http://gettys.wordpress.com/2010/12/13/mitigations-and-
solut...](http://gettys.wordpress.com/2010/12/13/mitigations-and-solutions-of-
bufferbloat-in-home-routers-and-operating-systems/)).

I didn't want to say your solution won't work. But it's not going to solve all
problems - it will work if you / your direct link is the bottleneck.

~~~
m_eiman
Sure, but at least then I've done what I can on my end and can blame any
remaining problems on my ISP with a bit more confidence :)

------
davidu
As Jim Gettys explains, this actually a complicated problem and while traffic-
shaping on one end of the pipe can help, there is nothing you can do about
your head-end's buffers or your stupid middleware buffers.

This is really a more frustrating problem that won't easily be solved. I
suggest you all read Jim's original posts on the matter as they are
enlightening and explain how TCP windows work and how your ISP configuration
can mess it all up.

------
CrLf
I remember fiddling around with this a few years back, I even made my own
shaping script somewhat based on wondershaper (<http://www.carlos-
rodrigues.com/files/unmaintained/ctshaper/> \- which probabyl still works).

At the time I could get a huge improvement in interactive response and
concurrency on my 512Kbps/128Kbps DSL connection. Without it it was impossible
for be to be playing some online game while another computer in the house was
downloading something.

The benefits diminish as bandwidth increases, as concurrency gets much better
without any measures. So, the black art of traffic shaping becomes much more
difficult.

------
pragmatic
Anyone have an idea of whether this holds true for cable connections? Also do
most high end wireless routers handle this well?

~~~
nas
Most cable modems and wireless routers have similar problems. You can easily
test your cable connection. Upload a large file to a fast server while
watching ping times.

