
Google fixes nearly decade-old Linux kernel TCP bug - tim_sw
http://bitsup.blogspot.com/2015/09/thanks-google-tcp-team-for-open-source.html
======
nathanb
I would love to see a characterization of this bug's impact. Reading the
description, it doesn't seem to be nearly as serious as the author of this
post is making it out to be. But there's no information about how fast the
client will step up the transmit rate after a period of quiescence and no data
from traffic analysis to determine how frequently real-world servers encounter
this condition and the overall effect in terms of dropped packets, increased
queue depths, and latency spikes.

(Google are actually in a good position to provide these data; a number of
Google's own services will doubtless benefit. I suspect the scale of Google's
operations means that even a barely-measurable decrease in latency or queue
depth compounds to a big gain pretty quickly.)

~~~
cornholio
Strictly speaking, it can't be too bad, since it would immediately stand out
if Linux boxes have decreased network performance in a mixed shop. Maybe a few
percent in the average case, with maybe some specialized applications more
affected.

------
eridius
I'm confused by the last sentence:

> _The whole web, including Firefox users, will benefit._

Why is Firefox thrown in here? And the post is even tagged with "firefox". But
the contents of the post seems to have absolutely nothing to do with Firefox,
except inasmuch as Firefox is an application that interacts with the web.

~~~
TimothyFitz
The author works for mozilla, and firefox is a common theme on his blog. I
think it's for regular readers of his blog to understand the practical impact
(readers he assumes care more than average about firefox).

------
zurn
The commit also says "This particularly shows up when slow_start_after_idle is
disabled as a dangerous cwnd inflation (1.5 x RTT) after few seconds of idle
time." And the slow_start_after_idle sysctl is enabled by default.

The bug fix is in the CUBIC congestion control algorithm, which has been the
default in Linux TCP since kernel 2.6.19 (released in November 2007). So
impact in distros has been ~ 2008-2015.

------
userbinator
_Remember that TCP is robust enough that it seems to work anyhow - even at the
cost of reduced network throughput in this case._

That to me sounds like this is not really a bug, but a performance
enhancement; a bug would be something like corrupted/missing/extra data at the
application level. Tuning the TCP algorithms beyond the basics is really more
of an art.

~~~
smegel
I thought that too, but then

> that an endpoint that is not moving any traffic cannot use the lack of
> errors as information in its feedback loop.

I think the bug is that it was counting something when it should have not
been, which would have violated both the design and intent even at the time it
was written.

But I agree that code is only a bug if it does not do what was originally
intended, if the original design was bad, and the code correctly implemented
that bad design, there is no bug.

~~~
sigmaml
> But I agree that code is only a bug if it does not do what was originally
> intended, if the original design was bad, and the code correctly implemented
> that bad design, there is no bug.

The frame of reference for determining if something is a bug or not is the
requirement specification, not design. If the design incorrectly or
inadequately addresses the requirement, that is a bug as well. There isn't
much point in rejoicing over the code that correctly implements an incorrect
design!

~~~
smegel
I think the frame of reference is the programmer intent. If he/she intended to
check for equality but wrote:

    
    
        if (x = 42) {
    

Then that is most certainly a bug.

If a bit of text was supposed to be blue but came out green, maybe because the
person writing the requirements doc got it wrong, it is hardly a bug in the
code. The code is doing what the programmer intended, even if that is not what
the user wanted. That's why the world has moved on from "bug trackers" to
"issue trackers".

> There isn't much point in rejoicing over the code that correctly implements
> an incorrect design!

And there isn't much point blaming a coder for correctly implementing an
incorrect design, especially if the design document is all they have to go on.

~~~
sigmaml
> I think the frame of reference is the programmer intent.

Sure. However, that is secondary. The sort of mistakes that you mention can be
caught during code review or unit testing, completely independently of what
the actual user requirement is. This post, evidently, does not involve such a
case.

> The code is doing what the programmer intended, even if that is not what the
> user wanted.

Your perspective appears to be inside-outward. It may be helpful during a
performance appraisal, but does the business no good!

Edit: inside-out --> inside-outward.

~~~
drdeca
Could it be that "bug" is defined from an "inside->outward" perspective
regardless of whether that is the most useful perspective for the situation?

------
thecaviardoer
Honest question: How is it that Google is the company that's always doing all
of this involved technical work? Or is it just that their work gets publicity?

~~~
Tobu
Éric Dumazet (author of the patch) is this highly productive guy working on
network latency issues. He helped get CoDel into the kernel, and worked on TCP
small queues
([https://lwn.net/Articles/507065/](https://lwn.net/Articles/507065/)) and the
fair queue scheduler
([https://lwn.net/Articles/564978/](https://lwn.net/Articles/564978/)).
Looking at the email addresses he used, I think he started working on the
network stack on his spare time, while employed at SFR, a French telco. He
implemented an in-kernel JIT for BPF
([https://lwn.net/Articles/437981/](https://lwn.net/Articles/437981/)),
speeding up packet filtering; now that BPF is fast it is also used for
performance profiling (filtering and aggregating in kernel) and syscall
filtering.

There are of course others working in these areas, but I think Google has been
going after them: Van Jacobson is a high-profile example. Dave Taht is still
subsiding on ramen though
([https://www.patreon.com/dtaht](https://www.patreon.com/dtaht)).

~~~
sandGorgon
_For the last five years I 've worked myself to the bone, mostly unpaid, to
solve the "bufferbloat" problem - first by organizing and leading the team
that solved it, and giving away, for free, the ideas and code for anyone to
use! And I've spent tons of time later convincing standards organizations like
the IETF to make the ideas standard in new equipment - and also open source
makers like openwrt to retrofit the fq_codel and cake fixes into millions of
older yet upgradable home routers, and helping ISPs, vendors and chipmakers
understand the both the need for the fq_codel fixes, and how to implement
them. _

wow - thanks for posting about David Taht. someone like Netflix or Google
should get him on their payroll.

------
toolslive
There is a talk
[https://www.youtube.com/watch?v=gQsOo_skjzk](https://www.youtube.com/watch?v=gQsOo_skjzk)
Where they show there are plenty of similar bugs left in TCP implementations.
It's a mess.

~~~
mtkd
But less of a mess today than it was 17 days ago.

------
zaroth
Speaking of Linux kernel network stack enhancements, there's one I'm
interested in but not sure where to start.

I was trying to setup ECMP on Ubuntu 14 and found I was constantly getting TCP
RST. Turns out after digging into the code I found the path selection
algorithm is not a consistent hash of a packet header tuple but rather pseudo-
randomly chosen.

In older kernels there was a route cache so a path would be chosen and cached
and so you would get a stable ECMP route per source. The route cache was
removed so now the random selection runs on every packet leading to unstable
routing.

It seems a simple fix of choosing the right hash function and maybe adding
some configuration flags to determine which fields are hashes on (typically
5-tuple SrcIP, DstIP, Ip Protocol, TCP Src Port, TCP Dst Port).

But what would be incredibly cool is if my patch could one day possibly even
merge to mainline. Is it reasonable to just blast a 'here's what I'm trying'
out to netdev mailing list to get some feedback? I must admit I'm a bit
intimidated to post...

~~~
jfoks
Absolutely go for it! Post it even if you feel uncertain about it, just be
clear about that, and about what your patch is trying to achieve, when the
unexpected behaviour happens, and try to include a way for others to
reproduce/investigate (as simple as possible, perhaps a program that
demonstrates the bug). Also be clear about how you feel about whether or not
the patch is the right approach and about what you would like list members to
do with it (are you looking to confirm that what you're seeing is a real
kernel bug, or a misunderstanding, or an application bug, and/or are you
looking for help solving the bug/issue you're seeing, and/or are you looking
to get it merged, etc). Can you demonstrate/quantify how your patch
improves/fixes things, etc.

------
Supersaiyan_IV
The reason this is good is because older wifi cards supported by iwlwifi get
hardware buffer underruns and hangs when said bug occurs together with WPA2
enabled.

~~~
tjoff
I'd say that is a reason for why this is bad. If I've got bad hardware I want
to know about it and bad hardware should be flushed from the market. Not
function "okaish" enough to not annoy users to the point of boycott.

~~~
Supersaiyan_IV
Allow me to clarify. "This" means: "The fact that Google fixed a decade old
bug", meaning the topic itself. The "this" you use in your comment is entirely
different. The above misuse of "this" causes me to be confused, because it
allows your comment to be interpreted as if you're promoting the inference of
bugs, where older hardware is affected, such that people lose faith in that
hardware faster due to frustration, consequently affecting the market causing
it relieve itself of older hardware faster. Which is why I'm asking you to
clarify your argument.

~~~
tjoff
I only commented on the aspect of this bug that benefits crappy hardware.
_That_ is bad. Because your comment seemed to imply that the benefits for
crappy hardware is _the only_ reason for why this is good.

Which, depending on your viewpoint, can be seen as: the only reason this bug
fix is good is because some company can still sell crappy hardware and get
away with it.

Then we have the user perspective of course, which is "hey, my device doesn't
crash as often" \- which is awesome. But given the context the hardware is
still buggy and it will still crash, albeit not as often. Which really isn't
much comfort in the long run. Rare hiccups are worse than daily hiccups.
Because daily hiccups you learn how to handle, a rare hiccup can really bite
you. So you really haven't gained that much anyway, and in the long run the
rare hiccup is rare enough for you not knowing who to attribute it to so you
might buy the exact same thing next time which of course is very, very,
unfortunate.

