
Remy – Computer-generated TCP congestion algorithm - JoachimS
http://web.mit.edu/remy/
======
donavanm
A key point that's glossed over is computational complexity on the senders
side. The existing congestion & flow control mechanisms are quite simple. In
contrast these "algorithms have more than 150 rules." Additional state and
computation time on the sender side is non trivial when managing tens or
hundreds of thousands of concurrent flows.

These algorithms vary based on link/endpoint characteristics. This would
require a priori knowledge of the path & quality to each TCP receiver in order
to select an appropriate congestion control algorithm. This problem is
probably tractable for large scale implementations. The number of unique
networks an endpoint is exposed to is in the few million range, and roughly
stable over time. Collapsing adjacent and similar networks would get down in
to the tens of thousands of variant prefixes.

The characteristics of each prefix are roughly stable over time; a subnet of
consumer cable endpoints does not flip to become a CDMA mobile subnet. At a
guess the rate of change is 1-5% per day. If you can track the performance of
millions of subnets a 1% delta per day is certainly feasible. In practice any
single sender would need dozens, or hundreds, of different congestion control
variants.

And lastly they appear to simulate receivers with similar network
characteristics on a single contended link. In practice receivers will have
wildly different characteristics. Traffic to mobile, consumer fixed line isp,
and datacenter networks will all travel on common transit carriers.
Additionally even a single IP endpoint can have variable hidden receivers;
think 802.11, gbe, embedded clients, and a desktop OS behind a consumer NAT
device. Now the sender must track, and adapt, congestion control _per flow_.

In summary synthetic testing of a simplistic use case has out performed
generalized solutions. This should surprise no one. The details of practical
implementation are ignored, and significant.

~~~
keithwinstein
Thanks for these comments!

Humans have been designing congestion-control schemes for 30 years; what's
interesting about our work is that we are starting to learn how to teach a
computer to do the same thing from first principles. (And then learning from
what the computer comes up with to inform human designs...)

I can't tell you yet exactly how computationally taxing the computer-generated
algorithms will turn out to be, compared with TCP CUBIC or similar. So far
we've found that because the RemyCC's rules are just dumb lookups into a
precalculated lookup table, the CPU requirements are pretty mild. But we need
to play around with RemyCCs a lot more before I can speak more confidently.
We're close but not there yet.

To your other point about whether the algorithms "vary based on link/endpoint
characteristics" \-- I don't think they will have to. Quantitatively, a single
RemyCC has been able to outperform TCP CUBIC over a thousand-fold range of
link rates, and with similar results for ranges of latencies, etc.

Just as servers currently use a single TCP congestion-control scheme (often
CUBIC) to talk to many different clients over diverse network paths, there's
no particular reason you wouldn't use a single computer-generated algorithm to
do the same. Trying to "learn" different parameters for different ranges of
remote IP addresses is an interesting idea, but not one we have explored or
are proposing.

~~~
baruch
In my work on improving performance of the Linux TCP stack (part of my
involvement with HTCP) I never saw the congestion-control part in the
performance monitoring. There are such long linked lists that one goes over
that any small computation is lost in the noise, most of the time was spent
reaching uncached memory lines.

------
keithwinstein
Grateful to see everybody's interest in our 2013 paper! You may be interested
in our follow-on work, led by my colleague Anirudh Sivaraman. This will be
presented at the ACM SIGCOMM 2014 conference in a few weeks:
[http://web.mit.edu/keithw/www/Learnability-
SIGCOMM2014.pdf](http://web.mit.edu/keithw/www/Learnability-SIGCOMM2014.pdf)

------
contingencies
Wow, this looks pretty impressive.

My general understanding is that congestion control algorithms seek to provide
performance and efficiency in end to end TCP (transport-layer) connections
that are well-matched to the underlying network path characteristics (latency,
throughput, lossiness). Unfortunately, obtaining reliable information about
these characteristics can be difficult, and this is particularly the case with
dynamic paths (IP mobility, mobile while moving between congested/non-
congested cells, shared near-capacity networks, etc.).

Previously, the areas in which customizing these sorts of algorithms has
yielded particularly high returns have been satellite communications and other
extremely long distance/high latency/known characteristic deployments with
outlying or extreme properties.

My main two questions would then be: (1) To what range of network layer path
or individual link-layer characteristics do the claimed benefits of this
algorithm apply? (2) How much difference will this make to mobile access or
IPv6 IP mobility under a range of different realistic network link issue
scenarios?

If the claims in this paper are true (broadly applicable increase in
throughput and fairness) then I suppose we'll see a switch to this algorithm
en-masse, thus providing another handy covert operating system detection
mechanism to delimit the new generation of kernels.

~~~
suprjami
> (1) To what range of network layer path or individual link-layer
> characteristics do the claimed benefits of this algorithm apply?

The paper states their RemyCC outperformed currently-available CCs even when
link variables varied by an order of magnitude.

To put this into simple terms, you could write a RemyCC for a 1Gbps 10ms link,
and that RemyCC would still perform better (than other CC algos) through a
range of 100Mbps to 10Gbps, and 1ms to 100ms.

> (2) How much difference will this make to mobile access or IPv6 IP mobility
> under a range of different realistic network link issue scenarios?

The whole premise of the paper is that a well-written RemyCC would probably
handle such situations better than a traditional congestion control algorithm.

> then I suppose we'll see a switch to this algorithm en-masse

You are more optimistic than I. Several people have already solved bufferbloat
in theory, but it still exists in real life. Updating legacy devices, even
with something as "simple" as a TCP congestion control module, may not be
possible.

That being said, implementing a few pre-baked RemyCCs in major operating
systems surely would help. Say you had a "mobile" (5Mbps 500ms) and "home
broadband" (50Mbps 50ms) and "LAN" (1Gbps 5ms) RemyCC available in
Lin/Mac/Win/And/IOS, TCP performance would be several times better than what
we have today.

~~~
donavanm
> To put this into simple terms, you could write a RemyCC for a 1Gbps 10ms
> link, and that RemyCC would still perform better (than other CC algos)
> through a range of 100Mbps to 10Gbps, and 1ms to 100ms.

No, pg 133 of the paper. They accept design time input variables as a range,
differing by an order of magnitude. If the "assumptions aren’t met,
performance deteriorates." Throughput and delay look horrendous when the
actual characteristics dont match the model.

> That being said, implementing a few pre-baked RemyCCs in major operating
> systems surely would help. Say you had a mobile" (5Mbps 500ms) and "home
> broadband" (50Mbps 50ms) and "LAN" (1Gbps 5ms) RemyCC available in
> Lin/Mac/Win/And/IOS, TCP performance would be several times better than what
> we have today.

I'm also dubious of any actual adoption. However I believe youre
underestimating the number of variant algorithms required. The consumer
endpoints receive the vast bulk of the traffic, and are rarely contended on
transmit. Its the large scale deployments of senders (CDNs, Streaming
Services, etc) that would make a significant impact to global throughput.

~~~
keithwinstein
You're totally correct that the performance is terrible when the actual
network characteristics are something that the algorithm's "prior assumptions"
thought was impossible.

We think the same is true of traditional TCP. (For example, traditional TCP
assumes that in-network buffers will be small, and therefore that it's
acceptable to keep filling them until they start dropping packets, without
harming delay-sensitive cross traffic too much. Today's Internet no longer
matches TCP's implicit model, and the consequences are that you can't upload
to YouTube and use Skype at the same time on a consumer Internet connection.
We think it's better to at least make the assumptions explicit!)

In the second paper (now posted), we have improved the initial results in that
we can now train a RemyCC that outperforms TCP CUBIC-over-sfqCoDel over a
1000-fold range of link rates (instead of 10x in the first paper). But the
same basic point still holds.

~~~
dtaht
My problem with the new paper is that it computes for a baseline rtt of 100 or
150ms. Real world average rtts are in the range of 4 ms for Google fiber 18 ms
for fios and 38 ms for cable, with the ethernet rtt in a Datacenter far lower
than that. I would be very happy to see Remy produce a CC for these rtts one
day also.

~~~
keithwinstein
Stay tuned... or bug Anirudh for a sneak preview when you see him at SIGCOMM.
(Hi Dave!)

~~~
dtaht
I put up all I have to say on the subject at:
[https://lists.bufferbloat.net/pipermail/bloat/2014-August/00...](https://lists.bufferbloat.net/pipermail/bloat/2014-August/002045.html)

------
GhotiFish
Oh man, this is a heady subject, and I feel I might not be getting the full
effect for how much of it is going over my head.

As I'm coming to terms with this topic now, what kinds of applications could
this get applied to? A program to produce routing rules automatically based on
assumed and then derived network infrastructure, would that make more organic
networks more feasible in organizations? Or is this more a tool for telecoms?

Is there much history of machine learning when it comes to packet routing
technologies? I would of thought yes.

required reading I think:

[https://www.youtube.com/watch?v=nLrBisNqEwQ](https://www.youtube.com/watch?v=nLrBisNqEwQ)

[https://en.wikipedia.org/wiki/Network_congestion_avoidance](https://en.wikipedia.org/wiki/Network_congestion_avoidance)

[https://en.wikipedia.org/wiki/CUBIC_TCP](https://en.wikipedia.org/wiki/CUBIC_TCP)

[http://www.isi.edu/nsnam/ns/doc/node239.html](http://www.isi.edu/nsnam/ns/doc/node239.html)

------
riobard
It's from the same author who made Mosh, a mobile shell/SSH replacement that
will keep you sane with high latency/lossy connections to your servers.

------
dtaht
I do look forward to the day where such complex congestion control protocols
can be implemented in hardware...

Until then... fq codel is the best thing going.

------
redxblood
Awesome, just awesome. Is it possible to use this algorithm in my machine and
replace the existing one? Would that be too hard?

~~~
keithwinstein
We're trying to get there! Plan is to make a user-space process that uses
Linux network namespaces to intercept the outgoing TCP connections of a
subsidiary process (using DNAT) and then send datagrams according to whatever
congestion-control mechanism you'd like.

But not there yet -- stay tuned.

~~~
ultramancool
Any reason you're trying to do this in usermode instead of using the existing
framework for pluggable congestion control already present in the kernel?

Changing your algorithm is already as simple as echoing into a /proc file.

~~~
signa11
> Any reason you're trying to do this in usermode instead of using the
> existing framework

my _guess_ would be that doing this in userland is way more forgiving than the
kernel. once the basics are in place, moving it to kernel using the
aforementioned pluggable congestion control f/w would be more or less
'mechanical'

------
luu
I love that this has a big fat "reproduce the results" button with detailed
build instructions.

Why isn't reproducibility required to publish in CS? Unlike in fields like
psychology or chemistry, reproducing results should be trivial if the authors
provide instructions on how to do it.

~~~
_delirium
In the natural sciences, reproducibility means that a separate team run an
experiment on independent apparatus, configured from written information
(e.g., the paper). Not that you re-run the experiment on the original
apparatus set up by and made available by the original authors. Clicking a
button is not "reproducibility" in that sense, though it is better than not
being able to do even that. Shipping code/VMs/etc. and having a 2nd team just
re-run it has too much of the original lab in it to be real replication; it's
more like just inspection of the 1st setup. Which is better than no
inspection, but worse than independent reproduction of the results.

Applied to CS, there's really no way around it: reproduction requires that
researchers claiming to replicate a result implement it independently. If
there are not two independently produced implementations that both confirm the
result, the research hasn't been reproduced. The process of doing so helps to
discover cases where the original results were due to idiosyncracies of the
original implementation, test setup, etc. It'd even be ideal if replication
was done in as dissimilar a setup as possible, to find cases where the results
unexpectedly depend on details of the original setup not thought to be
important.

If anything I think there are quite some dangers that reproduction will be
_decreased_ by the current trend towards what's questionably called
"reproducible research" as a euphemism for "code reuse". If people reuse code
rather than doing their own independent reimplementation of methods as stated
in papers, erroneous results can lurk for years and infect other research as
well. (I think code reuse _is_ good for engineering practicality, and making
permissively licensed code available is also a good way of getting researchy
methods out of academia into the real world. But I think it is quite wrong to
call reusing the original researcher's setup, rather than independently
producing your own, "reproduction" in the scientific sense.)

~~~
sjwright
The necessity of truly independent reproduction diminishes when the
researchers/engineers can supply an implementation that does real work in real
situations. A very high bar of self-evidence, if you will.

True reproduction remains necessary to confirm that the stated underlying
science is the cause, and not some other variable that might be legitimate and
remarkable but undocumented.

~~~
drewcrawford
Providing an implementation of a network protocol is hardly doing "real work
in real situations". I mean, it ships the packets from here to there but so
does every other implementation of every other protocol.

The thing that is interesting or novel about _this_ protocol as opposed to the
thousands of other proposals is that it's supposed to be faster than the other
proposals for a broad range of applications with a broad range of link
topologies and across a broad range of software/hardware platforms.

Of course, to establish _that_ claim replication is completely critical.

~~~
sjwright
> Providing an implementation of a network protocol is hardly doing "real work
> in real situations".

In what sense is a functional implementation not able to do real work?

