

Is your satellite link oscillating? Improving goodput using network coding - gunkaaa
https://blog.apnic.net/2015/03/13/is-your-satellite-link-oscillating/

======
kevin_nisbet
One thing I wish they had talked about, was about applying Random Early
Detection (RED) to this problem. It seems to be, that RED, while an older
technique, was created for exactly this purpose, and is supported in every IP
router I've ever worked with. Perhaps there are challenges applying RED here,
but I would suspect that if introduced it would also have considerable
improvements to the internet performance for those islands.

Now, the solution presented, could be superior to RED, but based on what
they've shown, I'm not entirely convinced. It is an intriguing solution
though.

Also, an important takeaway they presented, that translates to other software
tasks, is that increasing the buffer often makes the situation worse. One
thing you can look into, is the research currently underway into buffer bloat,
and the suspected impacts it is believed to be having on consumer internet
service. My understanding is it appears to be caused by this exact phenomenon,
where engineers from equipment manufactures and telecom operators reacted to
the problem by drastically increasing buffer sizes.

*Please be aware, I work for a large telecom operator in Canada, but the views are my own and do not reflect any position of my employer.

~~~
wtallis
There are challenges to deploying RED _anywhere_. That's why we now have a new
generation of queue management strategies stemming from CoDel. Still, even
with the best queue management there's obviously a benefit to using extra
forward error correction like this to further reduce packet drops if the
latency is high enough, but I'm not sure it is in this case.

------
Nimi
A few random thoughts/questions:

\- This sounds similar to the incast problem which occurs in datacenters, but
this happens on consumer Internet - cool.

\- After reading both this post and the TCP/NC paper, it seems to me like
TCP/NC is unfair to vanilla TCP. If all TCP/NC does is send the same number of
packets, but the packets are "more sophisticated" encodings of the original
data, link utilization would be the same. So apparently, TCP/NC is more
aggressive than vanilla TCP, and that's fine, but I think it should be
acknowledged (haha). When they say stuff like "TCP doesn’t see the packet
loss, and as a result there’s no need for the TCP senders to reduce their
sending rates", it's a bit unclear what they mean - you can just as well
modify the TCP stack to ignore the packet loss and not reduce the sending
rate, without network coding.

\- Why not use TCP termination? You could install a performance-enhancing
proxy at the Sat gate, and make sure the link is always 100% utilized.

\- "Let’s increase the queue memory" \- I thought this should theoretically
work. See for example
[http://yuba.stanford.edu/~nickm/papers/sigcomm2004.pdf](http://yuba.stanford.edu/~nickm/papers/sigcomm2004.pdf).
If folks familiar with the apnic effort are reading, I would love to know if
they tried such measures and what happened.

\- Could CoDel improve the situation here?

------
lkarsten
Congratulations, you have rediscovered a problem described in a 14 year old
RFC: [http://tools.ietf.org/html/rfc3135](http://tools.ietf.org/html/rfc3135)

------
nowarninglabel
This is great, spent a lot of time reading about this kind of stuff (from
books like this: [http://www.amazon.com/Satellite-Technology-Principles-
Applic...](http://www.amazon.com/Satellite-Technology-Principles-Applications-
Maini/dp/1118636473) ) before I went to work on a ship with the task of
maintaining the satellite connection. Turned out I didn't have much control
over our traffic on the satellite side, but it was interesting nonetheless to
think of what was happening to our traffic once it reached there.

------
walrus
Clever solution to a unique problem!

> _So how does this help with queue oscillation? Simple: We generate a few
> extra "spare" combination packets, so that we now have more equations than
> variables. This means we can afford to lose a few combination packets in
> overflowing queues or elsewhere – and still get all of our original packets
> back._

If I understand correctly, this would also work with a more traditional coding
scheme (block coding or convolutional coding). I'm curious if there are plans
to take advantage of the properties of network codes in the future.

~~~
darkmighty
"Network code" as described here isn't the academic definition, I think he's
just describing colloquially "i'm using a code in my network". The technical
definition is a Packet Erasure Channel [1], which any erasure code [2] can
deal with. Network coding is a more unexplored technique where routers in the
network combine packets in various ways that are better than regular routing:
see the example at [3].

[1]
[http://en.wikipedia.org/wiki/Packet_erasure_channel](http://en.wikipedia.org/wiki/Packet_erasure_channel)

[2]
[http://en.wikipedia.org/wiki/Erasure_code](http://en.wikipedia.org/wiki/Erasure_code)

[3]
[http://en.wikipedia.org/wiki/Linear_network_coding#The_butte...](http://en.wikipedia.org/wiki/Linear_network_coding#The_butterfly_network_example)

~~~
walrus
The vendor mentioned in the article, Steinwurf ApS, says on their website that
"Our products are based on Random Linear Network Coding[...]" [1]. In this
case, it looks like they're _using_ it as an erasure code, though.

[1] [http://steinwurf.com/technology/](http://steinwurf.com/technology/)

------
rasz_pl
From my experience (~10 years ago) sat links suffer high latency and VERY low
packet throughout. At the time I played with one in Europe it was delivered
with special sauce Windows proxy software that had one purpose - cut packets/s
and keep retransmissions local.

I got the feeling installing traffic shaping routers with explicit
packets/second limit would also do wonders for links mentioned in the article.

------
mschwarz
Reminds me of Forward Error Correction [1], a technique used by Satellite
providers and even WAN optimization vendors like Silver Peak to "erase" packet
loss events by injecting parity packets into the flow, which can be used to
rebuild lost packets at the receiving end if needed. This prevents TCP
synchronization, aka the throughput see-saw described in the article. This
problem isn't limited to high latency / satellite links but exists on any path
with packet loss, like your internet connection, or even MPLS.

[1]
[http://en.m.wikipedia.org/wiki/Forward_error_correction](http://en.m.wikipedia.org/wiki/Forward_error_correction)

------
lobsterloga
This technique is known as RLNC. It works quite well but it's not the only
optimal erasure code, there are others.

Unfortunately, RLNC implementations are patent-encumbered in the US, so good
luck using this "simple" linear algebra.

------
pyvpx
yes, it's really neat. then the more you dig into it the more you realize
literally everything and the kitchen sink is patented to all hell and back.

best of luck _using_ any of it without a large team of expensive lawyers.

------
petya2164
Thank you for the interesting questions and thoughts on this topic! We tried
to answer some of these in a longer comment on the APNIC blog:
[https://blog.apnic.net/2015/03/13/is-your-satellite-link-
osc...](https://blog.apnic.net/2015/03/13/is-your-satellite-link-
oscillating/#comment-34710)

Disclaimer: I am one of developers of the RLNC kernel module at Steinwurf ApS.

