
Network Coding in P2P Networks - adlrocha
https://adlrocha.substack.com/p/adlrocha-network-coding-in-p2p-networks
======
ivan_ah
I did some research on network coding a few years ago (see [1] if interested),
and I was very excited by the potential until I read the paper by Li, Li, and
Lau that proves the maximum gain in throughput we can expect in undirected
networks is 2x. [https://iqua.ece.toronto.edu/papers/zp-
transit.pdf](https://iqua.ece.toronto.edu/papers/zp-transit.pdf) This makes me
think there is no point in introducing all this complexity... just stick to
packet networks and routing, and you're at most 2x less efficient than then
optimal netwrok coding scheme...

[1] A short literature review paper on network coding dating back to 2009:
[https://minireference.com/static/tmp/network_coding_review.p...](https://minireference.com/static/tmp/network_coding_review.pdf)

~~~
nullc
When you've made all other optimizations and your performance is bottlenecked
by that last bit of efficiency, then perhaps it is interesting! 2x is nothing
to sneeze at.

If your application is costing users two million dollars a year in bandwidth
... then halving it would potentially save a million dollars a year-- and
could well be worth an impressive engineering effort.

~~~
stelfer
Real world incremental complexity is very high. To say nothing about the lack
of an ecosystem.

I fondly recall interviewing at one of the faang companies years ago, when
they ran out of things to ask me, and the guy turned to me and said: "Let's
talk about network coding". My response, "Why would anyone want to do that?",
probably didn't help land me the job.

------
nullc
Network coding is very interesting but the maximum gains from the network
coding itself are fairly limited... a small constant factor. In attempting to
apply network coding I've found that vulnerability to (maliciously) corrupt
data significantly limits the utility.

If this is an area that interests you, you might also be interested in
efficient set reconciliation-- where a party can communicate to another party
a set using only exactly the size of the set-difference to the other parties
set without having any idea of what the other parties set has.

This can result in fairly large bandwidth savings in P2P gossip applications
without reducing robustness-- and could be combined with network coding in
places where network coding is applicable.

I worked on an efficient implementation of setrecon
[https://github.com/sipa/minisketch/](https://github.com/sipa/minisketch/) and
an application of it to Bitcoin (
[https://arxiv.org/abs/1905.10518](https://arxiv.org/abs/1905.10518) ).

------
sliken
Seems pretty weird to waste substantial (1000x) CPU and disk I/O to figure out
what you are missing, and if you have enough pieces.

Why not just use a reed-solomon based approach and turn 1000 pieces into 1100
and then you know you can recover the file with any 1000 of those 1100 pieces?
That enables you to actively search for the rare pieces to improve the health
of the network and as a side benefit you use 1000x less CPU and I/O.

Adding significant complexity so that sometimes you get a missing piece one
time step sooner seems like a huge waste.

~~~
6510
Out of curiosity I compress a 3 hour lecture one time to 70 MB and was totally
surprised by how watchable it still was. It got me wondering, out of that 10
GB ultra uber HD XXL video there is about 50 MB that you must have, 100-200 MB
of data that would be really useful but the rest quickly drops to "nice to
have but not at all necessary". (my numbers are mostly imaginary)

I strongly think that packaging and delivery methods should be specifically
designed for the goods being shipped. (Sand is not glass)

Poor TV and radio signals use to gradually degrade the picture and audio. Even
with partial reception you could still figure out useful things like if your
city was going to get bombed.

Today on p2p you get a multi GB blob of data with which you cant do anything
until you got all of it. Streaming will only give you the beginning which
could be worse than not watching something.

But if we take say 100 fold and do 70 GB of reads for a 700 MB video... (I
guess a few more numbers here and below)

From the 80's to 90's memory went up from 16 kb to 1 MB. From 90's - 2000 from
1 MB to 512 MB. From 2000 to 2010: 512 MB to 48 GB. 2020, 128 GB? (rather
disappointing compared to the previous jump) Is 512 GB a reasonable guess for
2030? I don't think we will see the 32 000 fold increase from 1980-2000 but 2
or 4 times seems reasonable? 70 GB of reads wouldn't be such a big deal.

Monitors will grow of course but houses are getting smaller. We can do all 4
walls, then the ceiling and even the floor, increase frame rate and resolution
but it has to stop eventually. (haha)

~~~
the_pwner224
Progressive images - see demo images a bit down the page here:
[https://blog.codinghorror.com/progressive-image-
rendering/](https://blog.codinghorror.com/progressive-image-rendering/)

It's pretty cool on the rare occasion when you stumble across a _huge_ image
on the web (think astronomy); first a low res version pops up and then you can
see the top-to-bottom loading as it gets gradually replaced by higher
resolution versions.

------
6510
I never knew! What fascinating stuff. The idea that anyone who has the entire
file could generate a block that fits as the final piece seems totally
magical. You could do a final even block or odd, group every 3rd, every 4th,
first half second half... and then the "last" block just magically fits... for
everyone... mind blown!

