

Show HN: Framework for faster distributed data transfer - mgronhol
https://github.com/mgronhol/pouring-rain

======
kenko
Change test-client.py to end:

    
    
      print client.fetch(1000)
      print client.fetch(1)
    

And when you run it, you get the same thing twice (whatever file was specified
as the first argument to test-server.py on the commandline). On the other
hand, change it to end

    
    
      print client.fetch(1)
      print client.fetch(1000)
    

and you get "Hello World!" twice.

At one point the server process also gave me this:

    
    
      Server(s) started. Press enter to stop.
      Exception in thread Thread-1:
      Traceback (most recent call last):
        File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 552, in __bootstrap_inner
          self.run()
        File "/Users/wolfson/src/others/pouring-rain/LT/Server.py", line 84, in run
          self.streams[ entry[1] ].discard( entry[0] )
      NameError: global name 'entry' is not defined

~~~
mgronhol
Thanks, just noticed a small initialization issue there, fixibg that soon :)

------
wmf
You said Luby, so prepare to be vaporized.

On the technical side, theoretically this should only be a few percent faster
than BitTorrent but requires significant complexity.

~~~
andrewcooke
why vaporized?

also, why faster? does bittorrent duplicate some traffic?

~~~
wmf
BitTorrent has overhead from control traffic (have messages etc.) that can
theoretically be eliminated using rateless coding. The extra decoding makes it
a wash IMO.

~~~
evgen
FWIW, when building the second version of the core tech from which BT was of
forked we did license the tornado codecs from digital fountain to try to get a
better reliability rate and it ended up being worse than a wash; setting aside
the additional complexity it adds to the system it completely screws with
internal block verification based on hashes. The blocks are also not
structured compared to other FEC codecs (e.g. Rizzo) so you lose any hope of
streaming. Long sorry short, unless you are trying to get around nasty packet
loss there are better encoding schemes for this sort of data distribution
task.

------
andrewcooke
this is all new to me and i'm struggling to find a friendly, simple
introduction to the field. does anyone have a good link to, say, an
introductory article on fountain codes? i'm starting to get a feel for the
idea from reading various wikipedia pages, paper abstracts, powerpoints, etc,
but a good "layman's" article would be great...

[update] huh. just after posting i started searching for "fountain codes" and
found this - [http://blog.notdot.net/2012/01/Damn-Cool-Algorithms-
Fountain...](http://blog.notdot.net/2012/01/Damn-Cool-Algorithms-Fountain-
Codes) (previously had been searching for luby, raptor, etc).

------
kilburn
I fail to see how is this _exponentially_ faster. "Transfer time T(K) = 1.25/K
where K = number of servers" seems pretty linear to me...

~~~
mgronhol
Linearity: function's derivative is constant w.r.t. the parameter discussed.
dT(K)/dK = -1.25/K^2 and this most certainly is not constant.

~~~
kilburn
To clarify, the original title said that this was a method for _exponentially
faster_ data transfer. That said, you got me in that it is not linear. As you
have shown, that function is quadratically decreasing (which is _worse_ than
linear). The title has been corrected since then, so I kind of got my point
through.

