Hacker News new | past | comments | ask | show | jobs | submit login

Normally one would use some distribution of the number of encoded source blocks in a block. So one would send 30% single blocks, 30% double blocks, 15% triple, and so on until you get some relatively small number of high order encoded blocks. This means that while you can find a solution the way you posted it is a rare case that you have no single block for decoding and since in any case you need to receive a bit over 100% of the blocks before you can decode anything there is no real reason to go to exquisite lengths to find the perfect solution when the trivial case will work 100% of the times (well 99.9999% if you cared to be accurate :-)

The algorithm is also not so much about computational efficiency, to check at any point if you can or cannot decode requires quite a bit of work anyhow. The main advantage of the algorithm is in its relative low complexity and high efficiency on the transmission line.

My point isn't about computational efficiency, but bandwidth efficiency (or alternatively, decoding the full message as early as possible). As I understand it, the Moore's law doubling time for bandwidth is longer than for CPU speed, suggesting that processing is getting cheaper relative to bandwidth, so using more CPU in order to have to send fewer blocks is probably a long-term win.

The decoding amounts to solving a system of linear equation where each equation corresponds to one received packet. Assuming the file to be sent was broken into n parts, minimum n independent equations are required for reconstruction. To do that optimally, receiver needs to keep trying to solve the linear equation system once it has received at least n packets. This means O(n^3) in general case for each received packet. If the equations are generated randomly, the expected overhead seems to be a mere 1.61 extra packets! http://virtual.vtt.fi/virtual/abi/abi-seminar07/abiseminar07...

In practice however, depending on the receiver, you may not want it to solve a general system of linear equations for every packet it receives when n is big. It might be better to stick to a simple algorithm (which has a better time complexity if certain conditions are satisfied by the linear system) and just wait to accumulate enough equations for it to become applicable.

Raptor codes require O(1) time to generate an encoding symbol. Decoding a message of length k with a belief propagation decoding algorithm require O(k) time for the appropriate choice of inner/outer codes. http://en.wikipedia.org/wiki/Raptor_code#Computational_compl... See also: http://www.inference.phy.cam.ac.uk/mackay/itprnn/ps/555.573....

If this was transmission to a single client you were right but when this is used for broadcast or multicast there is less interest to minimize the transmission for each client but rather to optimize it for the entire population at once, since in these cases feedback is scarce in the first place there is little reason to check at every message, it is sufficient to test it once in a while anyhow.

I can see also use cases for this in a unicast transmissions but even there the idea would be about simplifying the communications layer, for example in the case of usage for a fixed rate transmission and with complete disregard for the loss rate. Not sure what I'd think about bandwidth usage and the stop criteria in this case.

With large enough blocks and efficient coding, the noisy-channel coding theorem guarantees nearly error-free transmission rates up to the maximum possible channel capacity. Modern LDPC, Raptor, and Turbo codes can get very close (to within 0.0045dB) of this Shannon limit, meaning that far from being less efficient, FEC allows for maximal bandwidth efficiency.

RaptorQ codes, an efficient fountain code implementation, have a relative reception overhead of K+2 symbols with probability of 99.9999%. See: http://tools.ietf.org/html/draft-ietf-rmt-bb-fec-raptorq-06

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact