
Online Codes: unencumbered, locally-encodable Fountain Codes alternative - pointfree
http://catid.mechafetus.com/news/commentsedit.php?sqc=3833b184cbf68b83&new=281
======
pointfree
> The only other work that deals with rateless codes that we are aware of is a
> forthcoming paper [6] by Luby. Since the author declined to provide us with
> a copy of his paper, we are unable to compare our results until his paper
> becomes public at FOCS’02. The final version of our paper will include a
> comparison with Luby’s work.

Online Codes:
[https://www.cs.nyu.edu/media/publications/TR2002-833.pdf](https://www.cs.nyu.edu/media/publications/TR2002-833.pdf)

Rateless Codes and Big Downloads:
[https://pdos.csail.mit.edu/~petar/papers/maymounkov-
bigdown-...](https://pdos.csail.mit.edu/~petar/papers/maymounkov-bigdown-
lncs.ps) (Start here)

------
grandalf
Could someone explain and contextualize this please?

------
pointfree
Woops, that was a bad link.

[http://catid.mechafetus.com/news/news.php?view=281](http://catid.mechafetus.com/news/news.php?view=281)

Ideas for Online Codes [Comment] by catid posted (>30 days ago) 9:35pm Thu.
Jan 26th 2012 PST Online Codes are an unpatented approach to rateless forward
error correction (FEC) codes from a 2002 paper by Petar Maymounkov. I first
learned about Online Codes from a wikipedia article and since then I have been
reading as much as I can about how to implement them with good performance.

Since that paper was published, Luby et al have made fantastic advancements in
rateless error codes, culminating in RaptorQ last year. Unfortunately they
decided to patent their algorithms, which makes their wonderful algorithms
worthless for almost everyone on the Internet. So, roll back the clock by a
decade and start over. _sigh_

Online Codes are a great place to start. I like the layered approach where
there is an inner code and an outer code. After reading about a lot of modern
implementations I have some ideas to try and see what will help Online Codes
reach good performance:

(1) Peeling decoder. This is an unpatented approach to decoding sparse low-
density parity check (LDPC) codes. It's O(N) and fast, and is pretty easy to
get running.

(2) Gaussian elimination decoder. This is also an unpatented approach used to
solve linear systems of equations when the equations are not sparse.

(3) Combine (1) and (2). This is an unpatented approach that provides maximum
likelihood decoding in roughly O(N) for good performance. I noticed that the
"Raptor Codes" monograph includes a description of this algorithm, so it must
be excellent in practice. I've implemented it myself and managed significant
performance (300 MB/s) before optimization. I have a few ideas for how to
optimize this:

\+ Reduce the number of variables needed by analyzing how data is passed
between parts of the algorithm and eliminating unneeded data. \+ Profiling the
application to see where hot spots are, and storing pre-processing in parts of
the code that are taking up less time. \+ When back-substituting the Gaussian
elimination decoded (2) symbols back into the peeling decoded (1) symbols, the
check matrix is pretty dense. Borrowing a trick from windowed modular
exponentiation, combinations of bits can be precomputed and then stored in a
table to greatly reduce the number of XORs required. I think this can be
implemented without any additional memory since at that point in the algorithm
many of the received rows have been copied over to the output file. \+ Reduce
the number of memory copies by waiting for memcpy() until back-substitution
and using memxor() where possible to avoid the copy.

(4) Use two inner codes instead of one. The second inner code will be based on
GF(256) octets instead of GF(2) like the rest of the check matrix. Since the
normal overhead is low, adding just a few more check matrix rows in a higher
field should help out the recovery properties a lot. This seems like a logical
extension to the layered approach of Online Codes.

(5) Decode all codes at once. The outer code might be combined with the inner
codes into one large matrix to solve. This would improve decoding performance
to that of the maximum likelihood decoder, and should work well with the
decoder I'm writing. We'll see...

(6) Since A = Decode(Encode(A)) and A = Encode(Decode(A)), the code can be
made systematic by precomputing Decode(A) on the transmitter, and then running
Encode() forward so that the first symbols sent are equal to the input data
file. This can reduce processing time but still allows check symbols to be
generated afterwards with good recovery properties. There should be a lot of
room for optimization on the transmitter side too since it already knows what
the output should be.

(7) Raptor codes are using some kind of "permanent inactivation" thing that
makes the outer code more complicated but apparently helps with the recovery
properties when the number of symbols is lower. Might be patented and
unusable, we'll see..

