
All About Erasure Codes (2004) [pdf] - Tomte
http://web.eecs.utk.edu/~plank/plank/classes/cs560/560/notes/Erasure/2004-ICL.pdf
======
omazurov
Using a general matrix inversion algorithm to decode Reed-Solomon erasures is
an atrocity. The only difference between encoding and decoding erasures is
that we restore _m=n-k_ code word values at static locations, _k+1..n_ , when
encoding, and _m_ code word values at dynamic locations when decoding. In both
cases, computing the matrix is _O(mk)_ \- much faster than matrix inversion.

~~~
akalin
Can you elaborate more on the computation of the decoding matrix? Is it
basically a special-purpose row reduction, taking advantage of the fact that
the matrix to be inverted has a bunch of rows from the identity matrix? (I
think it is, but would be curious if there's another algorithm out there!)

~~~
omazurov
If _RS(n,k)_ code is defined as _Sum[i=0..n-1] Xi Zi^j = 0, j=0..m-1, m=n-k_ ,
where {Xi} are codewords and _{Zi}_ are locators, then for any polynomial
_P(Z)_ of degree less than _m_ we have _Sum Xi P(Zi) = 0_.

Now, let's erase _m_ values at locations _E={Ej}, j=0..m-1_. For each _Ej_ ,
we can construct polynomial _Pj(Z)_ of degree _m-1_ that evaluates to 1 at
_Ej_ and to 0 at all other locations in _E_. From _Sum Xi Pj(Zi) = 0_ we get
_Xj=Sum[i!=j] Xi Pj(Zi)_ (note that the sum is actually over all known _Xi_ ).
_{Pj(Zi)}_ is in fact the decoding matrix. Computing it is _O(nm)=O(km^2)_
(not _O(km)_ as I put above).

I have that algorithm implemented in my _ErrorZ_ project on GitHub:
[https://github.com/OlegMazurov/ErrorZ/blob/master/src/main/j...](https://github.com/OlegMazurov/ErrorZ/blob/master/src/main/java/org/mazurov/errorz/ReedSolomon.java)
More advanced stuff, like decoding up to _m-1_ errors, can also be found
there.

~~~
akalin
Ah, okay thanks. My coding theory is a bit weak, but it looks like this
algorithm depends on the specific form of RS codes that you mentioned. I think
the presentation outlines a form of RS erasure codes which is more general
(e.g., allows Cauchy parity matrices instead of Vandermonde-derived ones).

Using a general matrix inversion is indeed inefficient, and becomes
impractical for large k. However, the special-purpose row reduction algorithm
I mentioned gets to O(n _m^2) which is much better, although worse than the
O(n_ m) algorithm you mentioned, which makes sense given that that one handles
more specific RS codes.

Thanks for elaborating! It's reminding me that I really should brush up on
coding theory, though -- do you have any resources/textbooks you'd recommend
for someone familiar with the mathematical background? (Finite fields, etc.)

~~~
omazurov
I'm a bit rusty myself (all those small mistakes I made). I'm now more
concerned with the engineering side of error correction (elegant
implementations and more importantly a totally new class of dynamic error
correction I have in another GitHub project). I can't recommend any good up-
to-date textbook - haven't read any in long time.

~~~
akalin
No problem. Do you happen to have a source for the algorithm you explained,
though? I recognize the parity-check matrix, but the rest I can't seem to find
in any of the usual coding theory texts; usually, they treat erasures as part
of the full RS decoding process.

~~~
omazurov
The source of the algorithm I described is my mind. It's so elementary that I
believe it is omnipresent but disguised by more complicated math, usually due
to other ways to define RS codes.

~~~
akalin
I agree re. being disguised by complicated math. Although even if you think
it's elementary, I encourage you to write it up somewhere, like a blog post.
I'm sure there are people out there like me who would find it novel. :)

If I ever do use it or write about it, I'll make sure to cite you. Thanks
again!

------
londons_explore
I'd like to see FEC used on a TCP/IP replacement to handle lost packets
without an extra roundtrip.

When you're downloading a 1MB webpage, splitting it into ~800 1kb ethernet
packets, the chance that not a single packet gets lost on the internet where
typical packet loss rates are ~0.5% is slim.

~~~
loeg
I think typical packet loss rates are lower than 0.5% in some parts of the
internet, and also, not necessarily normally distributed. That said, FEC is
already used for very high latency networking. At some (low latency) point
retransmit is faster than the hardware computing more FEC.

------
zokier
Unless I misread, the slides do not touch fountain codes (RaptorQ etc) at all,
which serms like a significant omission.

~~~
eternalban
"were first published in 2004":
[https://en.wikipedia.org/wiki/Raptor_code](https://en.wikipedia.org/wiki/Raptor_code)

(OP slidedeck is also 2004)

[ps:]

presentation on Raptor codes 2005:
[http://algo.epfl.ch/_media/en/output/presentations/raptor-
ba...](http://algo.epfl.ch/_media/en/output/presentations/raptor-ba..).

paper 2006:
[https://gnunet.org/sites/default/files/raptor.pdf](https://gnunet.org/sites/default/files/raptor.pdf)

~~~
phonon
LT codes were published in 2002...

[https://en.wikipedia.org/wiki/Luby_transform_code](https://en.wikipedia.org/wiki/Luby_transform_code)

------
userbinator
Those who have used Usenet to get binaries will be familiar with these, which
work on the same principle:
[https://en.wikipedia.org/wiki/Parchive](https://en.wikipedia.org/wiki/Parchive)

~~~
loeg
Indeed. I use par2 on my local tar backups, to protect against a few blocks
corruption.

People may also be familiar with RAID5/RAID6/RAID-Z, which use erasure codes
to protect data from drive loss.

------
DonbunEf7
How do the schemes laid out here compare with zfec ([https://github.com/tahoe-
lafs/zfec](https://github.com/tahoe-lafs/zfec))?

~~~
walrus
zfec is an implementation of Reed-Solomon coding, the class of codes discussed
in the first part of the slides.

~~~
lrizzo
To be precise zfec uses Vandermonde codes and it is derived from my old fec
library from 1996-97. See
[http://info.iet.unipi.it/~luigi/research.html#fec](http://info.iet.unipi.it/~luigi/research.html#fec)
(in 90's web style).

