

A mathematical trick allows people to scatter their computer files - eru
http://www.economist.com/science/tm/displayStory.cfm?source=hptextfeature&story_id=12081445

======
jlouis
Reed-Solomon coding is an old fella. Whenever you have a link which is
unreliable and you can't afford to retransmit packets on the link when errors
are introduced, RS is your friend. Mobile phones are among the prime users of
this.

If I remember correctly, the PAR/PAR2 formats used on usenet is using RS-
encoding as well.

An alternative would be to plot the file in N-dimensional space and define a
set of vectors to pinpoint it. When you have enough vectors you have the
precise pinpoint. Additional vectors gives the error-correction capability.
Some microsoft guys played with this idea for Bittorrent-like networks a while
back. But there is a disadvantage in the time it takes to decode the data, and
it probably doesn't help the swarm that much :/

Another interesting viewpoint: We might need RS-encoding on the _local_
harddisks soon (implemented in hardware or software), as it would circument
the bit-error rate problem with those disks.

~~~
newt0311
What I am wondering about is how much faster TCP could be with RS for recovery
instead of the current resend-packet technique.

~~~
jws
I would guess it would be slower and you would break the internet. You would
have to introduce enough redundancy to cope with the worst tolerable loss rate
which would increase the number of bits to transmit. Worse, it is the noticing
of dropped packets that tells TCP to slow down and decongest a link. If enough
senders fail to decongest then packet loss on the congested links skyrockets
wasting bandwidth elsewhere and doing silly things like favoring the sender
with the biggest pipe.

~~~
wmf
Obviously you can't just eliminate congestion control, and the coding rate
should be adaptive to reduce overhead.

At least one startup has gone broke on this idea already, but maybe it's
possible to do it right.

------
Herring
The economist doing error correction codes?? Are you guys _sure_ the LHC
didn't do anything to the universe?

~~~
fgimenez
I'm not sure whether I'm excited that this was in the economist, or pissed off
that they reduced error correction codes to "a mathematical trick"

------
secorp
We have an open source project <http://allmydata.org> that has been doing this
for quite awhile. I'm also involved in the commercial side which does online
storage and we've been running a business on a P2P backend (nice low costs)
with non-peer clients. We tried a business model with a full peer grid and
users were extremely uncomfortable storing "data" from other people on their
computers. Possibly the market is better educated now and/or more used to this
idea, but it may be a hard sell.

------
zandorg
We learned about a Hamming distance at University. But I could never figure
out when what or why it should be used. It was either predicting the future,
or just sending more bits to compensate for error.

But what if you get errors in the new bits? It's daft.

~~~
pmjordan
Beyond a certain error rate, you will definitely end up with bad data. The
point is, with error detecting or correcting codes, you're introducing
redundancy by encoding the information into more bits than minimally required
to represent that information.

The simplest form is adding a parity bit, which allows you to detect (not
correct) up to one bad bit. (so, say 1/8 bits or 12.5% if you store a byte of
information in 9 bits)

Using R-S codes you can crank up the number of bits used for encoding, which
also drives up your error tolerance. Plus, in addition to detecting errors,
you can even correct them. So it doesn't matter if some bits come up bad (or
missing) - the redundancy is spread equally across _all_ of the
transmitted/stored bits, so it's irrelevant which bits suffer from the
failure. There aren't any "old" or "new" bits.

------
louislouis
Erm.. I see tons of comments about the 'maths trick' behind the tech.. but
have any of you tried out the app cos it's really amazing! A great idea, great
execution. If this gets the news coverage it deserves then this could be huge
I think.

------
PStamatiou
Even if this is all worked out to be amazingly effective.. how are you going
to convince regular users to put their data on other peoples' computers?

Yes, I realize that it's all put into chunks so people won't be able to snoop
on them, but just try getting that concept past my mom.

It's neat but I'd rather my data on my encrypted and fast S3 account.

~~~
orib
Why is S3 different? "My data is on other people's machines" is still the case
there. Encrypt it before you send out the blocks, and you're exactly where S3
is.

------
eru
The anonymous p2p-project Freenet does similar forward error correction ---
and did it for ages.

