

Forever Storage - pquerna
http://journal.paul.querna.org/articles/2010/06/12/forever-storage/

======
chokma
The novel about this is Neal Stephenson's Anathem... where people strive to
keep just the necessary information alive (like, "Where did we store the
atomic waste a couple of 1000 years ago?").

No matter how your data is going to be stored, it will eventually have to be
copied to another medium. And if you want to store it "forever", you _will_
get data corruption, either because of transmission errors or because the
storage medium fails. Keeping the data redundant is not enough for guaranteed
data safety, as no matter how many copies you create, one day all of them can
fail at the same time.

But you can take steps to ensure that your data is realistically safe for say
a 1000 years (which is not by any means "forever", but then who but an
archaeology AI will read your data at that time). The Long Now Foundation is
thinking about stuff like this (<http://www.longnow.org/>).

An interesting question is, how do you ensure that your data will not be
intentionally wiped out, because it is deemed (religiously, politically,
legally) offensive? A metal band's CD cover is now considered dangerously
close to child porn here abouts (Scorpions, Virgin Killer). That may be enough
to make some people erase your data for decency's sake. A dead man can not
fight a cease and desist order.

A different view on the matter would be: you are going to die, and your data
is going to die eventually. And even if it is around, it's likely that no one
will want to read it anyway. But your existence will have consequences for the
rest of (human) history, as just by living you provide an input to the human
race. This input may be in the form of biological data (children) or pure
information (a blog article about "forever storage" which inspires the next
great startup founder to do great things). Just go on, do something positive
besides leaving a huge carbon footprint and atomic waste that lasts for a
million years.

~~~
gwern
Long Now's Rosetta Stone is the perfect project. Monel, their chosen alloy,
doesn't seem to have any issue being stable for 1000 years. Their human-
visible text is obviously too bulky to store a reasonable database of 500GB or
1TB, but no reason it couldn't support a more compact encoding.

(Alternately, you could use optical storage and hope redundancy and ECC will
be enough. For example, if you want to store 500GB, you could take 20 Bluray
50GB drives, split it all into 25GB chunks, and then generate 25GB of ECC
data, such as <http://en.wikipedia.org/wiki/Parchive> files. Do this every
year, and you'll be able to correct files back and forth between archive-sets
as well as within individual disks. Even if the Bluray discs degrade, there
ought to be enough of a trace left for things like electron microscopes to
pick up.)

------
phreeza
I wonder if there could be a distributed/open solution to this... not so easy
probably. Some Bit Torrent style distributed storage comes to mind, where you
have to provide x*the ammount of data you would like to store to others, with
y% availability.

Like a data-storage ponzi scheme, but it might be sustainable because storage
and bandwidth keep getting cheaper.

~~~
MichaelSalib
Tahoe LFS seems to fit the bill....erasure coding for the win!

<http://tahoe-lafs.org/trac/tahoe-lafs>

------
phreeza
Technically I agree this is completely possible, the challenges are more
likely to be economic. How do you make sure the company exists 1000 years in
the future? That question poses itself especially for a potential startup
providing this service, but would even be relevant if an established company
were to do it.

If I were to trust anyone with doing this, it would probably be the catholic
church. Can't think of anyone else with a sufficient track record.

~~~
pquerna
Yes, surviving a societal collapse is pretty hard.

I think a societal collapse, in which you assume the basic structure of of
Capitalistic motivations failed, might only have 2 real outcomes:

* Star Trek: Money becomes mostly pointless, and you hope with the altruism of other humans, the data is restored.

* Mad Max: Welp, sucks for your data, it'll be lost in the sands of time.

I think the more important perspective is, if you can make the first 200
years, the chances of someone else picking up the data increases massively --
right now most data is created and destroyed in years, never mind decades.

------
doriangray
If what you have is truly worth preserving for that long, transcode it into
<http://en.wikipedia.org/wiki/Junk_DNA> (several times over to allow for
mutations) and splice it into sperm/egg/embryo cells at fertility clinics. Let
your "progeny" multiply across the planet with Nature taking care of
replicating your data through space and time. For retrieval: I suppose that in
the distant future, DNA databases will probably be preserved for easy lookup
by sequence and thus so will your data.

The human genome is ~6B base pairs = ~3B bits = ~357MB. Only 1.5% constitute
protein-coding genes. Let's say some of that "junk DNA" is actually useful, we
could still get away with ~200MB of payload per human cell. Further scaling
can be obtained by targeting other organisms with large populations (bees,
rabbits, cockroaches, mosquitoes, bacteria) or even synthesize new viruses or
parasites that hitch along to humans for an evolutionary ride while preserving
your data at the same time ...

(Disclaimer: I'm in the middle of an X-files marathon)

------
derefr
It's very easy to keep the _data_ intact, if you know when, exactly, you want
access to it again: as in the book referenced in the article, The Forever War,
just put the _data_ on an object in space, relativistically accelerate it, and
set it for a return course in X years. From the object's frame of reference,
much less time (and degradation) will pass.

Or, even simpler: find a reflective surface X/2 lightyears away, and shoot
coherent light at it with sufficient power.

The real problem is preserving the _information_ , which degrades even faster
than the data. Even communicating a very simple message[1] across more than a
1000-year gap turns out to be nearly impossible, as it will be intentionally
misinterpreted in the direction of whatever will make the finders happy.

[1] [http://www.damninteresting.com/this-place-is-not-a-place-
of-...](http://www.damninteresting.com/this-place-is-not-a-place-of-honor)

~~~
stcredzero
_It's very easy to keep the data intact, if you know when, exactly, you want
access to it again: as in the book referenced in the article, The Forever War,
just put the data on an object in space, relativistically accelerate it..._

It's "easy" to relativistically accelerate something!? (Macroscopic, not
fundamental particles or ions.) This is the old wag about how, in Google
engineer speak, "trivial" means the theory is known and all that lies between
now and implementation is 300 million dollars of planning and engineering.

------
skybrian
Preserving data for future historians is a good thing but why is it so
important to preserve _your own_ data? Do you really think that's what will be
the most valuable to them? It seems better to contribute to projects like the
Internet Archive.

Also, the value of data depends partially on its scarcity, and I expect data
about us won't be particularly scarce. More data is better, but when doing a
bulk analysis, 10 terabytes of family pictures will not be all that different
from 10.1 terabytes of family pictures.

------
LaPingvino
I think the best way to guarantee your data to live on is to get it in a form
you can save for the future as easily as a photo album.

Some time ago I saw (maybe even here on Hacker News) something about "setting
your data in stone"...

<http://primera.eu/millenniata/millenniata-en.html> is what I found quickly
about it now, there is/was a bigger explaining page with a heat and ice test
of this medium.

------
pierrefar
A few years ago I won a (very small) business plan competition for a company
describing a consumer backup service like this one.

The kicker is figuring out what would happen if the company stops trading for
whatever reason (e.g. bankrupcy). Although I didn't follow through with it, I
think you can build a reasonably safe winding-down process that at the very
least returns the data to its owners.

~~~
zandorg
I was thinking about this. You don't charge people to add data - just to
remove it.

