
When Data Disappears - nsfmc
http://www.nytimes.com/2011/08/07/opinion/sunday/when-data-disappears.html?_r=1&ref=opinion
======
srl
Yuck. She claims that what we're doing to preserve old data isn't working[1],
so the solution is to emulate what vintage video game fans do - keep
circulating the same old games and pouring a great deal of time into
preserving them and (rarely) making marginal improvements. Not a very radical
idea, given that that's already what people do, well outside video game
groups. (The torrent/"piracy" society is the most obvious example.)

The other idea she has is that experts in the field should make decisions
about what to keep and what to ditch. Aside from her example (curators to
decide what data is good and what data is bad? Have you _never_ taken a
science class?), this is a good idea - and an obvious one.

[1] A dubious claim that she doesn't bother to back up with any specific and
correct examples.

(<flame necessary="no">This is what happens when english professors write
about technology.</flame> )

------
zdw
Emulation should be the last resort for things that are so dependent on their
environment that they need an exact replica going forward. Video games are
great examples of this.

For everything else, such as the collected notes of a writer, or musical
notation, etc. the goal should be to transform it into a well understood
format (for example, images of handwritten pages in PNG format, rather than
the physical pages themselves) and keep that replica as an archive.

Also, good use of integrity preservation tech is key - keeping copies of the
digital files with hashes of their contents so you know they're not corrupted,
and moving to new physical media periodically, or distributing the content so
that all the copies aren't lost.

------
cek
"The Cloud" has radically changed this situation. Sure, it will always be
possible to accidentally delete or corrupt something, and, yes, there are
formats that will be unreadable in the future, but the fact that

\- ingress to the cloud AND \- storage in the cloud

have costs rapidly approaching $zero mean that storing duplicate copies is
already almost automatic.

I have a stack of 5.25" and 3.5" floppies with all my high-school & college
papers and code on them. They may or may not be readable today. But EVERYTHING
I have done in the last 15 or so years is on multiple hard disks both in my
home and in the cloud. I did not have to work too hard to make that happen.

Today it's even more automatic. Dropbox, Amazon S3, Amazon Cloud Drive,
Flickr, cloud based email, all mean just about everything I do/have digitally
is preserved in a far more accessible and stable way than ever was previously
possible.

~~~
watmough
Sure, whilst you're alive, your data is being preserved and transported to new
ways to store it.

But what about when it's your children and grandchildren? My father has most
of my grandmother's personal effects and pictures.

Can you bet that your grandchildren will be able to access and recognize your
data, pictures and music.

As an example, we've already seen one DRM protected storage system fail in the
market, the ironically named 'PlayForSure', any guarantee that others will
survive? What about archaic image formats? What about 40 years from now?

~~~
cek
I don't understand your first point. My "upon my death" file provides
credentials.

Your second point is interesting but I think this discussion is about personal
information and, at least as far as I can see, people don't DRM that.

------
sehugg
_But over time, emulation becomes unwieldy: because the host systems for which
emulators are designed will themselves become obsolete, emulators must
eventually be moved to new computer platforms — emulators to run emulators, ad
infinitum._

Has there _ever_ been a shortage of developers willing to write new emulators?
A better argument would be that the usability degrades since the emulated
environment is from a different era and might not map to current tech (e.g.
iPad version of a keyboard-driven app).

~~~
gwern
> Has there ever been a shortage of developers willing to write new emulators?

A good usable emulator, with all the bugs and corner-cases ironed out? I've
read the occasional MAME blog post or documentation about the gory details,
and all I remember is a mounting sense of horror and gratitude that the MAME
devs were doing the work rather than me.

There's basically only one MAME project, BTW.

------
nknight
_Sigh_ This again? Seems like the press does this story once a month.

" _If you don’t have a copy of WordPerfect 2 around, you’re out of luck._ "

Dunno about WP2, but my copy of LibreOffice purports to open WordPerfect files
of indeterminate vintage, and there are various other formats in there that I
know go back 20+ years, and I'll be very surprised if ODF 1.0 readers
disappear between now and eternity.

And of course, for us English-only types, plain ASCII files work as well now
as they did in the 60s. I expect UTF-8 will likewise work at least as well in
2050 as it does now.

Physical media degradation/obsolescence is the only thing I would worry about,
and then not that much for anything that's already been brought into the
Internet age.

Storing lossless encodings of every film ever made might be a challenge, but
text? No. A little care is all it takes, and I do mean a _little_.

~~~
jcn
While the media may do stories like this often, I feel like the question is
often more along the lines of "how do we save all of the bits!?" and less
about other, broader ways we can think about saving the information, as
opposed to just saving the data.

For me, the more relevant part of this story was:

 _"By some estimates, that’s nearly 30 million times the amount of information
contained in all the books ever published. Even if we had perfectly stable
storage, could we ever have enough to preserve everything? The short answer is
no — but only because we’re trying to replicate the practices used for decades
to maintain paper archives."_

