
Google's Vint Cerf Warns of 'digital Dark Age' - bauc
http://www.bbc.co.uk/news/science-environment-31450389
======
karmacondon
Several people have mentioned the Internet Archive. They're doing god's work
and you should give them money [0].

But I think that dapper Vint "One is glad to be of service" Cerf is referring
to how difficult it will be for far future historians to piece together
records of daily life in the 21st century, especially from the view of
individuals. Think of Da Vinci's notebooks, the personal journals of artic
explorers, correspondence letters from great artists and statesman. Now
imagine that those things are stored on floppy disks in Office95 .doc format.
We'd have a hard time viewing that media today. In 1,000 years it might be
impossible. A lot more will be lost to history than what the Internet Archive
is currently storing.

I can't think of a great solution. When I watch some historical documentaries,
it seems like every other scene is based on a quotation from a letter or an
old photograph. We don't print those things out any more. They'll be lost,
completely lost, when email services and social networks finally shut down.
And even if people download personal backups, most file and physical storage
formats have a shelf life measured in decades at the most. Letters can sit
around in attics for lifetimes, undisturbed. A file formatted for a Commodore
64's word processor program might as well be written in a lost language and
then one way encrypted. I'm not sure what to do about it, but it seems like a
damn shame.

[0]
[https://archive.org/donate/index.php](https://archive.org/donate/index.php)

~~~
xamuel
>Now imagine that those things are stored on floppy disks in Office95 .doc
format

I _really_ don't think this is a big problem, in the historical context. Sure,
the average person won't be able to casually open those files, but neither can
the average native English speaker casually read the original Beowulf!

If a catastrophe wipes out an ancient civilization and all that survives is
one laptop-sized cuneiform tablet, historians don't get much from that. If a
catastrophe wipes out modern civilization and all that survives is one laptop,
historians get gigabytes, maybe terabytes, from that.

~~~
cafebeen
I think there's a middle ground though. For example, Egyptian hieroglyphics
were around but not understood until the rosetta stone was found. Without a
"digital rosetta stone" those terabytes would be similarly useless.

There's also some historical value in journals of every day folks, e.g.
soldier's notes from major battles or those present at historical events.

~~~
WalterBright
Given how good people are at cracking things, I seriously doubt it would be
that hard to crack a file format.

~~~
cafebeen
Well, cracking 100 years from now will probably be much harder. For example,
could CSS be as easily cracked if you didn't know it was for DVD video _and_
didn't have access to a working reference implementation?

Either way, we're surely capable of leaving a better legacy than what we
currently have.

~~~
xamuel
Even if DVD codecs are 100% opaque to future historians, recall that no video
of _any_ format existed until the 20th century, and widespread ubiquitous
video recording didn't exist until the 21st. If we're entering a dark age now,
what does that make former ages?

~~~
cafebeen
I agree that the "dark age" designation is not the best (and perhaps
sensational), but the basic issue of how to maintain digital data isn't worth
writing off...

------
Maakuth
I feel like [http://archive.org/](http://archive.org/) deserves a mention here
for their efforts of preserving digital history. Personal photos or other data
are not in their scope, but they do wonderful things with web pages and other
stuff that's publicly available. I just read Brewster Kahle's interview from
Founders at Work this week and it blew my mind how determined to build the
Internet Archive he's been for practically his whole career.

------
mtrn
Last summer I discovered, by accident, a little computer museum[1] in the
Kvarner Gulf. It's run by a single guy, who collected an incredible amount of
old hardware over the years and exhibited them in couple of rooms.

It rained the day I visited the place, and since it was an old building, it
actually rained a bit inside. The guy took it with humour, but it was actually
a pity. It is a great place and on some days the visitors can even run old
programs on those ancient machines or play games.

[1] [http://www.peekpoke.hr](http://www.peekpoke.hr)

From their site: _Opened on 22nd of September 2007, Club PEEK &POKE is one of
the few permanent displays of vintage computing technology in Europe. Located
in the centre of the city of Rijeka and spread across 300m2 of space, it
contains more than 1000 exhibits of the world and local computer history,
ranging from very early calculators and game consoles to rare and obsolete
computers from the nineties._

~~~
ht_th
I've been to a couple of these museums as well. What makes them fun to go to
are:

\- you may touch everything, often including things inside the computers
itself

\- the people working there (or, as is often the case, the one guy working
there) is knowledgeable, willing to talk about it in depth, cares, and you can
have a real conversation with them.

\- there's a lot of surprising hardware around I had no idea existed (from
failed computer companies, to specialized I/O of decades past, to special-
purpose hardware)

You cannot get the same full experience with most regular museums, as then you
have to follow their narrative instead of following your curiosity.

On the other hand, these museums are often not much more than an erudite
collection of historical artifacts they got their hands on. To preserve is to
select with a plan, not just collect everything.

~~~
vidarh
Technical/science museums in general often seem to be closer to this ideal
than other museums, because while some of the artefacts may be valuable, more
often they are trying to showcase how things worked rather than specific
objects.

~~~
mtrn
As I child I could spend days in _Deutsches Museum_ for that reason.

[http://en.wikipedia.org/wiki/Deutsches_Museum](http://en.wikipedia.org/wiki/Deutsches_Museum)

~~~
frik
The Deutsches Museum in Munich is the world's largest and IMHO best museum of
science and technology. Especially the mechanical engineering, cars and
airplane departments are one of its kind.

Though, the computer history department is a bit too small and similar in size
as the equivalent Science Museum in London. In both you find a Cray super
computer, first Zuse computers and many older mainframes and terminals, etc.
But everything is death, every historical computer sits just there. There is
no interactivity, what a shame. They should at least re-work a Cray 1 or 2
super computer and let visitors play around on its terminal - that would be
awesome.

------
Steuard
"A company would have to provide the service, and I suggested to Mr Cerf that
few companies have lasted for hundreds of years."

Maybe it's just that I'm an academic, but this sounds less like a job for a
company and more like a job for a library. (Or more robustly, a job for the
world's interconnected network of libraries.) The whole point that I take from
this article is "we must preserve our heritage for the common good", and that
sounds awfully close to a library's core mission.

~~~
TazeTSchnitzel
It's the kind of thing that the Internet Archive (which is a library) would
do.

~~~
TeMPOraL
Unfortunately, libraries (and museums and other cultural institutions) nowdays
are forced to compete on the market - which erodes their ability to preserve
and give access to our cultural heritage. If this trend continues, soon there
won't be any functional difference between a library and a company.

------
whyleyc
We observed this problem first hand when the inventor of Powerpoint emailed us
to ask for help in opening old presentations which he could no longer view !

When the guy who wrote the software which produced the file can't view it 20
years later you know you've got a problem.

You can read more about that here:

[http://blog.zamzar.com/2012/04/17/open-old-powerpoint-
presen...](http://blog.zamzar.com/2012/04/17/open-old-powerpoint-
presentations-in-office-2007-and-office2010/)

~~~
kaoD
This is why I'm a firm believer in open, free standards. Of course they don't
necessarily guarantee you'll be able to open those files, but it's a great
step in the right direction.

We shouldn't have to rely on reverse engineering to preserve what's ours.

Unfortunately public administrations don't realize (or don't want to realize)
this, at least where I live. I even discussed this with information
professionals and they wouldn't understand! We're right in the middle of some
big change and it will be too late when they finally realize.

------
cmiller1
It feels weird to me to see Vint Cerf propped up in this title by being
associated with Google. It seems the guy's name holds enough merit on it's
own; I didn't even realize he was working with Google these days!

~~~
k-mcgrady
This is the first I've heard of him. After reading about him on Wikipedia I'm
very surprised I've never seen him mentioned before.

~~~
bsdpython
We've mostly forgotten the people that created modern computing unless they've
made a billion dollars

~~~
tormeh
It is Hacker News, not Computer Engineer/Scientist news. It was always
supposed to tilt towards Silicon Valley. I don't think that's a good thing,
but it's a free service from Y Combinator, so it's very understandable.

~~~
wayfarer2s
Well, since it was pg's idea, I'd guess "hacker" would slant more towards MIT-
style hacks than anything to do with the valley.

I think the lack of Vint Cerf stories has more to do with the temporal nature
of news than any biases that people might have. I'd chalk it up to him just
not making news lately. Any text on the history of the Internet, TCP/IP, and
networks in general liberally mentions him.

------
EdwardCoffin
I see a lot of comments amounting to saying that contemporary digital content
isn't worth much, and its loss wouldn't be a big hardship. This isn't the only
kind of digital content though. What I am worried about is older, pre-digital
matter which is thrown out because digital copies now exist (often crappy
scans of marked-up, faded pages, but that's another rant.) When the digital
copies disappear, we will no longer have the durable pre-digital copies to
revert to.

~~~
TeMPOraL
Indeed. Every time we scan a document and throw the original away, we're
betting for the continued existence of current technological civilization. If
it collapses, the thing we digitized and thrown away is the thing future
generations will have lost from the cultural heritage. The Dark Ages may end
up extending far beyond the beginning of information age.

------
mark_l_watson
While I think that Vint Cerf is correct in talking about the dangers of losing
the ability to read files in various formats, my hope for long term access
lies in open and standardized file formats and services like the fantastic
archive.org
([http://web.archive.org/web/*/markwatson.com](http://web.archive.org/web/*/markwatson.com)
is their history of my little web site, starting in 1996).

I think that HTML files in a standard character set like UTF-8 could be
readable a thousand years from now if human civilization has not destroyed
itself.

I hold out less hope for formats like various ogg formats, TIFF, JPEG, MPEG,
etc. Software like computer games is even more problematic.

I am hopeful that the technology will improve for archiving digital assets.
New storage technologies will become more reliable, much more information
dense, and less expensive both to build and provide power for.

~~~
TazeTSchnitzel
MPEG and TIFF I hold little hope for, but JPEG? It's an industry standard and
still a massive force on the web. Not going to die any time soon.

~~~
officialjunk
Anything with lossy compression, like JPEG, is probably going to die.

~~~
T-hawk
I don't think so. There will always be usage cases involving a couple orders
of magnitude less capacity or bandwidth than the leading edge. Lossy
compression will always be appropriate for these.

Anything involving wireless is a good start. There's a hard physical limit to
the amount of data you can cram over 4G or 802.11 spectrum. It's physically
impossible to losslessly stream video at 4k 60fps over these, so that's why we
use lossy compression (currently MPEG) and always will.

------
crazydoggers
No one ever considers the volume of photos and documents we are now creating.
My parents probably took dozens of photographs when I was younger, and kept
maybe one or two albums. Nowadays, most people have hundreds if not thousands
of photographs and videos. So it seems to me that the converse will actually
be true. If you've got a 100 fold increase in the number of documents and
photos being created, and only 1% of those make it, you're still preserving
the same number of documents. And frankly, I think that's a conservative
number.

As some of the other posters here mentioned, with the advent of cloud based
services and easy backup systems, retrieval is getting easier. So if storage
and retrieval is solved, that leaves format evolution problems. But somehow I
doubt in 20 years or even more that JPEG is somehow going to be harder to read
than it is now.

~~~
vidarh
A couple of years ago, Kodak's online photo storage service (in the UK at
least) shut down. We just barely managed to get copies of the hundreds of
pictures we had stored there in time.

Other recent (non-photo specific) examples of vast amount of data
disappearing:

Geocities. Yes, large chunks of it was archived last minute. MegaUpload.

And we have Rapidshare on it's way to disappearing.

Cloud services can and will shut down, and it is not at all a given that we
manage to preserve the data.

This creates a relentless churn where some proportion of older data disappears
every day. All we really can do is to fight to keep the churn rate low enough,
because we have no realistic prospect of saving everything all the time.

~~~
svachalek
I don't think this invalidates the point though. In those alone we may have
already lost more pictures than were ever taken before 1990 but at the same
time the amount we have left is staggering. If you go back 100 years, there
are famous people that we have maybe 1 picture of. No matter how sloppy we are
with the majority of internet content today, I can't imagine that 100 years
from now, they'll only be able to find a couple pictures of Obama and Putin.

------
jgrahamc
Relevant:
[http://en.wikipedia.org/wiki/BBC_Domesday_Project](http://en.wikipedia.org/wiki/BBC_Domesday_Project)

"The project was stored on adapted laserdiscs in the LaserVision Read Only
Memory (LV-ROM) format, which contained not only analogue video and still
pictures, but also digital data, with 300 MB of storage space on each side of
the disc. Data and images were selected and collated by the BBC Domesday
project based in Bilton House in West Ealing. Pre-mastering of data was
carried out on a VAX-11/750 mini-computer, assisted by a network of BBC
micros. The discs were mastered, produced, and tested by the Philips
Laservision factory in Blackburn, England. Viewing the discs required an Acorn
BBC Master expanded with a SCSI controller and an additional coprocessor
controlled a Philips VP415 "Domesday Player", a specially produced laserdisc
player. The user interface consisted of the BBC Master's keyboard and a
trackball (known at the time as a trackerball). The software for the project
was written in BCPL (a precursor to C), to make cross platform porting easier,
although BCPL never attained the popularity that its early promise suggested
it might."

~~~
new299
The Domesday project has faired rather better than many other projects it's
content is accessible online:

[http://www.bbc.co.uk/history/domesday](http://www.bbc.co.uk/history/domesday)

And I believe the content on the original Laserdiscs has been reverse
engineered more than once. Hopefully once content is on the web (unless it's
behind robots.txt) it gets sucked up by the Internet Archive.

That of course doesn't take care of any rendering issues.

------
gambiting
I feel like we are already experiencing that. I have a drawer full of pictures
given to me by my parents, some of them 100 years old,but I can't open
pictures I took with my digital camera 10 years ago,because the CDs I burnt
them to are unreadable. Obviously the answer to that is that I should have
printed at least some of them,but I don't know anyone who prints pictures
nowadays. Nowadays I backup all of them to Picassa(google plus albums),but I
have no idea what happens if I can't pay for monthly storage anymore? Or if I
die and no one knows my google password? All of that data will be gone
permanently.

~~~
wil421
>but I can't open pictures I took with my digital camera 10 years ago,because
the CDs I burnt them to are unreadable.

We still had jpegs 10 years ago. All of my digital pics from 2005/6/7 till now
are still on multiple hard drives and multiple systems. Each time I get a new
computer I transfer them over.

Why are they unreadable? It sounds like you put them on a proprietary format
or some obsolete picture software you used to burn them on.

I think its absurd that you are paying for picture storage and think its your
only option. Amazon prime customers can upload pics for free now [1]. Why not
use one of the free ones like Dropbox, Google Drive, SkyDrive etc...

[1][https://www.amazon.com/clouddrive/primephotos](https://www.amazon.com/clouddrive/primephotos)

~~~
chowells
They're unreadable because CDRs aren't an archive mechanism. They break down
rather rapidly. Their shelf life is similar to a floppy disk. After 10 years,
the odds of being able to read a CDR that wasn't stored in perfect conditions
are pretty low.

~~~
wil421
Are you sure it would be as quick as 10 years? I have music CDs and mixes
easily from the early 2000s that still work. A few years ago I found a
Jurassic Park soundtrack that still worked.

My parents have CDs from the 90s that still work.

Would there be a difference in a music CD you bought from a store and one that
burned yourself?

~~~
tlrobinson
Yes there's a huge difference between professionally "pressed" CDs and burnt
CDRs.

There are also special archival disks and burners, like M-Disc, which
literally "engraves in stone": [http://www.mdisc.com/](http://www.mdisc.com/)

~~~
wil421
But a 10 year life span on CDRs. My personal experience is much different with
both CDs and floppy disk (I still have some of those that work).

------
whoisthemachine
DRM also plays a role in this. Obsolete DRM techniques whose algorithms have
long been forgotten will make it difficult to archive data protected by DRM
onto new storage mediums.

~~~
kbart
I somehow doubt it's a big issue. Do you know many DRM's that haven't been
cracked in a matter of days/weeks after the release?

~~~
wlesieutre
Yeah, especially with computer games. Whatever old game is having DRM trouble,
you can usually find a crack for it on Rapidshare.

~~~
vidarh
I can't tell if you made that reference on purpose, or if you're unaware that
Rapidshare is shutting down.

~~~
wlesieutre
Yep, tongue firmly in cheek on that one.

------
herf
You can virtualize a PC (or replicate the filesystem like Dropbox), but it is
harder to virtualize a distributed system like Facebook's or Google's. And
most mobile platforms are DoA without their vendor signing and cloud services.

It's always been hard to migrate out of social systems--convincing a well-
connected user of a photo service to move away from the place where they've
accumulated comments and tags is really hard. That metadata is not portable
because identity is not yet portable, and it's what we're spending time on
(lots more time than we spend making spreadsheets).

I think we might manage to keep "JPG as file" alive for 30-50 years, but there
is lots more to manage.

~~~
vidarh
This is the crux of the problem.

We are getting better at archiving files. And thanks to a combination of
people dedicated to emulation and the rise of virtual machines, there are few
popular pieces of hardware we can't emulate in excruciating detail.

But many of the services I used a decade ago are already gone. And pretty much
_all_ the services I used two decades ago have completely disappeared, with
some very few notable examples. And with them, vast amounts of data.

Some of it the Internet Archive have at least captured static snapshots of
(and they really should have magnitudes more funding), but ten times that - or
more - was data in walled gardens, behind logins or otherwise restricted in
ways that means it is lost forever unless we're lucky and it turns out some
admin held onto backup tapes they weren't really meant to keep.

And the problem with there is not to create a snapshot of a single server, but
as you say that distributed systems are far harder. Even recent. Twice I've
been contracted to help companies take over infrastructure that involved
systems I'd worked on, and try to "package it up", and it was incredibly hard,
because no matter how much you try to tear down and bring up individual
servers or groups of servers and automate deployment, very few places running
complex services ever try - or could afford to try - to tear down and bring up
a full copy of their entire infrastructure.

Suddenly all kinds of nasty interdependencies and bootstrap problems nobody
had needed to think about shows up.

------
Animats
If there's trouble with image, video or audio files, it's probably going to be
because of DRM. The number of formats for those is small relative to the
number of items stored.

Text documents in obscure formats can be more troublesome. There were many
early word processors, and many file format versions. Those can be hard to
convert, and there will be obscure text documents some historian will want to
see a century from now. Converting stored text documents into some self-
explanatory form like XML for archiving purposes is helpful. Even if the
software doesn't survive, the text will still be there and someone can
probably figure out the encoding.

Structured graphics files from old programs are a real problem. This is a big
problem in the CAD world. CAD files aren't just pictures any more. They're
detailed descriptions of physical objects and how they're made. Moving them
from one present-day CAD program to another is tough. Going back 20 or 50
years will be tougher. People will have a real need to do that; buildings,
aircraft, and industrial machinery last that long. The present compromise is
to export such things in well-know formats that are viewable, but not
necessarily editable.

Cerf is talking about preserving execution environments, so you can run old
software years later. With so much "cloud based" stuff, and network oriented
DRM, that's not going to work once the servers have gone away.

------
bkeroack
I've been thinking about this too, but I believe that Vint Cerf's solution is
not feasible. Or, rather, it's a typical technologist "solution" that only
prolongs the problem.

Imagine you're an archaeologist in the year 4500 or so, by our calendar. What
we know of as modern Western civilization collapsed thousands of years prior,
all you have are physical artifacts dug out of the ground in your attempt to
reconstruct the history of this lost civilization. What would you see?

Circa the late 20th century you'll notice a precipitous decline in the volume
of any surviving cultural material--meaning printed books, magazines, business
papers of various sorts. You'll find fewer sound recordings, ticket stubs,
even purchase receipts. The various detritus of daily life will appear to
rapidly dry up and virtually disappear, around the world more or less
simultaneously.

What conclusion would you make? The population at the time seemed to be stable
or growing as measured by ruins of settlements, yet they seemed to be doing
less or producing less? Perhaps there was a crisis in education and illiteracy
became rampant? Maybe there was an ecological disaster and paper itself became
scarce?

You see that a huge proportion of our culture that we take for granted--not
just pop culture, cat gifs, etc--but substantial business and scientific
research information as well would be completely lost to these future
historians. On a more trivial level, when was the last time you saw a
comprehensive, _printed_ guide to iOS 8 (for example)? You could consider that
a significant cultural/artistic artifact from our time, yet how will it be
preserved in any meaningful way for future scholars?

Anybody interested in these questions might peruse
[http://www.longnow.org](http://www.longnow.org)

~~~
pixl97
I'm pretty sure any future archaeologist would quickly figure out the
overwhelming number of rounded rectangles with glass screens had something to
do with the decline of other media.

------
danans
There seem to be at least two future scenarios people are discussing here:

1) A future where society loses the ability to build and maintain tools to
process digital data at scale, and as such, are only reliant on analog tools
to reconstruct the past.

2) A future where due to the extreme _advancement_ of technology, old file
formats are "forgotten", and therefore they are difficult to decode.

I think Vint statements are to be taken in the context of scenario 2, although
I kind of think that even if we forgot the format, we'd be able to rediscover
it with enough analysis and computing power, which would be a non-issue in
future scenario 2.

For scenario 1, I don't think we have answers short of burying long lasting
dormant computing devices in bunkers all over the world, along with maps of
how to find them cast in a very stable medium.

Or see
[http://rosettaproject.org/disk/concept](http://rosettaproject.org/disk/concept)
as a proposed analog preservation solution for a large(ish) data set.

------
Udik
Strange, I would call these preoccupations BS if not for the source. This was
a common concern some 20 to 30 years ago, when the internet was practically
non existent and people still used old floppy disks and cds. Today, a good
part of our personal files are stored in the cloud and storage devices have
become more standard through the usage of universal interfaces. Some of our
personal data will be lost (whatever is left in the hard drives of old
computers and not backed up) but the trail of information the world is leaving
behind is so huge that what worries me is rather that time seems to have
frozen. Everything we leave behind us, documents, pictures, looks as fresh
today as it was the day it was produced. And as easily retrievable.

~~~
agumonkey
On another thread people were saddened by rapidshare-like website being put
down because a lot of content was hosted there. Let's be sure the cloud will
stay long. So far most 80s data I see on the web comes from .. magazines. Ha
paper.

~~~
evgen
The trick with digital data (including that stored in cloud services) is to
keep it in motion. Movement from one service to another, leaving behind a copy
that may eventually get deleted but just might stick around a while, is the
best way to keep data available. Whatever disappeared with rapidshare would
have been just fine if people have moved it to the various "latest and
greatest" systems when they popped up, the only way it gets lost is if it
stops moving.

~~~
pmontra
What happens when one dies, who keeps moving that data? Companies eventually
close. Millennia old documents made their way to us without anybody carying
about them. Digital data seem to be intrinsically different from analog ones
in that continuous care is needed. What you propose is a solution but I can't
see who's going to do it for free as walls and caves did for paintings and
inscriptions.

~~~
agumonkey
Not necessarily. Analog data decays too. The issue is to be blind to the fact
that neither `technology` is timeless and perfect. People are giving in too
easy and too deep in digital data nowadays.

~~~
jarek
> Analog data decays too.

Sure but at what rate? Most 20 year old CDs are still fine, to say nothing of
paper stored with a sliver of care, what percentage of web companies from 1995
are still around?

~~~
evgen
Consider this fact, for perhaps a hundred years prior to digital technology
consuming just about everything we collectively produced gigatons of paper
output. How much of that is available today? A thousandth of a percent
perhaps? Almost all of it has been lost and most of what does remain is
available because it was related to someone famous or some famous event, or
because it was widely replicated. Digital data is easy to replicate and spread
widely. Given how cheap it is to store and how it gets cheaper every day it is
not inconceivable that every bit that is generated and "donated to the public"
will survive forever.

------
butterfi
I work for a Science museum that is struggling to keep its twenty-year digital
collection (web, media, art, etc) archived. Resources seems the be the common
factor. It takes time and energy to keep old projects from falling into
disrepair or losing them altogether, and many institutions don't have the
bandwidth or resources to dedicate to the long term. It's a problem, and I'm
grateful to Vint Cerf for putting a spotlight on the issue.

------
fidotron
The only long term answer is to detach the data stores from the applications
which process them. It is the migration between apps that leads to this data
orphaning.

This also leads to the conclusion that a standard mechanism for interfacing to
the data from multiple apps on multiple data backends is needed. The Android
storage framework is probably the best effort at this so far, but it's far
from clear how used it is.

~~~
kawera
remoteStorage - An open protocol for per-user storage

[https://remotestorage.io](https://remotestorage.io)

------
krick
I was expecting more obvious suggestions to move all our stuff to google cloud
"for the greater good". Apparently it was just a preparation…

But seriously, these complaints are dubious at best. Changing hardware? Well,
maybe. Still, we won't migrate to newer hardware unless we can bring our stuff
with us. Maybe only if 2D pictures will eventually be considered obsolete and
never used since, but I doubt it as well. Changing software? I'm struggling to
imagine how text documents could become unreadable. Even on completely new
architecture it won't be hard to write a translator. The same way I cannot
imagine bitmap images becoming obsolete, and every single format we use is
just moderately complicated compression algorithm wrapped around bitmap, and
every curious historian will be able to recreate it by himself. The same stays
true for wav/flac,ogg,mp3, etc.

I can imagine how Adobe swf will become obsolete and it might be hard to find
software to open it in 100 years. Or Microsoft Office slideshows. But it feel
almost right.

------
dbpatterson
Camlistore seems relevant - it's whole intention is to last (at least) 100
years, primarily by making the data format simple and making data migration
(ie, moving between providers, between hard disks, etc) a common thing.
[https://camlistore.org/](https://camlistore.org/)

------
JoiDegn
That reminds me of the Encyclopedia Galactica
[http://en.wikipedia.org/wiki/Encyclopedia_Galactica](http://en.wikipedia.org/wiki/Encyclopedia_Galactica).
I wonder if this is a scheme to get comp scis to a remote Location and
eventually start a new galactic empire.

------
maerF0x0
When I think of future civilizations pouring over our archives I imagine
they'll have invented new technologies that give them capabilities to discover
facts in ways we cant imagine. So long as the bits are still correct I could
envision an AI that can basically "crack" the codes and bring about a human
readable version. Its sort of like cryptanalysis in that you have a bunch of
"cipher" texts (eg, .DOC files) and maybe you even have some known plain texts
(.txt files with similar file names) . Yes i realize .txt is ascii and that is
a cipher in itself, but I am presuming that a 1-1 keyless mapping will be easy
for them to figure out.

Encrypted stuff maybe totally unreachable though besides brute force.

------
fit2rule
I think this is one reason to keep the 8-bit computing world alive. The act in
and of itself is worthy from the perspective that a lot of the older software
is still perfectly useful, fun, and applicable to the modern world, and as
well will _inspire_ the perpetuation of platforms as we fall ever further over
the abyss of hardware relevance. On one end is 'how relevant is the new
hardware' and the other is 'how relevant is the old hardware' .. as soon as
these become equivalent, we have a stable progression of human digital
culture. But, we don't have that: everyone upgrades their new iDevices as soon
as they can, and even in just the last few short years we start to see apps
that just don't run any more. This is a given.

Which is why I think that the resuscitation of 8-bit Computing, specifically,
is such a valuable thing to do: it provides context. When you've spent the
evening actually having fun with 30-year old software, the urge to splurge on
soon-to-be-redundant newgear is de-composed. Eventually, a person can
understand that all computing architectures over the Age, So Far, are of use.
That's how they got to be a working program in the first place: someone found
it useful.

I recently downloaded a PDF of 80 or so BASIC programs, written to be as
compatible with the plethora of machines that were available in the 80's, as
possible. What a joy it was to see linked lists, self-modifying code, and
competent optimization of program space while also using simplified
interfaces, to be cross platform as possible. A modern comp-sci student can
even still today, learn a _lot_ of very important lessons about computers by
reading such archives and going through 30 or so years of history. It
factually is not a long amount of time. All those lost floppy disk
collections, out there in the dumping grounds, or even the ones still working,
hidden in the closet, have the potential to be just as relevant in 100 years
as they were on the very first day of publication.

I urge anyone with an 8-bit stash to dig it out, soon enough, and find your
active community. There are few 8-bit machines out there which don't have a
thriving scene.

------
jpswade
This is a very clever way to pitch cloud technology to the older generation
that simply don't understand the benefits.

By essentially saying that, if you don't use cloud technology your precious
photos and documents might be unaccessible in years to come.

~~~
wlesieutre
Cloud services require their own sort of maintenance though.

[http://www.everpix.com](http://www.everpix.com) and the like. Hopefully less
often than hard drive failures, but more frequently than a format like JPEG
becoming unreadable.

I really hope someone comes up with a good long-term data archive/retrieval
system soon, but I'm not holding my breath.

------
pjc50
The irony is that Google have also participated in acquishutdowns and data
deletion. They still have the Deja archive online, but I think it's reasonable
to worry whether they might just bin it at short notice like Google Reader or
Wave.

"Never trust a corporation to do a library's job"
[https://medium.com/message/never-trust-a-corporation-to-
do-a...](https://medium.com/message/never-trust-a-corporation-to-do-a-
librarys-job-f58db4673351)

An example of acquishutdown from HN:
[https://news.ycombinator.com/item?id=8472047](https://news.ycombinator.com/item?id=8472047)

------
bsbechtel
How is this different than any other point in history? Throughout time, the
vast majority of records and artifacts were lost, and only some given
percentage survive. Let's say it's ~2% for argument's sake...actually, the
amount of material that survives would decay over time - 40% after 10 years,
20% after 50 years, etc.

I have no idea what the rate of decay for digital records, information, and
archives would be, but I would think it would be higher than information
stored as hard copies of paper, books, etc. Of course, we are also producing
orders of magnitudes more information than we were in the past.

~~~
benihana
>How is this different than any other point in history?

I would argue that for the first time in history, we're not limited by storage
space.

------
drivingmenuts
Who pays the cost of storage? It's not a simple question at all, especially
when you consider that most of the information being stored is only marginally
useful now and the future value is a complete unknown.

------
pmontra
I quote a page of Stross' Glasshouse which I'm reading right now.

"We know why the dark age happened [...] Our ancestors allowed their storage
and processing architectures to proliferate uncontrollably, and they tended to
throw away old technologies instead of virtualizing them. For reasons of
commercial advantage, some of their largest entities deliberately created
incompatible information formats and locked up huge quantities of useful
materials in them, so that when new architectures replaced old, the data
became inaccessible."

------
dwarman
The Long Now folks have been warning of this for at least twenty years. Their
definition of a "Dark Age" is one for which we have no extant original
records. Which describes the state of digital data precisely. Their concern
with this is how to encode the data alongside the clock for distant
descendants to read without access to the originating technology. From their
exposition I learned to backup complete computers, not just workfiles., not
even just the hard drives. Still not enough, even for decades let alone
millennia.

------
romaniv
_" A company would have to provide the service, and I suggested to Mr Cerf
that few companies have lasted for hundreds of years. So how could we
guarantee that both our personal memories and all human history would be
safeguarded in the long run?"_

This puts another perspective on the story about Japan's oldest companies,
does it not?
([https://news.ycombinator.com/item?id=9041040](https://news.ycombinator.com/item?id=9041040))

------
DanielBMarkham
Meta: I know this is the Internet, and we're all supposed to complain all of
the time. And I am aware of HN's submission guidelines. But do we really need
to refer to him as Google's Vint Cerf? The guy is famous in our community.
It's not like he needs another adjective, and it's not like Google owns him.
How about just his name? I find this phraseology a bit disconcerting.

~~~
sswaner
Contrast with the comments on the Paul Carr post - several comments asking
"Who is this guy?".

------
greatabel
I think some organizations should do things like "TimeMachine of the
Internet", like osx's TimeMachine, just in larger scope .

~~~
jarman
[https://archive.org/web/](https://archive.org/web/)

------
WalterBright
I find this strange. I know a lot about my parents, a few bits and pieces
about my grandparents, and essentially nothing about my great-grandparents.
Very, very little of those days was kept, so how can we be entering a "dark
age"? Heck, archaeologists are digging through trash heaps in Jamestown trying
to figure out how they lived.

------
wglb
Reminds me of the extremes the hackers had to go to to recover images from the
nasa tapes [http://www.extremetech.com/extreme/181241-lunar-recovery-
pro...](http://www.extremetech.com/extreme/181241-lunar-recovery-project-
restores-stunning-moon-footage-from-inside-an-a-mcdonalds)

------
pervycreeper
This is valuable (as I understand it, they are using a metaphor for virtual
machines), however the need for this in most cases could be obviated by using
open data formats. The problems of physically degrading storage media, "bit
rot", and the ability to interface with obsolete technology are also a big
deal

------
jakosz
This is already an issue in many areas, perhaps most troubling with regards to
scientific data (e.g.
[http://www.sciencemag.org/content/331/6018/694.short](http://www.sciencemag.org/content/331/6018/694.short))

------
JustSomeNobody
This is why proprietary binary formats are evil. We here so much of IoT now,
but what we're really going to get is MS's IoT, Samsung's IoT, Apple's IoT. We
are not ever really going to have a true IoT (assuming we even need an IoT).

------
thisjepisje
" _Vint Cerf is promoting an idea to preserve every piece of software and
hardware so that it never becomes obsolete - just like what happens in a
museum - but in digital form, in servers in the cloud._ "

Why not regular servers?

~~~
vidarh
The point is that the description/emulation would need to be a virtual
machine, so it can be transported and duplicated, as long as you have the
basic system to run it on. Otherwise the problem is just as bad when those
"regular servers" die.

------
PaulHoule
Librarians used to worry about this 20 years ago, but then SNES9X came along.

------
mcantelon
The effort to combat this is known as digital preservation:
[http://en.wikipedia.org/wiki/Digital_preservation](http://en.wikipedia.org/wiki/Digital_preservation)

------
known
[http://archive.today/](http://archive.today/) is nifty

------
newscracker
There are at least a few different things to be considered here, and I'm
instantly reminded of a big stinker of a post by Mark Pilgrim about
proprietary apps and data formats as well as another old article about how
stuff that was archived by some government agency was not even readable or
accessible later on due to technology changes (it's not one of those
referenced in the Wikipedia article for "digital dark age").

Firstly, the medium of storage, the encoding and the interface form one angle
to look at. Just like how floppies and CDs are now (almost) obsolete, there
would be a future when there won't be any machines that recognize a USB
storage device. The same would hold good for other technologies that we have
quickly run through, to mention a few - PATA, SATA, PCI, PCI-X, PCIe and so
on. Can you read an MFM hard drive today with any computer that you have? With
adequate care, data can be moved from one medium to another, like how people
learned to move music from tapes to CDs and then to flash drives and hard
drives, then to the truly nebulous thing called "cloud", etc.

Next, consider the data formats themselves and the applications that support
them. This is where proprietary formats, especially those that are not widely
popular, would hurt the users. So if you're using, for example, Apple's
document formats on Pages/Numbers/KeyNote, it's likely that those files will
soon become obsolete (as they already have, where Apple does not support older
formats in the newer iWork). Commonly used and supported formats that don't
change rapidly, like JPG and PDF, for example, are safe for a much longer time
because there are many applications to process these documents with, both
proprietary as well as FOSS. Even the web pages stored by archive.org or any
doc or xls files lying around from about 20 years ago - how many more years do
you think browsers, word processors and spreadsheet programs will keep
bloating up (like they have been so far) just to support older versions of
doc, xls, ppt, html, etc.? At some point, the bloat will have to be cut and a
decision made that older formats will not be rendered like they used to be.
That would leave some clobbered text or gibberish or both showing up for any
interested future humans to figure out whether it's worth preserving or not
and to convert it to a newer format if they care.

Now, assume the data format is a long surviving one, like say, mp3 or jpg. How
do you protect it from bit rot wherever it's stored? Just because you put it
on Amazon or iCloud or Dropbox does not mean you can't lose part of the data
to corruption of different kinds or lose all of your data due to system
failures. Among the people who do regularly backup, perhaps only a fraction of
a percentage actually verifies backups (if at all). With consumer level cloud
options, there's not a lot of hope for data longevity for non-tech savvy
people.

If you're dreaming up a beautiful future in the cloud for all data storage,
how can you be sure that your data just doesn't vanish or that it doesn't
diminish in quality? People have had that happen to their precious photos by
sites that had sneaky terms and conditions about deleting old photos (or if
photo prints are not ordered regularly), didn't allow full downloads at any
point in time, and sites that went out of business. We can see people treating
social networks as reliable cloud storage for their photos as well,
disregarding the risks of making such an assumption.

Ignore data integrity and data format issues for a moment, and imagine a
distant future where 64K displays are common. All your current and older
photos and videos would either look terrible on those or may even be
completely indistinguishable. Of what use would these artifacts be then for
anyone?

Considering that a lot of data on the cloud is actually insignificant at the
level of the human species (like rants and comments on social networks,
LOLcats, etc.) and also the huge amount of data being created every second,
how would someone in the future even sift through all this? It's somewhat
similar to the NSA/GHCQ looking for a needle in a haystack the size of a huge
mountain, except that this haystack is to preserve history, culture, etc. The
current haystack being built would need a lot of archivists from different
backgrounds working continuously to separate the wheat from the chaff and to
also look at consolidating (or "packing") the information concisely (like
gathering summaries, sentiments, trends, statistics) if we're ever to have an
archive that future generations would even want to look at (say centuries
ahead in the future). Leaving it to governments alone or corporations alone is
not the solution since each would shape the archive in its own image.

This is a very complex topic for most individuals to deal with, and the above
points didn't even touch upon cultural and linguistic shifts that happen over
time for any data to be usable. I'm sure I've missed many other aspects about
prolonging the life of (usable and useful) data.

P.S.: The best everlasting format, as many tech savvy people know, is plain,
unencrypted text for textual content (this still assumes that the media can be
read, because encoding may play spoilsport).

P.P.S.: All LOLcats may actually be cultural items to preserve for eternity!
:P

------
digerata
I, for one, feel that humanity would benefit from this generation's digital
presence being wiped from existence. Everyone on this planet is now dumber for
having experienced it. I fully support and will tell my congressman to vote
for a digital dark age.

Now get off my lawn!

