
Let's Say You Wanted to Back Up the Internet Archive - pmiller2
https://www.reddit.com/r/DataHoarder/comments/h02jl4/lets_say_you_wanted_to_back_up_the_internet/
======
peter_d_sherman
This is related to a philosophical question:

Suppose (for the sake of a philosophical discussion) that the Internet must
shrink to 10% of its storage capacity for whatever reason (Zombie Apocalypse,
War, Societal Instability, Natural Disaster, etc.).

If that's the case, then which websites and content do you keep, and which do
you throw away?

Now, if you can answer that -- then re-ask the same question, but with 10% of
that 10%... 1%... what do you keep and what do you throw away?

Now, follow this line of thought to its natural limit, that is, _given a
computer which can store only 64K on a floppy disk, and given that all other
content is going to be deleted, what content do you choose to preserve on that
64K floppy disk?_

?

One answer might be _the knowledge of how to build that computer, and that
floppy storage device_... (We'll assume that if it doesn't fit exactly, then
we could have a little bit more space, like 170K or what have you...)

This line of philosophical questioning could even be taken a step further and
that is, ignoring technology altogether, we apply this line of inquiry to
books...

So our question then looks like:

 _" What is the most important information that one, single, solitary, finite
book should contain -- given the near-infinite amount of information out
there?"_

My answer to this, is, that that book should contain _the knowledge of how to
produce paper, how to make books, maybe even how to create a printing press
and inks_ (if space permits).

That is, a book -- _which contains the knowledge of how to make other
books_... that is, _how to replicate itself_ (well, with human
intervention!)...

That would be my answer -- as an amateur philosopher...

Other answers might be, and probably will be, different...

But, I think it's a very interesting philosophical question...

------
pmiller2
This got posted a couple days ago but didn't get any traction (3 points, 0
comments). As well as being topical due to the Internet Archive's recent legal
troubles, there's some interesting discussion there.

~~~
badRNG
Thanks for posting this, given the recent legal trouble, I've been wondering
about what kind of history would be lost forever if the Internet Archive was
completely shut down.

------
Tagbert
would it be possible to setup a distributed mirror of IA using something like
BitTorrent?

~~~
pmiller2
The linked Reddit thread estimates the amount of data at ~50PB. Good luck
seeding that! ;) Even if you broke it up into more manageable pieces, 50PB is
still a lot of data, and I'm sure pieces would end up getting lost.

~~~
6510
actually, if one was to hack a client and everyone goes after "rare" pieces
you can sort-of backup a lot of it without anyone getting a real copy.

Then we simply wait for 100 PB drives?

