For more on increasing storage costs and the under-funded state of web archiving in general, I recommend David Rosenthal's blog, eg:
Far more effective and robust than hoping the archive is "suck it up for us" is to upload snapshots/dumps/exports yourself! Anybody can create an archive.org account and upload content (recommend https://github.com/jjjake/internetarchive over the HTML form), within reasonable limits. Obviously, care needs to be taken to remove sensitive (and personal) information first.
How feasible would a reliable, distributed archive be; given how massive amount of data Archive.org has? After all, it was created precisely because the already-decentralized web was too ephemeral and unstable. I don't think decentralization is a panacea in this case.
Some problems are best solved by institutions.
The way to solve this is to provide the Internet Archive with enough resources to build out a globally distributed storage system. Could you hack something together using their torrent tracker for every item served? Yes. But you don't hack together something made to preserve digital human culture in perpetuity.
> The way to solve this is to provide the Internet Archive with enough resources to build out a globally distributed storage system.
Yeah, I agree. I do see a space for other appropriate institutions (such as the Library of Congress, British Library, etc.) to pool resources and facilities with the Internet Archive to achieve that goal.
Ultimately, it'd be awesome to see each national library run a semi-autonomous IA copy that synchronizes with all the others, but can continue to operate independently (scrapers and all), if need be.
The University of California is a part of the government of the State of California established in the State Constitution, whose governing body is comprised of 18 members appointed by the Governor and confirmed by the Senate, plus seven ex-officio members, three of whom are State elected Constitutional officers (Governor, Lt. Governor, and Superintendent of Public Instruction) and one of whom is the Speaker of the Assembly.
(That said, it is unusual and potentially misleading to refer to UC as “California”, but not because UC is actually separate from the government of the State.)
Though I've heard credible complaints from "copyright" holders vs archive.org.
But both data-privacy and copyright they try to create ownership of information and must do so through intrusive legal measures because physical nature makes is against it.
The project aims to demonstrate how members of a cooperative, decentralized network can leverage shared services to ensure data preservation while reducing storage costs and increasing replication counts.
That is not true.
That is the latest Taylor swift full album streamable from archive.org
Items don't lose their copyright status by being uploaded or stored in the Internet Archive.
Further, that's opposite the entire purpose of Archive.org
Where are you getting this information?
That's 100% not true. Please stop spreading misinformation because you have no idea what you are talking about.
"According to a landmark 1999 federal district court ruling, The Bridgeman Art Library, Ltd., Plaintiff v. Corel Corporation, 'exact reproductions of public domain artworks are not protected by copyright.'"
>"The internet wayback machine is one of the biggest copyright thiefs in my opinion."
Anyways, copyright is weird and complicated and there are many people who don't really understand it. But your theory that archive.org is going to get a copyright on everything they scan and store is not rooted in any sort of legal reality.
If the photo meets the requirement for independent copyright ability, which requires being a distinct creative work, that's true, and for a photo that would often but not always be the case.
For a simple non-transformative digital copy, there's no copyrightable new work and no copyright in such a work.
> What if, in time, the only copy you could get hold of was my copy. My copy would be copyright me and you’d need to wait for it to become public domain before you could use it under your terms
If someone was copying elements that were original in your work of the photograph, true, but if they were merely copying the pubkic domain text of which you had taken a photograph, no, not true at all. Copyright protects your original work.
> For example, if a book published in 1995 is a reprint of a book published in 1900, then it is eligible. However, the onus is on us to prove that it is a reprint, and if it doesn't say on the TP&V that it is a reprint, confirming its eligibility may be impractical.
There's a growing push to legitimize copyright claims for "instances" of a work, even after the base work has entered the public domain.
If this "sounds ridiculous" to you, just recall how most of us are increasingly worried that, in the U.S., Congress and the Executive are going to... "re-Mickey" the copyright term. As in, they already pushed it to life plus 70 years when certain Disney copyrights were about to expire. (And keep using "trade agreements" as one mechanism to try to "back door" increases to "plus 70" into other countries' IP terms.)
A separate concern I have, is that currently archive.org continues to "retroactively respect" robots.txt changes.
404 your once public content, and archive.org "disappears" it from their corresponding records/copies.
As long as that's true, you can't really view it as a permanent, unbiased archive.
As politicians, commercial interests, and their lawyers continue to have a field day constraining "online rights" (and IP rights, and etc.), currently the only "guarantee" the public has of continued access and a more complete historical record is, ironically, local copies.
They say, "History is written by the winners."
Well, unless they can't find the copy you have squirreled away.