
The Internet Archive Telethon - empressplay
http://telethon.archive.org/?ref=hn
======
bane
Please consider donating. IMHO, the Internet Archive is one of the crown
jewels of the Internet. It's absolutely incredible what they've achieved, how
much information and knowledge they've made accessible and how important the
task of what they're doing is...and it's all free for anybody to enjoy,
millions upon millions of hours of content from books to magazines to movies
to music, classic radio, video games, and more than I could even guess at.
They're just asking for a little help and give so much back.

~~~
awqrre
I agree with you but I wonder why I never get archive.org results when I
search Google...

~~~
deadalus
1) Because they block it in the robots.txt file :
[https://web.archive.org/robots.txt](https://web.archive.org/robots.txt)

2) They are creating a search engine for this very purpose.

------
laarc
It should be socially acceptable for Internet Archive to ignore robots.txt.

They have to respect it because we, collectively, say so. Obeying robots.txt
is the minimum acceptable behavior for any robot, short of the Asmiov laws.

But archiving is different. I've been running into "Site was not archived due
to robots.txt" more and more frequently. Often these are articles from ~2011
and earlier which the author no doubt would have wanted to be archived.

Trouble is, robots.txt is also the only thing that people really bother to set
up. Maybe there's a way right now to indicate "Sure, archive my site please,
and ignore my robots.txt." But if there is, it's not really common knowledge,
and it's kind of unreasonable to expect every single website on the internet
to opt-in to that.

On the flipside, it seems entirely reasonable that if someone really wants to
opt _out_ of archiving, that they explicitly go and tell Internet Archive.
Circa 2016, Internet Archive is the only archive site that seems likely to
persist to 2116. It's a shared time capsule, a ship that we all get a free
ticket to board. If someone wants off, they can say so.

But right now, large swaths of the internet simply aren't being archived due
to rules that don't entirely seem to make sense. There are excellent reasons
for robot.txt, but opting out of "Make this content available to my children's
children's children's children" seems perhaps beyond the scope of the original
spec.

Would you feel ok with the Archive ignoring your robots.txt, or would you feel
annoyed? If annoyed, then this is a bad idea and should be rejected.

But if nobody really cares, then here's a proposal: Internet Archive stops
checking /robots.txt, and checks for /archive.txt instead. If archive.txt
exists, then it's parsed and obeyed as if it were a robots.txt file.

That way, every site can easily opt-out. But everyone opts-in by default.
Sites can also exercise control over which portions they want archived, and
how often.

~~~
frik
If example.com allowed indexing in 1999, a new owner of example.com can
hide/delete the 1999-2015 content by changing the robots.txt in 2015.

It would be better if archive.org would adhere to the robots.txt of the
requested date/year (show content of example.com from 1999-2014).

~~~
mapt
The fact that all popular URLs which fall out of registration are now picked
up by squatter-spambots is also troubling. An Archive.org entry should not
cease to exist when the registration lapses if the squatter-spambots decide to
robots.txt everything. That would defeat its purpose completely.

------
Roger_Archive
I'm honored to work at the Internet Archive and hope you will consider
chipping in to help us continue to champion the import of, and make progress
towards, our mission: universal access to all knowledge. If you can't part
with a little $$ now, how about uploading some digital media you think are
important? The Archive will endeavor to make them available to everyone,
forever, for free.

~~~
niutech
Thank you for your precious work! I will donate to support the Internet
Archive, but when will it have a proper Web Archive search engine? It will be
a huge thing!

------
clamprecht
They accept bitcoin, without any hassles or having to provide any information
about yourself. Just a simple QR code for the bitcoin address:

[http://archive.org/donate/bitcoin.php](http://archive.org/donate/bitcoin.php)

If you already have a wallet, it may be the easiest way to donate.

------
zubspace
My firefox bookmark manager contains about 1300 links. Since quite a while I
was really unhappy with this solution. Link rot is really bad and it's sad
when even internet archive does not contain the lost site.

The only solution to this problem I found is storing the links locally. I'm
now in the process of importing everything into OneNote (onenote clipper is a
huge help). A big plus is, that the content is indexed and fully searchable.

I probably would not do this if internet archive was more reliable. I'm ok
with this solution, but it's a bit strange that firefox/chrome/IE haven't made
this process of storing sites locally easier.

~~~
DanBC
Google don't want you to use bookmarks, which is why Google Chrome bookmarking
sucks.

There's definitely a niche in the market for a bookmarking site that does some
form of bulk importing of your bookmarks; has really easy organisation (drag
and drop, and an API for power users); has some kind of thumb-nail; has some
kind of link to Internet Archive (to read old sites); and maybe a link to IA
to store sites. The internet archive stuff could be a paid option with some
money going to IA to help fund them.

~~~
syaz1
> Google don't want you to use bookmarks, which is why Google Chrome
> bookmarking sucks.

Why?

~~~
DanBC
Google wants you to search, so they can show you ads in the search results.

~~~
cooper12
Same thing for the history. Chrome's history forces you to view it page by
page just to find something from a while back while Firefox will gladly let
you view the history per month. It also makes removing items a huge pain
because you have to individually check each item. (I'm aware of the "clear
recent history" feature, but not what I'm looking for)

------
aunty_helen
I donated due to the number of times I've seen the wayback machine links come
up in discussions on HN.

A great resource that we can't afford to be without.

------
syaz1
I am interested in learning more about their system architecture. Anyone know
if any such writeup is available? How do they scale and what amount of data
are they dealing with daily? Disaster recovery, considering they archive
history?

------
shmerl
Supported. They are an invaluable resource.

------
dieg0
It has been awesome, thank you so much internet archive for your hard work and
dedication to openness and knowledge.

This is a must watch, incredibly entertaining =)

------
ctstover
best telethon ever

------
exo762
IPFS is an attempt to fix link rot by addressing content by its contents, not
its location. Very cool project.

------
zorpner
Any other HNs here?

