

Archive.org: Only 12 days to reach the goal of $150,000 - jpswade
https://archive.org/donate/?n=hn

======
gokfar
The Wayback Machine has served me well over the years. I sent half a bitcoin.
The title should include something about the 3:1 matching, this usually makes
me much more likely to donate.

I hope they reach their goal and show pictures of what 4PB of storage looks
like.

~~~
asmosoinio
Agree on the title change. "3:1" pushed me over the edge to seek out my PayPal
password and make a donation.

------
carleverett
In the donation options, the 3:1 effect of the donation towards the goal is
included, but I think it is misleading. Correct me if I'm wrong, but a $50
donation will bring them $200 closer to the $600,000 goal, but will only bring
them $50 closer to the $150,000 goal that they are pushing for. That might
cause some confusion.

~~~
Wingman4l7
Tomato, tomahto. They're really pushing for the $600k goal because that's how
much the 4 PB of storage that they want to buy costs. It doesn't really matter
because in the end, the percentage is the same.

------
nnnnnn
Someone went wild with that bar graph bevel.

~~~
benesch
Looks like a default Excel 2007 style to me :/

------
atesti
I'm from Germany. Can I tax deduct this? How would I do it? Just show my bank
statement to the German IRS?

------
cstrat
Done - donated $50 to the cause !! =)

------
nextstep
Only 12 more days until the arbitrary date by when Archive.org wanted to raise
$150,000.

~~~
Wingman4l7
Not an arbitrary _date_ \-- the supporter will stop matching the donations
after December 31st. The _goal_ could be argued to be arbitrary -- $600k for 4
more petabytes of storage.

------
bananashake
Every person that has ever been harassed by or lost a job over a web page can
thank the archive for making that permanent.

It's irresponsible and unlawful to make unauthorized archives of web pages.

~~~
cowsaysoink
It is ridiculous to me that people view public web pages as something that
shouldn't be archived, if anything it provides illuminating snapshots to the
state of the web at certain dates.

The archive.org team does follow robots.txt and I believe they remove content
retroactively meaning if you update your site with a robots.txt it will delete
the old content (which I think sucks).

~~~
JoshTriplett
> The archive.org team does follow robots.txt and I believe they remove
> content retroactively meaning if you update your site with a robots.txt it
> will delete the old content (which I think sucks).

Indeed, especially since most domain parking garbage sites seem to have
robots.txt files for some crazy reason.

~~~
alexkus
> Indeed, especially since most domain parking garbage sites seem to have
> robots.txt files for some crazy reason.

Presumably to avoid being plagued (in terms of load and bandwidth costs) by
the numerous crawling bots looking to update their caches of pages that no
longer exist on those domains.

~~~
aw3c2
serving 404s is super cheap actually.

~~~
alexkus
It depends on the setup.

I've seen a CMS brought almost to its knees because the previous owner of that
IP had a site that had lots of distinct pages on it. Since every page in the
CMS was stored in a DB it took a DB lookup to find out whether the incoming
URL existed or not. Caching/varnish wouldn't help as there were hundreds of
thousands of different incoming URLs and none will be in the cache because
they don't exist.

About 20% of the hits to one site I look after are 404 because they're from
the previous site hosted on that IP address. Luckily the vast majority of URLs
have a specific prefix so it's a simple rule in the apache config to 404 them
without having to got to disk to check for the existence of any files. It
still counts against my bandwidth utilisation too (both incoming request and
outgoing 404).

