Hacker News new | comments | show | ask | jobs | submit login

I worked at the Archive for a few years remotely. It permanently altered my view of the tech. world. Here are the notable differences. I think these would apply to several non-profits but this is my experience

1. There was no rush to pick the latest technologies. Tried and tested was much better than new and shiny. Archive.org was mostly old PHP and shell scripts (atleast the parts I worked on).

2. The software was just a necessity. The data was what was valuable. Archive.org itself had tons of kluges and several crude bits of code to keep it going but the aim was the keep the data secure and it did that. Someone (maybe Brewster himself) likened it to a ship traveling through time. Several repairs with limited resources have permanently scarred the ship but the cargo is safe and pristine. When it finally arrives, the ship itself will be dismantled or might just crumble but the cargo will be there for the future.

3. Everything was super simple. Some of the techniques to run things etc. were absurdly simple and purposely so to help keep the thing manageable. Storage formats were straightforward so that even if a hard disk from the archive were found in a landfill a century from now, the contents would be usable (unlike if it were some kind of complex filesystem across multiple disks).

4. Brewster, and consequently the crew, were all dedicated to protecting the user. e.g. https://blog.archive.org/2011/01/04/brewster-kahle-receives-.... There was code and stuff in place to not even accidentally collect data so that even if everything was confiscated, the user identities would be safe.

5. There was a mission. A serious social mission. Not just, "make money" or "build cool stuff" or anything. There was a buzz that made you feel like you were playing your role in mankinds intellectual history. That's an amazing feeling that I've never been able to replicate.

Archive.org is truly only of the most underappreciated corners of the world wide web. Gives me faith in the positive potential of the internet.




> There was a mission. A serious social mission. Not just, "make money" or "build cool stuff" or anything. There was a buzz that made you feel like you were playing your role in mankinds intellectual history. That's an amazing feeling that I've never been able to replicate.

This resonates with me. Sometimes we developers need to get off the "move fast and break stuff" bandwagon (which has been ongoing for over decades now), and consider we're the ones responsible for preserving almost all human digital heritage of our epoch. There's a simple and obvious method to implement preservation-friendly content implicit in the web architecture: emit/materialize everything as plain HTML, even dynamic content. This is of course antithetical to most of this decade's SPA web development trends, but I think it's worth drawing a line between web content (worth preserving in the first place) and web apps (which have highly volatile content not worth preserving). I feel like this distinction isn't considered sufficiently in our staged web app architecture dicussions which are all about your latest JS MVw framework, to the degree that newby web devs really don't learn the fundamentals of HTML etc. anymore, and are lead to use React, Vue, etc. for content-based web sites.


Thank you for sharing this. I’ve donated (small amounts of) money to Mozilla and Wikipedia in the past. Your post makes me consider donating to archive.org this year.

Edit: typo


How does archive.org make money? I imagine their storage costs must be quite high.


They don’t. They are a non-profit 501c3 charity that relies on donations.


I think what the parent meant is "how does archive.org pay their bills?".


I thought I was answering that. Where did I go wrong?


Are you saying they don't pay their bills?


I'm saying they pay their bills (utilities, hardware costs, salaries) with donations, US dollars obtained from those donating.

> How does archive.org make money?

Donations (https://projects.propublica.org/nonprofits/organizations/943...)

> I imagine their storage costs must be quite high.

No, they aren't. Building and hosting your own storage is cheap. Same reason Backblaze and Dropbox built their own storage systems.


> Building and hosting your own storage is cheap

Archive.org uses S3 extensively. Not exactly cheap.


Can you provide a citation? To my knowledge, the Archive does not use Amazon's S3 storage system (which they refer to in places as "S3" [1]), only they're on their own internal storage system [2].

[1] https://archive.org/help/abouts3.txt

[2] https://archive.org/web/petabox.php


To the best of my knowledge, the Archive has it's own machines to store data. It is an Archive and one of the principles was to have the know how to preserve data even if the cloud providers disappear.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: