

Ask HN: Are there any alternatives to the Wayback Machine? - davidxc

I&#x27;ve found the Wayback Machine to be very useful. However, I&#x27;m looking for a past version of a website and the Wayback Machine doesn&#x27;t seem to have it.<p>Are there any good alternatives to the Wayback Machine that also archive a large part of the web?
======
burovector
Very good question! The general theme of web archiving is still not adequately
addressed (imho), in order to fight bit rot of hyperlinks and preserve digital
online web sources. I find it amusing that usually nobody seems to think about
or care for "who conserves the web for future generations".

Coming from academia, I myself primarily use The Wayback Machine. In research
literature I also encounter some use of WebCite
([http://www.webcitation.org](http://www.webcitation.org)) in references.

Anyone interested in web archiving, might have a look at introduction to web
archiving at wikipedia, including an overview of web archiving services:
[http://en.wikipedia.org/wiki/Web_archiving](http://en.wikipedia.org/wiki/Web_archiving)

If the past version of the page you are looking for is <3 months old, you
could also try cache:URL in google search (or equivalent function in other
search engine).

~~~
davidxc
Thanks for the response! WebCite looks pretty good. I'll try using that.

------
minopret
There's nothing quite like the Internet Archive Wayback Machine.

For sufficiently recent dates, web caches can serve a similar purpose. Some of
those can be accessed conveniently using the Firefox plugin "Resurrect Pages"
([https://addons.mozilla.org/en-us/firefox/addon/resurrect-
pag...](https://addons.mozilla.org/en-us/firefox/addon/resurrect-pages/)).

You can preserve the present version of a site for yourself. No tool will work
for all sites, but some tools will work well for many sites. WebHTTrack is
good. It will record the address and date of each capture.

------
ig1
You could try the Common Crawl ?

~~~
davidxc
I looked at Common Crawl, but it doesn't seem to archive the content of most
pages. Am I misunderstanding how it works?

