Quite coincidentally, I was just now reading an interview with Brewster Kahle, from NewScientist (23 November 2002) - back when the Wayback Machine had only 100 terabytes archived.
He said: "I guarantee that in the future researchers will curse us for having missed something absolutely critical. But only people using the archive can tell us about mistakes in what we collect. There is a cheaper alternative concept, called 'dark archiving', which means that we should not give people access to them. But preservation without access is dangerous - there's no way of reviewing what's in there."
But later on, he mentioned that: "AltaVista was the first Internet search engine that tried to be a complete index of all the pages. But what really got me was that they threw away the original pages. That grated, no end."
Aside: Kahle was one of the founders, with Danny Hillis, of Thinking Machines - the company that created the fabulous 'Connection Machine'.
He said: "I guarantee that in the future researchers will curse us for having missed something absolutely critical. But only people using the archive can tell us about mistakes in what we collect. There is a cheaper alternative concept, called 'dark archiving', which means that we should not give people access to them. But preservation without access is dangerous - there's no way of reviewing what's in there."
But later on, he mentioned that: "AltaVista was the first Internet search engine that tried to be a complete index of all the pages. But what really got me was that they threw away the original pages. That grated, no end."
Aside: Kahle was one of the founders, with Danny Hillis, of Thinking Machines - the company that created the fabulous 'Connection Machine'.