
Why there’s so little left of the early internet - seagullz
http://www.bbc.com/future/story/20190401-why-theres-so-little-left-of-the-early-internet
======
Waterluvian
I'm not disagreeing with any of this. But I think there's additional
perspective to be had.

I see a lot of conversation about early Internet and the health of the
Internet. I think some of this is just because those of us who remember it,
are starting to ache over it a bit for all kinds of reasons, one of which is
because that's what old people do.

Today I got lunch with my 17 year old nephew. I asked him about what he does
on the Internet and if he misses the "old Internet". First off, he has no
context of what the old Internet is. He doesn't really care nor seem to feel
like he's missing out. Second, he showed me what he mainly does. Basically it
was a handful of group chats via Signal and WhatsApp, a whole bunch of web
comic websites, and some YouTube series.

I felt like a dad of a teenager for a fleeting moment and my generalized
conclusion was that he's consuming a ton of independently developed creative
content. I was actually pretty proud of what the Internet looks like for him.

I asked him about curation and the ability to hold on to his favourites. How
does he ensure that something he loves isn't gone in 10 years? This concept
really didn't seem to matter to him. His perspective seemed to be that there's
so many webcomics and the "memes" and conversations he has with his friends
are ever evolving, there's not really a lot of time to care about the old
stuff. Made me wonder if attachment to nostalgia may be fleeting as well.

Is this the full story? Of course not. But it made me feel a bit better about
things and a bit more skeptical of the doomsaying I often hear and sometimes
take part in.

~~~
the_af
Not caring about the future is something a 17 year old would do. Not worrying
about whether the stuff he loves _now_ will endure _tomorrow_ is typical of
that age. He simply hasn't lived long enough and matured enough to feel
nostalgic about stuff. I promise you he will miss things when he's 30 or 40.

I think this reflects more on teenagers and grownups than on the internet :)

> _His perspective seemed to be that there 's so many webcomics and the
> "memes" and conversations he has with his friends are ever evolving, there's
> not really a lot of time to care about the old stuff._

He's missing on old stuff like Calvin & Hobbes, to name just one example.

~~~
CalRobert
I cared about the future a lot when I was a teenager/early 20's, and was a bit
of a digital hoarder. When I was a freshman in college a million years ago
(2001) it was common to meet people and trade binders of burned Divx's. I had
hundreds, because I might want to watch them someday. My Mp3 collection, tiny
by some standards, was still tens or hundreds of gb - and this in an age when
a 20GB hard drive was a decent size. I saved every photo I took and curated
the folders of them.

Then three things happened 1) Almost everything became easily available (or
reasonably easily) 2) Of the things that aren't easily available (personal
photos), I realized I never look at it, with very little exception. 3) I
realized I am going to die someday, and at the rate I consume media I will
_never_ get to even a sizeable fraction of it.

So I try to print the pictures I care about, I have a folder of just a few
photos and videos I really care about (mostly my kid), and that's about it
really.

Media used to be somewhat scarce, but now basically everything is available
everywhere all the time (rights issues notwithstanding). We take a thousand
photos on a vacation, not a few rolls of film (72-108 images) It's
overwhelming.

~~~
steelframe
Older dude checking in. I threw out my Divx collection about 10 years ago. I
have precious little time to spend watching 2-hour films, and I recognize that
the number of films I will watch before I die is on the order of maybe 300.
When I do get a chance to watch a film, I want the experience to be as great
as it possibly can be. I want as high a resolution as I can get with the best
color. Compared to 4k HDR stuff available today, the Divx stuff looks like
total shit. I have grown to despise my fellow human beings in groups larger
than about 7, so I've built a home theater where I can watch a movie without
having to deal with the bullshit that is the general public. I'll invite 4 or
5 of my friends over on occasion. Specifically, the ones who respect my rules
of watching a movie. No talking, no phones, that sort of thing. I'll buy a
Blu-ray disc brand new so I don't have to worry about random scratches or
whatever from previous renters. I don't stream because I no longer tolerate
any of the bullshit that happens half the time, like random degradation from
the service resulting in pixelation or pauses.

For music, I just subscribe to a streaming music service where I can download
("pin") stuff to my phone. Mainly because it's ridiculously cheap and
convenient. I don't care about "owning" music, like it could somehow forever
"go away" and I'd be sitting in the corner of my basement crying about how I
can't listen to such-and-such a song any more. But honestly I'm more likely to
listen to audiobooks these days, which I do immediately strip the DRM from and
store separately on my phone.

When I go on vacation, I take a pathologically small number of photos, and I
only take them on my phone. I can't be bothered with lugging around a huge
lens and camera body everywhere I go when I'm trying to experience some place
new. Whenever I show the photos to people, they often say, "Wow! What camera
did you take these with??" And I say, "A used Pixel 2."

~~~
JohnBooty

        and I recognize that the number of films I will watch before I die is on the order of maybe 300
    

This kind of thing seems morbid, but as I tiptoe into my 40s I find it
hilarious. Got my home's roof redone lately. Allegedly has a 50 year warranty.
Mentally added 50 to my current age and was like... well, yep, that oughta do
it. hahaha.

~~~
CalRobert
A bit morbid but also incredibly important.

When my colleagues say "I worked on this over the weekend" I get pretty upset
because

1) Dude you're in your 40's. If you're lucky you've got ~2100 lazy Sunday
afternoons left. Probably a third to half of those in decent health. That's
not very many. How many of those do you want to spend doing stupid shit (aka
most jobs) for somebody else for free?

2) Don't normalize that behaviour so I have to do it too.

------
blfr
_if I ever needed to show proof of my time there it would only be a Google
search away_

On that topic, has anyone discovered why Google deep-six'd Usenet archives it
acquired with Deja News?

You used to be able to find specific posts from specific posters with by: and
other operators. Sometime in the aughts it degraded quickly to the point where
I can't find threads from which I have explicit excerpts and full author
names.

Does someone high up in Google have an embarrassing usenet history? Did it
just fall in disrepair?

~~~
sverige
> Does someone high up in Google have an embarrassing usenet history?

I don't know if that's true (it probably is), but personally I am glad that
almost all of my old Usenet posts have vanished. I was horrified when Deja
News started up. That was the moment I realized the internet is forever and
decided to never use my real name or to upload any pictures of myself to
anything connected to the internet.

Of course, I'm still screwed because I use a smart phone and probably several
entities have that data and could connect the dots, but the average person I
encounter can find out very little about me with just my name.

~~~
bcaa7f3a8bbc
This is _the_ problem.

I've mentioned on multiple occasions that the current post-Snowden security
and privacy movement is creating a serious threat to the Internet history
preservation, and to some extents, threat the understanding/insights of the
human civilization in the digital age.

My personal interest is Internet culture and communities. And I'm not amused
by this comment, let me talk about the problem briefly.

Online communication from 1970 to 1995 was almost completely public, archived
indefinitely. You can still read every single comment by every hacker from the
late 80s in Usenet archives, sometimes even back to the ARPAnet era. There are
a million posts to read and no spam and low-effort posting at all (by modern
standards, even many flame wars seem to be high-quality). You can easily lost
days, months or even years in the Usenet archive.

Records like these are often the only remaining records of the online
communities, a snapshot of great historical and cultural value. _To me, even
the controversial political flamewars are interesting as they reveal parts of
the history I would not know otherwise (I guess if someone rereads Reddit
threads about Donald Trump today in 2055, he /she may have a similar
feeling)_.

On the other hand, you also have names, addresses, and even phone numbers of
almost everyone posted on Usenet. It was not a big problem when the access to
Usenet/Internet was exclusive to members of the academia, and at a time when
there was almost no systematic, organized abuses of the personal information.
But today's different, we have big and little brothers who have "Collect Them
All" as their slogan, and they are actively trying to exploit the information
available to the maximum extents.

What is the response then? People (at least many in the hacking community)
start to prefer private, semi-private, or in-group communities over public
communication, often protected by cryptography. Some people also actively
erases/purges their footprints, for example, some would delete every single
post when they left a community, no matter how insightful they are, others may
even deliberately insert misleading or false information. And we have
something roughly similar to Vernor Vinge's True Name (describes an
underground hacking community in the cyberspace). Good, now personal privacy
and security is more or less protected by using the cryptographic barrier.

But what is it doing then? We are now creating a unprecedented, HUGE GAP of
information in history, within our life time, we are now entering a new
digital Dark Age where no one has seen before.

Centralized and/or proprietary services often delete information when they go
out-of-service, too, so we need to archive them, desperately. You can't
imagine how many resources/memories that are extremely valuable to members of
some communities exist solely on a single web server/service provider. I
remember reading a post from Schneier's blog that says a website contains
numerous posts of wine culture were gone forever when the hard drive failed,
and one commenter said that he uses w3m/lynx CLI browsers, and records
everything he reads to his hard drive so he would never lost a single piece of
information he has seen.

Is it an act of little brother surveillance? Arguably, it can be seen as one.
But is it justified? I would say yes, and even say we need more people doing
this, systematically. Naturally, archive.org was born in this way.

But then it faces the same issue. On one hand, many archived information can
be abused, on the other hand, the more archive-refusing people we have, the
more damage to historical records is made.

I don't know how to solve this problem.

The only way I can think of, is (1) Cypherpunks were correct. Anonymity is
crucial in the information age, and we should have more of it: never use True
Name and reveal personal information unless absolutely necessary, use an
anonymous network (e.g Tor) if possible, discard identities periodically (but
if you delete posts, it still reverts to the original problem...), (2)
Encourages further developments and applications of anonymity, and (3)
Training people to assume every piece of information published publicly cannot
be removed, may be abused, and they should be able to withstand all possible
consequences of it. But now, "life" and "information" simply become
inseparable, if you are active in a community you have to post something...

I don't know the solution.

\---

Appendix 1: what you can see in an Usenet archive.

You can see people's reactions to the rumors that Apple would release new
68000-based machines, how Larry Wall was releasing patch-2.0, the debates
about the audio fidelity of vacuum tube/transistor amplifier, how /bin/sh on
System V was having a problem with "CFLAGS=-g make", the first hand
perspective on the impact of Great Renaming, Richard Stallman announcing the
GNU project and Linus's flamewar on microkernel with Tanenbaum, early Sci-Fi
fandom culture posted from a 4.3-BSD machine (beta version!) and how it
influenced the hacker culture, how the anti-spam movement gained momentum due
to its intellectual challenges, raw discussions of fringe political movements
(some interesting ones related to tech include the Cypherpunk movement, and
Exopian, an early sect of transhumanism), FAQs on almost every subjects
written by the active participants of the community who have probably spent
hundreds of hours, some weird "emergent" memes/phenomena created out-of-
nothing from the collective community, and also have a laugh on thousands of
forgotten Internet memes, like alt.religion.kibology, Usenet Oracle.

The only downsize is: no external resources is accessible, and nobody is going
to reply you. It feels like Otomo Katsuhiro's animation _Memories_ , the
protagonists in trapping inside a 3D Hologram simulation of the past, created
by the supercomputer in the abandoned space station.

~~~
Espressosaurus
Most communication throughout history has been ephemeral, and lost as soon as
the people relevant died without relaying it to someone else.

Consider the 1800s, where much of our understanding of the attitudes of the
day comes from newspapers and archived letters. Then consider how many more
were discarded once they had served their purpose.

Today it's possible to archive all of that ephemeral information, but it has
never been necessary.

Usenet, for example, was thought to be ephemeral because at best you had a few
months worth of posts archived on your server and maybe your local machine, so
if you said something boneheaded, it was going to naturally fall off the
internet sooner rather than later. As it turns out, that was an incorrect view
of the world.

Most forums are treated by their users the same way; a place for people to
meet and talk about things in quasi-realtime, but not to archive those
discussions for all time. Of course, as it turns out, those discussions _are_
archived for all time, or until the forum closes or has a catastrophic data
loss (e.g. this one we're on now).

~~~
JohnBooty

        Consider the 1800s, where much of our understanding of the attitudes of the day comes from newspapers and archived letters. Then consider how many more were discarded once they had served their purpose.
    

Don't you think the world would be a somewhat better place if we had more
records, and therefore more understanding, of what people thought in these
times?

Also consider the class implications here. Letter-writing (and archiving) was
something more often practiced by folks who were "elite" in terms of wealth
and/or education. The thoughts and opinions of these classes have
disproportionate representation in our understanding, and those of people in
lower classes are underrepresented or erased entirely.

------
gambler
The web has nothing built-in for archiving and versioning. It's a gaping hole
in this technological platform, one that has been noted and criticized for a
very long time. The reality of this problem, however, is vehemently denied by
the current generation of "technologists". Of course, those are the same
people who get six-digit salaries for managing complexity they themselves
create - partly through hyper-centralization. Good and resilient archival, on
the other hand, necessarily implies some level of decentralization.

I singlehandedly maintain a 14-year-old website that used to be a modestly
popular web magazine. It's not very expensive, but it's a pain. DNS system is
horrible and it's easy to loose domain names to some nonsense. (I lost one
that used to be a free 2nd level domain when it was converted to a paid-for
zone. Not a matter of money, just paperwork.) Server management is a time
drain. Stuff like adding SSL certificate to a legacy VPS can lead to a cascade
of updates and config changes that can take days to make and test.

BTW, everyone sings praises to archive.org (and it's well-deserved), but most
people here do not seem to realize that they are also a centralized platform
that can collapse and take everything down with them. Who archives the
archives, etc. Fortunately, they are not the only one of the sort.
Unfortunately, it's all very ad-hoc.

If W3C weren't a bunch of corporate shills, there would be a standard for
creating versioned web archives, like, 10 years ago. It's obvious that we need
one.

~~~
bo1024
Interesting post. So are you saying that if there was a good standard for
versioned web archives, then you could stop maintaining your website and just
point people to the archives?

~~~
nikisweeting
Yup, that's the idea behind projects like
[https://github.com/HelloZeroNet/ZeroNet](https://github.com/HelloZeroNet/ZeroNet)
and [https://github.com/oduwsdl/ipwb](https://github.com/oduwsdl/ipwb).

------
ChrisSD
I looked into this awhile ago and came to a similar conclusion about the web.
Here's the examples of pre 1996 websites I know of (some are recreations):

* The famous first web page: [http://www.w3.org/History/19921103-hypertext/hypertext/WWW/T...](http://www.w3.org/History/19921103-hypertext/hypertext/WWW/TheProject.html)

* The Global Network Navigator: [http://oreilly.com/gnn/gnnhome.html](http://oreilly.com/gnn/gnnhome.html)

* Trincoll Journal: [https://web.archive.org/web/20010413164311/http://www.trinco...](https://web.archive.org/web/20010413164311/http://www.trincoll.edu/zines/tj/tj12.02.93/tjcontents.html)

* BBC Networking Club: [https://archive.org/details/bbcnc.org.uk-19950301](https://archive.org/details/bbcnc.org.uk-19950301)

Anything pre 1994 is very hard to find.

~~~
randobrando1285
Can't forget spacejam:
[https://spacejam.com/archive/spacejam/movie/jam.htm](https://spacejam.com/archive/spacejam/movie/jam.htm)

~~~
ChrisSD
Ha, that's great! Although it is 1996, which is much better archived then
earlier years.

------
schwartzworld
Websites got a lot less fun at some point too. A lot of the designs seem dated
now, but I really miss the days when websites had brightly colored text over
load backgrounds. Obviously modern sites are more usable, but I want <blink>
tags back.

~~~
fma
I think back in the day people created websites really for fun and can do
whatever they want. Now you are judged against the best websites.

Probably the only remnants of the old web are professors who publish academic
websites and are based on content rather than design. I do remember when I was
about to graduate there was talk of standardizing all the professor websites.
I would agree all websites would need accessibility, mobility etc but like you
I miss the old web where you could just explore.

Having said that, the new Captain Marvel website is awesome. I don't think the
page counter is real though :(
[https://www.marvel.com/captainmarvel/](https://www.marvel.com/captainmarvel/)

~~~
EamonnMR
Nothing makes me trust a site more than a ~ in the url somewhere.

------
EamonnMR
Whenever we get into web nostalgia I've got to post
[http://wiby.me](http://wiby.me) which is a search engine for live retro
websites, a mix of modern sites with very little except text and immaculate
sites from the early web that time hasn't changed.

------
iamgopal
Am I the only one who is seeing this sudden surge of bbc.com linked articles ?
Not that I'm complaining.

When people started making pages for Search Engine instead of Human
Consumptions, Internet died. "Light on information and Heavy on keywords and
linking" sites and pages flooded the internet, through which finding the
actual content, needs another level of skill.

I think solution to this is, an AI based open source browser extension, which
find relevant page from search engine result depending upon its usefulness
index. ( or just show its usefulness index, along side the result, in short
sabotaging the Google search results ;) ). Is there anything similar available
?

~~~
blfr
It's an Eternal September problem, not a search engine problem. Whenever I use
search like a proper normie to buy something, check song lyrics, or weather,
the results are excellent.

~~~
Sharlin
On that note, lately it seems Google results are more and more about what’s
”popular” in some sense instead of what’s relevant, especially when your
search term has multiple meanings. Two recent examples of things I’ve run into
are ”cloud” (pages and pages of cloud computing results, no mention about the
meteorological phenomenon) and ”thrush” (nothing about the songbird family,
just dozens of results about the yeast infection). Whether I’m logged in or
use private browsing doesn’t make much difference. Not sure what has changed,
but I’m used to Google giving a more representative sample of different
meanings of a search term.

~~~
hedora
Duck Duck Go has a similar problem for cloud and thrush, but if you click the
“meanings” tab, it lets you switch results over to what you meant, and then
gives decent results.

More broadly, DDG implements UI features that allow users to clarify their
intent, while Google focuses on using machine learning to infer intent.

When things go sideways, this means DDG empowers users, and google overrides
them.

Disclaimer: I switched away from google search years ago, and now I find its
user interface completely baffling.

Maybe I just don’t know how to use it any more, and there is some override
I’ve overlooked.

Also, I’m kind of shocked by how many corporate logos and spam links fly by
when I scroll down the Google results for “cloud”. It feels like the 90’s
internet before animated gifs and the blink tag were invented.

In fairness to google, ddg has a similar number of logos, but they’re each
approximately the size of one normal sized character, not one third the width
of the screen like in google.

~~~
lol768
> Duck Duck Go has a similar problem for cloud and thrush, but if you click
> the “meanings” tab, it lets you switch results over to what you meant, and
> then gives decent results.

This is a good idea but not well implemented. For example, searching Python
displays a relevant carousel of "meanings" \- you have the snake represented,
and the programming language. But choosing an item from the list simply
changes the search term.

This is particularly bad with "Ruby". Perform a search for Ruby and then click
the 'a pink to blood-red colored gemstone' meaning. Observe that the top
result now that you have clarified is "Ruby Programming Language".

------
jccalhoun
I've been doing some research on videogaming in the 90s and I run across so
many dead links to things that even archive.org doesn't have. The thing that
is really frustrating is that there were a handful of psuedo-podcasts with
interviews from game developers that are all gone because they were posted in
real audio.

~~~
bshipp
if you're finding stuff that's not on archive.org just append
web.atchive.org/save/ in front of the url to get the crawler to capture it.

I used to have a chrome and Firefox JavaScript bookmarklet to do it in one
click but that doesn't work anymore.

~~~
bshipp
oh, for the love of...two hours into work and I read this thread and there's a
fat-fingered spelling error that I can't edit anymore.

The url is obviously [http://web.archive.org/save/https://<url-to-be-
saved>](http://web.archive.org/save/https://<url-to-be-saved>)

------
Tepix
The answer is clear: It takes an effort to publish something on the internet.
Companies and other organisations disappear and their web presence disappears
with them. It takes an extra effort to create a copy. The dark web - the parts
of the web that aren't easily accessible because they consist of databases,
paid access areas or perhaps simply facebook posts not published as "public"
\- remains hidden and inaccessible to preservation efforts.

There's also the problem of consuming old internet content. Do you still have
a gopher client? Will old web pages look the same in your browser as they did
in 1995? Flash will no longer work (hooray?).

On the other hand books, photographs and magazines stick around until you
throw them away and we have public institutions that are dedicated to their
preservation.

~~~
badsectoracula
FWIW old web pages do look the same as they did in 1995, with the exception of
the default background color changing from silver to white at some point for
some reason. But, at least under Windows, the default fonts, colors, spacing,
etc are 99% the same.

Flash also works if you are using Chrome. Sadly, Firefox does not support
Flash plugin anymore (and i say sadly because just yesterday i was trying to
find some files from an old site that was made using Flash and i had to use
Chrome just for that).

~~~
icebraining
Firefox still supports Flash, it just doesn't bundle it, unlike Chrome - you
have to install the plugin manually. They're only slowly deprecating it
starting on FF 69 (the "always activate" option will disappear, so users will
have to click to activate it every time:
[https://bugzilla.mozilla.org/show_bug.cgi?id=1519434](https://bugzilla.mozilla.org/show_bug.cgi?id=1519434))

------
denysonique
With the new Terrorist Content Regulation by EU there will be even less left
of the internet we currently have today, here is a recent open letter from
Vint Cerf, Jimmy Wales, Bruce Schneier, EFF regarding the matter.
[http://www.politico.eu/wp-content/uploads/2019/04/TCO-
letter...](http://www.politico.eu/wp-content/uploads/2019/04/TCO-letter-to-
rapporteurs.pdf)

------
jedberg
"While digital storage has fallen drastically in price, archiving all this
material still costs money. “Who’s going to pay for it?”"

This is really interesting. If you think about it, for all of our print
archives, the creator bore some of the cost of archiving it. When you print a
book you have to send a copy to the library of congress, so the "storage" is
paid for by the publisher. Same with newspaper archives -- usually the
newspaper is sent by the publisher, who pays for the "storage" (the paper and
ink). I think the UK and France and many other countries have similar systems.

But in the digital age, the entire cost is carried by the archiver.

~~~
ohithereyou
>When you print a book you have to send a copy to the library of congress

As best I can tell, you only need to send two copies of a book to the Library
of Congress if you want to get an LCCN. Otherwise, there's no requirement to
do so, certainly not to receive copyright protection.

~~~
jedberg
That's true, but basically every publisher does it because they want an LCCN.
I suppose if you self publish you don't have to.

But the bigger point is that to be in the archive you have to bear some of the
cost.

------
djhworld
In some ways the 'old' web died around 2006-2007 when Facebook started to get
its hooks in.

MySpace was a big thing before that, where people curated their own profiles
with custom HTML, then Geocities before that.

When Facbeook came along it felt like a breath of fresh air at the time as the
interface was clean and simple, and all your pals were there.

I guess it was one of those "you don't know what you've got til its gone"
things, because the sterile Facebook UI and overly corporate Instagram
fiefdoms makes me really miss the old internet.

The thing is though, does anyone outside of tech or media really give that
much of a crap? All their pals are on Facebook/Instagram, the barrier to entry
is low (no need to learn HTML/CSS/JS)

~~~
ungzd
Now Instagram users have to learn how to optimize posting time and how to
cheat algofeed by adding unrelated hashtags and even tuning color scheme in
photos. I regularly hear about such things from casual users with 50
subscribers, not SMM workers. It's much harder than knowing <b> and <img>
tags.

With heavy videoization of social media, they also have to learn video montage
and lighting.

------
edoo
The best page in the universe is still up!
[http://maddox.xmission.com/](http://maddox.xmission.com/) Like it or not it
is a historical monument.

------
Kiro
I used to be involved in multiple popular things on the internet and became
quite famous, even having my own "fan club". All of that is gone. Googling my
old nick name yields 0 results.

~~~
sneakernets
Same here. In the late 90's I was involved with some other kids from college
creating and maintaining a site called "The Crazy Zone", where I designed some
of the first JavaScript games ever made (as far as I know), including a crude
breakout clone and a magic 8-Ball. We even had a chatroom applet made in Java!

But it's just gone now. Can't find a trace of it.

------
roland35
My favorite memories of the earlier internet was hijacking Yahoo chat rooms
(guns and ammo was a favorite) and singing 80's karaoke while the normal
denizens became increasingly irate. Because once you took control of the
microphone you got to keep it as long as you held the button down ;)

Now the internet does not seem as fun and wholesome as it did back in the
90's; I feel bad that my kids don't get to enjoy that. On the plus side, maybe
they can make some side income as influencers?

------
xorand
This lack of care is amazing. On one side there is a majority of people who
react with oh, there's so much worthless stuff shared and consumed every day.
On the other side there are people who realize that there is so much valuable
information, which takes so little memory by today standards.

Take for example Sci-Hub library of articles. It's only 70TB. Most of it is
not stuff people share and like and consume like they do with Youtube or
Instagram. Most of it is even old, because many recent scientific articles are
available without passing through Sci-Hub.

Only 70TB of valuable data, in a sea of crap. We might loose this any day.
Which corporation cares about a tiny ammount of data?

Or suppose you're a historian in 100 years from now. What you have as sources?
Nothing? In the age where everything is posted on social media? Too bad,
Google or Facebook or others like that decided to trash a huge quantity of
data, most of it crap, to be sure, and a tiny quantity of unique, or hard to
produce, or scientific, artistic data.

Books which you can't read because MS drops a service? How many of the share-
and-like people passed by the experience of writing a book?

------
Bud
I find it sad that a post like this about the "early Internet" is really about
websites. The web is, of course, not actually the "early Internet".

------
nikisweeting
I arrived a little late here, but for anyone just getting into web archiving,
it's a fascinating space!

We maintain a master community index with a list of all the major archiving
organizations and open source tools here:

[https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-
Comm...](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community)

Some of the cooler projects apart from ArchiveBox.io:

\- [https://github.com/webrecorder/pywb](https://github.com/webrecorder/pywb)
/ webrecorder.io

\-
[https://github.com/HelloZeroNet/ZeroNet](https://github.com/HelloZeroNet/ZeroNet)

\- [https://github.com/oduwsdl/ipwb](https://github.com/oduwsdl/ipwb)

\- [https://getpolarized.io](https://getpolarized.io)

\-
[https://github.com/internetarchive/brozzler](https://github.com/internetarchive/brozzler)

And many more blog posts, articles, projects, and organizations on the wiki!

------
leoc
Somewhat cheeky from the BBC, which has a record of permanently deleting large
sections of its website much more recently than c. 1999.

------
daggasoft
Someone at the internet archive brought this thread to my attention. You all
may be interested in checking out my project.
[http://theoldnet.com](http://theoldnet.com)

The original purpose of this project was to allow vintage computers (old
browsers) to surf the old net using the Internet Archive's WayBack API. But
most people use it on modern computers. We just passed 100,000 pages served up
yesterday.

I've added a number of features to the site including a working guestbook and
automatically bubbling of geocities pages to the homepage upon discovery.

Two days ago I released the Old Net Navigator web browser. It's a little buggy
but mostly feature complete. You can check that out here
[http://theoldnet.com/browser](http://theoldnet.com/browser)

------
pcr910303
Isn’t the distributed nature of web making this super hard? Everyone can buy a
new domain, everyone can structure one’s website differently, everyone can
stop publishing content in any date. Innovative ideas about distribution and
structure pop out frequently and change how data is served. It’s different
from usual publication where publishers are needed to publish content,
structure has not changed since the B.C. era when books were invented, and
data is physical. We started archiving publications long after they were
invented; the web was made only 20 years ago. I’ll take a long time to find a
way to reliably save data that is served in a distributed sense, and people
will have to study web with ones that survived, just like how we study papyrus
or the Rosetta’s stone.

~~~
0815test
> It’ll take a long time to find a way to reliably save data that is served in
> a distributed sense

That's actually easy - it's literally what blockchains are for. The whole
issue is how to manage the tradeoff between reliably archiving everything (and
expecting participants in the network to do the same, so that the system
doesn't end up relying on a single point of failure), and delivering/serving
the stuff efficiently.

~~~
opportune
I don't think a long term, major _content_ archival project is a good idea for
a blockchain. Since the internet is so large, you would have to have major,
major sharding (as opposed to the naive approach in bitcoin where all nodes
contain the entire history). And additionally if you archive everything in an
immutable datastore, you are going to end up archiving and then hosting
illegal content, and afaik it is still not settled whether that constitutes a
crime (in my opinion, it should).

------
Merrill
One of the brain's most important abilities to maintain sanity is the ability
to forget the unimportant and remember the important.

The internet will need that ability as it becomes more intelligent. Having AI
reasoning based on vast pools of fake news and other disinformation would not
be a good thing.

------
eruci
I got sued over copyright infringement a few years back, and most of the
evidence against my company consisted of screenshots of my sites/blogs from
the internet archive. It is a good research tool for "who came first".

------
Keyframe
Time really did move fast, but it seems to have slowed down. "The early
Internet" \- by the end of 90's it seemed like Internet was all over the place
and has been with us forever with large sites established their presence and
we were on the verge of dotcom. All in all, it was only what, five or six good
years (until 1999). All of THAT happened in that time span.

Now, look at the last six years.

------
driverdan
I'm still amazed that Internet Archive maintains copies of my first site from
1997:
[https://web.archive.org/web/19971222141333/http://stuff.simp...](https://web.archive.org/web/19971222141333/http://stuff.simplenet.com/)

Many of the external links are archived too.

Anyone else have links to share of their old, pre-2000 sites?

------
newaccoutnas
For some nostaligia, poke around
[http://www.webring.org/](http://www.webring.org/)

------
lawrencegs
And this is just web. Don’t forget that we also have Apps. Some of the newer
startups / media company don’t / choose not to have a web interface. We have
archive for web, but not for apps. I looked at the screenshot of the first
Uber & Amazon app, and it was amazing how different they look compared to
today (2019)

------
armortech
Spent last ~10 years to actively search each other week a small game I liked
as a kid. It was on those CDs you buy with PC magazines. Even got in touch
with the original author who, of course, lost it too. Found it a month ago.
Can you imagine the happiness?

------
drmsucks
It's going to get even worse now that entertainment media is being "streamed"
and DRM is preventing users from making copies. Today's game servers and loot
boxes are very short lived, and they aren't even selling physical disks for
people to preserve and enjoy.

~~~
linuxftw
The solution to this problem is a simple one: Don't buy DRM content. It's just
entertainment, you can live without it.

------
return0
Often it's the domain renewal issue. People won't keep paying the domain for
an irrelevant website, even if the hosting itself is free. OTOH, why would we
want to keep everything that was ever written on the internet, forgetting is a
natural process in culture.

~~~
gugagore
I think there's a lot of sadness around forgetting. I'd like to preserve
almost all of it, really.

That's why we have stories, and care about legacy. All to avoid forgetting and
being forgotten.

~~~
jopsen
Put static HTML on GitHub pages and it'll probably stay up forever..

Well, at the time scale of forever, it's always sketchy, but the odds are
decent.

------
dwd
The Internet Archive is the closest I have to a resume of past projects.

So many things I've worked on no longer exist (at least not in the same form)
but can still be found there.

------
BorisMelnik
haven't logged in in a while, but more people should donate to archive.org -
and I'm not speaking AT HN at all because statistically I'm sure HN
contributes more than anyone but archive.org are truly the historians of the
web and we need to support them.

------
pard68
I know of Eternal September, but is there much life left in Usenet other than
for binary distribution?

~~~
icedchai
Some obscure groups, like comp.os.vms, are actually quite active.

------
stesch
Article is about the web. Clicked it because I thought it's about the early
Internet.

------
nemoniac
The title is misleading. The article is about the early Web, not the early
Internet.

------
revskill
It's about the database or where the content has gone ?

It's not about the web.

------
Lord_Zero
Simply put: It is not free to keep stuff up on the internet forever.

------
purplezooey
That site is heavy on generic clip art photos.

------
agumonkey
the process of wild growth fueling the need for organization which in turns
causes sclerosis is .. as mundane as strange.

