
Building a legacy search engine for a legacy protocol - benjojo12
https://blog.benjojo.co.uk/post/building-a-search-engine-for-gopher
======
MrRadar
Do you still have the raw crawl data? If so, have you considered uploading it
to the Internet Archive to help preserve it for the future?

~~~
benjojo12
Nice idea, I've rolled up all of my data and crawl database and put it on the
internet archive:

[https://archive.org/details/gopher-
may-2017.tar](https://archive.org/details/gopher-may-2017.tar)

~~~
snakeanus
Could someone please help seed the torrent? It has been stuck at 99.9% for the
past day.

~~~
MrRadar
The Internet Archive should be seeding all the torrents they host. Have you
tried restarting your torrent client, forcing it to recheck your local data,
or re-downloading and re-adding the torrent file?

------
mkup
Windows 98 is a flaky OS, IIRC it used to crash with BSoDs multiple times a
day even on real hardware. That AltaVista standalone crawler software probably
could run much better under NT 4.0 or Windows 2000. And NTFS allows for much
larger datasets than FAT32. NTFS hasn't much changed since that time, by the
way.

~~~
derefr
Old Windows versions crashed so much because hardware manufacturers at the
time slapped together poor drivers, and Microsoft couldn't do much about that
if they wanted people's computers to work at all. The #1 thing that as
improved since then, was simply that more of the drivers are now written by
Microsoft themselves. The #2 thing is Microsoft getting enough power over
hardware makers to force driver quality-assurance and signing on them.

Which is all to say: if you run Windows 98 on hardware that has good drivers
written for it—or especially on _virtual_ hardware whose "drivers" are just
paravirtualized calls into a modern OS kernel—your copy of Windows 98 won't be
BSoDing any time soon.

~~~
TazeTSchnitzel
There's also the thing where, uh, Windows NT has full memory protection.

On Windows 9x it was trivial for any app to hose the OS, because critical
memory regions were unprotected.

------
Keyframe
Anyone remembers back in 93/94 hearing about HTTP and saying 'gopher, but with
images? Neat! Oh, but I need a graphics terminal... damn'.

------
edward_rolf
Very cool and a nice write-up, thank you.

I know absolutely nothing about the AltaVista software so this was
interesting. I'm wondering, were there limitations in scale? Did the search
engine have other types of technical limitations? If you had to choose from
modern software, what would have been your choice?

------
anigbrowl
Gopher was awesome back in its day, and I'm sad that the semantic web efforts
(its nearest modern aspirant) have done so poorly.

------
spc476
Wow ... weird to think I'm running one of the 370 gopher servers left in the
world: gopher://gopher.conman.org/

------
mysterydip
I actually thought about doing this a couple months ago but never got past the
"cool idea in my head" phase. Great to see this!

------
DamonHD
AltaVista brings back memories!

And in the archive I was able to find some of our old text from '92! It's
alive I tell you.

I didn't use gopher much if at all when I had the chance back in mumble mumble
'80s/'90s so I'm pleased to finally use it. Surely it'll be back in fashion
soon?

------
rurban
But still, I would rather use xapian over AltaVista. Same technology (inverted
index), but stable and much better. Win98 as service is just asking for too
much trouble. And no, not ElasticSearch. No java in the house, C++ is enough.

------
e12e
Fantastic article. Wonder if the altavista software would run under reactos?

------
digi_owl
Gopher strikes me as something of a file explorer for the net.

------
mempko
Wow, this really brings me back. Well done.

