This was particularly bad because one of our earlier strong points was fresh indexes. Our ability to refresh the supplementary index on the fly was awesome. When you lose one of your primary strengths, it's noticeable.
I don't mean to minimize the downside of losing focus. That's one of the biggest lessons I learned while working there. I'd say that our failure to maintain a high quality index was directly caused by our loss of focus, in fact. But it's important to remember that both UI and the underlying index quality matter.
"What about PageRank?"
Eh, not actually unique to Google. Remember that Jon Kleinberg was developing HITS in parallel -- AV was well aware of the concept of measuring page importance using incoming links, and we had our own implementation. It may not have been as good. It's hard to tell when your underlying data source is stale.
Also, any AV article which doesn't note that we bought Elon Musk's first company is inherently flawed. ;)
My recollection is that Alta Vista supported boolean operators, but defaulted to OR while Google defaulted to AND. So searching Alta Vista for something like "$CommonWord $UncommonWord" would return results with high-ranking pages for $CommonWord that drown out all the low-scoring pages for $UncommonWord, whereas Google would return results that match the intersection (which would actually be relevant to the user's query). I'm convinced this default might have made a bigger impact on Google's success than any PageRank magic.
My theory on why this might matter even for people who knew how to use the operators is this: With OR as default, you would first try your query without operators, get page upon page of irrelevant results, and then start to narrow your query down. With AND as the default, you would type in the query, and if you only got a few irrelevant results, or often no results at all, you would try alternative terms instead.
It seems that progressing from no result to desired results by choosing alternative terms just makes more sense than having to wade through irrelevant stuff, and the default encourages one methodology over the other.
Today, it's very hard to make Google return no results at all. Not just because the amount of content grew to an unimaginable scale in the meantime, but also because Google has become way, way fuzzier in the way it interprets search terms, likely to better suit a larger and different audience. A lot of times today, I have to switch to "verbatim" mode first, at least for technical stuff.
1. entering 'invisible marmalade teapot' and getting no results
2. changing to 'invisible marmalade teapot with tartan cosy', again nothing
3. so 'invisible marmalade teapot with tartan cosy in outer space', ditto
You: do you have that new crime novel in stock?
Bookseller: er, i don't know which one you mean?
Y: the new crime novel by john grisham, pelican something or other?
B: oh, right, yes, here it is!
default OR in a search engine would mean my first example eventually starting to return results about transparent space coffee pots with tartan cosies, ignoring the first few terms but the rest match, which is often helpful, particularly if you're doing an exploratory search for something where you aren't sure of the exact details.
I don't find your examples compelling.
If you want to do an additional search that does not depend on your first terms, you simply bring up a new search window.
In your 'real world' example, the equivalent search queries would go something like
search: new crime novel
result: way way too much stuff
search: new crime novel john grisham pelican
result: exactly the right book because every one of those terms applies
9130 webpages agree with you.
To switch on Verbatim, I click "Verbatim" on the left side of the google page, it appears just under the alternative, "All results". I think for other systems it can be well hidden in menus. I use it almost every time I google anything. Otherwise you get a load of irrelevant crap.
The reason being you can just search twice if you want either word, but in the vast majority of cases you want both words when you enter them in the searchbox. Most companies kind of split the difference and put the AND results first then fill in with OR results, but that mostly just leads to "if your answer isn't on the first page, it's not going to be on the second or the hundredth".
IMO this was one of the biggest contributors to the perceived quality advantage of Google vs AV.
Relevant xkcd about the “second page of Google results” effect:
It always took 2 to 5 searches using Altavista to get the results you were looking for. This was a huge improvement on Excite and Lycos which might produce infinite results with absolutely nothing relevant. There was a lot of noise like source code archives. With Google search the first page usually had a useful result.
Probably the biggest thing that destroyed Alta Vista was the horrible flashing banner Ads at the top of the screen.
From the article:
> This move away from AltaVista’s streamlined search experience made AltaVista more similar to its competitors. Users gradually began to switch to a newcomer, Google, for the simple search they missed.
I don't think this was the case at all. People switched because they got better results.
And the guy who started Baidu (the Google of China), Robin Li, had created his similar to PageRank algorithm, and even patented it in the US, before Google (he filled for the patent in 1997, Google was founded in 1998).
However, Kaiser Kuo points out that Robin Li, the co-founder of Baidu, obtained a patent for hypertext link analysis before Larry Page obtained his “Page Rank” version.
I feel like that tells me where your percentage would lie...
EDIT: I apologize, it was intended as a joke to lighten the mood. It's never a good thing to lose one's job or have a company fail, even that long ago, so my response is to attempt humor.
> I feel like that tells me where your percentage would lie...
Are you responding to BryantD's admission of (widespread) bias by accusing BryantD of bias? This seems to add nothing to the conversation.
I see; I incorrectly read it as accusatory. Thanks for the very civil reply!
That's okay, I have a very dry sense of humor and straight-faced delivery in person as well. Which is unfortunate because it means I can't blame the lack of tone online when jokes don't land, they often don't in person either.
Honestly, I doubt that was even Google's plan for the first years of its existence. It was more "Hey, we made this neat search thing, let's see if we can figure out a way to make money from it".
I don't know if the information is true, but I know I've heard it more than once.
In a way, it ranks a link by its own incoming links (which are ranked by their incoming links etc). It was possible to game AV by setting up 100 sites that points to your own.
In pagerank, you essentially had to convince already popular pages to link to yours.
For a while, google effectively had no spam, and AV had lots. Eventually, spammers learned to game pagerank; the arms race is still on.
It had been applied decades before through scientific paper references, as a measure to improve on the "number of references" metric, which is more easily gamed. References are more rarely circular (only same time in-preparations can form cycles, unlike web pages). I was sitting in a class about stationary processes in 1996 when the lecturer mentioned this (already old and well known at the time) use case as motivation.
Whatever AV implemented at the time, it was not on par.
Pagerank is essentially a “universal authority score” (in the HITS terminology), and it worked well because at the tine you didn’t have pages that were authority for one subject and spam for another. You do now - which is why pagerank is now one signal out of 200, even though it was sufficient on its own 20 years ago.
This of course was also at a time when "whitehouse.com" was a porn site.
The lack of fresh index surely was a factor but not sure whether it was primary.
Obvious question - why did you not update the index? Was it that it was obvious Google was going to win and made people give up? (edit - never mind - you address this in other comments)
> The lack of fresh index surely was a factor but not sure whether it was primary.
I wasn't aware about index back then, but I saw broken links, at that time I thought that was normal, that it takes time to scan the entire Internet. With Google I generally got working websites.
I did like the boolean search in AV, it helped with obscure searches, especially when name was similar to a typo of a popular word.
Do you believe Google's strategic decision to use commodity computers and hard drives gave them any competitive advantage (cheaper cost, scaling, etc) compared to DEC Alpha servers?
As an outsider, it seems like Google could iterate its data centers faster and cheaper and therefore, their web crawlers were cheaper to run (also run more frequently), also cheaper to store terabytes of data, and also cheaper to service search queries.
* I was also not exactly sober at the time, so these numbers may be a bit off. The number of wafers per chip being greater than 1, though, I am absolutely certain about.
With cheaper techniques, the idea is that the "more capital efficient" way of indexing the ever-expanding web would in turn provide better results for an improved consumer experience. It's the old adage of "do more with less".
For example, see the old Danny Sullivan graphs showing how Google's index was growing faster than AltaVista. Having a bigger index lets one return more relevant search hits.
AltaVista wasn't just falling behind in "staleness" of old indexes; the aggregate size of the index was smaller than Google as well.
It did apply, to a point. Before Google, I had switched to AllTheWeb as my search engine of choice since a lot of sites just wouldn't show up in AltaVista no matter what you searched for, and ATW had a bigger index (I guess staleness could have had the same result).
But of course eventually I switched to Google for the better search results.
I didn't know anything about Google until mid 2000, and when I used it, I just thought it was an AltaVista clone.
Fast forward to fall 2000, and once after getting bad results on AltaVista, I tried Google again. At the time, I never bookmarked either of the sites or set them as my homepage. I remember, when I used Google, I thought to myself, "I'm going to switch to whoever comes out with a browser toolbar. Search should just be part of the browser."
About a week later, I had some bad results on AltaVista, and typed in google.com. Immediately I saw the "try our toolbar" banner.
At that point, I switched to Google, and I NEVER went back to AltaVista. (I think sometime in 2001 I tried AltaVista again, out of loyalty, to see if they finally had a toolbar, but the results were so bad I was in shock.)
Surprised that it still exists! https://www.google.com/intl/nl/toolbar/ie/index.html - and the screenshot even seems to show a "share to google+" :)
>In the mid 1990s, . . . from a place that would come to be known as Silicon Valley
In the mid 1990s the name Silicon Valley had been used in mainstream media for at least a dozen years.
Was the aspect how they engaged with government bureaucracy and managed to in effect define a work culture that is so far removed from governmental bureaucracy, that you wonder what the early days was like.
I'm glad this has been mentioned. I noticed this at the time but never really knew if it was actually the case! Altavista had become somewhat annoying to use at the time and Google's relatively clean front page (although it was more than just a search box in the earliest days, but it was cleaner than AV's) meant it loaded quicker and sold me on both speed and that the results actually loaded.
Agreed. My previous manager had worked with Elon Musk at Zip2 (maybe even as the CTO). He used to talk about how peculiar Elon is. Does it ring a bell who I might be talking about? :)
I have always wondered what Zip2 actually was. Can you shed some lights?
Was it like a primitive version of Yelp?
Did the map work like Google Maps?
giving away a free demo of <company> prowess is cool, but it's a big reason for its death I guess
Ha, never heard about that - sounds contorted (nowadays) but somehow funny - so somebody in some company was sitting next to a fax waiting for something to come out of it, then when that happened the employee wrote the reply (scribbled on the same or another piece of paper) and faxed back the reply?
I wonder if I would like or hate doing something like that today - waiting for & finally seeing a piece of paper containing an unknown message coming out of a device sounds somehow fascinating... :)
Just get a job at any restaurant in Japan. Fax is the way nearly all to-go orders are placed. Fax machines are still massively popular there.
Sadly, now google is a terrible mess of moderated and curated nonsense.
Then their gmail ( especially the initial invite and storage ) and chrome made google the "cool" tech company and pretty much cemented their place in the tech world. Sadly, they've turned out to be monsters rather than saints and we are all the worse off for it.