

Google 2000 vs. Google 2011 - sjs
http://www.mattcutts.com/blog/google-2000-vs-google-2011

======
vaporstun
While I also agree that Google's search results have gotten better, I just
feel (this is very subjective of course) that they should be far better today
than they currently are. I don't think they're leaps and bounds better and I
think they should be.

I'm not one to constantly point to yesteryear and claim they were better but I
do look at where they are today and think they should be further along than
they are.

I guess when I look at certain segments of technology in 2000, some other
companies have stayed more focused on their primary goal and have made
phenomenal achievements since then. For example, OS X wasn't released to
consumers until 2001 and the iPod didn't exist. Yet look where we are today. I
have an iPhone in my pocket with incredible computing power, a very good
refined interface, and a fantastic OS. Apple has stayed very focused on OS and
consumer device design and pushed the field forward very far, with everyone
else scrambling to catch up. This is their main business, and while they have
other side projects (like the AppleTV) they stay focused on their main thing
which is making great devices for consumers.

While Google, whose primary business was search+, now operates in many
different business segments. Their focus has widened greatly and now they have
email, payment processing, online document editing, 2 operating systems,
automated cars, and the list goes on and on. Now, I have great love for many
of these products and services so I do not suggest that it was a bad idea for
them to do them, but I think it is apparent that search has suffered as a
consequence.

Had Google stayed more focused on search instead of all these side businesses,
search could be far more developed than it currently is today. I think they
got complacent while being on top and tried to find other places to allocate
their resources. Which, from a business perspective, I understand, but from a
search perspective it seems tragic.

Therein lies my disappointment with Google, not that they haven't been
innovating or improving search, they have. But they have spent so much time
and resources on other things that search is not as far along as it should be.

\+ Certainly it could be argued that their primary business is and always has
been advertising and search is just one of the many vehicles for pushing said
advertising, but the word Google is now synonymous with search and, marketing
speak aside, I consider it their primary focus.

~~~
barista
Well for 10 years I think the search results have only marginally gotten
better. Besides, 10 years ago, what google was doing was revolutionary. No
other search engine had such vast coverage or better intent determination. Now
the competitionn is intense, Bing is almost as good as Google if not better in
some scenarios, other guys are blekko are coming up with different approach
(and have a better way to handle spam by avoiding it completely).

I think Google is spreading itself too thin. Search is what they do the best
and its their bread and butter. They should stay focused on it.

Google is too engrossed in thinking about generating more ad dollars rather
than making their core technology better.

~~~
bad_user

          Bing is almost as good as Google
    

It's easier to be a follower (or ahem, copy search results). Also if I'm going
to switch away from Google it must be something truly better (like the
difference I experienced when trying out Google after using Altavista).

    
    
          and have a better way to handle spam by avoiding it completely
    

I like Blekko, but avoiding spam is an impossible dream. If there are search
engines (or any filtering method, automatic or based on peer reviews) out
there with less spam, that's because right now only Google matters. The only
way to fight it is to be a moving target, which is what Google is doing.

Unfortunately spammers are also very creative, so it's a tough battle.

~~~
barista
Re: bing copying Google, it has already been profusely discussed all over the
internet and has been shown as nothing more than a PR stunt by Google folks to
avoid discussing spam

Re: fighting spam, its a tough problem indeed but so was indexing the ever
expanding web. Both Bing and Google seem to have solved the indexing problem
fairly well by now.

The issue with the the spam problem is not so much of how difficult it is to
solve but the motivation. Clearly the spam generates a lot of cash for Google
(and some for bing too) so they are less likely to be motivated to solve it.

Smartest of the minds are at work at Google so I find it hard to believe that
they cannot solve it (or even make a very good attempt at it). It's their
intent to not solve it and that's the reason I think they are trying to avoid
discussing it.

~~~
WalterGR
_Re: bing copying Google, it has already been profusely discussed all over the
internet and has been shown as nothing more than a PR stunt by Google folks to
avoid discussing spam_

I'm assuming the downvotes are because of that part.

I keep seeing the argument that "Microsoft didn't answer all of Google's
claims," but no supporting evidence. If anyone has some, I'd be curious to see
it. (This is a legitimate request: I haven't been keeping score.)

~~~
Matt_Cutts
At the risk of beating a dead horse, I'll try to summarize. Google believed
that queries and clicks on Google search results were being used in Bing's
ranking, so we ran an experiment. The experiment confirmed that clicks on
Google search results are used by Microsoft in Bing's search rankings.

Things we don't know include:

\- the degree to which those clicks are used in Bing's rankings.

\- whether MSFT does Google-specific processing of the clickstream data they
get, or whether clicks on Google are treated the same as some-random-
website.com.

\- how long clicks on Google search results have been used in Bing's ranking
(months, years, etc.).

Those are open questions that Microsoft could best answer.

~~~
WalterGR
Thanks for the reply, Matt.

 _Google believed that queries and clicks on Google search results were being
used in Bing's ranking_

I thought it was the search _results_ that Google accused Microsoft of
stealing. I guess I need to go refresh my memory.

 _Things we don't know include..._

Good summary, thanks. I don't guess anyone is very surprised that Microsoft
isn't quick to offer up how it treats a single data stream.

~~~
Matt_Cutts
"I thought it was the search results that Google accused Microsoft of
stealing."

I'm trying to use very precise/neutral language to avoid the
"copying/cheating" brouhaha. The crux of the issue is that clicks on Google
search results are used in Bing's search rankings.

------
patio11
Google 2k11 is a _radically_ better product than Google 2k:

1) Crawling is now continuous, and _fast_. You no longer have to wait months
for an index update to see a new site on the Googles. Even pages well down the
head of the importance distribution get crawled, indexed, and searchable in
minutes.

2) Search in Japanese is ridiculously better. Just trust me on this. It has
gone from being Hotbot2001 to (English) Google 2005 quality, which is
impressive given the natural language problems Japanese throws at you, and the
underdeveloped state of the link graph.

3) The Google symbiosis with particular sites on the Internet - most
relevantly to my interests, Wiki and StackOverflow - has successfully
incentivized the creation of sufficient basic and in-depth content that it is
virtually inconceivable that you can come up with a generic English term and
have a total failure on it. (e.g. the top ten for [manatees] will now exceed
an eighth grade science report in informational content, which was a very hit
or miss proposition in 2000.)

4) Google seems to have gotten over themselves in one critical respect: they
used to vocally glorify non-intervention in the SERPs, as if G were the blind
oracle of the net. This lead to truly stupid issues like a particular neoNazi
group ranking at #1 for "Jew" and Google running an AdWords ad against "Jew"
to explain that Google was content neutral between Jews and Nazis and in this
case the Nazis just happened to be more loved by The Algorithm so there.

I don't know about internally, but on the basis of external evidence, Google
is now willing in search quality on the behalf of users. This is a good thing.
(Now if they were humble and cautious about wielding immense power, that would
be a good thing, too. I get the persistent impression that they still think of
themselves as the wee little underdogs, cuddly academics in the dorm room,
rather than being Ma Bell.)

------
nhebb
I can't speak for others, but my disappointment is with the margin of
improvement. I know that results often sucked back in 2001, but with 10 years
of progress I would hope to get better results. In the case of "buy domain
name" (<http://www.google.com/search?q=buy+domain+name>), 4 of the top 6
results I get are How To articles, not sources to buy domains.

Google is trying to solve a hard problem, and I appreciate that. I just don't
have the patience I did 10 years ago to click through all the results to find
a legitimate site. If I get a set of results that look like spam, I try to
refine the terms, but often just give up.

I know that Google tries to solve this problem algorithmically, but I can't
help but think that mixing in a human review would immensely improve results.
And by human, I mean a Google employee, not a sure-to-be-gamed community
review.

~~~
tomwalsham
I think the issue here is that for you the correct results set would be
reputable domain name providers, whereas for many other people, the howto
articles are in fact what they're looking for.

Without having personal search switched on, they need to hedge their bets.
(note: not having tried it I don't know how effective the personalized search
is) In general these populist algorithms may tend to skew the results against
techies, or those who 'know what they want', in favour of overall increase in
perceived quality.

As far as I know there is a human component, and there is definitely
significant weight these days towards CTR as determining quality. These would
likely just reinforce the populist ranking factors.

One of the underlooked aspects of Google is the sidebar - using the 'fewer
shopping sites' and time-based queries can be a huge help to relevance. I've
for years used search modifiers (+, .., - etc.) in my initial queries with an
expectation of what Google might expect. A search like "Buy Domain Names"
still retains a lot of semantic ambiguity that their algorithm has to wrestle
with. You have the tools to fix that yourself.

As such I tend to agree with Matt on this. We've just forgotten how bad it
was.

~~~
nhebb
_for many other people, the howto articles are in fact what they're looking
for._

That's a fair point, but one of the top 6 when I searched was from
fourhourworkweek.com, so a whole other level of bias set in on my part. :)

 _We've just forgotten how bad it was_

I still remember wanting to go to hotbot.com and accidentally typing
hotbox.com. In a room full of people. So, some of my memories of early engine
use are actually quite vivid.

~~~
rapind
Try misspelling a few other well known domains. That tactic is still in full
force. I.e. <http://goggle.com/> (hint: this is not google, so fill it in at
your own risk).

------
jmilloy
I was skeptical when Google Instant was going to "change the way people
search" -- but in early retrospect, I think it really has for me. I now click
on one result maximum (and often zero), relying instead on rapid search-term
tweaking.

Even just a few years ago I'd expect to click through several results, or
heaven forbid to the next page, for what I felt was a difficult query.

Between google instant and other improvements, I am increasingly trained to
look ONLY at the top several results. When google doesn't deliver, the results
seem worse then they used to when they aren't. I think we're forgetting how we
used to have to search.

------
redstripe
I think Groucho Marx could sum up this article as "Who are you going to
believe, me or your own eyes?"

I didn't keep a record of my searches from 2000, but I do remember that I was
extremely impressed and satisfied with google back then. That is no longer the
case. I am frustrated on daily basis by the results of many searches.

It's not really your fault though. I blame it on mono culture. Google has such
a huge hold of the search market that it's probably not even worth SEO peoples
time to bother gaming other search engines. There is a whole industry feeding
off your success and we all suffer for it.

After the china attacks earlier this/last year google put out a decree that
they would phase out windows machines among their employees. I thought it made
sense. Windows is too easy a target because of it's success. The same is true
in nature and the same is true in search engines.

The best we can hope for is a new search engine that will be insignificant
enough in terms of market share to avoid the scammers. As long as google is
the dominant search engine it will never get better.

~~~
cryptoz
> I am frustrated on daily basis by the results of many searches.

I have some speculation on this. Are you sure that your frustration comes from
unusually poor results? Could it be that we're so used to things being perfect
now, that a page of mediocre results looks like the end of the world?

Look at 2000. No Shazam. If you heard a song on the radio, that was the end of
the story. There were some services you could call for $5 and a human would
help you identify it. Now in 2011 you press a button and wait 5 seconds.

The web had a _lot_ less information on it. A few hundred million people were
connected. Now, in 2011, a few billion people can connect. Huge shift in
quantity. Very tough to keep the S/N going strong.

2000: No Wikipedia. Now in 2011 if Wikipedia isn't in your top results you
might be upset, but in 2000 you were happy even though Wikipedia didn't yet
exist!

So I pose the question to you: Are you absolutely, positively sure that
Google's quality has declined and what you are seeing isn't just a side-effect
of everything else being so awesome?

~~~
listic
> Very tough to keep the S/N going strong.

I think there's also a problem that the quality of content that Google has to
index is steadily declining. In the 90's, many people were keeping lists of
links to good stuff they found on the internet. Google could crawl the links
and make conclusions. Now, people hardly do that anymore because they can just
google it.

So Google has to employ ever more powerful algorithms to maintain even the
same level of search quality. I think here lies an existential threat to the
search engines.

------
rbarooah
I'm not sure that a single hand picked result shows much. If Matt has 40,000
results, how about opening them so we can get an idea of how things have
changed for queries we care about, and letting people do an independent
review?

------
bhavin
I guess the whole backlash against search results was Content Farms
outsmarting Google (atleast temporarily) and distorting results.

Google's search results in 2000 were better than any competitor or Google
itself 2 years before. But the same thing can't be said with certainty when
comparing 2010's search results to 2008's.

------
msg
Google's spam filter is training the internet to produce better and better
content. Over time a rising tide lifts all boats.

In 2000 they were working on not returning results from various subdomains off
the same base domain.

This year they're working on content farms.

Next year we won't be talking about plagiarized or duplicated content. We'll
be talking about Google giving us content that is superficially different but
not unique enough.

Five years from now we'll be talking about how Google is giving us unique,
relevant content in the first ten results, but it's not personalized enough.
Et cetera.

Just like in the AI world, every time a computer does something we thought
only a human could ever do, instead of admitting defeat, you say, "This just
proves that chess|poetry|music composition|playing Jeopardy doesn't take
_real_ intelligence." The truth is that you're not defining intelligence,
you're moving goal posts. And if there's one thing Matt's post shows, it's
that we moved the goal posts.

I think Matt likes the criticism. When Google's critics have to move to more
and more sophisticated attacks, it shows that Google's facility is growing.

Google is not just raising the bar on the whole internet. You are raising the
bar on Google.

------
alexbosworth
I think it's about content: Google has transformed the webs' spam gardens by
promoting crap sites.

Because Google pushed low substance sites high in their search results, this
created a vicious cycle where sites were rewarded for creating mass volumes of
inane 'content', which has now become a huge percentage of the web

Google isn't just an observer to the system, search results impact the
ecosystem as a whole. A complicated environment is something that is
challenging to test and to predict, although that is what Google internally is
geared for.

Personally I'm also annoyed by Google's Adsense policies that emphasize inane
content: (They without warning blocked me for a post of a painting that
included a nude figure).

------
maxklein
For me personally, google is much, much worse than before. I don't find what I
need.

What I think is sad is that alternative search engines still work almost the
same as google.

Of course he's going to find spam if he searches for the most spamable
content. But that's now what I usually search for - mostly I search for
programming related topics or trouble shooting. The old google would give me
the right result straight away, the new google gives me some page on the main
vendors site that is unrelated.

There are many ways niche search engines could come up to solve this problem,
but none of them seem really interested.

~~~
ElbertF
Google isn't strict enough for programming queries anymore, for example this
query yields no results:

    
    
        "hello-world" -"hello world"
    

It ignores the dash even though I'm using quotes.

------
tibbon
I don't get people who say Google is getting worse myself. If any of you know
search engine building so well and aren't working for another search engine
(like DuckDuckGo) why not take your expertise to Google and help fix the
problem?

~~~
citricsquid
FYI DDG just uses the bing API and then does some fancy stuff _on top_ of it,
so _technically_ they're not a "better search engine" they're just a prettier
bing.

~~~
epi0Bauqu
FYI: no, actually we are a hybrid search eninge. We have our own crawling and
indexing that is merged with about 15 external APIs.

~~~
citricsquid
"hybrid search engine" is just worthless words, what does it even mean? If you
want people to stop saying you're just syndicating others data publish the
amount of others data you do use, but when I've compared bing.com searches
with yours they're _very_ similar whereas google is different, which would
imply you're heavily taking from bing, whether or not you crawl yourself...

~~~
epi0Bauqu
So I'm required to divulge information to stop you from spreading false
claims?

You said " _they're just a prettier bing._ " I have ~100K lines of code in
front of me right now that has nothing to do with UI, although I think UI is
tremendously important and has been a primary focus as well.

Nevertheless, I've talked about all of this at length numerous times -- in
videos, on HN, my blog, etc. Bottom line though is you can compare for
yourself. We do certainly draw on Bing, but its importance will heavily vary
per query and for many queries we'll look completely different. It really
depends.

~~~
citricsquid
False claims? It's what you've said[1][2], albeit a long time ago, but because
I can't find any up to date information that you've published (care to link to
it?) I can't suddenly know something else.

You're not required to do anything, you can tell me to piss off if you like,
but if you want people to claim things and for them to be factually accurate
you should publish this information, and if you have then please link to it
because I'm struggling to find it.

If you want people to stop spreading misinformation (intentional or mistaken)
then spread the correct information, otherwise learn to accept that when you
say you use the bing api but refuse to explain to what extent, people will
assume your site is entirely powered by it.

[1] [http://www.gabrielweinberg.com/blog/2009/02/thoughts-on-
yaho...](http://www.gabrielweinberg.com/blog/2009/02/thoughts-on-yahoo-boss-
monetization-announcement.html) [2]
[http://www.gabrielweinberg.com/blog/2009/03/duck-duck-go-
arc...](http://www.gabrielweinberg.com/blog/2009/03/duck-duck-go-
architecture.html#IDComment79453829)

~~~
epi0Bauqu
Yes, false claims. You claimed that " _DDG just uses the bing API_ " to get
results. That is false, and I don't remember ever saying that anywhere because
it is false.

Those two posts do not say otherwise. They mention that we use BOSS, which is
of course true. The distinction is that we do not _just_ use BOSS by any
means.

I actually started out crawling the Web before BOSS even existed, and still
crawl quite a lot. As for recent info, look no farther than the DDG FAQ, or my
HN comments. This is from just 3 days ago:
<http://news.ycombinator.com/item?id=2184873>

Note: no one was requiring you to make any claims, false or otherwise. So I
still don't get why I need to divulge information to stop you from making
false claims.

------
russorat
Could it be that people who are having a hard time finding things on Google
now are searching the same way they did in 2000? Google might (and arguably
should) be attempting to cater to the way people are searching now, not the
way they searched 10 years ago. Don’t forget that people using a search engine
need to evolve as well.

------
oyving
It would be interesting if they set up a Google Retro and let people try and
compare the old search and see if they still think it is an improvement over
the newer systems.

~~~
sahaj
it would be mostly useless. the content today on the web is very different
from the content even a couple of years ago.

------
ldng
Here is what I've commented on matt's blog :

Well, you're Google is better, but, expectations are higher as is the number
of Google employees. So when we see the same content farm again and again
during weeks without any action being taken, it annoys. So when blaming
"Google when the information just doesn’t appear to be on the web at all" is
clearly too high an expectation, fixing SPAM issues, be it case buy case, in
matter of days instead of weeks is reasonable to me giving the resources
you're _supposed_ to have.

So, HNers, my wondering is, is the search spam team under-staffed ? or is the
problem elsewhere ?

------
ilamont
There weren't any content farms in 2000, and SEO wasn't as sophisticated.

Now both are quite pervasive, and having a marked impact on the qualitative
assessments of Google search results.

On the quantitative side, Cutts has recently claimed:

 _... according to the evaluation metrics that we’ve refined over more than a
decade, Google’s search quality is better than it has ever been in terms of
relevance, freshness and comprehensiveness._

How does the increase in SEO, content farms, and other questionable results
(scraped content, etc.) figure into Google's evaluation metrics?

~~~
Matt_Cutts
Good question. We've talked about evaluation a little bit before, e.g.
[http://googleblog.blogspot.com/2008/09/search-evaluation-
at-...](http://googleblog.blogspot.com/2008/09/search-evaluation-at-
google.html)

When we evaluate our quality or a new algorithmic change, URLs can be rated as
useful, navigational, etc. They can also be rated as spam. Useful/navigational
sites URLs have higher scores, while a spam rating subtracts from the score.
If an algorithm change tends to rank higher-rated URLs higher, that's good. If
spam tends to rise in the rankings, that's bad.

What our metrics tell us is that Google has gotten better in overall search
quality in the last few months, despite also seeing an increase in spam. It's
safe to say that now we're putting a lot of effort into the spam side of
things to get that back down to the levels we want.

------
tristanperry
I definitely agree that search results in '11 are better than in '00, although
it would be cool if more search results from '00 were given since 1/40000
isn't the most representative of examples.

------
galactus
Is the post suggesting that the perception of drop in quality is only because
google was much much better than other search engines back in 2000?

