

What "viable search engine competition" really looks like - antics
http://blog.nullspace.io/building-search-engines.html

======
nostromo
I enjoy hating on Microsoft as much as the next guy, but as far as I'm
concerned Bing is basically as good as Google.

The problem for Microsoft is the same problem that faced Apple. Being "as good
as" isn't enough to get people to switch. You have to be 10 times better.

Back when Google launched, it was 10x better than Yahoo. Bing seems to have
quickly gotten to 1x, but hasn't moved much further.

~~~
FreeFull
I still regularly make searches in duckduckgo (which gets results from Bing)
and then find that I have to switch Google to actually find what I'm looking
for. The worst thing about this is that Google's results don't seem to be as
good as they used to be, and there isn't really any other search engine that
matches it yet that I've seen.

~~~
s3r3nity
Hmmm but DDG will never be quite as good as Bing or Google simply because DDG
doesn't keep any histories or track any other activity. Really good search
engines get quality and relevance from all those things that DDG purports NOT
to track. It's partly a machine learning problem that I just am very very
skeptical DDG can get around.

~~~
rsl7
I don't think this is true unless you are truly locked into your own tiny
little world. All your search history really does is help Google serve up ads
for you. If that's an improvement, so be it.

~~~
jkrems
Many programming language related terms, movie titles, cities/postal codes,
... All those things can be judged better when the search engine has prior
knowledge about you. You don't have to be "truly locked into your own tiny
little world" to have previous search be a valuable predictor of what kind of
results you expect.

------
lazyjones
Short and very informative, though I wondered why patents are no concern for a
competitor of Google (they have various broad and trivial patents on using
historical data, user feedback, moderation for better search results).

As the author wrote, aiming at feature parity with Google is probably a bad
idea. DDG seems to do this right: they have some better features and one
distinct advantage over Google with their stance on privacy. Perhaps Google
will reach a certain creepiness factor with all the collected data they're
factoring into search results (e.g. I have this suspicion that content from
gmail is used too) and at some point it might work against them even for the
average user (or am I just biased?).

One "feature" that Google has failed to do properly is custom "site search",
i.e. providing a search interface for arbitrary websites. Perhaps this is
another opportunity for a new competitor or even Bing (though their "site
search" was shut down in 2011 apparently), there is plenty of demand (and lots
of crappy custom solutions).

~~~
salient
Because Google usually doesn't use its patents aggressively. They will use
them defensively, though, or even as a counter-attack. That's why I think it's
_incredibly stupid_ of Microsoft to be attacking Google with search patents
through Rockstar. If they piss off Google enough to start using their search
engine patent trove against Bing, it may be game over for Bing. Microsoft
should already consider itself very lucky that Google isn't as spiteful and
vengeful as Apple, or Google would've used them since day one of them going
after Android OEMs.

That being said, I think the only solution to beating Google is developing
something disruptive, and I mean that in a very classical and traditional
sense (not the way disruptive is used to mean anything these days). It means
you have to build a type of a search engine (or search solution) that Google
not only has no interest in pursuing (due to conflicts of interest/opposite
business incentives), but probably _can 't_ pursue, or not without employing
dramatic shifts in how they do search (which again, makes it even less likely
for Google to pursue and try to beat this new "disruptor").

And no, don't look at Bing to provide that. They're a "direct competitor" to
Google, and they will always be, because they want to be "in the exact same
business" Google is, and therefore, I don't think they'll ever radically
change the way they build a search engine compared to Google.

So what can such a disruptive solution be? I can't say for sure, because it
probably needs more factors/benefits on its side than just one, against
Google. Social "may" be a way to disrupt Google, although I think Google is
already rapidly adapting to this attack vector.

I think another could be "extreme-privacy", which Google wouldn't pursue
anytime soon without _really_ rethinking how its search business works.

Disruption also means attacking incumbent companies with much fewer resources
through innovative business models and cost structures. Do something that
would make Google's billions of investment in infrastructure _irrelevant_. A
P2P search engine like YaCy or Faroo seems to be part of that solution.
Perhaps what can/will kill Google is a crowdsourced super-search engine, that
ends up surpassing even Google in most capabilities, because Google would be
just one company, fighting against the whole resources of the world.

As in classic disruption, you also have to consider this new type of search
engine will be much poorer than Google in some competition factors (at least
initially). I think that is the "relevance" factor. As the op mentions, it's
going to be very hard for anyone or anything to beat Google in relevance. They
have too much of a headstart. But perhaps there will be such a thing as "good
enough" in search, too. Maybe we're not there yet, but perhaps we will be in 5
years, or 10 years. Until a few years ago we kept wanting faster and faster
Intel processors. Now we think mobile ARM processors are more than good enough
for most of our daily activities.

A crowdsourced P2P search engine powered by _billions_ of people could perhaps
reach that point, too, while offering some of that "extreme privacy" and other
benefits that Google can't offer.

Still, how do you get that first "10 percent" of users, that end up changing
the paradigm, to start using a service that's much less relevant than Google
_today_? You do it in the same way other disruptive technologies win out
against incumbents. You promote the _other_ benefits, that Google can't and
won't match - benefits that will be important to those "first 10 percent",
until you reach critical mass, and then you should be fast approaching that
"good enough" relevance, too, and get even more users after that.

~~~
tomrod
Voice search with punctuation. I would use Bing if it provided this. As for
now I use Bing simply to make sure I'm not loading cached pages on bad
connections.

------
jordan_litko
If Bing were to overtake Google and become the preferred search engine, would
anything really be much different?

I'm no search engine expert but I'm pretty comfortable assuming that Bing
would have(and does have)had the same issues with rap genius as Google did.
The only real difference is that as the market leader Google had to publicly
react to rap genius 'gaming' of their system.

I think what this shows is that yes, this is precisely a product problem
because if Bing were at the top we would be having this exact same discussion
only with reversed roles.

Correct me if I'm wrong but Google vs Bing is not a lot different than Coke vs
Pepsi.

Until we have someone come along with a search engine which is actually a
better product, market share will only really be a function of effective
marketing.

That said, I think your article does a good job outlining some of the barriers
that any potential competitor in this arena should be aware of.

~~~
smoyer
"Coke vs Pepsi"

I want to be "A Pepper" but DDG doesn't always give me results ... Any ideas?

~~~
JetSpiegel
DDG is mineral water. Better for your health.

------
thoughtloom
I find this title to be misleading. It implies the post will demonstrate what
viable search engine competition would look like, but just explains the
problems that Bing is having with Google. Are we to assume that Bing v. Google
is viable search engine competition?

While Microsoft's problems are real, this offers little in the way of
solutions. That said; interesting read.

~~~
antics
Hmm, I didn't really say that I had the solution. If I did I'd start my own
company. :) I said that I would outline the state of making search engines
competitive. Here I basically focus the post around areas that are pain points
for competition, which I think is pretty relevant to the writeup I set out to
make.

Do you feel something in particular sucks about the article? I'm happy to
change it!

That said, I use Bing as a comparison point because it's really the only other
point that's comparable in terms of, like, infrastructure and stuff. It's
contextualizing more than anything. I might have removed all the references to
"Bing", but I don't think it would add clarity.

~~~
thoughtloom
The article is good as is, but I'd have titled it ~'What the barriers to
"viable search engine competition" really look like.'

------
Kerrick
> IE can track, and has tracked users behavior even when they’re not on Bing.
> (We got in trouble for this once, hehe.)

Am I the only one concerned with both the _fact_ that IE tracks its users, and
the _attitude_ of this Microsoft employee about it?

~~~
jamesaguilar
Personally, I'm amazed that they let him blog stuff in this detail about MS's
search business practices. Not speaking for the org or about the official
stance, but I would not consider doing so as a member of SI at Google. And I
always had the intuition that MS was even more restrictive about such things.
Maybe not?

~~~
antics
I didn't say anything that isn't completely public knowledge. And let's not
forget that MS once employed Robert Scoble. So we can't be all bad.

EDIT: and let's be realistic. The papers on search that come out of MSR expose
way more about our infrastructure than the paltry sum presented in this post.

~~~
jamesaguilar
Insiders can often link bits of public knowledge in a way that effectively
discloses confidential information. Every company has a different threshold
for where this is, so maybe you're on the safe side of that for your org. BTW,
I'm not trying to accuse you of malfeasance, I'm just saying I wouldn't do
what you're doing.

~~~
antics
Fun update, a Google executive just emailed me to say how much he liked this
post.

So there's that.

~~~
jamesaguilar
Your competitor enjoys hearing about the dysfunction in your org and some back
slapping about how great a job Google is doing? Surprise surprise. :P

~~~
antics
Er, what? I didn't say anything about dysfunction and I didn't pat anyone on
the back. I did say things about the inherent difficulties of competing in the
space though.

------
apunic
I don't get the point of this post and I think that almost all the lessons the
OP wrote down are wrong. Is this post an excuse why MS is performing that bad
with Bing? Why is there no one from MS stepping forward and shouting, yes Bing
is still behind but we will fight tooth and nail to crush Google -- no,
instead some MS employee is writing a non-approved post where he gives 6
lessons why MS sucks and why life is so hard, wtf? Maybe this is the reason
why MS falls behind year by year because people behind MS are too laid back
and lack fighting spirit.

Nobody knows what MS is today, especially nowadays because MS doesn't have any
CEO. We know that MS was the company which controlled the personal computer
experience in large parts and we know that this experience was ruled by the
web and by search for one decade and there is no excuse that MS totally forgot
that search must be a integral part.

------
patio11
To say nothing of the engineering challenges with crawling the entire
Internet. Your first cut will likely refresh every several months or so.
Remember when Google did that, back in like 2005? It was very effing hard
then. Google now crawls pages newly linked from authority sites _essentially
instantaneously_.

------
adam419
Bings' lack of market share compared to Google is not because of any of the
reasons listed.

It has to do with the perception people have, and the fact that they more
regularly associate searching the web with Google and "googling it" vs. Bing
and "binging it".

At least this is the case for average, non-hacker news people.

Why does Google have this perception? I don't know, but I will guess it likely
has to do with it being the one search engine that really caught on and was
able to sustain its momentum.

Competing against google would not be done by any of these incremental/less
important technical metrics but more having to do with changing the notion of
searching the web, and how, in a likely very niched way at first.

~~~
thrownaway2424
Public perception is also the reason that Microsoft can't hire machine
learning and search quality engineers, because the question "Do you want to
work at Microsoft on search quality?" starts with "Do you want to work at
Microsoft?" and for many people the answer to that question is "no" regardless
of the remainder of the sentence.

------
bra-ket
points 1, 4, 6 from the article could have been written in 1998 as well.

how can 2 guys in a garage wipe out AltaVista, Lykos, and whoever else was
doing search back then (with lots of smart people, data, resources, state-of-
the-art software and user adoption to compete against)?

What is missing is a serious discussion on search quality (which made Google
Google).

If Larry and Sergey started today, would they be able to compete against
Google by providing better search results, or has search quality reached a
"global maximum" of sorts?

Does search still suck?

~~~
BryantD
AV made a substantial unforced error at exactly the wrong time which resulted
in a huge drop in search quality. It wasn't as much a matter of better
algorithms as one might think.

There's also a pretty big difference in scale. The AV team working on search
in 2000 wasn't all that big.

------
amaks
"Bing exists than it is that it is equal to or better than Google in every
way"

How is Bing better than Google (Search) in every way? With worse relevance and
smaller index? Although, IIRC, Bing used to train their neural net based on
Google search results to match the Google search quality (remember the
'torsoraphy' incident)?

~~~
robrenaud
You misread that sentence.

> it is more important that Bing exists than OTHER CONDITION

Where OTHER CONDITION is what you've quoted. He isn't calming that OTHER
CONDITION is true, merely that it isn't the most important thing.

------
username223
FTA: "social may pose an existential threat to Google’s style of search"

No. Sometimes I want general information: "what is the airspeed of a laden
swallow?" Sometimes I want information dependent upon the interests of the
hundred or so people I know: "what do you all do in Toledo?" Sometimes I want
information dependent upon my interests: "if I enjoyed 'Requiem for a Dream',
what other movies should I watch?"

These things conflict: I could not possibly care less what my friends think is
the airspeed of a laden swallow.

I use Google for #1, Facebook for #2, and several other sites for #3. Google
has (or had) a perfectly viable business providing the best answers to general
questions, and it's a shame that they're screwing it up.

~~~
antics
Hi, I'm the author.

Sorry, but you're wrong. Understanding a good amount of detail about your
users, and in particular, about their social context, is a massive win for
search relevance. Google invested quite a bit in acquiring Aardvark, for
example.

So, even if you don't ask your friends about something, the information is
still really important.

~~~
s3r3nity
This. I think people get caught up too much in the small fraction of queries
that are facts of the universe ("What's the universal gravitational constant
to 3 decimal places?") versus the much much larger fraction that depends upon
much larger contexts ("Movies playing in theaters nearby," or "Rock music,"
etc.)

------
Houshalter
What if you made a search engine that only works on specific sites? For the
most part users _don 't_ want to search the entire web for every small blog in
existence, they want to go to a relatively small number of specific sites.

For example, reddit has a terrible search engine that users often complain
about and comments aren't indexed at all. But searching with google often
gives subpar results and they don't take into account things like upvotes at
all. (I also think sites like reddit are punished by Google's algorithm pretty
harshly.)

This might make the problem smaller and more tractable and a place where you
could get an advantage over Google.

~~~
PaulHoule
Yep, an effective general purpose search engine is really a collection of
effective vertical (subject-intelligent) and horizontal (task-specialized)
search engines.

The truth behind IBM Watson is that it's based on a methodology that makes it
possible to put huge teams working on different parts of the problem without
tripping on each other's feet.

------
apunic
I just checked out Bing with some well known search queries and the results
are very close to Google's.

What makes me slightly uncomfortable are

\- the name and the branding. It sounds irrational but the name annoys me and
I just do not want to type it. Bing doesn't sound sophisticated, smart or
anything and the big picture on the landing page is unwanted and makes the
product feel unfocussed. Maybe the reason is just that Bing never did
something way better than Google and thus, there were no opportunities to load
this name with a good reputation with some achievements.

\- the speed: Google is (or feels) still way faster here and there (when
entering something into the search it instantly switch to the SERP view)

~~~
jkrems
This! The branding of Bing is terrible. When I see the big picture I always
think: this has nothing to do with what I'm looking for - and "what I'm
looking for" should be the single most important thing for a search engine.

------
taxonomyman
Web search with up to the top million most popular web sites removed:
[https://millionshort.com/](https://millionshort.com/)

------
asmman1
Imagine if Microsoft begins saving users' information using its operating
system. As result they got more information on you even than Google or anyone
else.

------
mmmmax
Can someone who has the page loaded add a mirror or link to a cached version?

~~~
antics
Is it down for you? It's up for me. It's static HTML hosted on GitHub Pages,
so it should remain unbroken...

~~~
tdumitrescu
down for me too, can't connect

~~~
antics
Huh, weird. It would be here, if Google had indexed it yet:
[http://webcache.googleusercontent.com/search?q=cache:http://...](http://webcache.googleusercontent.com/search?q=cache:http://blog.nullspace.io/building-
search-engines.html/)

------
notastartup
can you elaborate on search relevance?

is it mostly natural language processing? matching keyword queries to document
corpus

~~~
300bps
Tonight I typed Eagles into Google and it showed me all the information I
needed about the game that starts at 8:10 pm. That is search relevance. I've
noticed Bing doing a lot of this type of thing for the past year or two but
Google is definitely the master of it.

~~~
nknighthb
This is a great example of search result subjectivity. I don't care about
football. I'm annoyed when it rudely intrudes my consciousness. If I type
"eagles" into a search engine, I'm looking for the band. The football team has
_no relevance_ to me, even as millions of other people tonight are probably
looking for exactly what Google showed you.

~~~
ak217
Yep, and that's why it's important for the search engine to be able to collect
information about you! It's not just for commercial ends like ad targeting.

