
What If IBM’s Watson Dethroned Google as the King of Search? - mgulaid
http://www.wired.com/opinion/2013/10/google-in-jeopardy-what-if-watson-beat-the-search-giant/
======
nl
_Oh Please.._

This sounds like it was written in 2007 or something.

Google has been working on question answering for years now, but only exposes
it when it is sure of the answers.

[https://www.google.com.au/search?q=when+did+jfk+die](https://www.google.com.au/search?q=when+did+jfk+die)

(I get a "one box" answer saying "November 22, 1963 John F. Kennedy, Date of
death"). You'll note that Google had to derive that I meant "John F Kennedy"
by JFK, that I wanted a date, and then retrieve the answer.

[https://www.google.com.au/search?q=how+old+was+jfk+when+he+d...](https://www.google.com.au/search?q=how+old+was+jfk+when+he+died)

"46 (1917–1963) John F. Kennedy, Age at death"

[https://www.google.com.au/search?q=how+did+jfk+die](https://www.google.com.au/search?q=how+did+jfk+die)

"Assassination John F. Kennedy, Cause of death"

It's worth clicking the "More info" button under one of these "One boxes".
Google will tell you what it regards as "facts" and where it is deriving them
from.

(Also,
[http://www.wolframalpha.com/input/?i=how+old+was+jfk+when+he...](http://www.wolframalpha.com/input/?i=how+old+was+jfk+when+he+died)
is just as impressive)

~~~
kalleboo
I'm also continually surprised at how good google is at queries of the type
"what was that that movie where the guy did that thing" and "what was that
music group that has that gimmick".

~~~
yk
Do you have an example? ( The two I tried did not work.)

~~~
jorgecastillo
I have another example. I live in Mexico and I watch movies in Spanish/English
depending on the channel. The titles are always in Spanish and a lot of times
they don't translate clearly to English or they don't make any reference to
the title in English. If I am watching a movie I usually try a few Google
queries to find about the movie I am watching, to see if I like it or just to
know the end. Even with out knowing the title I am usually able to quickly
find about the movie even if it is not a blockbuster.

horror thriller where woman goes to russia meets brother

[http://www.imdb.com/title/tt0475937/](http://www.imdb.com/title/tt0475937/)
\- The Abandoned

^The result I was looking for appears first.^

I think it is ludicrous to think that Google can be displace from it position
as the global leader of web search in the short or mid term. To displace
Google a new search engine wouldn't just have to be better it would have to
make Google appear like last century tech. With this said I have to make clear
that I am no Google fan or Google hater but I am glad there are other search
engines that can at least compete with Google on their national markets
(Yandex, Baidu, Naver, Seznam & Yahoo Japan).

------
chrislomax
This is one of the most ridiculous stories I have ever read. How do they so
quickly dismiss Google's search algorithm as something to fall back to?
Microsoft have been chasing their tails for years trying to be as relevant as
Google and never achieving it.

It sounds like the author is suggesting dynamic web pages built by Watson that
would answer questions (summarise court cases etc). The underlying problem is,
where does all this data come from? It sounds like Watson would intepret
multiple data points from around the web to compile the information. How can
they say this information is correct? Google at least points you in the
direction of the information then you make an informed decision yourself if it
is correct.

This sounds like something we already have, we have Google and Wolfram Alpha.
Problem sorted.

~~~
bnegreve
This is certainly not ridiculous. It is very legitimate to wonder whether
systems involving more AI such as Watson are going to take over standard
search techniques or not. Regarding Bing, it is as you said: Microsoft has
been _chasing_ Google but Watson is a totally different approach so the
question is still open.

~~~
chrislomax
The concept of the AI is not what I am disputing, it's this particular story.
There is no expansion on how it will replace it and equally dismisses Google's
algorithm as simple to replicate.

I don't see it as replacing search either, people search for websites. What is
the end result, that Watson replaces all sites on the internet with it's own
dynamic page filled with its own information?

How would Watson choose how to display the information it returns to me? Am I
getting the full story?

This article breezes over so many specifics I cannot take it seriously.

------
tytso
Without diminishing the props that need to be given to the team at IBM
Research which developed Watson, the author of this article has significantly
overestimated the general AI capabilities of Watson, while downplaying what is
involved with Google Search.

On the Google Search side, its algorithms are a lot more than just "the
PageRank algorithm". The knowledge graph and its ability to remember and take
advantage of context from previous searches are examples of this.

On the Watson side, a huge amount of what it was doing involved identifying
keywords and finding relevant information from the clue words. It did not
really reason about questions or have a deep understanding of the semantic
content of the query. It was hand optimized for the sort of questions that
tend to be asked on Jeopardy, which was an impressive feat, but in terms of
being able to create new knowledge, as the author suggests? The state of the
art in AI is a long way away from that.

------
diziet
The reason Google Search works well is because through many of it's iterations
it actually uses crowdsourced data from real humans to answer questions.
Whether it is the simple initial vote of confidence by links, to google's ever
changing internal data about clickthrough rates, personalized results, search
volumes and general user behavior, or simply from the many hardcoded questions
in certain format ("What is the capital of Spain?" or "distance from the sun
to jupiter"), Google is continuously getting better at judging intent and
answering questions. Similarly for a lot of the other technologies, such as
translations, the real work is being done by consumers and users who generate
the gigantic corpus of data.

The question is -- does IBM have access to the same amount of data?

------
DigitalSea
If Watson were to ever dethrone Google, I am willing to almost bet it would be
because it was programmed to consume data via Google. It all comes down to the
data in the end, unless IBM have managed to gain access to the same size and
or bigger trove of data than Google currently has (which is a lot), there is
no way anyone can beat Google. And lets face it, people don't just use Google
because they have the biggest database of knowledge and data, people use it
because they trust it, they know it works, and Google as a company have
established a rapport with people that has taken years. IBM to me will always
be a company focused on corporate and enterprise offerings, not offerings for
the general consumer.

------
devx
I've always thought Google would want to buy some Watsons to help their search
engine. I'm surprised they haven't yet.

That being said, I wonder if they're more interested in how their D-wave
quantum computer will turn out.

------
cynwoody
What is the diameter of a Krugerrand?†

Actually, I needed the answer, since I was reaching into my desk drawer for
some wedding gifts and needed to buy some gift boxes (Google helped with that,
too).

Watson may have a head start on the AI stuff, but Google is a quick study.

Also, Google is really good at server farms.

†[https://www.google.com/search?q=What+is+the+diameter+of+a+kr...](https://www.google.com/search?q=What+is+the+diameter+of+a+krugerrand%3F&oq=What+is+the+diameter+of+a+krugerrand%3F#es_sm=119&q=what+is+the+diameter+of+a+krugerrand&safe=off)

~~~
DanBC
[What years have Pixies toured England?] is a nice example of a search that
doesn't work. You don't get a historical list, you get very many sites selling
you tickets or sites telling you about a tour this year or an upcoming tour.

------
Xorlev
Given past experience with other MPI-based software, I'm not sure that Watson
would scale without extensive retooling. MPI tends to be extremely chatty (MPI
- Message Passing Interface), we ran ours off a hypercube (network topology)
switched Infiniband setup. MPI depends on broadcast/scatter/gather semantics.
This squarely lands in the 'what if' category for now.

~~~
frozenport
Infiniband sucks. IBM's or Cray's proprietary interconnect will scale your
code because it removes the chatter. On a Cray our MPI latency is 20x better
and has almost no discrepancy between PEs.

~~~
sgt101
Interesting, I went to a super computer conference in Germany last week and
was curious to see many papers using 10GbE and vendors talking about
infiniband. Can you point me to any resources? Also have you any insight as to
what the current thinking is about Hadoop clusters - are people making the
move to 40GbE to try and get good throughput or is it pointless. Our tiny 8
node cluster has recently got in a fluster due to having a 1GbE switch, an
obvious fix is to get a 10GbE one - but will this help?

~~~
frozenport
MPI programs and Hadoop will scale until the MPI latency becomes too high to
effectively perform a data swap. When you look at the profiler on a good
machine you see that 25% of the time is spent in MPI_recieve. When the time
spent in MPI_receive goes up you are done - the task simply can't scale.
Vendors like Cray and IBM sell machines that have low MPI latency. This
happens in both hardware and software. The interconnect is fast (I think Intel
baught it) and the MPI layer performs optimization for all the traffic.
OpenMPI doesn't even come close to optimizing the traffic. Ethernet doesn't do
direct DMA likes the propriety interconnects - this adds latency and jitter. I
don't believe the Ethernet strategy can scale for such applications, primarily
because the time to complete the step is the highest latency on the network.
If one node jitters and takes 100micros, it doesn't matter much that the rest
of the guys took 20micros. Anyways, benchmarks would help.

The smallest machines Cray sells cost about $500,000. If you want scalability
you gotta pay. 8 nodes isn't a real machine.

------
BjoernKW
Every few years the idea of a natural language / semantic / question answering
search engine crops up again.

Natural language understanding is quite relevant for the crawling and indexing
part of information retrieval systems and Google is very good at that. Just
look at their quite formidable automatic translation software, which is a by-
product of their ability to correctly map natural language concepts to
strings.

The thing is: People just don't want to converse with a search engine as if it
was a human being. Some library / scientific information retrieval systems
tried to go in that direction, which resulted in retrieval systems that were
just cumbersome to use.

Google nailed the search engine user interface quite some time ago and Peter
Norvig is absolutely right when he says that users simply don't want to ask
questions when searching. They're much faster at entering keywords relevant to
their search intent because they've learned how to efficiently converse with
search engines in their 'native' keyword language.

Hence, passing the Turing test is completely irrelevant for search and
information retrieval in general. Even in mobile environments where due to the
device's constraints a natural language user interface makes a lot more sense
than on the desktop, software like Siri more or less is just some gimmick that
in most cases is easily outperformed by more traditional input methods. Sure,
asking Siri to 'Show me the way to the next whisky bar' might be fun at first
but simply entering 'pub' and the name of the town you're in right now is
still a lot more efficient. Again, I think Google nailed the user interface
part with Google Now for mobile information retrieval as well. I don't want
intelligent machines to pretend they're human. I want them to take a back seat
and present me with the right information once it becomes relevant.

~~~
vannevar
_Google nailed the search engine user interface quite some time ago..._

If there is one area of search technology where Google has contributed almost
nothing, it is the search interface. The subjective user experience is
virtually unchanged since the days of Alta Vista circa 1996.

~~~
hexley
Actually they fucked it up when they started messing with the perfectly
working +,- operands.

~~~
DanBC
\+ might have worked perfectly, but most people didn't use it, and of those
who did use it most used it incorrectly.

([http://insidesearch.blogspot.co.uk/2011/11/search-using-
your...](http://insidesearch.blogspot.co.uk/2011/11/search-using-your-terms-
verbatim.html))

> we found that users typed the “+” operator in less than half a percent of
> all searches, and two thirds of the time, it was used incorrectly.

Check my arithmetic but that means only 1 in 600 searches used + correctly; 2
in 600 used it wrong, and the other 597 searches didn't use it at all.

~~~
hexley
That's what they say, but it's strange disappearance directly around the time
of Google+ says otherwise.

------
vannevar
The biggest obstacle to Google's progress in search is Google. They make
virtually all of their money from advertising, including the same spam that
pollutes their search results. Targeted advertising can only help so much,
because in the end, advertisers don't just want to reach people already
interested in their product, they want to attract people who _aren 't_.
Google's dominant position in the online advertising market is an
unreconcilable conflict of interest with their search users. They continue to
dominate search not because they are technologically unbeatable, but because
no one has figured out an alternative business model that makes search pay
without corrupting it with advertising.

------
lifeisstillgood
we should stop thinking in monolithic terms - as if one company will be able
to do it all in the global scale. Google seems to have mastered data centres
and spiders, and will probably form a base layer for part of the next step
towards humanity-AI. Maybe Watson will consume Google data, but maybe a
thousand individual tuned AI like services will arrive - for booking flights
and for negotiating mining contracts and ...

I think we shall soon see the end to the idea one organisation (based in SV)
will be able to service all the worlds needs on a march towards AI - I suspect
Google is as an organisation already straining. why not let markets flourish ?

~~~
7952
I agree. Certain search terms relate to a single concept with a single
authoritive source of information. A fireman searching for plans for a
building on fire only really want a single answer (contrived example). Google
is bad at domain specifics searches where you are looking for one answer. It
is so easy to get trapped in obscure business listings, local news and press
releases.

Google works well when there is no clear source of authority and the algorithm
can make a judgement. Some searches have a definitive thing that you are
looking for and the algorithm would need to know that source to give the right
answers. We need more human input which can make search engines that give
results that are shamelessly partial to particular sources of information
(like patent searches).

------
tn13
Quality of search results has very little to do with market share. Had it been
the case, Bing should have ad near equal share with Google and Ask should have
had zero.

Google owns Chrome and Android which is basically 50% of web traffic.

~~~
jamesaguilar
How does this explanation change in light of the fact Google was way ahead of
Bing when IE had majority market share?

------
slacka
When I first heard about Watson, I expected IBM put a version of it online for
people to experiment with. If it's as good as they claim, it could be a useful
tool and create good PR for them. I know the searches are much more resource
intensive than Google's, so I wouldn't have minded if they needed to limit it
something like 1 question/IP/hr.

------
frozenport
What if Watson just googles for the answers?

~~~
linker3000
We could ask.

------
primelens
My knowledge of the actual workings of Watson is sketchy, but doesn't it
devote a massive amount of resources to understanding natural language
queries? Can it scale its ability to handle one question at a time to handling
a billion queries?

And Google is getting much better at understanding syntax and will continue to
improve.

------
jezclaremurugan
Google search itself is becoming Watson like, with the all the additions in
the knowledge graph, and I think with Google Now google search will also have
a lot more context. Watson can of course be a classical disrupter in search
but Google isn't exactly inactive in this arena.

------
sarreph
...yet another Wired article where the author can't be bothered to use
synonymous nouns for the sake of intelligible readability:

"they could still resort to a simple PageRank-like algorithm as a last resort"

------
fear91
The problem with search is that nowadays, people judge results quality by
their similarity to Google results. So anything that doesn't look like Google
output looks "off".

~~~
protomyth
At this point, I judge search quality by the words I'm searching for appearing
on the page the search link sends me to. I am starting to pine for the days of
Alta Vista so at least I could include and exclude words with some thought it
might work.

------
adamnemecek
Is there something like Betteridge's law of headlines for headlines starting
with "what if"?

~~~
pestaa
Fortunately we rarely seen headlines starting with "what if", though I'm sure
such law would define the answer to be "nope".

~~~
isxek
Or probably even "meh."

