

Garry Kasparov on IBM's Watson - paulreiners
http://www.theatlantic.com/technology/archive/2011/02/exclusive-garry-kasparov-on-ibms-watson/71584/

======
kenjackson
_Worse, by definition they do not understand what they do not understand and
so cannot avoid them_

Kasparov didn't seem to see what I did. Watson seemed very consistent in
knowing what it did not know. There was maybe two questions I recall where it
actually got the question wrong with 50%+ certainty. I believe it answered,
"leg" when it should have been "mising a leg". The other it answered the 20s
when the answer was the 10s. And I think for neither of those the percentage
was much beyond 50%.

Also Kasparov seems to miss that Watson in medicine would be used with humans.
I doubt a doctor will say, "Watson says to cut off his left leg -- I would
just given him aspirin for the headache, oh well. Hopefully cutting off this
leg makes his head feel better."

What Watson hopefully will do is help diagnosis. Especially tricky ones.

There's a great story in a book I read, I wish I could recall the name, but it
begins with a lady who has some stomach issue that she has for like 20 years.
Everyone thinks its in her head. She finally happens upon a doctor who happens
to have seen something like this before, she gets diagnosed and healed. But
she had to live with it for like 20 years after seeing doctor after doctor.
Watson would be able to greatly help situations like this, I hope.

UPDATE: The book is "How Doctor's Think". Here's an excerpt that talks about
this case,
[http://harvardmedicine.hms.harvard.edu/bulletin/winter2007/7...](http://harvardmedicine.hms.harvard.edu/bulletin/winter2007/7.php)
\-- just in case anyone cares. :-)

~~~
zwieback
Totally agree. It's shocking how many doctors seem to be lacking basic
knowledge about things like drug sideeffects or lesser known symptoms about
common diseases. You obviously still need both specialists and general
practitioners but I think it's time to revisit the idea of a medical expert
system .

------
davidtgoldblatt
Here are the reasons I was disappointed in Watson's showing (despite handily
beating the human competitors). The most obvious was that Watson' auto-clicker
was a big advantage over human thumbs, so that Watson got 100% of the points
for clues to which all competitors knew the answer (if you asked Watson and
the two humans "what's five plus five", Watson would win, but that's not
necessarily proof of any sort of computer superiority).

The second reason is that IBM was representing Watson as something of a big
push in knowledge representation (I just watched a video where they talk about
Watson's "informed judgments" about complicated questions for instance). It
looks instead like Watson just has an improved ability to disambiguate words
relative to previous systems and to do quick lookups that match those words
with nearby key terms.

For example, on the clue "Rembrandt's biblical scene 'Storm on the Sea of'
this was stolen from a Boston museum in 1990", Watson correctly answered
"Galilee". But its next two answers were "Gardner Museum" and "Art theft"; no
one who "understood" the question in any conventional sense would even
consider these as answers because they don't make any sense. Clearly, Watson
looked for instances of "Rembrandt", "Storm on the sea of", "stolen", or other
phrases from the clue in its text corpus, and found that "Galilee", "Gardner
Museum", and "art theft" all frequently occurred when together (because the
painting was stolen from the Gardner museum in an instance of art theft), and
relatively rarely when not together. "Galilee" probably won out of these three
because Watson is tuned to Jeopardy clue styles (whenever there is a quoted
phrase in a clue followed by the word 'this', it's always asking for the
answer that completes the phrase).

Similarly, Watson was far less confident on the clue "You just need a nap!"
You don't have this sleep disorder that can make sufferers nod off while
standing up." It still got the right answer of "Narcolepsy", but with a
relatively low confidence of 64%. "Insomnia" had a confidence of 32% despite
clearly being the opposite sort of sleep disorder, and "deprivation" appeared
at 13%, despite not being a sleep disorder. Here Watson gets confused because
the only term of the clue that appears more frequently with "narcolepsy" than
"insomnia" is "standing up"; my guess is that if "standing up" had been
replaced by some oddly phrased, uncommonly occurring synonym, Watson wouldn't
have been able to come up with an answer, despite the clue conveying exactly
the same information.

This kind of cleverness is certainly impressive, but it seems like it's an
advance in tuning existing techniques to the format of Jeopardy, not an
advance that will spark other successful projects down the line. IBM's goal of
giving us "the computer from Star Trek" doesn't seem any closer; I don't see
any evidence that Watson could have answered a question that required more
thought or understanding than a simple text search. If there was the question
"how many kings ruled England in between Henry the Fourth and Henry the Eigth"
(8), then Ken and Brad would have been able to answer relatively easily, while
my guess is that Watson would be stumped.

~~~
kenjackson
_For example, on the clue "Rembrandt's biblical scene 'Storm on the Sea of'
this was stolen from a Boston museum in 1990", Watson correctly answered
"Galilee". But its next two answers were "Gardner Museum" and "Art theft"; no
one who "understood" the question in any conventional sense would even
consider these as answers because they don't make any sense._

But I think your peek into Watson's inner mind may give you a more insight
than you have about the human mind.

I'm reminded of a story about how a girl told me she was good at froggy when
it came to basketball. I was like, "What's froggy" and she said "when you get
the ball after someone shoots it". I said, "I think its called a rebound". And
she said, "that's the word, rebound... but froggy and rebound, they remind me
of each other"

And your narcolepsy v insomnia example is a mistake I think a lot of humans
make. Like if you ask me which way to turn a lightbulb to remove it, my brain
will have both clockwise and counter-clockwise as responses. And clockwise is
probably 80%, but counter clockwise is probably at 20% -- I have been known to
accidentally tighten a bolt, rather than loosen it.

~~~
scott_s
I can't count the number of times my father, when using my name, starts out my
brother's name and corrects himself midway through. I've done similar things
with pairs of friends I met at the same time and know in only one domain.

------
jhamburger
Not quite clear on why people keep pointing to the 'Toronto' question as proof
that Watson is fundamentally flawed in some irreconcilable way.

~~~
Jach
That's not at all what Kasparov said:

    
    
        My concern about its utility, and I read they would like it to answer medical questions, is that
        Watson's performance reminded me of chess computers. They play fantastically well in maybe 90% of
        positions, but there is a selection of positions they do not understand at all. Worse, by definition
        they do not understand what they do not understand and so cannot avoid them. A strong human Jeopardy! player,
        or a human doctor, may get the answer wrong, but he is unlikely to make a huge blunder or category error--
        at least not without being aware of his own doubts. We are also good at judging our own level of certainty.
        A computer can simulate this by an artificial confidence measurement, but I would not like to be
        the patient who discovers the medical equivalent of answering "Toronto" in the "US Cities" category,
        as Watson did.
        
        I would not like to downplay the Watson team's achievement, because clearly they did something most
        did not yet believe possible. And IBM can be lauded for these experiments. I would only like to wait
        and see if there is anything for Watson beyond Jeopardy!.
    

If IBM wants to fix the "Toronto" problem, have at it. But those sorts of
"embarrassing" errors could be quite costly in medical situations. During the
show they showed Watson's progression from really stupid answers very
frequently to less frequently, which makes me personally believe their
fundamental process is flawed (not necessarily irreconcilable) and their
current algorithms are just a bunch of hacks thrown together on top of Google
rather than something more sophisticated like Wolfram Alpha.

~~~
michaelcampbell
> but I would not like to be the patient who discovers the medical equivalent
> of answering "Toronto" in the "US Cities" category, as Watson did.

Surprise, that kind of mistake happens far too frequently in the medical field
_now_.

Why is Kasparov commenting on something so far out of his recognized area of
expertise relevant anyway? I don't go to Knuth for advice on chess, nor
Hawking for snarky banter on economics, etc. (Although if I had access to
either of those 2, I might try it.)

~~~
Jach
Would Watson make that mistake happen more or less often? (Bringing in Watson
can lead to blindly trusting or blindly ignoring the "stupid computer"
depending on the doctor; seems like a problem with doctors rather than a lack
of tools?)

> Why is Kasparov commenting on something so far out of his recognized area of
> expertise relevant anyway?

Isn't the asking obvious? (I won't comment on the relevance; people do and
read many irrelevant things every day.) People asked for his thoughts 'cause
he got beat by IBM's Deep Blue and he's had a lot of experience with computers
in their relationship with chess (specifically combining humans and computers
to make really strong opponents). People also asked for Ken Jennings' thoughts
and AI isn't his expertise. And people recently asked Hawking for his thoughts
on aliens...

------
woan
Kasparov is spot on in that Watson's DeepQA has yet to prove itself in a
meaningful way. If it proves itself as an effective medical advisor, that will
be far more impressive than the Jeopardy win (as impressive as that was in
itself).

I think everyone was disappointed in the applicability of the Deep Blue
accomplishment in other fields. Were any of the special purpose ASICs used to
defeat Kasparov used in any other application? As far as I know a significant
part of the Deep Blue development team left IBM relatively soon after the
accomplishment.

------
logjam
As noted in Paul Hoffman's recent book "King's Gambit", after his matches with
both "Deep Blue" and "Deep Junior" Kasparov was exhausted:

"As with Deep Blue, he had once again let an encounter with a machine play
games with his head. He had been obsessed with the idea that Deep Junior would
never tire. 'The machine is never distracted by an argument with its mother,"
he told me, 'or a lack of sleep.'

And in the linked piece Kasparov alludes to the reported next approach IBM
wants to take with Watson - support in medicine.

Kasparov's human reaction to his encounters with Watson's distant cousins
brings up one obvious benefit in the use of technology like Watson for
supporting medical decision-making - simply that such software will be less
likely to miss something. Software is less likely to miss considering a
diagnosis, ordering a crucial test, or following up on a finding - unlike the
fallible 'I' who may have skipped a class in med school, or was up all night
on call and just can't think straight, or am just occasionally more stupid
than usual.

Diagnosis is the first thing people think of with technology like this, but in
my opinion that's not the big problem Watson should tackle. Medical diagnosis
in and of itself (dramatizations like the TV show 'House' notwithstanding), is
not really that difficult 99% of the time. When you hear hoofbeats, you're
very likely going to find horses and not zebras. A future Dr. Watson might
occasionally be very helpful in pointing out very obscure (but uncommon)
diagnoses. However, in my opinion the most helpful thing a Dr. Watson could
provide is collecting, evaluating, and comparing evidence and outcomes as they
are developed globally and locally (ie across broad swaths of medicine, but
also within a single physician's own patient population), continuously
educating the physician, and monitoring cases.

There is plenty of untapped medical data/evidence out there, but it's almost
all hidden away in plain sight...text/natural language. I have to agree with
Kasparov here, in that the primary advancement Watson represents was in moving
farther down the path from syntax to semantics.

