
Watson crushes the competition in second round of 'Jeopardy' - mjfern
http://news.yahoo.com/s/ap/us_tv_man_vs_machine
======
ssclafani
David Ferrucci, the manager of the Watson project at IBM, on why he thinks
Watson got the Final Jeopardy question wrong:

"First, the category names on Jeopardy! are tricky. The answers often do not
exactly fit the category. Watson, in his training phase, learned that
categories only weakly suggest the kind of answer that is expected, and,
therefore, the machine downgrades their significance. The way the language was
parsed provided an advantage for the humans and a disadvantage for Watson, as
well. “What US city” wasn’t in the question. If it had been, Watson would have
given US cities much more weight as it searched for the answer. Adding to the
confusion for Watson, there are cities named Toronto in the United States and
the Toronto in Canada has an American League baseball team. It probably picked
up those facts from the written material it has digested. Also, the machine
didn’t find much evidence to connect either city’s airport to World War II.
(Chicago was a very close second on Watson’s list of possible answers.) So
this is just one of those situations that’s a snap for a reasonably
knowledgeable human but a true brain teaser for the machine."

[http://asmarterplanet.com/blog/2011/02/watson-on-jeopardy-
da...](http://asmarterplanet.com/blog/2011/02/watson-on-jeopardy-day-two-the-
confusion-over-an-airport-clue.html)

~~~
gojomo
Lame excuses! Watson is impressive, but I'm disappointed by the lack of any
hint of comprehension behind its answers. When it's wrong, it's often
nonsensically out-to-lunch, and its 2nd/3rd best answers are also often batty.

If it's just trained-up on statistical correlations between trigger phrases
and likely answers in the constrained Jeopardy domain, then 90 32-core/512GB
RAM servers seem like overkill.

~~~
InclinedPlane
Watson also generates confidence estimates and minimum confidence bars for
questions. It may sometimes have batty "answers" but usually it knows they are
batty. The rate of incorrect answers that Watson has a high confidence in is
fairly low.

What's remarkable and important to not take lightly is the result that it's
possible to generate answers to often vague and indirect clues without
understanding. That likely means that it will be possible to build useful
systems for automating research and the synthesis of large amounts of data
without needing to build artificial human-level intelligence.

~~~
extension
...presuming that human-level intelligence entails any sort of understanding
that is fundamentally deeper than what Watson is doing.

~~~
tel
I think that's a fair bet. The new wave of "probablistic everywhere" NLP
models, though even the very simplest strictly dominate older grammatical
methods, are not often capable of taking advantage of a lot of the structure
of language and topic that humans are wont to do. It's a cutting-edge
accomplishment when NLP algorithms learn prediction of long-range word pairs
such as how you almost certainly will see "law" or "marriage" somewhere in a
sentence containing the word "annulled" even if that local area of the
sentence doesn't seem to call for it. Humans on the other hand are more likely
to forget that it's possible to annul pretty much anything else.

I don't own a TV and plan on watching the Jeopardy match later online, so I'm
just going to guess about Watson's performance. I think that humans abuse
discovered patterns and structure in language and meaning to search through
possible interpretations very quickly. Watson on the other hand uses far less
structure and a room full of 200 cores to search through everything is knows
much less efficiently. I feel like Watson's "strange" answers probably aren't
nearly so strange when you realize it's simply being more fair to _any_
possible answer than a human would.

What's scart is this sort of thing---a willingness to consider out of context
answers---sounds pretty similar to the kind of behaviors we humans praise as
creative!

~~~
extension
_I think that humans abuse discovered patterns and structure in language and
meaning to search through possible interpretations very quickly._

Right, but does that structure really represent a "deeper" understanding or
just vast and meticulous optimizations of statistical algorithms similar to
Watson's? Or is there a difference?

We feel like we know how we think, but we can't actually explain it in enough
detail to reproduce. Humans have a bad history of rationalization and tunnel
vision. And now we discover that all the "wrong" ways to think deeply are
actually the right ways to make a working AI.

If the AI can fool us into believing that it "understands" then maybe we can
fool ourselves in the same way.

~~~
tel
I don't honestly feel like we know how we think at all. I do think that
statistics is a pretty good bet for the "math of learning" in that it's a
sensible way to track how information flows through a model. Furthermore, the
combinatorial problems involved need to be tackled just the same by humans so
we can maybe try to say that we're studying similar phenomena as the workings
of the brain.

Of course, the implementations we build will always be vastly different from
their appearance in the brain since the architectures are so extraordinarily
different!

------
emelski
For all the talk of the difficulties of playing Jeopardy! due to the "nuances
of natural language" and "puns and double meanings in the clues", that did not
really seem to be a factor in the second round -- most of the questions were
quite plainly worded with answers easily discoverable just by searching.
Accordingly, Watson performed dramatically better today than yesterday, when a
larger portion of the questions did have nuance and plays-on-words in the
phrasing. Note too how spectacularly badly Watson performed on the Final
Jeopardy! question, where nuance _did_ play a much bigger role.

So today, we learned that machines can push buttons faster than people, and
search is a great way to find answers for trivia questions. I doubt the former
is a surprise to anybody alive in the past 50 years; the latter shouldn't
surprise anybody who's ever used Google.

~~~
gthank
This. A thousand times this. Watson absolutely CRUSHED the human players on
pretty much every question that was basic facts. I know Watson can probably
generate answers faster than humans on simple search stuff, but it seemed so
bad at some points that I wondered: is Watson not wired in with some sort of
delay that mimics the delay that humans have between deciding to buzz in and
actually buzzing in? A lack of such a system would seem to skew the results
somewhat.

~~~
effigies
I was at a watching party with a couple of IBMers who worked on Watson, and
one thing they said is that it's not a question of speed, but of timing.
Players time their pressing to an estimate of when Alex will finish the
question, and Brad Rutter in particular has been clocked at under 2ms with
shocking regularity. The advantages Watson has are consistency and the
emotional perturbations in its opponents. You could see them getting
frustrated, and that likely only served to harm their ability to hit that
window between the end of the question and Watson's button press.

~~~
gthank
You're right: consistently being 6x faster on the buzzer than the common case
for your opponent is going to let you destroy them. Their only hope is that
you can't come up with a response before Alex finishes reading the question.

I arrived at the 6x approximation by googling around for avg. ethernet
latencies. I'm consistently seeing numbers of .3 - .35 ms for an ethernet
ping/pong. I think it's fair to assume that with the money IBM has invested in
this, Watson is on at least ethernet quality connections.

------
spitfire
So is there any information on how they actually implemented watson? My
understanding is it's a bayesian machine learning system, but I still don't
know how it parses answers, or really does its magic.

Also, if there is anyone who thinks silicon valley has the smartest people
around, this type of stuff should change your mind. Facebook is short trousers
compared to this. and it's just a tech demo.

~~~
Mahh
Some cool stuff here: <http://www-943.ibm.com/innovation/us/watson/>

The real challenge behind Watson is the natural language parsing. Instead of
abstracting information away from their sources(like a graph), sources seem to
have been left intact in sentences in Watson's memory. Watson would read
through this information in a way alike to how it interprets a question, and
it would try to create links and possible answers based on connections in
sentences from many sources(this gives thought on why pun questions are
difficult for Watson). I can't speak on behalf of the mathematical
implementation of the answer choices, but this is the high level way that
Watson finds answers. Those videos talk about the cool stuff behind the
algorithmic challenges of Watson.

~~~
spitfire
You aren't joking. I took a few minutes and wrote up a bayesian engine in
mathematica. I've got a pretty good start on that already, and as the IBM
stuff notes it's embarrassingly parallel. It seems to me the entire problem is
parsing. If you can parse well and feed a well formed input to your data layer
(and you've fed it enough data) you're golden.

So who wants to build a real Q/A site based on this? Call it hal-18000.

~~~
paganel
> So who wants to build a real Q/A site based on this? Call it hal-18000.

You'd have to learn it to deal with thick accents like this one:
<http://www.youtube.com/watch?v=5FFRoYhTJQQ> . Honestly, I don't know if
that's possible, no matter how much training you'd put into the machine.

~~~
khafra
Natural Language Processing != Voice Recognition.

~~~
paganel
> Natural Language Processing != Voice Recognition

This is what I don't get, why should be "language processing" tied to written
text? Part of the answer I know, because it's easier for computers to parse,
but other than that it doesn't make sense.

~~~
gloob
Speech recognition is speech recognition. Different problem entirely.

------
kirpekar
Interesting match.

Seems like Watson was able to ring in (clicker) much quicker than Ken or Brad.
Any unfair advantage?

~~~
zach
Of course. This isn't a match of equals, this is one of man versus machine. As
fans know, most responses in a game of Jeopardy! are known by multiple
players. That's especially so in a game of this caliber, so the knowledge
aspect is really quite minimal compared to the ring-in factor. Both these guys
slaughtered their opponents by being quick on the buzzer.

Watson is being granted first crack at the questions 90% of the time because
of its electromechanical advantage. IBM may not have the mean brainpower that
Google has, but they can clearly build a computer that can press a button
quicker than Ken Jennings.

Knowing that a computer can consistently beat even the best to ever play the
game to the buzzer, the IBM team could be pretty well assured of success once
they got Watson performing well enough.

~~~
sigstoat
the comparison to google seems, um, awkward?

"IBM holds more patents than any other U.S.-based technology company and has
nine research laboratories worldwide. Its employees have garnered five Nobel
Prizes, four Turing Awards, nine National Medals of Technology, and five
National Medals of Science."

admittedly, they've been around longer, but they're not exactly playing with
crayons over there.

~~~
Padavies
Come on, outside of these academic exercises, IBM is now just a giant
consulting firm.

~~~
kevin_morrill
umm yeah giant as in $100 billion in revenue and about $15 billion in net
income. Not a bad business at all. I should hope to be so boring.

~~~
arethuza
With over $10 billion a year in hardware sales.

------
philsalesses
Today, I learned that there are 7 cities in America named Toronto.

~~~
theycallmemorty
I wonder how many of them partially fit the mold of having an airport named
after WWII battles/heroes?

~~~
philsalesses
I have 31% confidence there is something there...

------
baddox
And, as I predicted, it only came to buzzer reflex, which computers
unsurprisingly excel at. On day 2 (today), Watson was only beaten to the
buzzer three times when it had the correct response above its confidence
threshold.

------
Curly
To make it a true test of brains, and remove the mechanics of button-pressing
speed from the question...

Place all three contestants in isolation from each other.

All three hear the question read, and buzz-in just as they do now.

Allow ALL contestants who buzz in to answer the question, but do not allow
them to know about their opponents' performances.

Record all contestants' buzz-in reaction times.

At the end of the game, compare only the accuracy of answers to determine the
winner.

At the end of the game, compare buzz-in reaction times to see how thumbs fare
against relays.

------
ckwalsh
Just put up the latest results

[https://spreadsheets1.google.com/ccc?key=tth_jhM8vyBAuogqHll...](https://spreadsheets1.google.com/ccc?key=tth_jhM8vyBAuogqHllHmHQ#gid=2)

------
nyellin
I received a few complaints when I posted the results of round #1 and it hit
the homepage. You might want to change the title to something ambiguous about
who won.

------
jamesjyu
Part 1 of the second round on YouTube here:
<http://www.youtube.com/watch?v=PHhDLUVAtqU>

~~~
usaar333
Thanks. Part 2 available anywhere?

~~~
thamer
Part 2: <http://www.youtube.com/watch?v=HR2_M8kL_3o>

------
metaprinter
I had to leave the article to search where the actual building was located
because all they gave was, "suburban New York". I'm still not sure where it
is.

------
rockstar9
is there a replay of jeopardy?

~~~
jackowayed
IBM will have it up in a couple days:
<http://twitter.com/#!/IBMWatson/status/37223337453158400>

------
drstrangevibes
the fact that watson has good nlp isnt nearly as impressive as the fact that
it has a huge knowledge base, how the hell did it get all that knowledge, if
it is just from browsing the internet by itself that makes me
afraid.......very afraid

------
e40
Why the spoiler? Do you think everyone watches it live? We are in the age of
the DVR.

~~~
bpodgursky
<SPOILER ALERT>

Mubarak's not president of Egypt anymore!

