
Day 1: IBM Watson Ties for first on Jeopardy - nyellin
http://venturebeat.com/2011/02/14/in-man-vs-machine-challenge-ibms-watson-ties-for-first-on-jeopardy/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed:+Venturebeat+(VentureBeat)
======
chaosmachine
After not being able to find a suitable online stream of this event (I don't
have cable), I gave up and built a rudimentary antenna for my TV. Basically, I
just wound a bunch of copper wire around a piece of cardboard and stuck it in
the window. The other end of the wire went into the cable port on the back of
my TV. Surprisingly, I was able to pick up the game in crystal clear, digital
1080p!

~~~
Swizec
That's pretty impressive, mind sharing a photo with us? I just want to
understand what this level of awesome looks like.

~~~
chaosmachine
Thanks, I just wrote it up and took some pictures:

<http://news.ycombinator.com/item?id=2222827>

------
edw519
Watching Watson's avatar gave a little insight about what the software may
have been doing. The results, based upon that avatar. The % is its confidence
in its #1 response.

    
    
      Category    Clue         Watson#1Resp   % rsult otcm
      ----------- ------------ ------------  -- ----- ----
      Alt Meaning belief       view          70 right Brad
      Alt Meaning horse foot   shoe          68 right won
      Lit Chars   split person Hyde          71 right won
      Beatles     guy pain     Jude          98 right won
      Olympic Odd 2008 perfect Michael Phelp 93 right won
      Name Decade Disneyland   1950s         87 right Ken
      Final Front black hole   Event horizon 97 right won
      Lit Chars   Beowolf      Grendel       97 right won
      Final Front Michelango   Last Judgemnt 97 right won
      Beatles     title gal    Lady Madonna  90 right won
      Olympic Odd 1908 city    London        69 right won
      Name Decade Emp St Bldg  1930s         50 right Brad
      Beatles     Silver Hammr Maxwell       98 right won
      Lit Chars   his victims  Harry Potter  37 wrong Brad
      Alt Meaning piece wood   stick         96 right won
      Final Front Latin ending finis         97 wrong lost 
      Beatles     John mother  Julia         97 right Ken
      Lit Chars   Les Miz      Jean Valjean  76 right won
      Olympic Odd 1976 epee    Pentathelon   85 right won
      Name Decade Klaus Barbie 2002          11 wrong Ken
      Olympic Odd George Eyser leg           61 wrong lost
      Alt Meaning bent arm     Knee          40 wrong Ken
      Name Decade Oreos        1920s         57 wrong lost
      Final Front paper limit  Envelope      61 right Brad
      Alt Meaning students     chic          82 wrong lost
      Final Front summit       peak          65 wrong Brad
      Name Decade Kitty Hawk   1900s         17 right Brad
      Olympic Odd sole member  Olympic Games 20 wrong Ken
      Beatles     died church  Eleanor Rigby 98 right won
      Lit Chars   evilness     Sauron        74 right won

~~~
ckwalsh
I also put together the full game stats:
<http://news.ycombinator.com/item?id=2220667>

------
dimatura
I attended a talk by one of the designers. Some of Watson's 'wrong' answers he
described were pretty funny. There was one where the clue (paraphrasing) was
'Punch below the beltline that rhymes' and the right answer was 'Low blow',
but Watson came up with 'Wang bang'. In another one the clue was something
like 'The end of this had an exclamation mark in headlines in 1919', with the
right answer being 'World war I', whereas Watson answered 'Sentence'.

------
Dboy
Video:

Part 1: <http://www.youtube.com/watch?v=4PSPvHcLnN0>

Part 2: <http://www.youtube.com/watch?v=CtHlxzOXgYs>

~~~
Splines
Thanks for the links. I love the bits when Trebek visits Watson in the lab -
it's the sort of thing where kids in 20 years will giggle at the fact that
their pocket computer is more powerful.

------
cryptoz
I couldn't have wished for a more exciting opening to this game. Watson just
shot right off into the lead like there was no tomorrow, jetting ahead by
thousands! I was blown away. Then he slowly started to crumble a bit and his
answers were faulty more and more. But his strategy is amazing! He hops from
category to category, while most humans pick a category and stick with it. I
wonder if IBM has him do that purely because it might throw the humans off a
bit. Totally awesome. I can't wait for tonight's show!

~~~
AgentConundrum
_He hops from category to category, while most humans pick a category and
stick with it._

I noticed the same thing, but I really hoped it was unintentional. It doesn't
feel like it's a fair fight otherwise, since it means he's exploiting a
weakness intrinsic to his competitors. The whole point of this competition is
to show that a computer can compete and win at a human level. If it weren't,
then why force him to physically push a button to ring in? If Watson is
intentionally jumping categories to confuse humans, then it sort of feels like
he's cheating a bit.

As I write this, I realize a lot of this comment is motivated by my bruised
human ego, but it just doesn't feel quite right to me.

~~~
hallman76
_It doesn't feel like it's a fair fight otherwise, since it means he's
exploiting a weakness intrinsic to his competitors_

Late in the first round you can see Jennings change his strategy a bit -- he
starts ringing in before he knows the answer. Jennings is exploiting the fact
that Watson won't ring in until it has confidence in an answer.

~~~
bitwize
That's awesome. He can make that strategy pay off because he's Ken freaking
Jennings. Usually the money penalty for buzzing in but answering incorrectly
(or not at all) would dissuade a contestant from buzzing in before actually
knowing the answer.

------
gojomo
That Trebek gave Watsom credit for "Maxwell's silver hammer" when the clue was
looking for the person – Maxwell – gave me a moment's pause.

Jeopardy does tend to be a little more forgiving of contestants during the
first round – such as giving them a reminder to phrase-as-a-question, and
letting them correct an initially wrong or incomplete utterance, if done
almost instantly. Still, I'm not sure if such a interpretive error by a person
would have been overlooked.

~~~
jeffcoat
I noticed that, too; but a little later in the game, Trebek declared "leg"
(instead of "he was missing a leg") wrong without hesitation.

So it's definitely being held to some standard.

~~~
Benjo
Actually, according to arstechnica, that lack of hesitation was before the
segment was reshot.

A human had answered "missing a hand," which created context that means "What
is a leg" could be interpreted as "he was missing a leg". Initially Alex did
give him credit, before realizing that Wattson couldn't have been using that
context.

[http://arstechnica.com/media/news/2011/02/ibms-watson-
tied-f...](http://arstechnica.com/media/news/2011/02/ibms-watson-tied-for-1st-
in-jeopardy-almost-sneaks-wrong-answer-by-trebek.ars)

~~~
shalmanese
OK, so I have a question about this, how was it legitimate for Brad to answer
at this point? Brad would have seen Alex initially credit Watson with the
right answer and then reverse the judgement, rendering the answer incorrect.
Brad would have been able to infer the correct answer at this point.

~~~
alanfalcon
In fact, Brad didn't answer the question... I was wondering about that since
the answer was obvious just from Ken's guess and Watson's response.

------
markszcz
Going the Sci-Fi route, I would love this technology to expand and become an
interactive Wikipedia. How cool would it be to go to your local library, (you
remember those?) and ask it a series of questions and get answers.

But thinking about how quickly archaic the library is turning into these days,
I see it going this route: You call this service from your computer or phone
and ask it anything. But by then Goggle will already have that feature. Google
AI (Beta)

~~~
flatline
I too am inspired by all the possibilities this implies for human-computer
interaction. I picture a portable device that records events from your
surroundings and provides personalized reasoning similar to this. It could
remind you of people's names, it would know all the details of your personal
finances and could provide budget and stock tips, remind you of events if you
have a pattern of forgetting to plan ahead, etc. Aside from the fact that
Watson fills up a whole room and its knowledge is tailored to Jeopardy, it
really doesn't seem that far off combined with a few other technologies like
machine vision and translation that are already out there.

------
baffledshrimp
I could beat Watson. "I'll take quote semi-colon drop table categories for
200"

------
spencerfry
You can watch it on YouTube:

Part 1: <http://www.youtube.com/watch?v=4PSPvHcLnN0>

Part 2: <http://www.youtube.com/watch?v=CtHlxzOXgYs>

------
ugh
If you want to watch it but have no way of doing so, the game is (still?)
available on YouTube and easily findable. It will also be available on the IBM
Watson website (<http://ibmwatson.com/>) the day after tomorrow.

------
shkb
<http://news.ycombinator.com/item?id=2220667>

For those that missed the game, or can't remember every question, a nice
spreadsheet/empty discussion made by ckwalsh.

------
umjames
If Watson wins, it should have to keep coming back as defending champion until
someone else beats it. It should also qualify to return in the Tournament of
Champions if it gets that far.

Now that would be a true test.

~~~
alanfalcon
It would be quite entertaining, but that would mean either more permanently
moving Jeopardy! filming to IBM's campus (cost prohibitive) or rebuilding
Watson on Jeopardy!'s set (also cost prohibitive) and you see why this is just
a three day exhibition match.

~~~
umjames
Understood.

Does anyone know what these IBM supercomputers ultimately are used for after
they're done winning Jeopardy or beating Gary Kasparov? I'd imagine something
top-secret for the US government, but it would be nice to know exactly what
Deep Blue is doing now.

------
nyellin
Does anyone want to create a community site for the contest? I am looking for
another Django developer who can participate in an all-night code sprint. The
site would:

1\. Show an online score board.

2\. Track Watson's performance (like shkb is doing on Google Docs)

3\. Aggregate pictures, videos, and written analysis of the event

~~~
trafficlight
I think you should've been doing this on Sunday...

~~~
nyellin
I didn't think of it until now. I assumed that the official IBM site would
have a way of tracking Watson's progress.

------
e40
Spoiler not appreciated by many, I'm guessing. Next time, please don't do
that.

~~~
kgermino
I'm confused. Where's the spoiler? I don't remember seeing anything in the
article (and I'm glad I didn't I wouldn't appreciated the spoiler).

~~~
bokchoi
The title of the submission is the spoiler.

------
T-hawk
=====MILD SPOILERS FOR DAY 2 BELOW=====

(I posted this in another thread, but this one seems to have most of the
action.)

I think it's become apparent that most of Watson's advantage is in signaling
speed. Figure that a top human player like Jennings knows about 80% of the
answers and Watson about 90%. Watson should be winning, but not by nearly that
much.

Jeopardy signaling 101 for those that don't know: An offscreen operator
presses a button to enable the buzzers when Alex is done speaking. If you buzz
too early, there's a 300 ms lockout until you can buzz again. A light near the
board lights up when the buzzers are open. Watson monitors that light (whether
by electrical connection or optical sensor, they haven't said) and physically
presses the signaling buzzer. Its reaction speed must be faster than the
humans' and it will never miss and buzz too early.

Also, Watson's clue selection pattern shows that it definitely starts by
searching for Daily Doubles. It picks the bottom 3 clues of each category
before anything in the top 2 rows, where the DDs are statistically
concentrated.

------
toddh
The longer the question the longer Watson has to generate an answer. Isn't
that an important advantage?

~~~
lylejohnson
I'm not sure I understand your point. All of the contestants "see" the answer
at the same time, and none of them are allowed to buzz in until the magic
light goes on.

~~~
derwiki
But if the question takes 20 seconds to read, Watson has 20 of his seconds to
narrow down his confidence to a particular answer. If he only had 10 seconds,
his calculations might not have gone far enough to bubble the correct answer
on top. It would be interesting to see a chart of confidence of answers vs
time for Watson..

~~~
dfan
But the same is true for the human contestants; at this level especially, I'm
sure that they scan the whole question in a second and then have the rest of
the spoken time (which, even for long questions, is nowhere near 20 seconds -
probably more like 4) to think about it.

------
ubercore
I'd like to see the next version take the avatar concept further. They went
half way by making it physically buzz in. Go the rest of the way, and make it
only receive input from the avatar. Add a computer vision and speech
recognition component, and make it read the board and listen to Trebek like
the human players. Then, make it mobile, and put the avatar in the actual
Jeopardy! studio.

EDIT: Clarified "human" players. Welcome to the future!

~~~
dimatura
According the designers they considered it, but finally decided to concentrate
on the question answering. OCR for this situation could probably be done in
fast and reliable way, so it wouldn't make much of a difference. Speech
recognition would be harder and less reliable, but redundant most of the time
(though it presumably could have helped for the 1920s question).

~~~
ubercore
Understandable, and I agree that they definitely tackled the meat of the
problem. I think there could be some interesting problems in the computer
vision aspect of it, though. Seeing and interpreting the board is subtly
different than a straight OCR problem.

Maybe I'm overstating the difference, though.

------
jluxenberg
Does anyone know why Watson was unable to buzz in immediately in some cases
when he had 97% confidence in his answer? Seems odd that a computer would not
be able to beat the human opponents at hitting the buzzer.

(It's possible that he didn't have the answer until after the human opponents,
but that seems unlikely.)

~~~
ugh
Watson needs a few seconds to calculate the answer and will only hit the
buzzer when it’s finished calculating. Its opponents can buzz in even if they
are not completely sure that they have an answer (or they simply might be
faster than Watson). It’s important to note that it is only possible to buzz
in after the question has been read out. (Someone behind the stage flips a
switch or something like that. Lights indicate that the buzzers are open for
the humans, they are locked out for a few hundred milliseconds when they press
the buzzer too early, Watson never buzzes in prematurely.)

I would like to know whether it is possible to beat Watson to the buzzer even
if you and Watson both already know the answer. Watson has probably better
reaction times but humans can anticipate when the host is finished reading.
That probably depends highly on the consistency of the person behind the stage
flipping the switch and opening the buzzers.

Here are some stats from the first round: Watson was the first to buzz in 16
times (with two wrong answers). It was above its buzzing threshold and didn’t
manage to buzz in seven times. (It would have been wrong three times if it had
made it to the buzzer first.) It was below its buzzing threshold six times.
(The remaining one is the daily double which Watson got correct.)

This needs to be in a table. Numbers in brackets indicate wrong or potentially
wrong answers:

    
    
      Confident & first:  14+(2)
      Confident & beaten:  4+(3)
      Not confident:       6
      Daily Double:        1
    

Edit: Some think that Watson’s (probably) superior reaction time gives it an
unfair edge. I don’t really agree because fast reactions are simply a part of
the game but I can sort of see the point that it’s not really about reaction
times. We already know that computers can be better than humans when it comes
to those.

I propose the following modification: One of the contestants who manages to
buzz in in the first 100ms (the human reaction time) is randomly selected,
buzzing in after those 100ms works as usual. Contestants are also no longer
locked out for buzzing in too early.

~~~
KVFinn
I'm a bit lazy to look up for the source, but one article mentioned that the
engineers were surprised that humans could get down to the 10 - 15 ms range by
anticipating the light rather than reacting to it. They said this beats the
mechanical buzzer, and it does happen, though not terribly often.

So yes, it is possible to beat Watson to the buzzer even if you both know the
answer.

I suspect they could have improved the buzzer if they really wanted to, but it
might have made things too lopsided.

------
tocomment
Has anyone considered building an open source Watson?

~~~
spicyj
Not very many people have a state-of-the-art supercomputer in their basement,
so it wouldn't be that useful.

~~~
ebiester
IBM's using two racks of computers and a huge SAN. Maybe it couldn't compete
with watson, but a decent setup could be cobbled together for 5,000. It
wouldn't be as _fast_ , or as comprehensive, but it could be done.

The harder part is the tens of thousands of man hours with very smart people.

~~~
tocomment
I think a little project called Linux got tens of thousands of man hours with
very smart people ... Just saying.

------
powrtoch
"Watson can’t adjust its answers to what the other players say and so it
simply answers with whatever comes up as its top answer."

This is false, in fact they specifically mentioned Watson learning from the
answers of the other players.

In the game, Watson answers "The 1920s" after Jennings answers incorrectly
with "the 20s". Basically, it didn't write off its own answer because it
thought Jennings' was different enough that it might have been incorrect for
the way it was phrased. The same way you might correctly answer "inner ear"
after someone else incorrectly answers "ear".

~~~
asuth
I was at an event at MIT with one of Watson's designers as I watched the show.
I asked him about the 20s thing, and he said they talked about it when they
were designing Watson but they figured the scenario wouldn't happen where
another contestant gave the wrong answer and Watson gave the same-but-
different wrong answer. Edge case!

~~~
sharmajai
Capt. Ed Murphy was a smart man.

~~~
solarmist
He was Major Ed Murphy.

------
JanezStupar
Its a baby god.

------
TimothyBurgess
Take note fellow hackers:

We're witnessing a truly monumental event in human history and technological
development.

Computers competing with humans on gameshows as if they are intellectually
equivalent...

We live in the freaking future now, man.

