
IBM Watson's team Q&A on reddit - hootx
http://blog.reddit.com/2011/02/ibm-watson-research-team-answers-your.html
======
gojomo
Alas, my somewhat-skeptical question came in late and got little support:

[http://www.reddit.com/r/IAmA/comments/fnfg3/by_request_we_ar...](http://www.reddit.com/r/IAmA/comments/fnfg3/by_request_we_are_the_ibm_research_team_that/c1h90u4)

 _What determined the use of exactly 10 racks of 9 maxed-out (32-core, 512GB
RAM) 4U Power750 servers? For example, would Watson have done better with more
hardware? Or could it have made-do with far less, after all the bulk pre-
processing of, and training on, source material was finished?

(My intuitions about the necessary amount of reference data and topical
associations – written up at <http://redd.it/fnixm> – made me think way less
hardware should have been required, at least at the very end during the
match.)_

~~~
blhack
I wonder if this is marketing.

IBM makes BIG HUGE MASSIVE (tm) server clusters that have lots of
blinkenlights and require lots of power and are so crazy and huge and awesome
that the people working at IBM must be hyper-geniuses!

vs

Watson runs on a laptop.

~~~
brisance
I can't see how that would be marketing. Most people would be more impressed
if Watson could run on a laptop, today. I know I would.

~~~
gojomo
More impressed? Yes. More willing to spend millions on hardware and software?
No.

~~~
brisance
And how would they look if, as the original poster surmised, it were possible
for a competitor to produce something similar to Watson on today's laptop? IBM
would've spent millions of dollars making themselves look very silly.

It's a similar situation to moon-landing skepticism. If it didn't happen, it
would be trivial for the Soviet Union to disgrace and discredit the
achievement by faking their own.

~~~
gojomo
No other team was given a crack at the Jeopardy spotlight. IBM didn't win
against a bunch of other teams for the slot on the human/machine faceoff; it
was all orchestrated for them. So give others some time to emerge.

I suspect in a year or two – perhaps sooner, if an organized open competition
with real cash prizes is launched – we'll have a better idea of what a lean
team could do on the trivia domain. It may not be reduced to a single
2011-equivalent laptop... but it might be a single 2011-equivalent maxed-out-
server (rather than a Watson-like server room).

------
trickjarrett
I also feel that they sort of jogged around the buzzing in question. Obviously
Watson has to calculate and decide his answer, but there is no denying that he
was very fast on the buzzer in the game.

~~~
judofyr
From the Q&A with Ken Jennings: [http://live.washingtonpost.com/jeopardy-ken-
jennings.html?hp...](http://live.washingtonpost.com/jeopardy-ken-
jennings.html?hpid=talkbox1)

    
    
        Q: Seemed to me, for many of the questions, that the computer was just
        better at buzzing in. Does Watson have an unfair advantage for timing the
        buzz-in?
    
        A: As Jeopardy devotees know, if you're trying to win on the show, the buzzer is
        all. On any given night, nearly all the contestants know nearly all the
        answers, so it's just a matter of who masters buzzer rhythm the best.
    
        Watson does have a big advantage in this regard, since it can knock out a
        microsecond-precise buzz every single time with little or no variation. Human
        reflexes can't compete with computer circuits in this regard. But I wouldn't
        call this unfair...precise timing just happens to be one thing computers are
        better at than we humans. It's not like I think Watson should try buzzing in
        more erratically just to give homo sapiens a chance.

~~~
ars
Seems to me they should have chosen harder questions.

They should try to pick questions such that the contestants only know about
1/3 of them.

Then lets see how the computer does.

This is (should be) a contest of knowledge not buzzing.

~~~
kenjackson
But that's true for normal Jeopardy too. The best buzzer person (Ken Jennings
or Brad Rutter) wins because of their reflexes and timing. Watson just took
that edge off the table and flipped it back at them.

~~~
ars
I meant change it for regular Jeopardy as well.

~~~
trickjarrett
That's good for a intellectual challenge, but for game shows to work, they
need to be questions that players at home, viewers who earn them their
advertising dollars, know. They want us to be sitting on the couch shouting
out the answers. If we're sitting there clueless, it's a much shorter distance
to changing the channel or taking it off the dvr.

------
Jd
Question 3 was the most interesting but data on parsing remarkably incomplete.
As far as I can tell, we have only lists of possible ways to break down the
data without any explanation of how or why one possible way is preferred to
another.

Case in point (1):How it decides to treat "Treasure Island" as a proper noun.
We see only "modifies(Treasure, Island)" -- indicating that it treats treasure
and adjective modifying island, then suddenly in the semantic assumption phase
they are treated as a compound.

Case in point (2). We are given:

    
    
            island(Treasure Island)
    
            location(Treasure Island)
    
            resort(Treasure Island)
    
            book(Treasure Island)
    
            movie(Treasure Island)
    

I assume what he is giving us is method names written in Java with "Treasure
Island" as the single argument that return a value indicating the likelihood
that "Treasure Island" is what the method name refers to. This is
extraordinarily interesting. However, it is not at all clear which methods are
chosen and why, if they are run in some sort of sequence or simultaneously,
etc .

Case in point (3) : "Builds different semantic queries based on phrases,
keywords and semantic assumptions." This is very vague but indicates that
Watson generate a set of queries which it runs against its own internal search
engine ranking answers presumably based on the quality of the initial search
and the confidence of the answer. Would be very very cool to have an example.

All in all, wets the appetite but leaves one wishing for more hearty fare (or
a job at IBM!).

~~~
ppod
A while ago I submitted a link to a blog post( <http://bit.ly/igJeRB> ) I did
that goes into the system in a bit more depth, based on a paper IBM published
- there's a link to the paper (open access) there too.

For your cases, from what I got from reading papers:

(1) You could spend two weeks only reading papers on noun-compound semantics.
Try a google scholar search just to get an idea of the volume of research. A
simple technique to test how idiomatic a phrase is would be a Bayesian type
test to see how many times "Treasure Island" occurs in a corpus divided by how
many times "Treasure X" and "X Island" occurs. In this case the capitalization
probably cues it to look up Treasure Island in Freebase. Interesting thought
actually - do the contestants also get the question as text? I think they do,
so they get capitalization.

(2)I would be pretty sure these are not Java methods, I'd say they are logical
predicates representing the fact that 'Treasure Island' is returned as being a
member of the set of things indicated by the predicate, as returned either by
syntactic processing (island) or from the knowledge bases (WordNet, Yago,
Freebase, dbpedia)

(3) there isn't a worked example in the paper, but my idea of this is that
it's basically Watson's way of figuring out what queries to type into it's
unstructured text corpus (they have a corpus of web snippets indexed with
lucene)

~~~
natrius
For more on (2), see <http://en.wikipedia.org/wiki/First-order_logic>

------
trickjarrett
Some interesting nuggets in here, I had watched the Nova specials on Watson
etc. I would have liked to have had a question about the team, their work
stress etc., but otherwise a fun read. I especially enjoyed the step by step
parsing and examination of a question in the process of how Watson would work
through it.

------
baddox
One could make the argument that since Watson is trained with English
information and English Jeopardy! clues, English _is_ Watson's native
language. Sure, there's Java down to Assembly beneath Watson's understand of
English, but the same goes for native English-speaking humans. English
speakers aren't biologically any different than, say, French speakers.

