Hacker News new | past | comments | ask | show | jobs | submit login
IBM's "Watson" finally ready for prime-time Jeopardy (hpcwire.com)
156 points by shawnee_ on Feb 10, 2011 | hide | past | web | favorite | 142 comments

This is going to be, for lack of better words, epic. On a personal level however, it's a reminder of how truly wondrous the future is.

When Deep Blue beat Kasparov I was a kid in an Italian high school, trying to explain to disinterested fellow students why that was a big deal.

Not even in my wildest dreams I would have imagined to be working for the company that made that and this event possible, in a different country than my own.

Sorry for the slightly off topic comment, but I just wanted to remind everyone that the future can be a beautiful and surprising time if you stick around long enough to witness it.

I love seeing this type of response. It's similar to the race to put a man on the moon. Some technology should be there to inspire.

Not just inspire, but push the envelope. The fallout of traveling to space not only unified the country but provided new technologies which simply had no need to be created until something as epic as landing on an orbiting satellite was attempted. These actions pay dividends unrelated to the mission at hand.

By 'orbiting satellite' I'm guessing you meant the Moon, but I've personally always been most amazed not by the moon landing per se, but the subsequent return from the surface to dock with the waiting CSM.

The Lunar Module is an amazing piece of engineering! It can land, be the base for excursions for more than three days, becomes and impromptu rocket launch pad and is eminently dockable. All that capability fits in a cube with a five meter edge and weighs less than 15 tons, it was developed and built in well less than ten years and it worked every single time.

Some like the raw power of the Saturn V, I think the multi-talented Lunar Module is much more amazing.

Reminds me of this Brad Paisley video: http://www.youtube.com/watch?v=Y0Yg9wjctRw&feature=relat...

Watson's defeat of top Jeopardy! players, while an impressive feat of natural language processing, is not as impressive as a defeat of a grand master, for one reason.

The reason is that when multiple contestants know the correct response in Jeopardy!, it all comes down to reflexive timing. It's no surprise that a computer could buzz in faster than a human being, and there's no evidence that Watson knows more correct responses than the human competitors.

I don't think you fully understand what is going on here. Watson will Process the slang in the sentence, figure out how the words are being used, understand the context of the question and then look through its internal databases to see if it can piece different pieces of knowledge that it has to come up with an answer.


What led you to believe that I don't understand what's going on? I definitely understand and appreciate IBM's accomplishment. I'm not minimizing the coolness of Watson being able to come up with so many correct responses, I'm minimizing the importance of its actual Jeopardy! gameplay. I'm not sure you understand the real game mechanics of Jeopardy! With highly proficient contestants, it comes down to being able to buzz in as soon as the light turns on indicating that the clue has been delivered by the host. All three contestants know the correct responses to most clues, and it really does come down to the buzzer reflexes, something Watson clearly has a huge advantage in.

Again, I fully understand and appreciate what an advance in natural language processing and sheer computation Watson represents. Even being able to match a below-average human Jeopardy! contestant would be impressive. However, the true game-winning technique in Jeopardy! (when all three contestants are highly skilled) is buzzer reflexes.

I think you are exactly right. Other commenters seem to be misunderstanding you.

It's clear that this is a huge achievement. However, it is different than a computer beating the human chess champion. All this will prove is that a computer is about the same as the best humans, not strictly better than them.

(And, that it is strictly better at timing its buzzer response, which is completely not impressive for a machine to excel at.)

I'd be curious in knowing the mean response time of Watson. That would shed some light on the topic.

But you still give Watson too little credit here. You have typically 3s from when the "answer" is shown to when you can click the buzzer. So in 3s Watson has to go through all of its databases and semantic links and formulate the question, from an answer that often doesn't even make sense to probably half of the US population. That's pretty impressive.

I understand and appreciate what Watson is doing. It's awesome and ground-breaking. However, with contestants of this skill level, most of the responses will be known by all contestants, and the real competition is how fast you can buzz in after the light turns on indicating the clue has been completed.

Think of it this way: if you had a human standing in Watson's spot, with a monitor displaying Watson's suggested response, I don't think that human would perform as well against the other human contestants. Sure, Watson would often come up with the correct response while the clue was being spoken by the host, but the human would still have to buzz in when the light turns on. If you buzz in too early, you get locked out for something like a quarter of a second. Watson would still be extremely awesome technology, but I don't think a human with access to Watson's output would compete at anywhere near the level that Watson did in the video.

with contestants of this skill level

Do you see what you're saying.. "contestants of this skill level"? It's Ken Jennings, Brad Rutter, and WATSON. Discussion over. The fact that there is a computer that can answer questions in 3s that put it at the level of the two greatest trivia players in history is THE story.

Some jeopardy answers:

Anagramed Animals -- A furry little pet: the rams

Some say the Bush administration's domestic spying conflicts with FISA, the Foreign Intelligence this Act

"R"2"D"2 -- More rubicund

Half the people in the US probably couldn't answer any of these at all, much less necessarily even know what the answer was referring to, period.

The fact that the computer may have an advantage at pressing the buzzer against the two best buzzer pushers of all time is a pretty small deal. IMHO. :-)

I am expressing marvel that a computer can respond to many Jeopardy! clues in 3 seconds. That is marvelous, amazing, and awesome. Don't infer that I am undervaluing that achievement.

The additional fact that this computer can beat human players is not impressive to me, because I know how Jeopardy! works. To sum everything up: it's a great feat of artificial intelligence and computer performance for Watson to generate correct responses so quickly; it's not a great feat for Watson to be able to buzz in faster than human competitors. I think IBM could have chosen a better sort of competition to truly show off Watson's abilities than a competition with reflexes as the final layer of competition.

I just submitted this story (http://ibmresearchnews.blogspot.com/2010/12/how-watson-sees-...), but here is a quote from it that was interesting:

"The best human contestants don’t wait for, but instead anticipate when Trebek will finish reading a clue. They time their “buzz” for the instant when the last word leaves Trebek’s mouth and the “Buzzer Enable” light turns on. Watson cannot anticipate. He can only react to the enable signal. While Watson reacts at an impressive speed, humans can and do buzz in faster than his best possible reaction time."

I suspect if anyone is good at anticipating the buzzer it is Jennings and Rutter -- the two best Jeopardy players in history. At least until next week...

I'm skeptical of that. Is the Watson's computer to solenoid path really slower than the brain to muscle path of a human? Sure, an anticipatory human could buzz in after the light is electronically triggered but before the luminescent fixture (LED, or whatever) fired up, but humans will also undoubtedly jump the gun and buzz in too early, causing them to be locked out. To say that a human, even a seasoned Jeopardy! player, could beat a computer and solenoid any significant portion of the time, is something I have a lot of trouble accepting.

I don't think its all that hard to believe. Remember that Watson also has to depress a physical button (the same buzzer everyone else uses).

The eye to finger path for humans is about 200ms. It probably takes about 100ms for Watson to physically press the button. So Watson is about 100ms faster. But that also gives humans about a 100ms window in which to beat Watson. This means that you need to start your press 100-200ms before Trebek finishes his last word.

That's pretty good sized window for most people given you are reading the question along with Trebek. If the person who turns the light on is very consistent, I think a human who is good at this could consistently beat Watson.

How many people in the US could answer these if given access to wikipedia, the OED, etc? Nearly anyone.

So you're saying that a human aided by a computer, internet connection, and access to additional data can answer intentionally formed trivia questions as well as an unmanned computer (although I suspect a fair bit slower)?

This is the biggest innovation in computers in my lifetime and kids are like, "but its no Twitter".

Watson doesn't have internet access. It has just learned from thousands of sources and thousands of Jeopardy! questions. Just like human Jeopardy! contestants.

I'm saying that Watson's advantage in trivia-answering comes mainly from aptitudes which are already well-known to favor the computer.

Watson's human-level performance at Jeopardy comes from the combination of highly superhuman data retrieval with highly subhuman language processing.

Watson is more a spectacle than an innovation.

And here we see the AI paradox..

As soon as AI succeeds at something, it is simple.

Are you an AI? Your response is completely generic, taking into account none of the specifics of my argument.

"Reverse primary thrust, Marvin." That's what they say to me. "Open airlock number 3, Marvin." "Marvin, can you pick up that piece of paper?" Here I am, brain the size of a planet, and they ask me to pick up a piece of paper.

Marvin the Paranoid Android, in HHGG

The main difference between chess and Jeopardy is that Watson has to interpret ambigous (and awkwardly presented) "answers" and interpret what will be a suitable "question" to that. Both these two are really difficult. Watson needs a lot of natural language understanding across all lingusitic domains to even be able to search its databases and rule engines and what have you.

Chess, in contrast, is a totally deterministic game and the game state can easily be fed into a computer. That computers are able to beat humans at that comes as no surprise.

If Deep Blue had tied Kasparov, it would still have been a tremendous achievement. I'd say the victory was only minimally more impressive.

Human reflexes, when ready for a signal can respond easily with 0.2 of a second and often much faster, this is the sum total of the advantage that Watson has. This is irrelevant, what matters is that Watson has probably reached the stage where it can compete with the best of humans in a game involving abstract reasoning and natural language processing and do it fast enough where people can start arguing about fractions of a second!

I've been working on the periphery of this project for about a year now (my company created the on screen "avatar" that's shown at Watson's podium on stage) and it's been amazing to watch as Watson has progressed.

Some of the early matches I got data from were downright funny. Lots of nonsense answers and weird correlations that kinda made sense but made it obvious Watson didn't really understand the problem space.

In case you're interested the avatar gets realtime data from Watson and visualizes both Watson's internal state and the game state. The core bit is a collection of "threads" that swarm around the surface of a sphere. The speed, variability, color, and length of the threads are all tied to the data we get from Watson. The colors roughly correspond to confidence, and when the threads bunch up it has to do with what Watson is "doing" (i.e. if he gets an answer wrong the threads will go slow and gather to the bottom of the sphere, if one of the other contestants is answering the threads will gather on that side of of the sphere, etc).

The designer and my team agreed early on that there would be exactly 42 threads. ;-)

Oh, and when Watson speaks the threads push off the surface of the sphere to the intensity of the audio. It also makes a subtle glow in the center of the sphere brighten in an homage to Hal.

Humanity is closing in on building machines to pass the Turing Test. Watson beat both champs in the demo round a few weeks ago and I fully expect him - yes, him - to beat them in the match coming up. I can't wait to watch this live. What an epic moment in the process of moving slowly from weak AI -> strong AI.

Watson can do amazing NLP (and presumably Machine Learning), which is something that the general public perceives as straight up "AI". NLP has been lagging far behind expectations for decades, but with Google's new Translate apps and Watson competing on Jeopardy, it seems like NLP is pretty close to being fully solved.

Very exciting. Very, very exciting.


> Eventually the machine will prevail.

This sends chills down my spine.

Watson is amazing, but I've known for years that it was coming. You just have to look at Google and Wikipedia to know that it was coming. Because answering trivia questions is just not the equivalent of passing the Turing test. Trivia questions are quite guessable.

I played a lot of College Bowl (and its local high-school equivalent) back in my younger days. And there were many running gags about the strategy. For example, if a question contained the words "Name this artist..." there was a strictly limited set of possible answers, defined roughly as "artists whose names appear in a typical high school art appreciation class". Similarly, "name this composer" is more likely to be cluing Beethoven or Bach or Mozart than Grieg or Saint-Saens, and almost certainly not anybody more obscure than that. And Grieg is not that obscure.

There were a lot of common themes: It was a good idea to memorize the names of all the Greek, Roman, and Norse gods; the names of all the types of clouds; the capitals of countries and the names of political leaders. That sort of thing.

In College Bowl (unlike Jeopardy) it is legal to press the buzzer and interrupt the questioner at any point. The questioner just stops reading until you've taken your guess. And so every College Bowl geek knows a story like this: Someone accidentally presses the buzzer before the person reading the questions has said more than one or two words. So, with nothing else to do but guess, the player shouts something out. And that guess is correct.

I've never seen a zero-word guess work in person, but I've seen a few two and three-word guesses succeed. Given just a hint of the topic ("this scientist...") the odds are surprisingly good. The search space is just not that large.

One reason why trivia is easy is that questions that require even the smallest amount of actual thought tend to be too ambiguous for good trivia. Trivia questions are generally read, answered, and judged by people who have no actual expertise in their subject matter, so it's important that the answers be very clear and unambiguous. I used to dread the physics questions, because I was a student physicist, and I understood the questions too well. So it was sometimes difficult to avoid giving answers that were technically correct, but that did not match the one on the card.

My college didn't have a college bowl team, but competing on high school teams and being responsible for more than a few correct two or three word answers, I can vouch for this.

It's all about having enough breadth that the other bits of information presented can help whittle down answers by way of association and relationships. Good trivia players rely quite a bit on intuition.

Ken Jennings mentioned (in his book) how Jeopardy! often has questions about rivers, so it was a fairly simple matter to just memorize river names and some basic associations with which to answer questions.

It seems like Watson is mostly a really sophisticated algorithm on top of lots of brute force (2880 processors?!). Though I guess it could also be said that humans are just ugly bags of mostly water.

That said: in something like medicine (differential diagnoses), such a system could also be extremely useful, given the frequent occurrence of a fairly limited number of "answers".

It could also make a decent living in tech support too.

I had the same thought. I think Watson would make a fabulous resource for medical diagnoses. It also shouldn't just give the "answer", but the top 10 or so matches along with their probabilities.

> It seems like Watson is mostly a really sophisticated algorithm on top of lots of brute force (2880 processors?!). Though I guess it could also be said that humans are just ugly bags of mostly water.

Arguably Ken Jennings is mostly a really sophisticated algorithm on top of lots of brute force (billions of neurons?). And computers are just pretty blocks of mostly silicon. :)

Trivia questions are quite guessable.

I think you overestimate human interaction. :-)

Look at how many people will have full on conversations with bots. And these aren't even good bots.

An IRC based Turing test will be passed in the next 15 years. The corpus that we have with Google/Bing/Wikipedia add to that emails from GMail/Yahoo/Hotmail, IMs and text message, Facebook messages and statuses.

The computer will not only be able to have conversation, but it will be able to go from talking to you in the Queen's English to ebonics. And will be the most knowledgeable person you've ever met.

Watson is a simulation, a clever puppet designed to mimic human behavior. We have no more to fear from it, than from a punch and judy show.

Cleverer and cleverer puppets will not become intelligent machines. A dedicated effort to build a massive neural cluster simulation might become intelligent, but it will think such incomprehensible thoughts as to be fairly useless to us (think intelligent rhododendon; what would you have to talk about?)

We will know when Watson is dangerous when it feels fear, angst, want. Not just an algorithm to sort facts and simulate speech.

Watson is not a simulation any more than your brain is a simulation. Your brain is (likely) mostly just algorithms and data structures, albeit on an enormous scale. This could be wrong, but I'm betting not. What else could the brain possibly be? It's not magic.

> We will know when Watson is dangerous when it feels fear, angst, want. Not just an algorithm to sort facts and simulate speech.

Nah. Look at good science fiction to see why this is false. The most dangerous AI ever conceived might be something like Skynet or HAL 9000; machines that are cold, calculating and have no emotion whatsoever. But they do hold weapons, they do speak English and they are intelligent. I'm not suggesting we'll ever build anything like what we see in science fiction, but good writers often are able to predict - or even influence (see Clark, Asimov) - the future.

Algorithms to sort facts are indeed the beginnings of machine intelligence.

I deny that the human brain IS algorithms. It is neurons connected by dendrons/synapses. Right?

Small bits of it can be algorithmically simulated. Large processes can be algorithmically simulated. But to call the algorithm "intelligence" is sympathetic magic.

Algorithms work in a different way; they break in a different way; they are hard-coded so don't change. They are a simulation.

No disrespect intended but, I think you are lacking a fundamental understanding of machine learning, genetic algorithms, neural networks and AI in general.

I was introduced to Watson when I took a sub-project from IBM related to Watson, I had no idea what it was untill taking the contract, at which point I was introduced to Watson and it is an impressive feat, it is with out a doubt the state of the art in NLP AI. Sufficient to say, AI algorithms differ from the static and rigged structures that most developers write for business apps, web or mobile apps (even the good developers that decouple stuff), the scale of difference is orders of magnitude.

In many cases machines learning has the ability to generate new code based on learning and build new connections and new algorithms to deal with learned problems. Applications literally generate new and novel applications and then connect themselves to these new nodes.

This is not stuff usually broached when you just need to credit a line in an account payable or get the sales volume for last month. There is a huge gap between the code used to write business applications and the structures used in AI.

This is not meant as a critique of you or anyone in general, and is not intend to belittle business app, mobile or web developers but rather to help inform people that, those who have not worked in the field of AI, that their understanding of software development has little relevance when applied to the field of AI. It is literally a different world where the concepts are totally foreign to a non-AI developer.

I've only spend a relatively short amount of time studying the subjects you mentioned, so correct me if I'm wrong, but the meta-programming aspects you're mentioning here are either highly exaggerated, or significantly less impressive than you make them sound.

Machine learning is really computational statistics - it applies fairly standard and well understood techniques to fit a function to a noisy data set. Genetic algorithms and neural networks are really fancy words for optimization algorithms - they're merely a set of tools (not unlike hill-climbing) for searching a large space. The de-facto books on AI are PAIP and PPAI. I've read both, and example programs there, while very interesting, are not much different than a combination of reasonably clever techniques.

"Generating new code" is the same thing as generating a data structure and running a predefined interpreter over it. These systems do that, but in a much more restricted way than you imply. They certainly don't design new algorithms in an intelligent fashion, merely use a set of predefined inference rules, not unlike any other rewriting system.

I don't know anything about Watson, but it is a well understood fact that every AI system to date is nothing more than a clever marionette (and it's very unlikely that this will change for a very long time). You can't just throw terms around - show an example. In every case so far a result that initially appears impressive, when understood, is immediately disappointing. They're all clever, but they're a far cry from "self-learning systems" for any reasonable definition of the word "learning".

but the meta-programming aspects you're mentioning here are either highly exaggerated, or significantly less impressive than you make them sound.

I don't think that I have exaggerated, the fact that currently AI systems are built by developers with a predetermined set of rules in not in dispute, those rules constrain what an AI system will generate this is analogies to the function of serotonin in the brain, it's level directly affects a factor of our state (happiness, empathy). Machine learning employs the same factor, basic here is serotonin (data) and here is the serotonin regulation mechanism (algorithm) but the machine learns (abuse drugs) to defeat the regulation mechanism, none the less it has to work within the constraints of the system, just as our minds have to work with the constrains of their biological functions. It works with in the constraints of the system to find ways to adapt the system. Whether this is impressive or not is subjective to the observer (I personally think it is). When contrasted to a biological level, it is pretty primitive reward logic is pretty low level when it comes to biology. None the less I think it is impressive.

I think when talking about AI a lot of people confuse consciousness with intelligence. While evidence suggest that intelligence is a prerequisite of consciousness the converse cannot be said. I think we have made great strides simulating the constructs of intelligence on a mechanical level. As for consciousness, we have to master the former before we will know how to tackle the latter. And when most people think about AI they think about the latter which sets a pretty high bar when measuring the state of the art.

I agree. Consciousness is not required for intelligence. Before I read your post, I was going to write to mention as well that people's definition is not only extremely wide ranging - running the gamut from memorizing digits of pi or spitting out trivia to self awareness and introspection - but it is also inconsistent. Changing scope and form based on whether the entity in question is autistic or a machine.

I muse that there is a race of AI type intelligences where genius is measure only in their ability to create moving works of arts and any fool can perform advanced mathematics and exceedingly complex computations involving a vast amount of variables.

You make a large mistake here. I feel that the only way your argument holds is by assuming its conclusion. Which would be: what is going on in [some/most] of our brain is something more than or very different from 'just' statistical inference and optimization. Right now, we don't know whether this is true or not but more people are beginning to think not.

For example let us consider that based on time of day and knock you can guess who is most likely at your door. How why, can you know this? Have you learned or figured out anything there?

Or say, I tell you that a person is of gender X, race Y and lives in city Z. You will automatically generate an idea of what this person is like. And it will be different from what I would generate and these data points would likely mean nothing of significance to a 3 year old. Why? Because we have learned a model from our past experience/data. Machine Learning also uses generative models to infer situations. And in fact, we humans perform a very weak form of machine learning. It goes by the name of stereotyping or profiling.

When you are trying to figure out something. The process is not some clean logical step by step deduction. It is more like a search with dead ends (local optimums), back tracking and restarts. Trying and throwing away different ideas. Or when trying to learn a new sport, dance or flip. You do not consider the physics of the situations to try and figure amount the correct amount of impulses to apply. You try again and again to learn or statistically generate a satisfactory approximate local optimum of the correct physics model for the situation at hand.

As for systems which generate code. We can look at it most literally in terms of those which evolve rules in some way or loosely by considering that all machine learning does is use lots of data to prevent the programmer from hand coding a giant restricted system. Regardless of your stance, these systems differ from mere rewriting in that they are not deterministic. They interact and respond to different situations in varying ways. The more sophisticated methods can develop new algorithms - a set of rules - that were not programmed and make no sense to the developer to develop behaviours to cope with their situations. It is true that we provide a base, but that does not mean some limited form of learning is not occurring. What Machine learning cannot do that we can is introspect, abstract and generalize across domains.

I am the reverse of you. Before I picked up machine learning I thought the brain was something special. But now I cant help but feel that we are just clever marionettes and that whats going on is simply mundane mathematics by clever co-opting of physics by nature. I find this fact to be amazingly beautiful.

I think anyone who has worked in the field of AI would admit there are a lot of philosophical questions around the definition of intelligence and it's relationship to algorithmic problem solving.

One of my favourite thought experiments in the area being Searle's Chinese Room:


Sure, intelligence is much like religion or faith, what it is is a very personal matter as well a philosophical matter to each observer. To me if a machine can make a logical decision that it arrives at via it's own conclusion independent of human involvement (after the initial construction). It can be said to be intelligent. I personally think we have already achieved intelligent machines. Granted not intelligent in the traditional biological sense but intelligent in their own right. I do believe that we are close to hyper-intelligent machines than many think we are. I personally think it will be the next big step. I also believe that it will fundamentally and irrevocably change humanity. It is a Pandora's box that we will not know the ramifications of until we pass that event horizon, it is much like Schrodinger cat, full of possibilities but until we experience the event we will not know how it affects humanity.

If you read further on Searle and the Chinese Room, you'll find that most of his argument has been debunked and what remains is an Appeal to Common Sense. "The man in the room can't really understand Chinese--right guys?"

How is this anything but pure semantics? I still don't know the conceptual difference between Searle's strong AI and weak AI, other a single word change in the definition.

It's all about semantics with philosophers ;)

Searle claims through this experiment that strong AI does not exist. The robot in the room doesn't know how to talk Chinese, it just matches up symbols by using an elaborate dictionary.

Turing defined intelligence by the appearance of intelligence. If the Chinese room can make you think it houses someone that speaks Chinese, than that means the person in the room is intelligent.

To Searle everything is weak AI. Just calculation, without knowing what they really calculate. Intelligence is more than appearance, like a hologram of someone, is not that person self.

Pragmatists and functionalists play around Searle's conclusion. For them it is about the behavior, not the system itself. In Searle's vision something like the China Brain (every person in China uses a radio to act as a neuron) is ridiculous. In the functionalist view the China Brain is intelligent and self-aware.

Arguing that consciousness and the brain is different from deterministic algorithms, or neural networks is perfectly valid. It doesn't show lack of understanding of AI. Just that the debater is not an adherer of strong AI functionalism.

Using Gödel's incompleteness theorems one can argue that no set of algorithms is capable of perfectly modeling human consciousness. A logically correct algorithm can not give faulty output, yet internally conclude that output to be correct. We do not make the same mental steps as a set of algorythms: I can't say this post is correct with a 97.77% accuracy. In fact I wonder if you will respect me more or less after this reply, if you'll believe I lack understanding of machine learning... not if I passed the Turing test for intelligence or not. Calculation != Intelligence.

Like the robot in the Chinese room is still a puppet, that doesn't really understand Chinese. Attaching a radar to a flying drone doesn't make it feel or act like a bat.

Much of AI is still Advanced Informatics.

How is Godel's incompleteness theorems an issue here? AI does not have to model the human brain by simulating it. And it need not be rigorously/axiomatically defined as a decidable formal system nor does it need the ability to prove its own consistency. Nor does it need to be consistent, making the GITs inapplicable. Heck the AI could use paraconsistent logical reasoning or couple bayesian inference with a suitable multi-valued logic as its base.

I was directly responding to this portion of the parents posts:

Small bits of it can be algorithmically simulated. Large processes can be algorithmically simulated. But to call the algorithm "intelligence" is sympathetic magic.

Algorithms work in a different way; they break in a different way; they are hard-coded so don't change. They are a simulation.

By the structuring of his description, it is apparent that he is reasoning from an application or "computer" if you will, developers perspective. My point was that it is flawed to look at AI software as rigid structures, applying traditional development patterns is flawed and does not reflect the realities of AI development. Put simply the description of AI as hard-coded paths and developer generated (implied) algorithms is in no way factual.

I agree with you on that. AI has advanced beyond rigid structures and scripts. Using a system that operates with fuzzy logic or building a neural network and teaching it, 'till you the programmer can't make heads or tails of its computations and derivations is a wonderful thing indeed.

As an aside: like Penrose chose quantum physics (a mysterious thing) to explain consciousness (another mysterious thing), and therefor didn't succeed to convince others, so we should guard against using a fuzzy, complex, black-box, dynamic system (a mysterious thing) to explain (or fully model) consciousness and human intelligence.

We just replaced the wonder with another wonder :)

"I deny that the human brain IS algorithms. It is neurons connected by dendrons/synapses. Right?"

"I deny that Watson IS algorithms. It is transistors assembled into logic gates. Right?"

My statement is obviously a silly thing to say. Watson is both algorithms and transistors, depending on how you care to think about it. If you believe that your statement is not equally silly, please explain.

Well HAL 9000, while cold and calculating seems to be scared when he's being turned off. So I'm not sure I agree there are no emotions. Who knows, maybe things like emotion just appear as side effects of the simulation once it becomes complex enough.

I'm not sure using a fictional computer is helping your argument.

My only argument is that I disagree that the fictional computers named showed no emotion.

A simulation of what?

It can actually answer Jeopardy questions. As good or better than the best Jeopardy player on the planet. (We will see.) Another task, previously only conquered by evolution, now conquered by algorithms written by humans. Watson doesn’t pretend to be good at Jeopardy, it just is.

> It can actually answer Jeopardy questions.

I think you mean, it can actually question Jeopardy answers.

Sigh... I always hated that about the show.

That sounds like a fun diversion!

No, I most definitely mean that it can actually answer Jeopardy questions. Formulating sentences oddly doesn't turn questions into answers and answers into questions. You could define "question" and "answer" as sentences with a certain syntax but that would be an extremely stupid definition.

> Another task, previously only conquered by evolution

I'm pretty sure that Jeopardy performance as such has not been strongly selected for among the ancestors of present day H. sapiens.

(And the related things that may have been -- effective language use, good memory, etc. -- are not so obviously "conquered by algorithms written by humans" however well Watson does.)

That sentence was written for its poetic impact, not its accuracy, I think it’s nevertheless somewhat defensible. Humans were obviously never selected for playing Jeopardy and evolution didn’t in that sense “conquer” Jeopardy (like dinosaurs “conquering” flying) but evolution nevertheless produced humans which are very good at playing Jeopardy. (We know of nothing else in this universe that can even only understand the rules of Jeopardy. Except for human-built machines.) That we can build a machine which is just as good as we are at playing the game obviously doesn’t mean that this machine is as intelligent as humans, it’s nevertheless quite impressive.

It can simulate an intelligence answering Jeopardy questions.

It actually answers questions, no trickery involved. It might not use the same algorithms humans use to answer those questions but that isn’t really that important, nobody ever claimed it does.

I don’t see the simulation. I see a machine that is genuinely good at playing Jeopardy.

And how will we know that Watson actually feels fear, or will we just get someone else saying "Watson is a simulation, a clever puppet designed to mimic human behavior" response?

Anyway, the answer to this type of conundrum is simple and well-known: http://en.wikipedia.org/wiki/Turing_test

Of course it should be noted that we aren't there yet for Watson. Even in the restricted domain of a game show, when Watson fails, it does so with a different mode, compared to humans, so we can identify that it isn't actually a human.

Personally I don't think the answer is the turing test, it is a limited test of intelligence.

There's also no guarantee we wouldn't be able to brute force it in time, just as deep blue was mere brute forcing. Impressive technologically, disappointing intellectually.

When I think of intelligence I think of intuition, creativity, flexibility and the ability to learn. The turing test is a simple test along a long path.

I don't think the answer of your conundrum is simple and well known.

You don't understand what the Turing Test is all about. It isn't testing intelligence, it is an attempt to answer the question "Can machines think?". It's all there in that Wikipedia link that I posted.

Also, what makes you think that what goes on in your own brain isn't just another form of "brute-forcing"? What makes one implementation of thinking "thinking" and another form just "brute-force"? And then, brute-forcing is the technique of trying every possible solution - a Turing Machine must be able to move in a near infinite solution space, so brute-forcing is not actually possible.

It's really more like "Can we fool ourselves into believing that machines think?" which isn't actually any different from the question "Can we fool ourselves into believing other people think?"

He does understand. The Turing test is to answer if machines can feign human intelligence. To pass a Turing test, feigning (not necessarily possessing) intelligence is enough.

The answer to insight into a system's qualia is not a Turing test:

Watson, tell us, do you feel fear? Crunching through some look-up tables Yes!

What did that prove about Watson feeling fear or not? We could have Alice talking with Watson for hours on end about fear and daddy issues, without them even knowing the taste of fruit.

That is why the Turing test is an incomplete test for human-like machine intelligence. You seem to equate feigning intelligence, with possessing intelligence. Feigning fear with possessing fear.

Ugh. No, I repeat, the Turing Test is not about intelligence - it's about whether or not the entity in question can think. Please, read the link I posted, it's all in there.

Also, where on Earth do I equate feigning intelligence with possessing intelligence. You're just making that up.

I had to read the original paper in the first week of college, so my knowledge may be a little rusty, but wasn't the paper titled: Computing Machinery and Intelligence?

I'm sure the question was asked: Can machines think? But passing the Turing Test doesn't answer this question. It answers a different question: What will happen when a machine takes the part of A in this game? Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman?

Its up to you if you take an affirmative answer to those questions to also answer the question it replaces: "Can machines think?" The imitation game does not aim to proof anything to this point. It aims to support its replacement questions, which we have to take at face value: that imitation is just as good (for all sense and purposes) as the real thing.

Also, you write: >And how will we know that Watson actually feels fear? The answer to this type of conundrum is simple and well-known: a Turing test.

If I make a chatscript that tricks you into thinking it was a person feeling fear, by your own logic, you would deem that chatscript to be thinking and actually capable of intelligent conversation, even if you were told afterward it was just some lines of code, feigning you with a (if input, output: yes), it still passed the test, so your answer is: those binary bits must feel fear (or you agree that you apply the Turing Test where it makes no sense).

What entities are really feeling and how to communicate this, has no simple known answer. There have been century-old debates about Qualia. The answer certainly isn't an imitation game, if you insist it is, you are equating feigning with truly possessing. (Or you are suggesting I am a P-zombie and I take offense to that!)

Passing the Turing Test is a bit (a lot!) harder than fooling someone that isn't paying attention. The computer has to respond reasonably to any conversational gambit offered by someone actively seeking to probe the computer. Can the computer correctly identify a pun? Does it have enough "common sense" to identify nonsense phrases such as "the waterfall ran up the aide of the mountain"? Does it understand the difference between a running stream, a running man and a running engine?

If your chatscript can negotiate this type of test, then it would satisfy me that it was thinking.

What is intuition? What is flexibility? How do you test these things in a lab? Turing's test is based on the most straightforward way of measuring intelligence (loaded word, but common; here I'm specifically talking about "intuition, creativity, flexibility and the ability to learn").

If you can have an interview with a computer, and an interview with a human, and be unable to determine which is which (accurately, over several trials), how can you argue that the computer isn't human-like? Most versions of the test have ways of eliminating everything but the straight communication of information (because whether you can build a convincing robot or amazing TTS software is a different problem).

How do you know that I'm not a computer?

> Anyway, the answer to this type of conundrum is simple and well-known: http://en.wikipedia.org/wiki/Turing_test

I prefer the Voight-Kampff test:

1. It’s your birthday. Someone gives you a calfskin wallet. How do you react?

2. You’ve got a little boy. He shows you his butterfly collection plus the killing jar. What do you do?

3. You’re watching television. Suddenly you realize there’s a wasp crawling on your arm.


"Decker, will I die?"

The biggest problem I have with the Turing test is that many humans can fail it.

The best bots can already carry on conversations better than a low-intelligence human who isn't concentrating on the task.

It's generally possible to distinguish bots by asking more probing questions. The problem with that is as the Watson project shows it is quite possible to build a computer system that can do well on hard questions too.

That leaves emotional-response type questions, but many of those a culturally specific, and so can really only identify something as being either a bot OR a person from a different culture.

My view is that the Turing test is much closer to being passed than people think, if you specify a Turing test that all humans can pass.

> The biggest problem I have with the Turing test is that many humans can fail it.

This is precisely the reason why I think AI has so much potential.

It doesn't have to be smarter than the smartest human to be of any use. If it's at least as smart and knowledgeable as a fairly dumb or ignorant person, then it could be incredibly useful. What's the use of building a fleet of robot servants if we can't load them with an AI which is at least smart enough to carry out chores, follow orders and communicate about everyday objects. We don't have to load them each with Einstein AI, just Bubba AI would be a big win.

Yes, I agree.

Too often researchers think that AI must be perfect under all conditions. The truth is that we adapt our usage of tools to their limitations all the time, which makes me think that AI-powered vision systems that work well in some limited circumstances would still be very useful.

I agree about the simulation part, does it have a sense of self? Or can pose questions to itself such as "I think therefore I am"?

Watson is completely amazing. The differences between Jeopardy and chess are staggering. I'd even say that in the chess challenge, they could kind of cheat, Deep Blue had every 5 piece board configuration in a book and so if it could play to 5 pieces left then it knew how to play from there, humans have their look up optimized differently. If this technology can be generally applied, it just seems like a radical tool, radical for medicine, law, and probably revolutionary for other professions where access to information hasn't been as important.

That being said, a really really clever jeopardy playing machine just doesn't seem "intelligent" to me. Huge bounds forward, we're making progress, I'm not denying that but Watson isn't going to slurp in the works of Shakespeare and write down some original thoughts on it, comparing it to current events. Or contrast Wordsworth and Keats. Or suggest a new experiment to further identify envelope proteins on a virus. Or invent a new way to etch semiconductors even smaller than we currently are. Will we get there? Maybe, hopefully, maybe even in our life times, but this is comprehension and search, it's not inventing or creating yet. It seems like half the problem, maybe the easier half.

Still awesome, maybe in a few years we'll have Watsons we can access from our phones or something.

To seem to me like a person, Watson would need more than intelligence, it would need, well... personality.

Watson will answer questions put to it for as long as the power is on. But it will never do anything else. It has no other desires, no motives, no interests. Its planning skills are limited to game strategy. Its conversational skills are limited to preprogrammed social niceties which probably don't extend beyond the context of the game.

You can make a decent case that it's intelligent --- but it's an intelligent machine. You'd have to add all of the above, and probably more, in order to get something that you could interact with even on-line as a person.

[EDIT: I don't mean to deny the possibility of a machine that I would feel comfortable calling a person --- but I saw the Watson demo run, and I haven't seen one yet.]

The thing is, Jeopardy! at a high level of competition boils down to reflexes. Obviously it's very impressive that Watson would even be able to generate correct responses a significant portion of the time, but you must realize that the two human contestants probably knew more correct responses that Watson. In the practice video, you saw Ken take off on the book titles category, probably because Watson couldn't handle the language of those clues.

This sends chills down my spine.

Allow me to rephrase that: Eventually the machine will better diagnose your symptoms. More accurately then a human doctor could.

Let's see... Start with UMV technology. Add human image recognition systems, natural language processing and speech synthesis, expert system-based medical diagnosis, and, say, a built-in weapons system.

I'll be in my bunker.

One interesting observation, which I don't intend in any way to diminish the significance of this achievement, which I think will be seen as one of the most important milestones in the development of computer science, is that today there is so much information on the web that even Google with it's non-NLP algorithms can "almost" answer Jeopardy questions.

There is an archive of past Jeopardy questions here :


Try choosing a question and typing the category + the clue verbatim into Google. I've tried this a few times and in most cases the correct answer was in the top couple of sites (usually in the summary text on the Google search page).

Of course there's still the problem of actually extracting the answer from the page and presenting it in the proper form.

A lot of geeks' girlfriends are going to be disappointed on 14th (aka Valentine's Day, in case you're wondering ;) ), as they huddle around a TV like schoolgirls watching a Bieber concert.

The crazy thing is that Watson is < 5 years in the making.

On PBS last night - http://video.pbs.org/video/1786674622

The "Making Stuff" series on Nova that's been airing the last week or so is very interesting: Making Suff Smarter / Stronger / Cleaner / Smaller. Watson was first mentioned during the episode of "Making Stuff Smarter" but I guess it merits its own segment.

> Watson is comprised of 90 Power 750 servers, 16 TB of memory and 4 TB of disk storage

4TB disk? something is wrong here.


"Several readers of my blog have asked for details on the storage aspects of Watson. Basically, it is a modified version of IBM Scale-Out NAS [SONAS] that IBM offers commercially, but running Linux on POWER instead of Linux-x86. System p expansion drawers of SAS 15K RPM 450GB drives, 12 drives each, are dual-connected to two storage nodes, for a total of 21.6TB of raw disk capacity. The storage nodes use IBM's General Parallel File System (GPFS) to provide clustered NFS access to the rest of the system. Each Power 750 has minimal internal storage mostly to hold the Linux operating system and programs.

When Watson is booted up, the 15TB of total RAM are loaded up, and thereafter the DeepQA processing is all done from memory. According to IBM Research, "The actual size of the data (analyzed and indexed text, knowledge bases, etc.) used for candidate answer generation and evidence evaluation is under 1TB." For performance reasons, various subsets of the data are replicated in RAM on different functional groups of cluster nodes. The entire system is self-contained, Watson is NOT going to the internet searching for answers. "

Perhaps they meant 360TB, 4TB per machine?

It's possible that it's only 4 TB. If it's only processing text and it stores a character in two bytes, that means 2 trillion characters. At a generous 8 characters per word, that's 250 billion words. There are only 560,000 words in War and Peace, which is a very long book, and 44,000,000 in the Encyclopedia Britannica, meaning you could stuff in ~500,000 of the former or ~5000 of the latter. That seems sufficient for Jeopardy.

Good spotting.

Eventually the machine will prevail

What a great ending line. The NOVA special last night was great!


Affordable tablets, intelligent computers, commercial space travel... it's so great to live in the future.

Now, won't somebody please design a viable successor to the Concorde?

First we need a viable airline industry any where in the world.

engadget reported that Watson "destroyed" its human competition while others reported that humans were "taken down."

In reality, Ken Jennings was beaten on the final answer.

I'm interested in seeing whether Watson has an "aggressiveness" algorithm that allows it to respond before the answer is fully spoken. Humans have an advantage in this regard because it goes right to the heart of intelligence. If the game boils down to reaction time, Watson will probably win.

"I'm interested in seeing whether Watson has an "aggressiveness" algorithm that allows it to respond before the answer is fully spoken.

You cannot buzz in until the question is completely read out.


True. But along the same lines...

I wonder if it's able to say to itself "I think I'm in the right ballpark, so I'll buzz now" and then take the 3-4 seconds it takes the host to recognize the buzz and ask for an answer, to finish its processing to get to the answer it's most confident of.

In other words, it doesn't need to actually have the answer to buzz. It just needs some confidence that the answer is close at hand.

Given the number of servers (and cores) plus 16TB of memory, 1ms is an enormous amount of time. 500ms is 1/2 a second.

I read an in depth article on Watson a few months back and I remember that it said it does in fact do that sort of confidence-based buzzing, but I'm not 100% sure.

I'd wager that humans do the same thing (e.g., when they have an answer and it's on the tip of their tongue)

It wouldn't surprise me, when watching (and playing along) I often start saying the "What is" or "Who was" before I have the answer.

I remember reading somewhere earlier that it did have an "aggressiveness" algorithm in the sense that it would take more risks if it was behind, i.e. answer questions with a lower threshold of "sureness".

NYTimes had 'Play against IBM Watson' interactive feature few months back.

Here's the link: http://www.nytimes.com/interactive/2010/06/16/magazine/watso...

IIRC that does not actually let you play against Watson.

I've looked around a bit but couldn't find any information on how the game state and clues are input to Watson. Are they typed in on the fly? Pre-loaded and just revealed as clues are selected? Is it parsing speech and reading the board visually?

I've heard (from Engadget) that it is given text on the fly. It does not do speech or visual recognition. But it gets the question the same time the contestants see them.

Watson's voice isn't menacing enough.

I'd settle for a decent Sean Connery impression.

I'm sad they didn't use HAL's voice for it.

I am intrigued by Watson. NLP is definitely something that has failed to live up to its billing so far. I will be very interested to see what kinds of questions Watson is good at and which he tends to miss, to see if there are any patterns there.

Going forward, the other real questions will be: is Watson overfitted to the problem of solving Jeopardy questions, and how practical is the technology? The former is a real risk to the general applicability of Watson's technologies, the latter is a question of who can afford it. The article mentioned on commodity hardware, Watson takes about 2 hours per clue. They only achieve reasonable response times by using about 3000 cores. That limits the potential audience.

Either way, I'm very interested to see what happens next week. I watched the demo videos on YouTube and it was quite cool.

Maybe they should give it the answer "The specifications for a machine to beat you at Jeopardy."

Won't Watson have a significant advantage on the final Jeopardy question where time is not nearly as much of a factor? If it can just keep pace for most of the game and then bet it all on that last question, it should be no contest.

I wouldn't bet on it. In the "preview" round, Watson seemed to come up with its answers more or less instantaneously. (Or at least, before the host finished reading the question.) My guess is the limiting factor is the quality of the underlying dataset and cleverness of the algorithms, not CPU time.

Human contestants are also able to see the question typed out (much like it's displayed on the television), so people are able to do this as well.

I'm usually able to read any given question in about a second. Granted, I used to practice speed-reading trivia questions for an hour or so per day (one of the leagues we competed in projected questions via powerpoint), but there's at least one reason most Jeopardy! contestants are able to buzz in almost immediately on most questions.

Seems unfair for the humans.

Wouldn't a more fair match be a series of individual 1-on-1 matches with Watson and Jennings / Rutter?

The current configuration means the two humans will both share the questions that are naturally difficult for computers, but Watson will dominate all the questions naturally hard for humans.

Alternatively, to make it fair, we would need a 2nd copy of Watson competing, and if the two Watson's buzz at the same time, randomly pick one to answer.

How much of the Watson code is Java?

I don't know the percentages but Watson is built on UIAM http://uima.apache.org/ which has bits in C++ and bits in Java. From there I am sure different teams used language bindings to use the language that best suited their particular needs to accomplish their tasks. Not that I know it to be true, but I would not be at all surprised to find R, Haskel and even some lisp in their somewhere.

Thanks for the link. That looks really impressive. For IBM this is presumably PR for their products/services. I do hope they disclose more of the architecture as I'd love to build out a system like this, even on a smaller scale (takes 6 hours to come back with an answer rather than three seconds).

My understanding that this is the beginning of their DeepQA product line, my assumption is that the research and portions of the technology will be used for business insight and analytical to answer what if type questions.

,Just to be clear I don't work for IBM, and I do not know their intentions for the project, but I do take projects from IBM and have taken projects related to Watson, but I do not know their plans to monetize Watson, the former is just pure speculation on my part.

I think it would be cool to see IBM's creation versus a creation from Google versus (a person or another machine), although I doubt another company would want to make such a risky move. If Watson wins, huge PR win; if Watson loses, still a pretty big PR win.

> When the software was run on a lone 2.6 GHz CPU, it took around 2 hours to process a typical Jeopardy clue -- not a very practical implementation. But when they parallelized the algorithms across the 2,880-core Watson, they were able to cut the processing time from a couple of hours to between 2 and 6 seconds.

That is an impressive amount of parallelism! This is very back of the napkin (and I realise I'm comparing apples and oranges), but a rough estimate for the time taken if the problem was parallelised with 100% efficiency would be:

(2 hours) / ((2880 * 3.55) / 2.6) = 1.83098592 seconds

What's 3.55?

Oops! The cores in Watson are 3.55 Ghz, sorry. I realise that ghz is not necessarily a measure of speed, this estimate could easily be out by a factor of 2 or 3.

A classic use of watson that IBM is saying is in the field of medical science - Imagine if we can feed the entire set of medical books into watson (somebody estimated that would just take about a week for watson to process and make the connections) and then you have the watson physician's assistant which can listen into the symptoms and spit out the first five most probably causes - that would be so damn amazing!

There are already experimental clinical decision support applications which, given a list of symptoms, can produce the differential diagnosis. But they generally aren't very useful since in most cases the physician can figure out the same diagnosis just as quickly. Real medicine isn't like House, MD.

IBM is more likely to apply Watson technology toward analytics and data mining. There are huge amounts of clinical data locked up in unstructured text reports. If they can analyze that data in a useful way to draw correlations between symptoms, patient demographics, medications, treatments, and outcomes then that could add a lot of value for medical researchers.

They can win at Jeopardy but they can't build a URL structure without annoying machine/network identifiers. :)

that is actually my bad , this URL works too http://www.ibm.com/innovation/us/watson/ - basically this sends a request to the main front end proxy and because IBM hosts like a billion URLs the front end proxy redirects to the appropriate cluster

Here's a fascinating overview of how Watson (DeepQA) works:


The video goes into some detail, and looks at how Watson analyzes particular questions.

It feels like AI is starting to become what people once thought it could be.

There's a problem with statistical significance given that the match consists only of two games:


I highly doubt that IBM will dismantle the machine when they're done (yes, I know that's what they did with Deep Blue, but this is different). I'd bet that far more than two games will be played with Watson, over the next few years.

Also, as pointed out in that thread, this is no different than a regular Jeopardy game. One game consists of dozens and dozens of questions. If Watson answers them all correctly and the humans get zero points, that is indeed statistically significant.

If the score is close, then repetition in games will be necessary. The problem with the assumption in that thread is that one Jeopardy game is vastly different than the next. In reality, you could string together a bunch of games and call it one game. Or take one game and split it up into 50 "games". The point is, inside one game, there are enough different questions to definitely count as statistically significant.

That depends on what "better" means. If better means actually winning in Jeopardy, then two games is not enough. Final Jeopardy plays a significant role in who wins a game, so it's not just like combining multiple games into one.

Even on the question level, a disclaimer should be given if Watson doesn't answer enough questions for statistical significance.

Finally, the issue here is with contributing to the statistical ignorance of people watching the two game match. Sure, Watson has played many games already before the match but that's not what's being shown on TV.

I wonder whether Watson 'reads' the question or if the questions are fed to it directly as parameters, or if it has to 'listen' to the spoken words and start thinking about its response after that.

We don't need to know math anymore because of calculators, now we don't need to know facts anymore because of watson.

I guess from their shift to broadcasting opinions that the "news" outlets saw this coming...

When deep blue beat Kasparov, IBM quickly moved to shut down the program. Let's hope it doesn't happen in this case, if Watson wins.

That article is sort of bullshit. They already taped the Watson Jeopardy episodes last month, they just haven't aired them.

Watson won.

Thanks for the spoiler. I didn't want to watch it anyway.

When the humans playing against Watson get slapped around it'll be a weird moment for humanity.

Did they give him HAL 9000's voice or am I crazy?

This is the right way to test AI.

"Alan Turing, meet Alex Trebeck."

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact