I'm mostly interested in how much does it differ from what IBM Watson does. Does...

nl · on Dec 12, 2014

It's (very) roughly comparable to parts of it.

Firstly: IBM is increasingly using the Watson brand for things that don't appear directly related to the Jeopardy winning system (eg, Watson Analytics). When I talk about Watson here I mean the Question Answering (QA) system.

At a very high level DeepDive consists of a Knowledge Graph construction tool, and a probabilistic querying tool. Compared to Watson it is missing a natural language question parsing tool, and any way of dealing with questions that aren't in the KG.

Watson has (very strong) natural language understanding for multi-claused questions, and the Jeopardy version can do things like understand puns. Deepdive doesn't have anything comparable. In the open source space, the closest thing I'm aware of is SEMPRE[1][2].

Watson also has a evidence scoring module, and my understanding is that this can work against unstructured data. Deepdive doesn't have this, and instead relies on probabilistic inference. This is an excellent approach, but relies on doing content extraction first (ie, extract entities and relationships from text and/or other sources). The Microsoft Probase[3] group has published lots in this area.

[1] http://www-nlp.stanford.edu/joberant/homepage_files/publicat...

[2] https://github.com/percyliang/sempre

[3] http://research.microsoft.com/en-us/projects/probase/