

Startup Unleashes Its Clone of Google's Knowledge Graph - miket
http://www.wired.com/2015/06/startup-shares-google-knowledge-graph-clone-everyone/

======
cpeterso
The comparison to Google's Knowledge Graph was misleading. The article frames
Diffbot as __a__ database until the last section, where the article introduces
the capability for customers to build _their own_ proprietary knowledge
graphs. That's much more interesting.

~~~
miket
Founder here. I wouldn't describe what we are doing as cloning Google's
Knowledge Graph. Rather, we're trying to build a machine that autonomously
learns from reading the web, and makes that structured data accessible and
queryable to any app, smart device, or business that uses data. You can also
tell Diffbot what part of the web to crawl.

Our APIs power consumer services including Instapaper and DuckDuckGo to
enterprise applications like content management and market intelligence.

~~~
nl
The article mentions that you power the Bing infoboxes (or whatever the
equivalent Bing term is).

How does that relate to Bing's use of Probase (their concept-graph)?

~~~
miket
Unfortunately, I can't speak about this customer integration without
permission. You can try to ask questions about Diffbot, though!

~~~
emerongi
You said that Diffbot reads the web autonomously. Is it able to learn a
language on its own? I'd be very interested in this (business-wise) if it
could provide results in my mother tongue.

~~~
miket
Working on it. The reliance on visual features means it works decently well on
extracting from international pages. Try putting in foreign-language product
pages or article pages into our homepage testdrive or with a developer token
to see an example.

------
rbkillea
Could someone explain what differentiates the Google Knowledge Graph from a
Markov Logic Network implementation? Is that what it is?

For reference:
[https://sites.google.com/site/slgworkshop2013/accepted_paper...](https://sites.google.com/site/slgworkshop2013/accepted_papers#abstract-
paper14)

~~~
nl
MLNs are one possible way to implement the inference component that any
knowledge graph needs.

Google's Knowledge Vault uses a fusion of a number of different extraction
methods. Their exact methodology is laid out in their "Knowledge Vault"
paper[1].

If you want to go deeper (ha!), then Deep Dive[2] is open source, and pretty
much state-of-the art. It does inference using Gibbs sampling on a Factor
Graph (Markov models/MLNs can be represented as Factor Graphs).

[1] [http://www.cs.ubc.ca/~murphyk/Papers/kv-
kdd14.pdf](http://www.cs.ubc.ca/~murphyk/Papers/kv-kdd14.pdf)

[2]
[http://deepdive.stanford.edu/doc/general/kbc.html](http://deepdive.stanford.edu/doc/general/kbc.html)

------
millstone
Is Knowledge Graph that thing that tries to directly answer questions you type
into Google Search? I haven't found it to be very useful; what is the thinking
behind cloning it? Where does it add value?

~~~
cromwellian
Knowledge Graph is used for lots of stuff, not just answering facts. It helps
to know what an entity is for many reasons.

Let's say you're searching Google Photos for "Mammals". Photos might have
neural tags tagged for dogs, for dog breeds, and for very specific animals,
but they might not be tagged for higher level concepts, like that a Dog is a
Mammal.

The KG tells you if a given term is a Point of Interest, a location, a
landmark, a person, an actor or character, a film, food, and what not. This
lets stuff like Now on Tap provide contextually relevant apps like IMDb for
films, actors, etc or Open Table for Restaurants.

I've found KG to be very useful for some queries, especially queries that
produce Carousels, like "top ten movies in 1996"

