
Things Not Strings -- Google Launches Knowledge Graph - rryan
http://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html
======
mikek
As someone who has done research in the field, I can tell you - this is the
future of search, and the key to true AI. Once we have a knowledge graph of
the world, you will be able to ask questions and get answers, not documents.
This will allow _so_ much more automation - personal assistants like Siri will
be able to "book me a room in San Francisco with a view of the ocean" or "give
me a list of the schools all the children of American presidents went to" and
so on.

There is an arms war here between Bing and Google - we can thank Microsoft for
pressuring Google into making this happen sooner than they otherwise would
have.

~~~
david927
The Semantic Web is truly the all of those things -- but this is not that. My
guess is that this will just end up as a sharpened version of the data section
of Wikipedia results that we have now.

Why? Because they are selling it without mentioning why they won't fail where
every other attempt has. There are huge difficulties in this. Have they turned
a corner on the research that changes something? I don't see it in what
they've so hinted at so far.

~~~
JPKab
I think that there are two things in their favor here:

First, they are Google, and therefore possess huge quantities of data and the
ability, courtesy of their uber map reduce prowess and ultra-fast custom
hardware, to make sense of it.

Second, they bought Metaweb (makers of Freebase) and with it some of the best
semantic expertise out there. Toby Segaran is a brilliant dude. His O'Reilly
book "Programming the Semantic Web" explains in 20 pages what most books take
150 pages to do: the concept of a URI based graph database and how it enables
data to be merged from multiple sources and reasoned over with applications.

I only hope Google open-sources some of their research here for the rest of
us.

~~~
toby
Thanks so much! Glad you enjoyed the book. I wanted to point out that Colin
Evans and Jamie Taylor were also authors (and still work on the Knowledge Team
at Google) and should get some credit.

~~~
JPKab
Haha! Thanks for pointing that out. I didn't mean to leave them off. I met you
guys in 2009 at a semantic tech conference, back before Metaweb was purchased.
So glad to see your work being pushed to the most popular website in the
world.

------
kjhughes
We keep hearing rumblings of the coming semantic web:

    
    
      2001: Tim Bernes-Lee publishes "The Semantic Web" in Scientific American Magazine
      2006: Microsoft acquires Powerset
      2009: Microsoft touts Bing as a "decision engine"
      2010: Google acquires Metaweb
      2012: Google introduces "Knowledge Graph"
    

Is this the big one?

~~~
joshu
you realize that the semantic web is not about the semantics of the data, but
instead about the semantics of the schema, right?

~~~
mindcrime
_you realize that the semantic web is not about the semantics of the data, but
instead about the semantics of the schema, right?_

I'm having trouble getting your point here. A schema can be thought of as
metadata about the data to which it is applied; so "semantics of the schema"
are also implicitly "semantics of the data" in a sense. I guess I'm missing
some nuance about the point you're trying to make... would you care to
elaborate?

------
anon1385
I note that none of those screenshots contain ads. Perhaps a Googler can
enlighten us: do Google employees not get served ads? Or is the author of this
blog just using adblock? Or are these images just fake mockups where they left
out the ads to make them more aesthetically pleasing and clean looking for PR
purposes?

~~~
Kylekramer
I just did every search (Taj Mahal, Marie Curie, Matt Groening) they did on my
unaltered Firefox/Windows 7 install while logged in, am certainly not a Google
employee, and got no ads. Also didn't get the Knowledge Base yet, if that is a
factor. Famous locations/people searches don't seem to be stuffed with ads
typically. I'd imagine more research style searches having ads would be a bad
experience for both the user and advertiser compared to stuff like "cheap
flights" which does have an amazing clutter of ads. And of course, picking
research style searches presents the idea better. Lack of ads is a nice bonus.

------
stevejohnson
This reminds me of zero-click info from Duck Duck Go, but deeper. They're
showing a summary-like view of information related to your search, but instead
of being generated by a bunch of hand-curated rules, they come from more
generalized machine learning algorithms. This is real competition for one of
my favorite DDG features.

I see this as a positive example of search innovation and competition. More
globally relevant information, not just a tighter filter bubble.

------
bdr
"Marie Curie... had two children, one of whom also won a Nobel Prize, as well
as a husband, Pierre Curie, who claimed a third Nobel Prize for the family."

Marie got two, one in physics and one in chemistry, the former jointly with
Pierre. This is notable because it makes her one of the very few people to
receive Nobels in multiple categories.

------
vibrunazo
FYI, they're using an open API you can use at:

<http://wiki.freebase.com/wiki/Freebase_API>

Now let's hope they'll also have an API for all the extra semantic info they
collect with search.

------
wololo
"I believe AI has an opportunity to achieve a true breakthrough over the
coming decade by at last solving the problem of reading natural language text
to extract its factual content. In fact, I hereby offer to bet anyone a
lobster dinner that by 2015 we will have a computer program capable of
automatically reading at least 80% of the factual content across the entire
English-speaking web, and placing those facts in a structured knowledge base."
\--<http://www.ai.rutgers.edu/aaai25/mitchell.htm>

~~~
boxein
I saw an on-going project recently that was attempting just that. They were
using texts from the web to infer relationships from recognizable patterns.
Like (the head of X, Y) implies X is an organization, Y is a person, and Y is
the leader of X.

It would also try to learn new patterns from facts it already knew, like if it
notices Jobs and Apple in a sentence it can hypothesize that sentence pattern
is about a leader of a company.

For the life of me I cannot remember the name of the project or even the
university, hopefully someone else has heard of this.

~~~
wololo
Tom Mitchell was talking about his own research. NELL
<http://rtw.ml.cmu.edu/rtw/>

------
pbw
How does this impact site traffic? They are taking popular queries and
providing the answer to them on a google.com hosted page. Does that mean less
click-throughs to actual result pages? Does that bother anyone?

~~~
jonknee
Since Google makes money on the ad clicks both on SERPs (AdWords) and content
(AdSense) it's safe to assume they won't be answering everyone's questions.

~~~
loverobots
Google gets 100% of Google.com ad money vs 30% or so on Adsense. A major
difference I'd say. Plus, some sectors are worth a lot more and you can bet
Google is aware of that.

------
lunchbox
"Things, not strings" does not seem like an effective tagline for marketing
this to the public. Are non-programmers used to the term "string" in the sense
of "text"? And even for those who are, I still think the tagline is too
abstract. I would prefer something like "Understanding the concepts behind the
words you search" (or "Understanding the things").

IMO the linked video does a better job at introducing the feature than the
article.

~~~
kristianc
From a marketing perspective, its probably a case of tailoring messages to
audiences.

Non-programmers will likely not be poking around on Google's engineering
blogs, and will pick up the Press Release that will have been pushed to the
likes of CNN and the BBC.

~~~
lunchbox
> Non-programmers will likely not be poking around on Google's engineering
> blogs

This is the central "Official Google Blog", not an engineering blog. From the
other articles on the blog (most of which are written by marketing and other
non-engineering disciplines -- see the last lines), it's pretty clear that the
intended audience is anyone who's interested in Google, not just programmers.
I know lots of non-programmers who like to read these, especially since a lot
of news sites link to and quote these articles.

------
akg
Interestingly there is no mention of G+ or Circles or personalized queries in
the post.

------
rabidsnail
Quick, everybody download a copy of the freebase dataset before they kill it!

On the other hand, I'm glad something is coming out of the metaweb
acquisition.

~~~
hollerith
Since the dataset has an open-source (CC-BY) license, I am at a loss to
imagine how someone could kill it.

~~~
rabidsnail
The data that has already been released will be legal to redistribute forever
(if there are still copies floating around). But google could decide at any
point to stop making new dumps, and to stop hosting the existing ones. That's
what I meant by "kill it".

------
nonameisfinetoo
I know you guys usually get all hot and bothered over anything labelled
_innovative_ , but wasn't search good enough by 2002? Seriously, when is the
last time you struggled to find something using google search?

~~~
ojbyrne
Have you used google lately? Thanks to SEO, the answer to your second question
is "about 5 minutes ago."

~~~
AznHisoka
Please don't associate SEO with spam.

~~~
jrockway
It's too late for that.

Fortunately, white-hat SEO is very easy to describe without mentioning search
engines at all: copy editing, fact checking, designing for accessibility, and
so on are valuable skills regardless of what algorithms search engines happen
to use for ranking today. Write content for your users and you don't have to
worry about optimizing for search engines.

------
diminish
Apparently Google is chasing after a knowledge based web now; a semantic web
of things. Google may end up sucking Wikipedia albeit categorized and also
Wolfram Alpha.

~~~
bergie
Wikipedia is already available in a categorized and semantic format:
<http://dbpedia.org/About>

------
barbazfoo12
All due respect to Google, but I give primary credit for this to Danny Hillis.
He was gathering and processing the data for this project years before going
public with it and before merging with Google. It's yet another Google
acquisition that is probably going to be viewed by many Google users as
another amazing Google innovation.

"Standing on the shoulders of giants"

Do they still use that slogan?

~~~
Drbble
Google is more about "burying the giants in piles of money, and standing on
that."

------
soupboy
I wonder if [[http://googledocs.blogspot.com/2012/05/find-facts-and-do-
res...](http://googledocs.blogspot.com/2012/05/find-facts-and-do-research-
inside.html)] is related. It looks similar, but I can't verify as docs is
blocked at my workplace.

~~~
riffraff
I can't try the search, but having tried the one in gdocs: yes it appears to
be the same, to the point of showing search results when no factual data is
available.

(I am baffled that someone would block google docs, you have my sympathy)

------
swah
Sidenote: interesting how the guys in the video don't look directly at you,
but do that lookaway thing, which I associate that with Apple videos ("we're
launching this incredibly awesome piece of technology, now available to all of
mankind enjoy, you're welcome").

Google, I expected them to look right at the camera and talk to me (for
example, like Matt Cutts here: <http://www.youtube.com/watch?v=ofhwPC-5Ub4>).

------
mkramlich
As someone with a brain, I was wondering probably at least as far back as a
decade ago why Google wasn't already doing this. The twenty questions
approach, in general, is a powerful and simple way to refine a human query in
an interactive way.

------
jmount
A piece of the embarrassment of a number of "Google killers" has been the
claim to supply something better than search. It will be interesting to see
how Google itself looks under the kind of scrutiny that sort of claim
inevitably brings.

------
JabavuAdams
How does this relate to OpenCyc?

------
toksaitov
Such a wonderful iPad…
[http://www.google.com/insidesearch/features/search/knowledge...](http://www.google.com/insidesearch/features/search/knowledge.html)

------
dgudkov
Google has power to encourage site owners to make semantic markup for their
pages which could lead to higher positions in search results for them. But
Google doesn't do it. Why?

~~~
huggah
Google does encourage marking up pages with Microdata, Microformats and RDFa.
[http://support.google.com/webmasters/bin/answer.py?hl=en&...](http://support.google.com/webmasters/bin/answer.py?hl=en&answer=99170)
Google publishes tools to help with these. Additionally, better informing
Google what your page contains leads to searches for your content being more
likely to get to you.

~~~
dgudkov
Theoretically, yes. However, vast majority of site owners don't do it, even if
they are interested in higher positions in SERP. There must be a reason for
this.

------
geoffhill
More clutter.

~~~
victork2
I have to agree and even add it's more dangerous than that.

Never mind the fact that what attracted me to Google was its sober interface,
its minimalist approach of results on the web (that's also why I like Hacker
News). Never mind that Google tries to fit more information per inch square
for no good reason and sacrifice readability of the "normal result", after all
they are still a _search_ engine.

But more worrying is that they are going to make assumptions without putting
sources[1]. It's a very Orwellian approach to answers and that's something we
should grow out of. There's a reason why we need sources: because what's
written is just one of the way to see an event or somebody.

And Google, really your new Blogspot is awful, useless eye-candy.

[1] At least from what's shown on the blog.

~~~
hahainternet
None of this seems to be accurate. Their 'good reason' is that you searched
for this information. That's why they're displaying it, in the most integrated
way they can.

Your use of 'Orwellian' indicates you are just using buzzwords.

~~~
hackinthebochs
>Your use of 'Orwellian' indicates you are just using buzzwords.

It's legitimate. The point is that showing answers without sources makes
Google the de facto "arbiter of truth". When Google's database updates its
version of truth, that now becomes what is (or what has always been). There is
no indication of different perspectives, or any analysis for how its version
of "truth" was derived. That is very Orwellian.

~~~
hahainternet
I don't believe it is. Google is not the arbiter of truth, they are not
dictatorially selecting the truth for the public. They are /searching/ for
information and displaying the results of the search. They remain a neutral
party in the middle.

It's hard to use 'Orwellian' when the entity you're accusing is _entirely_
dependent upon other sources and exercises no editorial control.

~~~
floomp
> It's hard to use 'Orwellian' when the entity you're accusing is entirely
> dependent upon other sources and exercises no editorial control.

There's a lot of trust in Google with that statement. If this changes, how
would you know? That's what's Orwellian about it.

It's not hard to imagine the results being silently tweaked by Google - not to
say that they will do this, but it's a real danger, because it'd be very bad
and hard to detect if they _did_ do this at some point in the future, after
we'd all gotten complacent and learned to implicitly trust the results.

~~~
hahainternet
That isn't Orwellian. If it was, it describes any resource which isn't
instantaneously transparent. Lets say your clocks retrieve their time via
radio signal broadcast. By your logic this is Orwellian because without
checking external sources you wouldn't know if they changed the time!

Of course Google _could_ use this for political gain or some other nefarious
purpose, but they rely absolutely on user trust and so it would be an
incredibly risky move.

Not to mention that looking at your watch or using bing or ddg or similar
tools would show you the deception. It's just silly invoking Orwell over this
I think.

------
SagelyGuru
Relationships between facts are also facts. Having more facts enables you to
do a slightly better job but I suspect that is all there is to it.

------
DrCatbox
Where is this knowledge graph, how do I try it out?

~~~
tcwc
I can't see this yet, but it looks like Google pull most of the structured
info from Freebase (which they acquired in 2010). You can play with that at
freebase.com.

------
andrejewski
Google's own DuckDuckHack.

------
ilaksh
Cool. Maybe Google could acquire Novamente or Numenta next.

------
ladino
come on.. the magic is, they simply scrapped Wikipedia!!! :(

------
ljlolel
This is just Wikipedia in-line in results...

------
geluso
Abandon social. Pursue this.

------
agumonkey
I wish people would integrate the punchline into programming languages.

------
gcb
didn't they try to launch knowledge graph some ten times after buying metaweb
or something already?

------
bashzor
And that's where Wolfram Alpha became useless. It was just a matter of time
after all.

