
Freebase is closing down; data going to WikiData - chatman
https://groups.google.com/d/msg/freebase-discuss/s_BPoL92edc/Y585r7_2E1YJ
======
fidotron
This is a big shame. Freebase was by far the most consistent of the open data
graphs.

With my cynic hat on I think they're being forced by larger strategy. The
existing Freebase dumps are far too useful for a Google would be competitor,
and I suspect the Knowledge Graph API will be somewhat more restrictive in
what you're allowed to do with it.

The thing is the more you get into this stuff the more you envy Facebook's
position where people give them structured data on a plate.

~~~
magicalist
> _The existing Freebase dumps are far too useful for a Google would be
> competitor_

from the article:

> _The last Freebase data dump will remain available_

and even if it wasn't, I'm sure archive.org will grab a copy (if they haven't
already).

~~~
sled
Part of the value was that the dumps were updated weekly.

Say you're using it to get a list of named entities (proper nouns). The
purpose is to cluster news stories about a given entity (if you look at
Facebook's Trending News, each headline begins with a proper noun followed by
a blurb. Not sure if they use Freebase, but it could be a useful input).

The value of Freebase will decline over time as the content becomes out of
date.

~~~
emw
The Wikidata dumps are also updated weekly; see [1].

Wikidata RDF exports are made every two months or so from those dumps and are
available at [2]. I imagine that frequency will pick up. You can generate your
own RDF exports using the Wikidata Toolkit [3, 4].

[1]
[http://dumps.wikimedia.org/other/wikidata/](http://dumps.wikimedia.org/other/wikidata/)

[2] [http://tools.wmflabs.org/wikidata-
exports/rdf/](http://tools.wmflabs.org/wikidata-exports/rdf/)

[3]
[https://www.mediawiki.org/wiki/Wikidata_Toolkit](https://www.mediawiki.org/wiki/Wikidata_Toolkit)

[4] [https://github.com/Wikidata/Wikidata-
Toolkit](https://github.com/Wikidata/Wikidata-Toolkit)

------
markbao
I'm sorry if I missed this, but what will happen to the Freebase software? I
always found that to be a strong asset, but Wikidata's is less than stellar.
They only mention the data and APIs here.

~~~
barakm
@markbao -- I did get Cayley out the door
([http://github.com/google/cayley](http://github.com/google/cayley)) which has
many many parallels to Freebase's graph database, loading and storing Freebase
data (as well as whatever other graph data). I've been a little busy with
other bits of life at the moment, but totally open for contributions!

------
nickporter
Hah, I read this as Firebase.. I was bracing myself for the google hate.

~~~
kaishiro
Did the same exact thing. Have a bunch of personal projects on FB and my heart
skipped a beat.

------
nicklaf
Has anybody here used DBpedia? If so, how do you see it in relation to
Wikidata? Do the projects overlap? Or, might they serve complementary
purposes, with, say, DBpedia extracting data from Wikidata (rather than from
Wikipedia drirectly)?

~~~
tommorris
DBpedia is data scraped out of Wikipedia and made available as RDF, SPARQL
(and JSON). The scraping process is... okay. Sometimes it is great, sometimes
it is really crappy.

Wikidata is intended to be the database behind Wikipedia. The current
infoboxes that show you things like the population of cities—ideally, they'll
be driven at some point directly from Wikidata. Then Wikidata can be a place
for disparate data to be placed, often directly from government and other
official data sources. The US government census data would just be routinely
imported into Wikidata, and then the Wikipedia infoboxes would be driven from
that.

It may also at some point lead to the creation of an alternative to the
current category system on Wikipedia. Wikipedia currently has policies around
categorisation of people. For instance, we might have a category called
"British physicists", and a category called "Jewish physicists" and a category
called "LGBT physicists" but because it would be too difficult to maintain,
Wikipedia doesn't have a "British Jewish LGBT physicists" category. See
[http://enwp.org/WP:OVERCAT](http://enwp.org/WP:OVERCAT)

What Wikidata means is we might be able to get rid of the category system, or
rather have a category system where the categories are based on Wikidata
properties. So instead of having all those categories, you could have a
faceted navigation system where you say "Show me all the scientists, now show
me all the British scientists, now show me all the physicists, now show me all
the women physicists" etc. etc. And you could pick any property you choose,
not just the ones that Wikipedia category editors think are important.

This also gets rid of a whole load of politics around categories: there was a
big storm a while back when someone decided to split up the "American
novelists" category into "American men novelists" and "American women
novelists", and some of the latter decided that this was rather a demotion.
Eventually, Wikidata may end up powering the replacement for that, enabling
readers to find what they want without editors having to make contentious
judgment calls like that.

Where Wikidata becomes quite interesting is that because it ends up being used
by Wikipedia, there's some kind of motivation to get it right and keep it up-
to-date. It's all too easy for projects like Freebase to import-once-and-
forget. But if it is used as the basis for a public-facing project like
Wikipedia, there's hopefully some more pressure to get it right. (Obviously,
how much you trust that is rather dependent on how much you trust Wikipedia to
get things right.)

It also may end up being the centre node for pointers between databases:
because Wikipedia is a reasonably good collection of everything (or,
everything that a few people at some point decided to write about in a books,
which is a reasonably good low barrier), then it becomes a fairly good central
index for pointers to other data sources. Bibliographic/authority control
databases like GND, BNB, VIAF and others are already being merged into
Wikidata, as have pointers to the identifiers used in some specialist
scientific databases. There'll be plenty more where that came from.

~~~
TuringTest
Wikidata has a number of structural problems of its own, though, at least when
it comes to interacting with the Wikipedia projects it aims to serve.

The model for interlinks connecting different languages assumes a 1:1
correspondence between articles and concepts, although the Wikipedias for each
language have different structures, and a given article can document several
concepts.

Also, I have the impression that the community of Wikidatans are averse to
getting their precious data pool muddied with inconsistent, ambiguous and
untidy content. That's understandable, but it means that there will be
friction whenever the larger community tries to capture knowledge in a
distributed way, without following a single well-defined standard. Things can
get messy fast, and the conversations I've followed at Wikidata show that the
project maintainers are likely to present significant opposition against doing
things in the required "quick & dirty" way required by collaborative editing.

I see at your Wikidata's user page that you get all the needed principles
right (pragmatism over theoretical purity, usage over hypothetical cases,
design for humans first) but the history of the project doesn't seem to follow
them well on the areas where it has gone live and has been used by outsiders.

~~~
tommorris
Yeah, generally the Wikidata folk are taking things quite slowly. Start slow
and get simple things right. The software is evolving slowly too.

I'm quietly confident that it might all work out, but it is a bit too early to
tell.

------
emw
Wikidatan here. Here's a quick comparison of Freebase and Wikidata:

Topics / items:

\- Freebase: 46,476,860 [1]

\- Wikidata: 12,921,731 [2]

Facts / claims:

\- Freebase: 2,696,141,481 [1]

\- Wikidata: 50,457,200 as of 2014-11-10 [3]

Instances of person / human:

\- Freebase: 3,391,533 [4]

\- Wikidata: 2,638,614 [5]

License for data

\- Freebase: CC-BY [6]

\- Wikidata: CC0 [7]

Data on Paul Graham:

\- Freebase:
[http://www.freebase.com/m/017cm9](http://www.freebase.com/m/017cm9)

\- Wikidata:
[https://www.wikidata.org/wiki/Q92650](https://www.wikidata.org/wiki/Q92650)

Data on San Francisco:

\- Freebase:
[http://www.freebase.com/m/0d6lp](http://www.freebase.com/m/0d6lp)

\- Wikidata:
[https://www.wikidata.org/wiki/Q62](https://www.wikidata.org/wiki/Q62)

Data on Python:

\- Freebase:
[http://www.freebase.com/m/05z1_](http://www.freebase.com/m/05z1_)

\- Wikidata:
[https://www.wikidata.org/wiki/Q28865](https://www.wikidata.org/wiki/Q28865)

Data on APOE / Apolipoprotein E:

\- Freebase:
[http://www.freebase.com/m/0byv2v](http://www.freebase.com/m/0byv2v)

\- Wikidata:
[https://www.wikidata.org/wiki/Q14890468](https://www.wikidata.org/wiki/Q14890468)
(APOE),
[https://www.wikidata.org/wiki/Q424728](https://www.wikidata.org/wiki/Q424728)
(Apolipoprotein E)

See [8] and [9] for an introduction to Wikidata. I have no notable experience
with Freebase, but I've been contributing to Wikidata for about 2 years and
would be happy to answer any questions I can.

[1] [http://www.freebase.com/](http://www.freebase.com/)

[2] [https://www.wikidata.org](https://www.wikidata.org)

[3] [http://tools.wmflabs.org/wikidata-
todo/stats.php](http://tools.wmflabs.org/wikidata-todo/stats.php)

[4]
[http://www.freebase.com/people/person?instances](http://www.freebase.com/people/person?instances)

[5] [http://tools.wmflabs.org/autolist](http://tools.wmflabs.org/autolist)
/autolist1.html?q=claim[31:5]

[6]
[http://www.freebase.com/policies/tos](http://www.freebase.com/policies/tos)

[7] See bottom of [2]

[8] Up and running with Wikidata: [http://www.slideshare.net/_emw/up-and-
running-with-wikidata](http://www.slideshare.net/_emw/up-and-running-with-
wikidata)

[9] Introducing Wikidata to the Linked Data Web:
[http://korrekt.org/papers/Wikidata-RDF-
export-2014.pdf](http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf)

~~~
waldir
Reasonator is a better way to visualize info from Wikidata:

Data on Paul Graham:

\- Freebase:
[http://www.freebase.com/m/017cm9](http://www.freebase.com/m/017cm9)

\- Wikidata:
[https://tools.wmflabs.org/reasonator/?&q=92650](https://tools.wmflabs.org/reasonator/?&q=92650)

Data on San Francisco:

\- Freebase:
[http://www.freebase.com/m/0d6lp](http://www.freebase.com/m/0d6lp)

\- Wikidata:
[https://tools.wmflabs.org/reasonator/?&q=62](https://tools.wmflabs.org/reasonator/?&q=62)

Data on Python:

\- Freebase:
[http://www.freebase.com/m/05z1_](http://www.freebase.com/m/05z1_)

\- Wikidata:
[https://tools.wmflabs.org/reasonator/?&q=28865](https://tools.wmflabs.org/reasonator/?&q=28865)

Data on APOE / Apolipoprotein E:

\- Freebase:
[http://www.freebase.com/m/0byv2v](http://www.freebase.com/m/0byv2v)

\- Wikidata:
[https://tools.wmflabs.org/reasonator/?&q=14890468](https://tools.wmflabs.org/reasonator/?&q=14890468)
(APOE),
[https://tools.wmflabs.org/reasonator/?&q=424728](https://tools.wmflabs.org/reasonator/?&q=424728)
(Apolipoprotein E)

~~~
frik
So sad. Freebase is way ahead and more polished.

Wikidata originates from the German Wikipedia. The idea is good, but the
implementation pales in comparison to Freebase (at the moment).

This is the real San Francisco Wikidata page (slow and ugly):
[https://www.wikidata.org/wiki/Q62](https://www.wikidata.org/wiki/Q62)

Reasonator takes ages to load and render the content.

~~~
emw
There's a Wikidata UI Redesign in development [1] which should improve the
default site's visual appeal.

That said, while the San Francisco Wikidata page may currently be uglier than
its Freebase counterpart, it is not slower. webpagetest.org has the Wikidata
page fully loaded at 8.8 s and the Freebase page 11.2 s [2, 3]. And while
Reasonator is certainly dog slow (21.2 s to fully load! [4]), its San
Francisco page is much more polished than the Freebase's.

[1]
[http://www.wikidata.org/wiki/Wikidata:UI_redesign_input](http://www.wikidata.org/wiki/Wikidata:UI_redesign_input)

[2]
[http://www.webpagetest.org/result/141218_DR_9W4/](http://www.webpagetest.org/result/141218_DR_9W4/)

[3]
[http://www.webpagetest.org/result/141218_ZA_9WF/](http://www.webpagetest.org/result/141218_ZA_9WF/)

[4]
[http://www.webpagetest.org/result/141218_6N_9WK/](http://www.webpagetest.org/result/141218_6N_9WK/)

------
pella
_" The move to Wikidata is a bit ironic, given that some of the data sitting
inside of Freebase — including musician genres, album names, and record
labels, for instance — originated from pages on Wikipedia, which the nonprofit
Wikimedia Foundation hosts. And Googlers understand that."_

[http://venturebeat.com/2014/12/16/google-plans-to-
integrate-...](http://venturebeat.com/2014/12/16/google-plans-to-integrate-
its-fact-database-freebase-into-wikimedias-wikidata/)

------
jscheel
Gee, didn't see that coming :/ Hopefully wikidata will be able to serve the
community well.

~~~
thomasfoster96
I've been having a look at Wikidata and it looks pretty good.

------
Shank
Doesn't Bing use Freebase for some search result panels? I suppose they'll
just transfer over to Wikidata, but it seems funny that Google might have just
added a lot of development time to some Bing developers to migrate APIs.

~~~
frik
Microsoft bought Powerset (company) on July 1, 2008:
[http://en.wikipedia.org/wiki/Powerset_(company)](http://en.wikipedia.org/wiki/Powerset_\(company\))

 _On May 11, 2008, the company unveiled a tool for searching a fixed subset of
Wikipedia using conversational phrases rather than keywords._

The natural language processing part of Bing (Powerset) is based on Wikipedia
data (scrapping the content), but they had a prototype based on Freebase too:

On April 16, 2008: " _Powerset demonstrated our integration to Freebase. At
one point, a group stood in front of the projected computer and threw out
queries to see all of the different Freebase types that Powerset could
handle._ " (source:
[https://web.archive.org/web/20080430113649/http://blog.power...](https://web.archive.org/web/20080430113649/http://blog.powerset.com/)
)

------
gbersac
I think this decision is a proof of the maturity of the Freebase comunity.

I always heard of open source community splitting, which is fragmenting the
forces working on each project and lowering the quality of each products.
Doing the opposite here will lead to one great product rather than two "not
bad" products competing against each others.

------
PaulHoule
You'll always be able to query this data in RDF with

[http://basekb.com/](http://basekb.com/)

------
BitMastro
A year ago I developed an android client for freebase, but unfortunately I
never had the chance to finish or publish it.

I guess now it's too late :D

If anybody is interested the apk is here

[http://goo.gl/6BxAZJ](http://goo.gl/6BxAZJ)

------
dredmorbius
What is/was Freebase?

------
curiously
isn't Freebase essentially what Import.io and Kimono are doing?

