
Ask HN: why isn't RDF more popular on HN? - sktrdie
I&#x27;ve learned about the idea of storing data in the form of triples not too long ago. This is essentially what RDF allows and since I&#x27;ve started using this to model my data, I&#x27;ve never really looked back to less interoperable data models. I know RDF has a long history of misconceptions, but so does JSON or HTTP or many other standards out there. I&#x27;m wondering why isn&#x27;t RDF more popular within the startup scene, because it&#x27;s definitely popular and has shown its power in academia and life sciences; just look at the Linked Data cloud.
======
acdha
I personally stopped waiting for RDF to become relevant after years of
fighting with impenetrable, mutually incompatible specs (page A refers to
draft B which has subsequently mutated or 404ed, etc. and although it's years
later nobody has updated anything and all of the examples predate either), and
horrible tools — nothing release quality, examples don't validate, bugs
unfixed for years in the issue trackers, etc. Vendors promise that everything
will be sunshine and rainbows if you buy their commercial offerings but the
costs are high enough to make any conceivable benefit dubious, etc.

It seems like the single most useful thing which could happen for adoption
would be cleaning that up: a clear use case which isn't already well solved
and a good tutorial outlining the benefits with non-trivial examples which
follow the current standards, validate and introduce production quality tools.
If any of that exists, it's managed to stay completely off the mainstream
computing radar, which is a pity given the number of smart people who've
poured considerably amounts of time into it.

As an example: a lot of the RDF community hated schema.org when it was
announced because it used HTML5 microdata instead. If you were a web
developer, the case for microdata was easy: add a couple of simple HTML
attributes, use one of multiple high-quality validators to test your markup,
and Google/Bing/etc. would return better search results for your data. At the
same time, it was daunting to write an RDFa equivalent because there were no
complete, current, non-contradictory docs and examples, the W3C validator had
been broken for over a year (it was at least up, unlike all of the other
online validators) and nothing actually supported it so you'd be investing
hours or days instead of minutes in the hope that at some point in the future
it would become useful. Most people with jobs to do are just going to wait
until the demand side of that equation is more favorable.

The best example of linked data in widespread usage is Facebook's Open Graph
markup but it appears that almost everyone simply stops as soon as they get
the desired results in Facebook and, predictably, almost none of the examples
actually validate because of tooling and cultural issues.

~~~
sktrdie
But how do you explain the large amount of research going into these topics?
They must be linked to industry and therefore used otherwise there wouldn't be
so many publications on things like RDF. Perhaps HN and the startup culture
likes to concentrate on less academic things, but you can't just ignore an
entire field of research simply because the tools weren't mature enough or
minimal enough when you first used them.

Perhaps the Semantic web community is doing a bad job at making these topics
exciting, but the concepts still stand or else there wouldn't be an entire
branch of research dedicated to them.

So in my opinion we should be more positive about the very valuable knowledge
being generated by these communities, and perhaps transform it or create tools
around them to make them more user friendly and more exciting.

I guess overall what I'm saying is that the concepts of the Semantic web are
valid no matter the cumbersome culture around them. We should judge the ideas
not the tools.

~~~
Someone
There's a lot of literature on Turing machines, too, but nobody builds them in
silicon, either.

RDF is for databases what Turing machines are for computation: the minimal
model that abstracts away as much as is possible without sacrificing
capabilities but ignoring anything performance (memory usage, execution time)
related.

------
mindcrime
_I 'm wondering why isn't RDF more popular within the startup scene, because
it's definitely popular and has shown its power in academia and life sciences;
just look at the Linked Data cloud._

It may be more popular than you realize. Perhaps people are using it but just
make a big deal of it?

It could also be that RDF and the associate tech stack (OWL,SPARQL, reasoners,
etc.) are a bit specialized and niche in terms of application, and have a
learning curve that's just steep enough to put people off until they _really_
understand why they need them.

Anyway, I can't speak for anybody else, but the semantic web stack, including
RDF, OWL, SPARQL, FOAF, SIOC, etc., are a big part of what we're doing. In
fact, you could say that our whole initiative is largely rooted in bringing
Semantic Web tech into the enterprise.

~~~
sktrdie
Indeed. I'm in fact surprised of all the negative comments regarding RDF and
the rest of the Semantic Web here on HN. Not to mention the huge amount of
research that is pouring in these topics from academia. I guess you're right,
they may be adopted but just not talked about? If you search on HN anything of
the Semantic web there's still very little content. It could be that academia
is not very good at making these topics exciting enough. Nonetheless there
must be a reason why so much research goes into things like RDF otherwise they
wouldn't be thought so extensively in university.

------
a1024
In simple terms:

1) Tools suck. Even today. Unmaintained tools from 2003 are still used as
examples, which is a shame. 2) Having multiple vocabularies is a good thing in
principle, but clashes with the "simple integration" selling speech. Have you
tried to different data using different vocabs? (assuming they are using the
vocabularies correctly, which is far from realistic in many cases). 3) On
important selling point is the reasoning and inference part. Sadly it is not
available by default in most tools I've used. In fact, there are really very
few tools that deal with that and usually are a pain in the ass. 4) Despite
all the academic hype regarding RDF-backed CMS, there are very few frameworks
that actually work. Drupal 6 and 7 claim to use RDF but honestly, RDF doesn't
add value at all

Just to make things clear, I've developed tools and being an advocate of
Semantic Web for years, but I'm also a critic person. We have to acknowledge
that in academia, tools reach a buggy, prototype level that make them unusable
in real world environments. I know there are a few companies that do a great
work building semantic web tools, but I'm afraid they are the exception to the
rule.

~~~
srblanch
This is exactly it. In my experience, using RDF has been underwhelming and a
time sink. Reality doesn't comes close to the lofty vision.

------
bjourne
What problems does RDF solve that:

1\. Does exist in the real world and are not just data theoretical research
problems?

2\. Aren't already solved by more mature and widely accepted tools?

~~~
sktrdie
It solves the really interesting issue of data integration. When you need to
combine and analyze data from various sources having things expressed as
triples really makes a difference. You can merge two datasets that never knew
about each other's model with ease.

How would you solve data integration issues at a large scale (the web)? The
web is full of triples: in webpages (RDFa), in APIs (json-ld), and as large
data dumps (freebase n-triples dump).

So it does solve a quite realistic issue that is currently driving thousands
of different people around the world insane. That is people trying to combine
and analyze large quantities of data.

If data providers exposed triples instead of their own data model, it would
make life easier for everyone.

Ps: remember that RDF is not a format but a data model. You can express RDF
using json, xml and even csv.

~~~
acdha
> You can merge two datasets that never knew about each other's model with
> ease.

Using triples helps you toss all of that data together but that's like saying
CSV helps you combine data from different sources because you can keep
appending columns. Once you actually want to reason about or link that data
you have to have compatible models or invest the time into translating between
them — sometimes that's easy but in many cases it requires intellectual
reasoning to map between the way different groups view the world. That's the
challenge, not the wrapper.

~~~
sktrdie
The difference is in the details. You could combine several CSVs and you could
probably come up with a bunch of rules that would make data integration
easier. But then you'd be reinventing exactly the triples model.

And it's different than combining different CSVs because you can't
specifically combine a CSV that talks about cars and another that talks about
houses. You could achieve this if you decide on specific column rules allowing
you to probably express various types of data as CSV, but again you'd be
reinventing the triples model and you already can express triples within CSV
[http://jenit.github.io/linked-csv/](http://jenit.github.io/linked-csv/).

It's true that RDF doesn't magically solve data interoperability, and
obviously requires you do reason upon the data you want to integrate before
you can make sense out of it.

But the important part is that RDF does actually make the process easier,
probably easier than any other model out there.

~~~
acdha
I'm not saying that namespaces aren't useful but simply that the value of
having a standard namespace mechanism in a triple is severely undercut by the
complexity of RDF and the poor quality of the tools.

Consider: I want to use data from two sources. Do I spend months not working
on the hard part of the problem — reconciling the different data models —
because I'm learning how to use RDF, configuring, fixing and tuning a bunch of
niche tools or do I just pick one of many database options which have much
higher performance, are well tested and highly durable, have great
documentation and language support, and simply JOIN two tables (classic SQL)
or add a namespace in a document (NoSQL or hybrids like Postgres)? Unless
you're in one of the few semweb shops, you need to have a _HUGE_ amount of
disparate data for that not to be a grossly uneven trade-off, which should not
be the case.

~~~
sktrdie
The RDF scene is much more mature nowadays. You could simply get an RDF
dataset and start analyzing using a large variety of tools. No need to
configure/fix/tune anything. It would be as simple as doing the tables JOIN of
your example.

I guess if you're more comfortable with that, then perhaps yes, a CSV dataset
would be better for you and it would make no difference for your use case. But
for me, I could say the same: that with CSV I'd have to learn how to import it
into an SQL database and figure out exactly what to JOIN because it's not a
graph but a relation db made of tables.

So it really depends on what you're used to. But RDF adds more to the table
really. Things like URIs for identifying things which could be directly
deferenceable (if HTTP URIs). So you know what your columns actually mean and
they're not simply a string of letters. And you'd know what it's in your cells
and exactly what type of data it's in the cells.

These are important features that make data integration a lot simpler imho.
But if you're used to CSVs and don't care to make your workflow more efficient
with the features RDF offers, then I guess you're better off?

Also please check out this interesting answer by Jarven:
[http://answers.semanticweb.com/questions/19183/advantages-
of...](http://answers.semanticweb.com/questions/19183/advantages-of-rdf-over-
relational-databases)

------
mcot2
I've been involved with RDF in a major way before, around 2008. My experiences
back then were pretty horrible, but I would be willing to give it another
look.

Primary I dispise the Java and XML ecosystems and am way more interested in
document, graph and columnar datastores these days.

~~~
sktrdie
That's a _huge_ misconception. XML has nothing to do with RDF. It's merely a
serialization. Please take a look at JSON-LD, it's another serialization of
RDF. Or perhaps Turtle which is more human-readable. Or HDT which is a binary
format of RDF making it extremely compact.

I've met lots of people that thought that RDF was something you could do with
XML and along came all the despise they already had for XML. But it's not, RDF
is essentially all about expressing and modelling data as statements of
triples.

And this allows for all kinds of interesting interoperability scenarios such
as data integration.

~~~
mcot2
I know all of those things. But it is still the case that the initial
serialization format was XML and most of the initial tools were Java.

~~~
sktrdie
Indeed the initial tools were a mess it's true. But they've grown a lot.
There's a lot to choose now. From interactive tools, to performance
improvements with really interesting things like HDT. The whole situation is
much better.

I understand your grief with RDF's past, but please consider judging the
actual things RDF _currently_ brings to the table.

Otherwise it's like saying that HTML was initially very cumbersome and, simply
because of that, you're willing to dismiss all the cool things happening with
HTML5 nowadays.

~~~
vuknje
> Otherwise it's like saying that HTML was initially very cumbersome and,
> simply because of that, you're willing to dismiss all the cool things
> happening with HTML5 nowadays.

But HTML wasn't cumbersome. It wasn't (isn't) the most elegant format either,
but it was simple and fun to use, people were excited and enthusiastic about
it and published their pages. IMO, these early adopters were critical for
making the critical mass that eventually led to the WWW explosion.

If you make the parallel -- that critical mass has never happened with the
Semantic Web / Linked Data. How many Linked Data people are really eating
their dog food and using RDF today? A way too little, and that's what matters.

------
KingsleyIdehen
RDF is unpopular because it is generally misunderstood. This problem arises
(primarily) from how RDF has been presented to the market in general.

To understand RDF you have first understand what Data actually is, once you
cross that hurdle two things will become obvious:

1\. RDF is extremely useful in regards to all issues relating to Data

2\. RDF has been poorly promoted.

Links:

[1] [http://slidesha.re/1epEyZ1](http://slidesha.re/1epEyZ1) \-- Understanding
Data

[2] [http://bit.ly/1fluti1](http://bit.ly/1fluti1) \-- What is RDF, Really?

[3] [http://bit.ly/1cqm7Hs](http://bit.ly/1cqm7Hs) \-- RDF Relation (RDF
should really stand for: Relations Description Framework) .

------
csense
I haven't heard of RDF. From other replies in this thread, I'm guessing it's
some kind of database schema thing. Or maybe a replacement for JSON / XML. I'm
not really sure.

Can you point to a good, comprehensive tutorial that explains the problems RDF
is meant to solve, and gives source code of a simple application using best
practices?

If the answer is "no," that's the problem right there. If you love RDF, and
that tutorial doesn't exist, why don't you write it?

