
An Intro To The Semantic Web: Why You Need To Know About It Sooner Than Later - gaiusparx
http://www.webcentralstation.ca/2011/02/08/an-intro-to-the-semantic-web-why-you-need-to-know-about-it-sooner-than-later/
======
henrikschroder
> Very broadly, things on the Internet will be described with descriptor
> languages

No. This will not happen.

Look, before Google, internet search relied on people adding meta tags to
their pages so that the search engines could know what a page was about. You
had two languages, one for humans, and one for computers.

But people don't really care about having correct meta tags or headers or
other invisible markers containing instructions for search engines, which is
why Google steamrolled the market when it appeared. Google ignored all the
instructions and instead analyzed the human language on web pages, and
inferred relevance based on that.

If semantic search engines or analyzers or other technology wants to become
widespread, it cannot rely on extra markup, it cannot rely on people adding
descriptors of meaning to the data it indexes, it has to determine that
meaning from the data directly.

~~~
mindcrime
_Google ignored all the instructions and instead analyzed the human language
on web pages, and inferred relevance based on that._

Google are now at least partially on the Semantic Web bandwagon. They, as well
as Yahoo, have been detecting, parsing, and using RDFa and microformats data -
when present - for a while now, and using it to provide enhanced search
results. See:

[http://www.google.com/support/webmasters/bin/answer.py?hl=en...](http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=99170)

~~~
henrikschroder
Yes, Google would be crazy to ignore good information and meta-data that is
already there, but that doesn't change the fundamental problem:

People don't give a shit about metadata.

~~~
mindcrime
In terms of people writing metadata into pages by hand, then yes, I agree. But
in terms of programatically generating this stuff from data that's already in
a database or what-have-you, then I think it will see growing adoption.

The important thing here is to not commit the fallacy of the excluded
middle... There's a place for the Semantic Web between "it's a total pipe
dream and no part of it will ever come to fruition" and "it will be fully
realized in exactly every detail as originally conceived by TBL."

People are already using Semantic Web technologies to some extent, the open
question now is "just how prevalent will this stuff eventually become?"

------
Maro
In 2003 I wrote my M.Sc. thesis about semantic web technologies such as RDF,
RDFS and querying such semantic graphs.

It's seven years later, nothing happened in terms of real-world adoption, it's
safe to declare it DOA. In fact it's been safe for several years.

~~~
mindcrime
Why after seven years? From the time TBL invented the original web to the time
it saw massive adoption was a decent amount of time, depending on exactly how
you want to define "massive adoption." And the problems that the SemWeb are
trying to solve are arguably harder, so it makes sense that the ramp up would
be slower.

And never mind the fact that there _is_ real world adoption. Google and Yahoo
both embraced RDFa a couple of years ago, and have you checked LinkedData.org
lately? There's a constantly growing body of data out there in semantically
interoperable formats: <http://linkeddata.org/>

~~~
Maro
Everything that matters and is going somewhere is using JSON.

~~~
mindcrime
That doesn't contradict the notion that the Semantic Web is of growing
importance in any way. RDF triples can be encoded in JSON just as well as they
can be encoded in RDF/XML or Turtle or N3 or what-have-you[1][2]. And work
continues on a spec to formalize the relationship between RDFa and HTML5[3],
so even though XHTML2 got shit-canned, it won't hinder the ability to embed
RDFa.

[1]: [http://webbackplane.com/mark-birbeck/blog/2009/04/20/rdfj-
se...](http://webbackplane.com/mark-birbeck/blog/2009/04/20/rdfj-semantic-
objects-in-json)

[2]: <http://n2.talis.com/wiki/RDF_JSON_Specification>

[3]: <http://dev.w3.org/html5/rdfa/rdfa-module.html>

------
nobody_nowhere
I've been hearing this for five years. Any real examples yet?

~~~
keefe
I worked in a semweb shop for ages, it's all over government where machine
interoperability is key and also in complex data areas like pharma.

~~~
nobody_nowhere
Cool! Makes sense. What are some good examples on the govt side?

~~~
p_alexander
<http://www.data.gov/communities/node/116/apps> \- the data integration being
done here is all driven by semantic web, linked data and RDF.

<http://data.gov.uk/apps> \- same thing here, driven by the semantic web and
linked data.

------
tristanperry
It'll be interesting to see whether this is widely adopted. As others have
said, I've been reading this sort of article for at least 4 years now and
there doesn't seem to be much progress.

Perhaps the metacrap rant (<http://www.well.com/~doctorow/metacrap.htm>) was
right?

Seems that way at the moment.

------
Homunculiheaded
I've tried several times to get really excited about the semantic web, but in
the end I tend to arrive at the same conclusion: so far it's still feels like
over engineering a solution to a problem I'm not sure we have. A perfect
example of this is OWL, which in its pure form is so expressive that it is
completely useless for automated reasoning (due to computational
intractability). If you were to build an automated reasoning system to solve a
real problem you had, you would never arrive at OWL.

Additionally there seems to be a whole lot of reinventing the wheel. The best
semweb people are aware of all the past research in logic programming and
automated reasoning, but most semweb enthusiasts seem to be hardly aware of
prolog let alone that rdf triples are just another way to express what clauses
do in prolog.

If we're really trying to solve the 'problem' that semweb addresses we'd be
seeing more articles titled "intro to logic programming, knowledge
representation and automated reasoning"

------
matei
I think this doesn't take economics into consideration. most companies aren't
that interested in sharing their data in an anonymous way.

~~~
sabalaba
I think you might have heard a similar statement in 1991 to the effect of:

"It doesn't take economics into consideration. most people/companies aren't
that interested in sharing their documents in an anonymous way.

WWW is a web of documents. SemWeb is a graph of data.

More reading: <http://www.w3.org/People/Berners-Lee/>
<http://www.w3.org/DesignIssues/>

~~~
matei
You have a good point, but we're not using the web today in the document
oriented way of the early 90's. Sure, if someone will invent a way to use the
semantic data in a way that could generate income or market value for the data
owner/publisher, then the SemWeb will take off. However in the current
presented form, I don't see a strong incentive for any profit oriented agent
to just put out their data in this way.

------
keefe
<http://www.rdfabout.com/>

I think this is a better intro for technical folk.

------
joshsegall
I'm with henrikschroder and call bullshit on semantic web being a big thing,
let alone the _next_ big thing. It's an attractive idea, but flawed. There are
some niches where structured semantic data will flourish (see examples in
other comments), but it will be a vast subset of the web.

A better idea is extracting latent semantic information from the existing
messy web. However, "meaning" is extremely difficult to characterize, and
attempting to encode it in an interoperable manner inevitably leads to a
lowest common denominator approach. That will probably still provide tons of
value and be much more ubiquitous than structured data, but will ultimately be
shallow and fall far short of the vision most semantic web proponents
evangelize.

------
bhatau
In a world where Facebook's oddly named "Graph" API does not allows an
authorized user to see friends of his friends, Semantic Web is just a day
dream.

