
RDF Semantic web research isn't working - bkudria
http://www.zacker.org/semantic-web-research-isnt-working
======
bravura
The problems are not, as argued by the blog post on zacker.org, poor adoption.
Rather, there are more fundamental problems with formal representations of
semantics.

Once you start to sit down and try and encode all knowledge in a formal
manner, one begins running into Godelian imcompleteness. Tugging on the string
and trying to resolve inconsistencies leads to more special cases and more
descriptions, which in turn lead to more special cases and more descriptions,
and so on. In "Ambient Findability", a book on the future of information
finding, author Peter Morville calls Knowledge Representation the _"quagmire"_
of good old-fashioned AI.

There is a certain benefit to the formal representations used by the semantic
web. Namely, that communication between different systems is enabled by
communicating using discrete tokens. The discreteness of these tokens make the
choice of token unambiguous, but does not mean that it is a perfect map of
meaning. This is similar to language, which is an incomplete and imperfect
representation of what is nebuluous and informal in the brain. For example, if
I use the word "healthy", you can be pretty confident that I used this
particular word (assuming you heard me okay), but you are not necessarily
confident that you know precisely what that actually means. We communicate
with distinct tokens as our language not because it is a perfect map of
meaning, but because its better than using continuous linguistic
representations. Imagine if you weren't even sure what word I used at any
given point in time, because words are fluidly flowed into each other. So even
though knowledge is better modeled probabilistically and continuously, it
might be difficult to communicate with a language whose representations was
explicitly probabilistic and continuous.

The semantic web shouldn't be viewed as a be-all end-all perfect encoding of
knowledge. Rather, it should be viewed as a language for machines to
communicate, in an imperfect way, but better than having no language at all.

~~~
joshu
One important thing to note is that semweb is about the semantics of the
SCHEMA, not the semantics of the DATA. Everyone appears to have forgotten
this.

~~~
bravura
What does that mean, precisely?

~~~
benklaasen
@bravura - it's about the distinction between how we represent our description
of the data and how we represent the data itself.

------
mark_l_watson
Well, I disagree with most of this 3+ year old article. Sure, the SW has got
off to a slow start - partially I think because people would look at an XML
serialization of RDF data and say WTF :-)

There are good open source tools (Sesame, Jena, Swi-Prolog semweb libraries,
etc.) and commercial support (AllegroGraph, etc.) If the motivation is there,
the tools are available.

Also, Reuters' Open Calais (free) web services do a very good job at
generating semantic data automatically.

The next thing we need is simply more people writing applications that use
linked data sources.

A tired cliche, but: this is like FAX machines, eventually there will be more
available linked data so the rate of app development will increase, and vice-
versa.

------
hvs
Some people believe -- myself included -- that the idea of "Semantic Web" is
basically an impossible idea to implement on the scale of the Internet:

<http://www.well.com/~doctorow/metacrap.htm>

------
ilaksh
RDF doesn't do much of anything. Its about OWL. I wish some people would learn
about OWL.

------
elblanco
The Semantic Web is simply a modern synonym for "AI". It promises essentially
the same things, functions in essentially the same ways and uses most of the
same principles, theories and technologies. Except now it's relying on super-
large internet scale versions of all this. But really it's nothing different
than predecessors like Cyc. It'll be a failure for exactly the same reasons.

What semantic research regularly ignores is that describing the system with a
semantic representation, is roughly equivalent to the effort and scale to make
the original. Until a semantic graph describes the universe it is expected to
reason about (which is akin to _being_ the universe), its utility is extremely
limited. Even domain specific applications might contain billions of entities
and relationships. Who builds this giant hairball? The argument is that the
crowd should, but I'm too busy building my application to build what
essentially amounts to a bunch of virtual synapses in a virtual brain that
I'll derive no benefit from in the lifetime of my web app.

On top of that, we know almost nothing about building the reasoning systems on
top of this kind of enormous graph. We have little toy models of a few billion
relationships, but what we _want_ from the semantic web is a virtual agent
with super-human intelligence capable of reasoning about the world like we do,
just using a brain with the capacity of the internet. Basically, we want
Kurzweil's singularity. But we have no real idea beyond very simple cases
where the reasoning systems on these toy model do anything generally useful.
The reason our brains work is that we are designed to work on limited sets of
data, and infer information that we don't know about. We popularly call this
imagination. For an idea like the semantic web to be useful before the graph
is complete, the reasoning systems have to essentially be able to imagine.
Data completeness and logical inferencing on that graph are simply not enough.

Even experiments like Cyc, where many man decades have been spent building
essentially a similar thing, have yielded no real applications...even on very
constrained and extremely limited domains like the Terror Knowledge Base. You
know why? Because reasoning about the motivations of a terror group quickly
start to involve secondary and tertiary considerations (and higher) to be
useful. Two bad guys working in coordination might not make any sense unless I
know that they went to middle school together. No I have to teach the system
what a school is, what classes are, why that makes a difference, how that
might impact future predictions....and now I'm quickly down in the weeds. And
the system is not likely to be able to imagine different kinds of schools and
then different kinds of social gatherings etc. that might predict similar
kinds of coordination in the future.

The problem is not a chicken and egg problem exactly (nobody will participate
in constructing the semantic web until the semantic web is up and running).
The problem is even if all of us started tomorrow to build the semantic web.
It would be generations before we had a real graph to work off of. Who knows
how long before we had enough reasoning systems on the graph to do anything
really useful. And for all that effort? Perhaps a few percentage points
difference over the capabilities we already have with far less effort. Until
we can take something like the Cyc project, and turn it into a useful system
_right now_ , larger scale systems like the Semantic Web are essentially
useless.

The real problem though is that semantic we researchers simply aren't learning
from these prior efforts. Their idea is to just "do it bigger". Which is not
helpful. The semantic web is supposed to be a model on the world. But our
small models of this proposed model don't actually do anything. Even a toy car
can still roll. The real answer is, let's figure out how those efforts failed
beyond just not being "big enough". Basically we failed in the past, and the
idea is to just try harder -- this is essentially the definition of insanity.

------
mhausenblas
Hacker NEWS (?) re an article from 2006? Come on, you can do better ...

~~~
urlwolf
I work on semantic web search and the issues he mentions are still mostly
relevant.

------
jokull
Has anyone used 4store? They claim amazing query speeds.

