
A Survey of the First 20 Years of Research on Semantic Web and Linked Data [pdf] - kkdw
https://hal.inria.fr/hal-01935898/document
======
gambler
RSS/Atom is being slowly killed. XHTML was replaced by a pile or junk that can
only be parsed by a single parser. HTML in general became a rendering layer
for executable JavaScript. After years of doing integrations, I've seen one(!)
HATEAOS web service and it's being replaced by GraphQL right now.

I like the idea of having semantic content, but it is suffocating under the
weight of all the overcomplicated designed-by-committee abstractions and
formats. In fact, the real web that we're using is becoming less and less
semantic every year.

I wish someone re-standardized XML without all the complicated edge case
garbage and with better handling of namespaces. So it would actually get used
for documents again. JSON is not a good document format, and it seems everyone
is busy re-inventing XML in JSON right now.

Let's say I want to define my own concept and put it on a web page. Is there
even an "official" way to do it right now, without namespaces in HTML5?
Theoretical example:

    
    
        <html>
        <head>
        <concept tag="computer-game" guid="c1a3e0cb-5872-4e5a-8dce-b8afc00772db" />
        <concept attribute="year-published" guid="ba073188-8bd6-47a7-9065-eb55c4a8b908" />
        </head>
    
        <body>
            <computer-game year-published="1993">Doom</computer-game>
        </body>
        </html>
    

Simple, isn't it? I'm not aware of anything of this sort.

~~~
zozbot123
> Let's say I want to define my own concept and put it on a web page. Is there
> even an "official" way to do it right now

What's wrong with Microdata? As in
[https://en.wikipedia.org/wiki/Microdata_(HTML)](https://en.wikipedia.org/wiki/Microdata_\(HTML\))
It can interoperate with RDF and JSON-LD.

~~~
gambler
First time I hear about it. Let's see.

It seems to aim at roughly the same problem, but god, they've managed to
design something even uglier and more verbose than XML namespaces. Still
requires to use valid URLs instead of GUIds. Still forces you to use external
names in your markup. Sigh.

One hard limitation I see is that unlike namespaces it doesn't deal with HTML
attributes, so you can't just annotate the same tag with unrelated semantics.

It also makes assumptions about document structure (item properties are nested
inside an item). Semantics and structure should not be complected. This will
create innumerable issues when doing logic in code or writing CSS queries.

Why does this have to be so ugly and complicated? All we really need is a way
to associate data with globally unique identifiers. This gives us semantics. A
single layer of indirection (mapping GUIDs to local names of your choice)
would allow us to be concise and descriptive in our markup, while avoiding
naming conflicts.

No matter how much W3C hates it, people _are_ extending HTML with custom tags
and attributes. (ng-whatever, for example.) Having no support for providing
semantics for such things is pretty ridiculous.

~~~
zozbot123
> Still requires to use valid URLs instead of GUIds.

Not sure why you would _want_ to do this, but AIUI you can use any URI for a
namespace (not just URL), including a UUID via the
`urn:uuid:c1a3e0cb-5872-4e5a-8dce-b8afc00772db` syntax.

~~~
titanix2
You can but most of the time, especially with Linked Data, an URL is required.
And sometime even a dereferencable one. So you have this weird situation where
you never really know if an URL is "semantic" (just a string actually) or can
be accessed. Also having a dependency on the DNS system is a bad idea in my
opinion. It's really clear when you read 2 years old semantic web paper and
every link is broken.

I agree with gambler that GUID + local names is a better solution, and I use
that in my research.

~~~
CuriousSkeptic
I’m thinking IPLD is looking like a great fix for that particular issue.

[https://ipld.io](https://ipld.io)

~~~
zozbot123
It seems rather overengineered to me. IETF and W3C standards do support the
general content-addressing, "naming things with hashes" use case via the
existing ni:// (Named Information) URI schema (RFC6920).

------
PaulHoule
From the viewpoint of an application creator what I find missing from here is
that very similar goals have been pursued by the OMG and other organizations.
Similarly RDF data works with F-Logic, XSB Prolog and a wide range of non-
SPARQL tools. There has also been a proliferation of document and graph
databases that compete w/ SPARQL databases. (Many sebweb refugees use
Couchbase or ArangoDB)

I'd like to see that side-by-side with RDF because then you'd see we've made
more progress than most people think.

~~~
larsga
I don't really see much connection with what OMG have done. Yes, they did do a
little work on an Ontology Definition Metamodel, but I wouldn't take that too
seriously. I don't think they understood what they were doing.

Similarly, I don't think graph databases have that much in common with RDF.
Yes, both are graphs, but they're quite different, really.

~~~
PaulHoule
What I've seen with OMG-promulgated standards is that the people involved know
what they are doing but (1) they do a bad job communicating it, (2) people do
a bad job of understanding it, (3) as with the W3C compromise in the standards
process leads the specification to miss the last 20% that you need to make
something that really works, and (4) some adopters of the standard see filling
that last 20% as what differentiates them from competitors, so the standard is
not so standard.

There are many homologies between W3C and OMG standards, one of them is that
there is a mapping between the semantics of documents and the semantics of API
calls, object definitions, etc. linking all the way back to the CORBA
standards. Another is between the XSLT/XPath functions and the "Object
Constraint Language". The OMG and W3C maintain a largely overlapping list of
primitive data types, for instance.

Neo4J and similar products tend to support the "property graph" model which
can be modeled with RDF _/ SPARQL_.

As for document databases, that gets to the magic about RDF which is most
obscure: a document full of facts is an RDF "graph". You can take the union of
all of the facts in two documents and that is also a graph. You can take the
union of all the facts in two million graphs and run SPARQL queries on it
_without_ doing any data transformation or import!

This is a necessary condition for a "universal solvent" for combining data
from multiple sources but RDF standards haven't been sufficient. Serious
semwebbers know about techniques like "smushing" that go a long way towards
finishing the job, but oddly these are not incorporated into standards or
widely known among beginners.

~~~
larsga
These mappings is exactly what I'm talking about. I advised the OMG on one of
the mappings in the ODM and I don't think they understood what they were
mapping. To model OWL with UML gives you very little. To then map Topic Maps
into the same level as OWL is ... just misguided.

Semantics means something very different to the OMG from what it does to the
RDF/OWL community. That's the root of the problem. To the RDF/OWL folks it
means "mathematically based logical inference of new statements", whereas to
the OMG it seems to mean "human-readable text".

Yes, you can map a property graph into RDF, but to make it work well you
usually have to add a lot of information that's not in the original data.

Thanks, I know how RDF and SPARQL work.

~~~
PaulHoule
UML is not the only standard pushed by the OMG.

CORBA and related standards have well-defined semantics. So does BPML.

Human-readable definitions are important. One thing I see missing in both the
OMG and W3C worlds is a realistic approach to model visualization. For
instance if you try to draw a large OWL ontology or a large UML diagram you
might need to blow it up to a full wall just to see everything, never mind
understand it.

Really you need to be able to paint on graphical elements to a graph to show
what nodes and relationships are relevant to a particular situation or use
case.

Many people don't understand OWL because it doesn't actually "make sense".
That is, without mechanisms for data validation, you don't know that inference
is going to proceed in a correct way, rather you get a "garbage in garbage
out" situation where you get new bad facts. Given that the official
explanation doesn't make sense, it is natural that people fall back on
something they understand.

------
chasingthewind
I worked at a company that got sold on Semantic Web and made an initial
investment in tools and training. While I found the concepts intriguing, the
teams tasked with making it work gave up after about six months of trying to
make it work with the use case they had. I understand that it can be a
powerful set of concepts for certain kinds of use cases but it feels like the
level of dedication and care needed to make it work is probably beyond many
organizations' ability to execute.

The impression I got was that it was like deciding to use a Kibble Balance [0]
to weigh yourself in the morning. You have to match the use case to the tool
and for many organizations this simply will not be the right tool.

[0]
[https://en.wikipedia.org/wiki/Kibble_balance](https://en.wikipedia.org/wiki/Kibble_balance)

~~~
specialist
I worked for a semantic web startup, back when it was the next big thing.

We had a tool (UI & backend) purportedly for managing ontologies, taxonomies,
vocabularies.

Our customers were begging for real world solutions. Would have paid any
price.

Our leadership (CTO) was mesmerized by metametametadata. Not kidding. And had
zero interest in customer's real needs.

Such a missed opportunity.

The two lessons I took away...

1/ Most real world modeling problems are some narrowed use case of knowledge
representation. Our customers didn't want a general purpose tool. They wanted
something tailored (customizable) for their immediate use cases. As a UI
designer, I guess I should have realized this quicker. My only defense is
initial lack of customer interaction.

2/ At the time, for general purpose graph splunking, there was no UI solution
for the "focus+context" problem. Human sized ways to query, represent, and
navigate large graphs, all in one.

I did come up with a novel UI/UX that I felt would solve "focus+context", but
we ran out of runway before I could get past the lo-fi prototypes.

On my to do list is to take another run at the problem, leveraging Neo4j's
(awesome) Cypher query language. I may discover that Neo4j's UI may have
already solved the "focus+context" problem.

~~~
CuriousSkeptic
Would love to see examples of useful ux. Do you have anything to share?

I have this hobby project where I’m thinking of using some kind of knowledge
graph to represent beliefs about scientific facts and urban myths. An attempt
to crowd source peer review of the pop-sci cited in online fora of various
kinds.

My prototype app being a nutrition planner based on dietary recommendation
sources from the graph in question.

Not being a ux-designer, I’m a bit stuck on how to approach it.

~~~
specialist
Very belated reply, but I feel I owe you a response...

Were I to implement my UI today, it'd most look like a query builder for
Cypher.

I mocked up my UI in the early 2000s. There was nothing comparable to Neo4j's
Cypher query language. Sadly, I was't clever enough to invent it myself. It's
so obvious once you see it. To the best of my knowledge, it's the only graph
query language that explicitly models both the nodes and edges.

In sum, my UI would be more useful for developers, enthusiasts and less useful
for any specific use cases.

Happy hunting!

~~~
CuriousSkeptic
Much appreciated, Thanks!

------
planck01
The Semantic Web is dead. It was never really something. Even the regular
decentralized Web has mostly died. It was replaced by mega corporations'
walled gardens. 95% of content is in there; in Facebook, Google, Twitter,
Instagram, Youtube, Netflix, Reddit, Twitch, Medium and a few small others.
The rest is a skeleton barely being touched. The Semantic Web isn't even
necessary or useful anymore, even if its technology were good.

~~~
jjtheblunt
except for wikipedia, no?

------
sgt101
The irony of this paper kicking off with three word clouds reminded me of
everything that happened with the semantic web community and movement.

------
nickbauman
The semantic web is only useful if you believe syllogisms are the way to
understand the world.

[http://www.shirky.com/writings/herecomeseverybody/semantic_s...](http://www.shirky.com/writings/herecomeseverybody/semantic_syllogism.html)

~~~
larsga
Clay Shirky unfortunately has no idea what he's talking about. You can
interpret a CSV file as a set of syllogisms, too, if you want. RDF is data,
just like CSV, except the shape is different. CSV is good for some things. RDF
is good for other things.

RDF is a fantastic way to aggregate data from many different sources, for
example. CSV sucks at that.

The problem is that RDF has been marketed so poorly, which has totally
confused people like Shirky.

~~~
nickbauman
Can you point to an example or two where RDF is a "fantastic way to aggregate
data..."? My sense is that things like microformats, which are more flexible
and narrowly defined than RDF, are useful for this, too, but it hardly
constitutes a "semantic web". It's just obvious connections between things.
Something like a custom type of hyperlink is all. Not that this "all" is bad,
it's just not going to revolutionize the world or anything more than the
existing web already has.

------
simplecomplex
The semantic web is alive in a different form with the microblogging community
using micro formats.

See Indieweb.org and microformats.org, micropub, microsub, websub

------
z3t4
I think the main reason why the semantic web has not become popular is a)
Chicken and the egg problem. There are no search engines or aggregate sites
that make use of it. So we do not bother marking up our data. b) XML is very
alien to most people, we need to implement the semantic web into graphical
user interfaces! c) It's hard to sell the concept as it's hard to imagine the
use cases. We somehow need to bootstrap by creating data and example services,
people will want it once they see it.

------
gibsonf1
It seems the Semantic Web is truly taking off with MIT/Inrupt.com's SOLID
effort - where all communication between decentralized machines is via rdf
triples. We are about to launch this commercially.

~~~
watersb
It may alarm you, but I cannot tell if you are joking or not.

Cisco was running Tuple Spaces data stores, in 2001. The queries/unifications
were pretty simple, I can't recall their use case. But they were using a
system internally and very happy with performance.

I never heard about it again, after a couple of conference papers.

I could imagine that the project died because it was impossible to sell.
Semantic Web researchers find nothing odd in the discovery that a networking
company has some understanding of applications of graph theory. But convincing
upper management, or customers, that they can compete with Google and Oracle
at the same time... no.

I hope your project breaks through.

~~~
nl
Tuple space projects were pretty popular commercially.

There was a pretty decent implementation in Java[1] and very scalable
distributed implementations.

The problem is that they are stuck in the space between the flexibility and
developer friendliness of databases, and the KISS approach of a simple cache.

[1]
[https://en.wikipedia.org/wiki/Tuple_space#JavaSpaces](https://en.wikipedia.org/wiki/Tuple_space#JavaSpaces)

