
I don’t use Semantic Web technologies anymore, though they still influence me - mdlincoln
http://www.lespetitescases.net/why-I-dont-use-semantic-web-technologies-anymore-even-if-they-still-influence-me
======
zcw100
I could write a book on what's wrong with the semantic web. One of the worst
isn't even technical, it's the community. There are some great people in the
community but there are also a large number of extremely toxic people that
drive people away. If the technology ever takes off it's going to be because
some outside community cherry-picks the good parts and tells those people to
f-off. That's already starting to happen and you'll hear no end of bitching
from people in the semantic web community about how they're reinventing what
they've already done years ago. Guess what? You're right. You're so toxic that
it's worth redoing everything if it means they don't have to deal with the
toxic attitudes.

~~~
markhollis
I'm curious what those toxic attitudes are. Surely the "we already invented it
and you're reinventing it" can't be the only case. I'm also curious if it's in
an academia or in industry.

~~~
wrnr
My take on the attitude in academia: Here we describe a set of algorithms that
can solve a class of problems that previous algorithms can't. In the 60'
someone published a solution to a problem we have improved upon with the novel
innovation of called "hyperlinks". The technical, social and economical
shortcomings of our solution are invalid because it is decentralised and
therefor morally superior to the current offerings, used the world over, of
industry practitioners who are only doing it for the money. More funding is
needed for further research.

~~~
nl
In general the decentralised fetishism isn't something that is big in academia
(as in the academia that publishes paper). There's lots of issues in academia
and even more with the semantic web, but fetishism of decentralisation isn't
it.

------
JimmyRuska
Semantic web tech solves a common problem. You have a database where you want
to have some shared schema among many groups, and you want a way to infer
facts based on first order logic. You want to be able to query multiple
sources and reason about facts when taking into account multiple sources.

Whether you use semantic web tech or not that's still a common problem that
doesn't always have a good plug and play solution. There's still a lot of
places using jsonld format for metadata and cataloging information. You can
google cooking recipes and get ratings, cook time; search for movies and see
how high rated the movie is and who made it with a synopsis of the plot, all
of these are product metadata powered by rdfs or jsonld metadata, a relic of
the semantic web. It would be incorrect to say semantic web is dead. Any AI
that can effectively use wikidata as a fact table would be jeopardy grade.
There's still new tools coming out like RDFox that apply first order logic at
multicore speed across huge datasets for reasoning. There is work being done
to make it horizontally scalable. I think people will just go on an endless
loop of getting the same pain points and creating new tools using the trending
tech of the day, but even in this day and age, sometimes something like prolog
or picat is what you need.

~~~
zozbot234
> you want a way to infer facts based on first order logic

Isn't that computationally infeasible? Semantic web standards are based on
description logics, i.e. multi-modal logics chosen specifically for
computational expediency.

Also, I wouldn't describe JSON-LD as a "relic" of anything. It's a fairly
recent standard in the grand scheme of things, and many interesting projects
these days implicitly rely on it.

~~~
aggerdom
So not an expert in this area, would love if someone corrects me. My
understanding is generally FOL is infeasible. Propositional logic even can be
computationaly difficult [1]. My understanding is that most of the semantic
web stuff is done using a description logic of some flavor. These will be
named based on the properties of the logic. The important thing is that they
are generally decidable, and you can use something like MALET or some other
solver to infer things from your database or ontology.(You give up some
expressivity for decidability) Not sure how much stuff is going on with that
these days. Played with a petrology ontology in protegé some back in college,
but haven't followed the space. I remember OWL being important, but can't
remember why at the moment.

[1] For example if you try to figure out if a formula is satisfiable. You can
for sure do this using truth tables. The catch is that you're looking at 2^n
complexity where n is the number of propositions in your formula.

------
hos234
I am still a fan of Googles OpenRefine tool. It's reconciliation feature that
helps disambiguate Named Entities etc based on wikidata is really powerful -
[https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation](https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation)

You can hook in your own reconciliation end point which we do at work to
expand internal knowledge graphs.

~~~
nl
Note that OpenRefine isn't really kept up to date.

The basic capabilities work ok, but lots of the additional capabilities have
atrophied away.

------
ansible
I had a lot of interest in the semantic web when I first started learning
about it.

However, the efforts I've seen seem to be missing some critical factors for
longer-term success. I think we've got a lot of work to do with regards to
knowledge representation in general.

One of the big things for me is that the context for any fact is critical for
it to be true or not.

You can have a fact like "Tim Cook is the CEO of Apple", represented in a
graph like you would expect. However, that is only true _today_. Ten years ago
it was Steve Jobs. Without explicit context encoded in the information graph,
this web of data isn't as useful as it could be.

Context is important for reasoning in all kinds of situations. "What if Steve
Ballmer was CEO of Apple?", is a hypothetical context, where it may be useful
to do reasoning about. The context of "Who is the most distinguished captain
of the Enterprise?" could be about the real world US Navy, or a fictional Star
Trek universe (of which there are multiple).

~~~
mdlincoln
context-dependent, or "reified" assertions are a pain point for sure. I come
from the perspective of cultural heritage data, where context is king. Which
expert made this attribution for this painting? Who owned it _when_? According
to which archival document? etc.

Almost all the engineering problems cited in the original post are still
basically there, but graphical models are still the least painful way of doing
this, particularly when trying to share data between institutions. Example:
[https://linked.art/model/assertion/](https://linked.art/model/assertion/)

~~~
zozbot234
The OP mentions property graphs as a way around this problem. They can be seen
as natural extensions of "RDF quads" which in turn are based on common RDF
triples (Subject / Property / Object)

------
sawaruna
Shoutouts to the 11 other people on HN still working with rdf and similar in
2020.

------
at_a_remove
At an old job, I knew some very idealistic folks who kept pushing semantic web
business. "Let's do that everywhere!" As an exercise, I would have them open a
browser, visit various sites, and then look at the source. "Go on, check to
see if it validates," I would say with an anticipatory grin. Whether hand-
crafted HTML or generated by any number of frameworks, many sites can barely
manage to close their tags, asking for semantic references is a "just won't
happen in practice" thing.

I have also seen a great deal of consultant money, programmer time, sys-admin
sweat, and the like focused on these toweringly-designed, completely-unused
triple stores, layer upon layer of hot technologies (ever-moving, construction
on the tower never ceased) fused together to create a resource-intense
monstrosity that, at the end of the day, barely got used. But hey, let's look
at that jazz semantic web example one more time.

The most painful part is that I understand the urge to build a gleaming
repository for information, where the cool URIs never change; SPARQLing
pinnacles, ready to broadcast the Library of Alexandria, glimmer; and the
serene manifold of abstract information lies RESTful ... but I have come to
understand that the web of today is an endlessly bulldozed mudscape where
Someone Very Important has to have _that_ URL top-level _yesterday_ (never
mind that they will forget about it tomorrow), of shoddy materials and wildly
varying workmanship, and where nobody is listening to your eager endpoints
because the commercials are just too loud. I too once labored for information
architecture, to have the correct thing in the obvious place, with accurate
links and current knowledge, to provide visitors with the knowledge they
desired ... but PR preempted all of it to push yet more nice photographs in
yet another place: the Web as a technology for distributing images that would
once live on glossy pamphlets.

The vision is lovely, but we who have always lived in the castle have walked
alone.

~~~
riffraff
I would argue the problem is not the broken tags, but the business
disadvantage to exposing semantic data.

Remember when microformats were all the rage, and you could get hReview or
hRecipe or XFN data everywhere?

Then every host in turn realized that actually, it's _better_ if people can't
scrape your site, and it's even better if they can't even see it and it's
behind a login wall.

~~~
acdha
“better” is too strong: in many cases, structured data is not a problem (and
if it is, people will scrape it anyway), but there's simply no business case
for spending time on it. Most of the semweb stack had a horrible developer
experience — bad documentation, tools, validators, etc. — and rarely had
tangible benefit from spending time slogging through it.

The semantic data which has actually been implemented on a wide scale happened
because someone could go to their boss and say “Spending time on x will mean
better Google ranking” or “Facebook will use their new sharing display for our
pages”, and it was orders of magnitude simpler to implement so the time and
risk were far more palatable.

------
mark_l_watson
These a fair criticisms of the semantic web. One thing the author misses (does
not touch on at all) is domain specific RDF resources for biology, medicine,
etc.

schema.org and WikiData are great resources and for large companies, using
these as a foundation for their own internal Knowledge Graphs can make sense.
This expense is (maybe?) too large for small and medium size companies, they
would not get enough benefit for the cost.

I worked with Google’s Knowledge Graph as a contractor, and I am still a
believer in the technology but I also respect other people’s well founded
scepticism.

------
AndrewStephens
I have some low-level hate for the Semantic Web. I run a small personal blog
that I maintain using a relatively simple static site generator that I created
that turns markdown files into clean(ish) html.

A couple of months ago I got interested in adding semantic information to my
posts so I modified the generator to add some of the common semantic tags. It
was an annoying job, since the semantic information pollutes the structure of
the html.

Can anyone tell me what the semantic web does for me as a small-time
publisher? Is it for search engines? Does it really matter that a book review
(for instance, I have a few) is tagged properly?

~~~
lazyjones
> _Can anyone tell me what the semantic web does for me as a small-time
> publisher? Is it for search engines?_

Yes, in practice it is mostly for bigger fish in the pond to easily identify
and steal your content as needed.

For example, Google was using reviews from small competitors' sites in Google
Shopping.

~~~
abathur
I think this is one of the big issues. The semantic information does make it
easier for end users to find what they're looking for, but it also made denial
of traffic possible.

In a lot of cases, the information was there to get eyeballs--so this is
undesirable.

I guess if you don't really care about the eyeballs it can be "useful" for the
big fish to pay most of the cost of serving the fraction of your server
response that the end user was looking for...

~~~
TeMPOraL
So the root problem is actually that people care about the eyeballs. Nothing
good comes from such incentive.

~~~
abathur
Maybe. Not sure what I think about that framing.

FWIW, I was picking "eyeballs" as something wider than just ad revenue. I
think ads are the big share, but I'm sure there are people/orgs who want
eyeballs for other reasons like ego/status, promote their
company/brand/service/products, etc.

In some sense I think your framing is accurate, but I don't know about whether
we'd be better off (have an informational ecosystem that is more net-
positive?) without status chasers. Some share of them are inevitably gaming
the system and diluting the ecosystem; others probably add net value in
pursuit of eyeballs?

~~~
TeMPOraL
In context of semantic web, pursuit of the eyeballs is a problem because it
makes the people owning/creating the data also want to be delivering that data
directly to the users, and be the only ones allowing to do so. Semantic web
works for the opposite goal - to allow the data to be automatically
transmitted, processed and understood by software, and only perhaps eventually
delivered in some form to end user.

As for building more net-positive information ecosystems, going for the
eyeballs instead of actually caring to deliver good information isn't
necessarily bad _per se_ , just suboptimal. It's better for an eyeball-chasing
site to publish some information, if otherwise that information wouldn't be
published at all. But it's the eyeballs being your primary revenue source that
will make you work hard to make the data as useless as possible outside your
own publication - which leads to a very unhealthy information ecosystem.

------
contravariant
With the recent widespread interest in Category theory I still think it's a
damn shame that RDF wasn't designed to treat relationships as stand-alone
entities. Perhaps property graphs work better in that regard, although it's a
bit weird how properties aren't themselves relationships, but perhaps that's a
necessary concession to keep things efficient.

------
tannhaeuser
I wouldn't call semweb dead; it just has found its niche(s) and is even
stabilizing and gaining in those areas. I actually landed a gig for graph DBs,
SPARQL, etc. in lab informatics for bio/chem. Earlier this year I attended a
keynote held by Wikimedia Deutschland's Franziska Heine pushing for large
publicly available RDF data sets, etc.

------
abathur
I'm really interested in semantic authoring (not really structuring data with
semantics--but marking semantics within running text), though I guess I'm
disinterested in the semantic web.

I agree with a lot of the problems noted in other posts, and would add two
other problems from the authoring side:

1\. Identifying and employing sound semantics requires a level of thought and
clarity that I don't think most people are habituated to working at. It raises
the bar somewhat on who can be contributing (either they have to understand
and take care with the semantics, or you need a separate person to handle
them?)

2\. I may be missing some good tools, but I haven't been able to find a good
low-friction semantic authoring experience. Even if you are mentally prepared
to write with explicit semantics, it still adds a lot of friction to the
writing process (or requires subsequent semantic-edit passes).

------
buboard
modern NLP makes the semantic web completely obsolete. if anything, you need
less markup because it's confusing and more often than not, just wrong.

~~~
drongoking
This is too extreme. If, like Google, you have a flock of Ph.D.s who you can
put onto an NLP problem to extract semantics from text, then semantic markup
becomes less valuable. Not all of us are in that situation. And I don't think
parsing text is the only application of the semantic web. Having hugs
databases full of knowledge is interesting in itself.

As for semantic markup being confusing and usually wrong, I don't know where
you get that.

~~~
buboard
Yeah, but i think there is a difference between standardized markup data
formats describing e.g. proteins, and generic text with annotations. The
latter are redundant

------
liminal
I really want to like semantic web technologies, but every time I try to get
into them I'm stymied: * A zillion standards that all reference each other *
Two zillion incomplete and incompatible implementations of those
specifications * No sense of direction within it all (what's the easy path?) *
Multiple rebrandings of the same ideas (Semantic Web, Linked Data, Solid...)

~~~
zozbot234
"Solid" is just SOcial LInked Data anyway. I like LD as it seems clearer in
intent than the "semantic web" label.

~~~
Vinnl
Yeah, in that sense Solid is a subset of Linked Data: linking personal data.

------
austincheney
When writing data structures that are not for describing or defining services
I still can't help but think in triples. I also can't help but think of each
data facet as though it were something described with meta-data would provide
sufficient context that it would make sense if it were read out loud to a
stranger.

------
tylerjwilk00
Or responsive web design techniques apparently.

~~~
StuffedParrot
It reads fine for me on mobile.

