
SWI-Prolog for the semantic web - amkk
http://www.swi-prolog.org/web/index.html
======
samuell
For anyone interested, I did my Masters thesis on SWI-prolog as a Semantic
querying tool, integrating it with the Eclipse RCP platform (Bioclipse.net in
particular), and compared its querying performance with Java based Jena SPARQL
parser for some typical tasks in cheminformatics.

The title was "SWI-Prolog as a Semantic Web Tool for semantic querying in
Bioclipse: Integration and performance benchmarking", and it is available for
download here: [https://www.researchgate.net/publication/50313589_SWI-
Prolog...](https://www.researchgate.net/publication/50313589_SWI-
Prolog_as_a_Semantic_Web_Tool_for_semantic_querying_in_Bioclipse_Integration_and_performance_benchmarking)

In this particular task, SWI-Prolog totally knocked out Jena, and it was also
more amenable to some heuristic optimiziations, where we the running time
really became infinitesimal in comparison to other tools.

~~~
craigching
Read through the paper quickly, seems like a nice representation and solution
of the problem. I have a couple of questions if you wouldn't mind.

You say this in the paper:

> It is an interesting observation that writing the Prolog query on the
> simpler form (Figure 18) made it amenable to heuristic optimization by
> sorting the values searched for, while this was not possible in the longer
> Prolog program

I'm afraid I didn't read carefully and will go back, but could you clarify
this a bit? I didn't understand about the longer vs. the shorter prolog code.
Would this optimization be required always get better performance than Jena or
Pellet?

And then this:

> Additionally, a drawback of SWI-Prolog speci?cally, against Jena and Pellet,
> is that since it is not written in Java, it is not as portable (i.e. the
> same code can not easily be executed) to di?erent platforms such as Mac,
> Windows, Linux etc. Instead the source code has to be compiled separately
> for each platform. This also has the result that the SWI-Prolog Bioclipse
> integration plugin will not be as portable as Bioclipse itself.

Really, though, Bioclipse is dependent on the portabilty of Eclipse which
probably doesn't support any more platforms than SWI Prolog (and most probably
fewer than SWI Prolog), so I wouldn't really see that as a limitation. I would
think you could provide the binaries in the distribution itself.

~~~
samuell
I might not have used the optimal wording there. The important difference
between the two prolog implementations is that the "longer" version is
implemented by (recursive) iteration over a list of [peak] values (and
comparing it to a reference list), whereas in the shorter one, the reference
values are specified as separate conditions in the prolog rule, combined with
logical AND.

The separate conditions enables "shortcut" of the match-testing, in a way that
the recursive list-parsing did not: As soon as one of the conditions is not
met, the current spectrum will be rejected and the backtracking will go on
with the next item, while with the recursive list-parsing version each item of
the list of values will be compared to the reference [value-]list regardless
of whether the current spectra has already been rejected or not.

The "shorter" prolog version, with separate conditions, thus enables to order
the conditions according so that statistically rare peak values come first
(thus, "heuristically"), so that a spectrum can be rejected as soon as
possible.

Being a heuristic solution, the performance would of course depend on the
prior knowledge of the values in the data.

But as can be seen from figure 15, both prolog versions beat Jena and Pellet
by a large margin, though the "shorter" version did so well that it is even
hard to notice an increase of the running time linear to the number of triples
in the triplestore, in the diagram.

Hope that made it a tad clearer!

~~~
craigching
Yes, that helps a lot, thanks!

------
laurencerowe
How does the closed world assumption of Prolog (if it's not explicitly
specified it is assumed false) mesh with the open world assumption (if it is
not explicitly specified it is unknown) of RDF?

~~~
jrbn
OWA exists only on paper. Practically all implementations assume CWA.

~~~
kendallgclark
That's false. Several practical implementations of OWL and RDF are based on
OWA. At least one is based on CWA and OWA.

------
mark_l_watson
I used to use the swi-prolog semantic web tools heavily. Very good stuff.

The cleopatria semantic web server is interesting, but I have just played with
it. The core RDF storage and inference get libraries were solid and nice to
use.

------
xamuel
The opening paragraph is structurally identical to "buy our nuclear power
plant, it'll generate a billion terawatts, oh and it also comes with this bike
shed". Prolog (the submission says) handles the gruesome problem of dealing
with RDF, oh and it also generates HTML pages and JSON!

That pattern's a red flag whenever I see it. Like an ostensible proof of P!=NP
that begins with a 30 page history written for laymen. Who is that paragraph
aimed at? Is there really a sizeable population casually using RDF and Prolog
but losing sleep over HTML and JSON?

~~~
brudgers
There are a lot of non-developer types who need to deal with metadata and for
whom the mechanics of websites are irrelevant, e.g. scientific researchers or
art historians or statisticians.

Librarians invented metadata not computer scientists.

~~~
orbifold
I'm pretty sure philosophers invented metadata, even though they did not give
it that name. The question how to distinguish between the properties an object
has and what properties we attach to it, is sort of central in epistemology.
Higher category theorists sometimes distinguish between stuff, structure and
property
[http://nlab.mathforge.org/nlab/show/stuff,+structure,+proper...](http://nlab.mathforge.org/nlab/show/stuff,+structure,+property),
which makes the definition of forgetful functors more precise. Conversly
attaching structure or properties are (not nescessarily unique) adjoint
functors to those forgetful ones. In mathematics this comes up for example if
you consider the category of abelian groups from which there is both a
forgetful functor to the category of groups and to the category of sets, but
the general idea should be applicable to metadata attached to text aswell.

~~~
kendallgclark
Aristotelian distinction between essential and accidental properties is
probably what yr thinking about here, but that's not really the same
distinction as the one between data and metadata.

------
samuell
There is also a really interesting Biomedical toolkit for SWI-prolog, which if
IIRC uses or integrates with the semantic capabilities (for ontologies etc),
although it was a while since I looked at it, so might recall wrong:

* BlipKit - Biomedical Logic Programming : [http://www.blipkit.org](http://www.blipkit.org)

------
possibilistic
Is semantic web tech being reliably employed to solve any big problems? (RDF,
RDFa, OWL, SPARQL, triple stores, graph dbs...?) Is it fast?

~~~
rspeer
It is not at all fast.

I don't even think it's computationally possible for SPARQL to be fast.

~~~
kendallgclark
SPARQL is PSPACE-complete. Worst case complexity and "fast in practice" aren't
really the same thing at all. I suspect average case complexity for SPARQL is
much better, which is backed up by several reasonably peformant
implementations.

~~~
rspeer
What are these reasonably performant implementations? Have you tried them with
a billion edges?

Also, if your data store is "fast in practice" but has worst cases that are
PSPACE-complete, how do you prevent worst-case queries from DOSing it?

~~~
kendallgclark
Well, I'm a vendor in the space so I like my implementation:
[http://stardog.com/](http://stardog.com/) \-- and yes we've "tried it" with
10s of billions of edges.

Worst cases are prevented from DOSing by having query management features like
auto-killing queries that run too long, etc.

------
pjmlp
It is always nice to see Prolog on the news. SWI was the choosen
implementation at my university back in the mid-90's.

------
moubarak
SPARQL which is the standard query language for RDF is based on Prolog. So why
go back to Prolog?

~~~
kendallgclark
SPARQL is similar to non-recursive Datalogˆnot, but that's a subset of full
Prolog. SPARQL is a query language, not a full-on programming language as
Prolog is.

~~~
samuell
True. Most notably I badly miss the ability to encapsulate queries in named
"functions". That is one of the things I really like in prolog, since it
enables to quickly raise one's level of abstraction, by building up a
"language" of facts and rules.

If anyone knows a way to do something similar in SPARQL, I'm highly interested
to know.

~~~
kendallgclark
You can do this with user-defined rules in some RDF databases. Stardog(.com)
supports it nicely.

~~~
samuell
Interesting, thanks for the hint!

------
fithisux
Is it possible to port it to mercury?

