
The Essence of Datalog (2018) - bibyte
https://dodisturb.me/posts/2018-12-25-The-Essence-of-Datalog.html
======
jgrodziski
There is a very nice Datalog tutorial at:
[http://www.learndatalogtoday.org/](http://www.learndatalogtoday.org/) Lean a
little bit toward Datomic, but very nice examples nevertheless.

~~~
tosh
Loved going through this a few years ago, eye opening

------
muydeemer
As I see people mentioning Datalog-related software, I'm gonna throw a pebble
into the yard as well - Graql
([https://github.com/graknlabs/grakn](https://github.com/graknlabs/grakn)) is
a declarative graph-based query language with rule-based reasoning heavily
inspired by Datalog and Logic Programming.

Disclaimer: I am one of the maintainers.

~~~
muydeemer
Just to add and clarify: it supports recursion (since a couple releases back)
and pattern negation based on set difference (on current master branch, soon
to be released).

------
yazr
> recently become popular among the scalable program analysis crowd.

Can someone give some examples? I have read papers from the Athens University
using datalog and pointer analysis. Has this become a major direction ?

~~~
rntz
Yes, the DOOP folks under Iannis Smaragdakis have been doing this for a while.
They use souffle, I believe: [https://souffle-
lang.github.io/](https://souffle-lang.github.io/).

This is Semmle's business model ([https://semmle.com/](https://semmle.com/)).
They have a custom Datalog with OO features they call QL.

I've heard LogicBlox used to do program analysis as well, but AIUI they turned
more towards business analytics and then got bought out, and the Datalog
engine team all jumped ship.

The Rust compiler has been exploring implementing some of their type system
(the borrow checker, I think?) in a Datalog.

On the academic front, Tamás Szabó and Sebastian Erdweg have been working on
an incremental Datalog engine for static analysis;
[https://github.com/szabta89/IncA](https://github.com/szabta89/IncA). I know
Tamas has worked for Itemis, but I don't know whether they're using this tech.

I'm working on a functional language, Datafun, inspired by Datalog, but it's
too much of a toy to do any real analysis with at the moment. All in all,
Datalog is still a niche language and static analysis a niche application of
it, but it seems to be on the rise.

~~~
frankmcsherry
Moar links!

(caveat: I am involved in many things)

1\. You can implement some minor flavor of Doop in differential dataflow
([https://github.com/TimelyDataflow/differential-
dataflow/tree...](https://github.com/TimelyDataflow/differential-
dataflow/tree/master/doop)). The fragment came from Yannis which he described
as a minimal non-trivial analysis (vs other, even smaller fragments), but it
still uses some 900 dataflow operators (updates in 10s of ms when inputs
change, though). The DD impl isn't meant to be understood, I'm afraid.

2\. Many ex-LB folks are now at relational.ai doing .. advanced weird tech
like they did with LB. I'd watch them (I do).

3\. Polonius ([https://github.com/rust-lang/polonius](https://github.com/rust-
lang/polonius)) defines the Rust borrow checker as Datalog rules, and went
from differential dataflow to datafrog ([https://github.com/rust-
lang/datafrog](https://github.com/rust-lang/datafrog)).

4\. If you want to check out other incremental Datalog environments, in
addition to IncA, there are

[https://github.com/comnik/declarative-
dataflow](https://github.com/comnik/declarative-dataflow)

[https://github.com/ryzhyk/differential-
datalog](https://github.com/ryzhyk/differential-datalog)

[https://github.com/TimelyDataflow/differential-
dataflow](https://github.com/TimelyDataflow/differential-dataflow)

Good times for Datalog, imo.

------
kendallgclark
Datalog is awesome as is Logic Programming generally. It's not only useful in
static program analysis but also in areas as diverse as synthetic biology and
data integration.

There's lots of datalog and similar technology in most Knowledge Graph
platforms, too. For example, [http://stardog.com/](http://stardog.com/).

~~~
trurl
I don't see anywhere that mentions datalog in relationship to Stardog. I see
OWL2 and SparQL, which as far as I can tell do not support recursion other
than in experimental prototypes. If there is no recursion, it is just
conjunctive queries and not datalog.

~~~
mmarx
> If there is no recursion, it is just conjunctive queries and not datalog.

SPARQL has negation (FILTER NOT EXISTS), though, making query answering
PSpace-hard, whereas conjunctive query answering is NP-complete, so the two
are very much not the same, although still less expressive then Datalog (which
is ExpTime-complete). CQ answering for the Horn fragment of OWL2 is already
2ExpTime-complete, however, making it vastly more expressive than pure
Datalog.

------
tosh
Datalog is also used as query language for Datomic

[https://www.datomic.com/](https://www.datomic.com/)

[https://youtube.com/watch?v=Cym4TZwTCNU](https://youtube.com/watch?v=Cym4TZwTCNU)

~~~
tannhaeuser
No it's not. The query language of Datomic, and that taught on
learndatalogtoday.com, is a Datomic-proprietary language similar to Prolog-in-
LISP implementations or frame-based expert systems of old.

~~~
souenzzo
There is datascript, which has a foss implementation of datascript (with nice
performance)

[https://github.com/tonsky/datascript](https://github.com/tonsky/datascript)
And this (deprecated) project, that translate datalog to SQL(lite)
[https://github.com/mozilla/mentat](https://github.com/mozilla/mentat)

------
eralps
There is also Flix language inspired by Datalog that adds the features of
monotone functions and lattice models. This post is such a coincidence for me
since I have just started to work on Flix and souffle for my thesis.

------
BoiledCabbage
So for someone who is familiar with it, why hasn't Datalog supplanted SQL for
relational querying? What's its limitation?

~~~
bowyakka
On the continum of relational algebra to logic programming catalog is
somewhere in the middle.

It is better at handling recursive queries, and logical inference than SQL. On
the flip side SQL can be better at more set style operations (for instance
windowing functions are painful in datalog).

It is different tooling, essentially it comes down to do you want to extract
answers from your data with varying relational algebra but holding the logic
the same (SQL), or keeping the relational algebra the same but varying the
logic (datalog).

~~~
aargh_aargh
Could you please explain what you mean by "varying relational algebra/logic"?
Didn't understand that at all. I only know SQL.

------
zengid
How does Datalog differ from something like MiniKanren?

~~~
tobmlt
My take on miniKanren is that it's implementation is designed to be small and
hackable. Especially small with microKanren. If you don't know scheme/lisp,
but want to roll your own for educational purposes, then check this out:
[https://codon.com/hello-declarative-world](https://codon.com/hello-
declarative-world)

Vs Prolog, the emphasis in miniKanren is on constraint programming ---
especially writing new constraints to extend it to more problems. Where (chief
variants of) Prolog have been optimized in various ways for certain types of
problems.

See Will Byrd's answer here:
[https://stackoverflow.com/questions/28467011/what-are-the-
ma...](https://stackoverflow.com/questions/28467011/what-are-the-main-
technical-differences-between-prolog-and-minikanren-with-resp)

What I read about Datalog is that it is basically a recursive relational
database. It's set up for "deductive queries". This wording reminds me of
miniKanren --- but the Kanrens are Turing complete.

Ah, Wikipedia has this to say:

In contrast to Prolog, Datalog,

1.) disallows complex terms as arguments of predicates, e.g., p (1, 2) is
admissible but not p (f (1), 2),

2.) imposes certain stratification restrictions on the use of negation and
recursion,

3.) requires that every variable that appears in the head of a clause also
appears in a nonarithmetic positive (i.e. not negated) literal in the body of
the clause,

4.) requires that every variable appearing in a negative literal in the body
of a clause also appears in some positive literal in the body of the clause[4]

I'd have to put some thought into checking this list against the capabilities
of miniKanren. Generally it doesn't ring any bells.

------
mathetic
OP here. Happy to answer any questions.

