
Mentat: A persistent, relational store inspired by Datomic and DataScript - ivank
https://github.com/mozilla/mentat
======
pbowyer
Key quotes:

"DataScript asks the question: "What if creating a database would be as cheap
as creating a Hashmap?"

Mentat is not interested in that. Instead, it's strongly interested in
persistence and performance, with very little interest in immutable
databases/databases as values or throwaway use."

and:

"Datomic has a beautiful conceptual model. [...] Many of these design
decisions are inapplicable to deployed desktop software; indeed, the use of
multiple JVM processes makes Datomic's use in a small desktop app, or a mobile
device, prohibitive.

Mentat is designed for embedding, initially in an Electron app (Tofino). It is
less concerned with exposing consistent database states outside transaction
boundaries, because that's less important here, and dropping some of these
requirements allows us to leverage SQLite itself."

~~~
ah-
Can you elaborate a bit on what this actually means?

How would I use it? How does a query look like?

~~~
pbowyer
> Can you elaborate a bit on what this actually means?

Sure - I'm highlighting the differences from DataScript/Datomic. My take is
the 'inspiration' from each (especially DataScript) is quite loose.

~~~
holygoat
Yes. Project Mentat fits into a conceptual lineage that includes Freebase's
graphd and 2005-onward Semantic Web stores. We have aimed for compatibility
with Datomic and DataScript for least surprise, but if you squint there's a
little AllegroGraph in the direction.

Datomic's model (both architectural and conceptual) draws from Clojure's
concepts of persistence. That model isn't free, so Mentat deviates from it
where it makes sense to do so: at present we don't implement querying of
history or past states, for example, and when we do it won't be free.

We'll get closer to full Datomic-style datom store capabilities over time, but
we'll make different performance tradeoffs.

~~~
pbowyer
Thanks for the explanation. For me the key takeaway is:

> at present we don't implement querying of history or past states, for
> example, and when we do it won't be free.

That's the bit of Datomic that intrigues me and gives the best use-case (I
don't have to add data versioning and history in my app-layer) and what I'm
looking for in other systems.

~~~
holygoat
Note that we do store the full transaction log, just like Datomic, and Mentat
will allow querying of it (and replication, and replay, and…). We haven't
implemented history querying yet because we haven't needed it for application
code.

The trick with Datomic is that _every time_ you grab a `db` instance, it's a
snapshot, and the system's index chunking and storage replication are
necessarily built around the ability to continue using those older index
chunks, potentially for a very long time.

Most consumers, most of the time, just want to query the store as it stands at
that moment, but Datomic peers pay the space and time penalty of keeping and
retrieving historical index chunks in order to answer those historical
queries.

My current thoughts are:

1\. To allow for short-term snapshot querying through something like
`db.keep()`, implemented via a SQLite read transaction. That's not free: the
database WAL will continue to grow until the read transaction is ended, so it
isn't ideal for all workloads, but it'll do.

For some queries it's enough to simply track a last-seen tx value and filter
everywhere, but that becomes difficult when cardinality-one and unique-
identity properties are considered.

2\. The obvious equivalent to Datomic's 'with' is an uncommitted write
transaction. Naturally this blocks other writers while it exists, and so
alternative implementations (e.g., writing to a complete disk copy of the
database, or writing a 'delta' table) might make sense.

At some very hazy point in the future we might try to get SQLite support for
this: after all, if we can guarantee that a write transaction won't be
committed, we could use a separate WAL file for the `with` and avoid blocking
other writers.

3\. A longer-term approach to snapshots/DB-as-value is to materialize the
datoms at the specified instant in time, either in a temporary table or in a
real persisted table. That is: `db.keep_forever()` will give you a new
structure to query, and calling code will be responsible for cleaning up that
space.

The reason I say "won't be free" is that each of these operations imposes a
cost _when the feature is used_ : either SQLite or Mentat will have to do some
work to allow an extended period of isolation, to reconstruct some state, or
to persist some state.

That's in contrast to Datomic, which imposes some overhead every time index
chunks are built or retrieved. It's also an interesting parallel to Clojure vs
Rust: Clojure's data structures are persistent by default, giving you
snapshots and safety at a cost everyone pays; Rust believes that you shouldn't
pay for abstractions you don't use.

------
rads
When I read the title I was wondering if this was a new project. It's actually
a continuation of the Datomish project. I'm glad they changed the name,
though.

------
rektide
Can't find any information whatsoever on usage. Not even sure if this meant to
be externally accessed or whether this is purely for embedding (in other Rust
code?).

~~~
steveklabnik
> To start the server use:

>

> cargo run serve

So, not embedding.

[https://github.com/mozilla/mentat/blob/rust/tests/external_t...](https://github.com/mozilla/mentat/blob/rust/tests/external_test.rs)
looks like some sample usage.

~~~
SomeCallMeTim
"Mentat is designed for embedding, initially in an Electron app (Tofino)."

It looks like it works both ways.

~~~
steveklabnik
Neat!

------
DanCarvajal
Hehe, Dune reference. I approve.

------
lvh
This looked really familiar to another Mozilla project called Datomish. The
project was recently renamed to avoid confusion with Datomic; but apparently
that _also_ involved a rewrite into Rust. Details here:
[https://github.com/mozilla/mentat/issues/133](https://github.com/mozilla/mentat/issues/133)

Unfortunately, this means that the path for using this in non-Node Javascript
environments (i.e. browsers) is unknown.

~~~
holygoat
This is the renamed version of Datomish.

The original implementation was in ClojureScript. This reimplementation is in
Rust, and is intended to work anywhere you can run Rust code: inside Node,
inside Firefox, and in standalone applications.

We expect a WebExtensions API to wrap this inside Firefoxes at some point, but
right now we're focused on the core (re)implementation.

------
samuell
Awesome. Been looking hard for open source datalog-supporting data stores and
data processing systems for a couple of years now. This is at least something
in this direction, although I might ultimately wish for a full-fledged
database or system that could run in a distributed fashion if needed.

~~~
espeed
See also...

RDFox: "A highly scalable in-memory RDF triple store that supports shared
memory parallel datalog reasoning. It is a cross-platform software written in
C++ that comes with a Java wrapper allowing for an easy integration with any
Java-based solution."

[https://www.cs.ox.ac.uk/isg/tools/RDFox/](https://www.cs.ox.ac.uk/isg/tools/RDFox/)

Dedalus: Datalog in Time and Space, by Peter Alvaro out of UC Berkeley (note
the StrageLoop talk):

[https://disorderlylabs.github.io/](https://disorderlylabs.github.io/)

Datalog -> Gremlin: It shouldn't be too hard to implement Datalog on top of
the Gremlin Graph Virtual Machine so that Datalog compiles down to Gremlin
bytecode -- SPARQL and SQL implementations already exist -- and running
Datalog on the GVM would allow you to run Datalog on any datastore Apache
Tinkerpop supports (all the graph DBs, HBase, Cassandra...):

Graph Computing with Apache TinkerPop, by Marko Rodriguez (the creator of
Gremlin)
[https://www.youtube.com/watch?v=tLR-I53Gl9g](https://www.youtube.com/watch?v=tLR-I53Gl9g)

A Gremlin Implementation of the Gremlin Traversal Machine
[http://www.datastax.com/dev/blog/a-gremlin-implementation-
of...](http://www.datastax.com/dev/blog/a-gremlin-implementation-of-the-
gremlin-traversal-machine)

~~~
samuell
That Peter Alvaro talk is on my top-three favourite talks, if not top-one :)

------
general_ai
> designed for embedding

Yet implemented in Rust. Why? If you want adoption, the best way to design
something "for embedding" is to write it in C.

~~~
ianlevesque
Rust embeds almost as well as C.

~~~
general_ai
Be that as it may, it's a relatively obscure and quickly changing language,
that _ends up calling into C_ anyway.

~~~
steveklabnik
Rust doesn't change in a backwards incompatible way.

> that _ends up calling into C_ anyway.

What specifically do you mean here?

~~~
general_ai
Says there it uses sqlite.

~~~
jdub
Datapoint: GNOME's Federico Mena Quintero is working on a file-by-file port of
librsvg to Rust. Compiled Rust objects are linked with compiled C objects.
Both Rust and C call functions in librsvg's C dependencies, such as Cairo.
None of this is weird; it's _precisely_ what Rust was designed for.

