
Crux as General-Purpose Database - tosh
https://jorin.me/crux-as-general-purpose-database/
======
jandrewrogers
Bitemporality is an under-rated and rarely discussed database concept. It is
similar to the idea of "reproducible builds" but in a data model context. It
can be indispensable for some applications.

Efficient implementations of bitemporality require first-class design support
in the underlying storage engine, you really don't want to paste it on top of
an existing storage engine not purpose-built for bitemporality if performance
matters. Write throughput in particular tends to be terrible without a fair
bit of clever engineering.

A number of database engines effectively implement limited support for
bitemporality exposed as narrow features that take advantage of it, but don't
expose it as a general purpose facility because of the engineering
implications of supporting it in the general case.

~~~
refset
This is a good summary!

I think the biggest reason why bitemporality hasn't seen widespread adoption
is that it's a hard problem, both for DBMS implementers to solve and for users
to adopt, when constrained to a world of SQL and tables.

Crux not only has first-class support for bitemporality in the core engine but
saves the user from having to worry about how and when bitemporality impacts
the schema. This is because everything in the database gets a bitemporal
history by default, without the user needing to make upfront designs &
decisions.

Point-in-time Datalog queries traverse the entity graph using a very simple
schema-on-read behaviour and this can serve as a foundation for more complex
relational modelling and constraint enforcement.

------
refset
Hello! I am the product manager for Crux. I will try to answer questions when
I get back online in a few hours.

Something not mentioned in the post is that there is a Java API in addition to
HTTP and Clojure.

The Beta programme is commencing very early in 2020 - please contact us in the
meantime if you are interested to hear more: crux@juxt.pro

~~~
fulafel
How does the document-db nature come up in practice? For example are
transactions single-document in Crux? I'm trying to understand the difference
in semantics and data model vs Datomic.

~~~
refset
Documents are best thought about in terms of being the unit of ingestion and
history - they do not have strong implications for queries asides from having
a 1:1 mapping with entities.

Each transaction may contain multiple operations and each operation is
typically only relating to one document. A document represents a single
version of an entity and is decomposed during ingestion into an arbitrary set
of attribute-value datoms that get updated atomically and fully replace all
the previous attribute-value associations of the given entity.

For more on the semantic differences with datoms see the FAQ:
[https://www.opencrux.com/docs#faq-
comparisons](https://www.opencrux.com/docs#faq-comparisons)

------
sundbry
What's the indexing/performance like? When I select an ID from a collection is
it going to reprocess the entire history to filter on a matching ID?

~~~
refset
> What's the indexing/performance like?

A key goal has been to avoid ingestion bottlenecks that are otherwise typical
when indexing late-arriving temporal data. This is partly achieved by having a
very simple EAV index structure over fast local KV stores like RocksDB and
LMDB (as opposed to a complex distributed storage architecture with vastly
more moving pieces).

As for query performance, the indexes have been designed so that graph
traversals are efficient regardless of how much transaction time or valid time
history is stored for the entire database of entities. One trick to this is
the use of "Morton space filling curves":
[https://github.com/juxt/crux/blob/master/crux-
core/src/crux/...](https://github.com/juxt/crux/blob/master/crux-
core/src/crux/morton.clj)

This is a great talk about the design and internals from ClojurTRE 2019 if
you're curious:
[https://www.youtube.com/watch?v=YjAVsvYGbuU](https://www.youtube.com/watch?v=YjAVsvYGbuU)

