
Thinking in Datomic - pelle
http://pelle.github.com/Datomic/2012/07/08/thinking-in-datomic/
======
mtrn
The temporal aspect seems to be somewhat ignored by mainstream db development
- in some fields, e.g. BI you'll have slowly changing dimensions; and the
event sourcing pattern promises a time-machine view on data.

Since I'm writing a history-aware application at the moment, I recently looked
into different patterns for this and trying a mixed strategy at the moment
(SQL DB used as document store and a single event log, that accumulates
changes - a lean approach though, a few hundred lines python for the data
access layer; what always gets ugly is the validation, which your application
must take care of).

I wish, there was more hands-on material on the subject (some resources dive
depth into bi-temporal modeling, but I feel your schemas can get complex (=
expensive) very fast).

~~~
mtrimpe
Big Data by Nathan Marz is a Manning Early Acces book (i.e. not fully written
yet) that explains how to hand-roll your own custom Datomic style DB in
Hadoop/HBase.

It might be of interest to you given your current project.

~~~
mtrn
Thanks for the tip, will take a look at it. Probably after this project, the
current app _must_ be lean (no big dependencies, just install a database
scheme, pip install and go).

------
jerf
There have been numerous triple stores. Mozilla/Firefox even used one as its
core backend for a while before ripping it out and replacing it with SQLite.
Why do you think those failed, and how is Datomic going to avoid the same
failures?

That is, I've seen this pitched as the solution to all our data woes numerous
times now, why is this time different?

~~~
mtrimpe
It's not a triple store. It's closer to an OODBMS focused on persisting large
object graphs with an immutable append-only datastructure, thus maintaining
full history and being quite scalable with a single-writer many-reader
configuration.

~~~
pelle
It is actually implemented more like a triple store than an oodbms. Just with
time added as a 4th parameter.

Their Entity class makes it feel a bit like an oodbms though in that it kind
of acts like a hash map.

~~~
mtrimpe
Could you explain how it's more like a triple store with a time parameter?

My previous explanation already papered over a ton of fundamental differences
in order to make a somewhat understandable generalization, but I can't make
the jump to triple-store from my understanding of Datomic.

I basically see it as "just a single-writer many-reader distributed fully
persistent data-structure."

~~~
jerf
I'm sort of sorry to turn this around on you, because it's generally poor
form, but given the contents of the linked article which describes Datomic
exclusively in the form of timestamped triples, can you explain how it's _not_
a triple store?

If it's because you store objects as many triples with the same subject and
different adjectives... err, well, yes, that's how you store objects in a
triple store. Not remotely new, and still subject to my initial question.

~~~
mtrimpe
The article leads people to confuse the chosen view (triples) with the model
(versioned object graph.)

The main difference is that in Datomic the timestamps don't identify a single
fact but the entire object graph at that point in time.

To make an analogy; saying that Datomic is just a triple store is about the
same as saying that git is just a store for lists of diffs.

------
mbreese
Couldn't you just model your data in a traditional RDBMS with a time stamp as
well? In fact for mutable data that you'd like to keep old versions of, this
is pretty standard. A simple design would be to have a separate table for
person_locations that mapped a person to a location.

With everything else, the standard RDBMS table could be considered as having a
'snapshot' of the Datomic values.

I'm still not sure what benefit this has over a traditional DB. Perhaps I'll
just have to wait for the next post.

~~~
pelle
You can do it all yourself if you want, but Datomic does it for you without
you having to do anything.

Also as it does it on a datom level it is a lot more efficient than versioned
rows.

Interestingly one of the early selling points of Postgresql was this time
travel functionality. But it was yanked out in 6.2

<http://www.postgresql.org/docs/6.3/interactive/c0503.htm>

~~~
mbreese
Yes, but you also loose the flexibility of _not_ doing it. I'd rather work in
an established RDMBS and configure in the design I'd like to use rather than
use a database that requires a specific configuration.

I'm sure that Datomic has it's place, but the examples you gave in this post
aren't that convincing.

~~~
pelle
I agree with you this is not on its own necessarily a good reason to switch to
Datomic.

I will be getting into more detailed examples later. My real point with this
post was more to talk about how to model the data using datoms and not
specifically the temporal aspects of it.

I made many mistakes in my original data models in datomic based on many years
of rdbms thinking.

------
damncabbage
"Datomic is so different than regular databases that your average developer
will probably chose to ignore it."

Playing to the _Well I'm obviously above average_ gut feeling. Cute.

