
Model facts, not your problem domain - ruuda
https://ruudvanasseldonk.com/2020/06/07/model-facts-not-your-problem-domain
======
memexy
> When requirements change, an append-only data model of immutable facts is
> more useful than a mutable data model that models the problem domain.

Interesting viewpoint and it sounds like a Prolog/Datalog database.

Similarly, many document databases like CouchDB will keep a revision number so
the database itself will keep track of the last N revisions and garbage
collect stale data.

~~~
ruuda
The Prolog/Datalog comparison is fair, Datomic (the subject of the
“Deconstructing the Database” talk linked at the end) is built around this
idea of accumulating facts, and it uses a Datalog-like query language.

But you don’t need any special tools. A good old relational database will do,
just don’t do updates or deletes. In my case I was using a simple SQLite
database.

Event sourcing is also kind of the same idea.

~~~
refset
It's worth a mention that there is a next level of challenge for backfilling
and correcting historical data - that's when a bitemporal data model is a good
idea. In other words, keeping track of the "valid time" (or "application
time") of when a fact became true, separately from the "transaction time" (or
"system time") of when the fact was ingested into the database.

> you don’t need any special tools

Absolutely agreed. I happen to work on
[https://opencrux.com](https://opencrux.com) which is designed from the ground
up for handling bitemporal data, but I was chatting to someone a couple of
days ago who built a really neat approximation of Crux purely on Postgres,
using JSON columns and 3 simple tables: Transaction_Log table + All_Documents
table + Current_Documents table. A small handful of triggers were used to
automate the timestamping and the population of the Current Documents table,
but it was super simple stuff.

Naturally there is a big difference in scale and performance compared to
something like Crux (where you can join within historical timeslices
_efficiently_), but I was impressed by how flexible JSON indexing in Postgres
made the approach viable with such little upfront effort.

It's probably fair to say that the majority of queries in typical applications
won't ever care about prior history, but I think always having history
available for native querying without having to import anything from archives
or logs is fundamentally liberating.

