I was a fly on the wall as this work was being done and it was super interesting to see the discussions. I was also surprised that Jepsen didn’t find critical bugs. Clarifying the docs and unusual (intentional) behaviors was a very useful outcome. It was a very worthwhile confidence building exercise given that we’re running a bank on Datomic…
Given that Rich Hickey designed this database the outcome is perhaps unsurprising. What a fabulous read - anytime I feel like I’m reasonably smart it’s always good to be humbled by a Jensen analysis
A good design does not guarantee the absence of implementation bugs. But a good design can make introducing bugs harder / less probable. This must be the case, and then it's a case to study and maybe emulate.
In practical terms, if you are a database and Jepsen doesn't find any bugs, that's as much assurance as you are going to get in 2024 short of formal verification.
Formal verification is very powerful but still not full assurance. Fun fact: Testing and monitoring of Datomic has sometimes uncovered design flaws in underlying storages that formal verification missed.
To start with, you usually perform verification on a behavioral model and not on the code itself. This opens up the possibility that there are behavioral differences between the code itself and the model which wouldn't be caught.
Thank you. I've updated my initial guess of p(critical bugs | did not find critical bugs) from 0.5 to 0.82 given my estimate of likelihood and base rates.
Evidence of absence ("we searched really carefully and nothing came up") does update the Bayesian priors significantly, so the probability of absence of bugs can now be estimated as much higher.
I doubt any organization that isn't directly putting lives on the line are testing database technology as thoroughly and competently as Jepsen. Banks jobs are to be banks, not be Jepsen.
I would have thought they would be more rigorous, since mistakes for them could threaten the very viability of the business? Which is why I assume most are still on mainframes. (Never worked at a bank)
Banks exist since a long time before computers existed, and thus have ways to detect and correct errors that are not purely technological (such as double entry bookkeeping, backups, supporting documentation, different processes). So a bank can survive a db doing nasty things on a low enough frequency such that is not detected beforehand, so they don’t need to “prove in coq” that everything is correct.
Mistakes don't threaten them that much. When Equifax (admittedly not a bank) can make massive negligent fuckups and still be a going concern there isn't much heat there. Most fuckups a bank make can be unwound.
Mainframe systems aren't tested to the Jepsen level of standard just because they were build on mainframes in the 70s/80s. In fact, quite the opposite.
Banks are not usually ran by people who go for the first fad.js they see ahead; they usually also can think ahead further than 5 min.
Also, I'm sure they engineer their systems so that every operation and action is logged multiple times and have multiple redundancy factors.
A main transaction DB will not be a "single source of truth" for any event. It will be the main source of truth, but the ledger you see in your online bank is only a simplified view into it.
This is the first time I try reading a Jepsen report in-depth, but I really like the clear description of Datomic's intra-transaction behavior. I didn't realize how little I understood the difference between Datomic's transactions and those of SQL databases.
One thing that stands out to me is this paragraph
Datomic used to refer to the data structure passed to d/transact as a “transaction”, and to its elements as “statements” or “operations”. Going forward, Datomic intends to refer to this structure as a “transaction request”, and to its elements as “data”.
What does this mean for d/transact-async and related functionality from the datomic.api namespace? I haven't used Datomic in nearly a year. A lot seems to have changed.
This is a fantastic detailed report about a really good database. I'm also really happy to see the documentation being clarified and updated.
As a side note: I so wish Apple would pay for a Jepsen analysis of FoundationDB. I know Aphyr said that "their tests are likely better", but if indeed Jepsen caught no problems in FoundationDB, it would be a strong data point for another really good database.
I would never, ever want to take food out of aphyr's mouth, but is there something specific that makes either just creating the Jepsen tests somehow out of reach of a sufficiently motivated contributor, or is so prohibitively expensive that a "gofundme-ish" setup wouldn't get it done?
I (perhaps obviously?) am not well-versed in that space to know, but when I see "wish $foo would pay for" my ears perk up because there is so much available capital sloshing around and waiting on Apple to do something is (in my experience) a long wait
I have heard from people who paid for a Jepsen test that he is eye wateringly expensive (and absolutely, rightfully should be, there are very few people in the world that can conduct analyses on this level) but maybe achievable with a gofundme.
I am not sure, for the same reason, that designing a DIY Jepsen suite correctly is really achievable for the vast majority of people. Distributed systems are very hard to get right, which means that testing them is very hard to get right as well.
He provides a good and unique service. He's worth every penny. Note that for some companies, the real "expense" is dedicating engineering hours to fix the shit he lit on fire in your code.
> is there something specific that makes either just creating the Jepsen tests somehow out of reach of a sufficiently motivated contributor
Skill and capabilities, I believe. Very, very few people have an understanding of distributed databases on his level. And most people don't even realize how much they don't know.
Really nice work as always. I love reading these to learn more about these systems, for little tidbits of writing Clojure programs, and for the writing style. Thanks for what you do!
It struck me that Jepsen has identified clear situations leading to invariant violations but Datomic’s approach seems to have been purely to clarify their documentation. Does this essentially mean the Datomic team accepts that the violations will happen, but don’t care?
From the article:
> From Datomic’s point of view, the grant workload’s invariant violation is a matter of user error. Transaction functions do not execute atomically in sequence. Checking that a precondition holds in a transaction function is unsafe when some other operation in the transaction could invalidate that precondition!
As Jepsen confirmed, Datomic’s mechanisms for enforcing invariants work as designed. What does this mean practically for users? Consider the following transactional pseudo-data:
[
[Stu favorite-number 41]
;; maybe more stuff
[Stu favorite-number 42]
]
An operational reading of this data would be that early in the transaction I liked 41, and that later in the transaction I liked 42. Observers after the end of the transaction would hopefully see only that I liked 42, and we would have to worry about the conditions under which observers might see that 41.
This operational reading of intra-transaction semantics is typical of many databases, but it presumes the existence of multiple time points inside a transaction, which Datomic neither has nor wants — we quite like not worrying about what happened “in the middle of” a transaction. All facts in a transaction take place at the same point in time, so in Datomic this transaction states that I started liking both numbers simultaneously.
If you incorrectly read Datomic transactions as composed of multiple operations, you can of course find all kinds of “invariant anomalies”. Conversely, you can find “invariant anomalies” in SQL by incorrectly imposing Datomic’s model on SQL transactions. Such potential misreadings emphasize the need for good documentation. To that end, we have worked with Jepsen to enhance our documentation [1], tightening up casual language in the hopes of preventing misconceptions. We also added a tech note [2] addressing this particular misconception directly.
To build on this, Datomic includes a pre-commit conflict check that would prevent this particular example from committing at all: it detects that there are two incompatible assertions for the same entity/attribute pair, and rejects the transaction. We think this conflict check likely prevents many users from actually hitting this issue in production.
The issue we discuss in the report only occurs when the transaction expands to non-conflicting datoms--for instance:
[Stu favorite-number 41]
[Stu hates-all-numbers-and-has-no-favorite true]
These entity/attribute pairs are disjoint, so the conflict checker allows the transaction to commit, producing a record which is in a logically inconsistent state!
On the documentation front--Datomic users could be forgiven for thinking of the elements of transactions as "operations", since Datomic's docs called them both "operations" and "statements". ;-)
In order for user code to impose invariants over the entire transaction, it must have access to the entire transaction. Entity predicates have such access (they are passed the after db, which includes the pending transaction and all other transactions to boot). Transaction functions are unsuitable, as they have access only to the before db. [2]
Use entity predicates for arbitrary functional validations of the entire transaction.
Datomic transactions are not “operations to perform”, they are a set of novel facts to incorporate at a point in time.
Just like a git commit describes a set of modifications, do you or should you want to care about which order or how the adds, updates, and deletes occur in a single git commit? OMG no, that sounds awful.
The really unusual thing is that developers expect intra-transaction ordering to be a thing they accept from any other database. OMG, that sounds awful, how do you live like that.
Yeah, this basically boils down to "a potential pitfall, but consistent with documentation, and working as designed". Whether this actually matters depends on whether users are writing transaction functions which are intended to preserve some invariant, but would only do so if executed sequentially, rather than concurrently.
Datomic's position (and Datomic, please chime in here!) is that users simply do not write transaction functions like this very often. This is defensible: the docs did explicitly state that transaction functions observe the start-of-transaction state, not one another! On the other hand, there was also language in the docs that suggested transaction functions could be used to preserve invariants: "[txn fns] can atomically analyze and transform database values. You can use them to ensure atomic read-modify-update processing, and integrity constraints...". That language, combined with the fact that basically every other Serializable DB uses sequential intra-transaction semantics, is why I devoted so much attention to this issue in the report.
It's a complex question and I don't have a clear-cut answer! I'd love to hear what the general DB community and Datomic users in particular make of these semantics.
As a proponent of just such tools I would say also that "enough rope to shoot(?) yourself" is inherent in tools powerful enough to get anything done, and is not a tradeoff encountered only when reaching for high power or low ceremony.
It is worth noting here that Datomic's intra-transaction semantics are not a decision made in isolation, they emerge naturally from the information model.
Everything in a Datomic transaction happens atomically at a single point in time. Datomic transactions are totally ordered, and this ordering is visible via the time t shared by every datom in the transaction. These properties vastly simplify reasoning about time.
With this information model intermediate database states are inexpressible. Intermediate states cannot all have the same t, because they did not happen at the same time. And they cannot have different ts, as they are part the same transaction.
When we designed Datomic (circa 2010), we were concerned that many languages had better support for lists than for sets, in particular list literals and no set literals.
Clojure of course had set literals from the beginning...
An advantage of using lists is that tx data tends to be built up serially in code. Having to look at your tx data in a different (set) order would make proofreading alongside the code more difficult.
Yes. Perhaps this is a performance choice for DataScript since DataScript does not keep a complete transaction history the way Datomic does? I would guess this helps DataScript process transactions faster. There is a github issue about it here: https://github.com/tonsky/datascript/issues/366
I think the article answers your question at the end of section 3.1:
> "This behavior may be surprising, but it is generally consistent with Datomic’s documentation. Nubank does not intend to alter this behavior, and we do not consider it a bug."
When you say, "situations leading to invariant violations" -- that sounds like some kind of bug in Datomic, which this is not. One just has to understand how datomic processes transactions, and code accordingly.
I am unaffiliated with Nubank, but in my experience using Datomic as a general-purpose database, I have not encountered a situation where this was a problem.
This is good to hear! Nubank has also argued that in their extensive use of Datomic, this kind of issue doesn't really show up. They suggest custom transaction functions are infrequently written, not often composed, and don't usually perform the kind of precondition validation that would lead to this sort of mistake.
Yeah, I've used a transaction functions a few times but never had a case where two transaction functions within the same d/transaction ever interacted with each other. If I did encounter that case, I would probably just write one new transaction function to handle it.
Sounds similar to the need to know that in some relational databases, you need to SELECT ... FOR UPDATE if you intend to perform an update that depends on the values you just selected.
For those who aren't aware, the name Jepsen is a play on Carly Rae Jepsen, singer behind "call me maybe". In my opinion a perfect name for a distributed systems research effort.
I’ve not really spent much time with Datomic in anger because it’s super weird, but is any of this surprising? Datomic transactions are basically just batches and I always thought it was single threaded so obviously it doesn’t have a lot of race conditions. It’s slow and safe by design.
In Clojure the answer is [2 2]. A beginner might guess [2 2] or [2 3]. Both are reasonable guesses, so a beginner needs to be quite careful!
But that isn't particularly interesting, because beginners always have to be quite careful. When you are learning any technology, you are a beginner once and experienced ever after. Tool design should optimize for the experienced practitioner. Immutability removes an enormous source of complexity from programs, so where it is feasible it is often desirable.
this is a lot less of a convincing argument when you consider the fact that we’re talking about a database interaction. If we want to do the FP weenie thing, I would assume a transaction context is monadic!
Another ding on your argument is that in the metaphor you _would_ get [2 3] if you aren’t in a transaction! Of course the reasons for this are not incorrect, but I think generally “the end result of this list of actions is different if you are in a transition for a part of it, even if you are the only client operating” is a bit surprising compared to default transaction semantics in other systems.
Of course, people should learn about their tools and not assume everything works exactly the same. I think it’s incorrect to assume that the choice made here was the only one that would make sense for Datomic.
Thanks Kyle! It is evident that our docs were insufficient. Rich and I have written what we hope is clearer, more comprehensive documentation about Datomic’s transaction model. We hope that this can preempt common misconceptions and we welcome all feedback!
The data model in Datomic is pretty intuitive if you're familiar with triple stores / RDF. But these similarities aren't very often referenced in by the docs or online discussions. Is it because people are rarely familiar with those concepts, or is the association with semantic web things considered potentially distracting, (or am I missing something and there are major fundamental differences)?
I wonder if Datomic’s model has room for something like an “extra-strict” transaction. Such a transaction would operate exactly like an ordinary transaction except that it would also check that no transaction element reads a value or predicate that is modified by a different element. This would be a bit like saying that each element would work like an independent transaction, submitted concurrently, in a more conventional serializable database (with predicate locking!), except that the transaction only commits if all the elements would commit successfully.
This would have some runtime cost and would limit the set of things one could accomplish in a transaction. But it would remove a footgun, and maybe this would be a good tradeoff for some users, especially if it could be disabled on a per-transaction basis.
I wouldn't use it. The footgun is imaginary. I use Datomic for ten years and I can assure you that I never stepped on it. As a Datomic user you see transactions as clean small diffs, not as complicated multi step processes. This is actually much more pleasant to work with.
This is also good to hear! I'm not sure whether I'd call it a "footgun" per se--that's really an empirical question about how Datomic's users understand its model. I can say that as someone with some database experience and a few weeks of reading the Datomic docs, this issue actually "broke" several of the tests I wrote for Datomic. It was especially tricky because the transactions mostly worked as expected, but would occasionally "lose updates" or cause updates intended for one entity to wind up assigned to another.
Things looked fine in my manual testing, but when I ran the full test suite Elle kept catching what looked like serious Serializability violations. Took me quite a while to figure out I was holding the database wrong!
In traditional databases, only the database engine has a scalable view of the data - that’s why you send SQL to it and stream back the response data set. With Datomic, the peer has the same level of read access as the transactor; it’s like the database comes to you.
In this read and update scenario, the peer will, at its leisure, read existing data and put together update data; some careful use of compare and set, or a custom transaction function, can ensure that the database has not changed between read and writes in such a way that the update is improper, when that is even a possibility - a rarity.
At scale, you want to minimize the amount of work the transactor must perform, since it so aggressively single threaded. Off loading work to the peer is amazingly effective.
You could include two transaction functions that constrain a transaction to different properties about the same fact and then alter that fact. I don't know of a practical usecase or that I ever encountered that, it would be extremely rare IME.
Seem like the design decision to only allow a single thread to handle writes paid off.
Datomic is a marvel of good design, I wish I could use it again....