
Verifying Transactional Consistency with Jepsen and FaunaDB - evanweaver
https://blog.fauna.com/verifying-transactional-consistency-with-jepsen-and-faunadb
======
sargun
Is this OSS?

> As part of our engineering process at Fauna, we have built a comprehensive
> suite of unit and property-based tests and as well as a sophisticated
> distributed testing framework that is capable of checking the behavior of
> FaunaDB in the presence of a wide combination of fault injections and
> operational changes, similar to Netflix’s Chaos Monkey.

What did you write your property testing framework in?

~~~
freels
We run FaunaDB as a service, or you can license it for on-prem use. Drivers
and tooling are all open however, and we plan to release the source for
correctness testing as well.

Our property tests are written in Scala as a Monad for controlling execution
and random sample generation, and our internal distributed test harness uses
Akka. This worked out nicely because we were able to make the relevant
integration and fault tests scale from local runs via SBT to full geo-
distributed clusters and OS-level fault injection.

~~~
winetraveler
Would be good to see -- problems/issues that needed code fixes? maybe caused
you to rethink (or regret) implementation decisions? I'm sure Kyle's report
will have something good. I always root for Jepsen...and I haven't been
disappointed yet.

------
sandstrom
Interesting to see new DBs cropping up! I think there is much to improve over
the choices popular today.

However, it's difficult to quickly grasp what makes Fauna unique. A comparison
with e.g. CockroachDB would be nice!

A good example of doing this well is RethinkDB, that provided two comparisons
with MongoDB (one biased, one unbiased).

1\. Technical comparison (unbiased):

[https://rethinkdb.com/docs/comparison-
tables/](https://rethinkdb.com/docs/comparison-tables/)

2\. Personal comparison (biased):

[https://rethinkdb.com/docs/rethinkdb-vs-
mongodb/](https://rethinkdb.com/docs/rethinkdb-vs-mongodb/)

Consul (not a DB though) also has these comparisons:

[https://www.consul.io/intro/vs/index.html](https://www.consul.io/intro/vs/index.html)

~~~
dhruvgupta
Some conversation on CockroachDB vs FaunaDB here -
[https://news.ycombinator.com/item?id=16878102](https://news.ycombinator.com/item?id=16878102)

The primary difference is in the interface (SQL vs NoSQL), the transactional
protocol, and capabilities such as built-in multi-tenancy, temporality..

Might be useful to note that GV invested in both (Fauna was after Cockroach).

~~~
sandstrom
Thanks!

------
winetraveler
Jepsen is an important sanity check on marketing. Easy to say in collateral
you are "strict" ACID -- Jepsen proves (or denies) those claims. Some fun
reports over there [http://jepsen.io/analyses](http://jepsen.io/analyses)

~~~
retroryan
"strict" ACID can mean a lot of different things. There is a lot of levels of
consistency and they prevent different types of issues with data correction
and isolation. I think database vendors should clarify what consistency model
they mean as defined by
[https://jepsen.io/consistency](https://jepsen.io/consistency)

~~~
freels
That's a great resource. We've tried to be consistent in using these terms. I
think a lot of the confusion and lack of standardization over terminology came
from the fact that database and distributed systems research had less overlap
in the past...

Consistency in the ACID sense meaning something different from CAP Consistency
is another good example.

~~~
freels
Another point that often gets lost when talking about ACID guarantees is the
maximum scope of a transaction that a system supports. There are many systems
which support ACID guarantees for a single record or shard, but fewer that can
support fully distributed transactions across a sharded dataset in a scalable
manner.

------
qaq
This whole space got really tough after Apple opensourced FoundationDB. It
will take few years for higher level solutions that build on top of
FoundatioDB to surface but after they do it will be really tough for the
competitors.

~~~
evanweaver
Not sure why building on FoundationDB is better than building on the similar
internal storage foundations current vendors have already implemented?

~~~
qaq
[https://www.youtube.com/watch?v=4fFDFbi3toc](https://www.youtube.com/watch?v=4fFDFbi3toc)

------
retroryan
How does fauna's consistency guarantees compare to coakroach, spanner and
yugabyte? It is difficult to compare because different vendors use different
terminology. For example spanner supports 'external consistency'. And
Coakroach says it supports serializable as defined by ANSI. Is this the same
as what Fauna calls strict serializable?

~~~
freels
I believe what we spanner calls 'external consistency' we call strict
serializability. Read-write transactions in FaunaDB are strictly serializable,
whereas reads default to serializable. This allows us to serve reads
independently from the closest region to the client, cutting down on latency
in geo-distributed clusters.

This is a bit weaker than external consistency or strict serializability;
Peter Bailis has a good writeup on the meaning of serializability vs strict
serializability here: [http://www.bailis.org/blog/linearizability-versus-
serializab...](http://www.bailis.org/blog/linearizability-versus-
serializability/)

------
evanweaver
Here is a direct link to the analysis:
[https://storage.pardot.com/517431/114459/FaunaDB_Correctness...](https://storage.pardot.com/517431/114459/FaunaDB_Correctness_Report_042618.pdf)

------
AtlasBarfed
What is FaunaDB's CAP focus? Kills me that this isn't front and center in
nosql home pages.

~~~
evanweaver
It is a CP system, although reads scale out more similar to an AP system
because coordination is not required on consistent reads.

------
devopsguyinnyc
What level of availability guarantees does Fauna provide?

~~~
jbieber
This is a really important question.

------
jbieber
What are typical use cases?

