Hacker News new | past | comments | ask | show | jobs | submit login
Verifying Transactional Consistency with Jepsen and FaunaDB (fauna.com)
51 points by evanweaver on July 13, 2018 | hide | past | favorite | 23 comments



Is this OSS?

> As part of our engineering process at Fauna, we have built a comprehensive suite of unit and property-based tests and as well as a sophisticated distributed testing framework that is capable of checking the behavior of FaunaDB in the presence of a wide combination of fault injections and operational changes, similar to Netflix’s Chaos Monkey.

What did you write your property testing framework in?


We run FaunaDB as a service, or you can license it for on-prem use. Drivers and tooling are all open however, and we plan to release the source for correctness testing as well.

Our property tests are written in Scala as a Monad for controlling execution and random sample generation, and our internal distributed test harness uses Akka. This worked out nicely because we were able to make the relevant integration and fault tests scale from local runs via SBT to full geo-distributed clusters and OS-level fault injection.


Would be good to see -- problems/issues that needed code fixes? maybe caused you to rethink (or regret) implementation decisions? I'm sure Kyle's report will have something good. I always root for Jepsen...and I haven't been disappointed yet.


Interesting to see new DBs cropping up! I think there is much to improve over the choices popular today.

However, it's difficult to quickly grasp what makes Fauna unique. A comparison with e.g. CockroachDB would be nice!

A good example of doing this well is RethinkDB, that provided two comparisons with MongoDB (one biased, one unbiased).

1. Technical comparison (unbiased):

https://rethinkdb.com/docs/comparison-tables/

2. Personal comparison (biased):

https://rethinkdb.com/docs/rethinkdb-vs-mongodb/

Consul (not a DB though) also has these comparisons:

https://www.consul.io/intro/vs/index.html


Some conversation on CockroachDB vs FaunaDB here - https://news.ycombinator.com/item?id=16878102

The primary difference is in the interface (SQL vs NoSQL), the transactional protocol, and capabilities such as built-in multi-tenancy, temporality..

Might be useful to note that GV invested in both (Fauna was after Cockroach).


Thanks!


Jepsen is an important sanity check on marketing. Easy to say in collateral you are "strict" ACID -- Jepsen proves (or denies) those claims. Some fun reports over there http://jepsen.io/analyses


A bit more precisely: Jepsen denies or doesn't a small subset of those claims.

It can't prove strict acid because it only observes behaviour it can observe. There are code paths it won't see in practice, so the best it can do is refute.

It also can't refute acid as a whole. If you look at the reports, they usually check the very basic behaviour - simple sets and gets. While that can tell you a lot about the system, you can't say it will generalise to complicated joins, multi-row operations, backend plugins, etc.


I would also say "strict" ACID is become just another marketing term. I think what is more important is evaluating a databases support of different consistency and isolation levels. As well what level of availability do they support with these transaction guarantees? Do they support fail over to an active replica on the fly with no downtime? Are minority replicas available during network partition?


"strict" ACID can mean a lot of different things. There is a lot of levels of consistency and they prevent different types of issues with data correction and isolation. I think database vendors should clarify what consistency model they mean as defined by https://jepsen.io/consistency


That's a great resource. We've tried to be consistent in using these terms. I think a lot of the confusion and lack of standardization over terminology came from the fact that database and distributed systems research had less overlap in the past...

Consistency in the ACID sense meaning something different from CAP Consistency is another good example.


Another point that often gets lost when talking about ACID guarantees is the maximum scope of a transaction that a system supports. There are many systems which support ACID guarantees for a single record or shard, but fewer that can support fully distributed transactions across a sharded dataset in a scalable manner.


This whole space got really tough after Apple opensourced FoundationDB. It will take few years for higher level solutions that build on top of FoundatioDB to surface but after they do it will be really tough for the competitors.


Not sure why building on FoundationDB is better than building on the similar internal storage foundations current vendors have already implemented?



How does fauna's consistency guarantees compare to coakroach, spanner and yugabyte? It is difficult to compare because different vendors use different terminology. For example spanner supports 'external consistency'. And Coakroach says it supports serializable as defined by ANSI. Is this the same as what Fauna calls strict serializable?


I believe what we spanner calls 'external consistency' we call strict serializability. Read-write transactions in FaunaDB are strictly serializable, whereas reads default to serializable. This allows us to serve reads independently from the closest region to the client, cutting down on latency in geo-distributed clusters.

This is a bit weaker than external consistency or strict serializability; Peter Bailis has a good writeup on the meaning of serializability vs strict serializability here: http://www.bailis.org/blog/linearizability-versus-serializab...



What is FaunaDB's CAP focus? Kills me that this isn't front and center in nosql home pages.


It is a CP system, although reads scale out more similar to an AP system because coordination is not required on consistent reads.


What level of availability guarantees does Fauna provide?


This is a really important question.


What are typical use cases?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: