(working for cockroach labs) As explained in the blog (as linked by another resp...

eternalban · on Oct 4, 2017

From the first link:

> Employs Raft, a popular successor to Paxos.

I don't believe that is correct and that is not a good meme to promote. Successor implies the 'deprecation' of Paxos, which is certainly not the case. For the architectural sweet spot where both protocols can be used, it is fair to say Raft provides the more accessible approach, and that for those who insist on rolling their own consensus mechanism.

Paxos and Raft serve distinct architectural concerns in the shared context of providing a distributed state with formally defined consistency guarantees. And to the point, it is entirely reasonable to see a distributed system that utilizes both to address system component specific concerns.

CockroachDB, ironically (in context of your OP), in fact could benefit from a vendor lock-in were the candidate cloud provider to provide atomic clocks and GPS cards in their offerings.

ddorian43 · on Oct 4, 2017

What are the GPS cards for ?

infogulch · on Oct 4, 2017

A lot of synchronization problems are solved with access to a very accurate synchronized clock. E.g. with it you can determine a reliable "happens-before" relationship between all in flight transactions, among other useful things.

Literally all GPS does is provide a way to determine the time very accurately in relation to a bunch of synchronized atomic clocks in orbit. (As a side benefit, knowing the time like this will also give you your position.)

phamilton · on Oct 4, 2017

To expand:

Windows for concurrent operations are bounded by clock uncertainty. If you say you did something at 2:00 and I say I did something at 3:00, you probably did your thing before me. But if it's 1:57 vs 1:59, there's strong possibility my clock is fast or yours is slow. If I know both our clocks are within 5 seconds of the correct time, then I can say your action happened before mine. So accurate clocks allow you to provide ordering to more events.

Additionally, If you do identify operations as concurrent, you can explicitly deal with them. Dealing with them is doable, but expensive. If your clocks are in sync with a small margin of error, the rate at which you will have to do expensive things to resolve concurrent operation is low.

Google's Spanner database uses atomic clocks to have a max clock skew of 7ms. That eliminates most concurrent conflicts, and dealing with conflicts is fast because we can recognize conflict quickly. Basically, it waits 7ms on all writes to see if any other nodes have a conflicting write.

CockroachDB can operate in "spanner mode", but without atomic clocks it uses a max clock skew of 250ms. This is a much bigger window and identifying conflicts on write takes a lot longer.

CockroachDB recommends that you don't run in "spanner mode" because of the performance hit. Instead, the default mode does lazy conflict detection, where instead of waiting after all writes, it will sometimes wait to read data until the clock skew window has passed.

Here's a great read on the topic:

https://www.cockroachlabs.com/blog/living-without-atomic-clo...

qaq · on Oct 4, 2017

Time sync

daxorid · on Oct 4, 2017

Thanks!