
Living Without Atomic Clocks - tschottdorf
http://www.cockroachlabs.com/blog/living-without-atomic-clocks/
======
jasonwatkinspdx
First, CockroachDB's time api is based on the Hybrid Logical Clocks paper:
[http://www.cse.buffalo.edu/tech-
reports/2014-04.pdf](http://www.cse.buffalo.edu/tech-reports/2014-04.pdf)

This paper is one of the most interesting published in the last couple of
years IMO. I remain surprised at how many people have overlooked it. I'm not
sure why this blog post makes no mention of the work, in the past the
Cockroach folks have been quite explicit about crediting the research
documents they're drawing from.

Second, cloudera has a patent on this work:
[http://www.freepatentsonline.com/y2015/0156262.html](http://www.freepatentsonline.com/y2015/0156262.html)

~~~
hnkimb3558
CockroachDB's time api is based on the Hybrid Logical Clocks paper. We credit
it on our design doc
([https://github.com/cockroachdb/cockroach/blob/master/docs/de...](https://github.com/cockroachdb/cockroach/blob/master/docs/design.md))
and in the source code.

This blog post doesn't mention HLC because the explanation didn't require it.
HLC boils down to a mechanism for taking the maximum physical wall time across
>= 2 nodes, while still being able to provide monotonically increasing time
via incrementing the logical component of the hybrid logical timestamp. In the
blog post, this is referred to as "taking the maximum timestamp across
requests". A segue into HLC would have served only to further complicate the
explanation.

The work done by Sandeep Kulkarni, Murat Demirbas, David Alves, Todd Lipcon,
Vijay Garg, and others was instrumental to our early design efforts. I am
dismayed to see there's a patent pending, though both our design and open
source implementation predate the patent filing.

~~~
tlipcon
FYI on the date issue, lest anyone think we filed the patent trying to steal
the work done by others, the patent application says:

"This application claims to the benefit of U.S. Provisional Patent Application
No. 61/911,720, entitled “HYBRIDTIME and HYBRIDCLOCKS FOR CLOCK UNCERTAINTY
REDUCTION IN A DISTRIBUTED COMPUTING ENVIRONMENT”, which was filed on Dec. 4,
2013, which is incorporated by reference herein in its entirety."

(which predates the creation of the cockroachdb repo and the hybrid logical
clock paper).

------
tlarkworthy
I don't understand why 7ms is considered a good bound for atomic clocks?

Hafele and Keating Experiment: "During October, 1971, four cesium atomic beam
clocks were flown on regularly scheduled commercial jet flights around the
world twice, once eastward and once westward, to test Einstein's theory of
relativity with macroscopic clocks. From the actual flight paths of each trip,
the theory predicted that the flying clocks, compared with reference clocks at
the U.S. Naval Observatory, should have lost 40+/-23 nanoseconds during the
eastward trip and should have gained 275+/-21 nanoseconds during the westward
trip ... Relative to the atomic time scale of the U.S. Naval Observatory, the
flying clocks lost 59+/-10 nanoseconds during the eastward trip and gained
273+/-7 nanosecond during the westward trip, where the errors are the
corresponding standard deviations. These results provide an unambiguous
empirical resolution of the famous clock "paradox" with macroscopic clocks."

So if we are loosing nano seconds per day, couldn't we fly clocks around the
datacenters and resync every month. 7ms seems beatable and not a terrible
operational overhead for a fast globally consistent database.

~~~
bigdubs
Something I never understood. If velocity (?) affects time, wouldn't time
operate a different speed at different parts of the universe / solar system?

~~~
andrepd
_Absolute_ velocity does not affect time, simply because _there is no such
thing_ as absolute velocity. Relative velocity, though, does. For instance, if
you look at a clock on a GPS satellite, moving fast relative to you, you can
see it run slower than the one on your wrist. Similarly, someone stationary on
the surface of the sun would see us here on earth moving in slow motion.

However, someone moving _on_ the satellite would see the clock on the
satellite move normally, and us on earth moving in slow motion, because they
are on the same frame of reference of the satellite.

So no, time does not operate differently on different parts of the universe
_per se_. It all depends on how it's moving relative to the observer.

~~~
monochromatic
> For instance, if you look at a clock on a GPS satellite, moving fast
> relative to you, you can see it run slower than the one on your wrist.

Actually, the general relativity effects of weaker gravitational field
dominate the special relativity effects of velocity[1]. So the GPS satellite
clock actually runs faster, not slower.

[1] [http://www.astronomy.ohio-
state.edu/~pogge/Ast162/Unit5/gps....](http://www.astronomy.ohio-
state.edu/~pogge/Ast162/Unit5/gps.html)

~~~
techdragon
There's a really neat thing about this, the relationship balances out at a
certain orbital height, before it flips over so there's an orbit where your
chronologically in synch with the ground.

~~~
jhayward
That's only true with respect to a given locus of points on the ground, not
the full surface. I.E., the relative velocity of the SV isn't the same for all
points that may be measuring.

In practice the GR effect is compensated by the satellite at manufacturing,
the SR effect is treated in the receiver - for just that reason.

------
nickpsecurity
Nice writeup. I disagree that we need chip-scale, atomic clocks. My idea was a
dedicated, battery-backed piece of hardware that reliably stored time plus
could sync other machines. Plugs into an interconnect with ultra-low latency.
One for each datacenter.

You can plug them into each machine in the cluster periodically to sync them.
Or you can plug it into a master node that connects with low-latency
management interface separate from main data line. Occasionally, time server
gets exclusive access to that line, assesses latency, and then syncs its time.
Time server might be custom built to avoid its own skew or keep one of the
timekeeping devices attached. Those are periodically shipped to a central
location to resync themselves against an atomic clock or each other.

What yall think?

~~~
NeutronBoy
> My idea was a dedicated, battery-backed piece of hardware that reliably
> stored time plus could sync other machines. Plugs into an interconnect with
> ultra-low latency. One for each datacenter.

Google have a variant on this, where they use a GPS receiver in each data
centre to provide an accurate time source for local machines.

~~~
dingo_bat
I've seen this being used extensively in sensor control centers (power
grid/power plants), so it's definitely not limited to Google.

Basically by accurate and precise timing, they are able to reconstruct and
pinpoint the origin of a failure.

------
Animats
Chip-scale atomic clocks are now about $1500.[1] This just gets you a
10.0000000000 MHz oscillator; something else has to count time and provide
output.

Passing around the max sequence number as a monotonic sequence indicator has a
risk. A bogus sequence number near the max value can cause serious problems.

[1]
[http://www.microsemi.com/salescontacts?ctype=3](http://www.microsemi.com/salescontacts?ctype=3)

------
grogers
When executing a transaction with a given timestamp on some node, doesn't the
node have to guarantee that it will no longer accept commits with a smaller
timestamp? Without that commitment, you could read a piece of data that is
later updated by a transaction that occurred logically before you, breaking
serializablility and/or snapshot isolation.

The spanner paper is unclear how they deal with this, but my guess is that
since they have accurately synchronized clocks, you'll never have to block
long for that commitment to hold. Spanner also uses pessimistic locking, so
for a R/W transaction, you can rely on locking reads to prevent the anomaly.

With cockroachdb, wouldn't this commitment imply that poorly synchronized
clocks would lead to poor performance?

~~~
hnkimb3558
Cockroach enforces this guarantee on a per-key basis as opposed to for the
entire node. If a key has been read at time t, it may only be subsequently
written at time > t. CockroachDB accomplishes this using a timestamp cache at
the leader node for a range. But this doesn't cause writes to block. Instead,
the write timestamp is pushed to the most recent read + 1 logical tick.

~~~
grogers
Interesting, thanks for the info.

Do reads execute through raft? If not, how do you guarantee that behavior -
relying on leader leases? If so, does the lease period guarantee disjointness
in the timestamps a leader can process, or do you rely on the leader
abdicating control via timeout?

Since it can push the write timestamp, does that mean the write transaction
aborts if it was in serializable mode? Does that cause problems updating a
value that is heavily read?

