
There Is No Now – Problems with Simultaneity in Distributed Systems - seancribbs
http://queue.acm.org/detail.cfm?id=2745385
======
marknadal
Hallelujah, we need more articles like this. I work in distributed systems,
and people hate hate hate hearing the truth, let's summarize the article:

1\. You cannot beat the speed of light.

2\. Machines break. Even the most reliable ones.

3\. Networks are unreliable. Even local area networks.

4\. It is an exciting time for distributed systems: CRDTs, Hybrid Logical
Clocks, Zookeeper, etc.

These are things I feel like I've been preaching a lot, but get upset
responses "Well, certainly Globally Consistent systems work, most databases
are!" Lies, as @rdtsc mentions:

"Keeping an always consistent state in a large distributed [system] you are
fighting against the laws of physics."

Next up, that is why master-master replication is important, because at some
point your primary will go down (or at least the network to it). I started an
interesting thought experiment that turned into a full on open source project:
What if we were to build a database in the worst possible environment (the
browser, aka javascript and unreliability)? What algorithms would we need to
use to make such a system still survive and work?

This is why I chose and worked on solutions involving Hybrid Logical Clocks
and CRDTs, which are at the core of my
[http://github.com/amark/gun](http://github.com/amark/gun) database. An AP
system with eventual consistency and no notion of "now", as every replica runs
in its own special-relativity "state machine" view of the world.

These are all interesting concepts, and the article was a good one. I
recommend it.

~~~
vdm
Nice to see another example of applying CRDTs in the browser, I only knew
about swarm.js. [http://swarmjs.github.io/](http://swarmjs.github.io/) They
have a few good posts on why you would want to do this.

------
rdtsc
Keeping an always consistent state in a large distributed you are fighting
against the law of physics.

Like it is mentioned, Google did it with their F1/Spanner SQL database. But
that also mean GPS receivers with antennas on the roofs of data centers.

Which is yet another thing that can fail and it, either by itself or in a
cascade of other failures will lead to unspecified and possibly undesirable
behavior.

Recently I see a lot of advocates of dropping NoSQL databases and moving back
to Postgress or other SQL databases.

The problem is SQL and schemas is not the only reason NoSQL databases became
popular, they became popular because they also started to have a default and
more defined behavior in respect to replication and distribution.

Most solutions don't need that and sticking with a solid single database works
very well. But those that need distributed operation have a pretty hard task
ahead of them.

One heuristic you can look at is if and how is distribution implemented. Is it
something that is bolted on top, like you download some proxy or addon, or
added as an afterthought, be very careful. Those things should be baked into
the core. For example, does it support CRTDs? Does it have master to master
replication and so on. If it claims to have consistency instead availability
figure out how it is implemented, Paxos, Raft, or something else.

So far I think Riak is probably the database that thought the hardest and did
the most reasonable job. The other simpler database is CouchDB, they have well
specified conflict resolution behavior and master to master replication. But
you have to usually write your own cluster topology. There are probably others
but those are the two I know of first hand.

~~~
logicallee
I'm not an expert, and it sounds like you are, so I appreciate your feedback
here:

what do you even mean by a consistent state? even in theory a person
initiating a new additional record in Auckland, New Zealand at the same time
somebody iniatiates a change in Gibraltar or London (which are antipodal to
the former[1]) 66 milliseconds away, cannot have a confirmation in less than
120 milliseconds, right? So do you just wait for that before declaring
'consistency'? Do you literally add 120 milliseconds to each and every
request? (And this is assuming you have a damned good solution to the two
generals problem)? I mean suppose the database tracks something as simple as:
number of web page hits. It's a counter. You now distribute it, and have a
stochastic process of counter hits between 0 and 5 per second in your largest
cities, distributed throughout the world. How can that database ever be
consistent?

If there are ten new records per second in New Zealand and ten records per
second in the UK, and they potentially depend on each other in some way, are
you going to just make everyone wait until everything has been committed and
confirmed to be consistent? Or is "a foolish consistency the hobgoblin of
little minds", and you really can accept out-of-date data and deal with
merging conflicts later?

I just don't understand why we would expect consistency to rank up there, when
we deal with a worldwide real-time system where the difference between getting
served by a local database in 40 milliesconds and one far away in 250
milliseconds is both staggering, and incredibly noticeable. why be consistent?
what is consistence?

[1] [http://www.findlatitudeandlongitude.com/antipode-
map/](http://www.findlatitudeandlongitude.com/antipode-map/)

~~~
bdamm
There are many systems which can work just fine in an eventually-consistent
manner. A database of people (customers, users, etc) is a classic example of
such a thing.

In general I think consistency is over valued. There are plenty of cases where
it is important. Lots of people are brainwashed in college to think that all
data must be consistent all the time, and that's just not necessary.

~~~
logicallee
>Lots of people are brainwashed in college to think that all data must be
consistent all the time, and that's just not necessary.

I knew it! So a foolish consistency _is_ the hobgoblin of little minds.

~~~
olau
While I agree strict consistency is probably overkill in many if not most
situations, the problem with not having consistency is that it potentially
makes the application logic much more complicated.

Take the database of customers - so if you don't have consistency, what
happens in case someone changes the company address and another person
simultaneously requests a delivery of something. Do you risk ending up with
half of the old address and half of the new one on the parcel?

Note you can certainly have this problem in a consistent system too, e.g. if
you make a UI without a save button where the address is changed one field at
a time.

Concurrency is just intrinsically hard.

~~~
nostrademons
Note that the real world operates like this too: before computers, if someone
changes their address and simultaneously sends a package, the package probably
will end up at the wrong address. We have a number of mechanisms in place to
mitigate this when it occurs (address forwarding, return-to-sender, customer
support, credit card chargebacks), but they _still_ don't always work, and
sometimes packages just get lost.

The real world solution to this is the acceptance that yes, sometimes bad
things happen for no reason at all. I suspect that the computer world will
eventually move to this as well, with consumers becoming more tolerant of
machines that simply give the wrong answer some of the time, as long as they
give the wrong answer less frequently as a human would.

------
jgrahamc
_Rear Admiral Grace Hopper (one of the most important pioneers in our field,
whose achievements include creating the first compiler) used to illustrate
this point by giving each of her students a piece of wire 11.8 inches long,
the maximum distance that electricity can travel in one nanosecond._

I made download and print versions of this:
[http://blog.jgc.org/2012/10/a-downloadable-
nanosecond.html](http://blog.jgc.org/2012/10/a-downloadable-nanosecond.html)

------
jhayward
Thanks, this is a helpful introduction to the history and literature of
concepts related to time in distributed systems. Most people's concept of time
is quite simple and they need to be broken loose of some intuitively held, but
unhelpful beliefs before they can really do engineering with respect to time.

Is there a paper somewhere that new folk should read first? One that includes:

\- A tutorial to describe all of the things we think of as 'time', e.g. order
of sequence, etc. and their dependence on each other.

\- The idea that time as it occurs in the physical world is probabilistic -
requiring such descriptors as precision (what is the smallest difference we
can discern), and error bounds or probability distributions (how accurately
can we describe it).

\- And for the concrete thinkers who 'get' that true simultaneity is
impossible, an easy to understand example of how we succeed in observing
logical coherence, from the scale of a single CPU chip (internally non-
coherent, externally consistent) to cross-continent compute clusters?

~~~
nuxi
There was a great presentation given by Tom Van Baak on measuring time,
precision, accuracy etc. at this year's FOSDEM . Abstract an video are
available at
[https://fosdem.org/2015/schedule/event/precise_time/](https://fosdem.org/2015/schedule/event/precise_time/),
he has lots of additional time-related information on his website
[http://leapsecond.com/](http://leapsecond.com/).

------
nickbauman
This article underscores a point I always end up explaining to people who look
at Cloud Computing (implemented as "the lambda architecture" in best-of breed
scenarios) as a golden hammer that it's a good technology, but only for a
certain class of problems. You can do things like monitor trends over time and
even act on them with soft deadlines using Cloud. But you will never have a
cloud technology control your anti-lock braking system on your car. It basic
understanding of CAP theorem, really.

~~~
dragonwriter
Well you could have locally accessed dynamically provisioned computing
capacity (cloud technology) in your car doing that. Remote/shared hosting is a
different and older thing than cloud technology (though it's a popular use of
cloud technology.)

~~~
pjc50
_locally accessed dynamically provisioned_

This is a mission critical safety system. There are some standards which are
unhappy with dynamically allocating _memory_ in that situation, let alone
dynamically allocating the entire compute resource.

~~~
dragonwriter
That's a good point, though it is different consideration from the response
time issue imposed by physical limits of communications round-trip time.

------
stcredzero
With the way the world looks now, and the way it's shaping up to be in the
future, we should be thinking about designing systems where there is no "now"
but rather, there is instead a notion of "everything syncs soon." There can be
a robust consensus reality in such a system, but it has to exist a short
interval of time in the past.

I'm currently working on a multiplayer game design on these principles.

~~~
nosuchthing
Having played a multiplayer mobile game that uses a "everything syncs soon"
protocol, and at the same time relies on the main mechanics of the game being
timing the moves of characters against your opponents it often can lead to
making the game highly unpredictable and jittery with "alternate timeline
warps" when it resyncs, having nullified important moves.

~~~
stcredzero
_Having played a multiplayer mobile game that uses a "everything syncs soon"
protocol, and at the same time relies on the main mechanics of the game being
timing the moves of characters against your opponents it often can lead to
making the game highly unpredictable and jittery with "alternate timeline
warps" when it resyncs, having nullified important moves._

Nope. Most designers pick a particular mechanic and try to make that exact
mechanic work over the network. If you abandon the particular mechanic and aim
on meta-goals, one of which might be no jitters, resyncs, and (visible)
alternate timelines, then you can eliminate _everything_ you just mentioned
above.

~~~
nosuchthing
If you'd like to share I'd love to hear how you approach solving sync issues
with multiplayer time sensitive games.

Starcraft/Warcraft RTS games seem to force every client in the same "room" to
slow down if other clients report back bad/slow sync clocks, FPS games like
CS/HL engine based will penalize single clients if they lag, and the worst
I've seen was a mobile game by SNK which seems to rely on and give too much
information to the client which has resulted in many users abusing such
protocols.

~~~
stcredzero
I may do that in coming weeks if my current experiments work out.

------
yodsanklai
> One of the most important results in the theory of distributed systems is an
> impossibility result, showing one of the limits of the ability to build
> systems that work in a world where things can fail.

Layman question. I wonder how important this result really is. This is an
impossibility result in a certain model, where processes are deterministic.
It's certainly a nice theoretical result but in practice, there are
probabilistic algorithm that solve this problem.

I don't know what are the probabilistic bounds of probabilistic consensus
algorithms, but if it's arbitrary low, the impossibility result for
deterministic processes is irrelevant isn't it?

After all, if we can live with a super low probability of a meteorite
destroying the planet, so can we with a good probabilistic consensus
algorithm.

------
jamespitts
This is an excellent overview.

Simultaneity isn't just for machines -- it is necessary for people being
connected together online as well. Time sync is a huge part of creating a
shared experience, and this will become more widely appreciated as virtual
reality develops socially.

We solved this in a very limited way at rapt.fm for our timed rap battles. We
maintained a shared clock tick (with adjustment), allowing UI and in-game
events -- e.g. the beat kicking in -- to happen somewhat simultaneously across
browsers. This helped make up for the latency of video, and created a feeling
that people were together at the same "place".

~~~
brianzelip
:) for the dude eating from a bowl of cereal at the end of the rapt.fm landing
page background vid

~~~
jamespitts
Yes, that is me! We had a lot of good times in downtown Det.

------
hyperion2010
I really enjoy the perspective that "now" is often a useful abstraction for
certain types of processes. The fact that it turns out that "now" is one
hellishly leaky abstraction. My perspective coming from biology is that for
many systems the only meaningful type of "clock" is a logical clock. The
important thing is not "when," when is used as a proxy for an assumed state of
a remote part of the system (even if remote is only 10cm away), logical clocks
are the only source that can guarantee that the state of the system is what
you expect it to be so that it will perform as expected. Thanks to the many
hardware guys who have spent years working out the underlying logic for this
we mostly ignore it for things like processors. Now we just need to solve it
for arbitrarily large finite delays! This also reminds me of a very funny (or
depressing) read on systems engineering by James Mickens [1].

1\. [http://research.microsoft.com/en-
us/people/mickens/thenightw...](http://research.microsoft.com/en-
us/people/mickens/thenightwatch.pdf)

------
Terr_
Another fun exercise is dealing with players who say "OMG teh netcode sux" for
online first-person shooters, especially when they have patently unrealistic
expectations for how well the software should break the speed of light.

Sometimes the hardest part is getting them to understand exactly how much of
what they take for granted is an illusion... Often even before any packets
leave their machine.

~~~
Dylan16807
It's less speed of light at fault than other factors, though. Bufferbloat is
terrible, and it's hard to make netcode that reacts well when a few percent of
packets are lost or slow.

~~~
Terr_
Sure, other factors predominate, but even in a significantly-more-ideal world,
a signal from LA to NY is still going to be a 40ms round-trip. (3940km one-way
along the surface of the earth, 200,000km/s signal speed in the glass.)

That's still more than enough time to require algorithmic trickery from games
in order to provide the illusion of "real-time" gaming over the internet.

~~~
Dylan16807
Think about how many console games run at 30fps. Then consider that a fully
framebuffered game running at 30fps, even under ideal conditions, has 70ms of
latency by the time a frame finishes displaying. 40ms isn't a big deal.

~~~
Terr_
Even assuming a perfect computer with zero input latency and zero display
latency the network part still matters when it comes to knitting an area like
"North America" into a game region. It feeds into longer causal chains.

For example, consider two players in LA both using an NY server over the
dedicated zero-overhead fiber-optic network mentioned above, with a relatively
modern game-networking stack.

(1) Player A raises an energy-shield t=0 ms. (Client-prediction.)

(2) The server agrees as of t=20.

(3) Player B shot an instant-travel hitscan weapon at t=39, while the victim
was still exposed on his screen.

(4) The server gets the shot-message at t=59, and honors it (Latency
Compensation), sending a damage/death message out to Player A.

(5) Player A receives the news of his damage/death at t=79.

Even in that world of unattainably-good equipment, that's 80ms of "wait, that
doesn't look right".

~~~
Dylan16807
Lag compensation has flaws, but you don't have to lag compensate. You can
delay player input until the server's processed it and responded. Even better
if you're clever you can delay player input by _half_ the ping time.

Edit: Also don't double your ping by putting relay servers only on one edge of
the country.

------
jbergens
One piece of warning from the article regarding "last write wins":

>..., this is really a "many writes, chosen unpredictably, will be lost"
policy—but that wouldn't sell as many databases, would it?

------
tlarkworthy
Spanner is externally serializable so you do get a 'now'. You just don't know
what the agreed now was until after the write.

The idea that there is no 'now' is of course, preposterous, we have very
strict laws of physics supporting the concept of now (sans relativity) and
eventually our engineering will be able to track that very accurately. Spanner
is a step on that journey.

These kinds of 'impossible' articles will appear very dated in 10 years time,
as they are really over exaggerating the rules-of-thumb of the previous 10
years.

~~~
dragonwriter
> we have very strict laws of physics supporting the concept of now (sans
> relativity)

Laws of physics "sans relativity" aren't the actual laws of physics in our
universe. There very much is no _now_ except the now that is also _here_. Its
quite accurate to say that simultaneity does not exist in distributed systems,
and simultaneity is less valid as even an approximation the more widely
distributed a system is.

~~~
nostrademons
To add some context & numbers to dragonwriter's point, the speed-of-light
delay from New York to San Francisco is about 21ms [1]. This is about 5 disk
seeks, 1200 random SSD reads, 200K main memory reads (without caching), or 10M
CPU cycles [2]. Speed of light delays absolutely matter in a distributed
system.

[1]
[http://chimera.labs.oreilly.com/books/1230000000545/ch01.htm...](http://chimera.labs.oreilly.com/books/1230000000545/ch01.html#PROPAGATION_LATENCY)

[2]
[http://www.eecs.berkeley.edu/~rcs/research/interactive_laten...](http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html)

~~~
tlarkworthy
that's transmission delays not relativistic effects. The idea of spanner is
you have a timestamp of when it was decided to commit. The quorum knows they
can't contact each other quickly but they trust each others timestamps and
resolve conflicts based on the trusted commit times (which are accurate
through hardware). Transmission delays don't undermine the fact there is a
very real concept of ordered time in the physical world which is
exploitable[1]. Spanner exploits it faster than transmission delays, but with
a clock error of 10 ms or something. We have better clocks than that so its
probably going to improve...

[1] sans relativity effects which are TINY, and not the limiting factor at the
moment.

------
keppy
> Another such area of work is logical time, manifest as vector clocks,
> version vectors, and other ways of abstracting over the ordering of events.
> This idea generally acknowledges the inability to assume synchronized clocks
> and builds notions of ordering for a world in which clocks are entirely
> unreliable.

Hardware is unreliable. Software is possibly less reliable. We have known that
for a long time. The author talks on a conceptual level about logical time,
but this concept isn't enough to understand the real challenges & possible
solutions of keeping interactions in your system logically ordered in the
dimension of time[0].

> You can think of coordination as providing a logical surrogate for "now."
> When used in that way, however, these protocols have a cost, resulting from
> something they all fundamentally have in common: constant communication. For
> example, if you coordinate an ordering for all of the things that happen in
> your distributed system, then at best you are able to provide a response
> latency no less than the round-trip time (two sequential message deliveries)
> inside that system.

Consensus protocols don't provide a logical surrogate for 'now', a log does
that. The silver bullet for assuring that your transactions are ordered
correctly is immutability[1]--"If two identical, deterministic processes begin
in the same state and get the same inputs in the same order, they will produce
the same output and end in the same state.[0]" It's important, from the
perspective of the implementor, to understand that there are multiple pieces
to this puzzle, and that each protocol has very specific details that can make
or break the reliability and performance of a distributed system. This is
similar to how a small bug in your cryptography code can expose the entire
system to threat. Paxos itself can be implemented in a myriad of ways, and
each decision the implementor makes must be well researched.

[0][http://engineering.linkedin.com/distributed-systems/log-
what...](http://engineering.linkedin.com/distributed-systems/log-what-every-
software-engineer-should-know-about-real-time-datas-unifying)

[1][http://basho.com/clocks-are-bad-or-welcome-to-distributed-
sy...](http://basho.com/clocks-are-bad-or-welcome-to-distributed-systems/)

------
keppy
I suggest reading this article for some insight if you haven't built out a
distributed system. Good hands on, practical information:
[http://engineering.linkedin.com/distributed-systems/log-
what...](http://engineering.linkedin.com/distributed-systems/log-what-every-
software-engineer-should-know-about-real-time-datas-unifying)

------
glittershark
Reminds me of this (absolutely spectacular) talk from 2009 by Rich Hickey:
[http://www.infoq.com/presentations/Are-We-There-Yet-Rich-
Hic...](http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey)

------
aristus
Related: [http://carlos.bueno.org/2010/04/dismal-guide-to-
concurrency....](http://carlos.bueno.org/2010/04/dismal-guide-to-
concurrency.html)

------
ajarmst
Doesn't appear to mention Lamport Timestamps
([https://en.wikipedia.org/wiki/Lamport_timestamps](https://en.wikipedia.org/wiki/Lamport_timestamps))
(Yes, THAT Lamport), which are one of the most elegant mechanisms to deal with
the some of the discussed problems.

~~~
macintux
> Another such area of work is logical time, manifest as vector clocks,
> version vectors, and other ways of abstracting over the ordering of events.

Vector clocks and version vectors are variations on the Lamport clock concept
(and that quote is from a paragraph that mentions Paxos, another Lamport
invention, cited in the bibliography).

