
Time is an illusion – challenges in distributed computing synchronization - alanfranzoni
http://queue.acm.org/detail.cfm?ref=rss&id=2878574
======
rogeryu
_Few programmers have read the most important paper in this area, Leslie
Lamport 's "Time, Clocks, and the Ordering of Events in a Distributed System"
(1978)_

If you want to be one of the few:

[http://research.microsoft.com/en-
us/um/people/lamport/pubs/t...](http://research.microsoft.com/en-
us/um/people/lamport/pubs/time-clocks.pdf)

~~~
jacquesm
Don't stop there:

[http://research.microsoft.com/en-
us/um/people/lamport/pubs/p...](http://research.microsoft.com/en-
us/um/people/lamport/pubs/pubs.html)

~~~
Swizec
And if you want an easier job of it, here's my summary:
[http://swizec.com/blog/week-7-time-clocks-and-ordering-of-
ev...](http://swizec.com/blog/week-7-time-clocks-and-ordering-of-events-in-a-
distributed-system/swizec/6444)

~~~
jacquesm
Oh cool, that's worth posting in its own right.

------
Inthenameofmine
You don't actually need absolute time for distributed systems to work, only
independently verifiable order. Independent venerability can be achieved
through Merkle structures and Merkle proofs. In a way, through the proofs you
can communicate anybody you're "perspective" on the order of events. If you
get several "perspectives" you can therefore independently infer the absolute
order of events.

You would still be left with race conditions between the communicating nodes,
but that is something you can't get around anyway.

~~~
eternalban
The deductive approach to system ordering (e.g. via computing hashes) sets a
fairly high floor on latency. If you have deep pockets like Google, you will
opt for the expensive GPS/Atomic Clock and work within the epsilon bounds.

~~~
justicezyx
The atomic clock is intended to bound timing skew, not latency exactly.

As long as clocks at different places are synced with high precision, you
could still sync operations within the epsilon bounds, despite a huge delay.
Of course, that requires the system to be design accordingly to accommodate
the huge latency.

~~~
eternalban
Late reply but noted this now.

We disagree. Perfect clock obviates the need for _protocol_ based consensus
mechanism. Paxos & friends have substantial latency costs. Secondary effects
of protocol based approach include NAK storms, reduction of available
bandwidth (consumed by the chatty consensus protocol packets). Tertiary
effects include triggering of congestion control mechanisms.

There is zero question that a distributed system built on high fidelity clocks
will mop the proverbial floor in terms of performance.

~~~
jasonwatkinspdx
A couple responses:

Even with the hardware, google can't shrink the window past 7ms or so, based
on published reports.

To preserve consistency there are situations where you need to wait out the
clock uncertainty.

Spanner still uses paxos for replication because ordering is only part of the
problem consensus solves.

------
kcorbitt
"A typical laptop or server, left without any type of external time
conditioning, will drift out of sync within minutes and after a few hours may
already be several minutes away from good synchronization with other systems."

This doesn't match my experience at all. I've had smartphones disconnected
from the network for weeks at a time without drifting "several minutes away"
from the consensus time. Drift is a thing, but it seems like that estimate is
several orders of magnitude larger than anything I've seen in practice.

~~~
lostcolony
Not to mention what it'd mean about watches (the dumb kind). Digital watches
as a whole, even the cheap ones I'd own as a kid, never drifted that far that
fast. I can't imagine that the cheap ones had better crystals in them than a
laptop.

I think it may just be badly phrased. "will drift out of sync within minutes"
was meant as it'll almost assuredly have gained/lost a few microseconds from
what it should have, given the progression of 'real' time for its frame of
reference, but that it -may- be that bad. Which, yeah, sure, it -may-. Not
bloody likely, but maybe.

~~~
YZF
I'm not quite sure what's in my laptop right now but it used to be that there
was a separate crystal for time keeping, typically around 32KHz. I believe
digital watches used similar crystals. Those drift a lot less than the high
frequency crystals used to generate the CPU clock. Their accuracy is on the
order of 10ppm. The other thing that matters is the temperature variation as
the frequency will change with temperature. A wrist watch probably sees more
uniform temperatures (esp. if it's worn all the time and e.g. has a large
metallic backing). Laptops can get quite hot and quite cold.

You can get temperature compensated oscillators down to about 1ppm accuracy
over a larger temperature range. 1ppm is about 1 second/week.

Back in the day I wrote a little utility where every time you adjusted your
computer clock it took notice of the drift and fed that back into the time
calculation. I.e. it continuously adjusted the clock based on the corrections
given to it.

------
peter303
Kind of what got Einstein started on special relativity: Is is possible to
exactly synchronize railroad station clocks? And the answer is no. There will
always be a causality uncertainty due to the finite speed of signals.
Sequencing could depend ont he location and velocity of the observer. On Earth
the planetwide uncertainty would be a fraction of a second. But across planets
you'd have minutes and so on.

~~~
bcook
Why is this downvoted? To a laymen (me), this seems like like an interesting
post.

~~~
danbruc
I guess because it is more or less wrong. Relativity establishes that time is
observer dependent, i.e. observers moving relative to each other will in
general not agree on which events occurred at the same time. But there is no
uncertainty because of the finite speed of light and nothing that prevents
synchronizing clocks although it is admittedly not easy because relative
motions and gravitational fields influence the speed at which clocks tick.
There is also still a causal structure in spacetime defined by the light cones
of events so that different observers agree on the ordering of events that may
have a causal relationship. It's all pretty complicated and way more nuanced
than popular science usually presents it.

------
ChuckMcM
I was at one of the open tech talks at Xerox PARC that Leslie gave where the
discussion of time synchronization came up. Xerox had a naming system called
Grapevine[1] and it used timestamps in a number of places. I was working at
Sun and dealing with time issues in RPC and came away from the tech talk
understanding that "perfect time keeping" was like "perfect security", if you
could assume you had it a whole host of problems became much easier to solve.

The point that the author makes about needing higher and higher precision
though got me thinking about ways one might achieve that. I'm wondering if you
could actually provide a master clock, a 1Ghz carrier, over network cables
that originate from the master clock. If the master clock is synchronized with
the bit stream, and you're seeing the bit stream locally, you first calibrate
your clock with the master and then drive it from the bit stream and you
should be in sync with respect to cable and time of flight delays.

[1]
[http://web.cs.wpi.edu/~cs4513/d07/Papers/Birrell,%20Levin,%2...](http://web.cs.wpi.edu/~cs4513/d07/Papers/Birrell,%20Levin,%20et.%20al.,%20Grapevine.pdf)

~~~
bitL
As a former sunnie that worked on distributed systems, how would you cope with
packet loss/out of order packets/weird socket states especially on Solaris?
Would you do clock-per-socket and then merge them via internal buffers? How
about when one/many of your nodes go down or you end up with a split brain? Or
you'd just use 0MQ or similar super fast protocol, do some simple checks and
if they fail, you'd just resend everything?

~~~
ChuckMcM
NIS+ used "relative" timestamps, which were basically the delta between the
server and the originator. So the originator would say "here is a packet and I
think it is time 1415151515.5" and the server would get that packet and say
Ok, since I think it is 1415151520 your delta is -5 and I'll subtract 5 from
all of your times to put them into my time context. Then when you compared
time stamps you did so in the servers frame of reference. But as the article
points out it still suffered from jitter given the various subsystems between
the packet and the service endpoint.

As for out of order and lost packets the TCP layer prevent out of order, but
retransmissions on lost packets resulted in big jitter spikes. Those were rare
enough to pull out as a special case. And there was layered on top an
optimistic transaction protocol where you could ask for the current
transaction id, increment it by one and send your transaction with the
assumption that if someone landed before you it would fail and you would have
to restart. That worked well for read mostly applications (like a name
service).

The NoSQL database that Blekko designed uses a more complex promise system to
preserve transaction ordering and it uses idempotent combinators which help
manage time syncronization issues. But again, if we could wave a magic wand
and get perfect synchronization it would be pretty interesting.

------
aristus
Most of what we do in distributed systems is provide the _illusion_ of a
consistent, shared memory space. We literally pretend that we can violate the
laws of physics, to make the programming for everyone else simpler.

~~~
kasey_junk
This is true all the way down to the chip level as well.

------
zimbatm
> If an interval time needs to be measured, then rdtsc, or a library wrapped
> around it, is the best solution, whereas getting the system time for use in
> log files probably ought to be carried out using clock_gettime()with a FAST
> option; or, if clock_gettime() is not available, then gettimeofday().

If I remember correctly RDTSC suffers from other issues like being affect by
CPU throttling and also might be different if your process is re-scheduled on
another core.

------
explorer666
Time is an emergent property. Ie: fundamental particles don't experience time,
but things made of them do.

~~~
imaginenore
Everything that has mass, must experience time. That includes electrons,
protons, neutrons, which are considered fundamental.

~~~
riskable
To put it another way: Time cannot exist without mass. Time only exists
because everything is always in motion (at least a little bit) and--in our
perception--we have a memory/frame of reference.

From the perspective of a photon it lives and dies in an instant. Even if it
crosses the entire universe!

------
amelius
Related:
[https://www.youtube.com/watch?v=wteiuxyqtoM](https://www.youtube.com/watch?v=wteiuxyqtoM)

(Showing clearly why simultaneity does not exist in an absolute sense.)

------
toolslive
It's even worse. Not only is time an illusion but your observation is being
sabotaged by sysadmins that make one of the nodes of your distributed system
jump back in time several hours (by changing the time zone info) and other
naughty deeds.

------
pcmaffey
It'll be fascinating to watch how time unfolds on the blockchain.

Some will strive for absolute standards, while others maximize the net
benefits of relativity.

------
SFjulie1
Got fired on my last job saying this. Such a shame I don't have a PhD like
their CSS coder.

My understanding of the root causes of the time problem is a poor education.

Basic definition of time : time is the accident of the accident, and the same
causes giving the same effects, some of them being irreversible they define an
ordered direction of events. Time is like temperature, it is measured
relatively to the pulsation of an harmonic oscillator. A closed absolute
system time does not exists. Since Einstein we also have to decorrelate the
physical speed to the speed due to the geometrical expansion. (Cerenkov
effect, yes you can go faster than the speed of light playing on this). Since
quantum mechanics we know time is quantic and its uncertain capped by hbar/2 <
dEdt

Hence a lot of problem when due to poor rigor and understanding (which amplify
the aforementioned effect) time becomes that its nightmarish physical beast.
And you are stuck with idiots, that even thinks that the colour of the skin
influence your quality as a coder.

So here is my understanding of coder's problem with time. The mindset of
coders I have met and boss alike is stuck in the 1800's. Where statistical
physic, the dual nature of light, quantum mechanics, Einstein's relativity are
known as trivial pursuit boring questions but no one cares of the
implications.

Then they sux at understanding geometry vs physics but most of all they are
stuck in the wrong physical world.

They live in a world of determinism where they would prefer compute the
position and speed of every molecule in a gas than use the 'unpure' perfect
law.

For time they are puzzled: \- time is a length of vector; (how much time
since)

\- time is a point - deducted from an implicit 0 origin when taking a length;

\- time is 1D vector so it behaves like a scalar, so it must be a scalar;
(computing resulting size by adding/substracting as length/vector))

\- there is a lot of politic involved in "time measure" (GMT, TZ, calendars,
interstitial seconds) and politic is buggy thus it results in bugs;

\- heisenberg DOES exists; they never care to measure the error and think it
is wasted time;

\- my time as a coder is always free;

\- time cannot be uncertain since we have these high resolution clocks (the
exactitude of time is such we never encounter uncertainty (and our code is
executed in 0s)));

\- GPS is a measuring instrument that magically corrects this, because it is
perfect and has no errors because it is USA spatial "godly" "star streky" in
the sky;

\- acausality cannot locally happen because of asymmetries in topologies
(slow/fast router vs short long path);

In short, most of coders are insanely crippled by their own culture of
ignorance and their self importance.

Common scientifical knowledge that is more commonly understood by mc donalds
employees has still not reached the brain of our elite architects/coders. And
time - frequencies is one of the most important dimension of all applications.

The question I wonder is "how?". How is it even possible to have such a bias
in the mass recruitment of coders that they select over confident thinkers
that are lacking of curiosity so much they can blindfold themselves
comfortably.

If the lack in scientific domain is that great, and reflects arrogant lacks in
other domain ... then I think of creeping lack of culture in "business",
"ethics", "legal", "cryptography", "probability", "algebrae" ...

I have provoked enough computer pro and made stats to know for sure their
level of confidence should be dangerously inversely correlated to their level
of actual knowledge.

I am very confident that IT has a corporate culture bias of valuating arrogant
ignorant that "can do it" over careful thinkers that may say "it well never be
doable" __*

 __* yes, the Cretan paradox revisited

~~~
Bjartr
I tried, I really did, to understand what you are saying here and to give you
the benefit of the doubt.

Unfortunately, if your intent was to communicate some idea to people who read
your comment, then it didn't work because I honestly can't say what that idea
might be.

I could make some witty remarks on certain things you've said or how you've
said them, but that wouldn't be useful for either of us. So I won't.

~~~
rayuela
I agree, that made absolutely no fucking sense. That was some Youtube comment
level stuff right there.

~~~
pinkrooftop
from the comment history it looks like OP is (a non native English speaking)
extraordinarily bitter physics major who had a falling out with a coworker
during a programming job

