
Time and time correction in Erlang - motiejus
http://www.erlang.org/documentation/doc-7.0-rc1/erts-7.0/doc/html/time_correction.html
======
jlouis
The ultra-short variant of what is going on:

Time in Erlang was historically used for many things: What time is it? How
much time elapsed between points A and B? Give me time, use it as a unique
timestamp. And so on.

The old way Erlang handled time (with time correction enabled) was to speed up
or slow down the internal clock by about 1% whenever the system time changed.
This is fine for smaller fluctuations like leap seconds (and correspond nicely
to what Google calls leap-second-smearing). But for a large sudden "time
warp", the system will be wrong forever because it will never catch up. The 1%
frequency difference also means that 1000ms is closer to 990ms or 1010ms.

The new API maintains a precise and accurate monotonic clock which is used to
wait on timers, do latency calculations and such. And then a "system time" is
kept by noting how the system time is "offset" from the internal monotonic
clock. You can monitor time change events of this "system" time, which allows
your program to explicitly handle leap seconds, or when the system suddenly
gets full NTP connectivity and learns that 14 days has passed since last
reboot.

Finally, a seperate API can give unique integers. Earlier, erlang:now() was
guaranteed unique (even if called from several different processes at the same
time. It would simply "speed up" for a while if the calls were too close to
each other). The downside of this guarantee was a big fat lock around
erlang:now(), which has been dreadful to anyone who wanted good scalability in
their programs. Lock instrumentation could often show serious contention
around this lock.

By querying the underlying OS for the best clock source and separating the
different time concepts, timer wheels can now be per-core, which results in a
massive scalability boost of the system as a whole.

All in all, it is a necessary and very cool change to Erlang systems. It
allows one to handle time with the attention to detail which is needed. While
the original leap-smearing tactic is fine (and worked well for a long time),
this finally allows people to program real solutions to leap seconds and
sudden NTP time warps.

(Edit: added a "never" which was needed for a sentence to parse correctly).

~~~
angersock
I'm curious if Erlang system here means just the nodes running on a single
machine, or the Erlang system meaning the system of nodes clustered across
multiple hosts.

Is time as transparent as the RPC model, is I guess what I'm asking.

~~~
jlouis
Time here is on a per-node basis. No effort is made to synchronize time over
multiple nodes. Partially because the problem is better solved with NTP,
partially because time synchronization is so application-specific you don't
want to solve it generically.

Some problems can do away with pretty lax sync, which in turn yields faster
execution due to more asynchronicity. Other problems requires global time
keeping, or timestamps of the form {erlang:monotonic_time(), node()} where the
node-name is part of the timestamp (tuple ordering is lexicographic).

There are already several solutions to the global time problem. Twitter does
something in Finagle, and earlier it was snowflake.

------
yellowapple
_The system clock on such a system will typically be way off when the system
boots. If the no time warp mode is used, and the Erlang runtime system is
started before the OS system time has been corrected, the Erlang system time
may be wrong for a very long time, even centuries or more._

There's something incredibly badass about Erlang/OTP's documentation
addressing the idea of an OTP app running for "centuries or more".

------
rdtsc
That looks like a well thought out and engineered approach to handle time.
Often it left unspecified what happens when time warps. And there a certainly
a lot of interesting corner cases.

I've seen system misbehave very badly during time warps. Even if NTP is
configured sometimes it will snap the time instead of slewing it.

It is always fun when there is a time delta measurement on absolute time and
it moves backwards. Now dt is negative with all kind of wonderful and exciting
ramifications "Oh so we are sleeping for 4294967294 seconds, great!"

~~~
TheLoneWolfling
This is why most things along those lines (timers, etc) should be based on
number of ticks since a reference point. Or, alternatively, a signed value
(where there is a buffer zone near wrap that errors.)

Not that that doesn't have problems of its own.

~~~
jlouis
This is essentially what will be possible now, with a 'CHANGE' event if the
underlying clock suddenly warps/jumps to a new point in time. Your code has to
be "time warp safe" in that it needs to handle this.

Interestingly, most Erlang code will be because most systems will just call
erlang:monotonic_time(), leaving the other calls for time-affected
applications, which are rarely that many.

------
skrebbel
Cool:

    
    
        erlang:unique_integer([monotonic])
    

I wish this was a more common API function. Being able to do ordering without
having to think about time and concurrency is _really_ nice (I know x86 has an
interlocked increment that you could just use on an arbitrary global variable,
but that means that you're not safe across crashes).

~~~
MCRed
This is elixir, but can be called from erlang:
[https://github.com/nirvana/flaky/](https://github.com/nirvana/flaky/)

It gives a unique value from any node in a cluster of nodes running this--
without any coordination-- such that you can guarantee they are sortable in
order of creation. (to a small increment of fungibility-- two flakes created
on different nodes in the same millisecond might be out of order, but they
will be unique and in order with any flakes created in other milliseconds.)

