
Look Before You Leap – The Coming Leap Second and AWS - jeffbarr
https://aws.amazon.com/blogs/aws/look-before-you-leap-the-coming-leap-second-and-aws/
======
stygiansonic
The last time a leap second was added, it caused issues with a number of
servers running Linux due to a livelock. [0] This had an effect on EC2 [1] and
apparently other pieces of software [2] as well.

Nice to see Amazon being proactive about the forthcoming change. Time is just
a prime example of one of those things that seems superficially simple, but
turns out to be deviously complicated. (If you ever want to bore non-technical
types to death at a party, just start talking about the history of time. Bonus
points for mentioning the proleptic Gregorian calendar in context.)

0\. [http://serverfault.com/questions/403732/anyone-else-
experien...](http://serverfault.com/questions/403732/anyone-else-experiencing-
high-rates-of-linux-server-crashes-during-a-leap-second)

1\. [http://www.techspot.com/news/49229-leap-second-bug-amazon-
ec...](http://www.techspot.com/news/49229-leap-second-bug-amazon-ec2-outage-
brought-down-major-websites.html)

2\. [http://www.wired.com/2012/07/leap-second-bug-wreaks-havoc-
wi...](http://www.wired.com/2012/07/leap-second-bug-wreaks-havoc-with-java-
linux/)

~~~
cperciva
_This had an effect on EC2_

I don't think the leap second bug had any effect on EC2 itself; the only
reason EC2 was ever blamed is that many of the sites which had issues happened
to be hosted on EC2.

~~~
stygiansonic
Correct, my bad. Thanks for catching the mistake.

------
ggreer
I've always wondered why system clocks are set to UTC instead of TAI[1]. To
me, it makes more sense for OSes to ship UTC as a time zone. Leap seconds
would then be tzinfo updates, just like when countries change their daylight
saving time. System clocks still wouldn't be guaranteed to be monotonically
increasing, but at least there wouldn't be minutes with 61 seconds.

1\.
[http://en.wikipedia.org/wiki/International_Atomic_Time](http://en.wikipedia.org/wiki/International_Atomic_Time)

~~~
deathanatos
This is certainly an interesting idea, but isn't TAI/UTC somewhat orthogonal
to TZs? I'm in America/Los_Angeles, for example, and if I want my time in TAI,
my local time would be ~35 seconds different from my local time in UTC, would
it not?

~~~
meatmanek
Time zones are offsets in UTC, which change over time (DST, political changes,
etc).

UTC is an offset from TAI, which changes over time (leap seconds).

The time zone files already keep track of historical changes[1].

Conceptually, they're pretty similar; the only difference is that leap seconds
have a special clock value (23:59:60 instead of showing you 23:59:59 twice).

    
    
      1. https://en.wikipedia.org/wiki/Tz_database#Example_zone_and_rule_lines

~~~
oakwhiz
I would imagine that some hardware RTCs would not support the :60 leap second
value. In fact I cannot recall a single RTC that I have dealt with, that
understands leap seconds.

------
jpatokal
Mildly disappointed that AWS isn't giving Google credit for coming up with the
idea of "time smearing", which they're using here:

[http://googleblog.blogspot.com.au/2011/09/time-technology-
an...](http://googleblog.blogspot.com.au/2011/09/time-technology-and-leaping-
seconds.html)

~~~
hillsarealiv3
The idea of spreading a leap second over a longer period is not unique to
Google e.g. www.quadibloc.com/science/cal06.htm The idea itself is actually
rather obvious.

~~~
jpatokal
Most good ideas are obvious in hindsight. ;) But AFAIK, Google was the first
to do this live and in production by hacking NTP, which is (apparently?) the
same as what Amazon is doing.

~~~
chronial
SAP Systems have been doing this for many, many years live and in production
for DST adjustments. Not sure about leap seconds.

~~~
RedNifre
What? How does it make sense to spread DST adjustments over a long time?

~~~
Namrog84
What if in a dst jump. It had. Pay namrog84 $100 at 1:30am. Then 2 rolls
around and it's 1am again and now I get paid a second time. Win! But if dst
was smeared over a day. This would be easily prevented.

------
graiz
Interesting. I remember reading how Google was doing something similar in
their implementation of the leap second back in 2011:
[http://googleblog.blogspot.com/2011/09/time-technology-
and-l...](http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-
seconds.html)

------
natch
I'm really doubting myself over this question, because who am I to think I
might have found a mistake in Amazon's chart, but aren't these two comments in
the table reversed?

1\. "Each second is 1/86400 longer and AWS clocks fall behind UTC. The gap
gradually increases to up to 1/2 second."

If the seconds are longer, wouldn't the AWS clocks be creeping ahead, not
falling behind? Bear in mind the leap second hasn't been added to UTC yet at
this point in the table.

2\. "AWS clocks gain 1/2 second ahead of UTC."

They do? But an entire leap second was just added to UTC. Aren't the AWS
clocks 1/2 second BEHIND at this point?

Note the time of the AWS smearing of this 1 second straddles the UTC injection
with 12 hours on each side.

And then:

"AWS clocks keep falling behind and the gap with UTC shrinks gradually."

Shouldn't it be "AWS clocks keep catching up and the gap with UTC shrinks
gradually"?

Not sure how I can be reading this so backwards, but if it's me who is wrong
here, I'd love to hear why. In any case it shows how time is easy to get wrong
(whether it's me who is wrong, or hah, doubtful, AWS).

What am I missing here?

Edit: duh yeah I get it now. Amazon is right, of course. Obviously I'm a n00b
when it comes to dealing with leap seconds. I was thinking the leap second
sets time ahead, but it doesn't; it effectively does the opposite. Leaving
this message in place as an example of how thinking about time and calendar-
related programming is easy to mess up.

~~~
lambda

      Leaving this message in place as an example of how  
      thinking about time and calendar-related programming is 
      easy to mess up.
    		

Indeed, time and calendar related programming is quite tricky, even specifying
what the behavior should be can be tricky and very non-intuitive.

~~~
drumdance
Dealing with time zone issues in code has driven me batty more than a few
times.

------
acqq
If I understood the text correctly, AWS adjustments are "linear" and not
curved like in the Google adjustment:

lie(t) = (1.0 - cos(pi * t / w)) / 2.0

described here:

[http://googleblog.blogspot.com.au/2011/09/time-technology-
an...](http://googleblog.blogspot.com.au/2011/09/time-technology-and-leaping-
seconds.html)

I think I'd prefer the curved one. I don't see the advantages of the linear
one except the less human capability needed to implement and test it.

~~~
bjackman
What are the advantages of the curved one? I've never worked on an application
that would care about this kind of thing, so I'm struggling to imagine the
implications.

~~~
TheLoneWolfling
Suppose you have something with a watchdog that times something every time it
runs to check that nothing suddenly gets slower / faster. A backup or
something.

Let's suppose that the amount of time it takes to run is relatively small. Now
suppose between two runs the linear decrease starts. All of a sudden the
second run appears to be slower than it is by a constant factor.

With a curved change, this effect still happens, but it's less for smaller
time intervals.

~~~
acqq
The shortest explanation would be that the curve can be introduced to lower
the average difference between the official time and the adjusted time.

bjackman "never worked on an application that would care about this kind of
thing," fine, but for those that do care, the curve is obviously better.

~~~
TheLoneWolfling
Another way to look at it: the curve keeps the second derivative of time
bounded. Which in turn keeps anything analogous to acceleration bounded.

~~~
acqq
Good point, thanks! Here's the example picture of the curve:

[http://i.imgur.com/oYboqSL.png](http://i.imgur.com/oYboqSL.png)

------
ComputerGuru
Wow, they've certainly put a lot of thought into that!

I wonder if they're actually "slowing down time" or just not implementing the
leap second and just injecting the 1/..... second into all user-visible or
exported fields?

~~~
javert
I haven't seen a patent application for a time machine, so we can rule out the
possibility that they're actually slowing down time.

However, I do wonder how they implement this. Do you change timekeeping in the
kernel so that a fraction of extra time must pass for each second to be
"counted"?

~~~
kccqzy
It's probably just a patched NTP server that lies about current time.

~~~
archimedespi
You are exactly correct. The aforementioned thread[0] on Serverfault mentioned
timesmearing with `ntpd -x`, as per [1]. It's a fairly well-known technique
for avoiding some weird kernel-related bugs on leap seconds.

[0]: [http://serverfault.com/questions/403732/anyone-else-
experien...](http://serverfault.com/questions/403732/anyone-else-experiencing-
high-rates-of-linux-server-crashes-during-a-leap-second)

[1]:
[https://web.archive.org/web/20140221084401/http://my.opera.c...](https://web.archive.org/web/20140221084401/http://my.opera.com/marcomarongiu/blog/2012/06/01/an-
humble-attempt-to-work-around-the-leap-second)

------
ColinDabritz
It's impressive the chaos that one second can cause. I think in terms of hard
problems, to naming and cache invalidation, I would add time or human notions
thereof. Or perhaps time is a case of the first two intersecting...

~~~
erichurkman
Dates, times, and time zones, +04:00 tough problems at the intersection of
computer science and humans.

~~~
WalterGR
Postal addresses, naming humans, encoding written language... Most problems at
the intersection of technology and humans are tough. :)

------
javert
What an elegant solution. This is really cool.

------
wslh
What a coincidence, today I was talking with the people at Vornex who are in
the niche area of simulated time for testing. It seems like a good opportunity
to test this "leap second".

[1] [http://www.vornexinc.com/](http://www.vornexinc.com/)

------
foton1981
What if I disable ntpd a minute before leap second is injected and restart it
a minute later?

~~~
cnvogel
My guess is, the leap second will be inserted, just as scheduled. And you'll
get the famous message in your kernel log (dmesg):

[http://lxr.free-
electrons.com/source/kernel/time/ntp.c?v=2.6...](http://lxr.free-
electrons.com/source/kernel/time/ntp.c?v=2.6.34#L199)

    
    
        Clock: inserting leap second 23:59:60 UTC
    

If ntpd gets notified of an impeding leap second via its peers (or the
connected radio clock, GPS receiver, ...) it will set (struct timex*)->status
|= STA_INS via the timex syscall (which it also uses to steer/speed-up/slow-
down the clock).

[http://man7.org/linux/man-
pages/man2/adjtimex.2.html](http://man7.org/linux/man-
pages/man2/adjtimex.2.html)

The stock linux kernel works such that if ntpd will have set the STA_INS flag
via adjtimex some time before, the kernel will do the leap-second insertion at
the end of the UTC day.

If you disable ntpd and it doesn't reset this flag (which I doubt it does, but
you'd have to check), the kernel will insert the leap second on its own, even
if ntpd is not running.

If you disable ntpd, and either ntpd on termination (which I doubt), or you
via the adjtimex syscall, clear the STA_INS flag, then the kernel will not
insert the leap second. After UTC midnight, the clock will then be one second
off, and a restart of ntpd will slowly steer the clock back to correct time.

For playing with all of this, there's a adjtimex tool which can display and
even change the timex values:

    
    
        ➜  sbin  ./adjtimex -pV | sed 's/^/    /'
                 mode: 0
               offset: 0
            frequency: -344608
             maxerror: 16000000
             esterror: 16000000
               status: 64
        time_constant: 2
            precision: 1
            tolerance: 32768000
                 tick: 10000
             raw time:  1432018768s 308111us = 1432018768.308111
         return value = 5

~~~
foton1981
Thanks! that's good info. It helped me to find a good article from someone to
devoted couple days on testing this method: [http://syslog.me/2012/06/01/an-
humble-attempt-to-work-around...](http://syslog.me/2012/06/01/an-humble-
attempt-to-work-around-the-leap-second/comment-page-1/)

------
pavel_lishin
I'm not sure if the addition of a leap second is what prompted qntm to write
about this, but his twitter feed is worth checking out:
[https://twitter.com/qntm](https://twitter.com/qntm)

~~~
wging
See also his website/blog, where he often writes about time-travel-related
ideas: [http://qntm.org/](http://qntm.org/)

------
acqq
Also interesting:

[http://www.datastax.com/dev/blog/preparing-for-the-leap-
seco...](http://www.datastax.com/dev/blog/preparing-for-the-leap-second)

------
maerF0x0
I'd like to hear from other HN readers if their applications would have major
issues if they did not deal with leap seconds. Curious the domains in which
this is very important.

~~~
onion2k
[https://github.com/search?q=86400&type=Code&utf8=✓](https://github.com/search?q=86400&type=Code&utf8=✓)
A quick search on GitHub finds more than two million examples of people
assuming 1 day is 86400 seconds. Every single one of those is a bug. Not a
critical one (in _most_ cases), but still an example of code that could fail
unexpectedly.

In the most basic case, code to fetch a date n days ahead that naively assumes
fixed length days will return a date n-1 days ahead instead if it runs at
exactly midnight eg date("Y-m-d", mktime()+($days*86400)) will return a date
$days-1 in the future if it runs on leap second day. An edge case certainly,
but if you're adding millions of records a day something you ought to
consider.

~~~
dllthomas
_" Every single one of those is a bug."_

The one place I know I've assumed a day is 86400 seconds I also assumed a
month was 30 days and a year 365, because it only needed to be an approximate
gauge of how much time had passed, didn't need any relation to actual calendar
time, and would be dropping far more precision elsewhere. I think it's
incorrect to call that a bug, even a non-critical one.

~~~
Twirrim
On a related note, you'll find lots of cases where seconds are assumed to be
00-59 instead of 00-60. An understandable, but flawed, assumption, and one the
leap second really messes up.

