Hacker News new | past | comments | ask | show | jobs | submit login
Look Before You Leap – The Coming Leap Second and AWS (amazon.com)
170 points by jeffbarr on May 19, 2015 | hide | past | favorite | 65 comments



The last time a leap second was added, it caused issues with a number of servers running Linux due to a livelock. [0] This had an effect on EC2 [1] and apparently other pieces of software [2] as well.

Nice to see Amazon being proactive about the forthcoming change. Time is just a prime example of one of those things that seems superficially simple, but turns out to be deviously complicated. (If you ever want to bore non-technical types to death at a party, just start talking about the history of time. Bonus points for mentioning the proleptic Gregorian calendar in context.)

0. http://serverfault.com/questions/403732/anyone-else-experien...

1. http://www.techspot.com/news/49229-leap-second-bug-amazon-ec...

2. http://www.wired.com/2012/07/leap-second-bug-wreaks-havoc-wi...


This had an effect on EC2

I don't think the leap second bug had any effect on EC2 itself; the only reason EC2 was ever blamed is that many of the sites which had issues happened to be hosted on EC2.


Correct, my bad. Thanks for catching the mistake.


0. That was an exciting day! I still remember rebooting servers on my phone while my kids tried on new bathers.

http://blog.fastmail.com/2012/07/03/a-story-of-leaping-secon...


> just start talking about the history of time

And lets not forget:

3. http://www.quirksmode.org/blog/archives/2009/04/making_time_...


I've always wondered why system clocks are set to UTC instead of TAI[1]. To me, it makes more sense for OSes to ship UTC as a time zone. Leap seconds would then be tzinfo updates, just like when countries change their daylight saving time. System clocks still wouldn't be guaranteed to be monotonically increasing, but at least there wouldn't be minutes with 61 seconds.

1. http://en.wikipedia.org/wiki/International_Atomic_Time


The system clocks on the hardware level are extremely inaccurate so genuinely it doesn't matter. All the problems that happen around the leap second are due to the poor "management" around it not that it's actually a big absolute difference from the reference atomic clocks.

Case in point: the Linux leap second kernel bug that made problems in 2012. Read the commit of the fix:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....

"This patch tries to avoid the problem by reverting back to not using an hrtimer to inject leapseconds, and instead we handle the leapsecond processing in the second_overflow() function. The downside to this change is that on systems that support highres timers, the leap second processing will occur on a HZ tick boundary, (ie: ~1-10ms, depending on HZ) after the leap second instead of possibly sooner (~34us in my tests w/ x86_64 lapic)."

So the bug was the effect of the programmers more worrying that the leap second adjustment happens as fast as possible, in 34 microseconds, ignoring that the call from that particular point in code made the livelock.

And instead of pushing the change "as fast as possible" we see that both Google and AWS solve the problem by spreading the changes over the long periods of time. Which is the right approach generally for all automatic adjustments to the system clock -- avoiding discontinuities.


I agree. TAI is simpler and seems better suited to be the fundamental time counting mechanism at the OS level.

There was an interesting HN thread along these lines back when the current impending leap second was announced: https://news.ycombinator.com/item?id=8840440


DJB to the rescue, once again. http://cr.yp.to/libtai.html


This is certainly an interesting idea, but isn't TAI/UTC somewhat orthogonal to TZs? I'm in America/Los_Angeles, for example, and if I want my time in TAI, my local time would be ~35 seconds different from my local time in UTC, would it not?


Time zones are offsets in UTC, which change over time (DST, political changes, etc).

UTC is an offset from TAI, which changes over time (leap seconds).

The time zone files already keep track of historical changes[1].

Conceptually, they're pretty similar; the only difference is that leap seconds have a special clock value (23:59:60 instead of showing you 23:59:59 twice).

  1. https://en.wikipedia.org/wiki/Tz_database#Example_zone_and_rule_lines


I would imagine that some hardware RTCs would not support the :60 leap second value. In fact I cannot recall a single RTC that I have dealt with, that understands leap seconds.


Local time zones are offset from UTC, not TAI. If it's 19:45:00 in LA, UTC is 02:45:00 and TAI is 02:45:35. There's no such thing as a "TAI version" of your local time.


The solution is simple math, for computers to base their time on TAI, the time zone conversion changes from "UTC + local timezone offset", to "TAI + UTC offset + local timezone offset" and we reap the rewards of drastically simpler software at the core of our systems.

TAI is defined like UNIX time, as a notation of the progression of proper time. It is the primary reference by which we build all other times, UTC is a humanist overlay on TAI to maintain norms, since we need an approximate terrestrial solar time for sanity purposes.

If the math changes to TAI as the "base storage representation" for time stamps and reference time internally, then the math becomes immediately sane, since TAI can be relied on as a direct sequence of mathematically related linear time without lookup tables or other crap. Move the crap "up the stack" to where it doesn't cause issues like these we see every time things need a leap second.


The problem is that "the system clock" in the sense we have it now is actually "overloaded" with different expectations. From the hardware point of view, we have hugely inaccurate timers on the motherboards possibly drifting a lot all the time.

Then we have the signal from GPS, but typically only on the mobile phones, and some other signals on some other distribution mechanisms:

"GPS time was zero at 0h 6-Jan-1980 and since it is not perturbed by leap seconds GPS is now ahead of UTC by 16 seconds.

Loran-C, Long Range Navigation time. (..) zero at 0h 1-Jan-1958 and since it is not perturbed by leap seconds it is now ahead of UTC by 25 seconds.

TAI, Temps Atomique International (...) is currently ahead of UTC by 35 seconds. TAI is always ahead of GPS by 19 seconds. "

And we have NTP servers, which differ from one another all the time, and to which our computers connect and try to adjust what they report.

So the bugs are already just in how the adjustments are handled, not that the world can be made simpler.


Q: What value of TAI will be noon of July 4th, 2030 in New York?

A: Honestly, nobody knows.

You can estimate the number of leap seconds, but not know (much) in advance. Having (future) date representation chance occasionally does not lead to sanity either.


Mildly disappointed that AWS isn't giving Google credit for coming up with the idea of "time smearing", which they're using here:

http://googleblog.blogspot.com.au/2011/09/time-technology-an...


The idea of spreading a leap second over a longer period is not unique to Google e.g. www.quadibloc.com/science/cal06.htm The idea itself is actually rather obvious.


See also Markus Kuhn's description of smoothed leap seconds from 2005: https://www.cl.cam.ac.uk/~mgk25/time/utc-sls/


The time-smearing technique reminds me of how the Erlang platform (ERTS) adjusts its internal timekeeping to changes in the system's clock. If the system's clock jumps, then Erlang makes its internal clock tick faster or slower than the system's clock by at most 1% in order to resynchronize.

http://www.erlang.org/doc/apps/erts/time_correction.html


Most good ideas are obvious in hindsight. ;) But AFAIK, Google was the first to do this live and in production by hacking NTP, which is (apparently?) the same as what Amazon is doing.


SAP Systems have been doing this for many, many years live and in production for DST adjustments. Not sure about leap seconds.


What? How does it make sense to spread DST adjustments over a long time?


What if in a dst jump. It had. Pay namrog84 $100 at 1:30am. Then 2 rolls around and it's 1am again and now I get paid a second time. Win! But if dst was smeared over a day. This would be easily prevented.


Google didn't come up with that first. I'm aware of a number companies with multiple data enters (my last co had 14 around the world) and coordinated logs or time sensitive services that used this technique.

I personally implemented it 12 years ago, to support (a) a kind of geo-balancing based on time coordinated spoofed DNS responses, and (b) time stamped URL token expiration validation, sensitive to seconds, across distributed datacenters. I got the idea from Microsoft Windows' ability to drift the clock back into sync.

Google has adopted or reinvented a lot, and gets credit because they have extra time and resources to publish.

// TBC, they also invented a lot. But not this one. Sibling comments point out other implementations predating Google as well.


It looks to me AWS is using the different algorithm. See my other comment here.


Agreed that they definitely should have given some credit. I was unfamiliar with the concept (and didn't even know the term). The AWS post raised more questions than it answered, but that Google one is excellent and coins a great term.


Interesting. I remember reading how Google was doing something similar in their implementation of the leap second back in 2011: http://googleblog.blogspot.com/2011/09/time-technology-and-l...


I'm really doubting myself over this question, because who am I to think I might have found a mistake in Amazon's chart, but aren't these two comments in the table reversed?

1. "Each second is 1/86400 longer and AWS clocks fall behind UTC. The gap gradually increases to up to 1/2 second."

If the seconds are longer, wouldn't the AWS clocks be creeping ahead, not falling behind? Bear in mind the leap second hasn't been added to UTC yet at this point in the table.

2. "AWS clocks gain 1/2 second ahead of UTC."

They do? But an entire leap second was just added to UTC. Aren't the AWS clocks 1/2 second BEHIND at this point?

Note the time of the AWS smearing of this 1 second straddles the UTC injection with 12 hours on each side.

And then:

"AWS clocks keep falling behind and the gap with UTC shrinks gradually."

Shouldn't it be "AWS clocks keep catching up and the gap with UTC shrinks gradually"?

Not sure how I can be reading this so backwards, but if it's me who is wrong here, I'd love to hear why. In any case it shows how time is easy to get wrong (whether it's me who is wrong, or hah, doubtful, AWS).

What am I missing here?

Edit: duh yeah I get it now. Amazon is right, of course. Obviously I'm a n00b when it comes to dealing with leap seconds. I was thinking the leap second sets time ahead, but it doesn't; it effectively does the opposite. Leaving this message in place as an example of how thinking about time and calendar-related programming is easy to mess up.


I really do find thinking about time to be fiendish as well.

Here's some examples I thought of that may help clarify:

1. "Each second is 1/86400 longer and AWS clocks fall behind UTC. The gap gradually increases to up to 1/2 second."

- AWK clocks do indeed fall-behind when their "seconds" tick longer. Because their seconds tick longer, over a fixed period of time, AWS will count less ticks than UTC. (Think of it this way: A mile is longer than a kilometer. After you have traveled 800 kilometers, you've only traveled ~500 miles.)

2. "AWS clocks gain 1/2 second ahead of UTC." - Before the addition of the leap second, AWS clocks are behind by 1/2 second as per (1). The addition of a leap second to UTC is another second that the UTC clock must tick - the AWS clocks don't have to tick this amount - so the AWS clocks are now ahead by 1/2 second.

Basically, this is what AWS is doing, using our distance analogy again:

1. Normally, we have to cover 1000 km every day. This includes a Civil group of travelers and an AWS group of travelers.

2. Today, we decide we're going to cover 1001 km. Everyone in Civil decides this is OK to count to 1001 instead of 1000, just for today. But the AWS group only wants to count to 1000 because counting to 1001 is a Very Bad Thing. However, the AWS group still has to cover 1001 km.

3. The AWS group comes up with the ingenious idea of just making each "kilometer" 1.001 real-world kilometers, just for today. Thus, they will only count to 1000, but each time travel 1.001 km. The end result is the same - they will have covered 1001 km.


Think of it this way. Over the next time interval, the true UTC click ticks 10 times. Then the time becomes time+10. If the AWS clock has longer seconds and only ticks 9 times, then the time is only time+9 on their clocks. Therefore, the time at AWS is falling behind UTC, since UTC is incrementing faster.


  Leaving this message in place as an example of how  
  thinking about time and calendar-related programming is 
  easy to mess up.
		
Indeed, time and calendar related programming is quite tricky, even specifying what the behavior should be can be tricky and very non-intuitive.


Dealing with time zone issues in code has driven me batty more than a few times.


If I understood the text correctly, AWS adjustments are "linear" and not curved like in the Google adjustment:

lie(t) = (1.0 - cos(pi * t / w)) / 2.0

described here:

http://googleblog.blogspot.com.au/2011/09/time-technology-an...

I think I'd prefer the curved one. I don't see the advantages of the linear one except the less human capability needed to implement and test it.


What are the advantages of the curved one? I've never worked on an application that would care about this kind of thing, so I'm struggling to imagine the implications.


Suppose you have something with a watchdog that times something every time it runs to check that nothing suddenly gets slower / faster. A backup or something.

Let's suppose that the amount of time it takes to run is relatively small. Now suppose between two runs the linear decrease starts. All of a sudden the second run appears to be slower than it is by a constant factor.

With a curved change, this effect still happens, but it's less for smaller time intervals.


The shortest explanation would be that the curve can be introduced to lower the average difference between the official time and the adjusted time.

bjackman "never worked on an application that would care about this kind of thing," fine, but for those that do care, the curve is obviously better.


Another way to look at it: the curve keeps the second derivative of time bounded. Which in turn keeps anything analogous to acceleration bounded.


Good point, thanks! Here's the example picture of the curve:

http://i.imgur.com/oYboqSL.png


I'm guessing some applications may be sensitive to a sudden change in second duration and they probably argued that a gradual introduction of change would trip up the least amount of programs.


Wow, they've certainly put a lot of thought into that!

I wonder if they're actually "slowing down time" or just not implementing the leap second and just injecting the 1/..... second into all user-visible or exported fields?


I haven't seen a patent application for a time machine, so we can rule out the possibility that they're actually slowing down time.

However, I do wonder how they implement this. Do you change timekeeping in the kernel so that a fraction of extra time must pass for each second to be "counted"?


It's probably just a patched NTP server that lies about current time.


You are exactly correct. The aforementioned thread[0] on Serverfault mentioned timesmearing with `ntpd -x`, as per [1]. It's a fairly well-known technique for avoiding some weird kernel-related bugs on leap seconds.

[0]: http://serverfault.com/questions/403732/anyone-else-experien...

[1]: https://web.archive.org/web/20140221084401/http://my.opera.c...


It's impressive the chaos that one second can cause. I think in terms of hard problems, to naming and cache invalidation, I would add time or human notions thereof. Or perhaps time is a case of the first two intersecting...


Dates, times, and time zones, +04:00 tough problems at the intersection of computer science and humans.


Postal addresses, naming humans, encoding written language... Most problems at the intersection of technology and humans are tough. :)


What an elegant solution. This is really cool.


What a coincidence, today I was talking with the people at Vornex who are in the niche area of simulated time for testing. It seems like a good opportunity to test this "leap second".

[1] http://www.vornexinc.com/


What if I disable ntpd a minute before leap second is injected and restart it a minute later?


My guess is, the leap second will be inserted, just as scheduled. And you'll get the famous message in your kernel log (dmesg):

http://lxr.free-electrons.com/source/kernel/time/ntp.c?v=2.6...

    Clock: inserting leap second 23:59:60 UTC
If ntpd gets notified of an impeding leap second via its peers (or the connected radio clock, GPS receiver, ...) it will set (struct timex*)->status |= STA_INS via the timex syscall (which it also uses to steer/speed-up/slow-down the clock).

http://man7.org/linux/man-pages/man2/adjtimex.2.html

The stock linux kernel works such that if ntpd will have set the STA_INS flag via adjtimex some time before, the kernel will do the leap-second insertion at the end of the UTC day.

If you disable ntpd and it doesn't reset this flag (which I doubt it does, but you'd have to check), the kernel will insert the leap second on its own, even if ntpd is not running.

If you disable ntpd, and either ntpd on termination (which I doubt), or you via the adjtimex syscall, clear the STA_INS flag, then the kernel will not insert the leap second. After UTC midnight, the clock will then be one second off, and a restart of ntpd will slowly steer the clock back to correct time.

For playing with all of this, there's a adjtimex tool which can display and even change the timex values:

    ➜  sbin  ./adjtimex -pV | sed 's/^/    /'
             mode: 0
           offset: 0
        frequency: -344608
         maxerror: 16000000
         esterror: 16000000
           status: 64
    time_constant: 2
        precision: 1
        tolerance: 32768000
             tick: 10000
         raw time:  1432018768s 308111us = 1432018768.308111
     return value = 5


Thanks! that's good info. It helped me to find a good article from someone to devoted couple days on testing this method: http://syslog.me/2012/06/01/an-humble-attempt-to-work-around...


I'm not sure if the addition of a leap second is what prompted qntm to write about this, but his twitter feed is worth checking out: https://twitter.com/qntm


See also his website/blog, where he often writes about time-travel-related ideas: http://qntm.org/



I'd like to hear from other HN readers if their applications would have major issues if they did not deal with leap seconds. Curious the domains in which this is very important.


https://github.com/search?q=86400&type=Code&utf8=✓ A quick search on GitHub finds more than two million examples of people assuming 1 day is 86400 seconds. Every single one of those is a bug. Not a critical one (in most cases), but still an example of code that could fail unexpectedly.

In the most basic case, code to fetch a date n days ahead that naively assumes fixed length days will return a date n-1 days ahead instead if it runs at exactly midnight eg date("Y-m-d", mktime()+($days*86400)) will return a date $days-1 in the future if it runs on leap second day. An edge case certainly, but if you're adding millions of records a day something you ought to consider.


No, not every single one of those is a bug. "This cookie lasts a day", "This DNS entry lasts a day", and so forth. These aren't issues that require down-to-the-second accuracy. DNS expires a second early? Oh well, go poll upstream. It would be unusual to see an application that relies on something like a DNS entry being exactly 1 day's worth of seconds. The first page of results in that link are mostly cache timeouts rather than functional definitions.


eg date("Y-m-d", mktime()+($days * 86400)) will return a date $days-1 in the future if it runs on leap second day.

No it won't. Those functions deal in UNIX-epoch times, which ignore leap seconds.


"Every single one of those is a bug."

The one place I know I've assumed a day is 86400 seconds I also assumed a month was 30 days and a year 365, because it only needed to be an approximate gauge of how much time had passed, didn't need any relation to actual calendar time, and would be dropping far more precision elsewhere. I think it's incorrect to call that a bug, even a non-critical one.


On a related note, you'll find lots of cases where seconds are assumed to be 00-59 instead of 00-60. An understandable, but flawed, assumption, and one the leap second really messes up.


That's fair. Bug is maybe a bit strong. I'd consider them potential bugs, although with cperciva's comment withstanding even that's probably too strong.


Well, there was this: https://lkml.org/lkml/2012/7/1/203

It affected pretty much everything running (a recent version of) Linux.


That's a bug on code that handles leap second. It's not a bug triggered by losing accuracy by ignoring leap seconds.

I'm also very curious about what kind of applications will fail if the kernel simply ignored leap seconds, and let the clock get a second out of sync.

A computer does not last for long enough for leap seconds to add to anything. I'm unable to come out with any hypothetical use case where that small difference is relevant, but specialized hardware and software to deal with the issue are not necessary.


You could probably find a number of post mortems by googling around for the last leap second, which was June 30 2012 IIRC.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: