Hacker News new | comments | show | ask | jobs | submit login
Leap second causing Linux server crashes? (serverfault.com)
253 points by sathyabhat 1486 days ago | hide | past | web | 114 comments | favorite

It appears to be fixed in Linux 3.4 [1]. According to the original commit [2] it's been broken since 7dffa3c673fbcf835cd7be80bb4aec8ad3f51168 [3], which appeared in 2.6.26.

So, kernels between 2.6.26 and 3.3 (inclusive) are vulnerable.

[1] https://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2....

[2] https://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2....

[3] https://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2....

Which, in summary, is pretty much every production kernel out there.

Spent the last two hours recovering servers, tomorrow will be another interesting day.

Whoever figured it'd be a good idea to INSERT[1] the leap-second instead of just slowing/accelerating time... <censored>

[1] Clock: inserting leap second 23:59:60 UTC

I'm still tryin to understand why all my servers seem to be ok even if they have kernel that should be affected and some of them are running mysql... For example one of them is a debian kernel 2.6.32 running mysql and ntpd, and i see in dmesg Clock: inserting leap second 23:59:60 UTC but the cpu load is ok...

Well, it was a known bug and you had six months to prepare (i.e. update your kernel).

Where was it published?

Almost all of my machines run the Debian stable kernel and were still affected.

The leap second was scheduled in January. That event is so unusual you might get worried. So you do a simple google search and find out that there was a critical bug[1] in Linux kernel last time leap second was inserted. People got worried rightfully[2][3]. I don't know about debian, if it was known prior, if it is the same bug as before, ... But I don't run Debian, you do.

1. https://bugzilla.redhat.com/show_bug.cgi?id=479765

2. http://it.slashdot.org/story/12/06/30/2123248/the-leap-secon...

3. http://serverfault.com/questions/402087/does-centos-5-4-prop...

No need to be a smart-ass about it.

Even if I had googled (which I didn't) then I'd probably have assumed the fixes for bugs from 2009 to have long made it into the current distro kernels.

I just didn't expect something so basic to be still (or again) broken.

Don't get me wrong, I wouldn't too by default. But do you remember Azure crashing on February 29th? And checking for that date is a matter of three conditions. Leap second is much more complex. I'm not trying to be a smart-ass.. I'm just saying it's something I would worry about and would try to find something about it. And perhaps it wouldn't lead anywhere with Debian.

And still, something in your app stack could crash on this as well, leaving the kernel patching pointless.

Pity there was a missing bnx2 firmware issue in all the stables kernels since then, which makes most of the boxes I'm unfortunate enough to run even less useful.

> Whoever figured it'd be a good idea to INSERT[1] the leap-second instead of just slowing/accelerating time... <censored>

That would be the IERS organisation. There's going to be a vote in 2015 to abolish them entirely.

Well, except for RHEL 5. That runs 2.6.18.

I have 2.6.27 kernels here (SuSE 11.1) which seem unaffected so breakage might be a little later

http://bit.ly/N1kZvS https://twitter.com/redditstatus/status/219244389044731904

Google uses a "leap smear" and slowly accounts for the leap second before it happens.[1] As long as you are not doing any astronomical calculations or constrained by regulatory requirements I think google has the right idea.

[1] http://googleblog.blogspot.com/2011/09/time-technology-and-l...

As part of Google Compute Engine we provide an NTP server to the guest which is based on Google Production time. As such our VMs get to take advantage of this leap second smearing implementation. I was going to mention this at my talk at IO but forgot.

So a VM on G's Compute Engine could in turn run an NTP server that exported G's Production Time? Do I also see GPT on App Engine?

Any chance Google could just make a GPT NTP server available as a public service anyway, just as is their public ping responder. ;-) and are Google Public DNS, not a "ping responder." https://developers.google.com/speed/public-dns/

Google does provide time servers, although I'm not sure whether they are officially supported. The addresses are:


Yeah, sorry, I know they're the public DNS, but they're also jolly handy as unforgettable IP addresses you expect to be able to ping, hence the smiley.

Good news about the time{1..4} NTP servers, I'll give them a try, thanks.

I don't know about GAE specifically but I'd be very surprised if it didn't see the same time as the rest of Google. They would have had to do work to make that happen.

Marco's blog post (linked) had a similar idea - running ntp with -x for a day so it smears time.

In case anyone is looking for the actual link to marco's posts on ntp:


Not surprising. In spite of all press that Y2K was just a silly waste of money, its events like these that makes me suspect it would have been a much bigger deal if everyone had ignored it and fixed it after things where shown to break.

A lot of engineers[1] spent a lot of time successfully fixing Y2K bugs.

Because nothing well known blew up, many people wrongly assumed that Y2K was never a real problem to begin with.

[1] I moved a Fortune 100 manufacturing company's database off an ancient mainframe that would've been disastrous come Y2K. It went smoothly and was thus a thankless job. They paid well though (mid six figures - those were the days).

When people say mid-six figures, do they mean 500k? Or 150k?

I think it means 300k -- 3 being a one-digit approximation to sqrt(10). I.e. the geometric mean of 100k and 1M.

Not sure if snark, or if really thst financially sophisticated...

I'd usually say just Google it, but coincidentally enough one of the front page results is someone asking the same question on this very site:


It means 500k, or there abouts; 400-600k would probably qualify, with anything higher or lower being mid to high or low to mid six figures, respectively.

$100/h approaches $200k, and consultants can definitely make $250/h+.

Yeah, but that rate usually assumes they aren't billing 40hr/weeks for 50 weeks a year.

Does it? I've had 50 week years as a consultant where it was 40-60hrs a week. Rough years. Profitable though. What would be nice is 25wks at the higher rates.

Why does everyone always say Y2K wasn't an issue ? I'm sure there were a lot of consultant making too much money with little work - however _alot_ of bug fixes were done, that would have caused problems. So because it was taken seriously , stuff were fixed and issues didn't happen because of that.

Personally, I fixed 3 Y2K bugs back then, 2 of them would have brought down a rather critical business support to simply crash every time new data arrived.

"Why does everyone always say Y2K wasn't an issue?"

From the outsider's perspective, it is indistinguishable from any number of other putative disasters that required lots of money to fix, yet didn't come to pass... in some cases including putative disasters in which the money wasn't spent and the disaster didn't happen anyway.

I have the insider's perspective and I agree that it is the more accurate, that Y2K was, if not necessarily going to end the world, certainly a bad thing and was largely averted through effective engineering. But I can still see how from the outside it sure doesn't look that way.


Many, if not a majority, of my non-technical friends and acquaintances have expressed at one time or another a reference to the "Y2K disaster" and rolled their eyes to suggest it was somehow not an issue. I was the 'Y2K compliance officer' at my startup at the time (we even got certified, and that actually may have been a scam (the certifying part)) but we identified and fixed a number of issues our box would have suffered had we not done the work.

"In spite of all press that Y2K was just a silly waste of money, its events like these that makes me suspect it would have been a much bigger deal if everyone had ignored it and fixed it after things where shown to break."

Did you read the parent's post?

2012. and we still have problems keeping track of time. This is both fascinating and scary.

P.S. for people wanting to know more this video is simple to understand but really amazing http://www.youtube.com/watch?v=xX96xng7sAE

Maybe we always will have problems?

It seems to be the unique class of bug that not only is it easy to forget to test, and won't ever show up until a particular date... but then affects everyone!

I can't think of any other kind of bug that never shows up ever, but then affects everyone. Rare bugs tend to stay rare, common bugs tend to get caught before they affect everyone... this is the exception.

From discussion of this same issue in prior threads, my takeaway was

(a) it's really not at all difficult to handle leap seconds, but

(b) the POSIX standard specifically disallows them, by specifying that a day must contain exactly 86400 seconds. (Analogously, imagine if leap days occurred as normal, but a "year" by definition contained exactly 365 days.)

The existence of leap seconds means that it's not possible to simultaneously have (1) system time representing the number of seconds since the epoch, and (2) system time equal to (86400 * number_of_days_since_epoch) + seconds_elapsed_today, and all the proposed methods of dealing with the problem involve preserving (2), which seems worthless to me, and throwing away (1), which I would have thought was a better model.

edit: actual system times may be in units other than seconds, but the point remains

It's harder than leap days, because leap seconds aren't inserted on a regular schedule. Leap days follow a predictable pattern of insertion. Leap seconds are inserted whenever the IERS decides to insert them.

The problem of leap seconds is therefore closer to that of time zone definitions -- which are a total mess, because they depend on keeping rapidly changing system tables up to date. I can see why people don't relish the idea of requiring similar tables just to keep system time accurate.

How are systems being notified of the leap seconds now, that wouldn't immediately enable them to update their hypothetical leap second table?

It seems like we already have a much bigger lead time for notification than we could possibly need.

> I can see why people don't relish the idea of requiring similar tables just to keep system time accurate.

But the 'solution' we're using now is to make system time less accurate, not more accurate. Accurate would be if leap seconds incremented the system clock like normal seconds do. If the accuracy you're worried about is displaying a clock time rather than time since the epoch, you already need a time zone to do that.

"How are systems being notified of the leap seconds now, that wouldn't immediately enable them to update their hypothetical leap second table?"

I am not an expert, but as far as I know the most automated solutions are doing it via NTP, which just resets the second, then relies on clock drift to bring everything back into synch. Otherwise, I think your only option is to keep the timezone packages up-to-date (which is a non-trivial task for large deployments). A quick search found this:


"But the 'solution' we're using now is to make system time less accurate, not more accurate."

Yeah, I'm not disputing this. I'm just saying that preserving the assumption that "day == 86400 seconds" probably breaks less code than the alternative. NTP messes with the notion of seconds-since-epoch anyway, so we know that single-second variations in unix time aren't automatically deadly to most unix software.

NTP sends a special message to the kernel (using adjtimex), that boils down to: today you will insert a leap second. This isn't the same as clock drift, which gets smoothed out, it means a minute with a 60th second (in UTC) or with the 59th second happening twice (in POSIX). NTP servers need a leap second table ( http://support.ntp.org/bin/view/Support/ConfiguringNTP#Secti.... ), but most other systems only need to know the current delta between POSIX and TAI, and manage without a leap table.

A clever solution would be keeping an internal time and generating POSIX times (and human friendly times) as needed. This way, leap seconds will never touch true system time.

Interesting analysis. Preserving (2) means that systems that don't handle leap seconds (or don't have a leap seconds table updated in the last six months) have a POSIX time at most 1 second away from systems that do. Dropping (2) would mean a more complicated formula to convert system times to times of day, but it would also make the failures and missing leap tables immediately noticeable, rather than a one-in-a-few-years event. Which would have made it the better choice; bugs encountered during development are easily handled, bugs that happen years later on production systems are inconvenient.

Its... worse. We can track time so easily and so well that we decided to screw it up.

2012 and we still don't really know what time is. The fact that we can keep track of it, even with all the problems, is quite amazing.

Physics perfectly well knows what time is. This is an issue with keeping track of time in terms of seconds, minutes, hours and days, and has nothing to do with not knowing what time is.

What is time, other than a state in which net entropy within a closed system increases (which is already a definition abstract enough that it almost misses the point)?

That's not really a definition of time, just a way to postulate its direction. Time as a physical unit has been decreed to be the rate of decay of a certain Cesium atom; i.e., a fixed regular interval, whereas the increase of entropy varies from moment to moment. E.g. right now I'm typing and before I wasn't, that doesn't mean time is going ever so slightly faster because of that ...

Decay of nuclei is a completely random process, and you really don't want to use it to define a time-scale.

What you had in mind is the SI-definition of the second, which reads (http://www.bipm.org/en/si/si_brochure/chapter2/2-1/second.ht...):

     The second is the duration of 9 192 631 770 periods
     of the radiation corresponding to the transition
     between the two hyperfine levels of the ground state
     of the caesium 133 atom.
The details of this effect are a little more complicated, but it boils down to the fact that you can measure the "angular momentum" of your nuclei when you have them pass through a non-uniform magnetic field. Particles in the one state are deflected differently than particles in the other state. And when you irradiate a particle beam in the F=0 state with the right frequency (9.2 GHz) you can very efficiently swap many particles over to the F=1 state. By adjusting your frequency sufficiently to find the maximum rate of flips, you can can tune for the exact 9'192'631'770 Hz.

The cesium particles have not decayed, in principle you could run forever on a certain supply of atoms... even though in practice a Cs-beam is produced on one side, at a hot filament, and dumped to the other side of the clock after passed through the apparatus that performs the steps described above. They will be disposed of when the lifetime of the beam-tube is reached (typically 10 years or so with a few grams of cesium inside).

>rate [sic] of decay of a certain Cesium atom //

That is time in the same way that leaves moving on a tree is wind.

The instance of the radioactive decay indicates the defined period of time has passed within the local frame of reference. That tells us very little useful about time itself.

I wasn't making a philosophical statement, I specifically qualified it by saying "as a physical unit." That's still nowhere near precise enough, but I only meant to point out that we typically don't conceive of time as an emergent property of physical processes (i.e., entropy change), but as a local property of a very specific atom, such that time passes at a constant rate, if only because we choose to define it that way. "Telling something about time itself" sounds like a philosophical question which I fear has no definite answer.

In this case, i believe we created a problem we did not have. Leap seconds is a dubious construct from the start, problematic with computers or space travel. We have added only 25 since 1972. Their unpredictability means they will be forever a problem with computing. We should either quit the whole idea or in the worst case allow them only every 25 years or so.

Edit: In fact there is strong indication that they may be abolished: http://en.wikipedia.org/wiki/Leap_second#Proposal_to_abolish...

Nice video but what if we used Sidereal time? (i.e., star time, which ignores the earth's rotation around its axis).

Fear the Unix 32-bit time-becomes-negative bugs, in 2037.

We have 25 years to get ready. I still think we'll be patching at the last minute.

(Yeah, lots of systems will be 64-bit by then, but there will still be a lot of embedded crackerbox systems running 32-bit timestamps. It's all the embedded stuff I'm worried about).

It's 2038, not 2037.[1] (Specifically, January 19th, 2038 at 3:14:08am.) And while lots of systems will be 64-bit, many programs still won't be -- and it seems highly likely that this will be a significantly more serious and widespread problem than, say, Y2K or DST. (And certainly more serious than leap seconds, which happen relatively frequently.) Then again, I might be biased: perhaps I'm secretly hoping to spend the years leading up to 2038 paying for my retirement with high-priced consulting gigs to fix it...

[1] http://en.wikipedia.org/wiki/Year_2038_problem


Less than a year ago there were already people thinking about your job security. (It's a better explanation than "the glibc maintainers are insane".)

But MUCH less than a year ago, many more people were still writing 32-bit-dirty time_t based code.

It's gonna be a fun one.

If you think that being 64-bit protects you, then you do not understand the problem.

The problem is that 32-bit time is embedded in filesystem representations and related protocols. (eg the POSIX specification for file times in tar.) Therefore even if your machine is 64-bit, it still needs to use 32-bit time for many purposes.

To name a random example, the POSIX specification for times in the tar format is 32-bit. GNU tar has a non-standard extension that already takes care of it. But will everything else that expects to read/write tar files that a GNU tar program implement the same non-standard extension to the format in the same non-standard way? Almost certainly not. And there will be no sign of disaster until the second that we need to start relying on that more precise representation.

Now facing this issue... By using 'adjtimex' command, you can clear the problematic INS bit.

At first, you can confirm the status flag like this.

    $ ./adjtimex --print | grep status
    status: 8209
8209's binary representation is like this. This surely have INS bit "100000000[1]0001" (5th LSB).

    $ ruby -e 'p 8209.to_s(2)'
8193 is the value after the clearance of the INS big.

    $ ruby -e 'p 8193.to_s(2)'
Then, let's set it as a current value. Please ensure your ntpd is not running.

    $ adjtimex --status 8193

Novell kb: http://www.novell.com/support/kb/doc.php?id=7001865

  SLE9 (kernel 2.6.5-7.325): NOT AFFECTED
  SLE10-SP1 (kernel NOT AFFECTED
  SLE10-SP2 (kernel NOT AFFECTED
  SLE10-SP3 (kernel NOT AFFECTED
  SLE10-SP4 (kernel NOT AFFECTED
  SLE11-SP2 (kernel 3.0.31-0.9.1): VERY UNLIKELY

  Update (06/26/2012): after thorough code review -> SLE9 and SLE10 not affected at all.

FYI: I've updated the post with details of the workaround as implemented on our servers.

Pardon the ignorance if this is a stupid question. I've been looking at some of my hosts and have noticed a message "Clock: inserting leap second 23:59:60 UTC" in dmesg output but each of the hosts is in the EDT timezone so the I was under the impression that the leap second hadn't been applied yet. So what does that mean? That the systems have applied the leap second successfully or have only received it from their NTP servers?

The leap second is applied at midnight UTC time, regardless of what timezone the server is in.

Okay, so does that mean that the various bugs that have been circulating can still hit as it hasn't hit midnight in EDT yet or can I exhale?

We just had 100s of EC2 instances generate high (alleged) load. Instances had load averages of 90+ but were responsive.

Running on a 3.2 kernel

Rebooted them all and they're fine.

What he said.

FYI: Our Debian servers did not kernel panic but system CPU load went through the roof; A quick restart brought levels back to normal.

My Ubuntu 10.04 desktop went to 100% proc and load avg of 20, none of my 10.04 servers or Debian stable servers were affected.

This fixed it:

  date; sudo date `date +"%m%d%H%M%C%y.%S"`; date;

You are a lifesaver. All morning my desktop's load has been pegged at 20. I upgraded FF, Chrome, etc. and no impact. I was dreading a full re-start, as I have lots of windows, tabs, etc. open. The above command knocked the load down to almost nothing in seconds.

After reading these tales of woe, all I can say is that I hope the criminal element doesn't start assaulting NTP servers.

I was logged on to a couple of CentOS 6 servers when I saw this happen, and on each one the Java processes went absolutely haywire. Everything else seemed to work fine.

I attempted to fix with adjtimex and the script in the linked question, but to no avail, in the end having to restart them all instead. After that, all was good again.

I just had the exact same experience.

Had the same issue across all our VMs running Java/Tomcat applications.

POSTMORTEM fix for CPU eating softirqd threads without rebooting:

stop ntpd, run ntpdate or sntp, start ntpd

/etc/init.d/ntp stop; sntp -s <ntpserver>; /etc/init.d/ntp start

Unfortunately sntp / ntpdate wrapper is not shipped with squeeze for example. I've used the binary from SuSE 11.4 just fine on squeeze.

OK this is how it works on squeeze etc.:

apt-get install ntpdate; /etc/init.d/ntp stop; ntpdate pool.ntp.org; /etc/init.d/ntp start

or easier still just date -s "`date`"

without ntpd restart


My Debian GNU/Linux 6.0 is still standing

Oh well, reading the issue, the machine date is Sat Jun 30 16:11:31 EDT 2012

Stopped ntpd just in case

Same here. Set ntp to restart in 12 hours.

With ntp stopped, no problem whatsoever

Two days ago while booting, the BIOS time on my eeepc was suddenly reset, with an error message on boot to adjust the time manually. Was just thinking that it may be related?

Our Linux instances running on Amazon EC2 had no issues since we are not running ntpd on these servers and adjtimex returns status as 64 (clock unsynchronized).

I think the Xen host takes care of the synchronization and we need not do it in the guest OS. (see http://serverfault.com/questions/100978/do-i-need-to-run-ntp...).

Is this fine or should we run ntpd for better accuracy?

Yes. This issue notwithstanding, you should be running ntpd.

Stupid question: Why was this not caught? Seems pretty easy to test. Just set the clock to today (or any day with a leap second), and watch what happens.

> Just set the clock to today (or any day with a leap second), and watch what happens.

That won't work. The bug is only triggered when an upstream NTP server reports that a leap second was scheduled. Since leap seconds aren't predictable (and aren't even scheduled very far in advance), just setting the time back to the date of a previous leap second won't do anything.

True, but the question still stands, since you can still test it by just telling the kernel to insert a (fake) leap second.

It also should not be that hard to provide your own upstream ntp server, and have that generate leap seconds at will. Both machines could be VMs, too.

On debian, I was able to fix the issue (fix the load issue specifically) with this command

/etc/init.d/ntp stop; date; date `date +"%m%d%H%M%C%y.%S"`; date;

If really all of the Linux where affected more than half of the Internet would be still down by now. Could be only a specific combination of kernel/userspace bugs that only exists in some systems.

What a bit sucks is that my VPN was affected to (openvpn) causing my computer to do a poweroff. I replaced the poweroff with

ip route add to dev lo

hope that saves me when the next leap second occurs.

Pirate Bay has also been crashed by this: "TPB crashed just after midnight June 30th GMT (5.5 hrs ago) The crash appears to have been caused by the leap second that was issued at midnight."


No burps from my BSD boxes either, although they're all in UTC so the leap second hasn't happened for them yet.

The leap second is added at the same point in time regardless the timezone your server is configured to use. So if you're GMT+3, the leap second will be inserted at 03:00 local time.

From the answer: "The reason this is occurring before the leap second is actually scheduled to occur is that ntpd lets the kernel handle the leap second at midnight, but needs to alert the kernel to insert the leap second before midnight. ntpd therefore calls adjtimex sometime during the day of the leap second, at which point this bug is triggered."

Is this implementation-specific, or could the Windows equivalent to ntp cause the same problem?

Implementation specific. It looks like it is a bug in the Linux kernel with how it adjusts the time. It is possible that Windows, OS X, and other BSDs will be affected by a similar bug, but that would be coincidental as the bug is not due to ntpd but rather how the kernel handles a request that ntpd generates.

More specifically, there is a condition in which the kernel tries to insert a leap second and, in doing so, attempts to acquire the same lock twice causing the spinlock lockup and (effectively) halting the kernel.

My AWS EC2 instances got spun up to 100% cpu and have been like that for a day. Basically saw a step function from 0 to 100 in the CPU graph. Just had to reboot them.

Hey, I'm running Ubuntu 12.04 . Could someone guide me through what I can do to detect/prevent this from crippling my server? Thanks.

Read the linked article.

> The work-around is to just turn off ntpd. If ntpd already issued the adjtimex(2) call, you may need to disable ntpd and reboot to be 100% safe.

Oddly netflix went down for me at 12:01 last night...

I assumed some cronjob or something similar was to blame.

Netflix had outages due to a huge storm on the east coast of the US. That was probably the cause.


My Ubuntu servers seem unaffected thus far.

I found mysqld (5.5.24) running at 159% CPU on a Ubuntu 11.04 (64-bit) box this morning. ntpd had drifted in the leap second between 1am and 2am (GMT) this morning (NTP drift info is one thing I graph with MRTG).

[EDIT] Ah, covered elsewhere. Fixed by manually setting the date on the box; stopping/restarting mysqld or ntpd doesn't make any difference.

Unfortunately I can confirm that Ubuntu 10.04 is vulnerable. We're proceeding with the fixtime.pl workaround.

I can confirm it too, but didn't catch it in time. A reboot however and everything is back to normal.

We didn't catch it in time either. It was oh so much fun to wake up to our service not working at all, all java and mysqld processes spinning like crazy, and having to reboot all servers. :-/

It seems to depend on high load, so you could be lucky!

That wouldn't happen if servers were Macs

Thats relevant how?

I watched my Mac OS X 10.7.4 box count backwards this leap second. The time went to 00:00:00, then back to 23:59:59. At least Linux makes the effort to have the seconds never go backwards.

Mac OS X is a BSD variant, so there's every chance.

How does that follow? OS X runs an odd hybrid kernel (XNU) which is Mach and parts of BSD, but... this is a Linux kernel bug. There's an effectively zero chance of this impacting anything but Linux.

The kernel is not the only OS component relying on time that may have not considered this.

This is evidently a kernel bug. The fact that both operating systems rely on time isn't particularly relevant. Could there be time bugs in OS X? Certainly. But it wouldn't be this one. Windows relies on time too, so I don't see why you bring up the fact that OS X is a BSD variant.

BSD doesn't have an adjtimex syscall, so it's very unlikely for there to be a spinlock bug in the adjtimex syscall that doesn't exist.

Read that as "high rates of cash."

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact