
How setting the TZ environment variable avoids thousands of system calls - jcapote
https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/
======
AceJohnny2
> _In other words: your system supports calling the time system call via the
> Linux kernel’s vDSO to avoid the cost of switching to the kernel. But, as
> soon as your program calls time, it calls localtime immediately after, which
> invokes a system call anyway._

This reminds me of an article by Ted Unangst[1], in which he flattens the
various libraries and abstractions to show how xterm (to cite one of many
culprits) in one place is effectively doing:

    
    
            if (poll() || poll())
            while (poll()) {
                 /* ... */
            }
    

In other words, if you don't know what your library/abstraction is doing, you
can end up accidentally duplicating its work.

Reminds me of some aphorism, "Those who do not learn from history..." ;)

[1] [http://www.tedunangst.com/flak/post/accidentally-
nonblocking](http://www.tedunangst.com/flak/post/accidentally-nonblocking)

discussed
[https://news.ycombinator.com/item?id=11847529](https://news.ycombinator.com/item?id=11847529)

~~~
segmondy
With the layering of clusters, containers, micro services, bet you probably
have 10x worse than that. There is always a cost to abstraction. On the
surface it might make things simpler but if you were to peel it apart, you
would reveal a hidden layer of complexity. Hopefully, it's done well right
enough that there will never be a need to peel it apart.

~~~
nomel
> On the surface it might make things simpler but if you were to peel it
> apart, you would reveal a hidden layer of complexity.

Well yes, This is the very definition and goal of abstraction.

------
tytso
System calls in Linux are really fast. So saving "thousands" of system calls
when /etc/localtime is in cache doesn't actually save that much actual CPU
time.

I ran an experiment where I timed the runtime of the sample program provided
in the OP, except I changed the number of calls to localtime() from ten times
to a million. I then timed the difference with and without export
TZ=:/etc/localhost. The net savings was .6 seconds. So for a single call to
localtime(3), the net savings is 0.6 microseconds.

That's non-zero, but it's likely in the noise compared to everything else that
your program might be doing.

~~~
rpcope1
This might be true for your system and libc, where the system calls make use
of things like vDSO for gettimeofday go fast, but in general this isn't
guaranteed at all. Even on x64, for certain libc implementations, like musl,
if I recall correctly, syscalls are made the old fashioned way by trapping
0x80, which would mean you would see a much bigger effect by reducing the
number of syscalls.

~~~
tytso
There is no vDSO for calls to stat(2). The claim in the article was that by
setting the TZ environment variable to ":/etc/localtime", one could save
"thousands" of stat system calls. Even for old-fashioned system calls where
you use trap 0x80, Linux is still amazingly fast.

This can actually be a problem, since there are applications like git which
assume stat is fast, and so it aggressively stat's all of the working files in
the repository to check the mod times to see if anything has changed. That's
fine on Linux, but it's a disaster on Windows, where the stat system call is
dog-slow. Still, I'd call that a Windows bug, not a git bug.

~~~
codedokode
Does Windows has stat() call? It is probably a function from some POSIX
emulation layer and maybe that is why it is not fast.

------
pquerna
Good blog post explaining the behavior of glibc, I also saw this first hand
when profiling Apache awhile back too:

[http://mail-archives.apache.org/mod_mbox/httpd-dev/201111.mb...](http://mail-
archives.apache.org/mod_mbox/httpd-
dev/201111.mbox/%3CCAMDeyhzRAZ4eyz%3D%2BstA%3DwoTibM-W6QL8TqT%2BaPio07UddCz7Tg%40mail.gmail.com%3E)

[https://github.com/apache/httpd/blob/trunk/server/util_time....](https://github.com/apache/httpd/blob/trunk/server/util_time.c#L310-L327)

The internals of glibc can often be pretty surprising sometimes, I'd really
encourage people to go spelunking into the glibc source when they are
profiling applications.

------
brendangregg
Please quantify the speedup (I've found this before, but it's never been a
significant issue). Eliminating unnecessary work is great, but what are we
really talking about here? Use a CPU flamegraph, Ctrl-F and search for stat
functions. It'll quantify the total on the bottom right.

~~~
brendangregg
Oh, and another page that recommends strace without warning about overheads.
Dangerous.

------
Daviey
Honestly, the primary reason I support this is to get developers out of the
habbit of demanding a localized server timezone. As an infra' person, I want
system time in UTC. If developers get in the habbit of setting TZ, then I can
have this!

~~~
int_19h
It feels like any code that needs to know the timezone of the _server_ is
inherently wrong. If timezone ever comes up in any context, it's either the
timezone of the client from whom the request originates - in which case it
should come as part of the request - or else the timezone somehow associated
with the business process (e.g. "warehouse open 8-5 Eastern time"), in which
case it should be part of the configuration for that one service.

------
jdamato
Author of the post here: greetings.

If you enjoyed this post, you may also enjoy our deep dive explaining exactly
how system calls work on Linux[1].

[1]: [https://blog.packagecloud.io/eng/2016/04/05/the-
definitive-g...](https://blog.packagecloud.io/eng/2016/04/05/the-definitive-
guide-to-linux-system-calls/)

------
rootbear
Is there a reason why the path to the timezone file is prefixed with a colon?

TZ=:/etc/localtime

I've set TZ sometimes without the colon and it seem to work. I did a quick
online search and didn't find anything relevant.

~~~
avar
:<whatever> means "read it from the <whatever>" file. See the last part of the
relevant glibc documentation: [https://www.gnu.org/savannah-
checkouts/gnu/libc/manual/html_...](https://www.gnu.org/savannah-
checkouts/gnu/libc/manual/html_node/TZ-Variable.html)

However the reason it works without : is that the implementation is being lazy
and just ignores the : delimiter and falls back to parsing out a filename
either way:

[https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzset....](https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzset.c;hb=refs/heads/master#l418)

~~~
rootbear
You beat me to it. I was answering my own question when one of my users came
in with a problem. Stupid users...

------
snowcrshd
Brendan Gregg wrote about this a few years ago [1].

My favorite part:

> WTF?? Why is ls(1) running stat() on /etc/localtime for every line of
> output?

[1] [http://www.brendangregg.com/blog/2014-05-11/strace-wow-
much-...](http://www.brendangregg.com/blog/2014-05-11/strace-wow-much-
syscall.html)

------
glandium
What is missing in this post is:

\- Why does glibc check /etc/localtime every time localtime is called? Wild
guess: so that new values of /etc/localtime are picked at runtime without
restarting programs.

\- Corollary: why does glibc _not_ check /etc/localtime every time localtime
is called, when TZ is set to :/etc/localtime? Arguably the reason above should
still apply when TZ is set to a file name, shouldn't it?

~~~
jdamato
Hi, both are answered in the article:

First:

> What’s going on here is that the first call to localtime in glibc opens and
> reads the contents of /etc/localtime. All subsequent calls to localtime
> internally call stat, but they do this to ensure that the timezone file has
> not changed.

and second: read the section titled "Preventing extraneous system calls" for
the answer to your second question.

~~~
glandium
Those "answers" are more about how than why.

~~~
jdamato
Thanks for reading and I'm glad to hear you loved my post!

------
jonathonf
If this has a real-world/measurable/etc. impact why isn't this set by default?
Are there potential side-effects? Is it set in some distros but not others?

~~~
mixologic
Probably because while there may be tens of thousands of additional syscalls,
the total amount of added latency and resources consumed are more likely to be
on a scale of micro/nano/milli seconds.

~~~
shawnz
In the trace you can see that the syscall takes less than a tenth of a
millisecond. I don't think this is a big penalty to check if I have changed my
timezone or not, as unlikely as that is during normal operation.

------
leovonl
This seems to be a simple RTFM issue to me: POSIX specifies that gmtime() uses
UTC and localtime() uses current timezone. Using gmtime() would implement the
desired behaviour without any need to hardcode environment variables.

~~~
mwexler
...which fixes all the code you wrote, but of course, you may have legacy
binaries that you don't have access to source to change... hence a simple
setting of an environment variable, hardcoded though it may be, fixes the
situation for all.

Though PeterWillis makes a good point akin to yours, and your (plural) point
does make sense.

(Edit: added mention of comment with additional background on why to avoid
hardcoding the variable)

------
rdtsc
Great post. I remember when vDSOs were added we noticed a nice speedup in our
code. We tuned for realtime and a few microseconds here and there add up. Most
importantly, less systems calls means more predictability.

------
blunte
This reminds me of a very similar behavior in Solaris over 20 years ago. Our C
application was having odd performance problems on some client systems, and
eventually we saw via truss that there were hundreds of fopen() calls every
second to get the timezone. Setting the right environment variable solved the
problem.

------
kelnos
I really enjoy when people dig into things like this and report their
findings. Having said that, I question the wisdom of "bothering" with this
sort of thing. Everything you do that's non-standard or works against a
system's default behavior incurs a cost. It's yet another thing you have to
replicate when you migrate to a new version, change provisioning systems, etc.

And for what benefit? A few hundred syscalls per second? Linux syscalls are
fast enough that something of that magnitude shouldn't matter much. Given that
/etc/localtime will certainly be in cache with that frequency of access, a
stat() should do little work in the kernel to return, so that won't be slow
either.

It's good that they did some benchmarking to look at the differences, but this
feels like a premature optimization to me. I can't imagine that this did
anything but make their application a tiny fraction of a percent faster. Was
it worth the time to dig into that for this increase? Was it worth the
maintenance cost I mention in my first paragraph? I wouldn't think so.

I'm really trying not to take a crap on what they did; as I said, it's really
cool to dig into these sorts of abstractions and find out where they're
inefficient or leak (or just great as a learning exercise; we all depend on a
mountain of code that most people don't understand at all). But, when looked
at from a holistic systems approach, a grab bag of little "tweaks" like this
can become harmful in the long run.

~~~
lotyrin
You should be able to get maintenance for something like this pretty low. Add
a line with a nice comment to the config template for running your apps that
adds an extra env var. Any machine that runs any app now has the line...

I would definitely like to see a before/after real-world metric on impact here
though.

------
scottlamb
There's another easy way to avoid this: use localtime_r instead of localtime.
From the glibc source:

    
    
        /* Update internal database according to current TZ setting.
           POSIX.1 8.3.7.2 says that localtime_r is not required to set tzname.
           This is a good idea since this allows at least a bit more parallelism.  */
        tzset_internal (tp == &_tmbuf && use_localtime, 1);
    

mktime also does the tzset call every time, though:

    
    
        time_t
        mktime (struct tm *tp)
        {
        #ifdef _LIBC
          /* POSIX.1 8.1.1 requires that whenever mktime() is called, the
             time zone names contained in the external variable 'tzname' shall
             be set as if the tzset() function had been called.  */
          __tzset ();
        #endif
    

and I don't see any way around that other than setting TZ=: or some such.

------
rocky1138
It is perhaps out of scope of the article, but it sure would have been helpful
to show how to set the TZ environment variable and what to set it to.

~~~
mmozeiko
Its right there in middle of article (under "Preventing extraneous system
calls" section):

    
    
        $ TZ=:/etc/localtime strace -ttT ./test

~~~
rocky1138
I can't believe I missed that! Thanks.

------
falsedan
Why did this post start with a tl;dr, then a Summary, and _still_ buried the
important takeaway in the ultimate paragraph?

------
rtsisyk
BTW, packagecloud.io is the great hosting for RPM/DEB packages. We've been
using it for the last couple years. GitHub + Travis CI + PackageCloud
combination allows us build and publish packages for EVERY git commit in 30+
repositories targeting 15 different Linux distributions [1]. There is no more
need to hire a special devops guy for that.

[1]:
[https://github.com/packpack/packpack#packpack](https://github.com/packpack/packpack#packpack)

------
hodgesrm
Interesting article but I can't reproduce the behavior on Ubuntu 16.01 LTS. I
don't have TZ set (or anything locale-related for that matter). Here are the
library dependencies:

    
    
      $ ldd test
      linux-vdso.so.1 =>  (0x00007ffd80baf000)
      libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8844bf7000)
      /lib64/ld-linux-x86-64.so.2 (0x00007f8844fbc000)
    

Any thoughts why the behavior would be different?

~~~
cnvogel
There's also a surprising difference in behavior between tm = localtime() and
localtime_r(..., &tm).

The former is the "traditional" function which returns a pointer to a
statically allocated, global "struct_tm". The latter is the thread-safe
version receiving a pointer to a use-supplied "struct tm" as it's second
argument.

    
    
        :   do {
        :           t = time(NULL);
        :           localtime_r(&t, &tm);
        :           printf("The time is now %02d:%02d:%02d.\n",
        :                  tm.tm_hour, tm.tm_min, tm.tm_sec);
        :           sleep(1);
        :   } while(--N);
    

with TZ set to Europe/Berlin, set to :/etc/localtime, or unset I never get a
stat on anything.

    
    
        write(1, "The time is now 07:23:33.\n", 26The time is now 07:23:33. ) = 26
        nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0
        write(1, "The time is now 07:23:34.\n", 26The time is now 07:23:34. ) = 26
        nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0
        write(1, "The time is now 07:23:35.\n", 26The time is now 07:23:35. ) = 26
        nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0
    
    

If I change it to tm = localtime()...

    
    
        stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
        write(1, "The time is now 07:30:56.\n", 26The time is now 07:30:56.) = 26
        nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc868c3010) = 0
    
    

One more reason to switch to the reentrant/thread-safe versions of those ugly
library functions :-).

Note, this is using glibc 2.24 under Arch.

    
    
        $ /lib/libc.so.6
        GNU C Library (GNU libc) stable release version 2.24, by Roland McGrath et al.
        (...)
        Compiled by GNU CC version 6.1.1 20160802.

------
acscott
This reminds me of setting noatime for disk mounts
([http://askubuntu.com/questions/2099/is-it-worth-to-tune-
ext4...](http://askubuntu.com/questions/2099/is-it-worth-to-tune-ext4-with-
noatime))

Now I want to know the number of other configs to reduce the number of system
calls. This all adds up to being significant the greater the number of hosts
in your environment.

------
actuator
While trying to find the cause of slowness in Rails requests, I was running
strace on an unicorn process when I encountered the same thing mentioned in
the article.

Rails instrumentation code calls current time before and after any
instrumentation block. So, when I looked at the trace there were a lot of
`stat` calls coming for `/etc/localtime` and as stat is an IO operation, I
thought I discovered the cause of slowness(which I attributed to high number
of IO ops) but surprisingly when I saw the strace method summary; while the
call count was high, the time taken by the calls in total was not
significant(<1% if I remember correctly). So I decided to set TZ with the next
AMI update 15 months back but forgot about it totally. I guess I should add it
to my Trello list this time.

Also, I think he should have printed the aggregate summary of just CPU clock
time(`-c`) as well as that is usually very low.

~~~
caf
_...while the call count was high, the time taken by the calls in total was
not significant( <1% if I remember correctly)._

Yes, on ordinary filesystems if you run stat() over and over again on the same
file then it's just copying from the in-memory inode into your struct stat,
there's no IO.

------
astrostl
I do this for "not get annoyed while stracing" reasons, not perf!

------
rargulati
Really interesting - thanks for sharing the findings. I haven't seen it
mentioned here, but for those of us using `timedatectl` via systemd, with the
default setting of `UTC` are taking advantage[1] of the recommendation in the
article.

[1]
[https://github.com/systemd/systemd/blob/master/src/timedate/...](https://github.com/systemd/systemd/blob/master/src/timedate/timedatectl.c#L85)

------
cbsmith
This reads to me like a glibc bug. Glibc should just be watching
"/etc/localtime" for changes, rather than calling out to hundreds of times a
second.

~~~
gumby
And how does it watch it? With the stat() syscall.

~~~
Dylan16807
Polling is not watching.

------
drudru11
Side note - why do some sites completely hide information about who is behind
them? I couldn't find a single thing about that on their blog or main site.

~~~
jasonmp85
Maybe it's just that I use packagecloud and follow people in their circle on
Twitter, but Joe Damato is the CEO and founder:
[https://twitter.com/joedamato](https://twitter.com/joedamato)

I'm under the impression that he may also write a lot these himself? Not
entirely certain, though.

~~~
jcdavis
Pretty sure he writes all of the linux internals posts. He also has a bunch of
great ones on his blog [http://timetobleed.com/](http://timetobleed.com/)

------
kseistrup
It would be highly inconvenient to have to set this variable if you live in a
country where you change the timezone twice a year due to summertime.

~~~
alexfoo
If TZ is set properly then you don't need to change it twice a year.

(man tzset for more info and examples of different values TZ can be set to.
Mine is set to "Europe/London" which handles the DST switches automatically.)

~~~
kseistrup
Hm…, now that I read TZSET(3): Wouldn't that be

    
    
        TZ=:Europe/London
    

It doesn't seem tzset() will accept the format

    
    
        TZ=Europe/London
    
    ?

~~~
noselasd
Yes, but " If the colon is omitted each of the above TZ formats will be
tried." , so it'll be figured out even if the colon is missing.

------
barrystaes
Im not an expert but the first thing that comes to mind is that 1) TFA does
not quantify the performance gain in time 2) I wonder if environment variables
like TZ are a security risk/vector in that these might facilitate attackers to
stealthy skew/screw time within current user process... no root required.

------
jakeogh
OpenRC users can:

    
    
      echo 'TZ=:/etc/localtime' > /etc/env.d/00localtime

~~~
JensRex
For Debian and derivatives:

    
    
        echo 'TZ=:/etc/localtime' >> /etc/environment

------
rtsisyk
Overhead of localtime() is well-known, just RTFM. Anyway, this article
provides very good explanation.

------
creeble
Does anyone have any evidence of this actually having a resource usage impact
on any common programs?

I see one reference to Apache below, but not whether it actually made a
measurable difference.

------
vesinisa
This was a thoroughly fascinating read. Highly recommend reading the previous
part in the series as well.

------
mozumder
Does this affect FreeBSD as well?

~~~
loeg
Experimentally, no. The example program calls localtime(3) 10 times but only
accesses the file once, per truss:

    
    
        write(1,"Greetings!n",11)			 = 11 (0xb)
        access("/etc/localtime",R_OK)			 = 0 (0x0)
        open("/etc/localtime",O_RDONLY,037777777600)	 = 3 (0x3)
        fstat(3,{ mode=-r--r--r-- ,inode=11316113,size=2819,blksize=32768 }) = 0 (0x0)
        read(3,"TZif20000000000000"...,41448) = 2819 (0xb03)
        close(3)					 = 0 (0x0)
        issetugid()					 = 0 (0x0)
        open("/usr/share/zoneinfo/posixrules",O_RDONLY,00) = 3 (0x3)
        fstat(3,{ mode=-r--r--r-- ,inode=327579,size=3519,blksize=32768 }) = 0 (0x0)
        read(3,"TZif20000000000000"...,41448) = 3519 (0xdbf)
        close(3)					 = 0 (0x0)
        write(1,"Godspeed, dear friend!n",23)		 = 23 (0x17)
    

(FreeBSD caches the database on the first call:
[https://svnweb.freebsd.org/base/head/contrib/tzcode/stdtime/...](https://svnweb.freebsd.org/base/head/contrib/tzcode/stdtime/localtime.c?view=markup#l1437)
)

------
dpatru
It seems to me that this is something the Linux distributions should already
be doing.

------
sandGorgon
does anyone know if this impacts docker images as well ?

~~~
pquerna
Yes, at least the Docker images that use glibc as their libc. (eg, most
Debian/Ubuntu images)

It looks like musl, which is used on Alpine Linux images for example, will
only read it once, and then cache it:

[https://github.com/esmil/musl/blob/master/src/time/__tz.c#L1...](https://github.com/esmil/musl/blob/master/src/time/__tz.c#L121)

It has a mutex/lock around the use of the TZ info, but avoids re-stat'ing the
localtime file.

~~~
nathancahill
This is the best part of HN. Not only did you answer GP's question in 6
minutes, but you link to the exact line of the source code.

