
What the Royal Astronomical Society in 1884 Tells Us About Python Today - skilpat
https://typesandtimes.net/2019/05/royal-astronomical-society-python
======
deepsun
I believe it was all clarified long ago in Java (Joda library, that was
reimplemented in newer versions in core SDK as java.time).

~~~
skilpat
Indeed! That library, as incorporated into the jdk, seems to me like the gold
standard of datetime programming in standard libraries. Good type-level
distinctions, good defaults, good extensibility... _chef kissing fingers_

------
bryanrasmussen
Ok, I don't understand though why the bug hasn't been fixed and is there any
other widely used time localization library that makes the same mistake - not
just in python but other languages?

~~~
sametmax
Just like people don't use the http module but requests, it's been years since
the community moved away from manual manipulation of datetime/pytz for time
zones.

Nowadays people use higher level libs such as pendulum:

    
    
        >>> print(pendulum.datetime(2019, 5, 21, 12, 30, tz='America/New_York'))
        2019-05-21T12:30:00-04:00
    

It avoids many gotchas, gives you more features and has a nicer API.

Like skilpat said, dateutil is a better fit that pytz, and hence pendulum uses
it, as well as pytzdata, to stay up to date.

~~~
falcor84
{{citation needed}}

Your assertion reminded me of this funny dialog about JS:
[https://hackernoon.com/how-it-feels-to-learn-javascript-
in-2...](https://hackernoon.com/how-it-feels-to-learn-javascript-
in-2016-d3a717dd577f)

~~~
sametmax
"pip install pendulum" is not complex.

Using pendulum is not complicated.

You can skim the doc in 5 minutes, your intern can do it too.

This is one of those tools that removes complexity when you use them.

Also, pendulum and the stdlib datetime module are compatible, making migration
painless:

    
    
        >>> pendulum.now() - datetime.now(pendulum.now().tz)
        <Period [2019-05-26T16:57:35.872732+02:00 -> 2019-05-26T16:57:35.872141+02:00]>
    

In the end, pendulum doesn't requires you to install a transpiler, 100 plugins
and create a configuration file like the post you link to. But it does save
you from bugs, and you don't need to be an expert in time to use it.

I see only wins.

~~~
pmahoney
> "pip install pendulum" is not complex.

For much of the work I do, there's a _big_ jump in complexity from using
python (2 or 3) and its stdlib vs. requiring a library. A script using only
the stdlib is easy to distribute and get working on developer, ci, and
production machines. Once an external library is required, I need machinery or
scripts to manage or ensure presence of those dependencies.

~~~
Filligree
It's probably no good to you, but one of the reasons I like NixOS is the
simplicity of creating single file scripts including dependencies. For
example, something using Pendulum and ffmpeg together would start like this:

    
    
        #!/usr/bin/env nix-shell
        #!nix-shell -i python -p python pythonPackages.pendulum ffmpeg
    

Then you just put the code after that.

~~~
pmahoney
I love Nix and NixOS personally... but it's been a tough sell at work,
unfortunately.

Even with simple shell scripts... it's so easy to invoke programs with GNU
extensions and later find they fail on a co-workers macos machine. And I often
have a shell.nix sitting right there that defines the complete dependency
closure; very frustrating to not be able to use it.

------
theoh
This article seems a bit misguided in its dismissal of the significance of
historical fact. The last paragraph seems to me to get unnecessarily snotty
about somebody making a very precise best-guess reconstruction of a historical
value/location. "Out-of-thin-air" values aren't something I had heard of but
apparently that category has to do with circular reasoning—so it doesn't apply
literally in this case. It's just a self-satisfied way of smearing someone
else's good faith work.

~~~
skilpat
That's a pretty wild interpretation! I have nothing but respect for the tz db
and its contributors; I indicated as much quite explicitly by stating my
appreciation for the commentary. Personally I will soon be sending historical
corrections to the precise days and times of DST changes in the 1940s for a
few places in US and Canada, from my own research.

But yes, it's absolutely a tongue-in-cheek reference to the unrelated notion
of "out-of-thin-air" values and reads in memory model semantics. (I've just
added a link to an explanation by some PL researchers.)

~~~
theoh
Nothing but respect? The word "hobbyist" suggests otherwise.

------
phonethrowaway
They should have used UTC for the transitions. This is an entire article built
around a faulty presupposition and naive objects.

I wrote about this here a few years ago:
[https://gordol.github.io/date_time_manipulation.html](https://gordol.github.io/date_time_manipulation.html)

Everyone here saying to use pendulum... you should definitely read this,
because it's about a very similar bug in pendulum with datetime transitions
across time change thresholds.

------
svat
It took me some effort to understand the issue here, so an alternative
explanation in case it helps someone.

First, the part that's independent of programming language. You may want to
read about absolute time and civil time (e.g. from
[https://abseil.io/docs/cpp/guides/time](https://abseil.io/docs/cpp/guides/time))
but if you don't, in short: “civil time” refers to something like “2019 May 26
at 2:45 pm in New York City” (or “in the America/New_York time zone”), which
means (roughly) whatever time the locals in New York City (or a larger shared
geo-political zone) would agree is 2:45 pm on that date. To convert this to an
absolute time, or in other words to make sense of “2019-05-26 14:45 in
America/New_York”, we need data about the real world _as of that date_ : most
obviously we need to know whether Daylight-Saving Time was in effect on that
date, but also what conventions were in use at the time. (This also means it's
hard to know for certain what such a notation in the future means in terms of
absolute time, as possibly DST could be abolished or the dates when it comes
into effect could change.)

It so happens that in 1884 the conventions of New York City were such that it
was about 4 minutes ahead of the then-recently standardized Eastern Time, so
about 4 hours and 56 behind GMT.

So, in any “correct” library, we should see the following respected:

• “2019 May 26 at 2:45 pm in New York” should mean “2019 May 26 at 18:45 UTC”
(timezone is EDT i.e. UTC minus 4 hours).

• “2019 Jan 26 at 2:45 pm in New York” should mean “2019 Jan 26 at 19:45 UTC”
(timezone is EST, i.e. UTC minus 5 hours).

• “1884 Jan 26 at 2:45 pm in New York” should mean “1884 Jan 26 at 19:41 UTC”
(timezone is... GMT minus 4 hours and 56 minutes).

\----

Now the part that's Python-specific: the pytz library in Python provides two
ways of constructing such a well-formed civil time. One is to call `.localize`
on a timezone, and the other is to call `.astimezone` to convert from one
civil time to its equivalent (the same absolute time) in another timezone,
thus obtaining a new civil time. Both are illustrated below, showing it
working properly:

    
    
        >>> pytz.timezone('America/New_York').localize(datetime.datetime(2019, 5, 26, 14, 45, 0)).astimezone(pytz.utc)
        datetime.datetime(2019, 5, 26, 18, 45, tzinfo=<UTC>)
        
        >>> pytz.timezone('America/New_York').localize(datetime.datetime(2019, 1, 26, 14, 45, 0)).astimezone(pytz.utc)
        datetime.datetime(2019, 1, 26, 19, 45, tzinfo=<UTC>)
        
        >>> pytz.timezone('America/New_York').localize(datetime.datetime(1884, 1, 26, 14, 45, 0)).astimezone(pytz.utc)
        datetime.datetime(1884, 1, 26, 19, 41, tzinfo=<UTC>)
    

Unfortunately, there's a third thing a programmer can do, which the
documentation warns against ([http://pytz.sourceforge.net/#localized-times-
and-date-arithm...](http://pytz.sourceforge.net/#localized-times-and-date-
arithmetic)), and that is to pass one of pytz's timezone objects as the
“tzinfo” parameter to the standard library `datetime` function:

    
    
        >>> datetime.datetime(2019, 5, 26, 14, 45, 0, tzinfo=pytz.timezone('America/New_York')).astimezone(pytz.utc) # Don't do this!
        datetime.datetime(2019, 5, 26, 19, 41, tzinfo=<UTC>)
        >>> datetime.datetime(2019, 1, 26, 14, 45, 0, tzinfo=pytz.timezone('America/New_York')).astimezone(pytz.utc) # Don't do this!
        datetime.datetime(2019, 1, 26, 19, 41, tzinfo=<UTC>)
        >>> datetime.datetime(1884, 1, 26, 14, 45, 0, tzinfo=pytz.timezone('America/New_York')).astimezone(pytz.utc) # Don't do this!
        datetime.datetime(1884, 1, 26, 19, 41, tzinfo=<UTC>)
    

which is certainly consistent in its own way, but only the last one is
correct. Oops.

The issue here is in the interaction between the “tzinfo” model of the
standard-library `datetime` and pytz's timezone objects: the result is that
when the two are used together in the above incorrect way, one ends up with a
timezone that is a fixed offset from UTC, which is silly. A timezone like
`America/New_York` is _not_ a fixed offset from UTC: not only does it change
twice a year, it also has changed in arbitrary ways in the past, and may
change in arbitrary ways in the future.

(Note that “fixing” the offset of 4 hour 56 minutes to 5 hours would not solve
any problems as it would still be wrong many months of each year — arguably,
having an obviously incorrect result may even be better than a sometimes-
correct one.)

The linked blog post by Paul Ganssle
([https://blog.ganssle.io/articles/2018/03/pytz-fastest-
footgu...](https://blog.ganssle.io/articles/2018/03/pytz-fastest-
footgun.html)), the author of the `dateutil` (not to be confused with the
standard-library `datetime`) library, is also informative.

------
mixmastamyk
Anyone know how/if this issue affects Django, which uses pytz?

------
solveit
Irresistible title.

------
deepsun
> pytz.timezone('America/New_York').localize( datetime(2019, 5, 21, 12, 30))

But wait, the datetime argument doesn't specify exact time instance, because
it's "zone-less" itself!

So the above code can also be buggy/unclear, if instead of 2019 we'd use a
datetime close to the switch time.

We should provide a time zone to the datetime param as well. Better GMT,
otherwise we would need to localize that one as well, falling into an infinite
loop.

~~~
tinix
pytz.localize() takes a naive date-time as an input.

The pytz docs are pretty much on-point, too:

> "The preferred way of dealing with times is to always work in UTC,
> converting to localtime only when generating output to be read by humans."

This, also, is the problem with this article, and is a really common pain-
point across the spectrum of programmers, both new and seasoned.

~~~
deepsun
Yes, but my point still holds, the code above is buggy/unclear. And
documentation supports this:

>>> loc_dt = eastern.localize(datetime(2002, 10, 27, 1, 30, 00))

>>> loc_dt.strftime(fmt) '2002-10-27 01:30:00 EST-0500'

> As you can see, the system has chosen one for you and there is a 50% chance
> of it being out by one hour.

And the solution, you're right, is to not use the code like above. But the
article doesn't mention that at all.

