Python datetime objects, by design, can be naive or timezone-aware. Timezone-aware datetime objects are OK; they identify a certain instant in time.
Naive datetime objects are Python-only abstractions (AFAIK) that don't identify anything in the real world; they're highly error prone, because there's no "right" way to use them.
They sort of work properly only if used in a very limited scope - e.g. your own code only, for small sections - but they're risky because they're not a different type (with respect to tz-aware), and it's hard to tell what any code accepting a datetime does if passed a naive object. Some libraries like java.time DO have a similar concept (e.g. LocalTime, LocalDate) but they keep it well separate from the "real" concept (e.g. Instant or Date in Java) so you can't use them accidentally.
Example: you pass a naive datetime object to any library which must translate it to an instant, like an ISO string with a well defined timezone. What does the library do? Throw an exception? Associate an arbitrary timezone (e.g. UTC)? Associate the local, current timezone? There's no "correct" behaviour.
I agree that python datetime objects are problematic, but for the opposite reason. It is tzinfo that is the sneaky disaster, the plain datetimes are fine.
Transparent timezone awareness always fail, unless you are 100% certain that a tzaware datetime object will remain uncoverted from the very top to the very bottom of the stack and all the way up again no matter who is reading and what they are doing.
For longterm minimization of pain, bugs and effort, you convert datetimes to UTC as early as possible and take them back to some localized version as late as possible (in the frontend, for a webapp, so that the backend never needs to know there is such a thing as timezones (except for separate validation and correction routines, since timezone definitions always end up being incorrect to some degree when you use them at scale)).
If the localization of the datetime is an essential aspect (such as the departure time of a ship leaving port), you store a UTC value together with a record of the location. Only at the latest possible moment of processing, should you do a lookup on the location data to make a local time.
Obviously, there will be exceptions to this rule. If you batch process billions of timestamps under a tight deadline and must do calculations in local time, it might make sense to have the values persisted localized.
Actually UTC + timezone is exactly the wrong thing for "wall clock times" (things like meetings or departures where the time at the location is relevant).
The conversion to UTC will lose the original local time so you cannot retrieve it once time zone data changes, unless you perform reconversions every time you detect such a change in tzdata. And countries changing time zones happens more often than we think (and also on short notice).
Thus it is important to distinguish between instants (e.g. for recording when exactly something happened after the fact) and wall clock time (e.g. for coordinating people and goods at a certain place, like meetings, concerts, departure times). For the former use UTC, for the latter use a localised time zone (e.g. Europe/Rome), not an offset time zone (e.g. not +0200).
For more information Jon Skeet has written about this multiple times.
I believe this "wall clock time" approach is broken by design as it pushes the burden of figuring out timezone details to those who are not located in that particular timezone.
A fair and therefore safer approach is to decide that by protocol the legally binding time is defined in UTC.
Your system will translate UTC times to and from any given local time using the IANA time zone database which is regularly updated. End users must be aware about the UTC time, that it is legally binding, and that the local time conversions are provided as-is.
This way the time of a meeting or deadline is protected from local governments messing around with timezone changes.
Additionally, dates are rendered in ISO8601 standard format with a proper footnote to help users learn about international standards.
I think whether UTC or wall clock time is binding is a problem in the legal and planning (so the business) domain and has to be treated as an external input to the software engineering problem.
> The conversion to UTC will lose the original local time so you cannot retrieve it once time zone data changes, unless you perform reconversions every time you detect such a change in tzdata
I don't agree/disagree with your point, and neither I do agree/disagree with GP on the topic, but why couldn't I retrieve the original time? If, for an event, I save UTC + event TimeZone, I can always get back to the original time (actually, it doesn't even matter whether the timestamp is UTC; it's enough for it to have an explicit offset, i.e. to be the representation for an instant). Why should I change the timezone on a saved record? What usually changes is the user's timezone, not the records'.
> Thus it is important to distinguish between instants and wall clock time
Yes.
> For more information Jon Skeet has written about this multiple times.
I have read many things on datetime; would you care to share a couple of relevant links?
Offset timezones (e.g. UTC+2) don't change, what changes are local timezones (e.g. Europe/Rome).
For example here Turkey decided to change daylight savings time: https://github.com/JodaOrg/joda-time/issues/403 (if you have a look at the tzdata database you will find more, this one I remember because Turkey went back and forth about this).
If your timezone database changes you cannot retrieve the original wall clock time, unless you have a temporal timezone database and remember the date of conversion to UTC.
And if you used offsets instead of local timezones to begin with, you cannot even infer which offsets to change unless you have location data saved as well.
> For me, the key difference between the options is that in option 3, we store and never change what the conference organizer entered. The organizer told us that the event would start at the given address in Amsterdam, at 9am on July 10th 2022. That’s what we stored, and that information never needs to change (unless the organizer wants to change it, of course). The UTC value is derived from that “golden” information, but can be re-derived if the context changes – such as when time zone rules change
I'm sorry for the late response; yes, you are right that for future events the "right way to do it" is saving the place + local time. I think we were speaking of slightly different things.
> I agree that python datetime objects are problematic, but for the opposite reason. It is tzinfo that is the sneaky disaster, the plain datetimes are fine.
Why naive datetimes should be fine? How are they fine? What do they represent?
> For longterm minimization of pain, bugs and effort, you convert datetimes to UTC
This COULD work if naive objects had an IMPLIED UTC in their contract - e.g. naive objects are declared as ALWAYS UTC. Your argument fails as soon as you pass a naive datetime object somewhere in a library/framework and it gets accepted, and/or you try serializing it without augmenting with a TZ. As I said, naive datetime only works if you control 100% of their usage. No libraries, no external points of contact. And, the reason because tz-aware objects sometimes fail for the opposite reason (e.g. libraries assume naive objects) is a fault of the API design (they're not distinct types), but the problem lies in the existence of the naive version, not vice-versa.
For the records: in the backend I always use tz-aware datetime objects with a fixed UTC timezone. That's the best way IMHO not to get crazy with time problems in Python. So, your points about datetime handling are all valid and correct (timezone is mostly a "UI problem" and should not leak into the backend) but they don't prove your "naive works better" argument.
> Why naive datetimes should be fine? How are they fine? What do they represent?
They represent the date/time wherever the user is (location-independent). If I want to take a pill every Monday and Thursday at 10am, I don't want to get a notification at 5am just because I moved from the UK to NY.
This is LocalDate/LocalTime in java.time/joda.time parlance. But it's a different beast; in fact, you're talking about a repeated action. But if I tell you that on "March 23rd, 2020, 9:50:01am" I did something, what does that mean to you? When did it happen? That's a naive datetime.
It's got its place: but the idea that the API and the usage should be similar to a precise representation of time, as if it the two were interchangeable is... dangerous, and it's the source of a lot of problems with dt in the Python world.
Or maybe you do want to take the pill at 5am, since your are only there for a few days and it is critical that you maintain an exact 24 hour interval between doses.
As an assembly worker in the timestamp-wrapper-class factory, I am not in a position to try being clever about it. :-)
There is indeed no contract that datetimes without tzinfo are UTC. But there is a contract that they don't have a builtin timezone or DST concept and that you must define handle that separately.
>What does the library do? Throw an exception? Associate an arbitrary timezone (e.g. UTC)? Associate the local, current timezone?
Naive datetime is what datetime.utcnow() returns. UTC is essentially a "default" timezone. I've always thought it made most sense in a library to assume it's UTC.
Your assumption is just a guess. Take a look at the python docs I linked just above here: "A naive object does not contain enough information to unambiguously locate itself relative to other date/time objects. Whether a naive object represents Coordinated Universal Time (UTC), local time, or time in some other timezone is purely up to the program"
I'd consider any datetimes that have an implicit timezone that isn't or might not be UTC a bug in any system.
This isn't restricted to python. Servers that spit out logs with timestamps, for instance, should be spitting out UTC.
It makes sense to build systems that deal with timezones at the very edges (and sometimes not even then) and use UTC for everything else. It's simpler that way.
Unfortunately, in Python, whenever you try to render a datetime "aware" by using the provided conversion method (`astimezone()`) it will assume the naive datetime is in local time zone, not UTC.
The datetime module provides `timezone.utc` to be used whenever you want to have datetimes _in utc_, but it needs to be explicitly used by the programmer.