I've noted before and will note again that the amount of hate Python got for the 2-to-3 deprecations has led to a much worse problem: piecemeal low-visibility updates which are much harder to plan for.
As a contractor, I singlehandedly migrated 4 different codebases from 2 to 3; a total of just under a million lines of code, and none of these codebases took more than 40 hours. This was not some horrific burden for most companies using Python. The people who flamed Python for these breaking changes were either extreme outliers or the sort of people who would be happy with the language never making breaking changes, ever, no matter how bad of mistakes in design were discovered.
I would strongly prefer that datetime.utcnow were not deprecated in a 3.x release, but instead deprecated along with the GIL and any other breaking changes, in Python 4.0. Yes, it would get them flamed. And the flamers are wrong. At least that way people can plan for the transition at their leisure instead of constantly worrying that a 3.x release will unexpectedly break their codebase if they don't read the release notes before every update.
> The people who flamed Python for these breaking changes were either extreme outliers or the sort of people who would be happy with the language never making breaking changes, ever, no matter how bad of mistakes in design were discovered.
Count me in the "don't make breaking changes" camp. You can spit out warnings to prevent developers from using certain language features in new code, but changes which outright break old code should be extremely rare.
This is how Javascript works (the language, not the ecosystem around it), and it means I can update my web browser and continue visiting old websites.
I don't think most languages should go as far as Javascript. However, the case for a breaking change should be a heck of a lot stronger than "we think all timestamps should have timezones now." That's a fine opinion to have, but it doesn't mean you should go break stuff.
> However, the case for a breaking change should be a heck of a lot stronger than "we think all timestamps should have timezones".
How about if you rephrase the case as "we've sampled codebases that use naive timestamps, and at least 50% of them had critical bugs related to mistaken assumptions made about timezones by functions used to produce vs consume those timestamps. Your code is already broken. This update just makes the compiler force you to rewrite your code in a way that makes the errors obvious and/or into type errors — and, therefore, which forces you to fix those bugs."
Presumably this code works for me if I'm using it. In most cases, an imperfect program I can run is better than one I can't run. Yes there are exceptions.
In the ideal case, the author updates their code so it keeps working. But what if I'm not the author and know nothing about the codebase? Is forcing me to make a naïve change to get things working again more likely to remove bugs or add them?
The compiler can warn me that I probably have a bug and should change my code. But it shouldn't hold the program hostage until I comply.
This isn't a blanket statement in favor of permissive compilers. It's fine for the compiler to force good practices the first time. I just have a problem with the compiler going back and changing its mind later.
> Presumably this code works for me if I'm using it.
Sure, it all seems to work for now... and then a customer signs up from Tuvalu, and performs a billable action at 11:59PM on the last day of the month, and your whole billing system falls over because they now have a split invoice generated for next month while it's still this month.
> But what if I'm not the author, or even a Python programmer? Is attempting to make a naïve change, in a code base I don't understand, likely to add or remove bugs?
Anyone other than the developer of the code, shouldn't be attempting to re-pin the runtime the code depends on to one with a different major ABI version.
The problem, of course, is that most programming languages don't have a concept of "pinning the allowed runtime version like any other dependency"; or of the runtime exposing an ABI version that increases for every breaking change. (This second could and should just be the SemVer major-version of the runtime; but for some reason projects aren't fans of bumping their major versions for every little breaking change. Personally I think we should "let marketing win", and just allow projects to expose an ABI version independently.)
> Sure, it all seems to work for now... and then a customer signs up from Tuvalu, and performs a billable action at 11:59PM on the last day of the month, and your whole billing system falls over because they now have a split invoice generated for next month while it's still this month.
You think there's much of an overlap between systems that do time zone arithmetic with naive timestamps and ones that bill in the customer's time zone?
> Sure, it all seems to work for now... and then a customer signs up from Tuvalu, and performs a billable action at 11:59PM on the last day of the month, and your whole billing system falls over because they now have a split invoice generated for next month while it's still this month.
Yeah, so the compiler should issue a warning and the user can decide how to proceed. If the user ignores the warning for code that is part of their company's billing system, that's on them. Computers should provide information, but ultimately leave the humans in charge.
> The problem, of course, is that most programming languages don't have a concept of "pinning the allowed runtime version like any other dependency"; or of the runtime exposing an ABI version that increases for every breaking change.
That's an equally fine way to handle things IMO, but the language would need to support each runtime version more-or-less indefinitely. It also still doesn't solve your problem with the customer in Tuvalu (which I think is fine, just pointing it out).
Fwiw, that's what's currently happening with this function, it is deprecated only, and I don't think there's a clear deadline yet on at what point it will officially get removed. So no code will stop working in the near future, and any code that does use this function will simply get a very clear warning pointing them in the correct direction.
> Sure, it all seems to work for now... and then a customer signs up from Tuvalu, and performs a billable action at 11:59PM on the last day of the month, and your whole billing system falls over because they now have a split invoice generated for next month while it's still this month.
A one-off is not "whole billing system falls over". These things are inevitable because bug free software never happens, and so humans exist to deal with these situations. Unlike what an update that changes API would do. Now you can't update unless you rewrite everything, and you certainly will have regressions just because of this rewrite
> Sure, it all seems to work for now... and then a customer signs up from Tuvalu, and performs a billable action at 11:59PM on the last day of the month, and your whole billing system falls over because they now have a split invoice generated for next month while it's still this month.
Then you issue a refund and fix the problem, not lock everyone out of the system.
Timezone bugs are evil and crop up at the worst times, though. The last one I (knowingly) wrote wasn't spotted until the CEO tried to use our software on holiday on the day of a DST change.
> Presumably this code works for me if I'm using it. In most cases, an imperfect program I can run is better than one I can't run. Yes there are exceptions.
If using buggy code works for you, so will not upgrading will work for you.
I don't know if I agree or disagree with your general thesis, but it has to be pointed out that the JavaScript ecosystem is even more complex than Python -- despite the fact that it doesn't have to deal with a C ABI -- and I think this is in fact largely because the language has resisted change so much that everything has just moved beyond the language itself to some form of transpilation.
The world changes. Software development changes. The best way to guarantee that your entire codebase will become an obsolete monolith that eventually needs to get replaced is to make sure that you choose an ecosystem that isn't growing and changing too.
Yes, there are COBOL and Fortran and whatever else programs still running out there. But software is not mostly a "one and done" thing, and unless you're confident you're writing something that's "one and done", then these comparisons are truly irrelevant.
> I think this is in fact largely because the language has resisted change so much that everything has just moved beyond the language itself to some form of transpilation
Resisted change? JavaScript has changed massively in the last 12 years I’ve been using it. Promises, generator functions, es modules, webassembly, const/let, arrow functions, destructing, classes, iterators, you name it. The fact that JavaScript is also a compilation target isn't because JavaScript doesn’t change much. It’s because the web is so important, and people wanted to use new JavaScript features before they became widely available in browsers.
They resist expansion pretty strongly even if they love change.
Like, remember when Google wanted to put Dart in the browser natively? Nobody liked it, and now we're stuck with build steps for quite a while until JS finally adds typing, which might never happen, especially because insane people seem to think canvas+WASM is a good idea for anything but the most specialized uses...
The choice not to add Dart into Firefox and safari was made by the browser vendors. It had nothing to do with the JavaScript standards committee. But to your point I do think it would be great if typescript got first class support by V8 & friends. Type information could dramatically improve JavaScript execution speed. Meticulously deleting all the types before passing code to V8 seems like a very silly choice.
I’ve been working in rust the last few years. I wish rust were as open to change as JavaScript is! Generators (aka coroutines) have been sitting just out of reach in nightly the entire time I’ve used the language.
> Count me in the "don't make breaking changes" camp. You can spit out warnings to prevent developers from using certain language features in new code, but changes which outright break old code should be extremely rare.
"Extremely rare" != "never".
> This is how Javascript works (the language, not the ecosystem around it), and it means I can update my web browser and continue visiting old websites.
...and there are reasons for this which don't apply to Python. A typical JavaScript deployment needs to run on millions of machines which the developer doesn't control, and coordinating an update of a JavaScript version to all the machines it needs to run on isn't feasible. Python is usually not run under these constraints: in the typical case, Python runs on servers which the person running it on controls.
JavaScript pays an extremely high cost for this, continuing to support decades-old horrible misfeatures such as nonsensical typecasts, returning undefined from nonexistent members, and a variety of incompatible ways to make asynchronous calls. You shouldn't do this unless you have a very strong reason to do so, and these misfeatures is one of the main reasons you shouldn't use JavaScript unless you have a strong reason to do so.
Did they break stuff? One of the differences is that Python is versioned, so you should be capable of running your current version without issues and then update when you’re ready.
While JavaScript itself is backwards compatible, in a few cases beyond good reason, it’s not really backwards compatible for most of its modern use cases. Node isn’t backwards compatible and neither is TypeScript, and neither are many frameworks as you point out.
I do agree that making breaking changes shouldn’t become a thing, but at some points you also have to move things forward. That being said, I do think this particular case is a shift in philosophy more so than a step forward, since this is a far more opinionated change than what may have been the case previously in Python.
You may not always be in control of the Python version, for example if you depend on the Python provided by your OS. You may also want to use a newer Python version for its features, support (bugfixes) and libraries, so you'll need to rewrite things.
Java also has versions, but they generally don't remove things. You can keep your ugly code and use the latest version of the language.
> Did they break stuff? One of the differences is that Python is versioned, so you should be capable of running your current version without issues and then update when you’re ready.
the problem is that the amount of asinine changes and unnecessary shuffling of standard libraries is so consistent that a good number of libraries have a grid of compatibility with different Python versions.
So version 1.1.0 is compatibe python 3.8, 3.9 and 3.10 and version 1.2.0 is compatible with python => 3.11 (until the next breaking change is introduced), you have to pick and choose depending on which python version you are using, and sometimes you have impossible combinations and pip enters an almost never ending loop trying to find combinations of libraries that work with each other.
I've been developing, packaging and creating our own deployment tooling since python 2.2 and I cannot believe how Python devs are still shuffling stuff around instead of working on fixing package management.
Deprecation shouldn't break peoples' code all of a sudden. Code should still work, and just get more annoying over time. Initially, spit out warnings that deprecation is going to happen. Then, start changing the user's desktop background. Then, start playing nice cat meowing sounds out of the speaker. Then, start playing ghastly wailing sounds out of the speaker. Then, start sending e-mails. Then, scan for IoT light bulbs and mess with the colors. Then, scan for bluetooth headsets within a 50 meter radius and play wailing noises out of those. Then, impose a 0.001 second delay. Then a 0.01 second delay. Then a 0.1 second delay. Then a 1.0 second delay. Then start scraping LinkedIn for engineer resumes and e-mail them to any sysadmin e-mails you can find in the various config files in /etc/ and /home/.
Don't suddenly deprecate stuff and break people's code on a moment's notice. Most likely the person who wrote the code using the deprecated function was laid off 2 years ago and the company still depends on it.
It should not be "removed" as a step function. It should get more annoying over time but still function for the next 30 years, at which point it's incredibly, incredibly annoying to call the function (but still works).
This helps products continue running but simultaneously incentivizes the people on top to hire engineers to make the code less annoying without completely wrecking the business or downstream software that depends on it.
Basically, the servers will still run and the customers will be happy but if the VP sitting 50m from the servers doesn't want cat noises coming out of their headphones they need to hire an engineer ASAP or it will turn into wailing noises pretty soon.
> you will start seeing deprecation messages on your console and your logs, and these can get annoying
Not annoying enough. This only sends messages to someone who was probably got laid during the recession and isn't around anymore, or some open source repo creator who is now being forced to work 18 hours a day by their new job and doesn't have the energy to update any code. It needs to slowly creep out and annoy other people nearby before someone will actually care. When the neighbors constantly have wailing noises coming out of their living room TV they will be incentivized to contribute to the repo.
> It should not be "removed" as a step function. It should get more annoying over time but still function for the next 30 years, at which point it's incredibly, incredibly annoying to call the function (but still works).
Keeping stuff around costs maintenance effort. Python is mostly developed by volunteers (either directly by occasional volunteers like me, or by professionals who are paid by organisations like Microsoft, who volunteer the money.)
A typical pattern is to deprecate in minor versions and remove in the next major version, encouraging gentle readiness for that major update. I don't know enough about Python to say this is what they're doing, but it fits in this case.
That's not what they do, in semver terms you get breaking changes in minor version bumps.
To be fair, as a result there's maybe more resistance to making any such change than there otherwise would be, but in nicher standard lib modules there are API/semantic changes from one version to another.
I'd prefer semver, like it sounds GP would, but failing that I'd prefer totally owning that the version is fairly meaningless, and doing something like 2023.x as pip does for example. At least then everyone knows they should check for anything like that. (Not to mention it carries the additional age/potentially out of date information.)
23 comes from 2023. Pip uses calendar based versioning. The 23.3.1 means 2023, 3rd quarter, second release. The last number is for bug fix release. The first two numbers are purely date based and do not follow semvar.
The only reason this is a potential source of bugs is because of the poorly thought out changes in Python 3. It is not a problem with Python 2 code, and it wouldn't have been a problem should there have been backwards compatible types in Python 3.
What's done is done, and the best way to avoid this is to add support in linters to warn for this, or a runtime warning when that can be done safely. Deprecating or removing a method is madness for any language with any real world use.
The only thing that leads to is a lot of unnecessary work for a lot of people, and there will be a best practice to save old versions of the interpreter to run old software.
Deprecating is not removing. Deprecating a method breaks nothing. It just shows a deprecation warning when you use it and have deprecation warnings turned on.
And that's a good thing! Do not hold on to broken cruft but give some time (more than just a few months (Ansible and other culprits) but definitely less tan 10 years (Python 3 transition)) for code maintainers to handle it.
Removing “broken cruft” from a programming language’s standard library has a very high cost associated with it. If you look at Java, the list of things removed in the past decade is quite short [0], and the removed things probably weren’t super important or popular. Python’s deletionists go too far IMO.
For Python that is properly source controlled, audited and understood I do agree it isn't that hard to port. However unlike many more compiled languages there's probably a LOT of python code out there that isn't which makes changes to Python and its APIs more risky. Don't get me wrong - this is one of the reasons why the language is popular - it is seen as "easy and approachable"; you just write a script and it works for the most part. It also was often the only language built into many distro's other than bash making it ultra convenient.
Which means a lot of it written isn't catalogued, in source control, whatever but still doing stuff people need and often forgotten about. This makes changes of Python IMO more risky. I've seen Python scripts on old Linux images lurking in the wild working for even decades without people realizing they are there set up by non-dev's people back in the day. Another example would be some financial analyst coding a script to produce reports that are used to make decisions and need same behavior each time - one day the date time function maybe in this article will just stop working often with the original person who wrote it just to get stuff done quick long gone.
IMO migration from 2 to 3 was particularly painful for Python because of the way it/was is used, historically who the main users of it, its popularity (wide blast radius) and how it is often Python is updated by blanket system upgrades given it is standard on many machines. Being aware of what needs to be upgraded for a given large scale environment is most of the work.
The problem is open source. I generally agree with your analysis of the migration path, but who’s going to go through every open source library and do this grunt work? It comes at a huge cost however you swing it, and it’s not theoretical: I recall trying to switch to Python 3 over and over during those dark years only to find an essential library couldn’t support it.
Can you share some nuggets about how did you migrate a bigger codebase from 2 to 3 in less than 40 hours?
I tried with a Django codebase and didn't know where to go.. 2to3 tools they provided didn't help much. Can you give steps involved? I'll dig deeper on each step.
Thanks
In general the approach is pretty heavily dependent on a trustworthy test suite. If you don't have that, you'll have to write a lot of tests which will take more than 40 hours, but... Python is pretty heavily dependent on having a trustworthy test suite.
All of the codebases I migrated were Django codebases.
Basically, I would:
1. Upgrade to the latest 2.x, run the test suite and a bunch of manual tests, and fix any errors/warnings.
2. Upgrade all the dependencies one by one to versions that support version 3, run the test suite and the manual tests, and fix any errors/warnings. This was generally the hardest part, because some projects had imported a lot of bleeding edge nonsense which was abandoned before Python 3 support was implemented, so I had to a) encapsulate the dependency, and then b) rewrite the encapsulated functionality either with my own code, or with a more up-to-date library. This is a predictable result of importing nonsense, and I blame the developers who did that, not the Python team.
3. Upgrade to 3.x, run the test suite of manual tests, and fix any errors/warnings.
>As a contractor, I singlehandedly migrated 4 different codebases from 2 to 3; a total of just under a million lines of code, and none of these codebases took more than 40 hours. This was not some horrific burden for most companies using Python. The people who flamed Python for these breaking changes were either extreme outliers or the sort of people who would be happy with the language never making breaking changes, ever, no matter how bad of mistakes in design were discovered.
Imagine I just randomly came up to you and demanded a few thousand dollars, you probably wouldn't be too happy about it. That's essentially what the python maintainers did: collectively demand tens of thousands of hours worth of labor from developers, labor that could have been spent doing something enjoyable or profitable, but instead was wasted on menial work because the Python developers thought some minor changes were worth breaking things for everyone. I.e. they put a much higher value on their own subjective ideas of cleanliness than on the time of all Python users.
I think it'd be more like if you gave me a car, and then, years later, you said you needed the car to build a plane for me. It would screw up my plans, but I wouldn't assign moral blame to you. I'd say "thanks for the car while I had it, I'll find another car vendor"
What most people don’t seem to realize is that there aren’t only language changes to contend with, but framework changes as well. This line is further blurred in language-framework pairs like C# and .NET, where (as a web dev) you’re pretty much always going to be using .NET if you’re using C#, whether you realize it or not. In these cases, framework revisions are what getcha, not language changes. Microsoft does a good job of maintaining backwards compatibility as C# the language changes, but framework upgrades (eg .NET Core 3.1 to .NET 5) will absolutely cause headache via runtime exceptions, etc.
So yeah Python ain’t that bad. I can’t speak to frameworks like Django, though.
Django was actually fine. They had the same codebase running on Python 2 and three, so you could update whenever. Since Django includes so many batteries, I didn’t have any trouble porting that over.
However, early versions of Python 3 were slower than Python 2, and also some breaking changes were getting rolled back (e.g. PEP-414, which was targeting Python 3.3), which contributed to a lot of library authors dragging their feet in upgrading their support.
So, yes, it was libraries causing the most headaches, but there was a sense at the time of wondering when the upgrade would become “real”. Depreciating py2 took 11 years after the release of 3.0.
I am not a fan of all these breaking changes. I still love Python, but it's just too much with these tiny tweaks.
Developers really seem to like endless polishing, even when more important stuff still isn't done.
On the other hand... Maintenance is a thing. Less code to maintain means less work.
I can see why they want to get rid of stuff, it shouldn't be too hard to update, it's just annoying. Can't complain too much, at least it's not like C, where the big dependencies seem to like making such big changes it takes years to port.
I don't manage anything too crucial, but I like to keep every project I work on updated to the latest Python. I've noticed I'm often quite early with this, and help open issues with any libraries that are lagging. For example for Py3.12 I'd say about half of what I work on is on Python3.12 now. Slower projects are 3.11 and those I'm just a user of (like stable diffusion) are still on 3.10. I don't know of any projects I use that are less than 3.10.
This also is in line with how I am around most tech. I like to keep all databases and libraries up to date for security reasons.
Java gets this right. In general, breaking existing working software is a very bad idea and only should be done in extreme cases. One thing a lot of people over look is breaking changes make life easier for the few people who write the language, library, OS, database, etc. but make life much harder to the many people who maintain or use applications.
>Every single breakage is a massive massive waste of people's time.
I think this applies to a ton of things. For instance, in 2015, my electricity provider switched from NSTAR to EverSource. For whatever reason they couldn't auto-migrate everyone's accounts. I ended up having to call them on the phone. Took maybe 20-30 minutes, I think.
Societally, there should be a way to deal with this. Obviously not possible for real, but maybe it would be appropriate lock the execs and board of directors in a room that just continuously plays hold music for a week so their time is wasted as well.
This. This is why Windows has kept backwards compatibility by heroic efforts on the part of its devs: because they do actually have clients (US-DOD is the obvious one) who can refuse breaking updates entirely and have enough leverage to make that stick. They also know that users who do somehow leave the Windows desktop ecosystem aren't coming back.
C# managed a semi-breaking migration by having a library format that was compatible with old and new: "netstandard".
It's not a waste, you just aren't acknowledging the purpose of using people's time.
The alternative is the situation we have with C: no breaking changes, and decades later people are still dealing with the bugs this causes--a true waste of time.
I suspect a lot of financial data code is going to break. I've worked on a number of codebases where the explicit instructions were "everything is UTC, all the time", and enforced at the entry edges of the code. Bringing in timezones was a definite NO for reasons of the unholy mess this would inevitably create:
* India has half-hour timezones
* Countries change their DST rules all the time.
* Conversion between timezones is an utter mess.
As Wes McKinney once said about python datetime: "Welcome to Hell"[0]
The whole of India -- even though it is pretty vast in area -- has a single time-zone country-wide. I am very thankful for this. We already have a lot of other challenges -- including dozens of different languages -- that a single time zone is a relief.
But yes -- the +5:30 offset usually makes for a less-than-ideal factor to plan meetings or follow non local event schedules.
Did I mention we also hate the Summertime / Daylight Saving time adjustments we have to accommodate twice a year when working with western teams -- more adjustment even after most people have already sacrificed on a sane 9-5 schedule to ensure they overlap with EU/US teams for daily work meetings. We cant imagine why US that has multiple time-zones already in a single country, and EU whose countries are often smaller than states in India ... need so much adjustment. Why can't they, for instance, simply declare that school/offices open at 10am instead of 9am for six designated months every year instead of imposing this adjustment on the entire world.
> We cant imagine why US that has multiple time-zones already in a single country
Are you aware the contiguous United States is three times larger than India? Alaska alone is half the size of India and not at the same longitude as the lower 48. Hawaii is even farther west than most of Alaska. The easternmost point in the US is 66 degrees west. The westernmost is 179 degrees west. That’s nearly a third of the globe and I'm not even including territories.
> Why can't they, for instance, simply declare that school/offices open at 10am instead of 9am for six designated months every year instead of imposing this adjustment on the entire world.
Is this a serious comment? Europe doesn’t have a central government. The most recent attempt to establish one was extremely unpopular and famously stopped in 1945.
Why should foreign schools and businesses change their hours to accommodate you? A more obvious solution would be to just stop observing DST but that’s not something that will happen unilaterally and will be an especially hard sell at extreme latitudes.
If you want to do business internationally you have to deal with timezones. India isn’t unique in that inconvenience.
People at extreme latitudes should not care much, since for them DST does close to nothing.
The parent's proposal is actually quite sound. It would be a lot more useful to have institutions switch their starting times for part of the year when it makes sense for them, rather than forcing a one-size-fits-all solution on everyone. Countries at different latitudes would be free to switch e.g. school start times at different dates depending on when sunlight shifts too far in one direction, while keeping their clocks synchronised with astronomical time. Heck, Spain basically does that already, since they're on the wrong timezone (geography dictates they should be on British time, but because of events in the '40s, they're on Central European time) - so they do everything one hour "later".
> [...] EU whose countries are often smaller than states in India ... need so much adjustment. Why can't they, for instance, simply declare that school/offices open at 10am instead of 9am for six designated months every year instead of imposing this adjustment on the entire world.
This proposal is literally one-size-fits-all.
> It would be a lot more useful to have institutions switch their starting times for part of the year when it makes sense for them, rather than forcing a one-size-fits-all solution on everyone.
This is exactly how it works now. But there are a lot of institutions and they aren't coordinated.
I was not arguing for a single timezone nor surprised that US has multiple timezones. the surprise was that even after multiple timezones, they need so much adjustment twice a year -- which is what I was arguing against. (to be more specific the argument was against how they go about that adjustment -- not that an adjustment is needed which is understandably due to geography and seasons)
you quoted half my sentence which creates an easy strawman.
> A more obvious solution would be to just stop observing DST
yes that is essentially my argument too ...with an extra accommodation for folks who will argue about seasons causing wild changes in sunrise / sunset / daylight.
Because of the comma and ellipsis "need so much adjustment" very much looks like it only applies to "...and EU whose countries are often smaller than states in India."
> A more obvious solution would be to just stop observing DST but that’s not something that will happen unilaterally and will be an especially hard sell at extreme latitudes.
Huh, why would it not happen unilaterally? Different countries already fiddle with their daylight savings times (or absence thereof) unilaterally, and the start of DST is not synchronised across the globe anyway.
So if you have two countries that start DST on different weeks that's already a hassle to deal with, and one of them going off DST altogether doesn't really add any extra hassle in their bilateral dealings.
"We cant imagine why US that has multiple time-zones already in a single country"
Because "The USA" spans from GMT+10 to GMT-10. Your Solar Noon only varies by an hour between Kolkata and Mumbai. Without time zones, the continental U.S. would see a solar noon difference of three and a half hours. And that's not including Alaska, Hawaii, and Guam, which gives us a 20 hour span. You want all those places to be in the same time zone?
Greenland has a single timezone for a 4-hour span. China has a single timezone for a 5-hour span. Having a single timezone for the contiguous US would definitely not be unheard of.
Guam is a bit of an odd duck because it is on the other side of the date line. The US currently spans from UTC-4 in Puerto Rico to UTC+10 in Guam - which is essentially the same as UTC-14. That's a 10-hour span, not a 20-hour one. Still, excluding the minor overseas colonial possessions would probably make the most sense.
Have you been to China? It's time zone make sense for Beijing and the coast but forces the "Chinese" of Xinjiang to do things at unnatural hours for the convenience of the ruling elite.
It's bizarre to suggest that someone in western Alaska should wake up at the same time as someone in Florida because that's what India does.
> Have you been to China? It's time zone make sense for Beijing and the coast but forces the "Chinese" of Xinjiang to do things at unnatural hours for the convenience of the ruling elite.
I've been there and it was absolutely fine - a lot better than working in the US with its mess of timezones. Sure, the times in Xinjiang are "wrong" but who cares? I'll take getting up at "11AM" or whatever over having people in different offices think the meeting is an hour earlier or later and not finding out everyone was mixed up until it happens.
If they do things at different times, then they effectively live in a different timezone, but with the downside that’s much harder to know when it is that they are going to do things. With time zones you know that businesses will work roughly from 9-10am to 5-6pm that people will normally go to bed around 10pm and will wake up around 7-8am. You can just apply the conversion and know all of that. If there are no time zones, how do you convert from one time to the other?
> If they do things at different times, then they effectively live in a different timezone, but with the downside that’s much harder to know when it is that they are going to do things.
It's not hard to know when you do things where you live. Some places people drive on the left, some places they drive on the right, people in different places people speak different languages, people in different places get up and go to bed at different times. When you go to Xinjiang as an outsider sure the customary times are a bit unusual but they're far from the only unusual thing about going there.
> With time zones you know that businesses will work roughly from 9-10am to 5-6pm that people will normally go to bed around 10pm and will wake up around 7-8am.
Not necessarily - plenty of ways to be caught out if you make that kind of assumption about a place you're not familiar with (e.g. turning up at a business when it's siesta). When going somewhere unfamiliar you're already best off looking up when your hotel/restaurant/etc. opens, and it's easy to do these days.
Timezones maybe made sense when physically going to a different place was more common than having a phone/video meeting with someone in a different place. But nowadays being able to agree on the same instant in time when you're in two different places is more important and we should standardise.
> It's not hard to know when you do things where you live. Some places people drive on the left, some places they drive on the right, people in different places people speak different languages, people in different places get up and go to bed at different times. When you go to Xinjiang as an outsider sure the customary times are a bit unusual but they're far from the only unusual thing about going there.
Sure, it can work. It's just worse. It'll be harder to adapt to local time if you move there or go there to visit than just changing your clock to match local time. You'll need to convert constantly. What's the upside, though?
> Not necessarily - plenty of ways to be caught out if you make that kind of assumption about a place you're not familiar with (e.g. turning up at a business when it's siesta). When going somewhere unfamiliar you're already best off looking up when your hotel/restaurant/etc. opens, and it's easy to do these days.
Sure, you could still get things wrong, but you're suggesting going from a situation where it mostly works, to a situation where this never works.
>Timezones maybe made sense when physically going to a different place was more common than having a phone/video meeting with someone in a different place. But nowadays being able to agree on the same instant in time when you're in two different places is more important and we should standardise.
Timezones are the way we found to standardise once global commerce became a thing. Timezones give you essentially time and location, so it's easier to figure things out when multiple parties are involved.
> Sure, it can work. It's just worse. It'll be harder to adapt to local time if you move there or go there to visit than just changing your clock to match local time. You'll need to convert constantly.
No it isn't? You don't convert anything, you just do things at the times the locals do them. It's really not hard.
> What's the upside, though?
No changing clocks, no scheduling a meeting at the wrong time because you mixed up the timezones, no calling your parents and accidentally waking them up because it's the middle of the night for them.
> Timezones are the way we found to standardise once global commerce became a thing.
In the distant past each village had its own time; once the railways emerged and it was practical to go from place to place in the same day, we standardised time across decent-sized regions. Now that we can talk to people instantly around the world, it's time to contine that process and standardise time everywhere.
> No it isn't? You don't convert anything, you just do things at the times the locals do them. It's really not hard.
Before you get used to it, when you see a time you'll have no idea what time of the day it is. Is it early? Is it late? Is it during lunch time? You need to convert in your head "11am here means midnight where I come from, so that's actually really late". Very easy to forget and make a mistake.
> No changing clocks, no scheduling a meeting at the wrong time because you mixed up the timezones, no calling your parents and accidentally waking them up because it's the middle of the night for them.
huh? How is using a single time going to help with waking something up because it's the middle of the night for them? I'd say it's more likely. If someone is 6 hours behind you, you'll need to keep in mind that their 10am means what would be your 2am. Even though you both call it 10am, you would definitely not want to call them at 10am. If anything, that's more error prone and confusing.
> Before you get used to it, when you see a time you'll have no idea what time of the day it is. Is it early? Is it late? Is it during lunch time?
That's not a real problem, IME. If someone invites you for lunch, it's going to be at lunch time. If someone wants to schedule a meeting, you have to check your calendar anyway.
> You need to convert in your head "11am here means midnight where I come from, so that's actually really late".
No, you don't. Converting in your head is the wrong approach just as it is for languages. Just get used to when you're going to bed and getting up. (And don't use AM/PM - why would you ever do that? Even within a single timezone it only causes confusion)
> How is using a single time going to help with waking something up because it's the middle of the night for them?
Because you never have to add or subtract a time, which is where most mistakes happen.
> If someone is 6 hours behind you, you'll need to keep in mind that their 10am means what would be your 2am. Even though you both call it 10am, you would definitely not want to call them at 10am.
Right, so you need to know when they go to sleep and when they get up, and not call them when they're asleep. But there's no arithmetic to get wrong, there's no risk of adding six hours instead of subtracting six hours or vice versa.
You know what? Good on them! We should all just use UTC everywhere and adjust business hours accordingly. Get rid of timezones altogether.
If you want to know when the sun rises, sets, or is at its apex just look it up in a table. The former two vary unless you are very close to the equator anyway, and the latter is off by an hour for every country using "daylight saving time" for half the year. Never mind countries that span more than 1/24th of the globe but use a single timezone, and that isn't just China.
Also I don't really see how this is "for the convenience of the ruling elite". I'd be willing to bet money most people in Xinjiang wouldn't have this "problem" in their top ten. Probably not even top hundred. This seems like something you get used to once and then never think about again unless you travel or have a remote meeting.
> If you want to know when the sun rises, sets, or is at its apex just look it up in a table.
Yes, a table.
A table with time.
A table that divides the world into zones with regard to time.
That definitely abolishes time zones.
> I'd be willing to bet money most people in Xinjiang wouldn't have this "problem" in their top ten.
Yeah, it probably does rank quite a bit below the genocide.
Anyway, time zones solve an important problem: People coordinate with other people close to them, but occasionally need to coordinate with other people far away. How do those far-away people know when the good times to call are? Clocks only work if you have some idea of what times mean in practice to distant people, which is greatly helped along by people setting their clocks to a local time that's known globally.
> A table that divides the world into zones with regard to time.
Think about it more. How sunrise and sunset change by location and by date.
A chart that covers both sunrise and sunset does not naturally have "zones", and any "zones" you try to infer would not resemble time zones at all. You're either looking at big sweeping ellipses, or you're dividing the world into hundreds of small tiles. It's not time zones.
Perhaps you haven’t heard of the Uyghur genocide by the CCP in Xinjiang. The local population very much does care about their local time zone (and language and culture, other things the Chinese government has stripped away from them). It may be one of the few places where you’ll get a different answer to the question “what time is it” depending on the ethnicity of the person you ask, and, if asked, you might get arrested for answering with the “wrong” time.
The goal is that 12:00 is pretty close to noon, which is reasonably possible.
(And even though lining up a time with morning would be significantly more annoying and need to exclude the polar circles, I still wouldn't call it "impossible".)
> Greenland has a single timezone for a 4-hour span
95+% of Greenland's population lives on its west coast. The most common map projection one is likely to come across makes it look like that spans a pretty large longitude range but it actually only covers about 15°, a 1-hour span.
Take a look at a projection that does a better job of representing latitude, such as this [1] azimuthal equidistant projection and you can see that west coast Greenland is a lot more north/south than you might have expected based on the more usual projections.
Here's a map of the towns in Greenland with >300 population [2] showing how much more populated the west coast is than the rest of the country. If you add their populations and the populations of the towns listed in the table but too small to make the map it comes to about 3100 people.
> The US currently spans from UTC-4 in Puerto Rico to UTC+10 in Guam
No, it spans from UTC-10 in Hawaii to UTC+10 in Guam. And if we include other US territories in the Pacific, like American Samoa, we have to include UTC-11. That's 21 hours.
> Why can't they, for instance, simply declare that school/offices open at 10am instead of 9am for six designated months every year instead of imposing this adjustment on the entire world.
If they did that, you'd still be stuck shifting your schedule when they change their working hours.
In the most simplest scenario -- once a recurring meeting is fixed at 10:30 AM eastern STANDARD time, say, -- no schedule change is needed all-year-long if we dont tamper with the clock.
It is just that US worker -- due to their own convenience -- decides to show up in their office at 9AM for six months and 10am for other other six months of the year. Recurring meeting does not need to change at all.
The adjustment , if any, is all on US side. And for local season / climate reasons which has nothing to do with the remote team members, and so the remote team members elsewhere in the world dont need to change anything at all.
Just dont tamper with the clock and things will be much more simple.
>> If they did that, you'd still be stuck shifting your schedule when they change their working hours.
No -- we do not want to link our meetings to the changing office hours. We just want a fixed time on a clock fixed clock. Pick a time that works throughout the year for both teams. Like 10.30 AM EST in my example above.
Current the meetings are "fixed time" but on a moving clock -- in reality the meetings shift around but US folks move their clocks to "simulate" for themselves that they are always having their daily meeting "at 10AM" -- they achieve this illusion by inflicting pain on themselves as well as rest of the world. We would love if they can leave us out of this forced dance :)
Most of my life does not involve coordinating across timezones.
You're suggesting that twice every year, we change the times that retailers, offices, government agencies, broadway theaters, restaurants, roads, schools, and so on open and close. Schools alter class schedules and trains modify departure times. Also, we all reset our alarms, because we'll need to wake up at different times anyway.
It makes so much more sense to just change the clocks!
And yes, all of these things would need to change, because they all revolve around the typical work schedule! Parents bring children to school before heading to work. Public transit runs more frequently during rush hour.
There is a discussion to be had about whether we should be shifting our schedules in the first place. But as long as we're shifting schedules, it makes so much more sense to just change the damn clocks!
P.S. Most people don't have white collar jobs! Construction workers, janitors, mechanics and so on don't ever work across timezones.
P.P.S. I'm okay with nixing the time change as long as we don't end up on permanent daylights savings! However, permanent daylights savings is my nightmare.
Or only the people who it actually affects make any such adjustment?
Currently the shop closes at 8pm and it's dark by 4pm; in Summer it closes at 8pm still (on +1 time) and it's dark at what 11pm or later? Explain your point again?
If school starts at 9AM EST for half the year and at 10AM EST for the other half, then people who drop their children before work will be available at 10:30AM for half the year, and will not for the other half.
If there is a certain time that really does work for everyone, then DST doesn't really affect it. At worse, it requires a bit of rescheduling in the calendar if the meeting was set by someone in a DST country. If the meeting was set by someone in India, it will even get automatically adjusted in everyone else's calendar. So DST itself is not actually making things difficult for you, it's only people's habits changing with winter/summer that makes things difficult.
And note, I have direct experience with all of this, working in Romania with colleagues from California and Kolkata with at least three recurring syncups with all three of us per week.
Yes if the meeting host is in a country that does not observe DST/summertime ... then this problem kind of solves itself. Then only those whose local clock dances around the year need to dance around to match :)
Outlook and calendar apps should probably make this a feature: "do not move this meeting for DST/summertime" (even if the host is in US/EU). Make it opt-in initially and a couple of years later make it opt-out (default) choice.
On a lighter note ... recurring meetings tagged to "dynamic time zones" should be allowed only after hosts "agree" to a disclaimer that they have consulted other participants and have their consent :D (a little like recording with two-party consent)
You seem to put very heavy emphasis on the precise recorded time of a recurring meeting in everyone's calendar. In my experience, this is mostly irrelevant: the time of a recurring meeting is often renegotiated when circumstances change for a big enough swathe of participants, such as schedule changes in the more northern/souther latitudes as winter&summer approach.
That's a good thing, and way better than the current state of things. Let the federal government pick summer/winter hours, and everyone can adjust or not as they see fit.
If you live closer to the equator it may not make much sense. For the longest time the population of the US was heavily biased to it's more northern populations than it's more southern. In relation to someone that lives in Delhi someone in Maine will see a 4 hour larger swing in day length. If you're talking farther south in India, like Madurai the size of the swing will be even larger.
> Why can't they, for instance, simply declare that school/offices open at 10am instead of 9am for six designated months every year instead of imposing this adjustment on the entire world.
Or, if you extrapolate this argument further, you could even wonder why the entire world can't run on a single timezone.
But here's the problem: for example, imagine it's UTC everywhere on Earth. No local timezones.
As per your solution, people in India can go to sleep at 4:30 pm (UTC), wake up at 12:30 am, have lunch at 7:30 am etc.
Likewise, people in San Francisco can go to bed at 6 am, wake up at 2 pm, have lunch at 9 pm etc.
Now, the problem is, when someone wants to setup an international meeting or whatever, nobody would have a clear idea or whether it would be "sleeping time" for any of the participants involved, without having to look it up.
Like, with the current setup, it's kind of universally understood that 3 am local time is sleeping time, hence people would avoid setting up meetings at that time.
But if everywhere on Earth it's the same time such as UTC, I would have no idea if 3 am UTC is your sleeping time or lunch time or whatever. I would have to maintain lookup tables of when the sleep time, lunch time etc would be for every country as per UTC.
Similar problems exist (to a lesser degree) when large countries move to a single timezone.
Yes there are flaws in the existing setup, but moving to a single timezone for large countries would only introduce a new set of challenges.
My comment is being misread as an argument for a single timezone for all of USA.
Kindly re-read.
I was arguing against daylight savings adjustments twice an year. Not against having multiple timezones per country / region. My comment was simply that timezones should be sufficient without the need for DST.
What I said ...
>> We cant imagine why US that has multiple time-zones already in a single country, and EU whose countries are often smaller than states in India ... need so much adjustment.
Simplified ...
>> cant imagine why US, and EU need so much adjustment.
Having lived in both India and far North in the USA, I'd take half-hour time zones over daylight savings (switches) every time.
Shifting time-zones increases car accidents by about 6-16% for the entire week (studies vary). Just that is a massive enough impact on safety and productivity to discourage the switch. At a personal level, leaving work at 5.30-ish with it being pitch black outside, is among the most depressing things in the world. Even more wasteful is that the average American wakes up at 7.20am. So, they don't gain any daylight but lose an entire hour.
I'm not usually the type to make such accusations, so I apologize in advance for sounding culture-warry. But, switching-time zones is "white person thing". Large percent of the Chinese, Indian & the Middle Eastern population live above the 25th parallel (Florida) and large populations of Korea & Japan live north of the Bay Area without daylight-switching.
Permanent daylight time with later sunrise would be a huge win.
> Why can't they, for instance, simply declare that school/offices open at 10am instead of 9am for six designated months every year instead of imposing this adjustment on the entire world.
Or, if you extrapolate this argument further, you could even wonder why the entire world can't run on a single timezone. Like, it can be UTC everywhere.
Brazil's stock market does/used to do something along these lines when US & Brazil changed their clocks in opposite direction, to keep the NY time hours of the Brazilian exchange from drifting as far. Given that we all also observed DST at the time, this made things even more complicated for systems.
> Why can't they, for instance, simply declare that school/offices open at 10am instead of 9am for six designated months every year instead of imposing this adjustment on the entire world.
For what it's worth, I've (British) never understood that either. Need more daylight hours for farmers to work or whatever, sure, no problem. But.. that's easily resolved as you described? And before widespread centralised time and access to it, surely they just started at sunrise or soon after, whenever the sun happened to rise. I don't get it.
For sharing between systems and representing dates to humans debugging a system, the following has worked well …
Planned local events: future date and time string including local time zone
Planned global events: future date and time string in UTC, with UTC timezone
Past events: date and time string in UTC with UTC time zone, unless there is a strong case to be made that these should be localized (see “planned local events” above)
Strings in the format specified by ISO-8601.
Of course, internal representation can be whatever it needs to be for efficiency. For example, maybe your database stores as integer microseconds since an epoch. Fine, but as soon as that leaves the database, ISO-8601 that date.
I like your premise but disagree with your guidance.
You should not convert the integer early, you should actually wait until the very last moment.
Nowadays a timestamp will be marshalled and demashalled by (at the very least) two runtimes until it reaches human eyeballs opening up a wide spectrum of bug potential, difficult to understand code and development slowdowns. Keeping it simple and unambiguous is very important. Time libraries have a habit of coming and going, ints are always there and 99% of the date arithmetic is trivial any way.
Don't use a lib, just pretty print numbers and you'll be fine.
An int is not unambiguous, though. "2023-11-19T14:22:34-03:00" is clearly a timestamp and represents an unambiguous point in time, while a number by itself doesn't mean anything. You have to know that it's a timestamp (probably from context) and you have to know what timezone it's in.
But 1700414554 is not unambiguously a date. Could be a count. Or an identifier. It needs metadata to tell you how to interpret it. For this to represent a datetime, the human needs to know that it’s a datetime, and what epoch and timezone to use. The ISO date string removes the ambiguity because the context for interpreting it is within the string.
I never understood this argument. The ISO8601 date string is as much a convention as a unix timestamp is - without context neither are decipherable. If anything, unix timestamp is easier to explain to an alien than a date string. It has a starting point (1970/01/01 00:00 UTC) and it counts seconds from then. Care to explain how an ISO8601 date string is constructed?
Also calculating amount of time between two ISO8601 strings without libraries is nor trivial, or any other operation actually. When dealing with dates, there is only one simple way to work with non-localized times, and it's not ISO-something.
EDIT:
> ...and what epoch and timezone to use.
This is also not correct. Unix timestamp has a well defined epoch and it doesn't deal with timezones (though the epoch itself is, of course, usually defined as 1970/01/01 00:00:00 UTC). You are free to define other timestamps, but unix timestamp is well defined.
This is the biggest general problem: you can't accurately compare the duration between two date times without having proper context of how the datetimes were stored or generated, especially for future dates. For example, you can't say today what the Unix timestamp will be for 1 January 2030, 10:00:00 AM in Bucharest, Romania. Romania may very possibly change timezone by then. So, if a user sets a meeting for that time, converting to the Unix timestamp according to the current timezone info is wrong. You'll end up alerting them on 1 January 2030 at 09:00:00 AM, or perhaps 08:59:58 AM if some leap seconds get added. Or it may even be entirely different if there is some calendar change for whatever reason by then.
Note that UTC has a similar problem - which is why you actually need a local time string for this type of events.
Correct, neither iso strings mor numbers can get you out of this dilema.
While ISO uses UTC offset and time zone interchangeably it actually is only defined over offsets, not locations.
I've had this problem discussed before. Typically the situation is firing events based on times at specific locations. It is neither straight forward with date libraries nor ist it easy with numeric computation. Sometimes it makes sense to render it down to a specific point in time at storage and not expect the country change timezone overnight. Some times the location itself may be changing randomly (moving objects) but in that case it typically just meant recalculating offsets on movement events.
Seconds since epoch requires careful and precise handling of datatypes and assumptions. This works great inside a system where that can be managed but for the purpose of exchange between systems the ISO formats contain all relevant information. Either side can use a library to parse the string into their local representation.
Comprehension by extraterrestrial life is a localization problem.
I'm not sure I understand. How is this different from any other quantity with an associated unit? What's the possible failure mode here, interpreting the "created_time_us" field as a distance in km instead of a timedelta in microseconds?
Is the unit seconds or microseconds? How are leap seconds handled? What base type is used? Integer, number, or float? How many bits? What is the precision? What is the CPU architecture? Is it signed? How should overflows be handled?
You can store weight as ohms of resistance on a load cell. But if you want to share that data you need to either normalize the number to a mutually understood unit or provide detailed information about your scale.
Aside from the obvious issue with engineering, physics, etc.
A unix second is allowed to be different from a "real" elapsed second.
The unix day has 86400 "unix seconds" which are actually 86401 real seconds (for days with added leap seconds).
Any application logging real world events against unix time will screw up velocities, force, energy, etc when computing "instrument reading" per "unix second".
It might not matter to most people but it's an issue for surveyors, geophysicists, astrophysicists, engineers, etc.
> Any application logging real world events against unix time will screw up velocities, force, energy, etc when computing "instrument reading" per "unix second".
That is a fair point for situations that depend on short term observations. But, UTC has the same issues there as Unix epoch. I think it is a valid edge case but if you are doing, say, GPS based speed calc I would wager you are already pulling your time reference from a low level source that you can depend on will be running steadily for the X minutes/hours you need it.
Sure, this is what responsible STEM people do for any time lapsed measurements, be those position, nine axis magnetic field recordings, gravity, radiometrics, microwave return, etc - they use an independant epoch based clock source that counts true lapsed time rather than conventional human time.
Point being, it's an area of UnixTime that many overlook - to date, since the 1980s I've had long term multichannel data aquisition running throughout every leap second adjustment which would have had data glitches had the time channel been UnixTime, UTC, etc.
There's a lot more context baked into an ISO string than a unix timestamp though. Write a regex to find all the strict ISO8601-RFC3339 strings in a directory. Now write one for unix timestamps.
> Also calculating amount of time between two ISO8601 strings without libraries is nor trivial, or any other operation actually.
That's true of any datetime that actual humans use (as opposed to computers, financial systems, sysadmins and programmers).
> Write a regex to find all the strict ISO8601-RFC3339 strings in a directory
That is a weird edge case. I've had this before and the person proposing it hinged on some forensic practice which was based on analyzing human readable data. As I said in my original post, pretty printing ints to ISO timestamps is actually my original opinion.
The good thing about ints is that typically the stdlibs themselves deliver quite good tooling to work with Unix timestamps so in most cases you don't really use a library for the timestamps themselves.
Not in the future it’s not. There’s no way to know the exact epoch time until it has happened. Timestamps “float” and can accommodate local fluctuations or even changes.
Haha. I’m sorry if the terminology is catching you out. It’s not epoch time that floats. Epoch time is an inalienable absolute (at least in theory). However actual cosmological and time as perceived by humans does drift wrt to epoch time. Which of these is right is a philosophical discussion but at the present instant they do indeed converge and so epoch time between now and some time in the early 1960s is “the correctest” from a systems perspective.
So what? That's true of all formats. How is it relevant to the choice of which format to use at which stage of the process?
> and you have to know what timezone it's in.
You don't.
> clearly a timestamp
Not very helpful. It's easy to set up types in advance. Don't figure out if things look like timestamps at parse time, or you'll be causing the same problems that turn gene names into dates and choke on people named Null.
> Past events: date and time string in UTC with UTC time zone, unless there is a strong case to be made that these should be localized (see “planned local events” above)
I don't think there's any case for storing local time with past events. They happened at a specific date and time, store them in UTC and if needed convert to local on display. Plus, though it's a rare circumstance, for one hour each year DST switches cause a duplicate hour - so local time is ambiguous during that hour.
There is, when you aggregate your data to create graphs such as "My most active hours", you will have a problem if your user moves between timezones, if you did not store the actual local wall time. [0] For those cases, you can store the offset or the denormalized local time along with the UTC version.
I'd say that wouldn't enforcement be rather simple going forwards since you could just enforce that dt.tzinfo == datetime.timezone.utc, but as far as I know, timezone information checking in Python can be a bit of a pain.
Removing utcnow() doesn't force you to use tz-aware dates though. That's just a leap that the author of this article has made.
If you're working in a system where all dates are utc, then datetime.now() gets you a date in utc and you don't need datetime.utcnow(). The only reason to call utcnow is if you're working on a system where the local timezone isn't utc, so you inherently have to deal with timezones at some level. Which means that using naive timestamps is a bad idea.
datetime.now(tz=timezone.utc) is what you probably want most of the time anyway. You can always manually strip off the time zone component if you really want to use datetimes without associated time zones.
Last startup I worked for hired a Nepal-based QA person. There was a bunch of calendaring and daily/weekly charts in the apps, and she found bugs in _everything_.
I make sure to test with Nepal time whenever I'm testing date/time stuff now.
And, of course, there's the (hopefully apocryphal) story of the French initially referring to GMT as 'Paris time minus nine minutes and twenty-one seconds'.
As it happens, this story isn't apocryphal at all! One can readily find the original law, which was enacted on March 9, 1911, and published in the Journal officiel of March 10, 1911 [0]:
> Article unique. — L'heure légale en France et en Algérie est l'heure, temps moyen de Paris, retardée de neuf minutes vingt et une secondes.
The decree which finally replaced it was made on August 9, 1978, and published on August 19, 1978 [1]:
> Art. 2. — Sur l'ensemble du territoire de la République française, le temps légal (ou heure légale) est défini à partir du temps universel coordonné (UTC) établi par le bureau international de l'heure.
When I built a live clock for some new CasparCG based graphics for a major TV program out of singapore some years back, a colleague reviewing it in London tried to trick it with a Nepal offset — apparently they’d run into an issue with the Viz system they used in 2015 when there were a lot of lives from Nepal.
RealLifeLore just had a video about time zones in that area of the world, there's an area between New Zealand and Hawaii where you can go north/south and jump an entire day.
There isn't a perfect geographical width for time zones. So humans pick something to define the boundaries between time zones. And making boundaries on a map is political.
To some extent a compromise between solar time and political and economic realities. For example, the Eastern Time Zone in the USA stretches almost to 90 degrees West, reflecting the East Coast's powerful pull on that part of the country.
A lot of it was GE's fault at the time, specifically. GE wanted facilities in New York, Cincinnati (OH), and Louisville (KY), among others, to all be in the same time zone and had the economic power at the time to lobby the railroads and the cities to make that happen.
Friend told me that Indiana not adopting Daylight Time until quite recently was due to a struggle between broadcasters, who of course wanted to match their networks, and drive in operators, who wanted it to get dark earlier so they could start the shows earlier.
Yeah, Indiana also had an interesting three way battle between Chicago, Louisville, and Indianapolis. A lot of population near Chicago getting Chicago broadcasts (Central), a lot of population near Louisville getting Louisville broadcasts (Eastern), with the state capital Indianapolis interestingly caught up in the middle (physically closer to Chicago, but maybe emotionally more connected to Louisville) which itself as a city eventually after a lot of back and forth settled on Eastern time following Louisville's lead as one of the westernmost cities in the timezone.
Indiana's Daylight Time mistakes were fascinating. It wasn't that the state didn't adopt it, it was that originally the state allowed it to be a per-county decision as timezones have always been in Indiana. At one point in time if you were traveling I-65 which is nearly due north/south between Louisville and Indianapolis you could experience four different timezones (CST, EST, CDT, and EDT) and which ones agreed with each other obviously depended on which month you were traveling. Since Indiana went state-wide Daylight Time and Indianapolis decided on EDT once and for all, all of I-65 today is EDT I believe, but it is still strange to remember the years where that wasn't the case.
(ETA: one of the underappreciated homogenizing factors here has been the modern cellphone. People would get really confused if their cellphones hopped an hour back/forth every so many miles as you passed county lines. In the eras of paper maps and hand-set clock radios in cars that would have mattered a lot less.)
In Europe, a huge driver for standardization of timezones into "reasonable" slots was railway traffic, including cross-border traffic.
Railways are extremely sensitive to exact time and, indeed, the very concept of unified time across the entire region or country only started developing when railways expanded across Europe. Prior to that, individual towns were happy with their own local solar time, but once railway connections were introduced, time irregularities would cause chaos at best and carnage at worst. That led to introduction of unified railway time which developed to timezones as we know them.
Railways aren't as prominent nowadays as they were 100-150 years ago, and countries like Nepal and India don't have extensive, frequently used cross-border railways anyway; any cross-border traffic is sporadic and mostly freight. So there is one fewer reason to cooperate with your neighbors when it comes to time-related issues. Trucks can take weird timezone changes just fine.
That far south it only varies by a couple of degrees. It does wreak astronomical havoc in Mumbai, though, where the Sun will sometimes be way in the northeast at noon.
because humanity never understood time properly.. so all these facades making it look "simpler" while actualy a lot more complicated.
long time ago there was a special `$ man date` -like page in linux which went into long explaining many "amazing" things about calendar stuff, like whoever feudal in 1553 deciding that certain week was bad and striked it out of his and his country's calendar, or another one that liked certain month and decided to repeat it...
Then accept that UTC is a timezone and just use that everywhere. There are also a lot of things that can’t be easily stored in UTC, like opening hours.
Something like “9AM-5PM EST” can’t be stored in a datetime object even with a timezone slot! Datetimes with a timezone represent specific points in time, not vague “5PM” like concepts.
I feel like every time a discussion on timezones comes up, people compete to come up with different situations where "see, your perfect system can't work, we should give up on timezones entirely".
Here, I'm not even sure what your point is. A datetime object cannot capture a range, no, just like a number can't capture a range of values. But through the magic of having two of them, we can get easily create a data structure containing two datetimes to represent a range. 5pm is not a vague concept, it is the time 5pm, the hour that typically comes after 4pm and before 6pm. If your point is that you can't store only a time in a datetime object, that's true, that's why the standard library also provides the `time` object which represents a time.
Where things might get a bit more complex is if you want to store the time of a recurring event that should occur at the same time on multiple days for a given timezone. In this situation, you can typically use a naive time object to represent a wall clock time, plus the timezone that the user has requested. This way you still have all the raw information to decide when the event will happen in the future. (Note however, that the same time can occur multiple times on the same day, for example during daylight savings changes.)
"every day, from 9AM to 5PM EST" is a thing that can be represented through calendar RRULEs. Usually libraries exist to help with this (Python has one, conveniently called rrule!). This is a thing to describe recurring events, has a spec and everything. So there is a thing.
And you're totally right about `time` existing, it slipped my mind that time has a tzinfo slot. I do think you should still take care and I would avoid having any naive time objects just floating around beyond the serialization layer though.
Yeah, the stock market opens at what time EST? Including daylight saving time, etc. I agree that any time time goes to a serialized format over a socket or to a database or file it should always be UTC.
UTC (or Unix timestamp, or any other non-localized date-time format) doesn't work for future localized events. It's fundamentally impossible to say what will be the UTC timestamp which corresponds to 9AM in some particular location in the future: the timezone of that location could arbitrarily change between now and then. Countries and smaller areas often experiment with such changes, wars happen and conquerors/liberators change the time etc.
So, when possible, it's best to store (future) human events in a localized datetime string. For things like physical events, the opposite is true of course - you can't say what will be the local date and time when the Sun will hit high noon over Athens 10 years from now, but you can certainly calculate the UTC timestamp for that.
Ignoring holidays/weekends: always 9:30AM or (0930) EST. Same with EDT: 9:30AM.
The key here is that the timezone info includes the daylight/standard designation. Or, in the ISO format -4:00 or -5:00 from UTC.
I think you meant to put ET (Eastern Time), which is still 9:30AM, but without a date associated with that time, there's no way to convert that to UTC or other timezones that don't have the same daylight saving time schedule.
A store’s opening hours will probably remain 9am-5pm regardless of any time zone changes around it. If you run a bunch of stores across time zones you need to know where each store is and whether it’s summer or winter to work backwards to find out 9-5 in UTC.
The coffee shop in my neighborhood is open from 6am to 2pm Pacific. We observe daylight saving time. The shop hours do not change. So, what UTC value do you store?
This is only sometimes true. Cases like opening hours and daily batch processing rely on local time. Persisting the values in UTC mean best case you are wrong half of the year and worst case all of it.
6am in my time zone is two different times in UTC. Right now it is 14:00 UTC. But when DST starts it will be 13:00. So what UTC value do you insert into your database for the opening time of the coffee shop? And how do you convert?
Yes, you can store additional information to decipher the meaning of a UTC time but why? Storing locale is good because sometimes you need to know where something is, such as when converting the local time to another zone. But you should avoid creating dependencies between these values because it makes using them individually harder.
The coffee shop opens at 6am. That is an unambiguous fact. The coffee shop is in Seattle, WA. That is another unambiguous fact.
If the time is in UTC then you need to know the date the time was set or store a dtc flag and then build some logic to convert. You can just feed local time+locale into any datetime library and get whatever you need.
UTC is great for a lot of things. Such as recording when something happened. But local time has a place as well.
You still keep everything in UTC in this new world, it's just the object you use internally now knows that it's UTC and makes it so you can't accidentally do calculations on two naive datetimes which happen to be in different timezones.
You get to have the typing help you enforce that everything is in UTC.
That code runs in the context of humans. Therefore it needs to take humans into account when running.
This is a bit like asking "how do you represent a string length in utf-8?" The answer is, you don’t. It’s almost never a problem in practice.
Just like there is almost never a reason to care about the number of characters in a string, it’s very difficult to imagine that this code comes up in practice. But if it did, the code is simple: for a given timezone, is the current time in utc past 9:30am? If so, the store is open.
Let's say you host a store front for many businesses and it's decided that your customers, who are businesses, some with many stores, want to store a default opening time. Something along the lines of "All Home Depot's open a 6 AM, all Lowes open at 7 AM."
The store property would be "6 AM". Anywhere it’s used, it’s running in the context of humans. When displaying on a website, you don’t need to do any conversions: the human viewing the store knows where it’s located, and they know the current time, so computers don’t need to answer the question of whether it’s open.
If computers do need to know, then they’ll need to know a specific store with a physical location. That gives you the time zone.
Indeed. It's not as bad as mapping address => sales tax (no ZIP Code isn't good enough), but only because it doesn't change as often and tends to follow county boundaries when it doesn't follow state boundaries.
Another good example. Arizona doesn't follow Daylight Savings, but the Navajo Nation, which includes the Four Corners area and is in three states, does observe the change.
However the Hopi reservation, completely surrounded by Navajo territory, is entirely in Arizona, and does not observe DST. So it can get quite complicated and it's not enough to say "Arizona doesn't have Daylight Savings Time".
Indeed. And anyone near a time zone boundary looking at an online map of store locations doesn’t want to see “opens at 6AM” or whatever — they want to know whether the store is open.
There's no time zones involved in that at all. Nobody is going to be checking the opening hours of a Home Depot in another timezone and expect to get the date in their local timezone.
I've never visited a brick and mortar store's website to check their opening hours and assumed it would be anything other than in their local timezone.
However, if I look up the opening time of a Home Depot in another time zone, I expect the result to be in that store's local time! I'll do the conversion myself. If I get my local time instead, I'll still perform the conversion, and get the wrong result.
You could just decide that any time listed anywhere on the internet should include a timezone... but that would be nuts.
So maybe individual stores can customize their opening time, but corporate still wishes to store a default. How do you represent the default? How do you handle the DST time changes for individual stores?
I'm trying to poke holes in the "always use UTC and you're good" advice. I have actually encountered this scenario, and I didn't know how to apply that advice.
> I'm trying to poke holes in the "always use UTC and you're good" advice.
There's a much simpler one - recurring meetings/events (in-person or local-ish like within one country) that cross a DST boundary. You always want these to be attached to a local time zone, so it doesn't suddenly happen an hour earlier/later than you intended.
utc is for a specific moment in time (deterministic with great precision for everyday use).
"9:30am local time" may be ambiguous/unknown (the rules may not exist yet)
Using the same programming type for both may cause confusion.
How do you express geographic coordinates? Noon in London and in NY are different moments in time (it may be important if you want to have a remote meeting).
No need for something as brittle as geographic coordinates or addresses. Time Zone names are the right abstraction. A given location very rarely [edit: adopts another] time zone (unless in case of annexations, handovers of territory, etc.), but time zone rules (zone offset from UTC, DST, etc.) change all the time. A consequence of this is that the time zone database of applications should be updated every few months!
"Complicated" then. One would have to consult a map to find out the time zone from geographical coordinates. The time zone is really everything needed to properly perform timezone-aware arithmetic.
Postgres stores this as TIMESTAMP WITHOUT TIME ZONE, sometimes called wall-clock time. IMO this is a good convention -- treating "ideal times" and specific epoch moments as two different types -- and more languages should support this (ideally with a less confusing label).
Reference opening time to local time on each day of the year, and then refer that back to UTC.
Like California is -8 in Standard, -7 in Daylight, so a 9:30 AM opening in the Winter is at 17:30 UTC, in the Summer, 16:30 UTC.
It's a hassle but programmers are supposed to model reality.
As an example of the problems that can arise, I remember when the new Japanese Emperor was to take office upon his father's abdication, there was a big fuss because the Era Name was customarily only revealed after the new Emperor's reign started but all the computer people wanted to be ready in advance. Not sure what finally happened, if it was revealed in advance or not.
What if the date for DST changes tomorrow, but only in some locations? How will you know which stores times need to be adjusted and which don't?
UTC to local time is not a function, it is not somethung that can be computed once and stored for the future. The two are fundamentally different concepts, and can only be related to one another in certain contexts, and only reliably for the past.
> How will you know which stores times need to be adjusted and which don't?
There are a number of libraries for this, usually updated by the unpaid volunteer in Nebraska. Or else your company has staff whose job it is to follow this for any place you operate.
If you record opening times as UTC, well, then you have to change to UTC when the store switches standard <=> daylight.
This is what Fred Brooks called an essential complexity. You can poke the pain from one part of your system to another part, but it's never going away.
The function is complicated. There is no avoiding that.
One Pacific island nation duplicated a calendar day when it switched International Dateline sides to more closely align with Australia instead of Hawaii, reflecting a shift in its economic ties. No idea how they handled it exactly but pretty hairy to model.
> I suspect a lot of financial data code is going to break.
Reminds me of when Numpy removed a lot of types like np.bool a few releases ago, because they were aliases of the standard library types.
Which broke a ton of libraries having Numpy versions pinned to ">=1.0" or even worse, "~=1.0" expecting them to follow the standard practice of incrementing the major version when they break compatibility. It's not terrible because you just pin your project to 1.23 or whatever, but it was very annoying when everything I was using just started breaking.
I wonder if Python plans to do the same with this - the 2 to 3 transition was painful and dragged on for a decade, nobody wants to do that again for 3 to 4. When they remove this function and break back compatibility with old code, is that going to happen in a future 3.x release?
It's worth noting that numpy and scipy have a consistent policy now, although it's not semver. Deprecations will always happen at least 2 minor versions before removal:
So, when using numpy in a package, and you've tested e.g. versions 1.12 to 1.22, you should require 'numpy>=1.12,<1.25', because something deprecated in version 1.23 could be removed in 1.25.
Yes, they're almost certainly going to do it a 3.x release.
I think in some senses these backward compatibility breaks have already happened. I remember that Google got bit by async becoming a keyword in I think Tensorflow.
Generally a lot of old Python 3 code still works today if it doesn't do anything really weird like that, but I do definitely think that the pain of Python 2 to 3 was painful so they don't really want to repeat it.
I think it is probably prudent to treat point releases as major releases at least with regards to Python.
Financial data deals with cutoff times and those happen in a timezone. Thus particularly in finance timezones are important. You are right though that internally everything should be UTC internally and only when required it gets transposed to the timezone. However many legacy financial systems assume a local time zone and not UTC and they do not expose that information in their data. Happy reconciliation :-).
If the market opens at 9am New York time then that’s not the same as storing 1400 utc. An law could be passed to change the offset on a given day
You have to store time in the relevant geographic time zone. Something occurring at 1200UTC on dec 1st is not the same as something occurring at 1200Europe/London on dec 1st. The two times may coincide today, but if the U.K. government decides to implement daylight saving next week as a fuel rationing measure, then your 1200London event will have to be displayed at 1300UTC, but your 1200UTC event would be at 1200UTC.
1200 London time, so 1100UTC with daylight saving yes.
Point remains, many events are linked to local civil time, which can change - and not just to daylight saving. The Line Islands in Kiribati shifted a whole day in 1994, same with Samoa in 2011. Have an event set for 0800 local on the first Monday of the year in Samoa which you stored as UTC, and your events would have broken. Unless you store the relevant data (it occurs on first monday according to the timezone that Samoa is in), then you will be wrong.
Now of course you can still get problems if a new timezone is created, as time is stored as Europe/London, if Scotland creates its own timezone in the future a meeting in Aberdeen would no longer be in Europe/London but in a new area Europe/Edinburgh, with different behaviour, but that’s far less common than DST changes or other date shifts.
All in all you should store the relevant datetime and zone, not just “normalise” it to UTC.
One edge case to this is I've worked on financial systems where we kept the Asian market systems in JST, while EU/MENA/Americas ran on UTC. As long as you aren't dealing with FX/Futures, markets are generally open no earlier than 7am and no later than 6pm. This worked nicely since it meant that the DATE in each regions system&local clocks agreed.
If “everything is UTC, all the time” is enforced at the entry edges of the code, couldn’t the entry edges be refactored to always specify the UTC timezone?
It definitely will break financial firms. I just did a code search in our repos for utcnow and other relevant functions, and the indexer was definitely wheezing pulling in the hits.
Deprecation should not break anything. At some point the code will be removed and the org will have to pin the version until the deprecations are resolved.
Using UTC as a universal time standard is stupid because it contains leap-seconds while TAI does not. TAI64* should be the preferred computational monotonic time format.
Stupid decisions don't always get put into practice after they're announced.
I seem to remember reading that remove-if-not is deprecated in Common Lisp, and that this is almost universally viewed as a mistake, but it remains deprecated (and the deprecation is ignored by everyone) just because the standard doesn't change anymore.
Aren’t now, “will be” in a future version. So the can will be kicked down the road as many generations of devs as needed until no one cares any longer.
So, they remove it in 12 months and your code wasn't ready for the change, and people mock you for having not been prepared by updating your code. The Python people have told you to remove it, so nobody will have any sympathy for you if you don't do it. As you demonstrated, people are sticklers for quoting exact wording when it comes to criticizing others.
Has anything in Python been removed in 12 months? TestCase.assertEquals was deprecated in Python 3.2 (February 2011) and was finally removed in Python 3.12 (October 2023). And that was 12 years for just a simple rename from "assertEquals" to "assertEqual".
The warning is also opt-in; you'll have to run python with PYTHONWARNINGS=always or similar to see it.
Which is scary, because you will be taking the performance hit from generating the warning regardless, which can easily amount to 30% of your total CPU time if you use utcnow/utcfromtimestamp a lot.
I don't understand why the functions weren't just made to return non-naive results. You've literally asked for UTC. There's no excuse for allowing the result of that to be interpreted as something that isn't UTC.
Set a separate flag that governs conversion to string to avoid different ISO output for shitty string parsers if you want, but there's no great reason to not just fix the functions so that they continue to work better than before and do the thing that people actually expect them to do, which is produce UTC.
Intentionally naive storage or transfer is irrelevant. That's just serialization. And when you explicitly tell the computer to "deserialize this as UTC" it should fucking do what you told it to and give a zone aware result.
This would be a backwards incompatible change that would cause a lot of issues. For instance, you are not allowed to compare naive and non-naive datetimes, so for instance, `utcnow() > datetime(2023, 11, 19)` would work before, but break following your change.
It is one that can be found via static analysis. That may not be true if behavior is changed. In some cases, this may be "ok" (see: Go's recent loop changes), but in others, it may not.
Yes, but "Unknown function datetime.utcnow" is much easier to diagnose than somewhere in your application, possibly inside a third-party library or maybe even in a different service altogether (because the change traversed serialization and deserilialization) throwing an exception because you can't compare two datetimes.
After the change that you propose, the timestamps written by this code will jump by an amount equal to the offset of the local timezone this machine runs on (and also lose any DST adjustment they might have had).
Now, let's imagine another independent process is consuming events from the database and compares an event written before this change to one written after.
Python is a popular language, used in production by scheduling applications around the world. The potential downside of this kind of change is measured in billions of dollars. The safe solution is of course to not silently change the API in a breaking way, and instead introduce a new API and get someone to look at why the code stopped working.
On my machine, a naive timestamp from utcnow() and a utc timezone aware datetime object from datetime.now(tz=ZoneInfo('utc')) produces the same timestamp.
datetime.now() does not return utc time (but does return a naive datetime object)
> I don't understand why the functions weren't just made to return non-naive results. You've literally asked for UTC.
I totally agree with you that it makes no sense this `utcnow()` does not return tz-aware object.
But I think their fix is great in this regard: now you just use `datetime.now(timezone.utc)`. It's more explicit, it returns aware DT object now, and more importantly, it gets rid of a function that I always feel too specific to have anyway (like how we don't have `plus(a, 2)` and `plustwo(a)`).
Except isn't datetime.now() still going to return a tz-naive datetime? I would much rather have a method that requires an explicit tz, or explicitly uses UTC (like utcnow, but with the right return type).
Looking for this mistake in code reviews gets really old.
I got bit by this once. Timezones aware date calculations can be an order of magnitude slower than non tz aware. Consider every time a legislative body changes dst, it creates an entry in the timezone database that needs to be checked when adding or subtracting time from that timezone.
This is one of the reasons why Unix epoch time soldiers on, even though it is totally indecipherable to humans. It can be easily mapped to a timezone-aware type, but performing arithmetic on it is trivial.
If this was something used only by your company or a specific project, then I’m absolutely on board. However, this is in a library used in thousands (millions?) of projects. Since it’s potentially a breaking change you can’t just change it and expect everyone to know.
Many people won’t know it’s deprecated until the code fails to run, and having new functions rather than changing the functionality of existing functions makes it much easier to identify what is actually happening.
One advantage of getting older is that you see a lot of change. In the long run, simple ideas tend to win. And there have been lots of cases where the programming world collectively agreed that some more complex idea was the right way, then a decade later everyone has reverted to the simple idea.
UTC is simple and unambiguous. Time is a number representing a fixed point in time. When displaying it for humans, convert it into human form. When this principle is applied everywhere, it’s hard to make mistakes.
My gut says that deprecating utc as the native time representation will be one of those ideas fervently embraced by a few, ignored by most, and then quietly forgotten about in a decade.
It doesn't seem like they're deprecating "utc as the native time representation." They're just making it harder to get a datetime object that "has no time zone." The message with this change is: Go ahead and keep using UTC, but explicitly specify it, and the object itself will know it's own UTC-ness.
This misses what the change actually is. Python has two datetimes, one which includes the timezone info and one which doesn't (naive).
There is no assumed timezone for naive datetimes and different software treats it differently. What's being deprecated is methods to create naive datetimes that are implicitly UTC without just attaching that information to the object itself. In the new world everything is still in UTC and you convert to local time at the last minute but now you can be sure that the datetime object you got handed is actually UTC instead of it being a convention that isn't always honored.
They’re not deprecating UTC. Did you read the link? They’re deprecating functions that read as authoritative, but are actually naive about timezones, which can less to non-UTC results.
I think the recommendation is to use a datetime with the timezone explicitly set to utc? I think the numbers vs. objects thing seems like a distraction...
You're free to use numbers instead of the datetime module. In fact you don't have to use any part of the standard library if you don't want to. I don't understand your remarks at all.
If you think UTC means that "time is a number" you are very confused though. UTC has no relation with the UNIX timestamps (and the epoch), though they are both ways to represent a specific point in time.
You're focusing on representation where you should be thinking about semantics.
A "naive" datetime is the one for which it is not known what the timezone is. The opposite is the one for which it is known. You can absolutely encode UTC timezones as seconds since epoch without any additional information (except that one bit that is needed to distinguish them from "naive" ones, which doesn't have to be a literal bit - it can be just a different data type, for example).
And, yes, it would sure be nice if we didn't have "naive" datetimes at all, and everything was always UTC. But there are too many datetimes recorded in the real world without enough information to recover the timezone and thus the precise UTC time, and they still need to be processed. The mistake was to make this the default behavior - but that is a very old one, dating back to earliest days of computing.
>> A "naive" datetime is the one for which it is not known what the timezone is.
> Then it is UTC
No, naiive datetime is: "now" ie a time description that lacks enough information to be able to fix that moment relative to other datetimes. It lacks TZ information.
I happen to physically live at roughly 2.6° west of the Greenwich meridian oh and at 50.9 north of the equator. It is now 23:08 (to one minute precision). I have given you enough information to derive my active time zone, if you consult the TZDB, and hence to convert my given naiive datetime to one that will work for you. If I hadn't included both a northing and an easting (I actually gave you a "westing" but that is simply a negative easting!), you could go very wrong. For example France, Spain etc are directly south of here and they are an hour "ahead" of the UK, right now.
Right now it is: "23:15 UTC+0" I could use GMT+0 instead and that would work too but let's keep it French! That's fair - the Greenwich meridian was picked over the Parisian one (and several others) for 0° east/west for time zone calcs but let's call it UTC instead of CUT which is what the english translation would end up being. I actually prefer UTC - CUT is also a word.
Now, I have not included the date bit of datetime in my examples above. There are multiple naiive interpretations for datetime. Calling them all "UTC" is simply not going to cut it 8)
Says you. The python documentation has never said that naive datetimes are UTC. If the language suddenly makes this assumption, then you break existing code. If you don't want to break existing code, your only option is to 1.) deprecate the existing function and 2.) provide an alternative (which is provide an explicit datetime).
No, it is not. It is a datetime for which it is not known whether it is UTC or not. Assuming that it is is only marginally less bad than assuming local time.
Time is two different things. Just making everything UTC until display time is incompatible with that. Sometimes you need to operate on calendar units or clock units as part of business logic.
Scenario 1: You have an event to schedule at 9am local at some future date. You want to convert to UTC. But you can only do that if you know the offset, and the offset for future dates can change or be unknown. How do you resolve this?
Scenario 2: Now the event is "9am every day, indefinitely". How do you represent that as UTC? Do you generate tens of thousands of separate timestamps?
All of this is fine if you are talking about timestamps. But Python's datetime library can do more than that. Basically once you get into future dates that it all starts to fall apart.
I'm the one arguing against this move, I've already made my thoughts clear that it's a lot of churn, e.g. the PR to remove it from Pip literally caused Pip to have to do a bug release, and has valid use cases, e.g. modelling datetime data types from spreadsheets and databases that do not have timezones.
But I also accept I'm not the one maintaining Python, so I've updated my codebase appropriately (`datetime.datetime.now(datetime.UTC).replace(tzinfo=None)`). Python is not a backwards compatability language, and it's clearly not a strong motivation for most users as I don't see anyone attempting to write a backwards compatibility shim over the standard library, so no point complaining too much.
Time handling seems like one of those cases where a type system could really come to the rescue - times with and without time zones in them shouldn't be the same type, and using one in a place where the other is expected should ideally just give you a compile error.
Reminds me of trying Scala and a library where not only were paths a type rather than primitive strings (good, like pathlib in Python), but absolute and relative paths were actually different types.
Here are two counter examples that sprang to mind:
1. I've set my alarm for 8am local time every day. I don't want my alarm clock to go off at 3am just because I'm in a different country
2. If I have an appointment at 10am next November, that's actually a datetime with a locale (given that my appointment is at 10am local time, no matter whether the DST rules change between now and then).
Well, you'll need another type meaning "local time" for #1. That's not a time without time zone, it's an entirely different thing.
Most environments simply can't represent that thing. They never could. With those changes this becomes more obvious, and with some luck people will finally fix this.
(The most absurd case is the "timestamp with local time zone" from SQL, that reads exactly like what you want, but represents a timestamp in a specific time zone, that is discoverable but mostly unknown.)
Well, you seem to have some weird definition for the word "exactly".
"8AM" is exactly time without a timezone. "8AM at local time (local defined by this procedure)" isn't exactly that.
Perhaps you mean something like homomorphic. It is almost perfectly homomorphic. But if you define your types that way, you'll push yourself into the most useless ones you can get quite quickly.
At my last job we had types for a `Time.t` (time + date + timezone), a `Date.t` (just a day of the year), and a `Time.Ofday.t` (a time of the day with no timezone or date attached - like 8:00 AM).
This worked really well! You represent (1) with a `Time.Ofday.t`. I suppose if you wanted you could represent (2) with a `Time.Ofday.t` + a `Date.t`, although it kinda seems to me like you want to keep timezone information around if you're dealing with DST changes.
I think ISO 8601 made a big mistake by allowing timezone-less representation[1]. They should have used L for local time so it's explicit, and not allowed any timezone-less representation.
Sure it's technically well-defined as it is, but I think it's too important not to be written out.
In that case you can represent your alarm either as 08:00L if you want to wake up in the morning regardless of where you travel, or 08:00-02 for those cases where you need to stay in sync with home say.
I completely disagree, the local time is the only useful part of ISO 8601.
The big problem here is that it does not use timezones, but UTC offsets. Timezones can and do change their UTC offsets, completely breaking all future timestamps saved in ISO 8601-with-"timezones". "08:00 UTC-02" is ambiguous, "08:00 America/Nuuk" is not.
You don't need both. Named zones can do everything that offsets can, and it's still useful to know the named zone for historical timestamps in many situations.
> After all, if a zone changes offset today it shouldn't change the timestamp I recorded yesterday.
That's not how named timezones work. That behavior would be bad and it would also qualify as ambiguous.
> Half the world experiences a scheduled offset change every time summer time begins or ends.
But they do so because they record the actual offset when the time is recorded. If I send you 2023-10-29T02:15NO, how do you know if it's +01:00 or +02:00? Similarly, if Norway suddenly decided to switch their offset to be 15 minutes back, how will you handle that when reading previously recorded timestamps?
> If I send you 2023-10-29T02:15NO, how do you know if it's +01:00 or +02:00?
I think instead you'd send a timestamp (unix or NTFS or TAI or other) annotated with "Norway", and you would not send the offset.
> Similarly, if Norway suddenly decided to switch their offset to be 15 minutes back, how will you handle that when reading previously recorded timestamps?
Am I that bad at explaining this?
If Norway changes their offset, they will change it as of a specific moment in time. Times before that moment use the old offset, times after that moment use the new offset. The timezone database doesn't just store one or two offsets for a location, it stores every offset that has ever been used.
Checking whether you're before or after 5am on Feb 29th 2024 for a 15 minute change uses the same code as checking whether you're inside any particular year's version of summer time.
> I think instead you'd send a timestamp (unix or NTFS or TAI or other) annotated with "Norway"
This is what I attempted to write with "2023-10-29T02:15NO" (NO being Norway's country code).
> Am I that bad at explaining this?
Could be I'm dense, and it's not my field of expertise.
Let's say Norway has decided to change it's offset from +01:00 to +01:15 at 2024-01-01T01:00:00Z. I have a process that records timestamps, and lets imagine it records them with the suggested named offsets instead of numerical offsets, in the format shown above.
Later next year, you look in the logs and see 2024-01-01T02:07:00NO. How do you know which time this is in UTC?
With numerical offsets this is simple, because it would either be recorded as 2024-01-01T02:07:00+01:00 or 2024-01-01T02:07:00+01:15. But when you just have NO, how to you convert the timestamp to UTC?
A Unix/NTFS/TAI timestamp would not be affected by the offset. Using Unix as the example:
2024-01-01T02:07:00+01:00_NO -> 1704071220_NO
2024-01-01T02:07:00+01:15_NO-> 1704070320_NO
The timestamp doesn't have to be UTC, but it should specifically not be local time. You should be able to convert it to UTC without knowing the offset or time zone.
So, done this way, storing time zone and no offset is unambiguous. And it gives you more information than storing an offset and no time zone.
Thanks, yes that makes sense. I was working under the assumption that ISO 8601 was meant to be human-readable. If you can forego that requirement, then the solution you present would work well indeed.
1. That isn’t date data. That’s just storing a local time.
2. That’s why you WANT time zones in this example. It will automatically stay at 10am instead of bouncing around when DST rules change.
I’ve never ever needed timezone-less dates. Even using UTC or timestamps, it’s still UTC.
A date without a time zone is like “10 inches” but without the inches so it’s just “10”. Absolutely meaningless. You start moving just “10” around without the units in your code and then your Mars Orbiter explodes.
>2. That’s why you WANT time zones in this example. It will automatically stay at 10am instead of bouncing around when DST rules change.
Respectfully, no - in my scenario, if I used a timezone then I would show up at the wrong time for my appointment: each locale that observes DST has two timezones: one for DST, one for non-DST (e.g. BST and GMT for the UK). If I record a 10am appointment as "10am BST" and my country changes the rules on daylight savings transition date between now and the appointment, then I will show up at the wrong time for my appointment. Whereas if my appointment is recorded as "10am Europe/London" then it's encoding the human reality of my appointment, and can be correctly converted to an Instant in time if the rules change.
It sounds to me from your reply that you are mentally modelling all datetimes as Instants. These are one key aspect of dates+times (and often all a lot of software needs - since it's usually referring to some point in time that has already passed, or a future point in time that is x seconds from some other Instant), but Instants don't cover everything required when working with dates and times.
>I’ve never ever needed timezone-less dates. Even using UTC or timestamps, it’s still UTC.
Try sketching out the design for a calendar application and I bet you'll very quickly run into a whole raft of datetime related logic you might never otherwise encounter. Think particularly about future events: some are fixed instants in time (e.g. a meeting with people across timezones), others aren't (e.g. a dentist appointment in your city - my second example above is an instance where the local time of the appointment is relevant, not the Instant in time that local time referred to when the appointment was made). How do you represent recurring events?
For the classes of problems that you mention, I consider them as separate data types entirely and they shouldn’t ever touch your timezone-aware date-times objects. They should be stored as a composite types of your date, maybe time, and target timezone.
If you can mix timezone-aware objects with non-timezone aware objects like you do in Python, it’s get pretty dangerous because people will just mix them willy nilly and add and subtract the timezone-aware parts and then you get some really subtle date/time bugs.
That’s why I believe either a date-time MUST have a timezone and for other cases, you need a specialty data type for your domain, possibly provided by your date/time library.
That’s not a date-time though. You would not store that using the same type-compatible objects as your regular date-times or that’s how you end up one day blowing up a Mars Orbiter.
Yes, you need both zoned and zoneless versions of dates and times. This is why e.g. java.time has both (and it has LocalDateTime for when you need to say Santa delivers presents at 00:00 on December 25).
Zoneless time is not a mistake. Using it when you meant to use a zoned time is.
Making it easier to use zoneless over zoned is a mistake, though. An even more grave one is making zoneless the default. Unfortunately, that's a mistake that we've made decades ago, and now we have to fix it in a way that deals with all the accumulated mess.
Really, though, this is a recurring lesson in software engineering: don't be in a rush to make something a default just because it's "obvious"; explicit is better than implicit. Strings are another example; we've treated them as arrays of characters for so long that it's still a common default semantics for indexing a string, except now what you often get instead is UTF-8 or UTF-16 code units rather than actual code points. All of which is likely not what you actually need - but what you do need depends so much on the task that it should always be explicit.
Which days are recognized by an administrative region as holidays is an orthogonal concern from timezones. e.g. the US recognizes the 25th but spans multiple timezones.
> 1. I've set my alarm for 8am local time every day. I don't want my alarm clock to go off at 3am just because I'm in a different country
Without timezone, your clock would actually go off in the middle of the night when you are in a different country. Your clock does not know you change location (no timezone, no location-dependent time), so it will alarm you after exactly 24 hours since the last alarm, which, in another country, can be midnight.
> 2. If I have an appointment at 10am next November, that's actually a datetime with a locale (given that my appointment is at 10am local time, no matter whether the DST rules change between now and then).
If everyone is staying in your city, then that works great (sans catastrophic continental riff or meteorite strike). When at any point you involve more than one geographic location, then the appointment is meaningless without timezones.
Now, maybe it is actually the case that you and your software only care about one geographic location in perpetuity, but just like assuming 32-bit IP addresses, 8-bit ASCII, and length-oblivious arrays are all you need, the teeth marks from its consequential bite tend to be quite deep and sanguine.
> Without timezone, your clock would actually go off in the middle of the night when you are in a different country. Your clock does not know you change location
This does not describe any modern person's experience. Your clock is connected to the internet; it always knows the local time.
You're missing the point. Evaluating the rule requires a timezone, but the rule itself is not defined with reference to a specific timezone (well, the second one is assuming you can use something like PT that changes between PST and PDT throughout the year). To define the first rule requires a zoneless time.
1 could be better served by an integer representing the minute it should go off anyway, since there is no specific date attached
2 time zones account for DST so that point is moot, but in the context of traveling and still wanting to maintain the same times this is actually something calendar apps often get wrong
No, I can't think of any case where that would be useful (although I do take umbrage with a local datetime being referred to as a 'naive' type - it serves a very specific purpose as a primitive... if it's naive then so is an int, since it does not specify a unit).
I don't know anything about Python's datetime handling (other then to be alarmed by it given what I've seen in this thread). I live in the Java world where the java.time library is one of the best around (after a decade or more of Java having one of the worst date time libraries)
datetime with locale is how timezones are represented in python. What you're asking for there is a datetime with a timezone, not a datetime without a timezone.
"8am" is a time that inherently has no timezone, and yet we need it. It's not that we don't need timezoneless time - we just need to stop using timezoneless time to represent times with a time zone and vice versa. Most time APIs were simply designed wrong on purpose for "simplicity".
All timestamps even have a position in space that will experience discrepancies relative to all other timestamps, even if they are UTC, we just usually don't care about it that much.
Since our time is at least all Earth-based, we could do even better than timezone and make timestamps have a lat/long attached for the place in which it was produced. Maybe a linked list for any timestamps that are calculated from that based type.
The social solution is to abolish timezones (mine is of course objectively the right one, other people must deal with breakfast at 9pm wherever they are).
I need a null value for when the timezone is unknown. If a user enters 9:30, they do not want to be bothered with specifying a timezone if they keep everything local to their life.
Right now, I am dealing with this problem that I have historical data recorded in Eastern and Mountain Time. Everything is local and nary a timezone in sight. Data files are mixed, and no obvious way to determine provenance.
I cannot just guess what timezone these data points were collected, so I have to suffice for local time. Which is 100% fine for the users, because that’s exactly how they think about the problem. There are probably no global events against which they need to sync, just interested in the relative differences between events.
It may not matter, but without UTC and offsets, you can't get relative differences between all events accurately because of DST. But it may not matter for your use case.
Well then maybe they should store the epoch alongside the duration value too! Wouldn't want to get confused thinking it was duration since some other epoch!
It's not as helpful as you'd think IME - timezoneless date times are suprisingly niche (at least as a useful tool to intentionally choose, rather than an accidental side effect of API choice.)
Reading this article prompted me to future-proof a program I maintain for fun that deals with time; it had one use of utcnow, which I fixed.
And then I tripped over a runtime type problem in an unrelated area of the code, despite the code being green under "mypy --strict". (and "100% coverage" from tests, except this particular exception only occured in a "# pragma: no-cover" codepath so it wasn't actually covered)
It turns out that because of some core decisions about how datetime objects work, `datetime.date.today() < datetime.datetime.now()` type-checks but gives a TypeError at runtime. Oops. (cause discussed at length in https://github.com/python/mypy/issues/9015 but without action for 3 years)
One solution is apparently to use `datetype` for type annotations (while continuing to use `datetime` objects at runtime): https://github.com/glyph/DateType
The way you put it makes it appear very surprising that time is not the extended hello world of type systems. Maybe it's because no environment has ever arrived at an implementation they are actually happy with, before fizzing out at "good enough"? It's clearly a topic where we tend to prefer something rough but shared over something better but custom, so the "good enough" is powerful.
This would be fantastic. As someone who just occasionally touches Python, I can’t count the number of times I’ve been bitten by bugs caused by unwittingly mixing tz-naive and aware datetimes.
I look at the arguments in this post and see echoes of Y2K:
- It's a clever hack based on an assumption that dismisses the fact that this assumption can change (that the time zone will never be changed, or that everyone will follow the same rules once it does).
- The argument for doing it is: It "uses less space" and "takes less time"
These are similar to the reasons why we used 2-digit years for so long.
Timestamps without time zone data are BUGS (and I mean REAL time zones, not time offsets like CET and PST). They are landmines just waiting to go off once the assumptions they're based upon change. They require your host system to be configured just-so, making your software brittle. They push the ugly details of handling many time issues that could be handled centrally into the user-space (and even worse, they make it OPTIONAL). The Python maintainers are right to remove this, and really they should have removed naive datetime when they released Python 3.
Time is one of the most notoriously difficult things to get right. I've yet to see software using time calculations (or anything more complex than recording timestamps of past events - the only thing for which UTC is useful) that is bug-free.
Time data ALWAYS REQUIRES time zone data. The only "exception" is timestamps of past events, but only because we always assume them to be UTC (and so there's an implied time zone so long as nobody bucks the trend).
Local time is very important. "Tomorrow at 9:00" is a meaningful local time for everyone, but doesn't refer to the same global time. It would be wrong to attach a time zone to such data.
Past events, on the other hand, do require timezone data if they are recording human activity, which is basically everything. "Just use UTC" is data loss. You now don't know whether at event took place at midday or midnight. Two very different results for many activities.
Imagine you have users all around the world and you've got logs of their activities. Someone asks "do users spend more time on X in the morning or afternoon"? If you only recorded absolute times and threw away their local time then you can't answer that question.
It doesn't matter whether you record UTC or local time as long as you record their time zone, you can always derive one from the other. This is an implementation detail that should be abstracted by any decent time library.
This seems like the blatantly obvious path to me. Perhaps that needs to be dealt with in a major version release tho. TZ naive and TZ datetimes are needed and it’s important to chose the correct one.
I'm not a Python guy so am not familiar with the culture, but I'd assume many, many programs use the deprecated function, given that it made "the news." It seems not right for backward compatibility to deprecate something many developers are using, simply because the old one wasn't thought out very well, and "developers should switch" to the explicit functions. Especially since, as the author demonstrates, the deprecated one can be easily re-implemented. It seems like a developer inconvenience was added without changing the state of the world. Things like this--I don't know--make me take a development environment less seriously.
In C, we all know we shouldn't use things like strcpy() and gets(), and you'd be a total fool to add them to new code. But for the sake of backward compatibility with programs that aren't going to change, they should be kept in and documentation should nudge people to do it the right way. Maybe that's the Python way of nudging: "This function is deprecated!"
Fwiw, gets() was actually removed in C11 because of it's inherit security problems. :)
But aside from that, I think it's just a different style. Waiting 10 years from deprecation to remove a deprecated thing might be required with C just because of how prevalent C is and how much the developers in that have to think about basically every platform, but removing something on a major release isn't inherently bad, it's just a different tradeoff (being able to move a bit faster and fix some design issues vs. being very very good at backwards compat)
The deprecated function can obviously be reimplemented pretty easily, but so can gets(). With utcnow(), it had some unfortunate effects in that it didn't really do what people thought it did, and because of that, most of the uses were basically bugs to begin with.
(Having experience with .NET for example, you'd think that DateTimeOffset.UtcNow and datetime.utcnow() are semantically similar, but the problem is that utcnow() doesn't actually have a timezone attached to it. Adding one would also be a breaking change, and one that is more subtle.)
> Fwiw, gets() was actually removed in C11 because of it's inherit security problems. :)
It always exists, it's just that the import gets controlled by a macro. Which, hints at a way Python could have solved this same problem without requiring auditing and rewriting code.
Different language ecosystems have different standards for backwards compatibility, with C (and e.g. Java) being incredibly conservative by modern standards. This is not a good or a bad thing per se, it's just something to consider when picking a tool for the task at hand.
Am I correct that the datetime.datetime.now(datetime.timezone.utc) syntax is backwards-compatible all the way to Python 3.2?[1] If so, this seems like a good move. The edge cases like compatibility with data sources that don't indicate the time zone seem like less of an issue than data corruption due to developers not realizing that they were working with "UTC" timestamps that weren't actually UTC.
I'd be more concerned if this was one of those changes where there isn't a good way to write reasonably straightforward code that will run in relatively old versions of Python as well as newer ones.
Because you can now fully simulate arbitrary times during testing by providing your own implementation of the time provider. If you just take in a timezone as an argument, how will you simulate it being 2041-08-23T21:17:05.023743Z?
True. When you can just make any function be whatever you want whenever you want that sort of abstraction does seem a big much. I was responding to the specific question in reference to the time provider in .net 8.
You know what would help people adopt timezone aware timestamps? Just make now and utcnow return timezone aware timestamps! What's the point creating a naive timestamp when you know the time?
> An application may be designed in such a way that all dates and times are in a single timezone that is known in advance. In this case there is no need for individual datetime instances to carry their own timezones, since this uses more memory and processing power for no benefit, since all these timezones would be the same and it would never be necessary to perform timezone math or conversions.
The memory and processing differences between the two seem marginal, at best. Especially when it's at UTC ("plus zero" doesn't take too much processing).
So I'm gonna have to ask for benchmarks on this one.
And even if it was slower one must ask if having a confusing concept like a naïve datetime is worth it. Obviously Python is all about the "programmer productivity at the expense of some performance" trade-off.
Why not just use .now()? Sometimes your code gets run on machines that have varying local times, but you still want to work with implicit UTC times. For example, when testing server code, or when writing a client application.
Because the existing codebase already uses naive datetimes. For example, perhaps there is a feature flag component that includes checks like:
input_dt > datetime.datetime(2023, 1, 2, 3, 4, 5)
As mentioned elsewhere, it's common to serialize datatimes without timezone because it's a huge waste of space and speed in databases and elsewhere to constantly include the timezone info. Of course you could handle this at the edge like many other kinds of serialization, but many apps exist which instead choose to keep all datetimes naive and handle TZ conversions at the edge instead.
The naive time thing in python really is (was) a nightmare. I once noticed that a db library we use when storing a timestamp in the db would calculate that timestamp from 1970-01-01 00:00:00 at _local time_ instead of utc and had to patch that to make sure we never store a wrong timestamp in our db (luckily all our production servers run on UTC as the local timezone).
Glad that it's finally being addressed at the language level.
Every time I see news like this, the voice in me that is saying "timestamps should be represented as epoch-time integers" grows louder. The amount of pain I've seen "better" solutions bring feels like the benefits aren't worth it unless you need some very specific feature (e.g. you're not trying to express an absolute time but repeating calendar events).
Timestamps yes, but you can't represent meaningful future times that way.
You should also record the timezone with the timestamp, though. An epoch-time integer will unambiguously tell you what time something was recorded at, but it can't tell you what the time on the clock was at that time. If you're doing some kind of data analysis on user habits, for example, you almost certainly want to work with user local times, not UTC.
I was recently working with stock data from a Django shop, half of their Unix timestamps were correct timestamps, the other half had a local timezone offset baked right into the integer.
At least this time finding problematic code will be a grep away, so I'm ok with this change, even though I've used it a lot.
Edit: Wow. So Paul Ganssle writes in April "Previously, we have documented that utcnow and utcfromtimestamp should not be used , but we didn’t go so far as to actually deprecate them [...]"
The linked to issue where this was decided is from Jul-Sep 2019. And I've never heard about this. Obviously this is my fault, but: where do I need to inform myself to know about these decisions? I've been "happily" using them all the time (quoted because we're all aware of the challenges these conversions pose each and every time).
I’ve been using Python for going on 14 years now, and I had no idea utcnow() returned a naive date time. I happen to live and deploy to a (mostly) UTC time zone, so i expect a bunch of my code is subtly wrong.
One of my projects added all of the code in PyPI to GitHub, it would be interesting to see how prevalent the use of “utcnow()” is.
I think naive datetime would be fine, even if it resembles UTC - as long as you don't mix in a serial representation. The module documentation says, "Whether a naive object represents Coordinated Universal Time (UTC), local time, or time in some other timezone is purely up to the program, [...]". That definition goes down the drain imho once you call `.timestamp()`, which gives you Unix time as a serial representation (epoch time), which must be referenced against aware datetime, namely 1970-01-01 00:00 UTC. Same trouble vice versa (`.fromtimestamp()`).
It’s pretty disappointing to see this article make claims like “web developers just use naive datetimes”.
I think anyone who does anything nontrivial with datetimes should always use non-naive datetimes (unless you have a really really really good reason to)
Even when I started doing Python web dev 10 years ago this knowledge was bestowed upon me and has saved me many headaches (and helped me find bugs in third party code). The fact this article claims otherwise makes me doubtful of other advice given.
Unfortunately, Pythons date/datetime APIs make it way too easy to do the wrong thing on this topic.
The timezone thing in Python that bit me was when I needed to parse a time string. Googling told me Python had strptime and it would do what I wanted. So I found it in the Python docs and used it.
What I failed to notice was that Python has two strptime functions, one in time and one in datetime, and they are not the same. I was losing timezone information that was in my time strings, and when I wrote a test program to just test timezone parsing I happened to pick the other strptime and everything worked fine, which left me quite confused.
Python could be better but really, does any language handle date and time types well? After 30+ years of this I mostly just use seconds since epoch everywhere like some sort of caveman banging rocks together. But at least it works clearly.
Last I looked Pendulum was the best choice for a fancy but humane Python library for dates and times. Install size is over 4MB :-( https://pendulum.eustace.io/
UTC does not have a timezone, so it is not a 'naive' timestamp. It just coincidentally has the same format as naive timestamps. Deprecating the utcnow(), because some folk does not think enough, instead of teaching them the correct way, is plain stupid.
Timezones are jurisdictional. Their exact definition may change (in the future). So store datetimes as UTC, depending on your requirements either as Unix epoch or ISO8601 string.
It's not a timezone but the timezone mechanism in python is used for time standards (which it is) as well as zones so the distinction is irrelevant. You certainly can have a python datetime with its timezone set to UTC and the naming of utcnow() strongly suggests that's what it returns.
Besides, this isn't a question about what is right in some abstract sense. It's a usability issue, and this accidental misuse of utcnow() is common. Python tries to avoid containing APIs that frequently trip people up, even if there's a "well akshewally" argument that they are in some sense correct. Not that there is, in this case.
Writing it in caps doesn't make it true. In real life UTC is a time standard, not a zone, but in python and in most other date/time mechanisms it is treated the same as one. It exists in the IANA tz database.
A python "aware" datetime carries a "tzinfo" object. This can be a timezone object which just uses a fixed offset from UTC or it can be a more sophisticated subclass of tzinfo such as the one provided in the zoneinfo module (since Python 3.9), which uses the IANA timezone database and so can express things like "America/Los_Angeles" (and, indeed UTC).
The UTC timezone object provided in the datetime module (datetime.timezone.utc) is the former kind, an instance of timezone, which is sufficient because of course UTC can be expressed as an offset (of +00:00) from UTC. However it isn't just the offset. Try creating a datetime using tzinfo=timezone.utc and call tzname() on it. It returns "UTC". It knows that it is called UTC, not just that it is an offset of +00.00.
Now try calling datetime.utcnow().tzname(). It will return None. Also try calling datetime.utcnow().utcoffset(). It will also return None (not 0). This is because the value returned by utcnow() is a "naive" datetime: it doesn't have a tzinfo at all. That doesn't mean it represents UTC (or an offset of +00:00 from it) as you appear to think. It means it has no timezone or offset specified. It is just a nominal time.
Plain nominal times with no timezone information are useful in some circumstances, but they do not represent an instant in time. A "naive" datetime of 12pm 1st jan 2023 means just that. It's a valid concept in its own right but it doesn't denote a single point in linear physical time. For that you have to add more information. The information that, in python, goes in the tzinfo.
You say yourself that "utcnow() definitely suggests that you get an UTC" but you then seem to be under the impression that it does so. It does not, because it returns a naive datetime, and a naive datetime is not UTC or offset zero. A naive datetime has unspecified timezone/offset. It could represent UTC or CST or BST or all three at the same time or none of them. It is unspecified. This is very misleading behaviour from a function that has "utc" in its name.
All of the above is easily verified by consulting the docs. I don't know why I took the time out of a busy day to write it, given the laws of ego on the internet mean you're unlikely to even read it all let alone thank me for writing it. Nonetheless, I hope it might help you avoid at some point making the exact mistake this deprecation is trying to prevent.
UTC has no timezone and does not need a timezone. What utcnow() returns is absolutely correct.
That IANA unfortunately named the timezone indicator for UTC+00 'UTC', instead of 'UTCZ' or 'UTC+00' means nothing.
That in Python UTC has the same structure as a naive datetime means nothing.
Deprecating an absolutely correct function to coddle some ignorants is plain stupid.
Edit: What is not correct are e.g. the astimezone() and ctime() functions of a naive datetime, because they assume it is a local time. Those should be corrected to handle an UTC!
One of the unusual things Swift did during its transition to a real programming language was to promise that there would be breaking changes.
This way, everyone knew what to expect. Customers, co-workers, contractors, and management knew where things stood.
If every programming language and API/ABI did this from the get-go, there would be so much less angst and hair-pulling each time some old, bad design was deprecated.
The problem with powerful languages is you can misuse them. The maintainers are deprecating a perfectly-good API, which is likely used by many people, simply because someone, somewhere, might not have read the (very clear) documentation and misused it. Maybe they should deprecate strftime() and strptime() because someone might serialize a datetime and drop the timezone.
> To me it is clear that the Python maintainers behind this deprecation have a problem with naive datetimes and are using this supposed problem as an excuse to cripple them.
I am not a python maintainer, and I also use naive datetimes a lot, and I can absolutely understand the maintainers rationale behind this.
Let me explain.
What's a "naive datetime" suposed to be conceptually? There is no such thing. Datetimes only make sense in the context of a timezone. UTC is just another timezone, with an UTC-Offset of +0
So why do we have this strange thing that may be a UTC timestamp, or a local timestamp, or god knows what in the first place? It makes sense as an implementation detail of the datetime library to have timezones as separate objects, but for consumers of that library? It can cause confusion, and hard to track bugs, when suddenly one of the functions that accept aware or naive is used and spits out something completely unexpected.
I have spent many an hour tracking bugs, where systems spat out complete nonsense timestamps, only to discover that the reason they do that, is a mixed usage of aware and naive dateteime objects.
In my opinion, making explicit use of aware datetime the norm as much as possible, is a good thing. And removing prominently named functions that spit out naive ones, is a good first step.
Let's look at the listed usecases:
> An application may be designed in such a way that all dates and times are in a single timezone that is known in advance.
Okay...so put that TZ info somewhere it's easy to find, like a configuration Singleton, and use it.
> It is also a best practice to store naive datetimes representing UTC in databases.
Okay...read them from, and into datetime objects using UTC as their timezone information.
The bottom line is: There shouldn't be "naive" datetime objects, or at least they shouldn't be a common thing. Conceptually, datetimes always have a timezone attached, they are worthless without one.
I certainly didn't miss that, as I pointed out myself: I use naive datetimes in my codebases. Alot of them. And this change generates quite a bit of work for me.
But I'm okay with that.
The problem with de facto standards is: The fact that something became de facto standard, doesn't make it a better idea. The only thing it means is that it is now even harder to get rid of it.
> It's basically very late to be doing this.
Yes, it is. But not yet impossible. Languages and libraries that wait forever instead of ripping off the bandaid at some point, eventually become so crusted with such early problems, they become problems themselves.
Numpy and pandas are their own little island. It's not just dates and time, it's everything.
If you use numpy and pandas, you should also not use Python datetime, generators, most stdlib mathematical functions, the itertools module, random, etc.
It's the first thing you learn if you read any good pandas book, and the first thing I teach in my numpy/pandas trainings.
It has pretty much nothing to do with pendulum.
Basically, half the Python ecosystem is "well, except with numpy/pandas of course".
Does Go also has a distinction between naive/timezone-aware times? If not don't see why would expect something else. This is an issue because the distinction exists. Note it isn't bad that it does. There're actually arguments in favor of it. It's bad that can freely mix them.
It does, which is great. Note however that Go times can be either wall-time or (wall-time, monotonic time), which have different semantics and are both represented by the same type.
Timestamp is always utc, unless something is messed up.
On it's own, Timestamp are already timezoned, to utc-0.
Not the most difficult thing to master if you keep that in mind.
Honestly i dont see the benefit in the change, just confusion.
I’ve definitely been burned by assuming that “no time zone” means UTC when my python environment assumed I wanted it to mean Pacific time - it’s like the APIs all prefer naive datetimes, and I had to learn a new trick for each one to convince it to do the right thing
Hmmm. Can’t help but wonder if it would have been better to have the function utcnow simply return a time zone aware object instead. Would any usages of the function break?
Isn't Python adding gradual typing? This is the sort of confusion that becomes apparent when we know and write the types (TZ aware or not) coming out of into functions.
`datetime.timezone.utc` is the original way of doing it and should work just fine on a stock Python 3.11 install. `datetime.UTC` is a relatively recent addition, and if you look at the actual type of the object it returns, it's still `datetime.timezone.utc`
This is verbose because it is dissuaded. The whole point of this change is to not have users get naive datetime objects, of course it gets more verbose.
This was an aside and not the point of the article but I can't let it go:
> It is also a best practice to store naive datetimes representing UTC in databases.
It is most definitely NOT a best practice and you SHOULD NOT DO THIS.
If your database supports TIMESTAMP WITH TIME ZONE -- which the "aware" type per the article's terminology -- you should be using it. If your database doesn't, you should get a better one.
Would you care to elaborate? I'm not an expert in this area so I'm probably missing something, but my thought has always been the 'portability' of storing everything in the backend as UTC is more flexible/less complex in the long run, dealing with conversation at the 'last mile' so to speak.
Happy to be corrected here but it's not intuitive to me that dealing with timezones as the 'base unit' makes anything easier.
Converting from a local date/time with an associated timezone to an UTC time is not a lossless nor unchanging operation. By converting everything to UTC, you're throwing away information that you might need to do the conversion at the 'last mile' later on.
For example, imagine I schedule a dentist appointment for 9AM in my local timezone sometime next summer, and put it in my calendar. Now, if the government decides that we won't do daylight saving times next year, my dentist and I still expect that appointment to happen at 9AM local time. However, if my calendar application had converted it to UTC using the timezone definitions at the moment I created the appointment, it would show in my calendar at 8AM, and I'd be an hour early!
This is not an as hypothetical situation as it seems. To give just two recent examples: the Chilean government decided to postpone the start of DST with less than a month's notice last year, and Morocco shifts the clock an hour during Ramadan, the date of which only becomes certain on the day itself.
It's the usual divide between people who actually know how databases work, and hence want to do more stuff there (because it will likely be faster and more robust); and people who don't know (and don't want to know) anything about databases, and hence would rather use them as dumb stores of basic data.
I've been in each camp at one time or another, since being db-agnostic makes sense in some cases (libraries, products) and not in others (websites, services).
It's not an either-or. You can store everything in UTC while also using TIMESTAMP WITH TIME ZONE. The timezone will always be UTC, but that way at least anyone who will be dealing with this data later knows that to be the case, instead of assuming.
> If your database supports TIMESTAMP WITH TIME ZONE -- which the "aware" type per the article's terminology -- you should be using it.
NO!!! IT IS NOT THE SAME AT ALL!
I've been burned by that exact misconception. TIMESTAMP WITH TIME ZONE does NOT store a timezone in the database!
It implicitly converts a timestamp you supply into UTC and DISCARDS the associated timezone information you provided. When you read it back, it converts it to LOCAL time unless you explicitly specify a different conversion. This is a recurring source of confusion and bugs. The timezone a timestamp was actually written in is sometimes VERY important -- and I've personally had to deal with data loss on a project caused by this misconception.
Let me repeat: There is NO WAY to read back what timezone a "TIMESTAMP WITH TIME ZONE" was written with later -- unlike an "aware" datetime in Python -- BECAUSE IT DOES NOT ACTUALLY STORE THE TIME ZONE YOU SUPPLIED! This is a data type that DESTROYS INFORMATION YOU PROVIDED TO THE DATABASE assuming it can be clever later.
I fucking hate that data type SO much.
An "aware" datetime in Python actually tracks a timestamp with a specific timezone explicitly attached to it. It is VERY different!
Documentation for Postgres:
> For timestamp with time zone, the internally stored value is always in UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, GMT). An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone. If no time zone is stated in the input string, then it is assumed to be in the time zone indicated by the system's TimeZone parameter, and is converted to UTC using the offset for the timezone zone.
> When a timestamp with time zone value is output, it is always converted from UTC to the current timezone zone, and displayed as local time in that zone. To see the time in another time zone, either change timezone or use the AT TIME ZONE construct (see Section 9.9.4).
> Date and time objects may be categorized as “aware” or “naive” depending on whether or not they include timezone information.
> With sufficient knowledge of applicable algorithmic and political time adjustments, such as time zone and daylight saving time information, an aware object can locate itself relative to other aware objects. An aware object represents a specific moment in time that is not open to interpretation.
[...]
> For applications requiring aware objects, datetime and time objects have an optional time zone information attribute, tzinfo, that can be set to an instance of a subclass of the abstract tzinfo class. These tzinfo objects capture information about the offset from UTC time, the time zone name, and whether daylight saving time is in effect.
I agree (though less indignantly). TIMESTAMP WITH TIME ZONE basically makes me confused as it is trying to run around guessing how to actively solve a problem I dont actually have, which is, "I don't know how to correctly represent datetimes in my application".
what people usually want, when they are confused about that datatype, is a datatype that stores the timestamp *and* a specific timezone to go along with it, so you get back a timestamp with that same timezone if you did not ask to convert it to another. that would be a bit less efficient from a storage perspective but would likely be more intuitive.
Can you explain why? I write web apps, and I typically want to know an instant in time when something occurred and then localize the time for display. Why would I need to store a time zone for that?
Storing UTC for past events is totally fine because it is unambiguous.
However, if you are writing something like a calendar app and storing timestamps for future events, you run into the issue that time zones have the annoying habit of changing.
Let's say you have a user in Berlin, and they want to schedule an event for 2025-06-01 08:00 local time. According to the current timezone definition Berlin will be on CEST at that moment, which is UTC+2, so you store it as 06:00 UTC. Then Europe abandons summer time, and on 2025-06-01 Berlin will now be on CET, or UTC+1. Your app server obviously gets this new definition via its regular OS upgrades.
Your user opens the app, 06:00 UTC gets converted back to local time, and is displayed as 07:00 local time. This is obviously wrong, as they entered it as 08:00 local time!
Storing it as TIMESTAMP WITH TIME ZONE miiiight save you, but if and only if the database stores the timezone as "Europe/Berlin". In practice there's a decent chance it is going to save it as a UTC offset anyways...
If you’re storing time zone info for a transaction in the database instead of storing it as epoch time, you’re probably doing it wrong. Time zone is a user-level presentation layer issue. Keep that junk out of the backend. Imagine writing code to reconcile financial transactions that also had to deal with time zone offsets to figure out if one transaction happened before another. So much extra complexity for no reason. I don’t know where you’ve been doing this or where you got this idea, but it’s most definitely not common or advisable to store TZ info on transactions anywhere I’ve worked in my couple of decades at some very large software companies. TZ info is usually stored with a user account as a preference that they can update, or it’s detected from their browser during presentation time, because that’s when it matters — when you’re showing it to a human. Having machines calculate TZ offsets between database records is asinine, inefficient, and a great way to introduce an entire domain of bugs for absolutely no reason.
TIMESTAMP WITH TIME ZONE stores the timestamp naively. The type defines a convention for how data is transmitted to and from the datatype.
which is basically exactly how one is supposed to use timezone-naive datetimes. the datetimes are individually "naive" however are meant to always be used within a context that is implicitly non-naive across the full set of datetime objects.
for me, it's UTC everywhere by default, let clients convert UTC to local time. dbs/upstream systems do that already (these days). computing is predominantly distributed these days
After Java’s JSR-310 almost any other API in almost any other language looks broken or insufficient. Threeten’s modeling of different time types seems so powerful and natural - specifically, it has Instant (which is literally a nanosecond-precision UNIX epoch timestamp), then it has ZonedDateTime/OffsetDateTime, which are full timestamps with time zone or offset information, and LocalDateTime/LocalDate/LocalTime, which are basically these “naive” timestamps from Python. These are all different types, with well-defined conversion methods between them, so you can’t easily use an inappropriate type.
My personally most hated piece of date-related API is Go’s formatting API, with its asinine number-based patterns instead of something reasonable and conventional like strftime-like or Java-like symbols. What’s especially maddening in it is that it is so US-centric, with month coming before day in these number patterns, which does not follow unit size increase or decrease sequence.
> What’s especially maddening in it is that it is so US-centric, with month coming before day in these number patterns,
Whether this particular bit of design is good or not, you'll certainly be hard pressed to find someone who has a reasonable argument in favor of the absurdity that is the Go date-time format string.
> 2006-01-02 (aka 'yyyy-MM-dd')
> January 02, 2006 (aka 'MMMM dd, yyyy')
> Jan-06 (aka 'MMM-yy')
> Monday (aka 'EEEE')
It's insanity start to finish and it boggles the mind how it was ever accepted in the language spec.
> I would have made it more clear that these functions work with naive time, maybe by calling them naive_utcnow() and naive_utcfromtimestamp().
> But their names are not the problem here. The specific issue is that some Python date and time functions accept naive timestamps and assume that they represent local time, according to the timezone that is configured on the computer running the code.
Uhhh, no I would not expect local time because as the name entails we're dealing in UTC. Sigh.....
You might not, but the standard library sure does. And that's the exactly the problem! How can functions in the standard library know if the time is local or UTC without any indication on the value itself? (spoiler: they can't!)
As the article author correctly points out, Python should have just thrown when a method is called that requires a TZ with a datetime that doesn’t include one.
Between this, the Unicode circus, and the rejection of PEP582, Python seems more and more like the HOA that isn’t earning their monthly fees IMO.
As a contractor, I singlehandedly migrated 4 different codebases from 2 to 3; a total of just under a million lines of code, and none of these codebases took more than 40 hours. This was not some horrific burden for most companies using Python. The people who flamed Python for these breaking changes were either extreme outliers or the sort of people who would be happy with the language never making breaking changes, ever, no matter how bad of mistakes in design were discovered.
I would strongly prefer that datetime.utcnow were not deprecated in a 3.x release, but instead deprecated along with the GIL and any other breaking changes, in Python 4.0. Yes, it would get them flamed. And the flamers are wrong. At least that way people can plan for the transition at their leisure instead of constantly worrying that a 3.x release will unexpectedly break their codebase if they don't read the release notes before every update.