Hacker News new | past | comments | ask | show | jobs | submit login
Python 2 removed from Debian (debian.org)
471 points by zdw 36 days ago | hide | past | favorite | 490 comments

As someone who does not use Python, the end result of this is that I now look into whether random utilities I use happen to be written in Python, and if so try to find an alternative. This is because I felt the pain of this transition -- again, as someone who does not program in Python. It has been miserable every time some random utility starts complaining that Python 3 is missing and then somehow when you install it something else that wants Python 2 starts complaining. You have containers that have worked for years that all of a sudden error out saying "Python 2 is no longer supported," and you have to dig up whatever library or utility that would otherwise work fine has some sort of date-based check or something? I don't know, all I know is a compiled C++ thing works the same yesterday as it does today, and I can upgrade on my timeline, not have a random Docker layer cache invalidation result in a week's worth of Python 3 migration in the least important part of my build. And it keeps happening, again, because this is not in stuff I directly edit, it's all in random dependencies and sub-utilities of utilities I install. So it is non-trivial to even go in with the mindset of "I will update everything on our systems that use Python". Although I guess if it's being pulled completely now that job will once again be done for me. Sigh.

I have this opinion about no other language. I am generally neutral about languages I don't use. But literally my only experience with Python is porting to Python 3. It's like a sitcom scenario of how to force the worst possible first impression for a language ever. The amount of slowdown and pain that Python 3 has generated on our non-Python project is really just out of this world ridiculous.

This rant feels like it would have been relevant in 2016-2018 or so, but in 2022-2023 I write a lot of Python code and don’t really run into Python 2 vs 3 issues anymore. Once Python 2 officially EOL’ed the ecosystem basically managed to port whatever was important enough to keep using over to Python 3. Im sure some legacy megacorps are still doing stuff in Python 2 but that’s no different from these corps running some ancient version of Java which is also somewhat common.

> I write a lot of Python code and don’t really run into Python 2 vs 3 issues anymore.

I think you are missing the key point here: I don't write a ton of Python. I don't write any Python. Of course you think this is old news. You write a lot of Python. You're totally plugged in. You've known this is coming, this inside baseball is "the thing" for you. My point is that, believe it or not, the rest of the world does not keep up to date with whatever the Python community is doing. This is not unique to Python. Most people in dev community X are not aware of what is happening in dev community Y. What is unique to Python is that all of a sudden we were forced to become aware as the downstream apps began to break. By creating a backwards compatibility issue in an interpreted language, you necessarily "backload" the end-user pain to when the apps, which by definition are updated later than the code within the apps, begin to break as packages are removed from package managers, or things are EOL'ed, etc. This is a situation that I can honestly not remember happening with any other language. I can't remember any other circumstance, and certainly not one where it proceeded to happen repeatedly, where I've been abruptly made aware that utilityX was written in a no-longer-supported version of C/perl/whatever (when I previously didn't even know what language it was), after I had been running it for ages.

The point is just you don’t understand how your tools work.

It seems the actual problem is a disagreement on how tools SHOULD work.

A saw still works after a decade on a shelf, programs should still work too because they are tools.

What shelf? A garage? Detached? A shed? How humid? You may want to resurface and oil that saw after a decade. The wood handle may have rotted, or the plastic one may be gummy. Taking care of tools is how they last.

The 100 year old saw I inherited and used to cut down a broken tree last year.

That's cool, but if the 100 year old saw had been used regularly it would have required maintenance and sharpening to continue working well. Software also requires maintenance.

I can't think of any 100-year old software that is being used today (ha-ha).

I'm sure some airline reservation systems and banking systems written in COBOL in the 70s are still around, but they all have ongoing maintenance to keep them working correctly.

The commenter's complaint seems particularly worse with dynamic languages like JavaScript and Python than compiled things. Old projects written in node are often just as hard to get running or installed as something written in Python 2. The dependencies are out of date, you have to install an older version of Node.js, or the dependencies just don't exist anymore. And JS stuff usually generates a huge tree of dependencies even if they only directly rely on a handful of packages.

End of story, I think if these Python-2 tools/scripts were such a huge part of his project that they keep breaking builds, the company should be invest time in either rewriting them in their language of choice or maintaining those projects or a fork of those projects.

One of my old bosses was strict about this - no COMPANY projects should be relying on external package repositories, unapproved third-party dependencies, definitely not some github project maintained by only a few people. Yeah, I think if your company is relying on "requests" or some other massively popular project with tons of community maintenance, you don't want to fork that and it's important to use the standard versions. But if you're bringing in something that is used and maintained by only a few people - you either need to become part of the community that maintains it publicly or fork the project and maintain your own version.

This isn't NIH syndrome, it's basic protocol for keeping your builds reliable and dependency management simple.

>This isn't NIH syndrome, it's basic protocol for keeping your builds reliable and dependency management simple.

Why should I, an end user, ever need to build WikidPad? I don't need to do so on Windows, why doesn't Linux support the same thing, shipping an executable?

When you go to the store, you're not presented with a set of tools for making a saw, you're given a complete, ready to go tool. Perhaps the very idea of users having to build things in Linux is the problem?

It is very similar to the problems with Windows, always pushing updates/upgrades that aren't necessary, and often break things. Windows machines are far less reliable on Wednesday than they are on Monday, because of Microsoft's idea of pushing patches on Tuesdays.

First comment is a complain about python and the first thread is about saws and woodworking tools.

Truly a hacker news experience.

This is such an odd and entitled take for a person who is really trying hard not to care about python.

It's as if JavaScript had shipped a non-compatible update, browsers had on a magic day stopped supporting the old JavaScript, and all of a sudden her crossword website stopped working because that site's developer didn't update the JavaScript. This grandma sure complains a lot about broken websites for someone who supposedly doesn't know what JavaScript is. What an entitled grandma. Why doesn't she just build Linux from source and check out a previous release of the browser from git.

Backward compatibility is not free, it would eventually become bloat that is too big to fix. MS Windows and Office for example, and Adobe Flash was killed for good. I’m sure you can find a lot of examples.

Welcome to the magic world of CSS.

How is it odd and entitled for an end user to not like it when things suddenly break?

That is, I believe, the entire point of the parent, is it not? That this whole migration was not quarantined to the Python developer community. That it spilled over to users in totally unrelated spaces.

I prefer Hg over Git but I use Git. Why? One time I tried to use hg command and it didn't work. Why? I was on a system that had incompatible Python interpreter to the Hg scripts.

That was it. Since then I use Git everywhere even though I prefer Hg.

And one time I tried to install MS Office on my OpenBSD server. Can you guess what happened?

Since then I only use Ed even though I prefer MS Office. What was your point again?

The point is that even though I like Python and Hg, a single snag like that is enough to tip the cost/benefit ratio in favor of C (egad!) and Git.

(With a side dish of "I'm entitled to working tools.")

It’s like my grandma complaining about the internet being slow... because her CPAP machine is tethered to the clinic with an IP connection and is shutting down every night at 2 AM because a Python script on the server at the other end is no longer running.

I edited my comment before you hit post, but I’ll still reply to LOL at that. Think about what you just said for a sec. Then blame that on a python2.

Laughing at your users is a sure path to unlimited success and good fortune as a developer, for sure.

"rant"? Is that what a criticism of a technology that you like is called?

I feel your response is best summarized as: "I like Python. I've never had this problem and don't think it's a legitimate problem. Only $megacorp users have this issue, and their concerns aren't important."

If you dismiss the problem (as you've done) then you don't have to intellectually address it. The Python 2 -> 3 transition was hard for those that huge productive workloads in Python2, and this pain will have a long tail.

Dude, Python 2 is dead since 3 years officially, and was burred already almost a decade ago.

Whoever missed that is self responsible for all self harm done by ignoring that.

The python2 executable still functions correctly as does all other code you have that relies on python2. Make sure the code is in the right place and run the python executable just as you have done before.

However that's not what you're complaining about. You are complaining about is that you want others to support python2 forever, just as you may want browsers/sites to support `alert()`, the `<blink>` tag and IE 6.0 forever. However that's not how the world works.

If you wish to use something nobody else wants to support, then you need to support it yourself.

Otherwise, take a moment to understand the thing you are using and follow any instructions to upgrade. Don't put ignorance on a pedestal and then make the argument that the fruits of your ignorance are only the fault of others and not yourself.

alert and blink don't make browsers stop working. They degrade gracefully, just like Pyhton doesn't.

> However that's not what you're complaining about. You are complaining about is that you want others to support python2 forever, just as you may want browsers/sites to support `alert()`, the `<blink>` tag and IE 6.0 forever. However that's not how the world works.

Incorrect. You have to understand that I am not a Python user, and don't a priori care what they do with the language. I only become annoyed when something breaks on its own, or if changing one unrelated thing breaks other unrelated things. This happens for a variety of reasons:

1. Either date-based breaking, or patch releases designed just to throw up deprecations, and breaking the tools you actually use.

2. Packages and or binaries disappearing from package managers, such that even if I have a pinned version and I make no changes, an accidental cache invalidation can now break the entire image.

3. The fact that updating one utility may break others (either because you need to figure out this python symlinking issue or whatever).

4. Python causing discontinuities that don't exist in other bundled build tools. Going from buildpacks:x to buildpacks:y, to get a new feature in A, usually doesn't affect B, but in the specific case of Python, means the installed version might change and break a bunch of stuff.

5. Python 3 changes often being bundled with breaking API changes in app updates, making "upgrading to Python3" a uniquely involved challenge for each one.

If you deprioritize "fault" for a second, such that we're not deciding who's to blame here (whether it's the transition plan by the language devs, the library or app authors, the distribution handlers, or even end users like me), you have to reconcile the fact that this is a unique situation to python. I am not describing the woes of development in general. I am describing a very python-specific set of experiences for non-python users. I did not out of nowhere decide to get angry at a language I don't use. This is the result of a repeated set of circumstances that just aren't happening elsewhere.

As has been mentioned throughout this thread, it is further upsetting that a lot of this could have been avoided. The fact that there are things that could have been done differently to make the long tail effect less painful is fairly disappointing. At the end of the day, I don't need a list of solutions. I've implemented them all, don't worry. I've gone through and read the 500-thumbs-upped threads of people repeatedly running into this exact same problem, either found an easy workaround or taken the time to update the config files for a tool that rudely put itself at the front of my priorities list in order to fix the build, or found a way to remove the utility in question completely. This is not me coming on here trying to figure out how to move forward. This is me identifying a clear pattern, and sharing what I believe to be is a common conclusion: putting "python" in the "cons" column when considering a tool.

I come from the node world. There's plenty of people that do the same for node. Some of it I consider fair, others not. But I make no illusions about one fact: if we want to change that perception, it's up to us to earn that by making the experience better going forward, not by explaining to people why their experience actually isn't real. For example, if there is a simple solution, if you just were familiar with the ecosystem, then LOG IT WHEN THE ERROR HAPPENS. If the person has to go to stack overflow and see a thousand other people running into the same thing, then you've failed.

> However that's not what you're complaining about. You are complaining about is that you want others to support python2 forever, just as you may want browsers/sites to support `alert()`, the `<blink>` tag and IE 6.0 forever. However that's not how the world works.

This is also hilarious since you had to scrape the bottom of the barrel to find the only instances of "non backwards compatible" features in JS/HTML, an environment notorious for backwards compatibility. Also, `alert` may still stay in.

If Python changes affect you then by definition you are a user.

Python is basically in the same situation as Perl was a decade ago, with the added benefit of mainstream, living-breathing "success stories" like all the data science stuff (numpy, scikit-learn, etc), django, etc. (And Python seems to have a sufficiently large batch of pragmatic & progressive experiments that address various challenges facing users. Decades of effort went into scraping off the warts in areas like packaging, performance, stdlib ergonomics, etc. Python also successfully reformed its leadership.)

Ok cool but all this is just complaining that your build system isn't hermetic and that's someone else's fault? Like, what? You want to "not care about any of this" but then get annoyed when that bites you in the ass?

Avoid writing huge, long rambling comments and try to simplify your arguments to less than 100 words. You may find the fault, and the fix, becomes obvious.

Start with trying to refute "The python2 executable still functions correctly as does all other code you have that relies on python2" without referencing anything specific to your build practices.

> This is also hilarious since you had to scrape the bottom of the barrel to find the only instances of "non backwards compatible" features in JS/HTML, an environment notorious for backwards compatibility. Also, `alert` may still stay in.

Those where the first that came to mind. There are many others, and many more if you widen it to server-side JS.

> Avoid writing huge, long rambling comments and try to simplify your arguments to less than 100 words. You may find the fault, and the fix, becomes obvious.

If you don't want to read the comment, then also don't bother responding. The comment is long because I am explaining a death by a thousand paper cuts situation. Don't worry, Python isn't going to die just because I don't like it. You don't have to defend Python even if you don't feel like reading my comment. You can also just ignore it or come back to it when you have time to engage with it thoughtfully, instead of just asserting you are correct and having your only argument be that I can't fit my argument to the arbitrary word length you've come up with.

> Those where the first that came to mind. There are many others, and many more if you widen it to server-side JS.

You should stick to talking about Python. This is just not the case and not the culture of browser JS. It's honestly fairly disrespectful of the monumental work browser engineers have done to make that a reality. If you want to switch your argument to server-side JS (where blink and alert don't exist and you are thus leaving it completely as an exercise to the reader to come up with the similar situation), fine, but node 0.10 runs exactly the same today as it did 10 years ago. Again, the bar I am setting is low: have the unchanged thing still gettable and still runnable. I get it. When you work in a certain environment, you imagine all the problems it has must be common and all the benefits are unique. But that isn't the case with this particular problem.

It’s possible to explain a death by 1000 cuts situation in less than ~630 words. Despite that in the now nearly 1,000 words you’ve written in this sub thread alone you haven’t really asserted anything concrete.

Start with trying to refute "The python2 executable still functions correctly as does all other code you have that relies on python2" without referencing anything specific to your build practices.

Sure, can't get python2 in Alpine 3.17. Few enough words for you?

Yep! That’s perfect, thank you. If that’s your condensed point I’ll repeat my first comment: you’re looking for other people (alpine) to maintain python 2 for you forever.

Presumably you’re too “not caring about any of this” to just pin your image to “alpine:3.15” and have it just work exactly as before? And presumably you’re too not bothered with any of this to understand that you had to do this because you didn’t pin the image to begin with, so your builds are not reproducible?

And this is anyone else’s problem why?

You are certainly making progress on the argument you believe you are having.

My point has always been simple: Python causes more pain than other languages when you're not directly using it. I can get node 0.10 on alpine 3.17 just fine. This demonstrates that your comparisons to server JS are completely wrong. But I know, you're going to tell me this is my problem and I expect the world to do work for me or something. Nope. As I've repeatedly stated, I've taken the time to get it to work already. I know you want to believe I'm here waiting for you to fix it, but I'm not. All I've said is: "Huh, I don't have this problem with anything else. I'll keep that in mind before I decide to use stuff from this ecosystem in the future".

You can’t get node 0.10 on alpine Linux 3.17 from the official main repository. That 18.x/19.x. So you’re using a community one. And so, you could use a community python2 APK/install process. Except that doesn’t exist. Because nobody wants it. And you’re unwilling to make it yourself, thus you’re complaining that nobody else is maintaining this port for you.

The argument I’m making is none of your issues stem from Python specifically. As others have said, you are conflating your own confusion and unwillingness to understand with systemic toolchain issues that do exist. Except, you are not being bitten by those - you’re still at the “shoot myself in the foot and blame the gun” stage.

> You can’t get node 0.10 on alpine Linux 3.17 from the official main repository. That 18.x/19.x. So you’re using a community one. And so, you could use a community python2 APK/install process. Except that doesn’t exist.

What a slight of hand to say "You can get it the same way you can with node, except for you can't". You should go into politics. Anyways, you asked me for a simple example, I gave it, and now you want to say what I want doesn't count because "no one wants it". Just choose a different challenge next time.

> thus you’re complaining that nobody else is maintaining this port for you.

I'm just pointing out a discrepancy with other environments. One you initially denied existed, and once I trivially pointed out did exist, you switched to attacking the complaint itself. Look, if I'm looking at two toasters and one has less features, sure, you can bark at me and tell me to make my own toaster if I want that so bad. OK, sure. Or I'll just get the one that has the features I want. You are still allowed to believe that those features are useless, but you can't deny they exist. It is OK for you to disagree with the properties I value in an ecosystem, you don't have to resort to telling me I don't know what I'm doing just because I value a different set of properties in the ecosystem.

You’re looking at two identical toasters from two different manufacturers.

Someone has glued on a big red alarm clock onto the side of one of the toasters.

Your conclusion after seeing this is to write to the manufacturer of the toaster without the alarm clock glued onto it and complain that it’s lacking a big red alarm clock and say they suck because they are not copying the amazing alarm clock features that the other manufacturer has.


This is a fiction in your head. I’m not writing any manufacturer. I’m casually commenting on a forum that I’m pretty sure is not Python’s official complaint box. This is as ridiculous as yelling at someone at the bar who says “yeah, I don’t like to use Python that much”. The entire “thrust” of my “threat” to Python is that I’ll try to avoid it in the future. You’re blowing this way out of proportion. If I reacted this way whenever someone said something bad about JavaScript or Objective-C (which I promise you get way more “drive by” complaints), I don’t know what’d I do!

Replace “write to the manufacturer” with “leave a review on Amazon”. The point doesn’t change.

Nobody is threatened by anything, and my point has nothing to do with Python.

If someone posts a comment saying “curl sucks! Every time I run it it just prints ‘command not found’” I’d of course try to tell them that’s not a fault with curl.

If they replied with “yes it is because I can’t figure out how to install it from this 3rd party site” then the discussion may become protected.

But ok, maybe we will agree to disagree.

No, you’d refuse to read the curl complaint because it’s too long, and instead arbitrarily decide that the entire discussion should hinge on issuing a challenge for them to find you one place where curl doesn’t work. And then when he does, you’d say “So what! No one wants curl there anyways!”

Afterwards you’d get involved in a long weird analogy discussion about two people arguing about git. But that fault would be shared by both of them.

I agree completely, plus there are all the conflicting ways of installing and managing Python programs, dependencies, and environments, and the poor quality of so much Python code, some of it widely used. I find myself having to switch the `python` symlink in my path between python2 and python3 regularly, I look forward to the day when I can purge my systems of all traces of Python.

The number of times I find a script that blindly assumes 'python' points to its required version is too damn high. It makes me question the general competency of the people shipping Python code.

It’s inexcusable in recent scripts, but some of these date from the time when there was only python2. No, the root of the problem, as pointed out by many others here, is the policy decision to create two incompatible interpreters both called “python”.

I think it's actually more excusable in recent scripts as Python 2.7 has been deprecated and unsupported for ages now.

You won't know what version of Python 3 you're getting, but at this point it's safe to assume you're getting a version of Python 3 when you call python.

Only old scripts written when Python 2 and 3 were maintained separately should've ever had this problem, but we're past that point now.

Yes, but *which* Python3? Are you getting one with f-strings support? Because if you aren't, now you've got to install that.

How about one with dataclasses support? Again, you get to install a new interpreter everywhere.

Python packaging continues to be a sore spot for the language.

You're getting the Python version you've specified as a requirement in your requirements.txt, unless your users ignore or bypass that. What language features you support depend on the minimum version you pick, and if you choose to go bleeding edge/not pick a version/pick a version and use newer features anyway, you're responsible for getting support to work.

requirements.txt doesn't support putting your python version in it

You're right, I misunderstood environment markers.

However, you can ensure compatibility around this by specifying the required version in the top of your script:

    import sys
    minimum_version = (3,6,1)
    if sys.version_info < minimum_version:
        sys.exit("You need Python 3.6.1")
This will work because of how Python's tuple comparison works.

As long as you put this above any code requiring any specific version (this will work with Pyton 2.7, I don't have anything older to test) this will halt execution on dependency failures.

You can include these checks in your setup.py if you want to rely on Pip so users are warned on install time. You could probably also add this to some wheel trigger if you add a local/public package as a dependency in your requirements.txt.

But what if you just write a Python3 script and want to run it? Not a full blown application, but just a script.

I mean the Python ecosystem is such that now I have to "package" up this script to potentially tell the user with Python 3.6 that they need to download a new interpreter version (eg 3.9) to run my application (and any dependencies). This puts all the pain of running a script on every user that wants to run it.

I've said it before, but Python requires users to replicate the developers environment...and somehow this is fine?!? Statically compiled languages (eg Go, Nim, Rust, C/C++, Obj-Pascal, D, etc) ship binaries...that just run.

Obligatory xkcd: https://xkcd.com/1987/

If you just want to write a script, you likely know the version available on your system.

When in doubt, add the #!/bin/env python3.10 shebang to the top of your script to ensure the version it's running on is the same as the version you're writing the script for.

If I build an application on Linux I need to specify a glibc version or a PHP version or a Java version in some way as well. I don't see what this has to do with Python. Good luck running a jar built with JDK19 on JRE8, or an executable linked against glibc2.36 on Ubuntu 18, or a Python file written for Python 3.10 on a system that comes with Python 3.6.

Statically compiled languages come with their own problems (note that C++, Go and Rust also have system requirements, I've had to deal with a Go executable that would not work in a musl environment and a Rust executable that I needed to build on Ubuntu because the compiled version from my Manjaro laptop linked a against a libc version that was too recent).

If you need dependencies, specify and install them. If your only dependency is the interpreter, document it, specify it in the shebang, or deal with it. Python 3 is completely backwards compatible in my experience, it's only the programs that stuck with Python 2 for a decade that have trouble running on modern machines in my experience.

> Good luck running a jar built with JDK19 on JRE8, or an executable linked against glibc2.36 on Ubuntu 18, or a Python file written for Python 3.10 on a system that comes with Python 3.6.

Yes, you are starting to see the problem. This is why I recommend statically compiled binaries as a solution to this scenario, and do not recommend any of the above.

Here's a real scenario I've had in my career, when deploying a script to 350k systems. I know I have Python 3.x, but I cannot count on the fact that every (or most) clients will have even a functioning Python interpreter. So what's the solution? Ship a docker container to everyone? There are multiple ways to solve it, but I complied a statically linked binary instead of writing a Python script...and had no issues.

> If you need dependencies, specify and install them.

Here be dragons, and this is the entire reason I avoid Python when I need my code to 100% work on another machine I can't fully control. I choose statically compiled languages in these instances. In a server deployment...sure it's fine because I can control that.

> Python 3 is completely backwards compatible in my experience

I do not agree. If you write a script that uses any of the following features and try to run on $version -1 (or more), your script will fail.

  * 3.6, F-strings were introduced  
  * 3.7, Dataclasses and introduced 
  * 3.8, Walrus operator (:=) 
  * 3.9, merge (|) and update (|=) operators added to the built-in dict class
  * 3.10, structural pattern matching
  * 3.11, exception Groups and except*¶ 
All use of these will fail to function in a previous release.

Maybe you'll say to just install a new version of Python. All that is overhead and requires more management. Ok, that's fair.

The problem is when you want to run that script on a bastion host, or a prod server, 350k machines, or a server you don't have ability to download and install external binaries? (all of these are real scenario's I've experienced).

My experience has taught me that if I can't control the machine the script runs on, I save headaches by using statically compiled languages. As always, YMMV.

The only useful feature from that list are F-strings, which goes to show what joke of a language Python has become.

>You have containers that have worked for years that all of a sudden error out saying "Python 2 is no longer supported,"

Isn't this precisely the problem containers are trying to solve? You can update your system, get rid of unsupported software, and still keep the older version of Debian within your container that depends on an unsupported version of Python.

That's the dream of containers. Arguably this is the "reality" of Nix. But the reality of containers is that if a layer gets invalidated, you need to refetch packages that are lower down. If the Python 2 versions (or Python 2 itself) is removed, then you can't "just" surgically make the change you want in the container without affecting the rest of the system (this is a major failing of Docker, but the world we live in). Now, you can start digging around in the container image and trying to do "layer surgery" yourself by editing the tars, but I would also consider that a less than ideal outcome of these changes.

Nix is also now intentionally breaking any packages that use Python 2, in my experience, so unless you're using a specific, older Nix channel, you're unlikely to have much success with anything that needs Python 2.

How would Python 2 end up removed from a container image?

If someone builds a container that is designed like this:

  FROM debian

  RUN do-x
  RUN install python2
and then someone changes do-x, if I understand, the layers below it get invalidated and all of a sudden install python2 fails. This is very bad design but very easy to replicate.

Okay, but surely my poorly written dockerfile that grabs the wrong images and runs broken code is not Debian's or python's problem?

Older versions of debian that still support python2 will theoritically be around forever, and any codebase that absolutely needs them should always work.

(though I contend that, given that we've had 15 YEARS of warning that this was coming, such instances should be vanishingly rare and not under active development)

> … "given that we've had 15 YEARS of warning that this was coming" …

This right here is the part that I'm still having troubles wrapping my brain around. People still stressing about Python 2 "going away" (it's gone folks; accept it) despite the fact that there's been well beyond a decade of advance warning, and Python 2 having been officially EOL ages ago now.

People won't move until heavens fall down on them.

And than they will start complaining that it's other peoples fault that they didn't move.

I have noticed pip warnings in the container build logs about python2 being unsupported- perhaps there is some "treat warnings as errors" flag set in some builds. Perhaps some builds use untagged base OS containers- building those after python2 is removed will fail.

Also, at some point old OS package repositories may be deleted (or bitrot out of neglect)- at that point building containers that depend on python2 will fail. You'll be able to use images built previously, but not everyone is backing up custom images in some remote docker registry.

I’m curious why installing Python 3 causes Python 2 programs to break.

The only reason I can come up with is that the installation process is wrongly changing the `Python` symlink. Anyone have other known causes? Does it screw with dependencies?

It is hard to keep track of the (no joke) dozens of different weird failures each unique instance of this caused. But I do vaguely recall that sometimes, yes, you now have to go and point each individual utility at the right Python, or set an environment variable, or something. Other times you are relying on some combined build package, which if you want the one that has Python 3, no longer includes Python 2. So then you have to go manually install it. But it's no longer provided as an independent install by that package manager anymore, it was just a coincidence that it was still being included in buildpacks-whatever, so now you have to decide whether you want to figure out how to find and install Python 2 yourself or maybe just go and update everything else to Python 3 too or... UGH. Then you remember "Wait a minute, why I am I doing this, I'm not even a Python programmer!"

The worst part is a lot of times, since Python 3 is a breaking change, people decide that it's a great time to finally do all their breaking API changes too, so all of a sudden updating something to the Python 3 version means that all of its associated config files have changed format, and now you're updating config files that have worked great for ages and running into bugs there.

> The worst part is a lot of times, since Python 3 is a breaking change, people decide that it's a great time to finally do all their breaking API changes too

Your issue with Python is that some people having bad development practices?

That was one issue I mentioned. You know that. You read my comments.

However, I will take your snarky reply and turn it into lemonade by using it as an opportunity to talk about how often development teams fail to consider the social implications of their changes. As is mentioned in another comment on this post, the Python team seems to readily admit in retrospect that they didn't predict the pain of having no clear transition path, etc. I think another lesson to take from this experience is the realization that if you create some big "line in the sand" update, you may absolutely inadvertently encourage bad development practices. If the Python 2-to-3 transition had held "ease of transition" as a core value, then it is more likely that that value would have been modeled downstream. By instead initially declaring that the 2-to-3 transition was going to be some watershed moment and people were going to have to update everything, it is not surprising that it could encourage people to just pull the trigger on a bunch of breaking stuff they wanted to do, and come to the (incorrect, but understandable) conclusion that they didn't themselves need backwards compatibility since the entire ecosystem was going through a "unique" non-backwards-compatible change anyways.

I think part of the issue is that your comments are very ambiguous as to what issues you actually hit.

Most of the major Python packages did maintain 2 and 3 compatibility at the same time. Maybe a few packages had bad coding practices but you’re going to hit that in any language.

I think you’re extrapolating unfairly

> I think you’re extrapolating unfairly

But that's exactly the nature of doing this sort of change. Again, I am not a Python person. This transition created an unintentional filter where the utilities with poor practices surfaced to end users, such that their only direct interactions with Python are always negative. So sure, the extrapolation may very well be unfair, but that is the completely avoidable situation they set themselves up for. No other language expects end-users of the programs written in that language to be sympathetic to some internal engineering transition. All this stuff may have great reasons, but you've already failed if you have to explain those reasons to non-Python users, or tell them this is confirmation bias or whatever. It's just not happening with other languages. And this is for big stuff too. Ansible, which uses 2 pythons, one locally and one remotely, was a pain. Add on to this that this was a fairly public boondoggle, that the makers agree wasn't handled well, so whether "fair" or not, this is a common take away I think.

Tons of other languages have version requirements as part of their end user experience.

Whether that’s libc requirements for compiled languages, or Perl language versions being breaking, or even as subtle as differences in behaviour between different compilers such that your system may give different results than another user.

So I disagree with your assertion here.

Yes it’s unfortunate that this is lifted up to the user, especially because of the fact it’s interpreted, but it’s not anywhere near unique to Python.

Therefore I think it’s not a logical outcome to outright avoid Python.

> Tons of other languages have version requirements as part of their end user experience.

That does not dispute the original claim though. It's still a problem for Python even when other languages have the same problem.

> or Perl language versions being breaking

Perl 5.0 code written in 1999 will still happily run in 2023. In part due to the fact that breaking changes are *opt-in* within a Perl script.

Reference: https://www.perl.com/article/announcing-perl-7/

"""Perl 7.0 is going to be v5.32 but with different, saner, more modern defaults. [..] Perl 5 still has Perl 5’s extreme backward compatibility behavior, but Perl 7 gets modern practice with minimal historical baggage. """

Honestly, Perl is an example of extreme backwards compatibility that has the user experience at the forefront of their design decisions.

I’m not trying to dispute whether it’s a problem. I agree it’s a problem.

I’m saying that avoiding Python because of it is illogical (to me) because the problems they’re describing aren’t unique to Python, and I don’t think there’s much that doesn’t suffer from it.

Perl is an odd one because, as your comment says, Perl is going 5->7 because 6 was such a disastrous break.

> I’m not trying to dispute whether it’s a problem. I agree it’s a problem.


> I’m saying that avoiding Python because of it is illogical (to me) because the problems they’re describing aren’t unique to Python, and I don’t think there’s much that doesn’t suffer from it.

I don't completely agree, but I see your point. Any interpreted language (or non-statically linked binaries) could (and does) suffer from these problems. I'd like to see Python have a better packaging story...but for now we live with shipping the developers environment (via docker or virtualenv) for running complex Python applications and services.

I will say that *statically linked* compiled languages are able to avoid many of these problems because running these binaries typically only requires the OS (eg, Go, Nim, Rust, Zig, C/C++ when statically linked, Object-Pascal, etc)

> Perl is going 5->7 because 6 was such a disastrous break.

Agree. And they did the right thing by actually renaming it Raku and skipping the version release.

slight correction >Agree. And they did the right thing by actually renaming it Raku and skipping the version release.

should be >Agree. And they did the right thing by actually renaming it Raku and functionally killing the language.

I mean, maybe it wasn't quite as clear in 2008, but the Perl 6 fiasco was an obvious disaster that's widely perceived to have killed Perl. It's not a precedent you should want to follow just because it's there.

I pointed to Perl specifically because it’s another contemporary to Python in terms of OS scripting while having had a much harder break.

Seconded. Noted comment is only mildly snarky at worst. Thanks for your insight on the python3 migrations though. I’ll be wary if I ever see something in that context on the horizon.

Maybe they should deprecate `python` -> /usr/bin/python3, assuming it's not way too late.

The official recommendation from the python developers has always been to keep the unversioned python as a symlink to python2

It was to precisely avoid this scenario where you get it switched out from under you.

This doesn’t mean every distro followed suite (though most popular ones were good about it) , but the biggest issue is users. Tons of users create the symlink or alias for convenience and then forget about it till stuff breaks.

Not anymore: https://peps.python.org/pep-0394/

> Distributors may choose to set the behavior of the python

> Avoiding breakage of such third party scripts was the key reason this PEP *used to* recommend that python continue to refer to python2.

Emphasis mine.

This is not just a Python problem. I remember when PHP shifted from 5.6 to 7 - I used many projects and software which needed to be updated because e.g. all the mysql_ functions and some crypto functions were removed.

Same Issue with e.g. converting Vue 2 to Vue 3 or Angular.js (Version 1) and Angular (Version 2).

I really hate when some programming languages or frameworks completely remove important functions which were used by nearly every project.

I tried to learn Python around 2008-2009 and it was literally the worst time to try to learn Python, the transition to Python3 was just beginning, take 2 random code samples from the internet and they won't work together, python2to3 didn't work either.

It's kind of a shame how this transition took more than 10 years. The good part, we learned that we don't have to go through the same again. of course I dropped python (I was like 18 years old and didn't know anything about programming), I had to learn python last year and you still can see a lot of stuff that makes the experience kind of annoying.

Anyway, I'm pretty impressed how Python still is in the first positions of most charts, I still fail to see how great the language is, but hopefully I will be able to someday soon :)

I just avoid Python and any job descriptions that contain it. It's never been a good experience for me, whether as an end user or a developer. It's nice that they may have ironed out their issues, but I've been done a long time ago. There's nothing Python does I can't do better with some other toolchain.

I wouldn't avoid a python program right from the start, but simply not bother if it actually wants 2 and only then start looking for an alternative. If anything that debian move might accelerate deprecation or upgrades of stuff that still uses 2. And then we might live in a perfect world with only python 3, because uuuuuunicooooode.

It's ok. At least our children won't have to go through this dreaded dependency hell. Also, the role of Python 3 in popularizing Python was worth all the pain. Without version 3 it would not have been where it stands today.

Without version 3 it would not have been where it stands today.

I fail to see how that is true. Python was already catching on in a big way back during python 2.7 and I'd guess it probably lost at least 18 month worth of momentum building with python 2->3.

I'd say Pythons popularity today is at best unrelated to python 3.

Python lost a lot more due to the constant flamewars about the transition, how many people wanted to just not care about py3 for many long years, etc.

languages that are actually growing in userbase/popularity are not static. this is hard to accept for many users of popular languages.

It sounds like you do use Python, but when you do, you face this problem...

That's fine, but I wanted to clarify - you intend to not use Python, because you have tried and it's not gone well.

Python didn't come out of the woodwork and attack you, so much as you stepped into a (momentarily) troubled ecosystem.

I understand they were dependencies of other projects. By using those, they became your dependencies.

Take some ownership of your problems, even if the ecosystem provides them -- they are chosen, and can be managed.

Distributions do a decent job distributing modules...

Utilities have an annoying tendency to rely on SomeStableModule==SpecificVersionForNoReason. This compounds the mess, sadly.

Those should be routinely addressed as part of your auditing, along with the whole of the supply chain

I can't really prescribe much... this is just generic advice.

I understand the woes, it's just not as unique as this is painted

> Take some ownership of your problems, even if the ecosystem provides them -- they are chosen, and can be managed.

Huh? This has a strong "No, it's the children who are wrong" vibe.

But to be more fair to your point, I think this is a dismissal of the OP concern's. A troubled ecosystem should be concerning to the Python community as a whole because it threatens future adoption if not addressed.

User's don't want a troubled ecosystem and will find languages that avoid these problems...and Go, Nim, and to smaller extent Zig, Rust, D are strong alternatives for the different workloads currently provided by Python tooling.

Python has a good moat with both ML and mind-share of developers...but at one time Perl had a similar moat...

It's a fair criticism, I know it's a bit harsh! I appreciate it and both the more fair take.

> User's don't want a troubled ecosystem and will find languages that avoid these problems

It's true, I'm very late to Python myself -- for the exact same reason. The distribution part of the ecosystem is a nightmare.

People want utilities, but they learn more than they ever cared to because of dependencies.

I'm only learning how to manage it now because work demands it.

Prior, all of my needs had fortunately been met by my Linux distribution of choice -- the Python part was abstracted away, for the most part.

Developers of these utilities could save some (shared) effort by offering package specs/finding a maintainer, but I understand why it's not common.

(I may even build/maintain it for you, to sweeten the pot)

I guess the fuel of this thing is, 'misery loves company' -- I'm dealing with it, and I can't just whine about it.

The "I don't use it and it still affects me part" was the bait -- they do use it

> Take some ownership of your problems, even if the ecosystem provides them -- they are chosen, and can be managed.

Not sure where you're going with this. I didn't whine to someone else to fix it. I did take ownership. Over and over. I fixed these problems, as unrelated to my main goal as they were. I then complained about it after the fact. Seems like a totally fair ordering of events to me.

Separately, I will state that Python is an environment that seems to go out of its way to make it hard to meticulously manage dependencies if you want to. There's all sorts of competing environments and package managers. I'm sure it's straightforward if that's your main environment, but it doesn't help if you're coming form the outside.

> Those should be routinely addressed as part of your auditing, along with the whole of the supply chain

This assumes that the only available option is "manage this dependency better". That's not the only option. There is also "consider using something different that is easier to manage." Most of these utilities exist in a competitive market. And if there are utilities that allow me to choose the cadence and separately make that auditing simpler, then I will choose that. That is the point of my statement. I am not choosing to go on a hunger strike or avoiding some no-other-choice tool. Quite the contrary, a lot of these tools were chosen by other members of the team at some point, either because it's what they were familiar with or some other unimportant reason, and now they have, to their own detriment, made themselves the squeaky wheel, and opened up the range of possibilities to everything from "put in the work to maintain this tool that has currently demonstrated itself to be brittle" to "maybe I should use something else". Generally speaking, it's a bad thing to only draw attention to yourself when you are doing something bad.

> can't really prescribe much... this is just generic advice.

And yet this is a specific problem.

> I understand the woes, it's just not as unique as this is painted

It very much is unique. I'm literally not experiencing this with any other language that I don't directly use. Do I experience this with the libraries I directly use in my language? Of course. I signed up for that. I know the risks I take using beta builds and have a keen sense of the current internal transitions, etc. But I absolutely do not become aware of the latest argument in the C++ language spec, or a contentious Swift proposal, or end up becoming vaguely aware of the string implementation choices of a language I don't use. That is a unique Python thing. The fact of the matter is, Python has (recently? momentarily perhaps?) proven itself to require more maintenance than other tools I use. The community can decide who's at fault for this (language devs, library devs, end users, whatever), and even whether or not it matters, but if the message to the people outside the community is to deny that this is unique or troubling, then don't be surprised if the way they choose to take ownership is to count Python as a point against a project they are considering to use.

I apologize for being so... combative, at first.

It was very knee-jerk to the interpretation that I had: "distance has been kept from Python and yet it's been a problem"

There hasn't been that much distance, truth be told. You did the same care for Python as you did the libraries, but it cost too much.

We're in complete agreement on dependency/manager aspects of Python -- my main point is, that's the core of it, I guess.

Everyone goes through this, it's awful and annoying, and my original post is a poor response to the presentation of that

The Python 2 vs 3 thing is settled, upgrade to a recent LTS and it's done... until a random relic utility comes around. Then obviously discard it

This is the top comment? In the year 2023?

OMG, that's a parody I hope!

I like Python 3. It eliminated a whole category of encoding/decoding errors. Even Python 2 codebases benefited from it, as libraries were updated to handle Unicode better in an effort to achieve compatibility with Python 3. I didn't experience much pain migrating codebases to it, but I'm just speaking for myself here.

Congratulations to Debian on upgrading to Python 3!

Yeah, coming from C#, Python 2's unicode support was so bad I almost abandoned it immediately as a Chinese speaker (and to make it worse, I use Windows). You literally can't use IDLE for learning/testing properly half of time due to encoding issues.

And what surprised me most is that every time I mentioned this, there would be lots people telling me how this is a superior design because you can operate string like bytes. I mean, it of course has its upside, but I don't think it's worth it if you care even slightly beyond ASCII.

We had crazy amounts of code handling unicode support and conversion from our ecommerce site to our ERP system (running on Windows using some Windows code page thing). With Python 3 all that went away, you can now just seamlessly parse text from one system to another.

For me, the unicode handling alone was worth the time spend migrating from Python 2. That was a decade ago, to finding that the "python" command still launches a Python 2.7 interpreter in 2023 is just beyond belief. Personally I feel like they should have yanked Python 2 in Jessie (Debian 8) in 2015, more realistically in Stretch in 2017.

The biggest danger in Python 2's unicode handling is that incorrect things somewhat worked (until you got a non-ascii character at which point it exploded or produced incorrect behaviour).

I'm sure you could do things well in Python 2 with proper combinations of encode/decode, but it wasn't obvious where you even needed those because with ascii text, things "just worked" transparently. With Python 3 it's very obvious where you need encoding/decoding because bytes != str.

You could do things correctly in Python 2, but as soon as you used any third-party library in your project, chances were it is going to explode underneath you anyway.

In the early 2000s, I maintained an py2, wxPython app with the users having the system encoding win-1250; the effort to patch this was unbelievable. The migration to python3-style handling forced everyone to think about these issues, not just few people for which things were crashing. Even just popularizing the issue was great, until then, many maintainers of third-party libraries didn't even understand what is the problem that you want to "needlessly complicated" fix in their libs.

> That was a decade ago, to finding that the "python" command still launches a Python 2.7 interpreter in 2023 is just beyond belief.

The problem is not the end-user invoking the command.

The problem is scrips expecting `#!/usr/bin/env python` to invoke python-2.

> finding that the "python" command still launches a Python 2.7

apt install python-is-python3

I don't think my co-workers would like that :-)

dpkg-reconfigure co-workers?

apt purge co-workers

I've broken a bunch of stuff when I tried to replace python 2 with 3.

> Personally I feel like they should have yanked Python 2 in Jessie (Debian 8) in 2015, more realistically in Stretch in 2017.

For example GnuRadio started supporting Python 3 with GnuRadio 3.8 released in 2019, and then you had to port all your programs using it to this version. So no, in 2017, the ecosystem was not ready.

FWIW, in Linux, this problem does not exist. Everything is UTF-8 and Python 2 would work just fine (and always did).

In order to support Windows better, Python 3 introduced support for UCS-4 (or worse, UTF-16) strings (depending on a compilation setting when Python was compiled) and they had to introduce extra string types to distinguish readable strings from binary strings ("bytes").

These extra types made Python 3 a lot harder to teach (I teach 30 person classes every year).

So it's not all roses now.

In the end, I got used to it, BUT I just gave up asking encode()/decode() questions at the exams. Very few people understand it, or care enough (and I understand why--it's a ridiculous thing to have). You only need it if your OS somehow slept through the introduction of UTF-8, which is backward compatible with ASCII, resilient even if there are transfer errors and can encode all unicode characters.

Encoding problem used to be really common in UNIX (and before that, in mainframes), but with the introduction of UTF-8, all encoding problems I had vanished and never appeared again.

Even Windows 10 has an UTF-8 mode now and the Windows API functions that end in "A" can be made to use UTF-8.

Now, in a sense, Python 3 has this entire complication for no reason.

That said, Python 3 is ok to use now--and, conceptually, distinguishing byte strings from unicode strings is better (for example so that you don't accidentially print the former to the terminal). It just uses up brain cycles that you could be using for solving your actual problems.

> I just gave up asking encode()/decode() questions at the exams. Very few people understand it, or care enough (and I understand why--it's a ridiculous thing to have).

I get it from the "pass the exam" perspective, since that's one more thing to worry about.

But from my experience in teaching others, doing the conversion between bytes and string implicitly (à la Python 2's way) hinders actual understanding of this very important concept, and it's quite harmful in further study.

Bytes should be considered as a separate, more low-evel thing, away from int/float/strings; at the very least, it should be considered as bits/hex numbers. If you want strings, you explicitly encode/decode them in a way, even if everything is UTF-8.

On top of that, "byte string" is just a confusing concept. It might works for English speaker (by "it's a ridiculous thing to have" I assume you mean that, "'english'.encode() is just b'english', why bother?"), not at all for Chinese speakers, even in UTF-8. There is no b'中文' -- only b'\xe4\xb8\xad\xe6\x96\x87' which has zero meanings in their own.

And even from an easy-to-use perspective: most people don't even work on bytes often nowadays. A more abstract "string" type is all they need, without worrying about how it works under the hood (and if they do, they need to understand how encode/decode works properly anyway).

>doing the conversion between bytes and string implicitly

There was no conversion. `bytes` and `str` were the same type.

http://docs.python.org/whatsnew/2.6.html#pep-3112-byte-liter... says:

> Python 2.6 adds bytes as a synonym for the str type, and it also supports the b'' notation.

I just checked in Python 2.7:

    >>> bytes is str
    >>> print("Hänsel")
    >>> "Hänsel"
I'm working with Germans, Japanese and Polish that use a lot of special characters, including Kanji, umlauts, extra quote characters etc. I need the non-ASCII parts and had no problem with them in Python 2 on Linux (now, C++ libraries that reinvented their own string classes: many problems; C libraries: no problems).

The point is when bytes is str, everything works just fine in Python 2 Linux with UTF-8 locale (which are used in all modern Linux distributions). No need to have a distinction between bytes and str.

That how the rest of the OS works, too. Even a lot of Gtk, Glib and so on (for example the GNOME desktop environment) assume that you are in an UTF-8 locale for file names, for example.

> A more abstract "string" type is all they need, without worrying about how it works under the hood (and if they do, they need to understand how encode/decode works properly anyway).

Ehh, we had students write drivers for measurement apparatuses and they all used Python 2 str (without being prompted to do so). No encode or decode anywhere. Of the students, almost no one who tried Python 3 for that stayed with it (instead they were using Python 2). There was just no upside for this use case.

I agree that, long term, having a distinction str vs bytes makes sense. But then you ARE juggling things that the OS doesn't need--it's basically busywork in Linux.

I'm not trying to minimize your experience--but I don't think it would happen if you tried python2 on Linux today. Not sure it was worth it breaking compat for that.

> FWIW, in Linux, this problem does not exist. Everything is UTF-8 and Python 2 would work just fine (and always did).

That's not true at all. I remember all kinds of encoding errors when dealing with the FS, the network or any user input when using Linux.

Unless you're talking specifically about IDLE?

> Congratulations to Debian on upgrading to Python 3!

Debian had Py3 for ages. It wasn't upgrade, that happened ages ago, just removing old packages kept for compatibility

Honestly, I just found Python 3 to introduce a whole bunch of complexity when working with text. Why can't it just be a byte buffer? Why must you complicate interacting with the OS so much? Text is just a buffer of mostly UTF-8 encoded bytes, why make it needlessly hard?

Glad this is finally mostly over, but... that was bad. I wonder if the energy that had to be put into this migration by everyone involved was worth what seems to be relatively small improvements.

The print-as-a-statement was ugly but convenient and didn't seem like a big deal, the integer division was something that you could live with once you knew about it (and you still need to know what the current behavior is), and the new string types make code more verbose, and the way they were handled introduced countless hard to find bugs during the migration.

IMO you're oversimplifying things without appreciating the scope of what went into Python 3.

Core developer Brett Cannon's talk "Python 3.3: Trust Me, It's Better than 2.7" goes over most of the differences at the time (2013):


> "Python 3.3: Trust Me, It's Better than 2.7"

The fact that such a talk exists shows that the improvements aren't worth the switch for most users. If they were, they wouldn't need convincing.

Yes, Python 3 is (mostly) better than Python 2. No, it's not even remotely worth the amount of confusion and work it caused. If Python 3 had brought massive performance improvements, or proper support for multiple threads, then maybe it would have been. As it stands, Python's biggest weaknesses remain unaddressed.

> No, it's not even remotely worth the amount of confusion and work it caused.

Have you worked with pervasively non-ASCII text in Python 2? Like on a machine where any file might suddenly turn out to be non-ASCII (even if it’s only in a comment), not just data inside a few carefully-patrolled fences? Outside of a few well-behaved libraries (Flask), my experience was that it was utterly impossible. More than half of my time debugging straightforward code was spent chasing down and fixing UnicodeErrors, knowing that some will still remain. And that includes code that used nothing but the standard library.

Now you might think that’s not important, and though I disagree (especially when talking about “most users”) that would be fair. But it is a substantial improvement going from 2 to 3.

Look, the breaking change to python3 was a disaster.

There’s no need to sugar coat it.

It was incompetently managed, in a way that made the technical success of the work look like a failure.

That’s on them. They screwed up.

I lived through it; I feel no particular need to smile and nod and say “it wasn’t that bad”. It was bad.

Things are good now! Python 3 is great, it’s well supported and the people involved all learnt a lot about how important managing communities is, as well as writing good code.

So, all round, a success!

…but, if you don’t acknowledge failures in the past, you’re doomed to repeat them.

> Now you might think that’s not important

I don’t think your experience with Python 2 is not important; I had similar issues with it.

No one is seriously going back to Python 2 at this point.

However, the process is a lesson worth studying; and the take away is not “what a technical success!”

… it is: never do this.

Big scary breaking migrations are bad, they hurt your community, and leave bitterness behind them.

Even if the result is better, technically; there are other factors that need to be considered too.

> No one is seriously going back to Python 2 at this point.

No, the community is stupid and wrong. My machine learning professor (actual professor, not a teaching assistant) only used Python 2 in 2017. Imagine how many people had their opinions shaped by someone like them.

> … it is: never do this.

By that standard, we would never get things like angular 2 or Raku.

Would it be better if we renamed Python 3 to some other name and pretended it was a completely new language?

> Would it be better if we renamed Python 3 to some other name and pretended it was a completely new language?

Quite possibly yes? It doesn't run python2 code, it can't be inter-linked with python2 code, so is it really the same language? The hybrid dialect that works in both does exist but you have to have compatibility shims in a few places.

By comparison, most C compilers will let you use C89 and C17 code in the same project provided you set the compiler options, and C# provided the "netstandard" target for libraries to smooth the transition between Framework and Core runtimes.

> No, the community is wrong.


I mean, Russ basically said that about all the work people were doing on the golang package manager and did his own thing, and it worked out. Good job! Sometimes, a smart person can do something that is better when they're deeply steeped in the domain.

> No, the community is stupid.

I personally find that most communities have a lot of very clever, very thoughtful people in them, and if you listen, you can learn a lot from them.

Just because they don't agree with you, doesn't make them stupid.

I think that its basic respect to listen to your community, acknowledge them and think about what they say, even if you go your own way.

...but, if you (as a maintainer) just think they (your community of users) are just a bunch of idiots, do whatever you like. They are idiots if they hang around in a community like that.

IMO Angular 2 was a huge fail in usability. Angular 1 was a joy to use and 2 was a disaster in terms of writing code in it.

However, I am firmly in the Python 3 camp. It is a better language (all the newfangled bullshit notwithstanding).

There is another universe where Python 3 only fixed the unicode strings, and the transition was a huge success and over in a couple of years.

The print() function was absolutely not worth all the carnage it caused. They could have just kept the print statement around and introduced a printing function with a different name, for example.

The fact that they removed syntax for Python 3 meant that all kinds of scientific computing code had to be transitioned even if it didn't use any strings!

The print function was the simplest part of the transition. For one you could find and replace all instances of it fairly safely across your codebase and you could quickly discover any places where things broke because of a bad replace. That part took like an hour for a large codebase.

The Unicode handling on the other hand took ages because string handling is not easily searchable and replacing it requires understanding what the piece of code actually does.

So you have it exactly backwards: if the only migration was the print function then it would be a quick and relatively painless process. But fixing the hard thing which was the broken str/byte model was the hard transition.

I am curious if you actually went through this transition or if your opinion on this is more based on observation of others’ work.

Python is a scripting language. Imagine if bash suddenly changed its syntax so that that instead of printing using echo, you now have to write echo(). Sure it’s just a pair of parentheses. But you probably have thousands of shell scripts on your computer. Are you really going to get your hands dirty and risk editing them without knowing what they do?

Of course not, you’re just going to revert back to the version of bash that doesn’t require parentheses and sit on it for as long as you possibly can.

Code exists in three states: actively maintained, done/mature, and obsolete.

Python 3 attacked mature code, forcing developers to either go back into active maintenance or to obsolete it early.

I did transition some Python 2 code to 3. But mostly it was uneconomical and risky, so the vast majority of it sat on Python 2.7 until it was no longer needed.

I feel that most of the people who had a positive experience with Python 3 were people doing webdev using Django. They were working on monolithic actively maintained projects and so transitioning to v3 was just a normal maintenance activity.

You are correct that my experience is mostly though not exclusively we Django projects. I guess maintaining your code pays off in the long run.

Again, you are arguing that running 2to3 and testing your code was hard. I guess that may be true. But having hunted for a ton of subtle Unicode bugs that problem was very worth fixing.

Wait, if it didn't use any strings, then it didn't use the print statement either, right?

For printing it's not a problem because a string containing unicode will be transparently printed in both Python 2 and 3 (at least on Linux/Mac, not sure how Windows deals with it).

It's also not a problem if you only have ASCII strings.

> There is another universe where Python 3 only fixed the unicode strings, and the transition was a huge success and over in a couple of years.

Didn't the unicode changes go on until at least Python 3.6? The issue that most C APIs did not care about their byte strings having any encoding at all seemed to cause one headache after another, especially on Posix, where everything touching the operating system could return random bytes. Reducing the scope to unicode wouldn't have sped up the transition one iota.

Yes, I have and it was simple but with plenty of footguns for developers who had not groked how you were supposed to work with strings. I am happy we moved away from that but I am not happy with the solution Python 3 picked. I think e.g. Ruby 1.9 came up with a nicer solution to the problem and one which broke much less code.

Uh, Python 2 was aboslutely fine at working with non-ASCII text as long as the only operations you try to do on it are "take the length" and "reproduce a string verbatim". Notably, it did not forcibly explode when encountering a character appearing in a comment or string literal or whatever that couldn't be interpreted as utf-8. This was a real step back for code that, oh, say, used filesystem interfaces to look at untrusted filenames.

Not if “reproduce the [possibly Unicode] string verbatim” includes concatenating it with others in order to insert it into some sort of context, that will absolutely blow up if you’re not careful and unfortunate input data comes in (I hope that happens before the code enters production!).

For comments and string literals in Python code, note that Python 3 changed the default from Latin-1 to UTF-8, but you can still use -*- coding: -*- to change that. Bytes literals outside of ASCII range you’ll still have to escape. As for program input that’s possibly invalid UTF-8 but not in places you care about, you’ll need to set errors='surrogateescape' or similar explicitly (or possibly set an 8-bit encoding like Latin-1, though not if there’s a chance of UTF-8 in places you do care about).

Paths are a bit painful, yes, both because there’s nothing on Unix systems that precludes /home/alice having filenames in UTF-8 and /home/boris in KOI8-R, and because NT paths are not byte sequences at all (they are WCHAR, that is 16-byte-number, sequences, in no way required to be representable in the current CP_ANSI). Having uninterpreted byte sequences as filenames would work to solve the former issue, and it does: I think most functions in os will accept bytes paths as input and treat that as a signal that you want bytes paths as output (if any). For the latter the cleanest solution is probably an abstract Path type—and you get one, in the standard library’s pathlib. That didn’t work all that well in early Python 3 versions, but it does now. (And IIUC inspired Rust’s abstract-type solution to the problem.)

This is true, almost all "string" operations require you to know extra information about the string's "type"—which, yes, includes its encoding.

Python 3 takes this fact and runs with the stance that strings should, by default, carry their encoding with them (and that encoding is specifically, more or less, a sequence of quasi-Unicode code points with unspecified internal representation). But this is an intensely half-baked solution to the wrong problem, because you need to know much more about a string than its encoding for just about any string operation you'd want to do. Just with Unicode code point concatenation, you can run into plenty of trouble with: - strings containing control characters - strings with unpaired surrogates (which okay, shouldn't really be there, but get used for various reasons anyway) - strings containing combining characters - strings in different natural languages - strings in languages with ligatures - strings with ligature-like emoji (and this is obviously not remotely close to exhaustive.)

Not the one you replied to, but we were doing things mostly without strings (audio processing, stuff with bits) and python 2 worked perfectly fine. We had started this project as late as 2013 because the important libraries had not been ported yet and when the stuff wrapped up in 2017... I'm not sure if we could have switched by then.

So personally I'm ok with Python 3 but I actually never ran into the described problems despite using Python 2 for years, so it was at least not a universal problem that needed to be solved in such a breaking way.

So not a single Cyrillic path in your INI or CSV files, or in the user input (CP_ANSI) to be printed onto the Windows console (CP_OEM, unless you’re in IDLE or redirecting the output, in which case CP_ANSI)? (The standard library’s CSV module was particularly atrocious at Unicode as I remember.) Not a single Chinese comment in your hardware descriptions? I envy you, but that’s a still a very specific corner of the world.

We were working exclusively with filenames (and files) that we generated ourselves, moving them around in our own infrastructure, and writing all of our own configs, yes :P

I didn't insinuate that there weren't problems or that I wanted to keep Python 2 - just that it was perfectly normal to use it and not run into problems. I think I only remember a handful of encoding problems with my other python 2 adventures...

Python 2 strings pretty much just worked with UTF-8, didn't it? Or did they manage to make Python 2's string handling worse than C's?

Agreed. Unicode change alone would be sufficient do to Python 3.

> The fact that such a talk exists shows that the improvements aren't worth the switch for most users. If they were, they wouldn't need convincing.

That's reading far too much into a tongue in cheek presentation title.

Besides, good devs are typically reasonably skeptical of new and shiny.

And when you've got a large Python 2.7 codebase, and Python 3 is backwards incompatible, you'd be looking at a lot of work to upgrade, so damn straight you'd need convincing to make the investment.

Hell, look at how many Java codebases are still 1.8 because of the upgrade cost of going past that, even though modern Java has many compelling features.

And expecting Python 3 to remove the GIL is incredibly unrealistic. How much Python code in the wild implicitly depends on the behaviour of the GIL? All of it.

OCaml 5 recently solved the global lock problem by introducing concurrent domains in which one or more threads share a lock. So old code keeps spawning threads with a shared lock, but new code can work with single-thread domains and manage in which domains does old code run.

> How much Python code in the wild implicitly depends on the behaviour of the GIL?

About as much as depended on Python 2-specific features?

Sure, but the GIL behaviour is inherent to all concurrent (for a given value of concurrent) Python code. And code that could be run concurrently.

Removing the GIL would be bloody fantastic, but it's far more involved than 2 to 3.

A lot of codebases could be converted from Python 2 to Python 3 with just some fairly mechanical work and light refactoring on top. GIL would be a thing of a completely different magnitude.

It’s so trivial! All that needs to happen is someone else putting in the work. And for free, please!

2to3 was always available to do that work, for free, yes.

That's a bit cynical since I think if we were still dealing with the issues python 3 fixed today, I'd be very disappointed in python overall.

And, the latest iterations of Python3 have real usability improvements - the error message improvements alone make development a lot nicer. And the perf improvements and jit coming I think in 3.13 will really make more people consider py3 for their new projects.

>The fact that such a talk exists shows that the improvements aren't worth the switch for most users.

More like it shows that many people like GP are only aware of the superficial changes.

I bet that you did not watch the presentation.

Also, the fact that there is a talk with such a name does not mean anything other than that someone is out there trying to clarify what the transition is about.

(There could have been a video "Windows 7: Trust me, it's better than Windows XP" and you probably still would say that it showed that improvements aren't worth the switch for most users.)

This talks title probably comes from the awful python 3.0+ versions. 3.0 was so bad, they had to fix and revive certain in the later versions. IIRC 3.3 was the first version who was considered decently enough, that someone you could use it for more serious things.

I think the issue here is that Python 3.3 was more like 3.0-dev-rc3

The first python 3 versions were not great and were missing some things. I think 3.3 didn't even had pip builtin

Python 3 paved the path for the changes that you mentioned you wished for, some of which are being tackled now. It's not only about what Python 3 offered at the time, but also about what it made possible in the long run.

How does print-as-function help anyone remove the GIL?

Damn, get over it. Who cares that print is a function now?

While that specific example is stupid the general question is very valid. I can't see what paved the way. The most invasive change was the unicode one and even that seems irrelevant to eg the GIL.

simpler code is always important to pave the way for innovation

Would you still feel that way if Python had a series of upgrade versions that "hand-held" the migration and giving warnings about behavior that was going to break?

I ask, because I know I would have wanted that.

The changes Python made was made in Perl without breaking compatibility, you just wrote say

    use v5.24
in header to use given feature set (defaulted to something old to not break old stuff)

The Py3 approach was terrible and wasted untold amount of hours just because you had to migrate everything, you couldn't just upgrade codebase piece by piece like in case of Perl.

At the very least they should've just made new one be under .py3 and just make Python3 interpreter transpile Py2 code at runtime

Isn't that effectively what was done with Python? You'd do "#!/usr/bin/env python" for old code and "#!/usr/bin/env python3" for new code. Rather than it being wrapped up in a single entry-point, you had the different runtimes and library sets.

> You'd do "#!/usr/bin/env python" for old code and "#!/usr/bin/env python3" for new code

.. but you can't use old libraries in new code.

Just to clarify, in Perl you can have your main code be in Perl 6 (or whatever), and then "import" and use libraries that "use v5.26"?

No, Perl 6 is a completely different language (now called Raku, because otherwise it sounds like a newer version of Perl 5, and never intended to be compatible with Perl 5). But because of the Perl 6 situation, even a change the size of the Python 2 to 3 change would be a "Perl 5" version update.

I think having Python 2 vs. 3 is a clearer demarcation, myself, but the Perl community cares a LOT about backwards compatibility.

No, it's completely different. The newer Perl versions correctly continued to execute the old but already working scripts.

The Python mess was, IMO, a typical example of bureaucracy inventing for itself new but previously unnecessary work to justify its existence, so I agree that that the decisions of how to introduce Py3 features caused (and still cause) waste of immense amount of the hours world-wide that could have been used more productively. Sad.

> No, it's completely different. The newer Perl versions correctly continued to execute the old but already working scripts.

There was some breakage between 5.6, 5.8, and 5.10; Unicode is hard, and it takes time to get it right. But I think the key difference I've heard as a mostly python avoider is the intent was for code written for 5.6 to probably work in 5.8 and 5.10 and if it doesn't, for there to be a way to have one file that works for all versions.

From what I understand, it's not easy to have a python module/script that works in 2 and 3, and you can't go to 3 unless your dependencies do, so if you have a lot of dependencies (as is modern), you're stuck on 2. Your dependencies won't want to move to 3 either, because their users are stuck on 2, so if they just switch to 3, they're droping users; instead they need to support two parallel versions of their code. Most perl modules didn't have to do anything special to support 5.6 and 5.8, but if they did, it was usually small and it could be done within the file with conditional compilation --- I don't think that was an option for python.

For sure. I'd argue it was debatable to even "normalizing" all the built in packages to be consistent.

I think very few people had trouble remembering them, and if you're using one, you'll be looking up documentation for anything with more than a few methods.

It really is frustrating in these situations with high level languages. It's super possible to do the majority of changes they did in non breaking ways.

Hell, they do it already with the __future__ package. Where did all that compatibility work go in python3?

The people working on Perl still support running in old mode. You can grab the latest Perl 5 release and run old-style Perl code in it. The people behind Python don't support Python 2.

Just because the old way was deprecated doesn't mean that it wasn't possible to run old and new code next to each other.

Also, Perl 5 is a very different example. While I haven't followed Perl for quite a while, ISTR that Perl had a very similar transition from 4->5 (I personally remember spending a lot of time trying to get old code working on newer OSes). Also, AFAIK but I may be wrong, Perl 6 has not really taken off like Python 3 did.

Yes; and now we have to keep every version of Perl around in our package managers because they're all used by some code.

That's terrible practice.

You don't. The directives in Perl code tell the Perl interpreter, even the very latest one and it's the only one installed on your machine, how it should intepret loading the code.

If you don't write special in your code, Perl doesn't have a "say" method (because it didn't have one in Perl 5.8). If you declare in the code that you're code for Perl 5.10, magically there are all the Perl 5.10 features, including "say". You can also declare you want specific features, rather than declare a specific Perl version.

  $ perl -v | head -2 | tail -1
  This is perl 5, version 30, subversion 3 (v5.30.3) built for x86_64-cygwin-threads-multi
  $ perl -e 'say 1'
  Number found where operator expected at -e line 1, near "say 1"
          (Do you need to predeclare say?)
  syntax error at -e line 1, near "say 1"
  Execution of -e aborted due to compilation errors.
  $ perl -e 'use 5.010; say 1'
  $ perl -e 'use feature "say"; say 1'
The best thing about this is you can mix and match libraries from earlier and newer Perl versions. If a library says they want Perl 5.10, they get Perl 5.10, even if you want both Perl 5.30 and that library.

If only Python were as reliable for backwards compatibility.

The Perl community is going through their own Py2->3 moment, and they're doing it with great care. They're going to create Perl 7, which will be Perl 5 where the "defaults" make the language look more like Perl 5.32 rather than Perl 5.8 [1]. Modern Perl scripts don't need so much boilerplate to enable modern features. They know exactly the trouble this will cause; if Perl 7 ever becomes the default Perl interpreter, the only one, it _will not run_ lots of very old Perl scripts untouched, because the defaults will make years worth of language changes suddenly visible to those scripts. Even if the fix is just to slap "use 5.008" on every script, they worry about it. That's a much better attitude than the Python developers.

[1] https://www.perl.com/article/announcing-perl-7/

> The directives in Perl code tell the Perl interpreter, even the very latest one and it's the only one installed on your machine, how it should intepret loading the code

> That's a much better attitude than the Python developers.

You do realize that this comes at a cost, right?

A lot of decisions in Python3 (and further) are based on the desire to keep the interpreter simple.

Python Core Team doesn't have unlimited resources, and even if they did, the cost of complexity grows non-linearly.

py2 -> py3 transition did not go as planned, but that doesn't mean that following Perl route would have had better outcomes.

That means the piece of software will work also in the future.

Personally I only use python for throwaway scripts. Whatever is written in Python likely won’t work in 10 years anymore. So they’re all lost in time. Whereas I can continue some of my C++ eternity projects and they still work perfectly fine.

Unless I'm missing something what in here supposed to justify a decade of pain? Hell most of it could have been easily done in python 2 and slowly via warning and errors transition codebases.

The removal of implicit conversion between bytes and strings via ascii did not introduce bugs, it showed were you already had bugs that you maybe did not yet notice.

That's only true if you ever received Unicode input. There are plenty of uses of strings that never do - enums, DNS domain names, URLs, HTTP parsing, email addresses (from any sane provider) etc.

Strings are still strings in Python 3; if you do 'foo' == EnumValue then that will work fine in Python 2 and 3. If 'foo' is from an unknown source: yeah, you might get a bytes type in Python 3 and an error, but that's the entire point. Turns out that in practice, it can contain >0xff more often than you'd think.

Certainly today DNS domain names, URLs, and email addresses can – and do – contain >0xff input, and for some of these that was the case in 2008 as well (URLs and email addresses – IDN didn't come until a bit later).

The Python 2 situation was untenable and lead to a great many bugs; "decode errors" were something I regularly encountered in Python programs as a user. In hindsight, the migration effort for existing codebases was understated and things could have been done better with greater attention to compatibility, but the problem it addressed was very real.

I've seen way too many cases (possibly resulting from 2to3 autoconversion) where the code ran without errors, you just couldn't log in because the xsrf token was "b'123456'" instead of "123456".

> DNS domain names, URLs, and email addresses can – and do – contain >0xff input

Is that true? My understanding is that DNS never supported non-ASCII and so punycode was invented.

DNS-the-protocol still doesn't support non-ASCII input, but DNS-as-people-use-it does. I expanded on that in another comment I just posted, so I won't repeat it here: https://news.ycombinator.com/item?id=34230218

I mean, in the most basic sense you could always have non-ASCII chars in your filenames on a webserver, which could be part of a URL then.

But is also very rare that you ask directly for domain name resolution in the first place; you are usually instead using something directly or indirectly that happens to be remote, and that eventually happens to have a punycode encoded non-ascii hostname or top level domain. But there's no garantee that you (or the libraries, or the libraries that your libraries use...) are only handling the ascii punycode.

I can count on my fingers the number of places I invoked manual DNS lookups inside production code.

I mean...




Just because you can get away with ignoring the non english speaking world doesn't mean Python should pretend it can.

You can't use Unicode characters in HTTP messages - IDNs and IRIs are encoded into ASCII before being sent on the wire (using punycode for IDNs and percent encoding for IRIs).

As for RFC6531, my understanding is that virtually no email provider implements it, because of the same risks that make browsers often show IDNs as their punycode version - Unicode is extremely easy to use to confuse people maliciously or accidentally, since it contains vast amounts of duplicate characters (e.g. Latin a and Cyrillic а), even larger amounts of similar looking characters (e.g. Latin r and Cyrillic г), and even characters that are ambiguous unless you choose a locale! (the CJK problem, where the same Unicode codepoint can represent different characters based on the locale - whose locale to use when communicating between two machines being the implementer's problem).

Also, I'm not saying that Python shouldn't have had proper suport for separating byte arrays from encoded strings. I was only pointing out that there were actual legitimate use cases where a valid Python2 program was broken by the Python 3 Unicode string transition, whereas the GP was claiming the Python 2 had to have been buggy already.

Edit:reading around more, it seems that RFC6531 is getting some traction, and many providers accept sending to/receiving from internationalized emails even if they don't themselves allow you to have a Unicode email (e.g. you can't have айа@gmail.com, but you can correspond with someone having such an email at a different provider). So, email was a bad example in my list. The rest still stand.

No doubt some things broke "needlessly", or that things couldn't have been done better, but I don't see how it could have been avoided since there is no way to distinguish between "I know that this string will always be ASCII" vs. "this string can contain non-ASCII".

For example, what if I want to enter "ουτοπία.δπθ.gr" in an application, via direct user input, a config file, or something else? Or display that to the user from punycode? No one expects users to convert that manually to "xn--kxae4bafwg.xn--pxaix.gr", and no one will understand that when being displayed, so any generic IDN/domain name library will have to deal with non-ASCII character sets.

The same holds true for email addresses: "ουτοπία <a@example.com>" is an email address. Sure, this may get encoded as "=?UTF-8?q?…?=" in the email header (not always, "bare" UTF-8 is becoming more common) but you still want to display that, accept it as user input, etc. People sometimes to forget that the name part of an email address is widely used and that any serious email system will have to deal with it, and non-ASCII input has been common there for a long time.

In specific applications you can often "get away" by ignoring non-ASCII input because you sometimes don't need it. For example I'm working on some domain name code right now which can because everything is guaranteed to be ASCII or "punycode ASCII", so it's all good. But in generic libraries – such as those in Python's stdlib – that's much harder.

In Germany, there are two competing flight comparison engines with very similar names: fluege.de and flüge.de.

How would you send an email to the customer support of flüge.de? How would you parse that domain?

IDN is here, and it's here to stay.

Even a decade ago, you would have needed to support Unicode if you handled any of those strings. IDN domain names existed as far back as 2003, so unless you could guarantee that everything was in A-label form already, you would need to worry about that (which affects URLs and email addresses as well). URL paths might be Unicode if no one normalized it to percent-encoding first. And HTTP headers--like email headers--could well be non-ASCII despite the standard prohibiting unencoded non-ASCII text because the real world is full of shitty implementations that don't follow standards, and the internet community generally runs on the principle that it's better to force everybody else to try to make sense of the result than tell those people to fix their code.

Indeed, I just had to port some code that had been running happily for years at the south pole to py3 (due to EL8 upgrade and not wanting to install the py2 world). It was something talking to a HV supply over a serial port which of course only spits out bytes, but then needed to parse using string handling. It wasn't that hard to port but it took a few tries to find all the places necessary, requiring debugging over a rather slow ssh connection (that is only up when the satellite is up).

For speakers of languages that cannot be written in ASCII-only, the importance of the improved unicode handling really cannot be overstated, especially for people new to programming.

Absolutely. Frankly, it's 2022. Putting some useful subset of unicode in is mostly a solved problem.

At least python3 got that right, unicode by default.

Heh, I still regularly write workarounds to deal with the Python 3 transition. These days, the main cause of problems is the disagreement of what the python 3 shebang is. A lot of especially Google code "supports" Python 3, but expects `#!/usr/bin/env python` to work. That means messing with environment variables and files so that `python` is available as a wrapper script which just executes `python3`.

I don't understand why there apparently wasn't a clearly stated preference about `python` being Python 2 and `python3` being Python 3 from the Python org.

strings, bytes, utf-8 and more have continued to stymie me while converting code over.

Honestly curious if it would have possible to make it better/faster.

i think it’s clear this was a huge mistake. none of these changes were critical to pythons current success. python stalled for 10 years if not more because of these non-BC changes. it will fragmented indefinitely. this is a good case study of how one big ego can completely derail a massive project for decades

It won't be fragmented indefinitely, that claim might've carried some weight five years ago, but these days I see very few Python libraries that still support Python 2.7. And, if they do, it strongly implies that they've not been maintained for some time. (See also, a dependency on six) - with the caveat that sometimes you don't need to continually upgrade a library that fills its desired niche perfectly.

Hell, if you're still using 3.6 you're limited in your ability to upgrade dependencies, 3.8 is generally the minimum supported Python version these days.

The people it lost are going to things other than python3.

Even if that's true, it's extremely obvious that Python is the language of choice for some very hot subfields at the moment—things like ML, computer vision, etc.

It may have lost a lot of people who got frustrated by the 2->3 transition, but Python is clearly not in a bad place as a language today.

Is that true? I have little data, but I'd imagine most are just working on small/mid size scripts that can be upgraded easily. There's really not many big projects your average joe will be coding in python, from what I've seen.

Where would they go from python? Node?

Not disagreeing.

Yeah, I mean personally the only thing that would keep me in the community is wxPython, which AFAIK still only supports Py2.

The last wxPython release that even supported py27 was from close to three years ago, and you would have known if you spent half a minute looking into it before making an absurd claim.

> Assume good faith.

I tried using the library less than a year ago. I must have missed something. This would be great news… if the way you told me didn’t make me feel like shit.

Python is still fragmented 12 years on. It might eventually get fixed, but it still isn't.

Most actively maintained Python code works with Python 3 now, but there's still a whole lot of even maintained code which uses `#!/usr/bin/env python`.

Can you explain how exactly Python stalled as a result?

A lot of devs were busy with the busywork of migration rather than improving their libraries. For a long while you often got stuck between one critical library that has no alternative only supporting 2, with a second critical library for some other thing that also has no alternative only supporting 3. So doing your project in Python wasn't feasible.

I really feel like the community was a difficult pain in the ass during the 2->3 migration. There were breaking changes but I’m sure the Python maintainers didn’t expect the community to react so badly.

Are there some valid reasons that made 2->3 migration insanely hard for some projects? I remember seeing blog articles whining about print vs print(), but surely there are some more important stuff.

The python maintainers went out of their way to break things - and were smug about it. That’s what it felt like to the community.

A big part of 3 was Unicode strings. In 2 you could mark a string as Unicode with a u””. This would have been a great way to let libraries and code work with both 2 and 3. They banned this, but you had to still use b””. This was the attitude and just one example.

The list of stuff that got hard / slow was long. Migration path was unnecessarily difficult as was compatible code. Somewhere around 3.4 it was like a light switch flipped and they started being more reasonable. ASCII handling improved (yes, not all internet protocol stuff is Unicode), they began making it easier to target 2/3 etc. but it was horrible to start. I’ve thankfully forgotten some of the details :)

> Migration path was unnecessarily difficult as was compatible code.

I feel this take is outright wrong. Python provided it's 2to3 tool[1] which took care of the bulk of the work required to port Python2 code to Python3. The only code that was not supported was eggregious errors in the code that worked by coincidence, such as handling bytes as strings and vice-versa. Porting old code to Python3 is a breeze that consisted of running 2to3, run tests with python3, and if anything breaked then just touch up the code to get it to work. I know it because I personally ported half a dozen projects throughout the years.

I'll go as far as to claim that most problems porting old python code to 3 were either upstream dependencies dragging their feet or internal human/organizational issues.

[1] https://docs.python.org/3/library/2to3.html

This is totally false - code produced by 2to3 no longer ran on 2. So you ended up w a chicken and egg problem. Things like u would allow folks to keep compatibility with 2 while working in 3.

Your comment is a perfect illustration of the issues. Lots of user blaming. No actual solution. As I said, it did start to get massively better at some point. Instead of condescending lectures on org issues they for example began allowing u”” in 3, which did not mess up Unicode handling in 3

> This is totally false - code produced by 2to3 no longer ran on 2.

I don't see where GP made a claim that it would. Why would one expect code written for a newer version, using features that don't exist in an older version, to run under the older version?

This was the issue. Why should a new version make it hard to run old code for no particular reason except to break things.

Folks using u”” were already being careful with Unicode . Supposedly Unicode was important enough for 3 to default to it.

The cost benefit of then destroying code using u”” made no sense to me. It was unnecessary. This was followed by the oh just upgrade thing. I’d love to see the “quick” upgrade guidos employer did. My guess is a total lie that it was easy and it probably took years and VC type money.

It’s not about using new features, it’s about not unnecessarily breaking old or allowing for cross compatibility.

> using features that don't exist in an older version

Except they broke the old way for existing features, required you to use the new way and in most cases you could manually write code so it would still work with both. Meanwhile the only tooling provided by the python 3 crowd screwed over anyone with an existing customer base stuck on python 2, which at least early on should have been a foreseeable problem. Not to forget that anyone stuck on python 2 was literally Satan and projects where outright "shamed" to drop python 2 support.

Given dependency chains the business case for these “quick” switches was often poor as well. Yes, Dropbox had $$ and guido - but that wasn’t everyone and most code was 2 at the time.

Many folks would have been ok updating to 3 compatible approaches if it didn’t blow up their 2 story. That’s what ultimately happened in lots of cases when it became more doable. The rip and replace everything at once was a weird goal

If you published libraries on pypi you were in exactly that nightmare scenario. How do you support users on both versions of python (because let's be real, it took _ten years_ to change so you had to support both major versions) without two totally separate codebases and all the massive headaches that entails. There were some partial solutions like the library six and all kinds of related tricks to write python that was compatible with both 2 and 3 at the same time.

> This is totally false - code produced by 2to3 no longer ran on 2.

That's what "breaking changes" and "backwards incompatible" means.

What exactly is hard to grasp?

> So you ended up w a chicken and egg problem.

You really don't. Your code and your upstream dependencies need to be ported to python3. Once your dependencies are updated, all that's missing is you doing your job.

Python2 has been on the path to deprecation for how long? A decade?

You're fabricating problems where there are none.

> Your comment is a perfect illustration of the issues. Lots of user blaming.

There is no issue. When I had to port projects to Python3, I just ported them. No drama, no hangup. You're pretending there were problems where there were none, and you're throwing a tantrum when this fact is pointed out to you.

There is really no justification for this. It's high time people like you stop making up excuses and start to own up their misconceptions, misjudgements, and mistakes.

Some projects were rewritten from scratch in entirely different tech stacks in less time the likes of you complained they could not update their projects from python2 to python3.

“It's high time people like you stop making up excuses and start to own up their misconceptions, misjudgements, and mistakes.”

What is it about python 3 that brings out this lecture stuff.

I would really like to see the inside of this “easy” change at even the bigger most well resourced places. Google / Facebook / Dropbox etc.

At small places where a program is working and programmer is gone they are not investing in ports.

Anyways - they did eventually get a clue. It’s much easier NOW to handle this, but it wasn’t at the start. Yes, they did put u”” back in. No it did not destroy the world.

> Yes, they did put u”” back in. No it did not destroy the world.

This saga isn't completely over, over here: We still have code running python2 that we don't have the room to upgrade, and libraries that have to work with both python2 and python3. Meanwhile, people are pushing "code consistency" to the point where people outside our team are taking those libraries and running the "black" formatter on them - but "black" doesn't understand python2, so it removes the "u" prefix and breaks the library.

I like black and python 3 - but that’s pretty funny! In fairness black is opinionated on purpose - even if it understood u”” it’d remove it if it’s targeting 3 probably because it tries to reformat to one style. Amazing how long the ripples last though

>What is it about python 3 that brings out this lecture stuff.

I think some people convinced themselves that using "" without the u"" was disenfranchising to people who don't want to use ASCII to represent their language. This transformed a practical argument over syntax into a moral crusade.

Ok, so what happens when several of your dependencies never ported?

This isn't strawmanning - this is literally the case for a codebase I'm working with today.

You should not be using those dependencies today. Looks like they have been unmaintained for a decade and can contain a ton of vulnerabilities.

Though I find it hard to believe that there are some useful dependencies that have not been ported and don't have better alternatives. Typically a "never ported" dependency is never ported because it is deprecated in favor of something better.

Does no one work in a business? Does no one work with science instruments purchased more than 5 years ago? Businesses have lots of useful dependencies that have not ported. If you are running a payroll system and it’s working, there is a real aversion to going to a new code base, and the cost benefit is not there for many businesses. You do realize python 2 is downloaded 3 million times per month now? Yes that’s down - but it’s not nothing - probably 300 - 400 thousand plus users easily

We're in a rather niche market. It's geocoding related stuff that only talks to a trusted endpoint.

> Ok, so what happens when several of your dependencies never ported?

If you're consuming dependencies which were never updated in the past decade then you have more worrying problems to deal with than porting your code to python3.

That’s not true at all - in high compliance industries if the dependency has been validated and is working changing it brings risk.

So why did for example Ruby not have such a painful transition from 1.8 to 1.9/2.0? My theory is that the Ruby people make the migration in many steps, it started even back in at least Ruby 1.8.6 and they did it with a lot of care on how to make the transition as painless as possible while the Python team just broke everything and bet on that tools like 2to3 would just magically solve it.

Right. I think a key miss was underestimating value of code that would work w both 2 and 3. This avoids need to upgrade all at once.

Relax things a tiny bit in 3, back port to 2 via futures, get a library like six going. It could have been a ton easier.

In fairness by 3.5 maybe they’d realized this, but a lot of lecturing was done before. Initially recommendation was to upgrade all dependencies and programs to 3 and not try and make dependencies work w both

I think rubys TDD culture at the time also probably helped. My (purely unsupported) opinion is that the test suites were likely more fleshed out on ruby so its less of a risk to switch. That said, I still saw lots of 1.8.6 installs in the wild for quite some time after 2.0 hit.

> The python maintainers went out of their way to break things - and were smug about it. That’s what it felt like to the community.

The other side of that coin is that the python maintainers went out of their way to push back the Python 2 EOL. They kept on pushing the date back and back again, as per the sunset page:

"We did not want to hurt the people using Python 2. So, in 2008, we announced that we would sunset Python 2 in 2015, and asked people to upgrade before then. Some did, but many did not. So, in 2014, we extended that sunset till 2020."

The drawn out demise of Python 2 was, frankly, painful.

I have no time for whinging snowflakes complaining 12 years (2020-2008) was not enough time to migrate their code to Python 3. Hell, even the original 7 years (2015-2008) should have been long enough for 99.999999% of the community.

> I have no time for whinging snowflakes complaining 12 years (2020-2008) was not enough time to migrate their code to Python 3. Hell, even the original 7 years (2015-2008) should have been long enough for 99.999999% of the community.

I'm old enough to have read this when it came out...and it changed my view on backwards compatability (from Joel Spolsky of Trello, FogBugz, and StackOverflow fame)

"Code doesn't rust": https://www.joelonsoftware.com/2000/04/06/things-you-should-...

If a company has production Python2 application / service (with hundreds of thousands of LOC), what business value does it bring to migrate it to Python3?

At that point, if you've got to make severe changes, folks might decide to use a language that doesn't impose breaking changes (and business cost) on them. YMMV.

I mean, code does rust in many ways.

Not just talking ASCII was one of the first ways.

Drift in 3rd party library support is another.

Security support of the language and libraries is a massive one.

Simply put as hard as you try to stand still the rest of the world is not going to.

If they hadn’t extended there was talk of actually forking python. Basically people needed a 2.x compatible python, and most features being put into 3 could have been backported. So you’d maybe keep u””, keep print, add futures stuff for folks targeting both and then pull in 3 stuff to reduce pressure to move. They were smart not to drop 2 cold.

It's not the other side of the coin, it's a direct consequence.

They had to push back the EOL because they screwed up.

it's a volunteer project. nobody stepped up to maintain 2.x as far as I know

> and were smug about it

I feel like a lot of the people who had this reaction were reacting to being told (correctly) that how they handled strings was broken. A big reason that projects had difficulty upgrading, especially difficulty using the automatic 2->3 upgrade tooling, was because of broken handling of unicode and broken handling of binary strings. I remember reading a quick assessment of a large open-source project that there were thousands of calls to a single string-handling function that would each need to be inspected by hand to determine whether it was correct usage that would be correctly translated by the 2->3 converter or buggy usage that needed to be fixed before running the 2->3 converter, with some people guessing a 50/50 split. The effort was going to be gargantuan, and though the ultimate cause and responsibility was poor design choices in Python 2, the immediate cause was that the code was wrong, and some people didn't appreciate hearing that.

No one really blamed programmers for getting strings wrong in Python 2. It was unreasonably hard and extremely rare for people to get it right, so there was not really any shame in Python 2 programs and libraries being pervasively broken in that way. But some people still felt it was smug and condescending to make reference to this elephant in the room. This put the Python maintainers in a bad position, because they were trying to explain the benefits of upgrading and also the costs and difficulties, both of which were intimately tied to the brokenness of current codebases.

As a user and spectator to the whole drama, therefore, I felt that some maintainers got overly defensive about issues that virtually every codebase suffered from. And some of them took a "so what" attitude towards bugs involving non-Western languages, saying that if they personally didn't care whether their code handled Chinese or Bengali filenames then it wasn't a bug to do random things with them, which sounded just as bad ten years ago as it does now.

> I feel like a lot of the people who had this reaction were reacting to being told (correctly) that how they handled strings was broken

Can you be very precise about what you mean by "broken" or "incorrect" code?

In python2 you can easily mix encoded and unencoded strings. This doesn't actually work, but at worst it'll produce mangled strings. You used to see this sort of thing online quite a lot with webpages having all sorts of weird characters and question marks showing up.

Python3 enforces the difference between encoded and unencoded strings. It forces you to deal correctly with unicode. If your code base was already handling unicode correctly it wasn't much hassle to migrate. If your code base was making a mess of unicode handling --like many at the time were-- you'd run into that headfirst.

There are many codebases in the world that never need to deal with Unicode. Therefore, new language features which deal with Unicode are unwelcome when they replace the existing string API (which was often used to handle raw bytes)

I think this was the right decision for the long run though. What we have now is the right situation where the strings we work with in the language are 'magic' and we don't need to think about representation or encoding or whatever. And then at the periphery we encode/decode based on what the outside world is working on. It has become clear that this is the 'correct' approach for programming languages and is almost universally the approach in new languages. E.g. Rust does something similar (they don't hide that the internal encoding is UTF-8, but it's consistent).

I'm glad that this is what we have for Python now, despite the pain it caused. We'll be using Python for another few decades probably, so we'll reap the benefits.

> eat your vegetables

Nah, I'm good.

The era of "move fast and break things" was disastrous for a lot of things.

12 years is moving fast?

Guido has done some pycon keynotes in the recent past and said himself he takes some of the blame for the transition. He says he greatly underestimated the impact of breaking changes and the inertia to get the community to change code. In particular not shipping tools to help automate the migration, or even thinking through a migration path with targeted backwards compatibility/back ports from the start (these eventually came quite a few releases and years into the transition). I don't think it's fair to say the community was entirely at fault for the length of time.

edit: This talk he did from pycascades 2018 in particular I remember seeing and being a good retrospective: https://www.youtube.com/watch?v=Oiw23yfqQy8

I think both of you are looking at some of the same problems from different angles, and there are some interesting ways in which the community shifted over the same time period. I suspect some of Guido’s instincts were guided by the earlier community which was more early-adopters, and simply younger. The 1 to 2 migration was easier because the median developer was far more motivated to switch - you weren’t using it if you weren’t a fan.

Python’s rise brought tensions there in multiple ways - business users started wanting longer-term support and tools for migrating large code bases, but relatively few would pay for them, and some maintainers had either drifted away or retired, or simply had less time available due to things like starting families or getting promoted. The 2 to 3 migration really highlighted the gap between commercial open source and hobbyist projects, and I’m sure now anyone would fundraise for support having seen how that worked out.

The other thing we’ve been seeing is a switch to faster releases industry wide and I think that showed some serious tech debt. Projects which have CI & decent tests had a much easier time shipping updates or having two release series for an while, and the 2 to 3 migration really highlighted that. Some of the griping was really showing that older projects often had the frictional disincentive of that tool. That’s still relevant because of the way security issues are forcing more frequent updates now.

Codemods aren't a thing in Python?

They are a thing, the ast module was created to help with the transition and tooling to automate some of the conversion: https://docs.python.org/3/library/ast.html IIRC it wasn't there at the start though.

What is a "codemod"?

A codemod is an automated refactor. Codemods are typically done by writing scripts that manipulate the AST of code in files or by writing a regex to find and replace.

The name originates from an internal script at Facebook called codemod. It had a public open source fork if you are curious.


Thanks for the explanation!

> Are there some valid reasons that made 2->3 migration insanely hard for some projects?

Print was a non-issue once you could do "from __future__ import print_function". It was the unicode migration.

All string handling from I/O required at least checking.

The dependency issue was the most crippling, though: because you can't load Python 2 code from Python 3, before your project can begin migrating you have to wait for all your dependencies to update as well. This may involve them changing the types that get passed over the API from 'str' to 'bytes' or vice versa, which is potentially a breaking change.

Sounds like a circular problem

The early Python 3 releases were abysmal, much slower than Python 2 even without considering the bugs. That alone was bad. But there was also real world problems, like how some syscall wrappers would not take bytestrings where they should have, so you literally could not open files unless you had a valid unicode representation of their file name. That's fun when all you want to do is some data processing.

The core team was swamped with other work around the migration and the community found their response to real world problems lacking. They did things like declaring Python 2 dead long before 3 was usable. Then they refused to fix SSL problems with Python 2, which everyone desperately needed, and went out of their way to oppose a community-led release to fix those problems.

It all ended well. There was another Python 2 release, and Python 3.4 came and fixed most problems people had. It was still slower, but performance picked up and by 3.7 it was just as good if not better. But it was a good ten years during which the response to real world issues could have been a lot better.

All in all, the migration could have happened much smoother. They could have learned from other languages. But ten years is not unreasonable for backwards incompatible changes for a major programming language. The idea that the process could be accelerated by neglecting the old version does not work.

In my personal opinion, the single thing that could have made the transition better for the community would have been the possibility to run Python 2 code in Python 3. It was discussed a lot before the transition and was deemed impossible because of conflicting data types. But it's not impossible, it's just (a lot of) work. It's another type conversion, and Python can already do those. The impossibility to link to old libraries made the conversion a flag day for most code bases, and those are really hard.

Migrating where there’s a 1-1 mapping of old API to new API is easy, such as the print to print() change. The hard part is when there’s a 1-N mapping, where a single name in the old API becomes multiple names in the new API. These are the most useful changes to make, because they can split apart unrelated concepts that had been erroneously represented by the same object, but require a human to determine which use case applies.

For python2, the str class could either contain a sequence of characters, or could contain a sequence of bytes. These are often conflated when using ASCII encoded text, but are very different for any other encoding. In python3, these are represented by the str class for a sequence of characters, and by the bytes class for a sequence of bytes. It’s so much nicer to work with, but required manually separating out the earlier use case.

Python2 had the 'unicode' class for characters, and the 'str' class for bytes.

The whole 2 to 3 debacle was only because somebody thought that naming your string class 'unicode' doesn't sound nice enough or something.

It’s not that simple: as the person you’re responding to explained, this was leaky because in simple cases you could pretend those were equivalent but then things would break around interface points when someone introduced things like filenames with Unicode.

I had clean Python 2 codebases which had been Unicode-safe for years and had good test coverage, so porting to Python 3 was trivial and mostly automated. The projects which were hard were the sloppy ones which conflated bytes and Unicode, and the poor development culture which lead to that also tended to mean they had limited testing, orphaned dependencies, monkey patching, etc. which made the migration hard. I also found a fair number of cases where that process fixed other bugs which had been ignored for years.

Also the default, what a no-prefix string like "foo" meant changed between the two versions.

Right, essentially the Python developers decided to define a sizable portion of their developer base as "bad people" due to their particular use of the string API and then spent years going to war with them.

The war continues to this day apparently with this Debian news. Somehow you can use fortran77 compilers without people getting angry but you can't use Python2

The biggest problem is fundamental to Python and a lot of other scripting languages: no static typing. Running Python 2 code direct on Python 3? All looks good, until _somewhere_ you use a method of str that isn't on bytes, or vice versa, or something that was a list before is now an iterable and it blows up in your face... but _only_ if code execution reaches that line. So you have to test _everything_. And if you're a library/module, that means testing with _every user_.

Secondly, it seems to me the Python team didn't want to make it possible to write code that works correctly on both Python 2 and 3 simultaneously. They wanted you to write code that only works on Python 3 and tell every single Python 2 user to get lost. Essentially, they wanted to co-opt you as their agent of change; burn your own reputation by abandoning your userbase (or worse, start proselytising Py3 at them) or double your own workload to maintain two codebases. Who wants that?

It's fundamentally an attitude problem. A desire for there to be only one right way to do things, a change of mind of what the right way is, and kicking the chair out from anyone doing it the old right way; the only compromise being to kicking the chair very very slowly. In Java, you can still run code that uses Vector and Hashtable from 1996, they're not even deprecated. They're used by _very_ little code since generic List and Map APIs superceded them 2004, but they're still there.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact