I really wish these big bulky enterprise and commercial solutions wouldn't prop up this technical debt. Things like ArcGIS an other $10k+ software packages have built in Python2 interpreters. If all of them decide to keep "supporting" py2, a ton of security updates are going to fall through the cracks or become non-uniform across tools.
They've literally had years and years to upgrade their tooling. But considering some of this enterprise crap still has support for VBA (Visual Basic for Applications) we'll probably see Python2 in the wild well into 2030. Makes you wonder how much COBOL is still out there.
As an alternative viewpoint, it's bizarre that we can't support these things. Mathematics textbooks from 400 years ago are relatively intelligible, and those from 100 years ago are easily readable. But with computer science, a domain which consists entirely of problems we ourselves invented, we've wedged ourselves into a situation where we are apparently so bad at computer language design (let alone systems engineering/systems architecture) that something surviving for a decade is frankly notable.
Programming languages are more like spoken languages than mathematical languages. They evolve fast. Programming languages also are versioned, something that's nonexistent and blurry in spoken languages.
There's no support for Latin in almost any software today and books in Latin can be read by a extremely small minority. Even a English speaker from the 1700s would have trouble with today's English.
Apples to oranges. I don't think you'd be as surprised if the statement were reworded to "the human brain has more flexibility in parsing to understand intent from text than a computer". The issue is very unlikely to be poor language design as, as you allude, people can parse old code just fine. The thing we're actually bad at is building computers that reason intent from content like we do.
Money talks, and RedHat is a business like any other. If customers are willing to pay more than it costs to continue supporting it, why wouldn't they do that?
As to why companies can't move forwards: writing tests sucks. whether it's the scaffolding, or perfect is the enemy of the good, writing tests is harder than it wants to be, if someone hasn't put in the effort to set up the codebase to be ready to accept tests. writing tests are often skipped, especially in older codebases that predate today's culture of testing, and even tests are often wrongly skipped in favor of expediency. Thus refactoring or rewriting old codebases is fraught with fear that it will cause regressions.
For a large deployment of our SaaS software last year, in order to connect to the raw data for their POS, we discovered they were on mainframes running UNIX. Not Linux, Unix. We had to go to a local library, in the old archived basement to find books that guided us on what the hell we were doing.
50 years from now, with new completely different languages being vogue, some poor engineer is going to have to find a book for Python2.
There are plenty of Unix systems out there, such as AIX (and arguably FreeBSD, though it's not certified), that are derived from and could be called UNIX. These systems are still maintained today and generally have the philosophy of "if it ain't broke, don't fix it". And there's definitely some sense in that, and I wish more software projects followed that philosophy.
If you take C code from K&R C, it'll likely still compile with current C compilers. Likewise, I'm sure most C projects from the 80s still compile with current compilers, perhaps with minor changes.
FreeBSD often chooses to improve existing tools instead of replacing them, so it would look more familiar to an old Unix expert than Linux, and there's some solid logic to that.
I think it's a little sad that we have these big disruptive changes like Python 2 -> 3, when Python 2 still works fine and can still be improved without breaking backwards compatibility. Honestly, I think it makes sense to have disruptive changes like that be treated as a new language instead of completely dropping support. I like how the Rest project is approaching it with building compatibility for older versions of the language into current compilers, so you can get the best of backwards compat and modern changes.
I'm sure by then there'll be brainwave controlled programming and the idea of using a keyboard with defined syntaxes will be as silly as punching cards sound to us today haha
I get that enterprise-y stuff is on a more LTS slow schedule, but:
Python 3 was released a DECADE ago - literally. If you've got mission critical stuff running on Python 2 and are panicking now then it's time re-evaluate your life choices.
In fairness to those developers (of which I was one): Python 3 was released a decade ago, and support for Python 3 in commonly used libraries was released, depending on the library, some time between a decade ago and "not yet".
This has required a lot of engineering work to make libraries function with Python 2 and Python 3 (e.g. pandas, numpy, scipy), to make frameworks work properly on both (e.g. django), and to ensure that the libraries that everything relies on (e.g. mysqldb) get updated, forked, or replaced with rewritten alternatives.
Sure, I as a developer "should have" been writing Python 3 for the last ten years, but for most of those years it would have required rewriting large libraries that I, and other developers, rely on to be productive.
Unfortunately this has meant that instead of starting Python 3 projects over the last ten years, people have been starting Python 2 projects so that they could get their work done in those intervening years.
I don’t do a lot of Python, and have been exclusively using v3, but I thought you could make your Python 2 somewhat compatible with v3 by using libraries, etc.
How close can you come to a Python code base that runs in v2 and v3?
> How close can you come to a Python code base that runs in v2 and v3?
Using the `six` module, this is (nearly) trivial for 99% of code. If you're doing certain weird things, its more difficult but possible. (I say this having done py2-3 conversion or dual support for lots of vanilla code, as well as a significant amount of weird code: c/c++ extensions and code that uses advanced features like code object introspection and dynamic imports/rewriting/etc).
On the plus side it does serve as a powerful reminder of the costs and technical debt incurred by dependencies and why they should be minimized as much as practical.
We've got some projects based on abandoned libraries that will never be upgraded to python 3, essentially leaving us paralyzed on python 2. These projects were also meant to be deprecated a decade ago so no path forward was ever created, but they cling on to dear life.
> Unfortunately this has meant that instead of starting Python 3 projects over the last ten years, people have been starting Python 2 projects so that they could get their work done in those intervening years.
Joke on them, some extensions you listed didn't even wait until 2020 and already dropped Python 2 support.
This just means there was a network effect amplifying the general reluctance; there may have been a greater impetus on the library developers than the end-users, but the community as a whole still failed to prepare for the future.
Not that I'm against python 3 but please, this dismissive arrogant attitude says more about your experience with python than of people who took their time. Python until at least 3.4 was changing way too fast to be considered as a stable platform to port to. The actual window for migration has been more like 3-5 years. Yes this has been sufficient but not so long that people need to question their life choices for being cautions.
>The actual window for migration has been more like 3-5 years.
Fair - the 2-->3 transition was indeed painful. So 3-5 years actual migration window, with advance notice of "Python3 is coming of another 7-5 years".
Whichever way you look at it - that's a decade advance notice. In a industry that moves exceptionally fast.
If you want to judge me as "dismissive arrogant attitude" for that - so be it.
>people need to question their life choices for being cautions.
The life choices phrasing was a poor choice on my part & I apologize for that. Message is unchanged though. To me waiting till support window runs out and then thinking of next step is the exact opposite of "being cautious".
You're operating from the assumption that people own their own code base, and are actually in a position to deal with that migration. It's a nice bubble you live in, but it is a bubble.
There is a lot of industry specific software out there that runs on all sorts of odds and ends of versions of stuff. Some critical software for a particular industry I'm aware of, still relies on CentOS 4, for example. It can't cope, and isn't certified to run on anything newer.
The end users often find their hands tied, because there's only one vendor producing software that fits their needs, and those vendors rarely give much of a damn about their end users running out of date software that doesn't get any security patches.
>and then thinking of next step is the exact opposite of "being cautious".
I don’t think you fully grok the risk of code changes in some of these large legacy systems. EOL with no security patches on a closed system that processes trusted inputs is nearly 0-risk.
Oh I get the risk. What I don't get is someone sitting on this for a decade and at EOL starting to wonder how this is gonna work.
>nearly 0-risk
That's cool to. I mean there are *nix servers with years of uptime that haven't seen a proper update since ever. All good (sorta).
I coded in Python 2 last week - nothing against it. It's the bizarre "this decade long transition just snuck up on us" tone of the question that gets me.
Something to consider: there hasn't been a great way to get Python 3 on RHEL / CentOS until this year. The IUS repo is an option, but it makes distributing a Python 3 package difficult: you have to get EPEL installed, then IUS, then you can finally get your own package in.
I'm happy to be able to drop Python 2 support in my projects, but I've had to hold off on some of them that run on RHEL / CentOS.
I use ansible heavily at work which doesn't really support python 3 very well yet. If it were an app I wrote from scratchthe choice would be esi but not all the code I work with and rely on (like ansible) was written recently or is easy to migrate.
Not sure what you mean by "doesn't support python 3 very well yet"- Ansible has had support and testing for Python 3 since Ansible 2.2, and has been officially support-with-a-capital-S Supported on Python 3 since Ansible 2.5.
Using Ansible against Python3-only targets used to require manually setting ansible_python_interpreter in the inventory, but since 2.8 also supports automatic interpreter discovery.
You're correct now, but it wasnt officially supported until quite recentlish. I remember reading last year on the official website that the server executing the playbook should be using python 2
And some features such as redis cache were broken on python 3
Totally. If you've got a legacy Python2 codebase, I'd encourage taking a long hard look at porting to a non-Python family language as an alternative. It's not going to be right for everyone, but it's worth keeping in mind.
well, in theory you're right, but it isn't like the transition from python 2 to 3 requires a complete rewrite. it was fairly easy to write (mostly) compatible code, via __future__, u"", and maybe six. it's a bit unrealistic to think that if someone hasn't put the effort in to do that, they'll expend the effort on porting it to a different language, unless it also makes sense for other reasons
Depends how good your test coverage is, I guess. The legacy Py2 codebase I have the most familiarity with is probably more than 50% C extensions with zero tests, and migrating those to Python 3 correctly is non-trivial. If you've now got to write a bunch of tests, you might as well test a reimplementation in a language with a history of more API stability.
i hear you, but at that point, it isn't really a Python codebase. i think that qualifies as a good reason to migrate away from Python, since if you've ended up with that many C extensions, Python probably wasn't the best fit to begin with. bitching about unicode, the print function, etc, i'm less sympathetic.
the only thing that's weird to me is if you have C extensions, it's super easy writing tests for those in Python. getting coverage is possible via gcov/lcov. retrofitting tests is hard and painful in any language, period. but so is a rewrite. pick your poison.
> i hear you, but at that point, it isn't really a Python codebase.
I don't think that's really true. Yes, our overall product is a mixed language codebase, with most of the core functionality in C. But quite a lot of the business logic lives in Python. The C modules are all just manually implemented wrappers of C libraries (as opposed to, say, rewrites to speed up slower Python hot paths).
By sheer quantity, it's something like 110k lines of manual C wrappers and 2.0 million lines of Python2. So from a numbers perspective, it seems like the Python part is very much a Python codebase. (Though the whole product is about 19 million lines of C, 3 million lines of C++, and negligible percentages of other languages beyond Python2.)
> if you've ended up with that many C extensions, Python probably wasn't the best fit to begin with.
Not sure I'd agree with that, either :-). Although I guess it depends on how you define "begin."
The initial manual wrappers were written in the early 2000s. At the time, Python (like, version 2.3?) made a lot of sense for CLI / business logic / initial web interface backend, interacting with the core libraries via the wrappers. It's not like any of the contemporaneous languages the original authors (now long gone) could have picked would have fit better or held up better over time. C++98 isn't the C++1y we have today and simply isn't a replacement for Python; other scripting languages were demonstrably worse (Perl, Tcl, ...). I don't know why those folks chose to manually implement wrappers (maybe it predated SWIG), but it's what we're working with now.
> the only thing that's weird to me is if you have C extensions, it's super easy writing tests for those in Python.
Same as anything else, really. If your code is pure algebraic functions, unit testing is trivial; if it has lots of system-level side effects or isn't decomposed into testable units, it's difficult to retrofit.
> retrofitting tests is hard and painful in any language, period. but so is a rewrite. pick your poison.
PoC||GFTO regularly publishes its missives in polyglot files that are valid in multiple languages/interpretations; that doesn't mean those things are the same. E.g., a zip isn't a PDF, except sometimes with PoC||GTFO, it is.
Python 2+3 is less extreme, but still: those nontrivial programs that run in both environment are just polyglot programs that happen to run in two different scripting languages.
Given the experience with Python 2->3, is there any reason to trust the PSF not to make a similar mess with a 3->4 transition? That's my concern porting any Python2 code to 3. Are Python.org done making huge API breaks, or will they do it again?
To quote from chapter 1 of K&R C: "The only way to learn a new programming language is by writing programs in it. The first program to write is the same for all languages:"
Print the words
hello, world
The program to do that is literally the very first program shown in the very first chapter of K&R C.
99% of Python programs before Python3 was announced used the print statement.
Python3 broke 99% of existing programs.
And your rebuttal is there are---now---nontrivial programs that run in both languages? How many of those nontrivial programs existed before python3 was announced?
Zero. Zero programs like that existed.
This historical revisionism astonishes me. Just because it is now, after the fact possible to write programs that behave the same in both Python2 and Python3 doesn't mean that they are the same language.
Works for me. But maybe OpenBSD 6.2 isn't "current" enough for you?
$ cat hello.c
main()
{
printf("hello, world\n");
}
$ cc hello.c
hello.c:1:1: warning: type specifier missing, defaults to 'int' [-Wimplicit-int]
main()
^
hello.c:3:2: warning: implicitly declaring library function 'printf' with type
'int (const char *, ...)' [-Wimplicit-function-declaration]
printf("hello, world\n");
^
hello.c:3:2: note: include the header <stdio.h> or explicitly provide a declaration for 'printf'
2 warnings generated.
$ ./a.out
hello, world
$ uname -a
OpenBSD foo.example.com 6.2 GENERIC.MP#0 amd64
I think this illustrates my point. OpenBSD did what was necessary to keep the vintage 1978 hello.c program working on a recent system. If only Guido and his colleagues did the same.
Not criticizing OpenBSD, but this has nothing to do with OpenBSD in particular (which uses Clang now).
Grandparent 'takeda is just flat wrong. hello.c compiles and runs with extremely modern Clang and GCC versions (8.0.1 RC and 9.2.0, respectively). The former is a development RC and the GCC was released August 2019.
And we can expect this going forward! The C standard is extremely stable and loathe to deprecate anything. gets(3) is finally gone in C11 and it has been deprecated since the 1999 standard. That's the only ABI removal in C history.
There is absolutely no reason to expect that tiny program which doesn't invoke gets(3) to ever stop being compilable by standards-conforming C compilers.
While you should move on to python3 at this point I still stand by the whole thing was a giant waste of time. 'Better' unicode support was not worth the mess. Everything else could have been done in python2.
That's only if you are dealing with an US English userbase who never copy/paste from Microsoft Word. Anything else, and the better unicode support is a godsend. Better have them pop up as type errors during development than as uncaught exceptions whenever a user happens to first put unicode somewhere you don't expect.
I strongly disagree: decoding and recoding strings to-from UTF8 was a monumental waste of time IMO. Never has one language gave so much trouble for simple tasks like reading a database. Python3 just works seamlessly.
Implementation details of u"" could be changed without rototilling the language. Breaking the print statement does not improve unicode support, for example.
All the people that are crying about print() never did any migration. That's the easiest part, it can be fully automated. In fact you will almost never use print in any serious program.
The unicode is the most difficult part of the migration, because most of python 2 code is actually broken, it can't be automated because you need to know what you want whether you need text or bytes.
Sorry but I don’t agree this is a bug. You need to decode your bytes to properly write a string. Otherwise you get the unexpected result you are showing because it has no way of knowing how your bytes are encoded.
But bytes are not strings. Unless you convert them to a string first, they could be an image, a sound, a certificate, etc.
Since ASCII strings are always one byte per character, they can be manipulated as bytes, but UTF8 strings aren't. In Python3 you operate with strings and always get strings, you operate with bytes and always get bytes.
The string = byte array pattern is an atavism that should not exist this century.
> That wasn't done, but for reasons that amount to "we don't want to"...
Well, and giant glaring differences in internals. Python3 and python2 have incompatible bytecodes. Swapping compilers on a per-file basis in a language as dynamic as python isn't possible. Python allows dynamic re-writing of modules and source code. How do you decide which version a module is run with if the module is modified at runtime?
Given what people pay RH for, this makes sense. Given why people like me no longer choose to run RH, this makes perfect sense, as a reason why we don't run RH: The money we pay RH to maintain support for old code can be better invested in converting old code to run on new platforms.
There is a class of client who needs what RH is selling and I am glad they sell it, and that people like my bank can invest in that, because longterm stability of fintech cores for a banking product is very important: People ran applications developed for LEO, in emulation under GEORGE and then subsequently re-emulated on later operating systems because ICL committed to that to meet customer needs. People emulated PDP-11 RSTS inside Vax/VMS for the same reason. People emulate OS360 inside RS6000 based machines..
TL;DR RH supporting Python2 makes perfect sense. If you don't use RH, then this doesn't have to matter to you.
The money you pay supporting security and new/broken features of fast moving upstream targets could be mostly saved with a small portion paid to RH freeing you up to improve your product instead of supporting ever-changing/insecure upstreams.
I have experienced far too many emergencies and worried about relying on far too much insecure old code in environments without RH support.
With RH I could always trust that packages wouldn't disappear overnight, if I built something against RH it would actually still be there in 6m or 6y and that I could really trust that if a security hole was discovered I would be notified and it would be fixed in a way that wouldn't break my stack.
All of that is super valuable compared to moving fast and breaking things myself while dealing with upstreams that move fast and break things too.
In many many circumstances, it is really expensive to stay on the bleeding edge of everything and does not add business value, other than pleasing and attracting magpie programmers.
I worked for a company which paid for RH licenses (they don't do it anymore).
Being conservative is nice, if it comes with stability, but during RHEL6, 6.4 was the first version that was actually stable.
As for breaking your stack, this actually is not that hard. All you have to do is not tie your application to the system.
Since we're talking about Python, if you use RedHat or CentOS, I highly recommend IUS repo[1] in there you can pick which version of Python you want to use (older if you want to ensure stability, newer if you need cutting edge), you also can install multiple versions side by side, and your application is no longer tied to the OS, that means switching underlying OS or upgrading to next major version (for example to RHEL8) is no longer that should affect your application.
>As for breaking your stack, this actually is not that hard. All you have to do is not tie your application to the system.
My point is the opposite, I want to tie my application to a reliable system so I don't have to deal with something else (pip, gem, npm, mvn, etc.) because those things break, I can't rely on them being available, I can't rely on packages always being there, I can't rely on the security of what is out there.
I can rely on the system python and associated packages, I can trust that security issues will be patched, and I can trust that compatibility will be maintained. Doing that I would be stuck with an aged and aging upstream perhaps blocking adoption of new technologies, but it buys me stability and peace of mind. There are tradeoffs, for sure.
I've some good luck migrating a couple of things to `pypy`, specifically "Squeaky's portable Linux binaries." [0]
"How long will PyPy support Python2? -- Since RPython is built on top of Python2 and that is extremely unlikely to change, the Python2 version of PyPy will be around “forever”, i.e. as long as PyPy itself is around." [1]
I mean, it's not like CPython 2 is going to disappear on January 1, either. You can keep using it.
What you lose is a) potential security fixes (you might not care, for your use case) and b) the ability to use other libraries that are Python 3 only, including security updates to libraries whose Python 2 versions are no longer maintained. Using PyPy (or any other Python 2 implementation) definitely doesn't get you b and arguably isn't much improvement on a.
Tellingly they go out of their way to say security fixes are always at their discretion and come from upstream. After explaining that a no feature updates is fine as upstream will be effectively dead. This ain’t Python’s problem Red Hat this has been slated for years