The Python Road Not Taken

Animats · on April 21, 2015

As I mentioned the last time this came up, I recently converted a medium-sized (about 30,000 lines) system from Python 2 to Python 3. The problem is not the language changes. Those are minor, and most of the new forms worked in Python 2.7.9. It's the state of the external modules. Important old Python 2 modules have been discontinued, and their replacements are not used enough or tested well enough to use reliably. I kept finding bugs that should have been found in the first months of heavy use of a module.[1] It took an extra month of work to find and work around all the external module bugs. These aren't obscure modules; they're ones most large server-side applications would need.

Porting was at least possible. Two years ago, I took a look at doing this, and there weren't enough modules ready to even attempt it. Now there are. There is progress. Slow progress.

The download statistics tell the story: "We can see that in the past year Python 3.x has grown from roughly 2% of the total downloads from PyPI to roughly 5-6%."[2] That's why the external modules don't work reliably on Python 3. 95% of the package use is on Python 2. Python 2 use is probably higher than that, since those are download statistics; PyPI doesn't see packages downloaded long ago and still running.

There's extensive denial about this among Python fans. That's a big part of the problem. Another problem is that PyPI is just a link farm - unlike CPAN, it doesn't centralize code hosting, bug tracking, support forums, and quality control. This allows Python fans to be unaware of how buggy the packages are.

[1] https://news.ycombinator.com/item?id=9378898, search for "quality control" [2] https://caremad.io/2015/04/a-year-of-pypi-downloads/

bcj · on April 21, 2015

I wish it was possible to know how much of that 5–6% were actual users, and how much were people running tests.

jefb · on April 21, 2015

Number one reason why I haven't switched yet?... print. Admittedly is is the dumbest possible reason, it's irrational; I would never admit it in a face-to-face conversation; I would deflect to "something something third-party library...", but deep-down inside, if I'm really honest with myself, it's print. I want asyncio, I want better unicode support, but all I think about are SyntaxError's in what I __strongly__ consider to be valid python.

My stubbornness is rooted in the simplicity of print. It is so simple and stupid, why change it? Never has a production app relied heavily on print, it's just a dumb way to check shit while developing. I mean sweet jesus if it ain't broke right? I know I'm wrong in thinking this kind of thing happened everywhere in the 2->3 increment, but as I said earlier, I'm under no illusion of rationality. Python 3 broke hello world, and that just skeeves me out.

Eventually I'll write new stuff in 3, probably sooner rather than later especially with asyncio et all, but for now I'll stick with 2.

PhantomGremlin · on April 22, 2015

   Python 3 broke hello world

   Python 3 broke hello world

   Python 3 broke hello world

   ...

Yes, yes, yes, a million times yes. The print change really grates.

People say "print became a function", but that doesn't do it justice. Your quip is infinitely better.

reuben364 · on April 21, 2015

If you are saying print is gone, it isn't. It became a function in python 3.

   print "hello world"

became

   print("hello world")

If you are complaining about the format, then that really is a dumb reason.

comex · on April 21, 2015

I'm also embarrassed about it, but I also strongly dislike the change to print. With this combined with a lot of little things like the removal of __cmp__ (apparently without even a relevant mixin in the standard library!), removal of the cmp argument to sort, removal of encode('hex') and encode('base64'), and doing Unicode wrong[1], Python 3 really feels like a pointless quality of life decline. For this reason I still stick with Python 2 for my general scripting needs.

[1] http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/

aidos · on April 21, 2015

But cmp is so backwards and non-intuitive. Key is a vastly better api, and more performant too, because you can generate keys once and then do comparisons using native types.

Again, with the encoding the api is different, but it's still really simple.

These feel like walls for walls sake.

comex · on April 21, 2015

Key... well, I should elaborate on that. I don't usually use cmp in my own code, and while I can think of situations where I might want to, using a wrapper class implementing comparison methods would be easy enough (modulo the missing mixin issue!). It's semantically a lot more complex to create a new object for comparison than to specify a comparison function when the English description is "sort by X" (of course, when X is just a property, key=lambda a: a.x is enough), but since this is a rare situation, I wouldn't mind. However, I ran into that issue when trying to teach my brother programming, and for learners, semantic overhead is really hard to deal with. Now, Python has no responsibility to optimize for learners over actual programmers (indeed, I'd argue it's always been worse as a first language than people think), but this was a case of removing a feature that had minimal additional API surface or implementation complexity; it really felt like removing things for the sake of removing things.

For encoding it's mainly annoying because now I have to manually add an import, which is more typing than before. To be fair, there is a cogent argument that my issues with both print and this mean what I really want is a different language altogether: one that, respectively, allows omitting parentheses for all function calls, and has some kind of implicit import. But other languages have their own problems, and this still feels like a paper cut.

Oh, and the annoyance is exacerbated by the interfaces of binascii.hexlify and base64.b64encode being broken. They both return bytes objects, which is dumb since the whole point of hex and base64 encoding is to represent binary data as text, and the most common thing to do with encoded strings is to insert them in the middle of other text. 'foo %s' % hexlify(b'bar') => "foo b'626172'"; to get the "foo 626172" I want, I have to do even more typing and append .decode('ascii').

aidos · on April 22, 2015

I find with cmp there's too much to remember - if a < b do I need to return -1 or 1? With Key it becomes really simple, and powerful:

    sorted(records, key=lambda r: (r.city, r.age, r.name))

For me there's a lot less mental overhead with the Key form.

Encoding-wise, hex and base64 are not something I do too often, so their new interfaces don't worry me too much. In the cases you've mentioned you could create a simple wrapper function to deal with it for you.

I guess there's some reason for using bytes instead of strings as the output to the interface, but, as you say, it's definitely non-obvious.

There are lots of things that are a little different in python 3; I would say overwhelmingly for the better, on the whole. There just doesn't seem to be much point fighting against those minor changes when they're easy enough to adapt to.

stuaxo · on April 22, 2015

Ha, I feel exactly the same about print. The extra brackets feel odd.

koenigdavidmj · on April 21, 2015

https://python3wos.appspot.com/

I'm seeing an awful lot of green there. Most of the things that are red (which might influence someone to use Python 2 on a greenfield project) have alternatives that are green.

RogerL · on April 21, 2015

Guido's keynote: https://www.youtube.com/watch?v=G-uKNd5TSBw

5000 packages converted, 55,000 to go. If you can't port until somebody else does, then projects that depend on you can't convert. Guido talks about this, and says more than "looks green to me".

the_mitsuhiko · on April 21, 2015

Some packages are ported but that does not mean people use Python 3. If you look at the download numbers it's pretty clear that they are not indicate actual use: https://caremad.io/2015/04/a-year-of-pypi-downloads/

metaphorm · on April 21, 2015

the situation for greenfield projects is much better than it has been in past years. I would feel just fine about, for example, starting a new django web app in Python 3.

however, the vast overwhelming vast huge vast majority of software in the world is legacy systems. software lifecycles are long and the cost of migrating from Python 2 to Python 3 is very high for those systems.

teddyh · on April 21, 2015

See also http://py3readiness.org/

whatok · on April 21, 2015

Bloomberg's Open API is not on there and I think pretty much the only thing keeping me on 2.7. Don't know the ecosystem as well as others but imagine there will be more enthusiasm for 3.x in finance once that's done.

baldfat · on April 21, 2015

Twisted? I still see that as a BIG python 2 Wall of Shame I mean Superpowers issue.

bcj · on April 21, 2015

It's too bad that the WoS isn't able to display things like partial-support (https://twisted.readthedocs.org/en/latest/core/howto/python3...).

leephillips · on April 21, 2015

And I'm not sure about the fine-grained accuracy here. For example, NetworkX is listed green, but its graphic subsystem, according to their documentation, does not support Python 3.

Also, Oath2 seems like a big one, and would exclude many other packages that might use it.

bcj · on April 21, 2015

It is all self-reporting based on the classifiers given to PyPI (NetworkX lists itself as supporting Python 3, Python3.2–Python3.4). It seems like both it and twisted would benefit from a Python3–partial classifier

stock_toaster · on April 21, 2015

gevent is pretty popular too.

aidos · on April 21, 2015

I used guv recently (https://github.com/veegee/guv) and it worked really well.

stock_toaster · on April 21, 2015

Looks interesting. First time I have heard about it. I will definitely have to check it out. Thanks for the link!

aidos · on April 21, 2015

Hope it works for you! I found that it behaved exactly like Gevent for what I was doing (which I gave up on due to Dropbox rate limits).

I've been looking at porting all the flask-websocket stuff over by using guv as a base. Haven't done anything other than reading codebases but it looks like it should make it easy.

PythonicAlpha · on April 21, 2015

I find some truths in this analogy:

I remember, when Py3 first came out, everything was incompatible -- unnecessary incompatibilities like the u" notation for Unicode string literals that was dropped. Unnecessary incompatibilities in the C-extension-module implementation layer. And so on. The list of incompatibilities was just huge.

Later several of them where dropped, like the string literal trouble ... But than the trouble was already done. Many extension modules where not lifted to the new version, since the overhead was to big.

I think, many more projects would have adopted Py3, if more extension modules would support it.

The huge library of extension modules was always the strength of Python. Now we have many projects still running on Py2, because Py3 did ignore this strength.

I think, when it would have cared more for compatibility from the beginning, much of the damage done would have been avoided.

fullwedgewhale · on April 21, 2015

It raises some good points. It makes me think of other projects I've been a part of where the working, existing system was starved for resources in favor a new system which didn't exactly dominate in performance or capability.

I've been programming for almost 20 years but am new to Python. I'm currently working on an API for a product and testing it against 2.7 and 3.4. It's not the end of the world, and if it works for 3.4 it pretty much works in 2.7. Granted, it's not the world's most complex or deep piece of python, but is there a real problem in making 2.X code run on 3.X? Is it really that big of a change?

et1337 · on April 21, 2015

It's easy to get your own code to run on 3.x. The challenge is making sure all your C library dependencies run on it. And Python is nothing without its vast library ecosystem.

rogerbinns · on April 21, 2015

I ported APSW to Python 3 during the Python 3 beta (2008 IIRC) and thought I was being late! Porting the C code was easy - just a few #if sections selecting between Python 2 and 3. What took considerably more effort was porting my test suite. In addition to normal functionality, it does far more testing of boundary conditions and errors. I finally ended up with a codebase that compiles unaltered on Python 2.3 all the way through 3.5 (except 3.0), and the same Python source file that does testing against all those version and popular platforms with issues (ie Windows). To date I have never had a bug report where sample code was for Python 3 - it is always Python 2.

I also provide binaries for all the Python versions in both 32 bit and 64 bit (where relevant) for Windows. When hosted on Google Code, their download stats showed the vast majority on Python 2.7 32 bit with a slow increase in 2.7 64 bit. Python 3.x downloads were roughly similar to Python 2.4 downloads! Sadly since moving to Github I have no idea what the download rates are.

BTW the only reason for not supporting Python 3.0 was that the test suite needed a module that wasn't provided (base64 or something similar), which I could have worked around, but the 3.0 release was end of lifed so I didn't bother.

fullwedgewhale · on April 21, 2015

Got it, thanks!

cyberpanther · on April 21, 2015

For me with the headaches with Unicode and international content, python 3 is great and well worth the switch. However, I know most don't have that big of a win to switch and there is really no business reason to switch.

pyre · on April 21, 2015

Not necessarily a short-term thing, but Python 2.7.x is set to be unsupported in 2020, IIRC.

msandford · on April 21, 2015

If they'd solved the multithreading problem in Python3 so that you can get true multithreading that would have been a tremendously motivating force to get people to switch. But that didn't happen and it's really undercut a lot of the pressure.

c22 · on April 21, 2015

My understanding is you can get true multithreading in both python 2 and 3 by using an interpreter that supports it (like IronPython).

elyase · on April 21, 2015

The pydata subcommunity, the fastest growing and probably already the biggest one doesn't care about asyncio. A good multithreading solution on the other hand would have been decisive.

sp332 · on April 21, 2015

The old road is still getting holes patched. Official support has already been promised until 2020, and features are being backported from 3.x to 2.7 http://legacy.python.org/dev/peps/pep-0466/#implementation-s...

coldtea · on April 21, 2015

And it's quite possible that'll get unofficial support (perhaps not from the core team, but by people concerned) even after 2020...

durin42 · on April 21, 2015

Honestly, 3.5 is getting there. I've been pretty frustrated by Python 3, but starting with 3.5 it actually looks /possible/ to port Mercurial without tons of grotesque hacks.

jessaustin · on April 21, 2015

I used this analogy to describe the state of Python 2/3 to a friend who knows nothing about programming. Like many Python programmers, he too would continue using the old road.

Wow, that's remarkably unconvincing. I'll think of this the next time I hear of some complex ethical dilemma solved by asking a small child.

wowzer · on April 21, 2015

I really wish we as a community just set a date and ported. As it is right now there's almost zero chance that the project I'm working on will get ported to Python 3. I'm not sure what it'll take, but I really wish we were passed this stage.

wspeirs · on April 21, 2015

This is the "problem" with not having things controlled by some central figure. Following the road example, the Dept of Transportation would simply open the new road, and close the old. You wouldn't have a chance to continue traveling down the old. This happens (to a large extent) with Java. Oracle opens the new road, then stops fixing potholes on the old one. You're not "forced" to move over, but eventually you stop wanting to fix your suspension.

That said, who is central enough to put an "end" date on Python and "force" people to move over?

TeMPOraL · on April 21, 2015

> That said, who is central enough to put an "end" date on Python and "force" people to move over?

Guido van Rossum?

I'm not in the Python community - I barely know the language itself - but from the outside he seems to be a central figure? Doesn't he have enough authority to pull this off?

nas · on April 21, 2015

Imagine you have a company that does something not related to software. You have some important business tools that are implemented in Python. What is your incentive to spent money and take the risk of new bugs porting to Python 3? It doesn't make business sense.

That's why there is still lots of COBOL in the world and why Python 2.x is going to live for many years, far past the 5 year predicted end of life. It will be far cheaper to make new releases of Python 2.x than to port the millions of lines of existing code.

Python 3 is nice, take a look at it. It probably will be easier porting than you think. However, don't feel like you are forced into moving to it. Python 2.x will continue to be around.

gregor7777 · on April 21, 2015

Why port it? IME the only projects we write in Python3 are new projects.

coldtea · on April 21, 2015

>I really wish we as a community just set a date and ported.

The thing is there is no "community". There's a core team, smaller teams around some projects with their own ideas, and a huge number of individuals and companies using Python, each with their own needs, use cases and timelines...

sp332 · on April 21, 2015

Exactly. And those smaller communities are setting their own timelines. For example, Fedora just made Python3 the default for their next version. Arch Linux did this already, and other groups will change over when they're ready.

markbnj · on April 21, 2015

I understand the desire, but there is no "we" in that sense. Those of us using python are a "community" in that we have shared interests, but not in the sense that we have shared risks and rewards. Each person/team has to deal with the constraints of their own specific situation. So, the bottom line is the tool either gains organic adoption or it doesn't. I really don't think having a king who can decree a switch is the answer. The answer would have been not forcing the choice in the beginning. Now the answer is probably going to be that more people look at alternative ecosystems that haven't suffered from this sort of schism.

greggyb · on April 22, 2015

Like Python 2's 2020 EOL?

jamesfe · on April 21, 2015

The metaphor doesn't appear to make sense - at first, the new road is only "slightly wider and speed limit is fractionally higher" but then later on it appears that the road needs "enough new lanes" for all the cars on the old road.

I thought the premise of building the new road was because the old road's ground wasn't good enough, not that there was too much traffic and the road needed more lanes.

Anyway, the whole thing hinges on the authors assertion that the new road isn't that much better, so it's really up to the informed developer/user to decide whether or not they take the new road.

idank · on April 21, 2015

I tried to stay away from a which is better battle. I think the numbers speak for themselves as to which road everyday users of Python are using (scripts to scrape cat pictures don't count ;).

andrewstuart · on April 21, 2015

Such garbage.

Another person asserting that Python 3 is going nowhere, based on their opinion and nothing more, no science, no facts, no figures.

Without any reasonable justification for this opinion then I call this complete FUD and to be entirely disregarded.

pnathan · on April 21, 2015

One of the nice things about Python 2 is that there will be no breaking changes introduced by an episode of CADT syndrome in the language or library design.

falcolas · on April 21, 2015

I'll take the Python 3 road when the big stable distributions (Ubuntu LTS, RHEL & CentOS) start providing onramps to it by default, and include the packages we depend on in their default repositories.

I just can't get myself interested in rolling my own packages for the dozens of libraries I depend on, especially since they already "just work" with Python 2.

ak217 · on April 21, 2015

> another year has gone by without considerable progress towards adoption of Python 3

What planet does the author live on?

morganvachon · on April 21, 2015

If you mean "adoption of Python 3" in the sense of new Python programmers choosing 3 over 2, then yes, there has been progress towards learning Python 3. It's pushed in all of the online "beginning Python" guides I've looked at as well. From everything I've read about it, learning 3 is the way to go, since if you learn 3 you can always code in 2 if necessary. When the time comes that you'll need to know 3, you'll already know it.

If you mean adoption by businesses and entities using it in their infrastructure, then no, the author is correct: We're not seeing much progress there.

coldtea · on April 21, 2015

The real world.

acdha · on April 21, 2015

That's the same real world where I routinely type "mkvirtualenv …", install a bunch of things and don't notice that I've been using Python 3 until the first time I toss a print statement in for debugging?

clebio · on April 21, 2015

Which virtualenv's, regrettably, depend on OS-level packages. so when you go to deploy your app to production, you find you installed some system dependency long ago and promptly forgot about it. The real world is messy.

jonafato · on April 21, 2015

I'm not arguing there aren't upgrading pains, but --no-site-packages has be on by default in virtualenv for over three years (https://virtualenv.pypa.io/en/latest/changes.html#id31). There are certainly issues when it comes to compatibility between the languages, but I'm not sure that's one of them any more.

acdha · on April 21, 2015

I've actually done this for years in the real-world. Yes, you can make mistakes. No, it's not a particularly hard problem to avoid and, as jonafato noted, --no-site-packages has been the default for awhile and I had it enabled by default for many years before then.

If this is a frequent challenge, it sounds like there's some technical debt to pay down elsewhere in your testing/deployment practice because it really doesn't need to be that hard.

clebio · on April 22, 2015

I could be doing something wrong; I'm open to the possibility. And yet...

`psycopg2` is a pip package that won't install without system dependencies. Yes, I use Ansible and the system dependencies are managed. But, out of the box, on a clean system, `pip install psycopg2` doesn't just work.

acdha · on April 22, 2015

The approach I use is to have the non-python dependencies managed externally – i.e. use APT/YUM for the postgres, mysql, redis, memcache etc. C libraries since those are very stable on a normal distribution – so a simple `pip install` will work as a non-privileged user inside a virtualenv.

stuaxo · on April 21, 2015

I realise that 3 has the nice async stuff, I get a little miffed that certain improvements won't make it into a 2.7.x (or even 2.8). Performance improvements for one.

It seems like the same attitude as the GNOME developers .. (disclaimer I use both GNOME and python, I must be a mashochist).

Klockan · on April 21, 2015

Why hasn't someone written a good 2 to 2/3 converter which also updates old c libraries? Wouldn't that speed up conversion considerably? It shouldn't be too hard to do it for the c libraries as well, or am I missing something?

thuffy · on April 21, 2015

Python 3's asyncio / co-routines is the reason to write any new project in Python 3.

The API is tiny, and the result is beautiful.

If you haven't seriously tried using it to implement something that really benefits from such computer science innovation, then you have simply seen nothing of Python 3 yet. There are many types of logic that are vastly simplified with such a programming paradigm. Faster to code, and far simpler to debug. By orders of magnitude. One example you could try if you cannot envision one is to implement a high performance network protocol. I for instance implemented the SSH protocol from scratch (RFCs) in a matter of weeks. I though it would take months before I tried, especially since I had never programmed in Python before! As I said, the API is tiny. Learning it (Python and Asyncio at once) was so easy and ended up taking such an insignificant amount of time compared to the vast quantities of time and headache saved, that, well, I just had to write this post with all that extra time and energy!

Check it out! You will find that it is beyond worth just addressing whatever fears or problems you have with making the switch.

belorn · on April 21, 2015

Im sure most people will start to use Python 3 for one of those large new features, but to me its all the small QoL changes for the standard library which I constantly see when searching for tricky problems and finding them in SO.

avolcano · on April 21, 2015

asyncio seems to me like it should be the biggest killer app for Python 3 - a universal replacement for the sprawling, confusing, and opposing async options out there for Python 2.x (Twisted, Tornado, gevent...).

I'm hoping that aiohttp, or a similar library, further matures and we start seeing apps built on it. Developing asynchronous HTTP code in Python 2.x is significantly harder than throwing together a Node app, where things are "asynchronous by default" (unlike, say, Flask or Django). Having a universal, built-in event loop removes a huge point of confusion and learning for async development in Python, and can hopefully bring interest from developers who would normally look to Node.

Scramblejams · on April 21, 2015

As a big fan (and user) of Twisted, I'd like to see a discussion of why I might want to use asyncio instead, besides the So-You-Can-Use-Py3 reason.

phenom · on April 21, 2015

I 'll add my experience with asyncio and twisted:

1) debugging is way better in asyncio, with ipdb you can step in co-routine. In twisted, you will end up somewhere in internals.

2) asyncio is end of unhanded error in deferred :) (or you need to remember to add addErrback everywhere, even after addColbecks and addBoth)

3) with trial testing is frustrating, sometimes error raised in wrong test because of global event loop (for instance you forgot to mock external call). In asyncio world loop passed explicitly and this is good for testing.

kyllo · on April 21, 2015

Well here's a great blog post on how to use the @asyncio.coroutine decorator to do asynchronous/non-blocking I/O in this style, as opposed to JS-style callbacks. I think the control flow is a lot clearer with coroutines as opposed to callbacks. I've never used Twisted though so I'm not sure how asyncio compares to that library.

http://sahandsaba.com/understanding-asyncio-node-js-python-3...

vtbassmatt · on April 21, 2015

Is there a thorough getting started guide for asyncio floating someplace on the web? I tried to pick up asyncio a few weeks ago. The official documentation assumes a lot of background that I don't possess, despite being a long-time Python user. Each method and class seems to be documented, but conceptual and end-to-end examples were much less common. Thanks in advance!

phenom · on April 21, 2015

asyncio is just nice API for async programing and event loop management. You should start from something high level in order to obtain asyncio ideas, like scraping [1] or try web development with aiohttp [2]. And then if you want to write some library or port database driver, look for examples at aio-libs[3] github organization.

[1] http://compiletoi.net/fast-scraping-in-python-with-asyncio.h...

[2] https://github.com/KeepSafe/aiohttp

[3] https://github.com/aio-libs

themckman · on April 21, 2015

Any interest in opening that SSH implementation up to the world. I wouldn't mind getting my hands on it and (maybe) even contributing.

thuffy · on April 21, 2015

Good news then!

I'll be releasing the greater project shortly as GPL, with the SSH implementation as LGPL! I'll announce on HN for sure then.

Guido, if you want to put it in Python itself, you can have it in whatever license you need for that.

merb · on April 21, 2015

I don't think so especially since it still uses bytes to output to a webserver, even go is way better for this kind of tiny library. For bigger projects i would still go the java route especially since there is karyon from netflix.