As I mentioned the last time this came up, I recently converted a medium-sized (about 30,000 lines) system from Python 2 to Python 3. The problem is not the language changes. Those are minor, and most of the new forms worked in Python 2.7.9. It's the state of the external modules. Important old Python 2 modules have been discontinued, and their replacements are not used enough or tested well enough to use reliably. I kept finding bugs that should have been found in the first months of heavy use of a module.[1] It took an extra month of work to find and work around all the external module bugs. These aren't obscure modules; they're ones most large server-side applications would need.
Porting was at least possible. Two years ago, I took a look at doing this, and there weren't enough modules ready to even attempt it. Now there are. There is progress. Slow progress.
The download statistics tell the story: "We can see that in the past year Python 3.x has grown from roughly 2% of the total downloads from PyPI to roughly 5-6%."[2] That's why the external modules don't work reliably on Python 3. 95% of the package use is on Python 2. Python 2 use is probably higher than that, since those are download statistics; PyPI doesn't see packages downloaded long ago and still running.
There's extensive denial about this among Python fans. That's a big part of the problem. Another problem is that PyPI is just a link farm - unlike CPAN, it doesn't centralize code hosting, bug tracking, support forums, and quality control. This allows Python fans to be unaware of how buggy the packages are.
Number one reason why I haven't switched yet?... print. Admittedly is is the dumbest possible reason, it's irrational; I would never admit it in a face-to-face conversation; I would deflect to "something something third-party library...", but deep-down inside, if I'm really honest with myself, it's print. I want asyncio, I want better unicode support, but all I think about are SyntaxError's in what I __strongly__ consider to be valid python.
My stubbornness is rooted in the simplicity of print. It is so simple and stupid, why change it? Never has a production app relied heavily on print, it's just a dumb way to check shit while developing. I mean sweet jesus if it ain't broke right? I know I'm wrong in thinking this kind of thing happened everywhere in the 2->3 increment, but as I said earlier, I'm under no illusion of rationality. Python 3 broke hello world, and that just skeeves me out.
Eventually I'll write new stuff in 3, probably sooner rather than later especially with asyncio et all, but for now I'll stick with 2.
I'm also embarrassed about it, but I also strongly dislike the change to print. With this combined with a lot of little things like the removal of __cmp__ (apparently without even a relevant mixin in the standard library!), removal of the cmp argument to sort, removal of encode('hex') and encode('base64'), and doing Unicode wrong[1], Python 3 really feels like a pointless quality of life decline. For this reason I still stick with Python 2 for my general scripting needs.
But cmp is so backwards and non-intuitive. Key is a vastly better api, and more performant too, because you can generate keys once and then do comparisons using native types.
Again, with the encoding the api is different, but it's still really simple.
Key... well, I should elaborate on that. I don't usually use cmp in my own code, and while I can think of situations where I might want to, using a wrapper class implementing comparison methods would be easy enough (modulo the missing mixin issue!). It's semantically a lot more complex to create a new object for comparison than to specify a comparison function when the English description is "sort by X" (of course, when X is just a property, key=lambda a: a.x is enough), but since this is a rare situation, I wouldn't mind. However, I ran into that issue when trying to teach my brother programming, and for learners, semantic overhead is really hard to deal with. Now, Python has no responsibility to optimize for learners over actual programmers (indeed, I'd argue it's always been worse as a first language than people think), but this was a case of removing a feature that had minimal additional API surface or implementation complexity; it really felt like removing things for the sake of removing things.
For encoding it's mainly annoying because now I have to manually add an import, which is more typing than before. To be fair, there is a cogent argument that my issues with both print and this mean what I really want is a different language altogether: one that, respectively, allows omitting parentheses for all function calls, and has some kind of implicit import. But other languages have their own problems, and this still feels like a paper cut.
Oh, and the annoyance is exacerbated by the interfaces of binascii.hexlify and base64.b64encode being broken. They both return bytes objects, which is dumb since the whole point of hex and base64 encoding is to represent binary data as text, and the most common thing to do with encoded strings is to insert them in the middle of other text. 'foo %s' % hexlify(b'bar') => "foo b'626172'"; to get the "foo 626172" I want, I have to do even more typing and append .decode('ascii').
For me there's a lot less mental overhead with the Key form.
Encoding-wise, hex and base64 are not something I do too often, so their new interfaces don't worry me too much. In the cases you've mentioned you could create a simple wrapper function to deal with it for you.
I guess there's some reason for using bytes instead of strings as the output to the interface, but, as you say, it's definitely non-obvious.
There are lots of things that are a little different in python 3; I would say overwhelmingly for the better, on the whole. There just doesn't seem to be much point fighting against those minor changes when they're easy enough to adapt to.
I'm seeing an awful lot of green there. Most of the things that are red (which might influence someone to use Python 2 on a greenfield project) have alternatives that are green.
5000 packages converted, 55,000 to go. If you can't port until somebody else does, then projects that depend on you can't convert. Guido talks about this, and says more than "looks green to me".
Some packages are ported but that does not mean people use Python 3. If you look at the download numbers it's pretty clear that they are not indicate actual use: https://caremad.io/2015/04/a-year-of-pypi-downloads/
the situation for greenfield projects is much better than it has been in past years. I would feel just fine about, for example, starting a new django web app in Python 3.
however, the vast overwhelming vast huge vast majority of software in the world is legacy systems. software lifecycles are long and the cost of migrating from Python 2 to Python 3 is very high for those systems.
Bloomberg's Open API is not on there and I think pretty much the only thing keeping me on 2.7. Don't know the ecosystem as well as others but imagine there will be more enthusiasm for 3.x in finance once that's done.
And I'm not sure about the fine-grained accuracy here. For example, NetworkX is listed green, but its graphic subsystem, according to their documentation, does not support Python 3.
Also, Oath2 seems like a big one, and would exclude many other packages that might use it.
It is all self-reporting based on the classifiers given to PyPI (NetworkX lists itself as supporting Python 3, Python3.2–Python3.4). It seems like both it and twisted would benefit from a Python3–partial classifier
Hope it works for you! I found that it behaved exactly like Gevent for what I was doing (which I gave up on due to Dropbox rate limits).
I've been looking at porting all the flask-websocket stuff over by using guv as a base. Haven't done anything other than reading codebases but it looks like it should make it easy.
I remember, when Py3 first came out, everything was incompatible -- unnecessary incompatibilities like the u" notation for Unicode string literals that was dropped. Unnecessary incompatibilities in the C-extension-module implementation layer. And so on. The list of incompatibilities was just huge.
Later several of them where dropped, like the string literal trouble ... But than the trouble was already done. Many extension modules where not lifted to the new version, since the overhead was to big.
I think, many more projects would have adopted Py3, if more extension modules would support it.
The huge library of extension modules was always the strength of Python. Now we have many projects still running on Py2, because Py3 did ignore this strength.
I think, when it would have cared more for compatibility from the beginning, much of the damage done would have been avoided.
It raises some good points. It makes me think of other projects I've been a part of where the working, existing system was starved for resources in favor a new system which didn't exactly dominate in performance or capability.
I've been programming for almost 20 years but am new to Python. I'm currently working on an API for a product and testing it against 2.7 and 3.4. It's not the end of the world, and if it works for 3.4 it pretty much works in 2.7. Granted, it's not the world's most complex or deep piece of python, but is there a real problem in making 2.X code run on 3.X? Is it really that big of a change?
It's easy to get your own code to run on 3.x. The challenge is making sure all your C library dependencies run on it. And Python is nothing without its vast library ecosystem.
I ported APSW to Python 3 during the Python 3 beta (2008 IIRC) and thought I was being late! Porting the C code was easy - just a few #if sections selecting between Python 2 and 3. What took considerably more effort was porting my test suite. In addition to normal functionality, it does far more testing of boundary conditions and errors. I finally ended up with a codebase that compiles unaltered on Python 2.3 all the way through 3.5 (except 3.0), and the same Python source file that does testing against all those version and popular platforms with issues (ie Windows). To date I have never had a bug report where sample code was for Python 3 - it is always Python 2.
I also provide binaries for all the Python versions in both 32 bit and 64 bit (where relevant) for Windows. When hosted on Google Code, their download stats showed the vast majority on Python 2.7 32 bit with a slow increase in 2.7 64 bit. Python 3.x downloads were roughly similar to Python 2.4 downloads! Sadly since moving to Github I have no idea what the download rates are.
BTW the only reason for not supporting Python 3.0 was that the test suite needed a module that wasn't provided (base64 or something similar), which I could have worked around, but the 3.0 release was end of lifed so I didn't bother.
For me with the headaches with Unicode and international content, python 3 is great and well worth the switch. However, I know most don't have that big of a win to switch and there is really no business reason to switch.
If they'd solved the multithreading problem in Python3 so that you can get true multithreading that would have been a tremendously motivating force to get people to switch. But that didn't happen and it's really undercut a lot of the pressure.
The pydata subcommunity, the fastest growing and probably already the biggest one doesn't care about asyncio. A good multithreading solution on the other hand would have been decisive.
Honestly, 3.5 is getting there. I've been pretty frustrated by Python 3, but starting with 3.5 it actually looks /possible/ to port Mercurial without tons of grotesque hacks.
I used this analogy to describe the state of Python 2/3 to a friend who knows nothing about programming. Like many Python programmers, he too would continue using the old road.
Wow, that's remarkably unconvincing. I'll think of this the next time I hear of some complex ethical dilemma solved by asking a small child.
I really wish we as a community just set a date and ported. As it is right now there's almost zero chance that the project I'm working on will get ported to Python 3. I'm not sure what it'll take, but I really wish we were passed this stage.
This is the "problem" with not having things controlled by some central figure. Following the road example, the Dept of Transportation would simply open the new road, and close the old. You wouldn't have a chance to continue traveling down the old. This happens (to a large extent) with Java. Oracle opens the new road, then stops fixing potholes on the old one. You're not "forced" to move over, but eventually you stop wanting to fix your suspension.
That said, who is central enough to put an "end" date on Python and "force" people to move over?
> That said, who is central enough to put an "end" date on Python and "force" people to move over?
Guido van Rossum?
I'm not in the Python community - I barely know the language itself - but from the outside he seems to be a central figure? Doesn't he have enough authority to pull this off?
Imagine you have a company that does something not related to software. You have some important business tools that are implemented in Python. What is your incentive to spent money and take the risk of new bugs porting to Python 3? It doesn't make business sense.
That's why there is still lots of COBOL in the world and why Python 2.x is going to live for many years, far past the 5 year predicted end of life. It will be far cheaper to make new releases of Python 2.x than to port the millions of lines of existing code.
Python 3 is nice, take a look at it. It probably will be easier porting than you think. However, don't feel like you are forced into moving to it. Python 2.x will continue to be around.
>I really wish we as a community just set a date and ported.
The thing is there is no "community". There's a core team, smaller teams around some projects with their own ideas, and a huge number of individuals and companies using Python, each with their own needs, use cases and timelines...
Exactly. And those smaller communities are setting their own timelines. For example, Fedora just made Python3 the default for their next version. Arch Linux did this already, and other groups will change over when they're ready.
I understand the desire, but there is no "we" in that sense. Those of us using python are a "community" in that we have shared interests, but not in the sense that we have shared risks and rewards. Each person/team has to deal with the constraints of their own specific situation. So, the bottom line is the tool either gains organic adoption or it doesn't. I really don't think having a king who can decree a switch is the answer. The answer would have been not forcing the choice in the beginning. Now the answer is probably going to be that more people look at alternative ecosystems that haven't suffered from this sort of schism.
The metaphor doesn't appear to make sense - at first, the new road is only "slightly wider and speed limit is fractionally higher" but then later on it appears that the road needs "enough new lanes" for all the cars on the old road.
I thought the premise of building the new road was because the old road's ground wasn't good enough, not that there was too much traffic and the road needed more lanes.
Anyway, the whole thing hinges on the authors assertion that the new road isn't that much better, so it's really up to the informed developer/user to decide whether or not they take the new road.
I tried to stay away from a which is better battle. I think the numbers speak for themselves as to which road everyday users of Python are using (scripts to scrape cat pictures don't count ;).
One of the nice things about Python 2 is that there will be no breaking changes introduced by an episode of CADT syndrome in the language or library design.
I'll take the Python 3 road when the big stable distributions (Ubuntu LTS, RHEL & CentOS) start providing onramps to it by default, and include the packages we depend on in their default repositories.
I just can't get myself interested in rolling my own packages for the dozens of libraries I depend on, especially since they already "just work" with Python 2.
If you mean "adoption of Python 3" in the sense of new Python programmers choosing 3 over 2, then yes, there has been progress towards learning Python 3. It's pushed in all of the online "beginning Python" guides I've looked at as well. From everything I've read about it, learning 3 is the way to go, since if you learn 3 you can always code in 2 if necessary. When the time comes that you'll need to know 3, you'll already know it.
If you mean adoption by businesses and entities using it in their infrastructure, then no, the author is correct: We're not seeing much progress there.
That's the same real world where I routinely type "mkvirtualenv …", install a bunch of things and don't notice that I've been using Python 3 until the first time I toss a print statement in for debugging?
Which virtualenv's, regrettably, depend on OS-level packages. so when you go to deploy your app to production, you find you installed some system dependency long ago and promptly forgot about it. The real world is messy.
I'm not arguing there aren't upgrading pains, but --no-site-packages has be on by default in virtualenv for over three years (https://virtualenv.pypa.io/en/latest/changes.html#id31). There are certainly issues when it comes to compatibility between the languages, but I'm not sure that's one of them any more.
I've actually done this for years in the real-world. Yes, you can make mistakes. No, it's not a particularly hard problem to avoid and, as jonafato noted, --no-site-packages has been the default for awhile and I had it enabled by default for many years before then.
If this is a frequent challenge, it sounds like there's some technical debt to pay down elsewhere in your testing/deployment practice because it really doesn't need to be that hard.
I could be doing something wrong; I'm open to the possibility. And yet...
`psycopg2` is a pip package that won't install without system dependencies. Yes, I use Ansible and the system dependencies are managed. But, out of the box, on a clean system, `pip install psycopg2` doesn't just work.
The approach I use is to have the non-python dependencies managed externally – i.e. use APT/YUM for the postgres, mysql, redis, memcache etc. C libraries since those are very stable on a normal distribution – so a simple `pip install` will work as a non-privileged user inside a virtualenv.
I realise that 3 has the nice async stuff, I get a little miffed that certain improvements won't make it into a 2.7.x (or even 2.8).
Performance improvements for one.
It seems like the same attitude as the GNOME developers .. (disclaimer I use both GNOME and python, I must be a mashochist).
Why hasn't someone written a good 2 to 2/3 converter which also updates old c libraries? Wouldn't that speed up conversion considerably? It shouldn't be too hard to do it for the c libraries as well, or am I missing something?
Python 3's asyncio / co-routines is the reason to write any new project in Python 3.
The API is tiny, and the result is beautiful.
If you haven't seriously tried using it to implement something that really benefits from such computer science innovation, then you have simply seen nothing of Python 3 yet. There are many types of logic that are vastly simplified with such a programming paradigm. Faster to code, and far simpler to debug. By orders of magnitude. One example you could try if you cannot envision one is to implement a high performance network protocol. I for instance implemented the SSH protocol from scratch (RFCs) in a matter of weeks. I though it would take months before I tried, especially since I had never programmed in Python before! As I said, the API is tiny. Learning it (Python and Asyncio at once) was so easy and ended up taking such an insignificant amount of time compared to the vast quantities of time and headache saved, that, well, I just had to write this post with all that extra time and energy!
Check it out! You will find that it is beyond worth just addressing whatever fears or problems you have with making the switch.
Im sure most people will start to use Python 3 for one of those large new features, but to me its all the small QoL changes for the standard library which I constantly see when searching for tricky problems and finding them in SO.
asyncio seems to me like it should be the biggest killer app for Python 3 - a universal replacement for the sprawling, confusing, and opposing async options out there for Python 2.x (Twisted, Tornado, gevent...).
I'm hoping that aiohttp, or a similar library, further matures and we start seeing apps built on it. Developing asynchronous HTTP code in Python 2.x is significantly harder than throwing together a Node app, where things are "asynchronous by default" (unlike, say, Flask or Django). Having a universal, built-in event loop removes a huge point of confusion and learning for async development in Python, and can hopefully bring interest from developers who would normally look to Node.
1) debugging is way better in asyncio, with ipdb you can step in co-routine. In twisted, you will end up somewhere in internals.
2) asyncio is end of unhanded error in deferred :) (or you need to remember to add addErrback everywhere, even after addColbecks and addBoth)
3) with trial testing is frustrating, sometimes error raised in wrong test because of global event loop (for instance you forgot to mock external call). In asyncio world loop passed explicitly and this is good for testing.
Well here's a great blog post on how to use the @asyncio.coroutine decorator to do asynchronous/non-blocking I/O in this style, as opposed to JS-style callbacks. I think the control flow is a lot clearer with coroutines as opposed to callbacks. I've never used Twisted though so I'm not sure how asyncio compares to that library.
Is there a thorough getting started guide for asyncio floating someplace on the web? I tried to pick up asyncio a few weeks ago. The official documentation assumes a lot of background that I don't possess, despite being a long-time Python user. Each method and class seems to be documented, but conceptual and end-to-end examples were much less common. Thanks in advance!
asyncio is just nice API for async programing and event loop management. You should start from something high level in order to obtain asyncio ideas, like scraping [1] or try web development with aiohttp [2]. And then if you want to write some library or port database driver, look for examples at aio-libs[3] github organization.
I don't think so especially since it still uses bytes to output to a webserver, even go is way better for this kind of tiny library.
For bigger projects i would still go the java route especially since there is karyon from netflix.
Porting was at least possible. Two years ago, I took a look at doing this, and there weren't enough modules ready to even attempt it. Now there are. There is progress. Slow progress.
The download statistics tell the story: "We can see that in the past year Python 3.x has grown from roughly 2% of the total downloads from PyPI to roughly 5-6%."[2] That's why the external modules don't work reliably on Python 3. 95% of the package use is on Python 2. Python 2 use is probably higher than that, since those are download statistics; PyPI doesn't see packages downloaded long ago and still running.
There's extensive denial about this among Python fans. That's a big part of the problem. Another problem is that PyPI is just a link farm - unlike CPAN, it doesn't centralize code hosting, bug tracking, support forums, and quality control. This allows Python fans to be unaware of how buggy the packages are.
[1] https://news.ycombinator.com/item?id=9378898, search for "quality control" [2] https://caremad.io/2015/04/a-year-of-pypi-downloads/