Hacker News new | comments | show | ask | jobs | submit login
Python 3 in Science: the great migration has begun (astrofrog.github.io)
86 points by ngoldbaum 864 days ago | hide | past | web | 91 comments | favorite



> The main reason for Python 2 users to not switch to Python 3 is the lack of motivation/killer features. We need to therefore be more proactive in encouraging people to switch to Python 3 by (a) making sure that any new users are always directed to the latest Python 3 version, and (b) releasing, in the near future, new major versions of packages for Python 3 only, while maintaining long term bugfix support for Python 2 versions.

That's the evil right there. Read carefully what's written.

"The main reason for Python 2 users to not switch to Python 3 is the lack of motivation/killer features". Which says that users are familiar with what Python 3 has to offer but consider it not good enough. Why would you then jump to a conclusion that you have to be more proactive in directing people to switch to Python 3. Or even better, to stop adding features to what 81% of people use.


In my experience the main reason is actually people who just looked at 3.0, decided it didn't have anything new and required work to port over, and then never looked at 3.1, 3.2, 3.3, 3.4 or the in-progress 3.5.

Which means those people don't know about/don't get to use, among other things:

* Vastly improved unittest module, including mock objects

* Dictionary-based logging config

* The sysconfig module

* The pathlib module

* Built-in enums

* Single-dispatch generic functions

* The statistics module

* The asyncio module

* The "yield from" syntax to delegate to generators

* The lzma module

* The ipaddress module

And that's without getting into 3.5, which adds things like the matrix multiplication operator that science-y packages will care about and probably port for.


You can probably scratch the statistics module as something scientists might be interested in; as much as it's simple set of functions is useful for new users who are overwhelmed by numpy, it doesn't add any new capability to python.

With that said - I love it.


Almost none of these things are used in scientific computing though. Maybe asyncio and yield from.


I suspect enums have uses, as do the improved async code constructs (and there's more of that in the pipeline).


Yes, I do use python for scientific applications and enums are one of the things I miss - I've built around it in python 2.x by using this construct:

    def enum(*sequential, **named):
        enums = dict(zip(sequential, range(len(sequential))), **named)
        return type('Enum', (), enums)

    Init = enum("NOTHING_LOADED",
    "DEPENDANT_ENTRYNODE_ATTRIBUTES_LOADED",
    "ROUTINENODE_ATTRIBUTES_LOADED",
    "DECLARATION_LOADED"
    )

    myState = Init.NOTHING_LOADED
3.x is still a non starter for me since many clusters where I want my software to work still only come with 2.6, e.g. I can't even use dictionary comprehensions. Adoption for scientific software is heavily influenced by how many external dependencies you have - requiring users without root access to compile python 3.x for their cluster home is a no-go.


No offense meant, but if you're at a stage where you use a pre-2.7 version because that's what comes with the OS, then it's not really a case of "looked at 3.x and decided it didn't have anything".

OS vendors who will be supporting ancient Python versions until the next decade are a blight on the entire ecosystem.


> it's not really a case of "looked at 3.x and decided it didn't have anything"

I agree, this isn't the main reason why I'm not looking at 3.x.


Yup, enums I use as well, you're right. Still... like 3 trivial things that can be overlooked.


First of all, it's not evil so let's not get crazy.

"The main reason for Python 2 users to not switch to Python 3 is the lack of motivation/killer features". Which says that users are familiar with what Python 3 has to offer but consider it not good enough.

You could have lack of motivation because you're familiar and decide that you don't need it - or - you could have lack of motivation out of ignorance or misunderstanding about the benefits. Both are reasonable and if it's the second one, there would be benefit from being more proactive about directing people to switch and explaining the reasons.

Why would you then jump to a conclusion that you have to be more proactive in directing people to switch to Python 3. Or even better, to stop adding features to what 81% of people use.

Maintaining 2 lines of development does have a cost associated with it for the entire python community involved. I think overlooking this was part of the problem with choosing this python3 strategy to begin with. I personally would not have done it this way and I get what you're saying about continuing 2.7 line but I think it's too late and now 2.7 and 3.4 have improved to the point where the porting task is reasonable.

I think at this point the community should just get it over with. I say this as someone who primarily develops on python 2.7 (mainly due to library dependencies) so I get where you're coming from.


That pretty much sums it up for me. I wouldn't have done it this way, but we are too far ahead on the Python 3 line and it's probably time to let go off the debate and move on to Python 3, at least for new projects.

Needless to say that the Python web developers community was set back by couple of years due to diverted attention. When Python should have been looking at concurrency and async support, everyone had to get busy maintaining duplicate code. The biggest issue probably was for the library developers (Django, Flask, Jinja, pymongo etc) to create a new Py3 package. But now that most of the important libraries and frameworks have moved to Python 3, application developers could probably move along with the change. It's not that much work for the app developers.


> by (a) making sure that any new users are always directed to the latest Python 3 version, and (b) releasing, in the near future, new major versions of packages for Python 3 only

How about a major speedup? Even a 10% speed up would be a big motivation!

You don't need fancy JIT, just less costy function calls and modern interpreter optimizations.


Maybe for the sake of having a single version to focus maintenance efforts on?


I've never seen it happen. (having a single version to focus maintenance efforts on)

1. Waiting for other favorite libraries to be ported over.

1. Waiting for other people to work the bugs out.

1. Waiting for a new project to work on. I have work ongoing in Python3 and 2.7; and I have no plans to migrate existing work from 2.7 to 3. I would not have migrated existing projects that are working; let sleeping dogs lie, seriously, you'd have to fight a lot of scientists if you told them you wanted to change anything in their (working) DAQ, just for the hell of it.


Directing people to the newest version is unsolved. If you type "apt-get install python [or pypy]" you get 2.7. This is obviously a problem.

Basically, everything here suggests that 2.7 is the official version and 3 is some kind of risky beta.


It's anecdotal, but Arch has had Python 3 as default for over a year. I think Fedora is coming close to this too.


Almost five years[1] in fact. To correct some other common misunderstandings elsewhere, it is not correct to say that python3's "changes are extremely breaking." It is easy to write code that runs under both 2 and 3, and 2to3 can convert 99% of code I've thrown at it[2]. The last one percent is straight up bad code that does crazy things like override the list() function with a variable.

The other pain point is hybrid python-C libraries that monkey around with Python internals - but those often have trouble keeping up with mere point releases. Given everything done to make the transition easy, if a library doesn't work under python3 by now it is a code smell. Either the library does weird things under the hood or it is practically unsupported.

Regarding "lack of killer features", there are some pretty compelling ones[3]. I've been writing all my code in py3 (or at least making it support both 2 and 3) for years now.

[1] https://www.archlinux.org/news/python-is-now-python-3/

[2] About two dozen packages I maintain and a bunch of personal projects.

[3] http://asmeurer.github.io/python3-presentation/python3-prese...


So is Ubuntu.


Because Python 3 made breaking changes that make the langauge easier to use for 99% of people. But the changes are extremely breaking.


This has been my takeaway too.


Yep stick to Vax/Vms


It's not that it isn't good enough, it's that the cost of switching is larger than the value of what they believe they would gain. I believe this is very common in academic settings (consider, for instance, how many people still see no reason to switch away from Fortran).


It's not that it isn't good enough, it's that the cost of switching is larger than the value of what they believe they would gain.

That's exactly what "not good enough" means. Getting rid of the GIL, or a 5x performance increase, or optional static typing might have been worth breaking everyone's code. Somewhat better Unicode handling wasn't.


Jython and IronPython both remove the GIL, and PyPy is often roughly 5x faster than CPython. Strangely enough, these projects don't seem to be as widely used as I'd expect.

By the way, optional static typing is available in any version of Python by using MyPy.


Because all the alternatives do not support many C-extensions, especially numpy, without which no scientists would look at Python at all.


All three of those bundled into the official release might have done it.


but with Fortran, the reason is generally because there's a large highly optimised legacy codebase that few understand or want to understand. Also it's still difficult to beat Fortran in terms of performance.

Python has neither characteristic. Switching to Python 3 is really easy, and it's easy to understand both. There are only a few projects where it would take more than a week or two to move from Python 2 to non-idiomatic Python 3 (assuming that Python 2 needn't be supported afterwards). Sure for some unviersities or whatever it wouldn't be in the budget, but where active development occurs by this point it's largely due to stubbornness (or need to support python 2 due to stubbornness of others).


I worked for a physicist for a number of years who refused to give up on Fortran. It wasn't for (computer) performance reasons. It was because he could quickly get his ideas into code with Fortran. He wasn't interested in computing for the sake of computing.


Most scientists use computers as a tool to prove that something works. Since few scientists are required to make commercial tools based on these ideas, so there is no reason to use latest software engineering fads/techniques. I know a guy who writes his software in VBA+Excel, and for him it is all that he needs to get data for his scientific papers.


>I believe this is very common in academic settings

It's even more common in enterprise settings, where the cost of switching also includes real money, not just some postgraduates porting code over...


Reminds me of Mozilla and HTTPS.


just include it for default in ubuntu repositories and distributions


I'm a scientific user with no incentive to switch; I'm still on Python 2.7.

But the release of the @ matrix multiplication operator in Python 3.5 gives me strong incentive to switch. I'll be switching as soon as Anaconda support Python 3.5.


I'm a scientific user, and work in a R&D environment for scientific instrumentation. I'm probably the most advanced Python user at my site, but have inspired a number of colleagues to give it a try. We use it for pretty much everything but developing our actual commercial software.

Perhaps an added dimension worth studying is that Python is so widely used by people who have little chance of even understanding the differences, aside from the print() function. I'm not sure that I could clearly articulate them myself.

I migrated to 3.4, just so I could find out and then tell people from my own experience that 3.4 is not broken, and that it supports a sufficiency of packages. In other words I'm using myself as a guinea pig. At the same time, I offer to help people by maintaining my own programs in a way that lets them run on 2.7 or 3.4 systems, e.g., with some kind of "if version > 3" verbiage.

Given that I know my audience, I can tell them confidently that they will not get hung up by the 2 vs 3 dichotomy, and that I will help them if it ever becomes an issue. I think the benefits of Python, including the giant ecosystem of packages, outweigh the risks of choosing the wrong version for most of us.

Edit: One more thing for beginners, a lot of us have working Python installations on our computers, but we don't quite know how or why. A useful migration tool might be a program that analyzes your 2.x installation, tells you what you've got, and maybe even offers to build an identical 3.x installation for you.


>But this is all wrong – we should be teaching new users to use Python 3! New users won't thank you if you teach them Python 2 and they have to migrate all their scripts to Python 3 in a few years...

Believe me, they won't have to. Python 2 will stay with us for a loooong time, and most NEW web stuff AND scientific work is done with it.

>However, the Python developers have now stated that there will be no Python 2.8 release. Essentially, no new features are going to be added to Python 2. In fact, after 2020 (which is not so far in the future), Python 2 will no longer be supported.

Yeah, let's see how this works out. After all it's the same guys who said that Python 3 transition would end with 3 being the default choice by 2014.


Whatever the result of this, the transition Python 2->3 will remain as one of the most troubled ever for a mainstream programming language (of course, nothing beats Perl 5 on this area). If you consider that Java is on version 8 and C++14 is here, we can see that even old languages have done this in a much more organized way.


Isn't Python the only one with breaking changes out of this group?


Which group are you referring to? Mainstream languages? The languages mentioned above? Perl 6 was designed from start to be very breaking, but even putting Perl 6 aside, Ruby had quite a few breaking changes going from 1.8 to 1.9 - and that was a merely point release.

Breaking changes are tough, but many libraries with a large codebase managed to update to Python 3 and even maintain dual-version support using compatible features, compatibility libraries and automated conversion tools.

For the everyday users things would be much simpler, and hence the main reasons for converting is not "too many breaking changes" but rather "not enough compelling changes".


It's probably reasonable to compare the breaking changes in Ruby 1.9 to Python 1.6/2.0, when Python first added language support for Unicode. There wasn't much in the way of blow back.

If there is at some point a decision to significantly change the semantics of string handling in Ruby (I think tagged strings are the addition of string handling, not a change to it), expect just as much whining as Python has seen (but I think people will look at how it has gone and avoid doing it).


I was replying to

"If you consider that Java is on version 8 and C++14 is here, we can see that even old languages have done this in a much more organized way"

C++14 and Java have extremely minor breaking changes.

I'm not saying it has to be that way, but the demographic that uses Ruby is different from the demographic that uses those other languages and Python.

But the main thing is probably that Python 2 is the default version on OSX and Ubuntu.

Updating to a newer version requires going to the Python site or installing Brew/Macports.


> Firstly, most users are using either Python 2.7 or 3.4

That was an amusing statement. His data shows 81% using 2.7, so that 3.4 could be replaced with any version and the statement would remain true.


An amusingly redundant statement judged by the rules of formal logic.

A highly informative and not-problematic-at-all statement judged by the rules of human conversation.


No, it's a highly problematic statement by the rules of human conversation, the same way shady media or advertisers play tricks with statistics to misrepresent an issue or product.

In fact, whether intensional or not, it's a trick with statistics itself. In that sentense he's presenting the huge majority and a much smaller minority selection as equals in what "most" people use.


The usage of 3.4 is 16% and combined with the 81% for 2.7 that gives us 97%. You can't replace 3.4 with "any version" and keep the statement true.


"Firstly, most users are using either Python 2.7 or 100" That is 81% (0% version 100) using Python 2.7 or 100. AFAIK anything over 50% is most :)


Check his use of "most" and "or", given that 2.7 is already "most", and 3.4 is 5 times less used.


Both 2.7 and 3.4 have a special status in that they are the latest of their branches. They will always be included. The author is writing for people who are interested in what they have to do to support pytyhon 3. People want to know:

I already have to support 2.7 and 3.4 to 'do the right thing' according to the plan. Is that sufficient or should I worry about supporting more?

This is very similar to the process web developers go through in deciding which browsers to support and when to cut off support.


Why can't those who want to use Python3 just use 3? Why do they have to push the rest of us over? This read as war against the silent majority to me.

The transition has failed. Python is now 2 separate communities and we only have the 'leadership team' to thank for that. Until they make more compromises this will continue.

That ~17% userbase is your new Python3 community.


There's a definite "screw the 2.x users" attitude in the Python community. Report a bug in 2.7 and see what happens. This attitude is encouraged by Python's little tin god.[1] The big excuse for Python 3 is that it does Unicode. Python has done Unicode since Python 2.6; it just wasn't the default. I've been writing all-Unicode Python for about five years.

The killer problem is that many package developers, faced with the 2->3 conversion, abandoned their packages. New packages were written by others to replace them. Users are thus forced to convert to using different packages, packages with different bugs and a much smaller user base. I wrote previously on converting a medium-sized production system from Python 2 to Python 3. I was finding bugs in third-party Python 3 packages, bugs so blatant that they would have been found years ago if the packages were being widely used. This was in the web space; the article indicates similar problems in the scientific space.

[1] https://www.python.org/dev/peps/pep-0404/


Worse it becomes an emotional plea, "do the right thing". As if not switching sides to 3.x is a moral issue. And the myth that 3.x is in our best interests or "inevitable". The author is fooled into this even though his own numbers show they can't crack 20% even though 3.0 was released in late 2008.

Using those numbers, even if the Python3 community doubles in another 7 years they'll still be at 34%. I think that would be an optimistic outlook as many people are simply finding something else to use instead of move to 3. I'd be as interested in Ruby 3.0 or Go as I would Python3. Many programmers for some reason cannot resist technical churn. To me that's what Python3 is. All things considered not better, not worse. Just different. But with the library issue.

My guess is Pyston will catch on and maintain the 2.x line after 2020. Pyston may end up the innovation that Py3 wasn't, but without breaking everyone's code. LLVM JIT compiler with C extension support. It's music to my ears and transitioning from being a Python programmer to a Pyston programmer sounds good. I suspect by 2020 it'll be ready as a spiritual 2.8, and the 80%+ will move to that.

Only this core dev team could've achieved defeat from the jaws of success by shooting itself in the face.


The successor to Python 2 may, in practice, be Go. That's the direction Google is going. Google used Python internally for non-speed-critical tasks, but the performance was too low for anything that had to scale. A few years ago Google hired von Rossum, and Google had a project, "Unladen Swallow" to produce a faster Python.

It failed.[1]

Von Rossum is no longer with Google. Google hired others to develop Go, which seems to be a good language for doing server-side web-related work. It's fast, memory-safe, scales well on multiprocessors, and has lower development costs than C++. That's what Google needed in their business.

Google maintains many of the key Go libraries. They're well-exercised production code. This is not the case for Python 3, as I spent a painful month discovering. I have some technical criticisms of Go, but when I write something in Go, it usually works as expected without surprises. You can use Go for important work with confidence. Python 3, six years on, isn't there yet.

[1] https://en.wikipedia.org/wiki/Unladen_Swallow


Failing something like Pyston taking over for 2.x, that is my plan. Going from Py2 (what I currently use) to Py3 doesn't really do anything for me at all. There's the constant threat online from people about 'support' and guilt trips, but the support that matters (library support) isn't going away anytime, if ever, with a 80%+ userbase in 2015.

So I'm going to stick with Python2, and if I migrate to anything it'll be Go. Though it'll likely just be also using Go. I've already worked through a book on Go a couple years ago. I have my complaints about it, but I also have my complaints about Python.

Nothing is perfect, but Go IS really easy to get started with, which is worth a lot to an individual programmer or to a team. I'm not sure it's flexible enough as a drop-in replacement for Python, but if forced it's good enough to completely replace Python, if this Python2/3 debacle isn't resolved.

I'm seeing Python as 2 separate communities now. Of course the Python3 diehards see that as the worst-case outcome, since their community is always the short stick. But I think we're here now and it'll remain this way.


Von Rossum worked on App Engine, he didn't work much on Unladen Swallow. You can see it here:

https://code.google.com/p/unladen-swallow/source/list

Google hired the Go guys years prior to Unladen Swallow even being an idea, and the first open source versions of Go were available around the time that the Unladen Swallow project plan was put out:

http://golang.org/doc/faq#history


Py2 and Py3 aren't actually that different. And the library ecosystem caught up. I switched one year ago, and I do not regret it. Better Unicode support is great for people like me living in a non-ASCII world, and asyncio (tuplip) has a sizeable async ecosystem now.


Python 2.7 is an extremely productive language, for a lot of us. I want to move to 3.x at some point, but there's very little I can't do with 2.7, and it is the default on OSX, and many linux platforms.

The one killer feature that would make me move almost straight away would be making it a lot faster.

But since I have pypy, I'm much more interested in installing that as my 'extra python version' than I am in going to 3.x straight away.


Is the high % of mac users still on 2.7 partly because that is what OS X ships with?


Python 3.2 is the latest version that is PyPy compatible, so support for 3.2 shouldn't be dropped if that's a concern.


I'm currently working on a new project using Python 2.6. You see, I'm creating an analysis tool that absolutely must run on an old, isolated (no internet connection) RedHat system, and have access to only a very limited number of packages I can install. Which, in my case, means Python 2.6, fairly old versions of NumPy and SciPy, and Tk for the GUI.

This is actually my first significant project in Python, so naturally I wanted to use the latest and greatest (learning opportunity and all), but no such luck.


I've never quite been convinced by this argument. If you can deploy and run .py files, surely you could deploy and run a local copy of python3? It's not like it needs root.

The target system not having an internet connection is a best-case for a local python3 package, since the lower attack surface makes package security updates less urgent.


Perhaps off topic, but this was the most interesting to me:

http://astrofrog.github.io/images/survey_plots/os.svg

People really use Linux in scientific communities? Like, non-computer people? In the Netherlands Linux usage (or anything other than Windows and OS X), even among university software engineering students, is almost non-existent. Three out of seventy students I know use it (4%), including myself.


Linux is very common in the physical sciences. Back in the 80s and 90s a Unix box from Sun Microsystems might have been more common, but Linux has been firmly in place for more than a decade now.

The audience for this survey is also probably somewhat biased, since it was mostly promoted on twitter. That said, none of my colleagues use Windows. I'd say 75% Mac with the rest various flavors of Linux.


Currently I am doing some consulting work in life sciences for well known multinationals.

On my specific case, Linux is only used in their HPC clusters and for hosting some DB servers.

All researchers use Windows systems as their desktops and control systems for their automated robotic systems.


I know a ton of (mech, civil, aerospace) engineers who use Linux, including quite a few engineering graduate graduate students who use emacs form the command line as their primary development environment.


I have a friend just starting to get into astronomy research, and he had to learn how to use basic Linux CLI. A ton of tools are written explicitly for Linux!


The scientific community should focus on moving to PyPy and helping to excise CPython C extensions from more scientific libraries.

That's at least a change that will benefit them.


How about...no?

PyPy is a nice way to make most Python programs 20-100% faster, often at the cost of using slightly more memory. Writing in C/C++/Cython etc, we typically see 100x run-time and memory improvements.

These extensions were written in C for a reason. If I had to write in pure Python, I'd find another language for the bulk of my work.


The reason is that PyPy didn't exist, and C extensions were the way to go back then. That is no longer true. Anything else is misinformation.

I don't use numpy (or really very many scientific libraries), and talking about generalities like you mentioned just isn't meaningful, but here's a really old benchmark which is (small) evidence that it's just not true that these things need to be written in C anymore: http://morepypy.blogspot.com/2012/01/numpypy-progress-report...


Why are you trying to push this point, when you plainly admit you don't know much about this? And so strongly, too! "Anything else is misinformation." I've seldom seen someone so confidently wrong.

PyPy can address Python's slow loop performance and slow numeric calculations, in particular examples. This can help a lot. But if the problem requires you to carefully manage your data structures so that you get good cache performance, PyPy cannot help much at all.

Consider this: there are very few non-toy examples of PyPy giving order of magnitude efficiency improvements over CPython, while other languages do typically run that much faster than Python implementations.

Finally, if you don't want to talk about generalities...Consider my NLP library, spaCy: http://honnibal.github.io/spaCy/#speed-comparison . It's the fastest library with this functionality in the world, among any language.

This library is an example where you need C data structures just to make the working set manageable. You would need an enormous amount of memory just to load the models. In pure Python, the library would not be of much use to anyone --- so I would have written it in Java or C++.


I did not admit I don't know much about this, although I will now, but that's not the important thing in these discussions. Very few people in the scientific community know what's out there, and many are misinformed in the exact same ways.

You seem to be suggesting that everything is fine, and that all the C code that exists right now in the scientific Python community is justified. This is incorrect.

Needing "C data structures" (I'm guessing you mean memory management but I'd love to have a look at your library tomorrow morning) is certainly reasonable -- and there are better ways to do so than the CPython API.

My point before, which I am pushing, yes, is that the scientific Python community is using old tools, and they'd benefit from using new ones that actually would directly benefit them, which PyPy will -- CFFI is a better choice than ctypes or the CPython API for way more agreeable reasons than whether Python 3 is a better choice than Python 2, and pure-python is a better solution than both for where it's applicable -- it's unimportant where that is to me, my point is "it's more than what we currently have in pure-python".


Numba is faster than pypy and compatible with cpython.


The scientific community will use whatever makes it easier to achieve its own goals. The idea that you need to use something because it is cooler, more open, fancier, etc., works only in the programming community, where software itself is the end goal.


At a minimum, that would require largely rewriting scipy, pandas, and any other piece of software that uses the CPython C extension interface rather than the FFI.

That's a pretty tall order...


SciPy and Pandas use Cython, so if that can ever be made runnable semi-directly in PyPy, those two are dealt with. Only NumPy uses the CPython API directly but this is being worked on as NumPyPy (see http://morepypy.blogspot.de/2015/02/linalg-support-in-pypynu...).


SciPy uses Cython, but it also uses other C interfaces. The most definitive statement I found in a couple minutes is here:

http://www.scipy.org/scipylib/faq.html#how-can-scipy-be-fast...

SciPy uses a variety of methods to generate “wrappers” around these algorithms so that they can be used in Python. Some wrappers were generated by hand coding them in C. The rest were generated using either SWIG or f2py. Some of the newer contributions to SciPy are either written entirely or wrapped with Cython.


> f2py

This is a Fortran-to-Python interface, and has some support for Python 3 so far as I can tell.


Actually, most of the time, the performance-critical computations happens in the highly optimized C modules already, so there's not /that/ much to gain. That being said, the PyPy team is working on a NumPy integration (NumPyPy).

In many cases, you can use something like python-bond to call PyPy from a regular CPython program: https://pypi.python.org/pypi/python-bond (or just use multiprocessing).


The C extensions shouldnt be cut out of Python. I get that PyPy is better but cutting out c sxtensions and rewriting them in pure python would hurt performance more.


There are several other options. The extension can be written as a pure C extension, with CFFI for the bindings, and code in Python to present the public API. Or write the extension in Cython, which has backends for both CPython and PyPy's cpyext (I have no experience with this option).


Personally, I would really like to see more Python extensions written using the ctypes foreign function library. This has two advantages:

* Supported on more than just CPython. I.e. you can use them on PyPy as well.

* The extensions are far easier to install, since one doesn't need both a full C compiler as well as all development headers for the library installed on the server where you install the extension.


ctypes overhead is horrendous though, at least in CPython (I don't know about PyPy). Fine if you're doing array operations on huge arrays, but not so if you have lots of small objects.

I wanted to use ctypes to wrap a C library for a recent project (for the ease of installation and development that you mention) but had to give up when it turned out to be more than 10x slower than a wrapper written in Cython, and barely faster than doing a pure Python implementation of the C library itself.


It's unclear that that's true (that performance would be worse), but regardless, there are many other, better ways to run C code from within Python (CFFI being the easiest and nicest currently)


On a related note, Apache Spark recently landed Python 3 support in master, which will be released as part of Spark 1.4 in the next month or so. [0]

[0] https://issues.apache.org/jira/browse/SPARK-4897


Does this[1] type of plot have a name?

[1] http://astrofrog.github.io/images/survey_plots/python_vs_exp...


hmm... I would describe it as something like a '2D histogram square-bin plot' a 'categorical plot' or perhaps a variant of a 'mosaic plot' using color density instead of area.

http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html

http://en.wikipedia.org/wiki/Mosaic_plot


Thanks, that is a lot more than I hoped for yet very appropriate considering what I plan to do with it.


In all honesty, I'd call it a table. This plot stood out for me as being unnecessarily a plot. It's tabular data, plain and simple. The x and y axes are simply used to order the margins of the table. They've added colors, which I guess are a little plotty, though I'm not sure the color adds or detracts here.


My conclusion is: if there were no Python 3, Python would be much better


Only 10% Windows? Has ESO switched to Linux?


And half are astrophysics?


I'm assuming you chose science for the basis of this analysis because scientists would be more likely to provide thoughtful replies, if any reply at all correct? I did like the analysis, but once the proselytizing kept popping up, it started to lose value and actually stoke some some of the existing tensions on this matter (we should stop teaching Python 2??)

So now we have to come down from the clouds into reality. For one, having limited yourself to scientific fields (sort of) you have literally chosen the tip of the iceberg. Let's ignore the fact that the stats clearly show abysmal adoption of Python 3 among those who make a living by being on the cutting edge and can most afford to adopt. But 786 replies?? And we should stop Python 2 for THAT (even though ~700 in that set aren't using it)??? If you venture into the commercial world, this entire analysis becomes irrelevant. My field of finance and derivative trading is one of the big up and comers in the Python space for a number of reasons but likely to costs - rapid prototyping, powerful testing, ease of training, etc. And I can say with a fair degree of confidence that there are more Python developers at my firm than this entire set - and all of them use Python 2.6. Or at least they did until a multi year upgrade migration finally brought us to Python 2.7 just this year ( and cost millions to achieve). Most are still using 32 bit, develop in Windows (on in house built IDEs) and have some split of deployment between Window client side projects (GUIs) and Linux server deployment. This is not to say we are simpletons - we have vast and complex systems, globally distributed compute grids with tens of thousands of cores (and equivalent disaster recovery clusters), reactive caching graphs, time shifting time series, cutting edge data stores manipulating massive data sets in the tens of billions of rows and basically run huge swaths of the HUNDREDS OF TRILLIONS in notional derivatives that drive the entire global economy. These systems have BILLION dollar budgets. And mine is one firm out of thousands employing tens (if not hundreds) of thousands of well compensated software developers.

If my point isn't abundantly clear by now, then to put it frankly: WE (and many other industries like ours) ARE the Python community. And it shifts all your stats to the point where Python 3 isn't even statistically significant. So perhaps it would be wise to work with the elephant in the room instead of trying to pull the rug out from under him. Because elephants are big and don't like that shit. Moreover, as a multitude of examples can attest to, any features left out of Python 2.X ultimately just get developed in house and become closed source, with a huge inefficiency in developing the wheel again and again but also locking up the critical knowledge on the best way to do it, backed by proven production deployments with big money at stake.

I don't want to disway this sort of analytical review, but if it's going to be science, it should bear at least some resemblance to it.


I wonder if you and I have crossed paths at some point in the industry.




Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: