Hacker News new | past | comments | ask | show | jobs | submit login
PyPy gets funding from Mozilla for Python 3.5 support (morepypy.blogspot.com)
316 points by progval on Aug 9, 2016 | hide | past | web | favorite | 103 comments

Weird how suddenly everyone seems to have coalesced around 3.5. I myself have experienced this just recently -- the new async stuff is too nice to pass up. Really looking forward to new formatted string literals in 3.6 as well! One of those ideas that's so obvious in hindsight, you can't believe it took us this long to come up with it.

Not weird. This phenomenon is literally called a "tipping point," and it has been well documented in many different settings and disciplines. Example:


We're at the tipping point between Python 2.7 and 3.x.

One of the arguments against switching to Python 3 has always been that you just don't gain enough to justify the hassle, so maybe 3.5 finally hit that critical mass of things people really want.

Don't gain enough? Static type checking and asyncio? Sign me up!

Static checking ? Python 3.5 Asyncio ? Python 3.4

If these had been available in Python 3.2 or even better 3.0, the switch would have been far easier for corporate users that need benefits before accepting the cost of change...

3.4. gave a good virtualenv by default. That helps a lot

I still use normal virtualenv, what am I missing by not using pyenv?

3.4 gave us pip installed by default https://docs.python.org/3/library/venv.html

Before that it would give you an empty venv, not even setuptools

Yes, you can use the Python 2 virtualenv on Py 3 but I remember that there were some problems

The usage is essentially the same, just a different command. Huge benefit though is that if you have Python 3.4+ installed you already have the virtual env and pip installed, so things are much simpler.

Which one is that? I'm not aware of it.

    python -m venv
I'm not sure what the differences are to old virtualenv, but the biggest feature is that it works out of the box as long as you have Python > 3.4. No more googling "how to install virtualenv" or "easy_install pip; pip install virtualenv" stuff.

Unless you're on Ubuntu, where they ship a non-functional version of venv and require you to "sudo apt install" python-venv to get the working version, defeating the entire point of a simple module in the stdlib that lets you manage your Python environment as a user.

The big improvement on 16.04 is at least the error message explains what's going on.

in addition to what other wrote you also have pyvenv command which is much convenient way of using it in cli (the usage is same as of virtualenv).

I would say: Optional type checking outsourced to a third party tool (mypy) that catches perhaps 90% of the errors if you are lucky.

You still need something like MyPy to actually validate types. What happened is that there's now a standard for typing annotations - now, PyCharm, MyPy, Wing IDE and anyone else can share the same syntax.

I am disappointed in Python core developers when they decided to accept static typing to Python 3. It bothers me more than the formatting PEP.

It's optional. How can that possibly bother you? Don't use it if you don't like it.

It's not only optional, it does nothing by default. What was introduced into python is a standardized way to do type annotations. It consists mostly of a new standard library module and as far as i know it doesn't even introduce changes to the language as it uses comments, additional files and the rarely used function annotations. The interpreter happily ignores them anyways but external tools like mypy don't have to invent their own incompatible syntax.

I was more excited about them when I believed they were actually enforced. I guess they help with static analysis in IDEs, but you can still get type errors at runtime, so Python is still missing out on the main advantage of static typing.

there is a type checker available: http://mypy-lang.org/

> as far as i know it doesn't even introduce changes to the language as it uses comments

That's not the case. See for example https://www.python.org/dev/peps/pep-0484/#type-definition-sy...

Because I'll need to support people that use it and explain it to newbies who read it. More work to do, without much benefit (to me; I understand it has much benefit to some people).

It's only without benefit if you don't write Python. Otherwise, it's like having tests that take almost no effort to write. What's not to love about it?

Edward Tufte might call it chart junk. It reduces the information to ink ratio. I generally get by just fine by choosing good names, so I don't need extra type checking beyond the standard duck typing.

Why add static typing to a language long stood for duck typing? There should really be one way to write in Python.

A few things.

i) its not a type checker. It cannot actually ensure type correctness. So may be a more appropriate description would be "a very smart linter".

ii) Its optional

iii) even when switched on it does nothing unless you are explicitly using its features

iv) The "there is only one way to do things in Python" is just vapid marketing. It is not true and has never been true.

"there is only one way to do things in Python" is just vapid marketing.

It's not even that, the actual quote is "There should be one, and preferably only one, obvious way to do things". The whole "there is only one way to do it" was a joke response to Perl's "There's more than one way to do it" motto.

The explicit version: "There should be an obvious way to do something. Ideally only one way will be obvious."

The original version was too implicit. Unless you are Dutch.

I've had great experiences with optional static typing from tools like flow[1] and mypy. I've found you can gain most of the bug catching and readability benefits of static typing, while having the power of dynamic typing when come across a truly dynamic problem.

It's been almost a decade, a lot of bugs, annoyances and slowness has been ironed out. Most important libraries are py3, which is a biggie, and py3 changes were backported via __future__ to py2 so you can easily begin getting used to the syntax in current apps.

So now that you are used to using print("hello") instead of print "hello" and not having to use u'whatever' all over the place - it was a perfect time to experiment with toy apps in python 3, since they are mostly backward compatible with python 2 without all the 2to3 nonsense. Even ubuntu made it default.

Which leads us to asyncio, the killer feature - people were prepped to move...but didn't have a really compelling reason to...until now.

Totally not weird at all.

Users need very compelling reason to switch off any kind of platform. And usually one of the compelling reason is better performance.

The reception towards Python 3 would have been very different had it launched with everything Python 3.5 currently have, more performant GIL, OrderedDict, decimal operations, the new asyncio, etc.

> One of those ideas that's so obvious in hindsight, you can't believe it took us this long to come up with it.

If by "come up with" you mean finally acquiesced after decades and decided that string interpolation wasn't such a bad idea... sure.

What's the difference between `"%s" % foo` and `"{}".format(foo)`? Why is one interpolation and the other not?

The format() method is more capable as it plays nice with heterogeneous duck-typed objects. It is type aware and you can supply your own custom formatters through __format__ methods that can receive parameters from the format string.

% will spew TypeErrors if you don't get the format specifier right.

Python 3.6 is adding `f"{foo}"`, which is rather different than the previous offerings.

Yes, but to clarify, it's only rather different than previous python offerings. There exist things more or less equivalent in other languages, so it's not as if there's much of a reason it took this long.

Wow that is horrible.

I really hate the fact that there are now 3 ways to do it. And that it looks like PHP.

To be honest, to me the "%" seems the most pythonic way to do it. Format does not seem easier to understand to me . String formatting just cannot be transformed to natural language. I at least don't think "I need to format this string", I think "this variable needs to be in there", in the in there might as well be called "%". I think ["a is %s" % a for a in all_as] just fits.

Although named parameters are nicer with format. In no way should there be a third way to do it. Isn't that the basic principle of Python, to care about things like this? I mean this whole Python 3 problem was to a small extend caused by people rather having an incompatible syntax than two ways to print a string. Going through these troubles and then inflating the language with new syntax for the same thing seems like a waste.

In that respect, don't list comprehensions fall under the same category of multiple ways to do something? They didn't exist in the language until 2.0[1], and it's just a convenient way to encode a couple for loops, is it not[2]? At some point you have to accept that it's a trade-off, and the more complex the topic, the less well served it will be by only one or two ways to do it. In this case, there were actually already three ways to do it, since you could always use plain string concatenation with `+`. If python were truly about one way to do it more than anything else, that would be the only way to build complex strings from variables, but since convenience is also important, numerous shortcuts were deemed acceptable and implemented.

1: https://en.wikipedia.org/wiki/History_of_Python#Version_2.0

2: And for my money, significantly more confusing (and less powerful) than something like Perl's `map`.

List comprehension is the one obvious way to do do a quick map, a for loop would be used for more complex stuff and side effects. A real map is more niche but probably necessary when doing eval stuff or stylisticly nicer in a module where you do several functional tasks.

String concatenation has one use-case where it is the obvious way, too, namely when you have to concatenate already existing strings without modification, since "%s%s" % (a, b) is just stupid, beyond that you would use string formatting.

Maybe there are unique obvious use-cases for the different formatting options as well, but I don't know them. I give you, that "%" is bad for named arguments, but why wouldn't you just improve that then? I think they do the exact same thing and are good for the exact same contexts. If somebody knows more than me, I would be very happy to be proven wrong.

> I give you, that "%" is bad for named arguments, but why wouldn't you just improve that then?

Perhaps they should. I'm not making a case specifically for the proposed strategy (interpolation), just that if they are going to implement it, that it took this long is odd, and somewhat laughable. Personally, I think the named string format is superior to this new feature in many respects except the most simplistic of cases, but I don't often write python, so my opinion may be colored by that.

List comprehension are interesting and really come into their own when seen as part of the whole family of comprehensions. In addition to list comprehensions, you also have set, dict and generator comprehensions that all share the same syntax.

Python also has map(), but the lack of lightweight block syntax (like Perl's and Ruby's) makes it more cumbersome to use. I don't see how list comprehensions are less powerful, though.

> Python also has map(), but the lack of lightweight block syntax (like Perl's and Ruby's) makes it more cumbersome to use.

Yes. Additionally, I prefer the clear directionality of control and transform denoted by the dual capabilities of `map` and `grep` over list comprehensions, especially when chaining. Even better, IMO, is Perl 6's ability to use the feed operator to go left-to-right, so if the map block can be read the same direction as the flow of list items.[1] Having to scan back and forth to determine what is happening is a real detriment to the readability of these types of expressions, IMO.

> I don't see how list comprehensions are less powerful, though.

You're correct. Not strictly less powerful, just that the single expression for transforming the item may require multiple comprehensions where a single slightly more complex map block could get away with a temporary variable. Given that I dislike the way multiple list comprehensions read, this may be a bigger deal for me than most.

1: E.g. my @new = ( @original ==> grep { ... } ==> map { ... } ==> sort { ... } );

just checking to be sure, your example is essentially

original.filter(...).map(...).sort(), which in python would be `sorted([... for x in original if ...], ...)`.

I'll agree that sometimes the functional/chainable approach is nicer, but often it also isn't necessary

> just checking to be sure, your example is essentially original.filter(...).map(...).sort()

Yes, a method chaining format (such as Ruby and JS, and which can be enabled with a module in Perl) does approximate the control flow I'm referring to, The only different to the feed operator in Perl 6 is that the feed works on a per-item basis, allowing the list to be lazily generated, which also means it's a closer approximation to how you would code it in a for loop, as statements are done in succession on each item.


    my @source = 1..Inf;
    my @elevensies = ( @source ==> map { $_*11 } );
    my @odd-elevensies = ( @elevensies ==> grep *%2 );
    my @root-of-odd-elevensies = ( @odd-elevensies ==> map *.sqrt );
Although in this case the left feed operator may be clearer.

> which in python would be `sorted([... for x in original if ...], ...)`.

> I'll agree that sometimes the functional/chainable approach is nicer, but often it also isn't necessary

Well, the example I gave was simplistic and to show direction. A more complex example would require a comprehension of a comprehension, which is where I start to have real issues with the format, and which is not represented from that example, which wasn't meant to illustrate that particular problem.

Your example translated to Python is also lazily generated (it uses the lazy equivalent of list comprehensions, called generator expressions):

  source = count(1)
  elevensies = (x * 11 for x in source)
  odd_elevensies = (x for x in elevensies if x%2)
  root_of_odd_elevensies = (math.sqrt(x) for x in odd_elevensies)
To me, the examples seem fairly equivalent, with the exception that Python's is, in my opinion, more readable for a newbie, since it doesn't require knowing map() or grep() - just for loops and ifs.

I was aware of generators in python, but I wasn't aware of the lazy lest comprehensions, which is very nice.

I agree list comprehensions in the simple form are easier for a newbie to recognize, but I also like the way map and filter make you think about your actions, and think they are clearer in the less trivial cases (as I mentioned above). I say "filter" because while I am completely familiar with that terminology, I accept that it's an unneeded departure from the common terminology for that task.

Well, map and filter may be more clear in certain cases, but I still don't see how they are more powerful. Seems like you can do a 1-for-1 translation between the two.

Oh, that's because you must have missed a few comments back where I agreed with you and backed out on any claim they were strictly more powerful. ;)

Ah, uh, sorry ;)

> Weird how suddenly everyone seems to have coalesced around 3.5

I wonder if v3.5 being included in Ubuntu 16.04 LTS has helped there a little.

I wouldn't think. Its more likely that 3.5 just hits all the right bullet points and people are excited about it.

I'll be impressed by the new async features when they abstract the manual declaration of the event_loop in ways similar to Elixir or Golang. That's still a very confusing part for me and I find it hard to implement async properly into my programs in such a way that it's worth the added complexity.

These are fantastic news. Right now, PyPy is probable the biggest reason why many users don't migrate to Python 3 yet. Python 3 comes with a lot of very nice language features, but even when CPython 3 is already faster than 2.x in most cases, these speed increments are very small compared with the speedups of PyPy, which in my experience brings Python's speed to NodeJS level. I'm looking forward for their first release!

> Right now, PyPy is probable the biggest reason why many users don't migrate to Python 3 yet

I've met hundreds of pythonistas, in all sorts of different places. A lot of them wanted to migrate to Python 3.X not a single one told me that PyPy was a reason.

I fall in that camp too. I worked on a project where i really wanted to migrate to Python 3, mostly to have a saner Unicode story, but couldn't do it because it was deployed on Google App Engine :(

You may already know, but you can use 3.4 if you switch from the "Standard Environment" to the "Flexible Environment".


Yep. Already tried that, but i found too many disadvantages on the "flexible" environment (still in beta, bad documentation, much slower deploys, minimum of one instance always running, etc) that i preferred to stay out of that for the time being.

Similarly, Python 3.5 is (for me and probably others) the biggest reason for not using PyPy

Agreed. Looking forward to the 3.5 port.

You mean you don't use 3.5 be cause pypy does not support it.

No, it seems like they are in the same position as myself.

They (and myself) are not using PyPy because PyPy does not support Python 3.5

> brings Python's speed to NodeJS level

Does anyone have a benchmark? I hear this claim a lot, but I haven't seen NodeJS being that much faster.

The Techempower benchmarks show them each having their strengths, with 3 'wins' apiece:


The usual Techempower caveats apply: I have no idea about the implementations or any possible constraints on these, YMMV.

Thank you, that seems useful. I must be reading it wrong, though, because I see wsgi completing three times more responses than node.js, which seems wrong. I also can't find the "three wins each" that you reference.

EDIT: Ah, I just noticed the different benchmark tabs at the top, thanks.

NodeJS/V8 is faster than CPython for obvious reasons - V8 is JIT compiled while CPython is not. PyPy closes the gap.

LuaJIT made the news for having an interpreter that beat V8’s JIT a while ago. There are a lot of traits about languages and their implementations that makes looking at their compiler features a poor way of determining speed.

> Right now, PyPy is probable the biggest reason why many users don't migrate to Python 3 yet.

Does PyPy have that many users?

Doesn't seem that way to me.

For actual real world apps with a lot of IO, your PyPy benefit is kinda small.

For actual number crunching, I guess. Which is the whole scientific side of Python.

Python is weird that it is 50% webapp hipsters ala Ruby, and 50% scientists in lab coats sitting in the lab crunching.

Well, PyPy does not actually help much with raw number crunching -- a scientist will use numpy for native matrix math, or numba to JIT a tight loop that can't be writted in numpy easily. PyPy really helps in the in-between cases: a few weeks ago I had to write a script to pull a few million rows from a database, process them nontrivially, and dump them to a file. What do I use here? CPython and Ruby will take days. Perl will probably take days, unless I am just doing text processing which perl is optimized for. I can write C or C++, and have the headache of debugging memory errors in a script that will only run once. Or, I can write PyPy and have immensely convenient syntax that will run reasonably quickly (Luajit also occupies this space for me).

Well if this task is trivial to parallelize.. I would spin up 100 GCE machines for 1 penny each per hour after today's price drop (combined total of $1 per hour for 100 CPUs of action).. and pull those few million files and dump to a cloud drive in 1 hour flat.

It's a tricky sweet spot PyPy has. Certain tasks that need CPU power but not too much CPU power and you want to spend time to make it run in parallel but not too much time to make it run across many machines.

(By no means am I a PyPy or CPython expert. Just toyed with them and want to understand the bounds...)

But now you have to manage that deployment which is also nontrivial.

Depends on your setup I guess? I have a a GCE cluster setup where I can:

1) Point a metadata variable at my git repo

2) Spin up 100 GCEs

3) Each will pip install requirements.txt from the git checkout

4) Each will read a metadata to get the command and run it. (Could hardcode run.py or something).

Realize not everyone has this sitting around, but you can do it in an afternoon. Then gives you instant Python supercomputer for anything you want across many projects.

The setup for that sounds significantly nontrivial, although interesting.

Scientific Python benefits more from (properly) vectorized numpy code, Cython, Numba and Numexpr than what it gains from PyPy.

From my experience, one of the biggest issues with PyPy is that it's not a magic pill you take and suddenly all your Python code runs faster -- that's how they "sell" it. But in many cases, the idiomatic code for CPython results in slow PyPy execution, and idiomatic PyPy code results in slow(er) CPython execution.

That means that you have to refactor your code, and whereas you won't be breaking compatibility with CPython, your CPython performance will (most often) suffer. So, unless you're the sole user of your code, or you have control or decision power over most of your users, (or you just don't care to keep good CPython performance) your project might not benefit: "yeah, in PyPy it runs 2x faster, but 80% of our clients use CPython and are complaining over a 5x slow down -- ditch that code and get back to the previous version" may end up being the feedback from someone above you in hierarchy.

> Python is weird that it is 50% webapp hipsters ala Ruby, and 50% scientists in lab coats sitting in the lab crunching.

Don't forget about the ops guys using it for automation!

Oh yes my ansible brothers and their lack of Python3 support ;)

Where brothers == inbred cousins

Hahaha kinda.. but if Ansible brothers are inbred cousins, Puppet and Chef brothers are certified insane asylum.

Well, it weights in favor of Python 2, and considering that until recently the incencitives for Python 3 were still light... It's not really about having users, but having the possibility of switching from CPython to PyPy if you face performance issues.

PyPy is still not really a performance option for numerical computing.

Are you not satisfied with NumPy and associated tools?

Very satisfied, and that is why I can't really switch to PyPy

I wonder if uvloop will work with such pypy. The speed would be awesome!

It's a possibility. uvloop is written in Cython so it will depend on whether PyPy3 supports what uvloop uses through Cython: http://docs.cython.org/en/latest/src/userguide/pypy.html

We'll find a way. Worst case it's possible to rewrite it using cffi.

This thread presents a great opportunity for someone to sweep in and present a foolproof path to persuade enthusiastic newbies to jump in and help...

I'm glad to hear about this funding. It's also a reminder for myself to bring the topic to my company about donating to open source.

Oh thank goodness. The last thing that would make me consider using Python 2 for new projects is about to go away.

I'd love to switch to PyPy, but Theano doesn't seem to be supported, Tensorflow is an unknown.

Does anyone know what is the limiting factor for supporting deep learning frameworks on PyPy?

Why would you ... umm, use pypy with these frameworks? After all they do the heavy lifting via very specialized low-level libraries, right? So python is just the glue holding together and driving the data pipeline.

Why is Mozilla doing this? They're already doing Rust, and their products don't use Python internally. The build process for Mozilla add-ons was even converted from Python to Javascript to eliminate Python from the build chain.

I'd rather see Mozilla support Thunderbird, which is useful.

Their site(s) are using python extensively. https://blog.mozilla.org/webdev/2013/02/22/the-restful-marke...

The code push infrastructure also runs on PyPy specifically: https://github.com/mozilla-services/autopush

Mozilla uses Python for many things. To name just one example of many, the mach build script for Firefox and Servo is written in Python.

In addition to their sites (which have already been mentioned), their build tool "mach" uses Python[1].

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Developer_g...

Pedantry time: mach isn't a build tool, it's a command dispatch framework. So yes, |mach build| will build Firefox (or servo), but, for example, |mach try| will push code to the try server, and |mach mdn| will search MDN.

However to address the original point, Python is indeed widely used at Mozilla, not just for websites but for test harnesses, build tools, analysis scripts, etc. It wouldn't surprise me if it was second to only to javascript on the basis of "number of developers who have written >0 lines of code in this language for Mozilla projects".

Ha, thanks for pointing that out. You're obviously correct about it being a command dispatch framework. I've only ever used it as an interface to build and never realized what it actually did behind the scenes. And after reading a little more about it after you pointed out my mistake I have learned about an awesome tool.

I don't know how many people say this, but I appreciate the pedantry!

Mozilla's websites notably use Python and Django.

Same comment and replies from four days ago (though not made by you) https://news.ycombinator.com/item?id=12233988

> The build process for Mozilla add-ons was even converted from Python to Javascript to eliminate Python from the build chain.

Which part of the build process? I don't think the build process is written in Javascript / NodeJS.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact