Hacker News new | past | comments | ask | show | jobs | submit login
Grumpy: Go running Python (googleblog.com)
1411 points by trotterdylan on Jan 4, 2017 | hide | past | favorite | 451 comments



- Amusingly, it runs Python 2.7, even though this project started long after Python 3.x came out.

- It's a hard-code compiler, not an interpreter written in Go. That implies some restrictions, but the documentation doesn't say much about what they are. PyPy jumps through hoops to make all of Python's self modification at run-time features work, complicating PyPy enormously. Nobody uses that stuff in production code, and Google apparently dumped it.

- If Grumpy doesn't have a Global Interpreter Lock, it must have lower-level locking. Does every built-in data structure have a lock, or does the compiler have enough smarts to figure out what's shared across thread boundaries, or what?


> Amusingly, it runs Python 2.7, even though this project started long after Python 3.x came out.

Basically, we needed to support a large existing Python 2.7 codebase. See discussion here: https://github.com/google/grumpy/issues/1

> It's a hard-code compiler, not an interpreter written in Go. That implies some restrictions, but the documentation doesn't say much about what they are. PyPy jumps through hoops to make all of Python's self modification at run-time features work, complicating PyPy enormously. Nobody uses that stuff in production code, and Google apparently dumped it.

There are restrictions. I'll update the README to make note of them. Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.

> If Grumpy doesn't have a Global Interpreter Lock, it must have lower-level locking. Does every built-in data structure have a lock, or does the compiler have enough smarts to figure out what's shared across thread boundaries, or what?

It does fine grained locking. Mutable data structures like lists and dicts do their own locking. Incidentally, this is one reason why supporting C extensions would be complicated.


> Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.

What about stuff like literal_eval? Or even just monkeypatching with name.__dict__[param] = value ?

> It does fine grained locking. Mutable data structures like lists and dicts do their own locking. Incidentally, this is one reason why supporting C extensions would be complicated.

Would there be a succinct theoretical description of exactly how that's implemented anywhere? What about things like numpy arrays.


> > Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable. > What about stuff like literal_eval? Or even just monkeypatching with name.__dict__[param] = value ?

literal_eval could in principle be supported I think. name.__dict__[param] = value works as you'd expect:

  $ make run
  class A(object):
    pass
  a = A()
  a.__dict__['foo'] = 'bar'
  print a.foo
  bar
EDIT: fixed formatting


Hmm, numpy isn't pure python, is it? If I read correctly this only works with pure python.


By volume numpy is mostly assembler written to the Fortran ABI (it's a LAPACK/BLAS-etc wrapper).


NumPy is a library that provides typed multidimensional arrays and functions that run atop them. It does provide a built-in LAPACK/BLAS or can link externally to LAPACK/BLAS, but that's a side effect of providing typed arrays and is nowhere near the central purpose of the library.

Also, NumPy is implemented completely in C and Python, and makes extensive use of CPython extension hooks and knowledge of the CPython reference counting implementation, which is part of the reason why it is so hard to port to other implementations of Python.


Having typed arrays without efficient functions over them would be rather pointless.


Are you sure you aren't mistaking numpy for scipy?


numpy is the foundation of scipy.


Is there not a single namedtuple in the entire Google codebase? That's strange :o


Heh, I came across the namedtuple exec thing the other day when I was trying to get the collections module working :\

namedtuple will have to be implemented differently. I think it can be accomplished by defining the class with type()? Maybe with a metaclass...


> I think it can be accomplished by defining the class with type()?

I've done it using more or less that method. The code is in the "coll" sub-package of my plib.stdlib project; the Python 2 version is here on bitbucket:

https://bitbucket.org/pdonis/plib-stdlib/src


You won't get exact compatibility, but a metaclass implementation would give almost all the features. I can't remember what exactly you give up, but I did that once and I lost some introspection friendliness.


Nevermind, all you need is type(). Metaclass unnecessary.


Are namedtuples that popular? They always felt awkward to me. If some temp variable with multiple values inside a loop, I either use normal tuple or a dict. If passing data around a dict or a real class. I never got the huge win from namedtuple?


namedtuples are tuples, meaning they are stored efficiently, and are constant (thus can also be used as dictionary keys). Unlike regular tuples, they can be accessed like a class/dictionary for readability, but requiring much less allocations (compared to dict/class), so much faster. Also, as they are tuples, you have well defined methods (printing, comparison, hash value, ) you'd have to implement yourself for dict/class.

If you like writing in functional style, namedtuples are much more natural than dict or classes, and more efficient to boot.


Attrs (https://attrs.readthedocs.io/) replaced namedtuple for us (and many others). It's slightly more verbose but allows all class goodness such as methods, attribute validation, etc.


Doesn't work for everything, but you can subclass a namedtuple:

  from collections import namedtuple
  
  class Foo(namedtuple("Foo", "a b c")):
      @property
      def sum(self):
          return self.a + self.b + self.c
  
  
  f = Foo(1,2,3)
  print f.sum


That doesn't look super awesome to me. I.e. classes or attrs both seem better.


Aaaaaarrrrrgggggh! I've had that particular itch for every one of my ten years with python, and at last I get to scratch it!

Thanks so much for bringing it up.


We use them extensively in our API client code to pass back immutable, well-defined data structures. Dictionaries and classes are mutable and then each layer of code tends to sloppily change them however is convenient, meaning the underlying data can end up being represented differently in different code flows.

Namedtuples are a way to preserve the data unless the consuming code _really_ wants to change it, which is sometimes legitimate.

I'm not totally sold, as in some cases dictionaries or classes would add nice value. But namedtuples have a rigidity that makes you think twice before tampering with retrieved data.


In every introductory python course tuples are presented as just immutable lists. However a "more accurate" way of describing tuples is if you think of them as records with no field names. When you see tuples as records then the fact that are inmutable make sense, since the order and quantity of the items matters (it remains constant). Records usually have field names and here is where namedtuples comes in handy. Also helps to clarify what the tuples wear (see https://youtu.be/wf-BqAjZb8M?t=44m45s), just 2 minutes clip. If you are thinking why don't define a class, I will tell you a couple of reasons:

1) You know before hand that the number of items won't be modified and the order matters since you are handling records. So it is a simple way of accomplishing that constraint.

2) Because they extend tuple they are inmutable too and therefore they don't store attributes per instance __dict__, field names are stored in the class so if you have tons of instances you save a lot of space.

Why creating a class if you just probably need a read-only interaction? But what about if you need some method? Then you can extend your namedtuple class and add the functionality you want. If for example you want to control the values of the fields when you are creating the namedtuple you can create your own namedtuple by overriding __new__. At that point it is worth it to take a look at https://pypi.python.org/pypi/recordclass.


(1) A grep can identify all the occurrences; a sed might even fix them. (translated to Google internal tools obviously)

(2) Apparently setting a __dict__ key works; they could be implemented like that.


There are some defined in built-in modules. Even in Python 2.7, where sys.version_info is a namedtuple.


Yeah, one of the motivations for adding namedtuple to stdlib was a drop-in compatible upgrade of existing interfaces returning tuples. Notable atrocities included `time.localtime()` returning a 9-tuple, and `os.stat()` returning a 10-tuple...


> There are restrictions. I'll update the README to make note of them. Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.

I'm guessing pretty much the entire AST module is a no-go?


I think the CPython AST module is written as a C extension module so currently it's a no-go. I don't think there's a fundamental reason Grumpy couldn't run a pure Python AST module, though.


The ast module itself is in Python, but it imports the _ast module which is an extension module. This actually isn't that big of a deal, though, as the entire AST is defined in a DSL (see https://cpython-devguide.readthedocs.io/en/latest/compiler.h... for some details), so you just have to write some code to generate _ast in Python instead of C (which PyPy may have already done).


So I take that means Grumpy can't run itself?


Correct, Grumpy cannot yet run Grumpy :)


> I'll update the README to make note of them.

I managed to run into 2 trying to build a 5 line program :-)

  $ cat t.py; ./tools/grumpc t.py  > t.go;go build t.go;echo '----';./t
  import sys
  print sys.stdin.readline()
  ----
  AttributeError: 'module' object has no attribute 'stdin'
  $

  $ cat t.py ;./tools/grumpc t.py
  c = {}
  top = sorted(c.items(), key=lambda (k,v): v)
  Traceback (most recent call last):
    File "./tools/grumpc", line 102, in <module>
      sys.exit(main(parser.parse_args()))
    File "./tools/grumpc", line 60, in main
      visitor.visit(mod)
    File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
      return visitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stmt.py", line 302, in visit_Module
      self._visit_each(node.body)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stmt.py", line 632, in _visit_each
      self.visit(node)
    File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
      return visitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stin visit_Assign
      with self.expr_visitor.visit(node.value) as value:
    File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
      return visitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 101, in visit_Call
      values.append((util.go_str(k.arg), self.visit(k.value)))
    File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
      return visitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 246, in visit_Lambda
      return self.visit_function_inline(func_node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 388, in visit_function_inline
      func_visitor = block.FunctionBlockVisitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/block.py", line 432, in __init__
      args = [a.id for a in node_args.args]
  AttributeError: 'Tuple' object has no attribute 'id'


Ugh, sorry about that. There's a couple issues here:

1. Lambda tuple args are not yet supported -- I actually didn't know that was a thing :\ -- https://github.com/google/grumpy/issues/17

2. Even if that worked properly, sorted() is not yet implemented: https://github.com/google/grumpy/issues/16


Yeah.. It also used to work with def, but it was removed in python3. You can do this in 2.7:

  def func((a,b)):
      return b

  mytuple = 1,2
  print func(mytuple)
in py3 you need

  def func(t):
      a,b = t
      return b
Not sure if

This is probably the cleaner way to write that:

  key=operator.itemgetter(1)


sorted() is widely used, adding will extend coverage considerably.


> Basically, exec and eval don't work.

Couldn't they supported with a slower runtime implementation? I mean I still love the idea and actually like the idea.


>- Amusingly, it runs Python 2.7, even though this project started long after Python 3.x came out.

Python 2.7 is what's running at Google. Not really surprising they're looking at this considering the fast approaching end of (core dev) support for Python 2.7.

Write an interpreter in another language and programatically port modules to Go. Seems pretty sensible to me.


Given the failure of their unladen-swallow work to make it into CPython, I think Google is probably tired of trying to make Python faster. Some of their stated goals with Go were to be a faster, compiled Python, so this makes a lot of sense for their use case. They face the choice of fixing all their existing Python code to run in Python 3 (which won't make anything faster), or just porting everything to a different language. They chose the latter, and this lets them incrementally convert Python code to Go. I don't know that this makes sense for anyone but Google, just like Hack probably doesn't make sense for most PHP development that's not at Facebook.


> I don't know that this makes sense for anyone but Google

I'm not sure; as a Go developer, I kind of like the idea of having access to the Python library ecosystem from Go, without being forced to create an IPC bridge and building up the requisite release-management and deploy-time goop.

Plus, I'm just not a Python developer; in the case where the only library that exists to do something is written in Python, I'd much rather write Go that calls that Python library than Python that calls that Python library.


IIUC, it's not about accessing python libs from go. It's for accessing go libs from your python program and transpiling that python code to go source and compile it with go tool chains.

Eg: python code (from blog post)

  from __go__.net.http import ListenAndServe, RedirectHandler

  handler =  RedirectHandler('http://github.com/google/grumpy', 303)
  ListenAndServe('127.0.0.1:8080', handler)


sure but the reverse should be equally feasible. It's transpiling Python to Go, so theoretically we should be able to (eventually) "convert" Python libs to Go and call them from Go. A lot of utility libs are available in Python... the Go library ecosystem is relatively sparse


I'm not sure if it is already possible? Is it.


> I kind of like the idea of having access to the Python library ecosystem from Go

I'd like to see an example of this, because from the blog post I get the impression that this mostly allows accessing the Go ecosystem from Python, rather than the other way around. For example, how would Python classes be handled from Go?


embed a cpython interpreter into the Go runtime as an embedded interpreter?


Statically linked Python interpreter sounds pretty great to be honest.


> Python 2.7 is what's running at Google. Not really surprising they're looking at this considering the fast approaching end of (core dev) support for Python 2.7.

I'd prefer that all new Python tools that need to support 2.x also support 3.x. It's an additional development cost, but IMHO, a worthwhile investment in the future.


Well, the difference here is that Google seems to be looking at Golang as the future for their internal tooling currently implemented in Python 2.x, instead of Python 3.x. I'm curious to know how much additional work might be necessary for this to support 3.x, but it doesn't sound like that's part of their use case.


While python2 may be the past, I think that for many python3 is not the future.


Highly depends on the use case.

Python 3.5 with uvloop+sanic can be faster than node.js without any JIT:

https://github.com/channelcat/sanic#benchmarks


Still slower than OCaml, Haskell, Java or .NET.


Mmm, "faster than node.js" isn't a great benchmark out in the wide world. Although node.js being as fast as it is remains an astonishing thing.


Who cares about the speed of the interpreter? The interpreter's job is to orchestrate high-performance components written in some high-performance language. If your interpreter is dominating execution time, you should move some of your logic to native code.


Better not use an interpreter in first place, rather a language with REPL that allows compilation straight to native code.


Why? Native code is costly: machine code generally has a much bigger footprint than interpreter bytecode. In some cases, interpreted code can be faster due to cache effects and reduced IO load making it faster to be smaller.


I am yet to see such benefits in action.

The fact that Google has started this project to migrate away from Python to an AOT compiled language, shows where the performance wins are.


>I am yet to see such benefits in action.

Here you go:

https://morepypy.blogspot.com/2011/08/pypy-is-faster-than-c-...

https://morepypy.blogspot.com/2011/02/pypy-faster-than-c-on-...

For more examples, just search "pypy faster than c".

Also, here is an article from the Python wiki about why speed doesn't really matter a lot of the time:

https://wiki.python.org/moin/PythonSpeed

And, my own two cents:

Speed is relative. Does every piece of code need to be as performant as possible? No. I would argue that, in most cases, speed of development is far more important than speed of execution. This is, of course, not true for things like drivers or statistical analysis.

Writing a web application? Speed isn't that important as the whole process is i/o bound anyways.

Writing a machine learning algorithm? Depends.

Web scraping? I/O bound, speed not really important.

Image processing? Speed matters at least a little bit.

Writing networking glue for distributed systems? Speed probably doesn't matter.

It's all relative. If it needs to be fast, it needs to be fast. Most things don't really need to be fast. For the things that don't need to be fast, why build them with C/C++/Rust/Go when you could spend half the time building them in Python/Ruby/js/etc?


PyPy isn't an interpreter, you are just validating my assertion about JIT/AOT compilers.

I usually ignore it when talking about Python, because that is what the community does, by gathering around CPython.

> For the things that don't need to be fast, why build them with C/C++/Rust/Go when you could spend half the time building them in Python/Ruby/js/etc?

Because one can use languages like OCaml, Haskell, Lisp, Scheme, Racket, F#, C#,... thus having both the productivity of a REPL environment and the execution speed of native code.


>PyPy isn't an interpreter, you are just validating my assertion about JIT/AOT compilers.

I thought we were talking about Python implementations exclusively. My mistake.

>OCaml

I have no experience with OCaml, so I won't make any comments regarding it's efficacy.'

>Haskell

Well thought out language. I like the purity, but it's too academic for real world use outside of scientific computing. FP isn't for everyone, and my personally belief regarding it is that it is better used as a tool alongside other paradigms than all by itself as the only paradigm.

>Lisp

Lisp is useful for a lot of things. It's also not very popular for new projects as far as I've seen. There are also a ton of different versions, so I don't know if "Lisp" is really a good descriptor.

>Scheme

As far as I know Scheme is the defacto teaching language for most compSci programs.Or, at least it was for a long time. Once again, FP is not for everyone. A lot of people also dislike Lisp style syntax, myself included.

>Racket

Same issues as Scheme.

>F#

F# is a fantastic language. There's not really a whole heck of a lot to complain about other than the .NET implications. The only detriment relative to Python is F#'s much smaller ecosystem and community.

>C#

Once again, some people just don't like .NET stuff. A lot of people also see static typing to be a detriment in many use cases.

Relative to Python, these languages also share several other problems when it comes to real world application: lack of competent developers, stagnating ecosystems, lack of third party libraries, ecosystem lock in, and cross-platform comparability issues. In the case of languages like Haskell, they could even be considered "esoteric".

I'll give you that many of these languages are more "pure" or "logical" than Python. I'll even give you that most of them are designed much better than Python. None of that changes the fact that Python is overall easier to read, easier to learn, easier to write, has a better ecosystem, is platform and file system agnostic, has a very non-restrictive license, and is, overall, very pleasant to work with.


Actually, they said they would keep programming in python.


Well... yes. That's basically my point. Comparing speed to node.js is pretty much useless because if you really care about speed you're not using node in the first place.


From Google's POV, Go is the future, not Python 3, they created it for this reason. As mainly a sysadmin these days, I tend to agree with them for their use-case. For deployment, performance and overhead, Go is great for system tools, which is reflected in the new toys in the "sysops" toolbox. Pretty much every-one of them is written in Go these days, where that used to be Python or for a brief period Ruby or C/C++ for the more performance sensitive stuff.


Python3 was never the future of Python. It never got over the hump all new languages need to if attempting to reach relevance.

It's likely code using lots of C-extensions will continue with CPython2 and new code will be written in Grumpy (pure Python2).


That is pretty ridiculous. Pretty much all major libraries are Python 3 compatible and everyone is writing Python 3 (or should be). (Yes, I'm still on Python 2 but moving soon).


My python course at uni focused on python3. We talked about differences and toyed with both interpreters, but ultimately wrote all projects and assignments in 3. I'm sure this is the case for students at other schools too.


I've been writing Python 3 for years.


That's top-down change from the PSF/core dev team and some major library developers added dual 2/3 support. The users never arrived and it'll be 10 years Python3 has been available next December. I'd like to unify Python again but we as users didn't break it either. 10 years any reasonable person in charge would hang it up or change course.

Grumpy is pretty much what most everyone would actually want out of a new Python and may have arrived just in time.


> some major library developers added dual 2/3 support

> The users never arrived

That's the way it used to be a few years ago - it changed a lot the last few years. Pretty much all libraries are ported and many new libraries are Python 3-only.

asyncio is nice.

All Python devs I personally know moved to Python 3. Porting is a lot less painful than it used to be.


Gevent is nice. AsyncIO is terrible[0].

[0]http://lucumr.pocoo.org/2016/10/30/i-dont-understand-asyncio...


It's actually ridiculously nice, especially with the new async/await constructs.


Python 3 is largely cross-compatible with Python 2. If you're not working with unicode, not reliant on perfect floating point division, and not using aysncio, then chances are what you produce will work just fine on Python 2.


>(Yes, I'm still on Python 2 but moving soon).

I'm like really new to programming and I'm still just learning the basics, but I see this little addendum a lot from people who say everyone should be writing Python 3.


me too. I've migrated a big project from 2 to 3 two years ago. And let's get this straight : innovation happens on python 3. So staying on 2 would feel like riding a dead horse. I know I did the right thing.

And if Google can make an interpreter for 2, then sooner or later, one for 3 will pop up. Since Google made some restrictions on what Grumpy supports from python 2, I'm sure someone somewhere will be able to do the same stuff for three.


It's true, but moving our huge codebase to Python 3 is a big undertaking. We're making progress towards it by using Python 3 constructs for new files, etc. For my personal projects I'm already in the process of moving over.


So is it just a time issue, or are there compatibility reasons for why not all of you code is Python 3?

I ask because I started to learn to code with Python 2 because that's what was preloaded on my system. Is one over the other a big enough deal at a beginner level that I should switch to 3 now? How much of a learning curve am I in for?


> Is one over the other a big enough deal at a beginner level

No, at a beginner level it's not, there are many guides that explain the differences at a beginner level, and you can go through those in a few hours at most, for example http://python-future.org/compatible_idioms.html

But, if you start working now on a Python 2 project and that project starts growing significantly, then it will be hard to convert the codebase. That's why you can see people saying that they didn't switch yet, it's not that they don't know Python 3, it's that upgrading large legacy code bases is hard (not only in Python).


For me, the hardest part was migrating from Python 2 string to Python 3 Unicode string. But at the same time, this was a huge improvement for my code base because I work with several languages and unicode makes that much easier/safer. So it was a good thing.

Now the rest of upgrades were a bit painful (I was using some functional programming stuff, httprequest libraries, etc.)


Okay. Thank you for the advice.

My biggest program is like 100 lines of code maybe. So I will go ahead an switch now. But it's like 10pm where I'm at so here's hoping I don't play too much...


Just a time issue. I don't believe the learning curve is that big, most of it would be in how you deal with strings and the `print` statement, which isn't that heavily used in web apps.


what system is that? a mac? If so , the sadly you do have to get py3 yourself. If a Linux distro, then you likely already have python 3 preinstalled. `/usr/bin/python` won't point to python3 in anytime in the near future.


Haha it's Ubuntu. After I downloaded Python 3.6 last night, I found out I had Python 3.4 already on my machine. I got excited about new software and forgot to check. It's a crutch :/


Here, I'll give you a counterexample:

We're on Python 3 for all new stuff, and are migrating the old whenever we can.


Python 2.7 is EOL in 2020. Make of that what you will :/


The community will fork the 2.7 codebase and continue to support it, even if Python.org EOLs it.


It looks like all Python 2.8 is missing is a new name:

https://www.naftaliharris.com/blog/why-making-python-2.8/


I can see "the community" doing security fixes, maybe some bug fixing and a few back ports, but so far I havn't seen much effort from any community to bring active development of new features.


It seems like the 2.7 community is happy enough without the new features -- just need the bugfixes and continued backwards compatibility to keep that segment happy.

From a new features perspective, the other reply's Placeholder is fascinating. (I haven't looked into it thoroughly yet.)


It is certainly interesting if it truly materializes. But it looks like the plan is to mostly backport stuff from py3- so py3 is still the future, with Placeholder getting some of those features eventually. And that is great from a legacy codebase perspective.


Right.


That means nothing. Python 3 was also expected to be mainstream by 2015, but it's nowhere near even 30% yet.


It's mainstream for new projects. Which is what was expected and aimed for.


Citation needed. Most companies I know that have 2.7 older projects also do new projects in 2.7. They don't want to introduce 2 different versions of the language, set of dependencies etc to their production.


absolutely true. imagine porting a decade or more of code for only the promised benefit of a more "pure" language. asyncio is nice, but excluding it and enhanced generator syntax from 2.7+ is >policy<, not engineering. python 3 is bootheel style top-down engineering >management<, not good engineering. The BFDL is fallible. Placeholder looks like a great Python 2.8+ to me. Runs all my old code and gives me new syntax, without rejiggering the stdlib for purity's sake? Twist my arm.


>and everyone is writing Python 3 (or should be)

You'd be surprised.

If anything, the numbers show the opposite. The vast majority of Python codebases, legacy or new, are 2.7 or older.


Which numbers? People keep saying things like that, but things like the Python 3 Wall of Superpowers https://python3wos.appspot.com/ don't seem to support it. Do you have something more concrete?


The "Wall" is just numbers for Python3, without context for how it compares to Python 2.

For 2.7 Pypy reported 419,227,040 downloads for 2016.

At the same time, for ALL 3.x versions combined (up to 3.6) there are just: ~52 million downloads.

That's 1/8th of the Python 2 downloads.


Given that there are only 7.5 billion humans on the planet, and that rather significantly fewer than 1 in 20 people are PyPy-using developers, perhaps those numbers should be taken with a grain of salt?

The message I would take from those statistics is that needing a fresh download of Pypy is less common among 3.x users than among 2.7 users, who apparently needed to reinstall from the web at least a few times a day during 2016.


> The message I would take from those statistics is that needing a fresh download of Pypy is less common among 3.x users than among 2.7 users, who apparently needed to reinstall from the web at least a few times a day during 2016.

Occam's Razor would suggest that there are fewer Python 3 users


It should be PyPI rather than PyPy in the parent, FWIW.


Oops, mea culpa.


>Given that there are only 7.5 billion humans on the planet, and that rather significantly fewer than 1 in 20 people are PyPy-using developers, perhaps those numbers should be taken with a grain of salt?

Those are not downloads of PyPi, but of packages. It's not like "number of downloads == number of individual developers". Those are packages, including package updates. A single developer can download 50 deps across his codebase, and update them to later versions 2-3 times a year.


> A single developer can download 50 deps across his codebase, and update them to later versions 2-3 times a year.

As if individual developers are the reason behind the bulk of the downloads. I wonder how many downloads Travis alone counts for?

Your hate of Python 3 in every discussion about it is frankly baffling.


>As if individual developers are the reason behind the bulk of the downloads. I wonder how many downloads Travis alone counts for?

Travis runs/tests user projects, so there's nothing about it that's especially partial to Python 2 over Python 3.

>Your hate of Python 3 in every discussion about it is frankly baffling.

Or, you know, my pragmatic assessment of its popularity.

That you'd even use the word "hate" (when in fact, I like Python 3 over 2.7, even if its mostly tame updates over what 2.7 offers) shows that you're probably too partisan. I was enthused with Python 3 even when it was only a vision called Python3K back in 2000-ish. My personal preference has nothing to do with whether I see more people using it or not.

The situation is not unlike the perennial "next year is when Linux dominates the desktop", which has been every year since 1999.


> even if its mostly tame updates over what 2.7 offers

> The situation is not unlike the perennial "next year is when Linux dominates the desktop", which has been every year since 1999.

Your bias is showing, as it does in every comment section on this site regarding Python 3, as you make comment after comment about how inferior Python 3 is and how nobody is using it at all because your sample of 2 companies shows this and how it personally hurt your family or whatever. You don't stop. Either you hate it or you hate something else and use Python 3 as a vent.


If by "your bias" your mean my assesment of the state of Python 3 vs 2, that doesn't change depending on whether I like the language or not, then we agree.

>as you make comment after comment about how inferior Python 3 is

Actually I've never made any such comment. In fact, tame updates" means that IT IS an update over 2.x, only not that much as it could be. Which most people I've read agree, or at least agreed until the async stuff.

>and how nobody is using it at all because your sample of 2 companies shows this and how it personally hurt your family or whatever.

Notice how I never said that, but actually gave concrete numbers that place those using it in much less (up to 1/8 less) of those who use 2.x?

So why the lie? Less is not the same as "nobody at all", and doesn't fix by itself just because you really really wish more people used 3.

>You don't stop.

Yeah, I continue expressing my opinion and my argumentation. I should stop because you happen not to like it?

Please don't bring "the feelz" into technical and community discussions. It cheapens the argumentation. If anything, it's you who are biased: 80% of your submissions on HN are for Python stories.

One can acknowledge that D is way less popular than Golang or that Perl 6 failed to gain traction over 5, without hating Perl 6. Ditto for Python.


How can you accuse him of that? You're rebutting everyone with Python3 criticism, including myself. This is a prime example of projection. You're the one on a rampage, against Python2.

From what I've seen of his posts, he's only talking about the reality of the situation.. not "how it should be".

Go look at the stats on PyPi and other metrics. Python3 failed, there is a cutoff time for adoption. It's no different than the first 24 hours of a missing person report. You don't get eternity to see if something is going to pan out or not. We're past that point for Python3. It may survive as it's own (smaller) thing, but Python2 isn't going to die either and that's more assured than Python3's fate.

And coldtea is right, but we're not going to do your research for you. What I'm saying needed to be said to you, but you need to find better ways to contribute than just rebutting everyone who has something to say about Python3. Talking about how he hates "something else" and using Python3 as a vent is just ridiculous.


That response is talking about something else because your original comment accidentally said "PyPy", which is an implementation of Python, instead of "PyPI", the package repository.


A huge confounding factor: newer Py3 codebases are more likely to be built with newer pipeline tooling like devpi (to cache PyPI downloads), wheel (to cache locally-built packages), and Docker (which caches all the things).

Our legacy Python 2 build pipelines that we're actively moving off of hit PyPI far more often than our Py3 processes.


>A huge confounding factor: newer Py3 codebases are more likely to be built with newer pipeline tooling like devpi (to cache PyPI downloads), wheel (to cache locally-built packages), and Docker (which caches all the things).

Maybe in your case, but from what I've seen, I seriously doubt use of Docker or Devpi makes any dent in newer Py3 codebase dependency downloads. Besides, tons of new codebases for greenfield projects are still done in 2.x Python.

Not sure how it is in scientific computing area, but for enterprise/web apps, any company that has legacy 2.x code and libs in production (which is most of them) will continue to write new parts (including new projects) in 2.7 for compatibility with their Python production setup.

3.x is either from companies that didn't already have significant 2.x Python code in production (generally newer companies that for some reason went with Python instead of Node or Go that the cool kids use) or new programmers that just get started and start with 3.x.


The PyPy statistics aren't worth much since they're counting all sorts of automated downloads/dependencies/etc.

That's why packages like supervisor and graphite - which aren't libraries - are among the top downloads.


>The PyPy statistics aren't worth much since they're counting all sorts of automated downloads/dependencies/etc.

Those would exist for both 2.x and 3.x so it's not a differentiating factor.


There's plenty of 2.7 out there, but we're moving over slowly, basically due to the nice function annotation/type checking work. That's to me the first really compelling reason to use the latter.


Hell, even OpenStack (which is giant tangled mess of code written by like a few dozen teams) is making good progress.


> - It's a hard-code compiler, not an interpreter written in Go. That implies some restrictions, but the documentation doesn't say much about what they are. PyPy jumps through hoops to make all of Python's self modification at run-time features work, complicating PyPy enormously. Nobody uses that stuff in production code, and Google apparently dumped it.

Does it really imply many restrictions? Common Lisp, for example, is probably more dynamic than Python and it's been a compiled language for ~20 years.


> ~20 years.

Common Lisp was designed for interpretation and compilation from day one. The first implementations from 1984/85 had already compilers.

> Common Lisp, for example, is probably more dynamic than Python

Some parts are more dynamic than Python, some not. For example everything that uses CLOS+MOP is probably more dynamic. Also some stuff one can do when using a Lisp interpreter may be more dynamic. CL is more static, where one uses non-extensible functions, type declarations, static compilation, inlining, ... The parts where a CL compiler achieves good runtime speed may not be very 'dynamic' anymore.


20 years only takes us back to 1997, several years after Common Lisp was finally standardized. The ancestral dialects of Common Lisp were compiled as far back as the early 60's.


I didn't mean to imply that Lisp was only 20 years old or that common lisp was precisely 20 years old (I used the `~` to indicate that Common Lisp was approximately 20 years old), I just wasn't sure whether Lisp has always been a compiled language, so I restricted my claim to being a claim about Common Lisp and estimated its age conservatively.


In regards to the third point: The global interpreter lock protects the fact that python's GC scheme is not thread safe. It does not coordinate accesses across threads, and therefore grumpy's would not either. In grumpy the GIL is replaced by Go's GC implementation that is specifically tuned for multithreaded execution. Any additional synchronization would need to be done with individual locks etc...


The GIL is not just for GC; it does coordinate access across threads, albeit at a very low level -- if two threads execute "mylist.append(v)" at the same time, it is the GIL that makes sure it actually works as expected, and from comments above it seems grumpy uses per-object locks for that.


That's not a good characterization of the GIL. It doesn't prevent races or make multithreaded mylist.append safe. It makes sure that mylist.append doesn't cause a segfault as the bytes in RAM are in an inconsistent state during an update. Beyond that, it doesn't really protect you from your bad threaded code.


(assuming mylist is a standard python list) it does prevent races inside mylist.append. It does make mylist.append safe.

When I wrote "append from two threads .. as expected" I meant "two items will be added, which one first is unspecified", and the GIL certainly takes care of that.

I agree it does not protect you from your bad threaded code - but then, nothing short of STM does (and even STM doesn't guarantee starvation in the general sense - nothing can).


An unintended side effect of the GIL is that all calls to a C implemented function are atomic and single threaded, provided that the C function doesn't release the GIL. In practice this means lists and dicts are thread safe and existing Python code relies on this.


No support for 3 makes sense. This is all about building an off-ramp to put Python behind them.


I was in second year of college and the Python 2 vs Python 3 was a couple of years running. Is this fight -still- not resolved? I'm not a python developer so I'm out of the loop.


The arguing will continue for years after Python 2 is legitimately dead, but the shift has been happening and will continue to happen. Python3 is the future, and more and more new projects are being started with it, and more and more legacy 2.7 codebases are being moved to Python 3 or deprecated in favor of Python 3 replacements.


> Nobody uses that stuff in production code

Nobody uses the features of Python which make it a dynamic language? Google must write some really weird Python if their compiler is that strict.


In Python, you can get at a variable, or even code, in another thread with "getattr()". You can monkey-patch another thread while it is running. This is not very useful, but it's easy to implement in a naive interpreter such as CPython. Part of the price for this is the Global Interpreter Lock, so you don't really have two threads running at once. PyPy has a huge amount of machinery so that stuff will work.

Grumpy doesn't even seem to try to implement that. That's a good thing. If you restrict Python a little, it's much easier to compile.


> That's a good thing. If you restrict Python a little, it's much easier to compile.

Isn't that more or less what RPython does? https://rpython.readthedocs.io/en/latest/architecture.html I mean, I know that starting with a full-fledged(?) Py27 codebase rules out _actually_ using RPython for the stated goals of Grumpy, but I think the two projects agree in principle and differ about the definition of "restricted" :-)


RPython is a restricted (hence R) language specifically for VM development, it is not a general-purpose language.


>Nobody uses the features of Python which make it a dynamic language?

Python has TONS of dynamicity besides those (eval and co), who are seldom used by anyone anyway....

If you think eval is what makes Python dynamic you're doing it wrong...


I'm not sure what is meant by "dynamic language" in that sentence, but examining the compiler output, it supports the features of Python which I think of when I think of "dynamic language" (e.g., a class `Foo` gets compiled into a runtime `*Object` with collections of properties and methods, not into a `type Foo struct` with fixed fields and methods).


Anybody running Django uses this. It uses the pattern of specifying plugins as class paths in strings in the config, which are then looped over and instantiated at runtime.

Frameworks do lots of such dynamic tricks in order to provide nice DSLs for building apps.


'exec' and 'eval'? No it's not, the importlib machinery is used (which doesn't just eval(read('import.py')))


I have never used exec or eval in production Python, and I doubt I could get them past code review because of the possible security impact.


Their python is weird if it doesn't use eval?


or getattr/setattr, or dynamically building classes with type(), or probably more features I can't remember now.


  > getattr/setattr, or dynamically building classes with type()
I think Grumpy handles both those things fine.


What does "hard-code compiler" mean?


It seems to be that it pesudo-transpiles python to go and compiles that down using a normal go toolchain.


What's the difference between transpiling and pesudo-transpiling? (Even if you meant pseudo-transpiling, I still don't know what the difference between that and transpiling is.)


That was a typo. I meant pseudo. I don't actually know that there is any difference in this case. I made that statement hastily. It transpiles, nothing pseudo about it.


Transpilers with no runtime.


It's written to run Python 2.7 because these problems are largely solved in Python 3, and needs solved for people on Python 2.x versions.

"Upgrade to Python3" is the usual defense to that, but it's not really practical for large companies with software such as YouTube completely written in Python 2.x.


No, its written to support 2.7 because the majority of Google's code they want to over haul is in 2.7. I don't think much of this is fixed by python3, at the very least the speed benefits you get don't even compare. See the graph on the OP comparing # of threads using CPython and Grumpy


How is it not practical to upgrade to Python 3, yet it is practical to rewrite in Go?


As someone who works on both python and go day to day, I find this to be quite interesting.

Just tried this out on a reasonably complex project to see what it outputs. Looks like it only handles individual files and not any python imports in those files. So for now you have to manually convert each file in the project and put them into the correct location within the build/src/grumpy/lib directory to get your dependencies imported. Unless I missed something somewhere.. The documentation is a bit sparse.

Overall I think the project has a lot of potential and I'm hoping it continues to be actively developed to smooth out some of the rough edges.


Thanks for trying it out! And sorry about the lacking documentation. I'll be fleshing it out over the next little while.

Your assessment is right: the grumpc compiler takes a single Python file and spits out a Go package. Incidentally, this means you can import a Python module into Go code pretty easily.

I don't have a ready solution for building a large existing project but I'll write up a quick doc to outline the process. The trickiest bit is that the Python statement "import foo.bar" translates to a Go import: import "prefix/foo/bar". Currently prefix always points at the grumpc/lib directory so that's one way to integrate your code, but I need to make it more configurable.


I hope this is a well thought out solution that can evolve into something great... and not just something built for a single purpose.

I question the transpiler. I think I'd much rather prefer a solution like Jython.


I'm confused because Jython runs on the JVM, but Go is a compiled language. Can you clarify?


Jython is a python interpreter written in Java. Grumpy is a python transpiler that converts python to navtive go object code.

Edited to add: The difference is that Jython doesn't covert python to JVM bytecode.


What's the advantage of writing an interpreter? Go already has an excellent runtime (scheduler, GC, etc)--why should this project reimplement it?


The interpreter could still use that stuff.

One advantage of an interpreter in general is that one important use case for Python is interactive scripting, as data scientists do.


Fair point. I would think it shouldn't require too much work to build a REPL on top of this. Rather than transpiling, you would parse the Python AST into the same runtime Objects that Grumpy constructs statically. Seems straightforward conceptually, though I'm sure it would be complex in practice.


I'm sure as this gets hacked on they'll be able to support consuming imports and doing all the conversion to go recursively


For those who are interested, I've used grumpy to compile the following Python code and placed it at https://play.golang.org/p/YP1SP7WsdR . (Note the playground can't run this, it just had convenient formatting support for Go; the generated source wasn't 100% gofmt compliant.)

    class Test(object):
        def __init__(self, value):
            self.value = value

        def method(self):
            print(self.value)

    class Test2(Test):
        pass

    t = Test("hello")

    t.method()
Pythonistas, note I had to have "class Test(object):" and not just "class Test:". The former compiled successfully into a Go program but that program then failed at runtime with "TypeError: class must have base classes".


Thanks for trying it out!

Yeah, Grumpy does not currently support old-style classes. Since all of our code internally requires new-style classes, this was not a high priority feature. It is something that we'll get to.


I did not mean that as a complaint against your very young codebase, I meant that as a defense against Python people complaining about my code. :)


I don't think they'd do that. All Python 2 code I've seen uses `class Foo(object)`, at least since 2.2 came out.


That's fascinating. It's creating run-time data structures similar to CPython's for data, and manipulating them with very general code. There seems to be a type comparable to Python's internal CObject, and it's used for most (all?) data. It's not generating Go that looks anything like human-written Go. There's no sign of type inference, although it's hard to tell from such a simple example. It's a lot like a Python run time environment, where everything is a CObject. Still, once you can do that, you can start optimizing, such as inferring that something is an integer and using ordinary Go arithmetic types.

All that stuff with "switch" seems to be to handle Python exceptions in a language that doesn't have exceptions. Maybe later, analysis can tell that some function can't raise an exception, and translated calls for such functions can be simpler.


> Still, once you can do that, you can start optimizing, such as inferring that something is an integer and using ordinary Go arithmetic types.

I was hoping for something more aggressive even, like compiling Python classes to Go structs so long as the program doesn't need the dynamic behavior. Alternatively, Grumpy could support declaring native Go types via some sort of pragma or a new `struct` keyword or some such, which would be treated like a normal Go object (rather than defining your Go objects in a separate Go package).


I'd expect to see that in time. If you analyze the whole program to find all the fields of an object and verify the absence of code which dynamically adds a field, you can then make it a struct of "CObject" like entries. Then, try type inference on the fields. Some will clearly be integers, booleans, floats, or strings. Those can be represented with type-specific representations.

If you can identify the built-in types, that's most of the potential win; you get to do hardware arithmetic. If you represent integers as 64 bits and check for overflow, you probably don't need bignum promotion outside of crypto code.


Are Go's integers unbounded? If not, proving the value never exceeds the range to silent convert essentially a BigInteger into an int might be hard.


Go tends to use machine sizes like C, but this implementation appears to properly handle it. The following Python code has identical output for me under Python and grumpy:

    two_32 = 4294967296
    print(two_32 * two_32)
    print(type(two_32 * two_32))
And I tested some other things I won't burden HN with, but promotion is implemented, yes.


Interesting to note that `print()` is supported out of the box--no need to `from __future__ import print_function`.


Sure, but that works without from `__future__ import print_function` in Python2 as well.

But it gives different output; the print() prints a tuple whereas the function print() prints a newline.

Also compare print(1) and print(1,2) with and without the __future__ import.


Oh, weird. I could have sworn I've seen errors for using parens with `print` in Python 2...


That's a print statement with an expression surrounded by parentheses. You can try it yourself in a Python 2 interpreter.


I can't help but see this balkanization of Python as a sign that the core language is falling apart.

How many interpreters are there now? And how many of them have even close to 100% compatibility with Python 2.7 or 3.N? Guido has lost control of the language, but has he's still officially the BDFL there's no real standardization body. His stubborn view on functional mechanisms have held the language back syntactically, breaking BC with Python 3 without fixing the language's fundamental problems... it really feels like Python is lost in the desert.

Which doesn't mean the language is dead, but it's rudderless. I think we were all hopeful when Guido joined Google that we'd see real direction for Python, but that obviously didn't happen.

Not that Python is dead, obviously - still lots of great projects are written in Python. But I don't like the language's future.


Why do you think multiple implementations are a problem? We've got multiple ruby runtimes, multiple basics, multiple JavaScript engines, multiple C compilers, etc. None of those languages are falling apart because of it.


It's not that different from what's happening with Java (via Android) or Go (with GopherJS, for example), or even C (many, many extensions). A popular language attracts implementors, even if they don't implement the whole thing on their target platform.


Actually, historical evidence suggests the opposite of your argument: multiple implementations of a language typically mean it is succeeding, not failing.


Really then, it suggests that Python 2 is succeeding, not Python 3.


GVR, the PSF and the core dev team overestimated their influence. They still truly believe the majority will come around to Python3. I agree with your sentiments and balkanization is the right word. Guido won't even read these comments. He thinks it's all some unjust slander and nonsense that will be forgotten in 3 years. :)

It is a good time to jump off the Python train in general, and I say that as someone invested in Python who loves it. If possible I'd recommend people reach for Go or Elixir depending on their needs or requirements.

I will admit I'm a little shocked how much of a failure Python3 adoption has been. I think if it had been Grumpy from the start it would've been a huge success. This is exactly what people want and Google should be commended for sharing this.

Here's to hoping Grumpy takes on a life of its own and is the new de facto Python.


Both my workplace and the one big open source project I use homeassistant all use python 3. Do you have any data to backup python 3 adoption being a failure? I know it was certainly painful for many years.

As a python 3 user everything seems fine on my end. Though 3 has its own new warts. They are smaller and more forgivable warts for now but its probably not a good sign.

I do agree the direction python is heading is not very interesting anymore but that doesn't mean its dead or useless now.


We try to use Python 3 at work but have to run Python 2 as well because there are still packages that weren't upgraded and it's too much working re-writing them all for no direct benefit.

We also have third party vendors that only support Python 3 in experimental versions, and there not even recent versions (Bloomberg is a great example).

I really like Python3 features, but the pain of using them drives me towards using other languages. I hope Julia will be stable and mature enough soon so that I can dump Python all together. I really like Julia, but currently the changes in the languages are too fast and there are constantly incompatibilities with packages that don't update fast enough. But I'm reasonably certain that this will be fixed once they reach 1.0.

I'm sure Python is far from dead, but can imagine that Julia has the potential to kill it in many domains.


Do you do data science?


Yes, mostly for finance.


What python 3 features are painful and why?


> Here's to hoping Grumpy takes on a life of its own and is the new de facto Python.

I can appreciate you have that opinion, but I'll be livid if that's true - the last thing in the entire world I want is to do battle with dependencies and the very, very strict/opinionated Go build system.

If your idea is that all Py27 is just transpiled into Go and then jettisoned, that's fine, but keeping one foot in each world sounds terrible.


Go compiles down to native code (x86, ARM assembly etc). That's what Grumpy code generates as well but it still needs to be maintained in the original Python2 or Go (depending on whatever your original source is). What I'm suggesting is that Python3 is jettisoned and Grumpy takes the forsaken throne that Python left behind.

One thing is clear with all these new compilers/runtimes, you want to be writing Python2 syntax because that's where all the action is. I hope Grumpy succeeds and new features are added and becomes it's own ecosystem that plays nicely with Go code. These folks at Google have really done what Guido & Co should've done.

This is Python3 as most of us wanted it to be, it's worth rewriting all your code for... but you don't even have to do that. Valid Python2 is Grumpy already. I don't know what else I'd want. It compiles existing Python2 AND offers a legitimate upgrade from CPython at the same time.

As far as all of the lost C extensions? You won't need them with the performance that the Go runtime has. That's been the answer all this time, not maintaining C-extension compatibility.

They nailed this thing, it's the answer to "what's the future of Python?" that everyone has been wondering for the past 9 years.


> you want to be writing Python2 syntax because that's where all the action is.

> This is Python3 as most of us wanted it to be.

> Valid Python2 is Grumpy already.

> It compiles existing Python2 AND offers a legitimate upgrade from CPython at the same time.

> As far as all of the lost C extensions? You won't need them

None of the statements are true. You seem to be very confused what Grumpy can and can't do, and what the need of actual Python developers are.

The only benefit of Grumpy is speed (I don't think go "interop" counts). Now, that's a pretty big benefit for some, but comes with significant drawbacks and probably always will. Even though CPython is only the reference implementation, many clever people have worked to make it faster. Getting rid of the GIL is also very difficult. The easiest way to gain speed is to limit Python to a subset of features and then optimise for that. While this is a fair approach, hailing it as the future of Python is terribly misguided.


Citations needed for your assertion on each point being false.

Grumpy proves Python2 is where the action is at. Everyone wanted a speed improvement with a new Python, that's the ultimate carrot.. instead Python3 was and still is in some ways slower than 2. Other than exec, eval and C-extensions, Python2 is valid Grumpy.

You didn't provide any reasoning or proof that my points, which were just reiterated, were false. If you're going to "port" anywhere from Python2, removing C extensions (which no language should have to be dependent upon anyway, so it's an improvement) and exec/eval usage is a bigger win than Python3.

The future of Python is what the users decide, not what the PSF decided. I recognize there's a lot of confusion and propaganda surrounding that. This is open source, not top-down control.


Lead on the project said he would like to support python 3 at some point.


That would be great. But the issue is really what do people do that have all this mass of Python2 source. The Python3 people are off doing what they want to do on that, which I consider the experimental branch. There's just so many new mistakes made with Python3 it's not a slam dunk for people to move to. At this point, it's become more of a social pressure / political thing (2020?) than a logical decision to move to 3. Something like Grumpy is definitely going to take the throne that Python3 abandoned.


You are upset about Python 3 adoption, so you advise reaching for much less popular Elixir or for brand new Google-specific Grumpy? If insufficient popularity were the problem, those choices wouldn't make any sense.


> They still truly believe the majority will come around to Python3.

People will upgrade if they make py3 more appealing, something like a 20% speed boost would be nice.


> His stubborn view on functional mechanisms

Reference, for a non-Python dev who hasn't kept up with it?


Guido has said that he wishes Python didn't have lambdas.

Also, map and reduce were removed from the standard global namespace and into the functools module.



Python needs a new runtime. This talk shows how bad of shape it's really in.

https://www.youtube.com/watch?v=qCGofLIzX6g&list=PLRdS-n5seL...

Basically, the language doesn't have a "spec" per-se. The language is whatever the defacto CPython implementation happens to do within it's giant eval loop.

Another great talk about CPython internals:

http://pgbovine.net/cpython-internals.htm


> Basically, the language doesn't have a "spec" per-se.

It does[1]. And process of improving it is called PEP[2].

[1]: https://docs.python.org/

[2]: https://en.wikipedia.org/wiki/Python_(programming_language)#...


Uh, what? Claiming that your [1] is in any way a specification for the language is utterly absurd. It's far too vague. (Compare to even an IETF RFC, and you'll see what I mean. If you want to compare to a real language spec, compare to ISO C++.)


> to a real language spec, compare to ISO C++.

No thanks. Written specs can always have interesting implications or undefined behaviour. Just because it's written in a more verbose language (English) doesn't mean it's less vague.

E.g. GGC is the de facto C spec for many. Code/platforms as spec makes more sense and is easier to maintain/update, with quicker iterations of language features (c.f. Ruby/Python to C++).


My issue isn't necessarily with the fact that it's in English. It's that it's hopelessly imprecise English. Maybe you'd have less of an issue with the Java Language Spec? (Which, IIRC, even left out some memory model problems until recently.)


CPython is the spec (or really more the CPython test suite). Just like the Ruby MRI. It's a simple, plain interpreter without many frills, and to add or remove a feature you have to submit a PEP which goes through a specification process.

Python started as a one-man-band project and of course didn't have a specification.


Yes, but that's what the PP was arguing wasn't the case (by linking to a "spec"). I'm not sure why you're restating the obvious (OPs position).


> Python started as a one-man-band project and of course didn't have a specification.

C started as a one-man-band project and of course does have a specification.

JavaScript started as a one-man-band project and of course does have a specification.


C wasn't a one man band project (at least a two-man-band one at the start!) and neither was JavaScript. C also didn't have a formal specification for over 20 years (only an informal one) and JavaScript had a strong selection bias for interpreters that roughly conformed to the specification. But that didn't exactly help it, JS is/was notorious for differing implementations of browser APIs.

Each of them also has a strong need for a specification, as there are many differing compilers and interpreters. There are a few for Python but are specialized, the CPython interpreter is good enough for 90% of cases.


For your narrow minded understanding of what a programming language specification is only: https://en.wikipedia.org/wiki/Programming_language_specifica...

You can say it's imprecise or lack of ratification from one or many international organization(s), but you cannot say it doesn't exist. End of story.


The reference [1] is a decent spec. It might not be as formally rigorous as an ISO standard, but it's probably as good as Go's [2], which also a "reference".

[1] https://docs.python.org/3/reference/index.html

[2] https://golang.org/ref/spec


In my experience with both languages, while the Go spec is incredibly readable, navigable, and succinct, the Python reference is a sprawling mess that is difficult to navigate or even to Ctrl-F in.


To be fair, Go is also a much smaller language, which hasn't gone through the process of collecting and shedding multiple layers of legacy, and exposes far fewer implementation details.


cpython is the reference implementation, so it makes sense that it's;

A) Not well optimised.

B) Touting features before the spec/standard.

EDIT: people really dislike that I said this, and I'm having trouble finding my original citation- it was on one of the many python books I own. Most likely "Learn Python The Hard Way" but I'll dig out the exact chapter where they compare pypy to cpython and mention that because cpython is the reference implementation it values code clarity over performance optimisation.


Who says CPython is not well-optimized?

CPython is 25 years old -- people have been making it faster for a long time. Python 3.6, the latest release, has many performance improvements, cf. http://www.infoworld.com/article/3120952/application-develop...


Interestingly, achieving performance parity with CPython is one of the biggest challenges of this project. There are certain things CPython does very fast like allocating and freeing many small objects.


So, Grumpy currently isn't faster than CPython?


Notably, some of the 3.6 performance improvements were merged in from PyPy :-)


It makes sense that the official, primary and by far most popular implementation of one of the most used languages in existance is not well optimized?

(edit: I'm just being polemic about your statement here. CPython is reasonably optimized within the constraints it currently has).


It only makes sense because it's python. A language where style is part of the syntax, and readability is one of the things that many libraries focus on , the so called being "pythonic".

It makes sense that the reference implementation mirrors the same patterns than the language itself.


Someone else already posted this link: https://www.youtube.com/watch?v=qCGofLIzX6g&list=PLRdS-n5seL...

It explains why CPython can't improve on many things.


Please watch that first video, it's a good one. It explains how CPython essentially _is_ the spec because its internals leak into the spec when they have no business being there.


Pretty much the same is true about most other languages that have a single main implementation. This is even true, to some degree, for Java, which had competing implementations relatively early on.


At some point it leaked pretty hard as well. The package scope was an unspecced implementation behaviour that became a standard later. (If I recall the story correctly)


The main reason I moved from coding in Python to Go as my main language many years back is because concurrency was such a pain in standard Python (the other was compile time error checking).

It's interesting to see the same pain has now made caused the runtime itself to be implemented in Go.

It's a pity C extensions (often used in scientific computing) are not supported but Go does have support via CGO, so maybe some approach can be worked out to access C routines in the future.


I "grew up" on Python, then wrote a whole bunch of Go for my job. Then this past Autumn re-visited Python to implement a networked terminal based game[1].

With what I learned about Go and concurrency, I would say that currently in Python, writing concurrent code is not very hard, and is as close to Go as you can get without actually just writing Go.

Now, you may be saying "but Python has the GIL, how can concurrency be easy in Python?" I'd say, you're definitely not wrong that the GIL is a problem, but it's not much of a problem for concurrency.

This goes back to the heart of Rob Pike's classic talk, "Concurrency Is Not Parallelism"[2]. To quote Wikipedia:

        In computer science, concurrency is the decomposability property of a
        program, algorithm, or problem into order-independent or partially-ordered
        components or units.
In Python, you can pretty easily emulate the conceptual properties of Goroutines and Go channels with Python threads and queues. The problem is that doing this in Python won't net you the performance increases you get with Go. And I believe that is an important distinction. There are plenty of cases where you don't care so much about the performance benefits of parallelism, but you want the conceptual and implementation benefits of concurrency.

In closing, concurrency in Python is pretty easy to work with, it just performs very poorly.

[1] - https://github.com/lelandbatey/defuse_division

[2] - https://blog.golang.org/concurrency-is-not-parallelism


I wrote a little task module that does precisely that and provides you with a go() function and "channels":

https://github.com/rcarmo/python-utils/blob/master/taskkit.p...

Obviously, it wasn't amazingly performant. But it did help a lot for doing concurrent stuff, and I've been pondering re-doing it for asyncio.


Using Gevent[1] is quite similar to Go when it comes to concurrency in Python.

[1] http://sdiehl.github.io/gevent-tutorial/


Concurrency is even easier with the new async/await syntax


I'm not philosophically opposed to supporting C extensions. The additional complexity just was not deemed to be warranted since the YouTube frontend doesn't use a lot of C extensions.

In principle it's possible to implement something like JyNI (http://jyni.org/) or CPyExt (https://morepypy.blogspot.de/2010/04/using-cpython-extension...) to bridge the CPython and Grumpy APIs. In practice, marshalling data across the interface can be very expensive.


Out of curiosity, what C extensions does YouTube use?

If this is good enough to run YouTube's python code already it's honestly super impressive. Well done.


To be clear, Grumpy cannot yet run YouTube's Python codebase. There's still a lot of work to do on the standard library.

There are a handful of C extensions for JSON, protobufs, etc that YouTube uses, but mostly they're small utility functions written by us to optimize particularly hot code paths.


Is the plan to use grumpc to transpile the code to Go and then work in Go in the future, or is the plan to keep coding in Python and add a grumpy step before deploying?

(If the former, you could just update the code to use the standard Go JSON/etc packages..)


The idea is to continue to write code in Python. The transpiled code is not suitable for working with directly. That said, there is the possibility of rewriting bits and pieces in Go (e.g. performance critical stuff) and then call into it from Python. Sort of a hybrid approach.


I"m also curious about this


Will these be rewritten in go then?


Does anyone know the technical reasons why C extensions are not (at least easily, apparently) supported by Go? Is it to do with Go's being a GC'd language? I would have thought that should not be a reason per se, since Python also has GC, but has plenty of extensions written in C. But I'm not a language internals expert.

Also, further signs that GC may not be the reason, is that D also has GC, but can link to C libraries somewhat easily (not sure about all cases or how far the ease goes).


It's because Go uses a different stack structure, called "segmented stacks", in order to enable cheap goroutines. Basically, Go stacks start tiny (8 KiB, as opposed to much larger C stacks), then it grows them in small segments. Additionally, Go code runs inside an event loop, which enables excellent I/O performance without kernel context-switches, and ordinary C function calls conflict with this event loop.


Segmented stacks in Go went away in Go 1.3 (https://golang.org/doc/go1.3#stacks; June 2014).

The alternate stack structure is indeed one issue. The bigger one is the GC, though; the Go runtime needs to know which pointers it is responsible for freeing, and which are the responsibility of the C code.


> The bigger one is the GC; the Go runtime needs to know which pointers it is responsible for freeing

That is not the bigger issue, and AFAIK already handled for C types.

The stack/calling conventions is the reason why cgo is "not go", cgo calls have significantly more overhead than just about every other FFI (the overhead of a cgo call is ~2 orders of magnitude more than a "native" go call, or was around the same time last year, that is you could perform ~100 no-op non-inlined native calls to a do-nothing function by the time you need for a single cgo call to the same).


The original question was about why C extensions are not supported by Go. It's not a matter of performance; it's a matter of correctness.


>Additionally, Go code runs inside an event loop, which enables excellent I/O performance without kernel context-switches

Interesting, didn't know this (that Go code runs in an event loop). Is the reason something to do with goroutines and channels? something like, a routine gets info that data is available for it to read (on a channel, sent by another goroutine), via an event it receives?

Also, can you explain this point:

"which enables excellent I/O performance without kernel context-switches" ?


Since we also talking about Python. If you ever used AsyncIO you will see that programming in it is a bit different than you usually write code without it.

Before you can call any coroutine you first need to start an event loop and schedule something in it. This essentially enables the language to schedule another async function each time you use await.

Since Go by default always is async, before your main function is called, it sets up the even loop and then calls your main, which technically is also a coroutine. Your code appears to be sequential, but it is not executed that way.


Interesting ...


One of the reasons an asynchronous I/O event loop can be faster than a threaded model is that the CPU spends more of its time in a single userspace thread per core, switching between clients that are ready. A threaded server will incur a kernel-space context switch each time, while an asynchronous loop will keep the processing time in userspace.


Go has cgo, which works fine for most purposes of fine; native code interop is not an issue for Go.

Grumpy likely doesn't support the C extensions due to time, and complexity of having to actually emulate the GIL since Python does not have fine grained locking for structures. C extensions that work with Python data structures need to first hold the GIL.


> Does anyone know the technical reasons why C extensions are not (at least easily, apparently) supported by Go?

It's because Python's C API is inherently non-thread safe. The API lacks passing an interpreter pointer as a parameter (as Lua's API does for example). So Python is forced to use a terrible thread local storage hack involving the Global Interpreter Lock to swap interpreter instances which is insanely inefficient and limits compute-bound programs to a single thread.

Python 3.x had a chance to fix the API and do away with the GIL once and for all, but inexplicably they did not. There was a misguided notion that C extensions between 2.x and 3.x could be interoperable.


I never saw Python as anything more than a scripting language to portably automate tasks across UNIX and Windows environment, even back in the Zope days.

My experience with Tcl teached me to stay away from languages that don't have either a JIT or AOT compiler on their reference implementation.


I think that that's rather dismissive of a language that runs huge web, scientific, and general purpose applications daily.


It might be, but that it is usually a consequence of not knowing any better or making use of existing libraries.

Just like people learned 8-bit BASIC and went on to do business applications and games on it. I went Z80 ASM instead.

Personally I would only use Python for shell scripting and advise for using Julia instead.

Of course, others see it differently.


Julia? Look, I enjoy playing with Julia, but have you actually used it? Because most of the people I meet who turn up their nose at Python and say nice things about Julia haven't ever used Julia. It's got great potential but has some major gaps and warts, and is just plain dog shit slow for certain things compared to SciPy, which, you may not be aware, is basically C and Fortran code wrapped in Python API.


> is just plain dog shit slow for certain things compared to SciPy

Do you have real examples here or is this just FUD? SciPy is not known for using absolute state of the art algorithms or perfectly optimized implementations. It can be pretty easy to improve on the naively implemented or legacy pieces of SciPy, e.g. http://tullo.ch/articles/python-vs-julia/


I do follow Julia development and I am aware that it isn't quite there, but at least their community does embrace JIT compilation, not like Python that PyPy is just yet another project, ignored by the reference implementation.

> SciPy, which, you may not be aware, is basically C and Fortran code wrapped in Python API.

Which for me personally means, that I would rather C and Fortran directly or better yet, a C++, .NET or Java binding to them.


> Which for me personally means, that I would rather C and Fortran directly or better yet, a C++, .NET or Java binding to them.

Being able to describe things with a syntax that looks almost like pseudocode and runs highly-optimized C/Fortran code to do heavy lifting has huge, huge advantages.


Scala, Clojure, F# also allow it, with the added benefit of industrial strength JIT/AOT compilers.


Except without the great Python ecosystem and without great projects like Numpy/Scipy


Numpy/Scipy are only relevant to a minor set of computer users, and even then, there are alternatives like LANPACK and BLAS.

Also Java and .NET ecosystems are just a little bigger than Python.


> there are alternatives like LANPACK and BLAS.

LANPACK + BLAS = scipy

> Numpy/Scipy are only relevant to a minor set of computer users..

> ... Java and .NET ...

Being able to describe things with a syntax that looks almost like pseudocode and runs highly-optimized C/Fortran code to do heavy lifting has huge, huge advantages.


> LANPACK + BLAS = scipy

Thanks, I already knew that.

> Being able to describe things with a syntax that looks almost like pseudocode and runs highly-optimized C/Fortran code to do heavy lifting has huge, huge advantages.

Hence we are back at Scala, Clojure, F#, enjoying the respective AOT/JIT native code compilers, and integrating with that highly-optimized C/Fortran code.


I use Scala and Clojure. I like F#. Have you ever worked at an actual business that needs to hire developers that know these languages WELL? I can tell you that the significant added cost can't be justified by the not-very-big benefits compared to just using Scipy.


> Scala, Clojure, F#

Functional languages. The "year of Linux on the desktop" of programming languages. Also none of those look like pseudocode (Scala does if you ignore bits and squint).


The amount of job postings say otherwise.


No they don't. On stack overflow jobs F# has 14 posts, closure 28 and Scala ~100. Python has 400+.

Other sites have similar totals. But they are filtering for London/UK/Europe, so that might be a bias.


This made be wary of Julia: http://danluu.com/julialang/


Julia's nice, but it's playing catch up to R & Python. Name any statistical algorithm, and R probably has it. Name any scientific field, and there's probably a Python library for it.


Some people are very productive in Django and Rails (and the like), in non-performant languages. Many sites never grow to the size where running on the JVM (or AOT-compiled Go) would save you money from fewer servers vs. more coder-hours.


The scientific community thinks otherwise.


The scientific community uses it as a tool to automate specification of tasks to be performed inside of C libraries. It's a case pjmlp may not have explicitly named, but it's of the same kind.


PyPy is making good progress at implementing and JITing numpy as native Python code.


They've been working on NumPyPy since at least 2011, and to my knowledge it's still not in serious use.


AOT can be done by Cython and it's fully compatible with reference as far as I know.


It seems interesting but my concern is that it's just another Unladen Swallow.


There's no guarantee that it will succeed, but the fact that they're willing to sacrifice some considerable degree of compatibility to achieve other important goals may make the project a lot easier to pull off.

I still think Unladen Swallow should have been based on V8, but as I recall that project had very strict compatibility goals that would have made a V8-based implementation impossible.


Even ignoring the extension-compatibility goal, at the time Unladen Swallow began, V8 was nothing more than an AST-based JITing interpreter, hence there were far fewer generally applicable parts of the code compared with today.


Well, if Google is pushing it and actually using it internally...

They also have the benefit of being able to push features into the Go core that they might need for this.


Google has pretty specific needs, so this project may be very successful internally even if it gets zero traction externally.

A lot of big software companies do this nowadays. Google and Facebook both have a lot of purpose-built software, some of which gets released as open source, that meets their needs well but is hard to use for other purposes. I guess it's still strictly better than them not open-sourcing the code, but it's definitely an existence proof that just making something open source doesn't make magic happen.


Perhaps Golang will get a VM or something similar to enable stuff like exec/eval to be implemented with rapid response times in Grumpy. The only way I've ever seen a REPL implemented for Go is by recompiling the source code via a call to "os/exec".Command for each statement or declaration entered, which gives a 1-second delay on many computers.


> Well, if Google is pushing it and actually using it internally...

It will be silently abandoned in 18 months....


Not if it turns out to be usable and have a significant impact on the performance. You need to realize that at the scale Youtube works, even a small performance boost translates to huge savings in server cost.

So the worst case here is if they never improve this past their own needs (which is a pretty limited subset). But if it's successful for them, and it's opensource, I could very easily see other people who run heavy stuff on Python contributing to it and helping it grow.

That's the thing with opensource, even if Google doesn't actively work on it, others can (if it has actual value and is useful to people).


To me it looks like Google essentially is moving everything to Go, and this project is to help with it.

They port their python libraries so they can reference them from their Go code, and then module by module will rewrite it in pure Go.


>Google essentially is moving everything to Go ...[snip]... and then module by module will rewrite it in pure Go.

I can see this happening for some core modules sure, but I think you've underestimated the work required to convert the sheer amount of Python code at Google. There is a tool inside Google which graphs the number of lines of each language in Piper; I don't think I can quote numbers from my time there but it would be no small feat, even for Google.


It just needs to happen one piece at a time, even if it takes a few years.


s/years/decades, then sure.


>To me it looks like Google essentially is moving everything to Go, and this project is to help with it.

Interesting - first time I have heard such an opinion. Why do you think it may be so? One reason I can think of, is they get more control over the languages they use.


The Go seems to be developed for their needs (well it's coming from them). It's simple to learn, it's very opinionated, how code is stored, how it is formatted etc. It seems perfect for a company that gets a lot of college graduates.

Now with Python, they still seem to stick to Python 2.7 and don't show any effort to move. TensorFlow was released for Python 2.7 and only later Python 3 support was added. Grumpy is for Python 2.7 and first issue opened asking about Python 3 support was closed with change in documentation that only Python 2.7 is supported.

To me it seems like they wouldn't stick to Python 2.7 if they were planning on continuing using Python. It seems like it coincides with deprecating Python 2 in 2020.


I made an agreement to myself to never use any Google language, tool, or library after they decided to shut down google code. Google has a history of creating all of these crazy projects and then just completely abandoning them leaving the early adopters to fend for themselves. It's really bad practice and leaves a nasty taste in my mouth. That's why I haven't even tried Go. I have no interest in it. As soon as Google develops some other language they'll drop Go like a hot potato and everyone using it in production will be forced to rewrite or stagnate.

Look at AngularJs for a more recent example.

Add to this that Google has some of the worst versioning practices I've ever seen and you get a recipe for destruction.


How does this compare to Jython? The blog post says they looked at alternative runtimes but didn't like that they had tradeoffs (implying that Grumpy has no tradeoffs??). But Jython has been around for quite a while and also has no GIL:

http://www.jython.org/jythonbook/en/1.0/Concurrency.html

It can also handle Python's dynamic aspects.


Every time I've tried to use Jython, I've found it won't work with pre-existing code all that well.

In part it has exposed CPython "implementation quirks" that people were wittingly or otherwise taking advantage of. In other cases there doesn't seem to be obvious reasons for the differences and has required special-casing the python code to handle it.

It has been great with code written from scratch, specifically for it.


That sounds like the kind of issue you'd have with any reimplementation of CPython though. It sounds like Grumpy doesn't support quite a few things, so I still wonder how they compare and why developing a new runtime was considered easier than reusing Jython.


There might be a strategic element: Jython development has been erratic throughout the years.


Google's own projects have been developed erratically throughout the years. They could have simply forked it.


I wonder why the Grumpy Fibonacci is so much slower than CPython for 1 thread. Seems weird given Grumpy is compiled.


Although Grumpy is compiled, it is just as dynamic as Python, in that method dispatch involves dictionary lookups, etc.

The main reason why Grumpy's slower for most single threaded benchmarks is that most Python workloads involve creating and freeing a bunch of small Python objects. In Go, these objects are garbage that need to be GC'd in a very general way. In CPython, there are free lists, arenas and other optimizations for allocating small (especially immutable) objects. And cleaning up garbage in CPython involves pushing unreferenced objects back onto the free lists for later reuse.


> Although Grumpy is compiled, it is just as dynamic as Python, in that method dispatch involves dictionary lookups, etc.

Right, I suppose I assumed that straightforward numerical-looking code would be translated to Go numerical code. Perhaps they just aren't that ambitious yet.


We hope to implement optimizations like you're describing eventually. But the core functionality needs a lot of work before we start down that path.


Guido insists that single threaded performance of Python has the highest priority - http://www.artima.com/weblogs/viewpost.jsp?thread=214235

I am not sure, any implementation of Python will beat the single threaded performance of Cpython..


CPython is still a naïve interpreter. PyPy beats it in most cases, sometimes by a lot. Of course, PyPy still has a GIL for the same reasons CPython does.


So your complaint is that GVR doesn't want patches that would (significantly) regress performance for the majority of use cases, yes?


IIRC, IronPython single-threaded performance was competitive with CPython.


Multiple ones already do.


From the post it looks like it is optimized for multithreading from the start, so they possibly used the multithreaded version with thread count = 1. When you care for parallel workloads its not so important if your single-threaded code is the fastest.


There's no mention of whether Grumpy passes the CPython test suite. Until it doesn't, it's not a Python runtime, it's a compiler for a language with Python-like syntax.

Compilers like this, from almost-Python to say C/C++, have existed for a while: Cython, Shedskin, Nuitka are some examples.


It does not, yet. The standard library is very incomplete at this point.


experimental ~= alpha, give the team a break


You're responding to the principle developer. :)


I know, but i figure people will read his comments first and then see this.


I'm not particularly concerned about the standard library, but about the more dynamic features of the language. Not just exec/eval, but for example the complex dispatching logic behind magic methods, especially the various __getattr__, __getattribute__. They are where alternative runtime implementations usually stumble, as if they were an intrinsic bottleneck to Python runtime performance.


exec/eval do not work, but all of the other dispatching and magic methods are supposed to work exactly like CPython (and if they don't, it's a bug/not yet implemented).


> Until it doesn't, it's not a Python runtime...

True Python runtimes fail the CPython test suite. They have some work to do!


This seems like an odd engineering choice. Presumably the effort to create a python->go translator would be non-trivial. Why not just start rewriting components into Go, and migrating them out of Python, leaving python as essentially the presentation layer at most?


Ask yourself how much Python code you think Google has. Then multiply it by some large single-digit number, minimum. Then compare the effort of spending 30 seconds per line on that code vs. writing a new Python interpreter/compiler/runtime, especially when you can trivially get people who are capable of doing that.

There's a reason why the large companies often end up working on new runtimes/interpreters/compilers like HipHop for PHP, Hack, and so on, rather than working on the code bases written in those languages. It is very easy for it to not just be easier to leverage in at that level, but an order of magnitude or two easier. Or three.


It's interesting to think of the dynamics of developing an application (presumably) many times more complex than its host language. At a certain point, changing the language implementation becomes cheaper than making sweeping changes to the application.


I don't know PHP, but in the case of Python you will spend the next ten years ironing out subtle incompatibilities between CPython and this new thing.

Added up, it is easily more than 30s per line.


"Added up, it is easily more than 30s per line."

30s per line of current Python code, not compiler code.

And I'm just providing a number to put some numbers out there. 30s/line to convert a Python program to something fundamentally different like Go is probably an underestimate, yes, but then, if it's an underestimate that means the budget for the Python reimplementation is that much larger.

It also really helps the "subtle issues" when you control both the implementation and the code running on it; it means for every subtle issue discovered you have the option of either fixing the implementation or fixing the original code not to tickle the corner case. It's much harder when you only control one side or the other. It's not Google that will be having massive problems with corner cases.


I wouldn't think that a tool for transpiling Python to Go would be harder than creating a whole new runtime. Most of the work is already done; you can get the compiled ASTs directly out of Python itself.

Of course, if it started out as a "this seems like a cool project", that skews the "a is more efficient than b" ratio significantly.


Sorry, I think my post may have been excessively specific; I've edited it to say "writing a new Python interpreter/compiler/runtime" instead of just "writing a new Python interpreter". I was trying to make a generic point, not one specifically about a Python interpreter. (At least, if I'm interpreting your reply correctly.)


I mean, you can get something out of transpilers--not typically maintainable code.


I take it you've never worked on a large code base with a large team? Even ignoring the actual task of rewriting the logic in a different language with different idioms, the change is going to be a massive hit to productivity for many months as people learn the new architecture of the rewrite, the tooling, and idiomatic Go.

For a project the size of YouTube, that will be millions of dollars of engineering hours and weeks/months of lost productivity for an unknown gain and almost guaranteed bugs. It's a terrible value proposition so it's better to squeeze every last drop of performance out of the code base you have, which at the scale if this project includes paying engineer(s) to work on a completely new runtime.


Because Python running in Grumpy can import Go packages, I'd assume (I work for Google, but not on YouTube), that Python libraries could then be migrated pretty seamlessly and incrementally to Go.


It's been said a couple times in this thread already but basically, the YouTube Python codebase is very large and the cost of a rewrite is prohibitive.


Statistics would be very interesting to the "useless internal trivia" types.


Creating a new Python runtime in Go is obviously non-trivial, but so is rewriting the heart of YouTube in Go. Heck, YouTube may be a bigger codebase than Python for all I know.

And there's a lot more side benefit in a better multithreading Python runtime for Google for other Python code Google has (or hosts), whereas the benefits of a YouTube rewrite are more narrowly limited to YouTube.


Just because the site is popular doesn't mean the codebase is big. It's a simple site. I could probably write YouTube in Go in an afternoon and I don't even know Go.


You could perhaps write a feature-bare clone. Here's some basic features:

    - search (the various search pages)
    - trending
    - channels
    - subscription
    - video uploads
    - history
    - comments
    - likes
    - upload
    - video editing (most of this is in browser, but still)
    - livestream
    - video analytics
    - payment and ad management
    - video comment review and moderation
    - translation
    - captioning
    - video replication, multiscaling, caching, and resolution management based on network speed
    - video recommendation
    - cards, overlays, annotations, etc.
    - music and sound effect search
    - antispam
    - dmca and copyright tooling
And that was 5 minutes of poking around.


> resolution management based on network speed

Not so much a feature of YouTube as it is a feature of HLS/Dash... but yes, it means you've gotta transcode the source video into multiple different bitrates.


Even outside of that, I can override the bitrate in some instances. But yeah, I was mainly talking about the necessary replication, cache, and transcoding (and I actually forgot about the multiple codecs, which they also provide, in addition to multiple resolutions/bitrates)


This comment made me wonder what you consider to be a complex site.


Not a video host with comments. That's practically CRUD. A very solved problem.


Well, go on then, why spend just an afternoon when you could spend 5 whole days. Make a YouTube that's 10x better, host it from your closet and watch the megabucks roll in as you take down Google.


But, really, what site cannot be boiled down to practically CRUD? That does not help us understand what you believe is a complex site.

Is this supposed to be some kind of no true Scotsman?



That's a question I have as well. I'm guessing they have a very massive cpython codebase, and the trade-off was worth it.


I'd assume that's true, but they also didn't know how well this would work. A 'little' project to make Grumpy is probably much easier to sell to management/throw away if to slow than "Let's rewrite all our code in language X to see if it goes faster".


The thing is, you can not use Python in production at Google. YouTube was acquired a long time ago and a rewrite of the inherited Pyton code base has not happened for very good reasons.


Maybe there is too much to rewrite.


It's not just a matter of changing the implementation language from C to Go, they wanted to remove the GIL as well. For that you can't even use the existing CPython design.


I think he's talking about moving their Python codebase to Go, so they don't have to run Python anymore.


You're probably right. He seemed to be questioning the translator so I thought he was suggesting replacing CPython piece by piece with a Go implementation.


It could be part of the auto migration effort.


This seems like a win for Go, at least to me.

I wrote a random sentence generator in Python several years ago. A bit later, I wrote my blog using Java EE. Early on, I had an idea: put that generator in Jython, and spit out a random sentence on every request. It's probably the one feature that I was OK to let go, should I switch platforms.

Since Oracle has only gotten more evil and Java more stagnant over the years (especially in light of TLS features), I've been thinking of possible alternatives. I've been intrigued by newer compiled languages, and it's come down to either Go or Rust, but I've yet to dig too far into them. I might have a winner.


"The downside is less development and deployment flexibility, but it offers several advantages. For one, it creates optimisation opportunities at compile time via static program analysis."

Optimisation. This is a smart move, hard though. A compiler, written well allows the back end to improve the code. So the whole code base can improve with improved analysis.

"The biggest advantage is that interoperability with Go code becomes very powerful and straightforward: Grumpy programs can import Go packages just like Python modules!"

Extending Python (youtube codebase) with Go modules. That's interesting.


I'm curious to why they started this project when there already is low hanging fruit that can speed up Python (e.g. pypy). What makes this better other than to satisfy the inner go-fanboy?


Because they want to get off of Python completely, and forever. They're not looking for speed, they're looking to ditch the liability of supporting Python and its ecosystem.


Yeah, they didn't at all say that.


Of course, that doesn't necessarily mean it is not true. There are many reasons why Google may not want to reveal long-term plans so explicitly.

It's clear that existing Python codebases will be maintained for the foreseeable future – there would be no reason to build this otherwise – but this may signify a shift away from Python for new extensions to the project, as this now makes it possible to integrate Go packages with relative ease.


It's right there in the post:

> To solve this problem, we investigated a number of other Python runtimes. Each had trade-offs and none solved the concurrency problem without introducing other issues.


Pypy still has a GIL. GIL-free seems to be one of the explicit goals of this project.


>Pypy still has a GIL.

Not strictly true, http://doc.pypy.org/en/latest/stm.html. In general for the main project however, this is true.


Pypy still has a gil. Pypy is experimenting with non-GILed versions, and in fact there are people actively working to remove the GIL from pypy. But the same can be said of cpython (the gilectomy), and yet I don't think that you'd say "not strictly true" to "cpython still has a GIL".


GIL still exists in pypy


In case anyone wanted the github repo: https://github.com/google/grumpy


This actually pretty great. Even if you will never run your code in grumpy, it underlines the status of Python. Having large corps investing in Python will help pulling in new talent into the Python environment also it strengthens the ecosystem of Python.


Actually, I would interpret this as "CPython is broken for scalability. It's not worth trying to fork/fix it, we'd rather just roll our own runtime".

Not exactly an endorsement.

Sadly, the code was just dumped into a new Git repo, so no way to tell how many people contributed internally so far.


I agree and disagree ... on one side this may signal a shift away from Python. Something to help the migration effort and the ultimate goal is to write everything in Go.

... but then I don't see Python going anywhere anytime soon. Didn't Microsoft just start a project to get Python's runtime to use CoreCLR's JIT?

There was an article, can't find it now, about an upcoming Python renaissance saying there may be an influx of new interpreters. There's PyPy, Microsoft's CoreCLR thing, now this, etc. It seems people really want to program in Python so there is an effort to make it faster.

*edit: found the article: https://lwn.net/Articles/691070/


Devs love Python and will continue to do so, no question there.

I'm just afraid the surge of different compilers and interpreters will bring up plenty of issues in the medium term.

There is no formal spec of Python like there is for, eg, Javascript. (which is of course driven by multiple, VERY engaged adopters).

How long until subtle and not so subtle differences creep in between different implementations, leading to incompatabilities and a continuous fragmentation of the ecosystem?


Microsoft had IronPython, which was then semi-abandoned...


That looks like a super interesting runtime. Seems to target 2.7 only, I hope they're open to supporting 3.x as well.


I'd like to support 3.x at some point. See https://github.com/google/grumpy/issues/1


Have you thought about using type hints to help type inference?

Some work on that is discussed here. I would love a dropbox google colab (though also targeting 3.x :) )

https://github.com/python/mypy/issues/1862


Yes, leveraging type hints for optimization purposes is a long term goal. Thanks for pointing me to that issue, I'll keep an eye on it.

One of the goals of open sourcing was to get feedback and work with outside folks so I'm definitely open to collaboration!


Awesome!


I wonder if they intend to maintain it long term or if they're just going to use it as a bridge while they rewrite things natively in Go.


Seems like a huge effort for the interim... unless they really need interoperability between Python and Go during the switch.


From looking in the code, this is my favorite part so far: https://github.com/google/grumpy/blob/75a35b4b30afb049b9cfff... making python threads as lightweight as goroutines must be a breeze!


* I would love to know how big the codebase is.

* It seems like writing a translator to deal with all the use cases is so much more work and risky than iteratively rewriting portions (in whatever faster more concurrent language) and using some form microservice/process message passing to communicate with legacy pieces.

* Love to know how they compose async operations currently? Is it some sort of object (e.g. Futures, promises, observables, etc)? Is Grumpy going to have some sort of language difference (to Python) to compose async stuff (e.g. async and await)?

Of course being biased towards the JVM (since I know it so well) they could get really fast concurrency if they want with Jython today. Most of the Python tools already work with Jython (assuming 2.7).

With Jython you could always drop down into Java (or any other JVM lang) if you need more speed as well C for cpython (or even C from Java). It is unclear what you do with Grumpy with performance critical code. Can you interface with Go code or is the plan C?


> I would love to know how big the codebase is.

Sorry, can't be very specific, but rewriting all the frontend code would take a lot more effort than writing a new Python runtime :)

> It seems like writing a translator to deal with all the use cases is so much more work and risky than iteratively rewriting portions (in whatever faster more concurrent language) and using some form microservice/process message passing to communicate with legacy pieces.

We do iteratively rewrite components as well. We are pursuing multiple strategies.

> Love to know how they compose async operations currently? Is it some sort of object (e.g. Futures, promises, observables, etc)?

Most async operations are performed out-of-process by other servers.

> Is Grumpy going to have some sort of language difference (to Python) to compose async stuff (e.g. async and await)?

I'd love to support async and await at some point.

> Of course being biased towards the JVM (since I know it so well) they could get really fast concurrency if they want with Jython today. Most of the Python tools already work with Jython (assuming 2.7).

We did also do an evaluation of Jython but there were a number of technical issues that made it unsuitable for our codebase and workload. One such example is this longstanding issue: http://bugs.jython.org/issue527524. I just noticed the very recent update on that thread that implemented the workaround outlined in 2010 by Jim Baker. We tried that workaround and found we got a huge performance hit on affected code. There were a few other general performance problems as well but I can't recall all the details.

Please note I'm not at all bashing Jython, I think it's a great project with a sound design, it just wasn't right for us.

> With Jython you could always drop down into Java (or any other JVM lang) if you need more speed as well C for cpython (or even C from Java). It is unclear what you do with Grumpy with performance critical code. Can you interface with Go code or is the plan C?

You can interface with Go code directly, e.g. from the blog post:

  from __go__.net.http import ListenAndServe, RedirectHandler
  handler = RedirectHandler('http://github.com/google/grumpy', 303)
  ListenAndServe('127.0.0.1:8080', handler)


Very interesting project and technical solution. Is this in use at Google? The described example problem is Youtube with millions of requests per second, but the post doesn't say if it Grumpy was put in production (and what performance gains were then achieved).


As has been said by others, Grumpy is not used in production at Google currently. There's still a lot of work to do -- especially on the standard library -- to support large real world codebases.


I'm thinking it isn't in production yet, based on wording like "we're excited about the prospects" and "although it's still alpha software".


Yeah without performance metrics I'm not sure what to think about this. I'm surprised they even used Python to begin with for their front-end server, for such high volumes of traffic.


I wonder if future work might support C Extensions in the same was as the jruby truffle / graal implementation plans to do.

Here's a great article from a couple of years ago by Chris Seaton on this topic.

http://chrisseaton.com/rubytruffle/cext/


I see some parallel with the work made to automatically convert Go compiler C code to Go code. Strategically wise, I'd rather have the software do the transpilation to the new language and then after extensive testing, ditch the old python codebase. But they probably don't agree with my preference of Go over Python.


Haha, now that's what I call an interesting project.

The biggest surprise for me is that the Go runtime would be a good fit for Python, performance wise, considering the very different object and dispatch model.

The post also mentions runtime reflection, which used to be painfully slow last time I used it. (Go 1.5, i think).

Has this improved in the latest releases?


A huge part of the reason Python (and Perl, PHP, etc. in their original interpreters) is so slow is that it is like running a Go program that does everything through runtime reflection, even just to add two integers. If your code is already in Python, this level of performance is apparently not a problem for you.

If you know Go or are willing to learn about Go and reflection, you can learn a lot about how dynamic languages work under the hood by implementing:

    func Add(interface{}, interface{}) interface{} { ... }
using the reflect module to accept all types of numbers, including for a bit of extra fun the math.Big* number types, and returning upgraded numbers as appropriate, or panicking on types you can't Add with. That's not all there is to writing a dynamic language interpreter, but I'd say you can learn the core idea this way, shorn away from a lot of accidental complexity and with a lot of the grunt work plumbing of setting up (type, value) pairs already done for you.


Reflection in Go is used minimally in Grumpy because it is slow. Currently importing Go packages into Python code is accomplished via the reflect module, but I think this will have to change to make such integration useful.


Does anyone know if there is an easy way to get involved in open source projects like this? The readme mentions adding PRs, but as someone that doesn't have much experience working with open source projects like this, I don't even know where to begin. It sounds like an incredible learning opportunity though.


The first stop is usually a CONTRIBUTING.md Here's theirs https://github.com/google/grumpy/blob/master/CONTRIBUTING.md

There's not a ton there. The next place I'd go is open issues: https://github.com/google/grumpy/issues

It looks like at the moment they're mostly random bug reports from people who have tried Grumpy since this announcement, rather than ones filed by people working on Grumpy since before it was made public. So that's a bit trickier.

The last place I'd look is then, the README: https://github.com/google/grumpy not a ton there about ways they wish to have people contribute.

At this point, what I'd do personally is open an issue asking how you can get involved; possibly by improving this documentation on how to get involved!

Anyway, that's what I'd do. Hope that helps!


That's super helpful; much appreciated!


No worries. Open source can be tricky to get into, but it's largely about just keeping at it: A story I like to tell is how my first ever PR to Rust was actually rejected based on a procedural issue. Now I'm on its core team. It might take you a while, but if you keep at it, I'm sure you'll figure it out.


"much" could mean anything, from "what's a PR?" https://guides.github.com/activities/hello-world/

to "what's good etiquette for contributing to open source projects?", I'd say: add as much information as possible to your PR, why it does what it does, how it does it, etc.


You can probably mail a maintainer or a developer for hints of a project you'd be interested in. Or better you can send a question to the project's mailing list. People there often are very helpful. I've done that. It worked. (I haven't tried contributing to any Google-managed projects though.)


Thanks! This is what I'll do. Much appreciated.


Check out OpenHatch: https://openhatch.org/


I'm a bit disappointed that the blog post doesn't explain why the authors didn't choose PyPy instead.


The objective of the authors is to efficiently exploit multi-core parallelism. PyPy still has a GIL. They have been doing some experiments with transactional memory, but performance is quite bad.


> These efforts have borne a lot of fruit over the years, but we always run up against the same issue: it's very difficult to make concurrent workloads perform well on CPython.

> To solve this problem, we investigated a number of other Python runtimes. Each had trade-offs and none solved the concurrency problem without introducing other issues.


How does Grumpy handle Decimal in Python? As far as I am aware there isn't an equivalent in Go.


The decimal module in Python 2.7 is implemented in pure Python: https://github.com/python/cpython/blob/2.7/Lib/decimal.py (in Python 3 there is an accelerated extension module that's used when available, falling back on the pure Python version otherwise).


So, does Decimal work with Grumpy (you're implying it does)?

For that matter, does Grumpy match CPython's Float behavior exactly?

ie.:

    >>> 2.2 + 3.1
    5.300000000000001
    >>>


I'm implying that they can get the decimal module into Grumpy by implementing enough of Python to support the module (i.e. they don't need to implement the decimal module from scratch because the module isn't implemented only in C).


Got it. Thanks for the clarification, Brett.


> Once we started going down the rabbit hole, Go seemed like an obvious choice...

Good? It's funny that a rabbit hole is the place you go to make obvious choices. This culture must seem strange to outsiders.


It's all about performance and concurrency here, but I'm quite happy about this for another reason: static compilation. If I needed really performant code, I'd write it in Go/Cython/[insert your fav language] in the first place. But if I have some existing code that I just want to run without worrying about the Python runtime being installed/correct, this is a good solution.

(Sure, none of my code will work, because it's all Python3 and usually uses C/asm, but it's still early and I'm hopefull.)



Reminds me of the (old I guess?) ShedSkin project to automatically translate Python to C++.

I suppose the easy concurrency and ability to inter-operate with Go libraries is really the driver for Go over that.


Shed Skin is still living at https://shedskin.github.io/ , but hasn't changed much since the last release in 2013.


One of my complaints about Go is that it seems to be designed with the (mistaken, in my opinion) notion that what we really need is just a better C.

One way to leverage C's widespread availability and high-performance while side-stepping some of its deficiencies was simply to use it as a target for a different language. The Cfront C++ compiler which generated C code is probably the most famous example, but I recall that there were many others.

Maybe Go, like C, will make a fruitful target for other language implementations.


>One of my complaints about Go is that it seems to be designed with the (mistaken, in my opinion) notion that what we really need is just a better C.

It was designed with the notion that all some people really need is just a better C.

For other people there's Swift, Rust, etc.


For better or worse, I think Go sucked up a lot of the metaphorical oxygen for similar languages. Swift and Rust are in different niches and don't compete head to head with Go, so this isn't a real problem for them.

What I've long wanted is a better C++, not a better C. It would be hard for any such language to succeed today, because it will have to compete with Go, and Go has a fairly mature toolchain and widespread adoption and a lot of mindshare.

D literally is designed to be a better C++, and it hasn't really been a big success. Maybe that's because there really wasn't as big a demand for a better C++ as there was for a better C. On the other hand it may have been because Go sucked up a lot of the oxygen that D was going to need to succeed, and maybe that was because Go was a product of Google and D wasn't. (If this was my primary point, I'd make a more nuanced argument, though.)

I'm now wondering if the best way to get to a better C++ might be by piggy-backing on the Go toolchain. To get back to my original point, I'm wondering if Go might in fact be a good target for all sorts of better (for some definition) programming languages.


I think Go being Google product is overemphasized. Dart is far more a Google product and has achieved modest success in industry.

Rust is putting lot of effort in better marketing because they want much broader usage than bits of high-perf, low level code which cannot tolerate a GC pause.

Similarly Swift is putting effort on general Server side coding. But I think it will remain limited to server side bits of macOS/iOS applications etc.


It seems likely to me that Rust is your better C++. I'm also quite positive that there is (more than) enough oxygen for Rust to succeed.

As for D, I think it has numerous other issues, that don't apply to Rust.


The stuff I've done in C++ would have worked as well or better in a language with garbage collection. On the other hand if you want to write the next OS kernel, then you need a "better C++" that is designed to work without GC. In the latter case, I think there is no question that Rust is your best bet today. I'd like a programming language that is statically compiled, with good performance, garbage collection, and decent support for abstraction. It's the last part where I think Go comes up short, and I think the Go designers think that's a feature not a defect.


> I'd like a programming language that is statically compiled, with good performance, garbage collection, and decent support for abstraction.

It feels like you're mixing actual requirements (like "good performance" and "decent support for abstraction") here with things that are more like implementation details ("garbage collection", "statically compiled"). I don't really understand why you want "garbage collection", as such, though I could understand wanting some kind of increased productivity from not having to think about memory management. However, if your real requirement is as I describe it here, then I would still contend that Rust fits the bill.


> I'd like a programming language that is statically compiled, with good performance, garbage collection, and decent support for abstraction.

Well if you really are tired of "better C" (in varying guises) then pull the fire alarm and go off-piste with OCaml or Haskell


Go and D both made the design choice to have garbage collection heavily intertwined with the language (or at least the standard library), which makes them not really suitable for replacing C++ in most applications. Rust doesn't have a GC, so it should do better in that niche.

It's always seemed like Google wrote Go to for the purpose of having a "faster Python"--where they were writing components in Python that weren't fast enough, they could write them in Go instead of having to resort to C++.


I'm not certain that the goal was a substitute for Python, but either way I think you can make the case that it's worked out that way. It seems like a lot of people who ended up moving to Go started out with Python or maybe Ruby rather than one of the statically typed languages.


For my money, a better C++ is in development: it's called C++20.


C++ with more stuff isn't the same as "better C++".

C++ has more features than any one person could ever need and it seems like the design philosophy is: anything that is possible in any language; should be added to C++. In reality it is a huge problem for productivity.


If you think that languages iterate simply to add stuff, I don't think you understand language development as well as you think.

Maybe you're being glib, and that's fine. But improving build times, metaprogramming (which greatly simplifies many libraries), adding lazy ranges, more powerful type inference, better error messages - these are all improvements.

The fact is, better metaprogramming means c++ gets smaller, because you don't have to learn separate languages for run time vs compile time computation.

If you think the development of c++ in the past few years is anything short of amazing, you're quite mistaken.


I don't know that it was made with assumption that everyone needs a better C, but certainly that many could benefit from it. I think they were largely correct--I certainly appreciate Go's simplicity and ease of use.


Question: better threading performance seems to be the main motivation for this project. Could they not just use multiprocessing instead of threading for the cpu-heavy parts of the YouTube codebase, and threading or asyncio for the io-bound parts of it?

Edit: for example, the fib benchmark they cite is cpu-bound. If the python code used multiprocessing, the performance would scale almost linearly with the number of processes.


> Could they not just use multiprocessing instead of threading for the cpu-heavy parts of the YouTube codebase, and threading or asyncio for the io-bound parts of it?

Probably, but consider how much bare iron Google has lying around, much of it likely with few cores-per-CPU. Using it efficiently and not accelerating it's obsolescence is probably a priority for them.

BTW, does anyone have data that would suggest how long it does take an org at the scale of an Amazon, Google, or Facebook to entirely replace their HW? I assume that it isn't only through attrition, and that Google for example currently has no servers running that date back to Y2k, but I have no idea what the "half-life" of a server is at their scale.


Whatever the haters say here, this is really cool!


I would love to see this support Python 3.5+, specifically for asyncio. Concurrency via threads in Python is far less appealing.


The code initially landed on Github 16 days ago[0]. How long was it developed internally before that?

[0] https://github.com/google/grumpy/commit/f60ee257db9d7996e3f8...


The blog post says that YouTube is the inspiration for this runtime, but does YouTube run on Grumpy yet?


YouTube does not run on Grumpy. There is a lot of work left to do before Grumpy can run a large existing codebase.


I tried a simple `http.Get()` example, but the resultant `Response` object appears to not allow access to any of the struct fields as attributes. How does one access a Go object's fields from Python? For example, I want to `io.Copy(os.Stdout, rsp.Body)`.


Unfortunately the native interface is still pretty immature so I can't guarantee things will work as they should. In this case, I think the problem is that you have a *Response object which is not itself a struct. I've rewritten this part of the code a couple times and I think this functionality was lost. I've filed: https://github.com/google/grumpy/issues/13


Whoops, just noticed you (or somebody) had already filed it: https://github.com/google/grumpy/issues/12


Yeah, that was me. I should have updated my post here accordingly. Apologies for the inconvenience!


It's all good. Thanks for filing the issue. I'll get that fixed.


Facebook: PHP -> HipHop

Google: Python -> Go


Insightful discussion in the comments here about performance and JITs https://lwn.net/Articles/710634/


Will Grumpy be a good fit for go's scientific packages such like gonum?


I don't know. I hope so. I think this is an area where Go can succeed and integration with existing Python libraries could be useful.


This is amazing work. Look forward to the integration between go & python


It's actually a transpiler written in Python that generates Go code.


Nice. I've been toying with doing similar for the JVM [0]. I think people are coming to realize that while Go is not a great/powerful language (IMO), the runtime is great. I would not be surprised if more and more people start targetting Go if they want a GC and cross platform static compilation w/out LLVM complications and get a nice stdlib for free.

0 - https://github.com/cretz/goahead


First to write a language that is basically go + generics wins


Will you continue maintenance on this indefinitely (barring orders from higher up) or is this meant for a one shot translate and dump?


on a related note: given my affection for python and having, until just now, no idea that YouTube runs python (albeit 2.7) at Google scale (a language not known for being able to scale very well given the GIL and such)-- I now more than ever would love to work on something like that. Welp, back to learning and hacking with python.


>> "The front-end server that drives youtube.com and YouTube’s APIs is primarily written in Python, and it serves millions of requests per second! YouTube’s front-end runs on CPython 2.7, so we’ve put a ton of work into improving the runtime and adapting our application to work optimally within it."

So has YouTube already migrated over to using Grumpy (and no longer running python in production)?


God, will Python 2 just die already?


May I suggest GoPy as name


Maybe you don't like Grumpy as a name. I think it was named after NumPy (pr. ˈnʌmpaɪ) so I guess it's pronounced ˈgrʌmpaɪ rather than ˈgrʌmpi:

Around 2011 the beta releases of the static typing additions to Apache Groovy were called "grumpy" but the name was dropped after objections from the Grails crowd. I think the quality of the product is more important than the name, so Grumpy should do OK regardless if it's built and maintained properly.


With click.py, this could make amazing command line utilities



any chance you guys have tested this with numba?

that would be awesome.


I'd love something like this for JavaScript


This is the best news to come out of the Python space in a very long time. Python on the Go runtime, no GIL, more performance, is exactly what people wanted out of a new Python.


+1 to cooperative multitasking


No C extensions boo.


But isn't that really what lets them do this? In following PyPy (and other alternate Python discussions) it seems like eliminating the GIL and getting better performance out of Python, even in C, isn't that hard if you drop the C extensions.

As soon as something can see into the guts of the interpreter you have to maintain compatibility which is a pain/waste.

Worse than that is that view wasn't designed for multi-threading which is why the GIL exists. The C extensions were t designed to be multi-threaded because that wasn't a thing in Python so they're not safe. You either have to drop them, define a new interface layer that would be safe, or I suppose somehow sandbox their little view of the world but keep it coherent between threads.

If you have a codebase where you can make the choice to drop C extensions and you're trying to accelerate Python it seems like a very smart choice.


The irony is that some C extensions exist because they have better performance than Python.

On the other hand, there are many C extensions which are just interop/wrappers for existing C code. I think no language is naive enough to think they can get away without C interop.

C extensions are actually pretty nice because you can wrap the C code into idiomatic Python. With ctypes, you have to e.g. maintain two structure definitions, which is very problematic for some codebases.

> In following PyPy (and other alternate Python discussions) it seems like eliminating the GIL and getting better performance out of Python, even in C, isn't that hard if you drop the C extensions.

AFAIK, many projects initially struggle to even achieve performance parity with CPython. The GIL isn't evil, it's a simple solution to a hard problem. But at this point, a complex solution to a hard problem is better if it's faster, and some people care more about speed than C interop (and vice versa).

What would be awesome is a Python runtime that could run with fine-grain locking until a C extension is loaded, and then continue with coarse locking. But the problem is still there's no upgrade path for C interop.

The only thing I can think of is using type annotations and something like Cython's cdef to write Python-implementation-independent C interop that doesn't suck as bad as ctypes. Then the Python rumtime could also lock the arguments at a very fine level while the function is being called.


wish there was one for ruby as well


Sometimes I really wonder what Google has against Ruby.


2.7 haha. First Tensorflow, now Grumpy. I can't stop laughing as all the 3.x zealots try to squirm out of this. Talk about awkward.

yes yes I know tf now "does" 3 but we all know what Google really cares about.

"I'd like to support 3.x at some point" - trotterdylan.

Read: nice to have but when it gets down to brass tacks, 2.7 is where it's at for Google.


"So we asked ourselves a crazy question: What if we were to implement an alternative ?"

As always with Google...

What if you humbly contribute to open source projects rather than creating new stuffs, labelled with your own brand, controlled by your own engineers, and with your own design choices, however good they may be ?


They tried that with CPython, it didn't work: https://www.python.org/dev/peps/pep-3146/



Interesting, I hadn't seen that post! I wonder if there are any posts that elaborate on the internal Google bits more. I'd love to know what "they found other ways to solve their performance problems" wound up meaning in practice.


> I'd love to know what "they found other ways to solve their performance problems" wound up meaning in practice.

Me too, especially as this was already several years after the YouTube acquisition.

Also interesting to note that potential internal customers were put off by having to upgrade to 2.6.1 in order to use unladen-swallow (presumably they got over that reticence at some point, as 2.7 is now standard).


How entitled can you be ? Why are they required to do anything of the sort ?


Is this the final nail in the coffin for Python3? Seems like it.

Who would use 3 if you could have CPython2 for existing code and write new code in Grumpy Python? This is the dream language for me. Python on the Go runtime.


Everyone who likes the numerous new language features, or new syntax for old features, in Python 3?


Those can be built into Grumpy if anyone cares to do so. You don't need Python3 for that stuff. Someone, one developer, recently released a Python 2.8 that backported almost every new Python3 feature.

I'm hoping it becomes a permanent fork of Python2 that uses the Go runtime. That would really be great and exactly what they've got now.


If you backport every new Python3 feature to Python2, it becomes Python3, by definition.


Except that "Python3" would compile and run Python2 source code. So it would be "Python3" but backwards compatible. Which is exactly what everyone wanted (other than a performance boost).


It could become Python 2.8, wherein the standard library has both the legacy versions and the new versions, and any cross-compatible syntax is allowed.

This leaves us with more than 1 way to do things, like meta-class declaration.

There's a lot in Python 3 where changes were made to the syntax for 'clarity', but those worts weren't removed for any technical reason but because of the thought that since backwards compatability was being broken anyways, then we might as well get the most bang for our buck.


> It could become Python 2.8, wherein the standard library has both the legacy versions and the new versions, and any cross-compatible syntax is allowed.

So two `builtin` modules, then?

What if someone does `sys.modules['builtin']`? Or any other kind of explicit string-based lookup?

How does pickle figure out which types to instantiate? There'd be a lot of types with same qualified names but different implementations with this approach...

It feels like the only way this would work reliably, is if you completely isolate the Py2 and Py3 universes. So if you e.g. pickle from Py2 code, it only looks at modules and types that Py2 universe knows, and vice versa.

But then what happens when code using the old library interacts with the new one (e.g. tries to pass objects around)? If that is prohibited, then you effectively still have two different languages, just with a single shared implementation - but no ability to gradually replace bits and pieces of code, for example, which would seem to be the biggest motivation for such a thing.


Python 3 is a more modern, better language, that's why. Hopefully, grumpy will support it soon enough, though it's understandable that they started with 2.7 to support Google's huge older codebase.


That's a bit of hyperbole. Python3 is technical churn, not technical innovation. They just stirred the pot and added some features that have no barrier to being added to Python2 and were, with Naftali Harris' Python 2.8.

But the bigger problem other than lack of widescale user adoption is that the primary reason for Python3 (unicode) was botched. Go got this right, everything is a byte string and the assumed encoding is UTF8.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: