Hacker News new | past | comments | ask | show | jobs | submit login

> Amusingly, it runs Python 2.7, even though this project started long after Python 3.x came out.

Basically, we needed to support a large existing Python 2.7 codebase. See discussion here: https://github.com/google/grumpy/issues/1

> It's a hard-code compiler, not an interpreter written in Go. That implies some restrictions, but the documentation doesn't say much about what they are. PyPy jumps through hoops to make all of Python's self modification at run-time features work, complicating PyPy enormously. Nobody uses that stuff in production code, and Google apparently dumped it.

There are restrictions. I'll update the README to make note of them. Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.

> If Grumpy doesn't have a Global Interpreter Lock, it must have lower-level locking. Does every built-in data structure have a lock, or does the compiler have enough smarts to figure out what's shared across thread boundaries, or what?

It does fine grained locking. Mutable data structures like lists and dicts do their own locking. Incidentally, this is one reason why supporting C extensions would be complicated.




> Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.

What about stuff like literal_eval? Or even just monkeypatching with name.__dict__[param] = value ?

> It does fine grained locking. Mutable data structures like lists and dicts do their own locking. Incidentally, this is one reason why supporting C extensions would be complicated.

Would there be a succinct theoretical description of exactly how that's implemented anywhere? What about things like numpy arrays.


> > Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable. > What about stuff like literal_eval? Or even just monkeypatching with name.__dict__[param] = value ?

literal_eval could in principle be supported I think. name.__dict__[param] = value works as you'd expect:

  $ make run
  class A(object):
    pass
  a = A()
  a.__dict__['foo'] = 'bar'
  print a.foo
  bar
EDIT: fixed formatting


Hmm, numpy isn't pure python, is it? If I read correctly this only works with pure python.


By volume numpy is mostly assembler written to the Fortran ABI (it's a LAPACK/BLAS-etc wrapper).


NumPy is a library that provides typed multidimensional arrays and functions that run atop them. It does provide a built-in LAPACK/BLAS or can link externally to LAPACK/BLAS, but that's a side effect of providing typed arrays and is nowhere near the central purpose of the library.

Also, NumPy is implemented completely in C and Python, and makes extensive use of CPython extension hooks and knowledge of the CPython reference counting implementation, which is part of the reason why it is so hard to port to other implementations of Python.


Having typed arrays without efficient functions over them would be rather pointless.


Are you sure you aren't mistaking numpy for scipy?


numpy is the foundation of scipy.


Is there not a single namedtuple in the entire Google codebase? That's strange :o


Heh, I came across the namedtuple exec thing the other day when I was trying to get the collections module working :\

namedtuple will have to be implemented differently. I think it can be accomplished by defining the class with type()? Maybe with a metaclass...


> I think it can be accomplished by defining the class with type()?

I've done it using more or less that method. The code is in the "coll" sub-package of my plib.stdlib project; the Python 2 version is here on bitbucket:

https://bitbucket.org/pdonis/plib-stdlib/src


You won't get exact compatibility, but a metaclass implementation would give almost all the features. I can't remember what exactly you give up, but I did that once and I lost some introspection friendliness.


Nevermind, all you need is type(). Metaclass unnecessary.


Are namedtuples that popular? They always felt awkward to me. If some temp variable with multiple values inside a loop, I either use normal tuple or a dict. If passing data around a dict or a real class. I never got the huge win from namedtuple?


namedtuples are tuples, meaning they are stored efficiently, and are constant (thus can also be used as dictionary keys). Unlike regular tuples, they can be accessed like a class/dictionary for readability, but requiring much less allocations (compared to dict/class), so much faster. Also, as they are tuples, you have well defined methods (printing, comparison, hash value, ) you'd have to implement yourself for dict/class.

If you like writing in functional style, namedtuples are much more natural than dict or classes, and more efficient to boot.


Attrs (https://attrs.readthedocs.io/) replaced namedtuple for us (and many others). It's slightly more verbose but allows all class goodness such as methods, attribute validation, etc.


Doesn't work for everything, but you can subclass a namedtuple:

  from collections import namedtuple
  
  class Foo(namedtuple("Foo", "a b c")):
      @property
      def sum(self):
          return self.a + self.b + self.c
  
  
  f = Foo(1,2,3)
  print f.sum


That doesn't look super awesome to me. I.e. classes or attrs both seem better.


Aaaaaarrrrrgggggh! I've had that particular itch for every one of my ten years with python, and at last I get to scratch it!

Thanks so much for bringing it up.


We use them extensively in our API client code to pass back immutable, well-defined data structures. Dictionaries and classes are mutable and then each layer of code tends to sloppily change them however is convenient, meaning the underlying data can end up being represented differently in different code flows.

Namedtuples are a way to preserve the data unless the consuming code _really_ wants to change it, which is sometimes legitimate.

I'm not totally sold, as in some cases dictionaries or classes would add nice value. But namedtuples have a rigidity that makes you think twice before tampering with retrieved data.


In every introductory python course tuples are presented as just immutable lists. However a "more accurate" way of describing tuples is if you think of them as records with no field names. When you see tuples as records then the fact that are inmutable make sense, since the order and quantity of the items matters (it remains constant). Records usually have field names and here is where namedtuples comes in handy. Also helps to clarify what the tuples wear (see https://youtu.be/wf-BqAjZb8M?t=44m45s), just 2 minutes clip. If you are thinking why don't define a class, I will tell you a couple of reasons:

1) You know before hand that the number of items won't be modified and the order matters since you are handling records. So it is a simple way of accomplishing that constraint.

2) Because they extend tuple they are inmutable too and therefore they don't store attributes per instance __dict__, field names are stored in the class so if you have tons of instances you save a lot of space.

Why creating a class if you just probably need a read-only interaction? But what about if you need some method? Then you can extend your namedtuple class and add the functionality you want. If for example you want to control the values of the fields when you are creating the namedtuple you can create your own namedtuple by overriding __new__. At that point it is worth it to take a look at https://pypi.python.org/pypi/recordclass.


(1) A grep can identify all the occurrences; a sed might even fix them. (translated to Google internal tools obviously)

(2) Apparently setting a __dict__ key works; they could be implemented like that.


There are some defined in built-in modules. Even in Python 2.7, where sys.version_info is a namedtuple.


Yeah, one of the motivations for adding namedtuple to stdlib was a drop-in compatible upgrade of existing interfaces returning tuples. Notable atrocities included `time.localtime()` returning a 9-tuple, and `os.stat()` returning a 10-tuple...


> There are restrictions. I'll update the README to make note of them. Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.

I'm guessing pretty much the entire AST module is a no-go?


I think the CPython AST module is written as a C extension module so currently it's a no-go. I don't think there's a fundamental reason Grumpy couldn't run a pure Python AST module, though.


The ast module itself is in Python, but it imports the _ast module which is an extension module. This actually isn't that big of a deal, though, as the entire AST is defined in a DSL (see https://cpython-devguide.readthedocs.io/en/latest/compiler.h... for some details), so you just have to write some code to generate _ast in Python instead of C (which PyPy may have already done).


So I take that means Grumpy can't run itself?


Correct, Grumpy cannot yet run Grumpy :)


> I'll update the README to make note of them.

I managed to run into 2 trying to build a 5 line program :-)

  $ cat t.py; ./tools/grumpc t.py  > t.go;go build t.go;echo '----';./t
  import sys
  print sys.stdin.readline()
  ----
  AttributeError: 'module' object has no attribute 'stdin'
  $

  $ cat t.py ;./tools/grumpc t.py
  c = {}
  top = sorted(c.items(), key=lambda (k,v): v)
  Traceback (most recent call last):
    File "./tools/grumpc", line 102, in <module>
      sys.exit(main(parser.parse_args()))
    File "./tools/grumpc", line 60, in main
      visitor.visit(mod)
    File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
      return visitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stmt.py", line 302, in visit_Module
      self._visit_each(node.body)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stmt.py", line 632, in _visit_each
      self.visit(node)
    File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
      return visitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stin visit_Assign
      with self.expr_visitor.visit(node.value) as value:
    File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
      return visitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 101, in visit_Call
      values.append((util.go_str(k.arg), self.visit(k.value)))
    File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
      return visitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 246, in visit_Lambda
      return self.visit_function_inline(func_node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 388, in visit_function_inline
      func_visitor = block.FunctionBlockVisitor(node)
    File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/block.py", line 432, in __init__
      args = [a.id for a in node_args.args]
  AttributeError: 'Tuple' object has no attribute 'id'


Ugh, sorry about that. There's a couple issues here:

1. Lambda tuple args are not yet supported -- I actually didn't know that was a thing :\ -- https://github.com/google/grumpy/issues/17

2. Even if that worked properly, sorted() is not yet implemented: https://github.com/google/grumpy/issues/16


Yeah.. It also used to work with def, but it was removed in python3. You can do this in 2.7:

  def func((a,b)):
      return b

  mytuple = 1,2
  print func(mytuple)
in py3 you need

  def func(t):
      a,b = t
      return b
Not sure if

This is probably the cleaner way to write that:

  key=operator.itemgetter(1)


sorted() is widely used, adding will extend coverage considerably.


> Basically, exec and eval don't work.

Couldn't they supported with a slower runtime implementation? I mean I still love the idea and actually like the idea.




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: