Hacker News new | past | comments | ask | show | jobs | submit login
Exposing Python 3.6's Private Dict Version (jakevdp.github.io)
89 points by lamby on May 28, 2017 | hide | past | web | favorite | 14 comments

> Python uses many dictionaries in the background: among other things, local variables

Just a minor correction, Python doesn't use dicts for local variables, `locals()` is a lie, it creates a dict from scratch. Internally each frame has an array of locals whose values correspond to the names in the code object's co_locals.

> `locals()` is a lie, it creates a dict from scratch.

Are you sure? The behavior I see suggests otherwise...

    >>> a = 1
    >>> d = locals()
    {'a': 1, 'd': {...}}
    >>> b = 2
    {'a': 1, 'd': {...}, 'b': 2}
    >>> d['b'] = 123
    >>> b
    >>> d is locals()

See PyFrame_FastToLocals() in the CPython source.

Try wrapping your code in a function and notice the changes are no longer mirrored. Calls to exec() and eval() cause PyFrame_FastToLocals() to be applied afterwards, as used by the interactive console, but this doesn't happen during normal execution

The REPL essentially operates at the global scope, which is represented as a dictionary. Variables local to a function are not stored in a dictionary, however:

    def main():
        a = 1
        d = locals()
        print d
        d['b'] = 123
        print b
        print d
        print d is locals()

Which prints

    {'a': 1}
    Traceback (most recent call last):
      File "test.py", line 23, in <module>
      File "test.py", line 19, in main
        print b
    NameError: global name 'b' is not defined

That's because the compiler doesn't know the locals() dictionary will be modified at runtime, so it treats b as a global variable (which happens to be undefined).

To show that not even changes to existing local variables work, try

  def main():
      a = 1
      d = locals()
      print d
      b = None
      d['b'] = 123
      print b
      print d
      print locals()
      print d is locals()
which prints

  {'a': 1}
  {'a': 1, 'b': 123}
  {'a': 1, 'b': None, 'd': {...}}
I.e. changes to the dictionary returned by locals() are ignored, and calling locals() again overrides the previous values.

Why this happens has been explained, but more generally what you posted really doesn't prove anything at hand really. The dictionary created from scratch could be made an updateable view by hooking __setitem__.

If you gonna contradict with "are you sure?" for something so objectively answered by the source, it might be helpful to think if your example actually demonstrates what you're trying to prove.

Raymond Hettingers presentation this year at Pycon was also interesting.

Modern Python Dictionaries A confluence of a dozen great ideas PyCon 2017 - https://www.youtube.com/watch?v=npw4s1QTmPg

I watched an earlier version of Raymond's talk and I skimmed through the other one. One thing that I think they both mention is that to take advantage of some of these improvements you need to set every attribute any instance of your class will ever set on the __init__ method.

This is probably a good practice anyway. I'm curious now how I can test and see if I'm taking advantage of this or not. I wonder if using the attrs module will also make use of this new feature.

I'm starting to wonder whether it wouldn't be a better design to forget about classes for most use cases, instead only use namedtuples and have a bunch of functions defined that take one type of namedtuple in their first argument. You can easily bind this by doing my_applied_func = functools.partial(my_func, my_named_tuple). This design IMO has a few advantages compared to classes:

* __repr__ comes for free.

* converting all members to a list or dict comes for free.

* converting a list or dict TO a namedtuple comes for free.

* more restrictive in what you can do with data structures inside the namedtuple - may lead to a cleaner design where you think about pulling apart the data into multiple tuples at an earlier point in the prototype phase.

* You by design won't make functions into instance methods that depend on multiple data tuples.

* Need to extend a function to deal with different types of tuples? You now don't need to refactor an instance method into a first class function, instead just dispatch and be done.

* Inheritance can be solved with the above method as well. This may actually be more maintainable than class based inheritance, since you see all the logic for a specific function explicitely (or explicitely dispatched) in one place.

The only reason I can think of to still use a class: If you can heavily profit from inheritance or if you need to program against an "interface" like dict- or listlike behavior or comparison operators.

A lot of your points are solved with attrs. I suggest reading up on it if you haven't already. http://www.attrs.org/en/stable/

I think organizing / refactoring a bunch of named tuples would be a mess. In one code base I maintain I switched a bunch of namedtuples to attrs classes and the code instantly became more readable / maintainable.

The downside to the namedtuple is that it's a tuple and can be unpacked as one.

for x, y in points: print x, y

# vs.

for p in points: print p.x, p.y

# ... now go and add a z to the point class and see what happens to your code.

I don't get your point about unpacking yet to be honest. Both your code examples work fine with namedtuples I think. Adding additional attributes is better in the second example, which is why it is to be preferred, but that works with namedtuples as far as I understand.

Adding a member to the tuple requires updating all code that unpacks it.

Adding an attribute doesn't impact any code that doesn't use the attribute.

Yes, that's why you usually don't use unpacking for a namedtuple. This works just fine without me adding additional 3rd party packages:

> from collections import namedtuple

> Point = namedtuple('Point', ['x', 'y'])

> points = [Point(3,4), Point(5,2)]

> for p in points:

> print(p.x, p.y)

Thank you for this, it was a very interestingread.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact