
The Descriptor Protocol and Python Black Magic - avyfain
http://faingezicht.com/projects/2016/04/26/python-black-magic/
======
pdknsk
A related, albeit simpler, Python quirk.

    
    
      >>> a = 100
      >>> b = 100
      >>> a is b
      True
      >>> a = 1000
      >>> b = 1000
      >>> a is b
      False

~~~
sophacles
Actually not related. The case in the article is about late-binding methods to
object instances (creating a new object on each access) - whereas in python 2
the "unbound method" was a lazy creation on access, and in python 3 an unbound
method is just a reference to a bare function in the class' name space. The
lazy evaluation is from the descriptor protocol in python (see the links in
TLA).

The case you point out is because the python bytecode compiler will intern raw
values for some items - e.g. string literals, some integer values, and so on.
More on interning:
[https://en.wikipedia.org/wiki/String_interning](https://en.wikipedia.org/wiki/String_interning)

~~~
pdknsk
I meant related in the strictness of the is operator.

~~~
deathanatos
The `is` operator is the same amount of strict in all examples here and in the
article: it compares the identities (in CPython, this is essentially the
memory address) of two objects.

The reason, in your example, that `is` returns different results is that small
integers are "interned"; i.e., the literal 100 always references the exact
same integer object, whereas 1000 will cause an integer to be allocated. (So a
and b, while both represented 1000, are stored twice in memory.)

~~~
SEJeff
And in python 3, you can do this manually with sys.intern()

~~~
sophacles
in py2 it's a builtin, so just: intern(thing)

------
rbistolfi
Note that the descriptor protocol itself is not causing this behavior. What
actually happens here is that functions are descriptors (they implement
__get__()) that generate the instancemethod object on attribute access. More
info here:
[https://wiki.python.org/moin/FromFunctionToMethod](https://wiki.python.org/moin/FromFunctionToMethod)

------
avyfain
Author here. Questions/comments/feedback are very much appreciated! AMA

~~~
re
Very minor point: be careful comparing IDs the way you do in the REPL (i.e.,
without keeping a reference to the previous object)--since the old one is
released, it's possible that the memory address is reused and the ID of the
new object ends up being the same, even though the object is different, which
can have misleading results:

    
    
        Python 2.7.10 (default, Oct 23 2015, 19:19:21) 
        [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
        Type "help", "copyright", "credits" or "license" for more information.
        >>> class A:
        ...   def b(self): pass
        ... 
        >>> A.b
        <unbound method A.b>
        >>> hex(id(A.b))
        '0x104e2c0a0'
        >>> hex(id(A.b))
        '0x104daeb90'
        >>> hex(id(A.b))
        '0x104daeb90'
    

So the `x is y` approach is "safer." :)

~~~
avyfain
Interesting, had not considered that at all. So basically, to ensure the old
reference is not released, just naming it would do the trick? AKA:

    
    
        >>> some_unbound_method = A.b
        >>> hex(id(some_unbound_method))
        '0x104e2c0a0'

~~~
phasmantistes
Yep, that's right. To lay out the example the way you did in the article, it
would look like:

    
    
        >>> foo = A.b
        >>> bar = A.b
        >>> hex(id(foo))
        '0xdeadbeef'
        >>> hex(id(bar))
        '0xdecafcab'
        >>> foo is bar
        False

------
voltagex_
>In 3, this distinction between bound and unbound doesn’t exist, but
strangely, the docs for Python 3 are not up to date

Is there a place for raising documentation bugs?

~~~
jcl
It is linked in the second update at the bottom of the article:

[https://bugs.python.org/issue23702](https://bugs.python.org/issue23702)

