

Python quirks - luu
http://www.lshift.net/blog/2009/10/29/python-quirks

======
kgm
Some of these quirks are explained incorrectly. I find the real explanations
interesting, so please allow me to provide them.

> 999+1 is not 1000

The examples given in this section do not actually show what the poster
thinks! Python's interning of small integers is not relevant. It only applies
to integers in the range -5 through 256.

The real explanation for why "1000 is 1000" evaluates to True has to do with
the Python compiler. By evaluating this expression as a single statement in
the interactive interpreter, the compiler notices that the constant value 1000
is repeated more than once. Therefore, it is able to re-use the same object.

But if you provide the value in more than one statement, the compiler is
unable to do this:

    
    
      >>> a = 256
      >>> b = 256
      >>> a is b
      True
      >>> a = 257
      >>> b = 257
      >>> a is b
      False
    

Note that this behavior exists because each statement in the interactive
interpreter is compiled separately. If you place the code within a function
instead, the entire function is compiled at once, and the object is reused
once more:

    
    
      >>> def f():
      ...     a = 257
      ...     b = 257
      ...     return a is b
      ... 
      >>> f()
      True
    

> Ellipsis?

 _Apparently Ellipsis is always “bigger” than anything, as opposite to None,
which is always “smaller” than anything._

Not so. Ellipsis follows Python 2's default rules for the comparison of
unrelated types.

    
    
      >>> Ellipsis < ()
      True
    

The default is documented as comparing objects "consistently but
arbitrarily"[1]. The actual rules are:

1) None is the smallest object. 2) Followed by numbers. 3) Followed by all
other objects. Objects of distinct types are compared by the lexical ordering
of their type names.

This can easily lead to senseless orderings, when two types define an ordered
relationship between themselves, but another type happens to have a name that
is lexically between them, as with str, tuple, and unicode:

    
    
      >>> "def" < (1,)
      True
      >>> (1,) < u"abc"
      True
      >>> u"abc" < "def"
      True
    

The Ellipsis constant is an instance of a type named "ellipsis", and so it is
smaller than instances of most of the other non-numeric builtin types, except
for dict.

The actual use of Ellipsis has nothing to do with recursive containers
printing an ellipsis in their repr. It's part of a wacky special syntax that
exists for NumPy's benefit:

    
    
      >>> d = {}
      >>> d[...] = None
      >>> d
      {Ellipsis: None}
    

[1] [http://docs.python.org/2/reference/expressions.html#not-
in](http://docs.python.org/2/reference/expressions.html#not-in)

~~~
agf
[http://stackoverflow.com/questions/7969552/why-
does-4-3-retu...](http://stackoverflow.com/questions/7969552/why-
does-4-3-return-true-in-python-2)

------
emef
I'm surprised using mutable value for a default in a keyword argument wasn't
mentioned. This certainly tripped me up early in learning python:

    
    
        def fn(x, my_dict={}):
            my_dict[x] = x * 2
            return my_dict
    
        >>> fn(1)
        {1: 2}
        >>> fn(2)
        {1: 2, 2: 4}
    

(I would have expected the second call to return {2: 4} when I was learning
python)

~~~
kyllo
This is pretty bizarre, and seems like it would cause a memory leak if you
don't know what you're actually doing. You're allocating a my_dict object on
the heap, in a field of the function object, so my_dict never goes out of
scope and never gets garbage collected until the function fn itself does,
right?

Whereas if you did this instead:

    
    
        def fn(x):
            my_dict={}
            my_dict[x] = x * 2
            return my_dict
    

You'd get what you'd expect--a new my_dict object gets created, returned, and
then goes out of scope every time fn is called, so it would get garbage
collected once there are no more references to that return value. (I think...)

(I don't know that much about how memory allocation and GC works in Python
yet, just trying to learn!)

~~~
baq
this is not at all bizarre, you just need to know what dynamic means. the best
answer i could come up with is "definition is execution". for an enlightening
moment, see this piece of code:

    
    
        def a():
            print "a called"
            return []
    
        def fn(x=a()):
            x.append(1)
            print x
    
        fn()
        fn()
        fn()
    

i suggest typing this directly into the interpreter instead of a script for
better effect.

~~~
akkartik
You got me thinking about how python differs from lisp in this respect, and
the subtle way that a functional API has helped this specific situation.

While lisp suffers from the same "definition is execution" gotcha, the effects
are far rarer in practice because () is immutable and interned while [] is
mutable and usually generated afresh each time it's executed.

    
    
      $ python
      >>> [] is []
      False
    
      $ sbcl
      * (eq () ())
      t
    

Since [] is mutable, appends of [] can do superficially 'the right thing'.

    
    
      >>> a = []
      >>> a.append(4)
      >>> a
      [4]
    
      * (setq a ())
      * (nconc a '(34))
      (34)
      * a
      ()
    

But there's a reason lisp does seemingly the wrong thing. According to the
spec, _nconc_ skips empty arguments
([http://www.lispworks.com/documentation/HyperSpec/Body/f_ncon...](http://www.lispworks.com/documentation/HyperSpec/Body/f_nconc.htm#nconc))
Reading between the lines, I'm assuming this makes sense from a perspective in
lisp where we communicate even with 'destructive' operations through their
return value. This is more apparent when you consider _nreverse_ :

    
    
      * (setq a '(1 2 3 4))
      * (nreverse a)
      (4 3 2 1)
      * a
      (1)
    

Destructive operations can reuse their input, but they're not required to
maintain bindings. There is no precise equivalent of python's _.reverse()_.
Instead, a common idiom is:

    
    
      * (setq a (nreverse a))
    

It seems like a weird design decision, but one upshot of it besides
encouraging a more functional style is that this optional-arg gotcha loses a
lot of its power in lisps. It's very rare to define a default param of a non-
empty list, and empty lists can't be modified without assigning to them.

------
njharman
> Tuples constructor

Confusion is avoided by understanding that comma ',' not paren '()' is the
tuple constructor.

> Python doesn’t have multiline comments. Instead, multiline strings are used.

No the are NOT! Don't use strings for comments dammit.

~~~
jmduke
PEP 257 recommends using triplequoted strings for multiline docstrings:

[http://www.python.org/dev/peps/pep-0257/#multi-line-
docstrin...](http://www.python.org/dev/peps/pep-0257/#multi-line-docstrings)

~~~
forsaken
A docstring is not a comment. A docstring is actually a string that gets saved
on the object, where as a comment is not used at all.

~~~
quacker
Right, and comments and documentation serve different purposes. But
appropriate docstrings will greatly reduce the need for comments.

------
csdigi
The sectioned titled "Inconsistent get interface" compares get() to getattr()
which are two unrelated functions. getattr is the same as a property lookup on
an object (person.name, person['name']), get() is a method defined by some
types which returns a stored value.

In the provided example he calls get on an empty dictionary for the key 1,
then calls getattr of 'a' on an int. Finally he calls it again with an
optional default argument of None.

The difference is made apparent by the example:

In [13]: test = {'values': 1}

In [14]: getattr(test, 'values') Out[14]: <function values>

In [15]: test.get('values') Out[15]: 1

------
leephillips
Many of these "quirks" are just odd expectations on the author's part that
were not fulfilled, but he does describe a few interesting odds and ends. And
I agree completely about modules, importing, eggs, namespaces, and all that
mess. One of the most refreshing things about learning Clojure after using
Python for a while was the relative absence of confusion surrounding these
issues, aided by Leiningen.

~~~
craigyk
I think these "quirks" hit on legitimate inconsistencies and confusing
interpretations. Most of his expectations seem pretty commonplace.

------
aaronharnly
One of my favorite tomfooleries in Python is this:

>>> True, False = False, True

It doesn't have much practical effect, since most logical tests don't use the
True and False constants directly. But it's a good way to perplex the unwary.

~~~
wicknicks
In Python 3.x, this gives me a "SyntaxError: assignment to keyword" error.

~~~
bluewres
They changed the language definition for Python 3 to make True and False
keywords, unlike in previous versions.

------
barik
I don't consider the sort example to be a quirk at all, though some of the
other examples are reasonable. It's more a definition of what methods are
supposed to do in the first place: a behavior that is applied on the instance
of the object. Unless you're doing functional programming (or implementing
certain specialized design patterns), you wouldn't expect dog.move() to return
a new dog, so why should list.sort() return a new list?

~~~
manojlds
I think this is where Ruby shines with the ! to indicate destructive.

~~~
petercooper
It would be, if only it were true. Consider Array#delete_if or Array#pop, for
example, although there are many others too.

Matz has written about this - [https://www.ruby-
forum.com/topic/176830#773946](https://www.ruby-forum.com/topic/176830#773946)
\- and said _" The bang (!) does not mean "destructive" nor lack of it mean
non destructive either. The bang sign means "the bang version is more
dangerous than its non bang counterpart; handle with care"._

This is one of the most commonly misunderstood things about Ruby in my
experience (enough so that some library developers do apply a ! == destructive
naming system) and would certainly make an equivalent "Ruby quirks" list IMHO!
:-)

------
lmm
The one that always confuses me is that multiple "for"s in the same
comprehension work the opposite way from how you expect:

    
    
        >>> [[(a, b) for a in [1, 2, 3]] for b in [4, 5, 6]]
        [[(1, 4), (2, 4), (3, 4)], [(1, 5), (2, 5), (3, 5)], [(1, 6), (2, 6), (3, 6)]]
        >>> [(a, b) for a in [1, 2, 3] for b in [4, 5, 6]]
        [(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)]

~~~
sp332
The second one makes more sense to me. My brain reads it as:

    
    
      for a in [1, 2, 3]:
        for b in [4, 5, 6]:

~~~
lmm
I don't because the statement itself flips compared to for style, so it seems
like the whole thing should flip. i.e.

    
    
        for XXXXX:
          for YYYYY:
            ZZZZZ
    

it would kind-of make sense to me if this translated into a for comprehension
as

    
    
        [ZZZZZ for YYYYY for XXXXX]
    

but instead it translates to

    
    
        [ZZZZZ for XXXXX for YYYYY]
    

which seems decidedly middle-endian.

------
727374
When I learned python I was surprised by the absence of a length() member on
collections. But, apparently this was far from an accident.
[http://effbot.org/pyfaq/why-does-python-use-methods-for-
some...](http://effbot.org/pyfaq/why-does-python-use-methods-for-some-
functionality-e-g-list-index-but-functions-for-other-e-g-len-list.htm)

~~~
est
You could always use .__len__()

------
tveita
Ellipsis and slice are used for slicing. e.g:

    
    
      class MyContainer(object):
          def __getitem__(self, key):
              return key
      >>> c = MyContainer()
      >>> c[1]
      1
      >>> c[1:]
      slice(1, None, None)
      >>> c[1:2, 1:2:3, ...]
      (slice(1, 2, None), slice(1, 2, 3), Ellipsis)
    

NumPy uses this to allow advanced slicing of multidimensional arrays:

[http://docs.scipy.org/doc/numpy/reference/arrays.indexing.ht...](http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)

------
valtron
print in 2.x lazy-evaluates its arguments.

Given

    
    
        def p(x):
            print x
            return x
    

then

    
    
        print 'a', p('b')
    

is not equivalent to

    
    
        a1 = 'a'
        a2 = p('b')
        print a1, a2

------
sbierwagen
Article is from 2009.

