
Anti-Patterns in Python Programming - aburan28
http://lignos.org/py_antipatterns/
======
tribaal
The more frequent and dangerous pitfalls are, in my humble opinion:

\- Bare except: statements (that catches _everything_ , even Ctrl-C)

\- Mutables as default function/method arguments

\- Wildcard imports!

~~~
SEJeff
Couldn't agree more! One of my all time new python interview questions gets a
surprisingly large number of developers.

Given a function like:

    
    
        def append_one(l=[]):
    
            l.append(1)
    
            return l
    
    

What does this return each time?

    
    
        >>> append_one()
    
        >>> append_one()
    
        >>> append_one()

~~~
rmrfrmrf
At what level would you test an interviewee with this kind of question: Python
guru, Python expert, Python ninja, Python rockstar, or merely "is familiar
with Python"? Your example is a very common gotcha that has been covered ad
nauseam, but IMO it's still the kind of bug that would be caught immediately
in code review and is very easily fixed.

~~~
lilsunnybee
I thought we were past using trick questions like this anyways! Since, you
know, it would be easy for an experienced yet anxious programmer to get
tripped up on this, but someone who just browsed "python interview questions
101" to breeze on through. Also it selects against experienced multi-language
developers, since language-specific quirks like this are not generally useful
information to keep front-loaded, but are trivial to become re-familiar with
in a work environment, or even _gasp_ learn for the first time from a co-
worker or helpful article.

If the industry as a whole cared about evidence-based, non-superstitious, non-
monoculture-reinforcing hiring practices, we'd realize that tripping people up
and judging programming capability based on minutia is as unfair as it is
self-defeating.

------
shadowmint
Mmm... you should always use 'if x is not None:' imo.

It's very common for libraries to make values evaluate to False, and very easy
to get bugs if you just lazily test with 'if x'.

Sqlalchemy springs to mind immediately as one of the common ones where using
any() and if x: is a reeeeeallly bad idea; there are plenty of others.

I'm pretty skpetical about modifying your coding behavior based on what
libraries you happen to be currently using.

'If x' isn't your friend.

~~~
tomp
Especially taking into account the more bizarre bugs (features?) of python:

    
    
        (bool(datetime.time(0)), bool(datetime.time(1))) == (False, True)
    

I _always_ consider `if x:` a bug, unless x can only be a boolean.
Furthermore, it seriously hinders readability and clarity of the code.

~~~
Zancarius
I was specifically looking through these comments for your example.

I got bit by this once, and it's certainly... _strange_ (it's also
surprising). It's considered "behavior consistent with the rest of Python" [1]
(which I can agree with) even if it makes little sense in terms of immediate
readability to someone who hasn't previously encountered it. Fortunately, the
workaround is easy, and it _is_ documented.

There's at least a couple of spats on the mailing list regarding this feature
that are of interest to the curious or at least those who are interested in
the history of such behavior.

[1] [http://bugs.python.org/issue13936](http://bugs.python.org/issue13936)

------
msvalkon
Check out Raymond Hettingers Transforming Code into Beautiful, Idiomatic
Python talk on youtube [1].

Great talk on avoiding some of the common pitfalls new python developers step
in. Exposes some nice language features.

[1]:
[https://www.youtube.com/watch?v=OSGv2VnC0go](https://www.youtube.com/watch?v=OSGv2VnC0go)

~~~
gshubert17
And the slides are here:

[https://speakerdeck.com/pyconslides/transforming-code-
into-b...](https://speakerdeck.com/pyconslides/transforming-code-into-
beautiful-idiomatic-python-by-raymond-hettinger-1)

------
sophacles
The only thing I disagree with is "use nested comprehensions" thing. In my
mind: x = [letter for word in words for letter in word]

is inside-out or backwards or backwards. I want the nested for being the less
specific case:

    
    
       x = [letter for letter in word for word in words]
    

makes more sense in my mind.

(It's also my first answer to the "what're some warts in the your language of
choice).

~~~
njharman
I'm in the camp that if your list comp needs more than one for clause, it's
complicated enough to be broken out into actual for loop.

~~~
jevgeni
Everybody, listen to this person!

~~~
jules
Then it turns into this:

    
    
        x = []
        for word in words:
           for letter in word:
              x.append(letter)
    

Which in addition to being far more verbose and less readable, is also less
efficient.

~~~
sophacles
I'll tell you what I tell my team: it's barely more verbose, and the
readability is up for extremely serious debate (I mean, everyone understands
nested for loops, but the nested list comprehension thing is wierd... why is
something that appears way way before the innermost "for" the same as
something in that for loop's declaration?)

It may also be less efficient. When I'm shown numbers that the difference
between the comprehension and the for loops are (in each specfic instance, or
in aggregate for the program in question) is above statistical noise AND it's
a significant factor in overall runtime (I won't ever worry about a
millisecond when the runtime is 1s), then I'll gladly say: put them in.

Until then, just use the loops. Use of really strange language features that
are surprising, not exactly idiomatic (this argument is common for this case)
and not shown to be of actual benefit, are detrimental in a polyglot
environment.

~~~
jules
When I see a list comprehension I can see with a single glance what it's
doing. Not so with the 4 line for loop. Comprehensions aren't a strange
language feature in Python either...it's one of the central features of
Python.

Don't use something until it's proven to yield a great benefit is a very
conservative approach. That may be appropriate in some cases, but I'm very
glad that I am not in such a team since that would be incredibly frustrating.
I much prefer an approach where you go with the choice that's most likely the
better one, even if it's not 100% proven better or not a big difference.

~~~
sophacles
I'm talking strictly about multi "for" comprehensions. They just are too
confusing to me and most of the people I've worked with ever. But we also use
lots (most) python features fully, just that one has been the source of dozens
of bugs in this one codebase, not to mention others I've worked on with other
people. It is a shitty non-intuitive syntax.

Nested for loops, flatten(), various itertools functions and chained generator
expressions all suffice, and I have yet to see them provide measurable
slowdown to actual code compared to good algorithms and decent factoring. Like
I said, I'll even use multi-for comprehensions if there is a measurable
difference over nested for-loops.

Also, I think you are intentionally misrepresenting what I said - when I said
don't use "weird stuff" I explicitly excluded idiomatic language things. That
includes (for python) single for comprehensions. The multi-for comprehension
is something I rarely come across in the wild despite it's long time existence
in python - it's a weird one.

~~~
hueving
I think you are trying to justify your strange preferences after the fact. How
exactly were there bugs caused by nested for loops that you encountered? It's
not like if you mess up the order it will actually run without throwing an
exception. Nested list comprehensions are idiomatic python. It's really
strange that you don't let your team use them because you are afraid of them.

~~~
jevgeni
Well, Google doesn't recommend it either ([http://google-
styleguide.googlecode.com/svn/trunk/pyguide.ht...](http://google-
styleguide.googlecode.com/svn/trunk/pyguide.html?showone=List_Comprehensions#List_Comprehensions)),
but I guess they are all bloody noobs or whatever.

I mean, single level comprehension is good. Nested list comprehension is OK
only in most trivial cases. In my opinion, if I see how a person uses list
comprehension, I can tell, what kind of person this is.

There are people who, for example, do this def all_is_okey_dorey(lst): return
all([some_predicate_fn(x) for x in lst])

instead of this def all_is_okey_dorey(lst): for x in lst: if not
some_predicate_fn(x): return False return True

and can live with themselves somehow.

Or there are people, who refuse to acknowledge the existence of anything
besides Python 3.x and when forced to write in 2.x use list comprehension
instead of iterator comprehension.

Thing is, the _validity_ of using nested list comprehension depends not on the
amount of for loops you have, but on the thing you want to do with the item.
If it's just selection, then it might be ok. If you want to apply some kind of
function to it, then it's most probably the case of trying to be too clever.

~~~
hueving
If you think two loops is not a "simple case" of a list comprehension as
Google suggests to use them for, perhaps you shouldn't be doing code reviews.
It sounds like your team would be held down by your weak grasp of the
language.

------
cuu508
> write a list comprehension (...) code just looks a lot cleaner and what
> you're doing is clearer.

I know how to use list comprehensions, but often avoid using them and use the
standard for loops. List comprehensions look nice and clean for small
examples, but they can easily get long and become mentally hard to parse. I
would rather go for three 30 character lines instead of one 90 character line.

~~~
bkeroack
Depends on how you want to program: imperative vs functional.

Personally I think list comprehensions are the most beautiful part of Python,
though sometimes I use map() when I'm trying to be explicitly functional (I
realize it's allegedly slower, etc).

Generally I think list comprehensions are cleaner and allow you to write purer
functions with fewer mutable variables. I disagree that deeply nested for
loops are necessarily more readable.

~~~
SEJeff
Map is not allegedly slower, it is demonstrably slower.

$ python -mtimeit -s'nums=range(10)' 'map(lambda i: i + 3, nums)' 1000000
loops, best of 3: 1.61 usec per loop

$ python -mtimeit -s'nums=range(10)' '[i + 3 for i in nums]' 1000000 loops,
best of 3: 0.722 usec per loop

Function calls have overhead in python, list comprehensions are implemented
knowing this fact and avoiding it so the heavy lifting ultimately happens in C
code.

~~~
bkeroack
Fair enough. I said "allegedly" because I had never personally measured the
performance difference.

Even though you could construe map as "half as fast" (or twice as slow) as the
equivalent comprehension, I don't see a difference of ~1 usec making any
difference in my code thus far. Good to know, though.

~~~
SEJeff
Yup, for very large calculations, or certain use cases, it can make much
larger differences. It all depends on your use case.

------
moretti
I always struggle to understand why a list comprehension

    
    
      alist = [foo(word) for word in words]
    

is considered more Pythonic than _map_

    
    
      alist = map(foo, words)

~~~
blossoms

        alist = [foo(word) for word in words if word.startswith('a')]
        alist = map(foo, filter(lambda word: word.startswith('a'), words))
    

Which reads better?

~~~
LyndsySimon
I don't use FP practices in Python much, but if I did I'd define the filter
outside the map, like so:

    
    
        begins_with_a = lambda x: x.startswith('a')
        alist = map(foo, filter(begins_with_a, words))

~~~
nilved
Even with named functions, Python's use of global functions instead of methods
for iterators force you to read the expression from the inside out. I think
Lisp languages nailed this with their threading macros, which allow natural
left-to-right reading, but Ruby's strategy is better than Python's, too, while
maintaining very similar syntax.

    
    
        ;; clojure
        (let [begins-with-a #(.startsWith % "a")
              foo #(do-some-stuff-with %)]
          (-> words (filter begins-with-a) (map foo))))
    
        # ruby
        words.filter { |e| e.start_with?(?a) }.map { |e| foo(e) }
    

It doesn't really make sense for things like `len` and `map` to be global
functions in object-oriented languages.

------
ggchappell
This is a nice little article, but I wonder about some of the design
decisions. In particular:

> The simplifications employed (for example, ignoring generators and the power
> of itertools when talking about iteration) reflect its intended audience.

Are generators really that hard? (Not a rhetorical question!)

The article mentions problems resulting from the creation of a temporary list
based on a large initial list. So, why not just replace a list comprehension
"[ ... ]" with a generator expression "( ... )"? Result: minimal storage
requirements, and no computation of values later than those that are actually
used.

And then there is itertools. This package might seem a bit unintuitive to
those who have only programmed in "C". But I think the solution to that is to
_give examples_ of how itertools can be used to create simple, readable,
efficient code.

------
omegote
Point 3 of the iteration part is not good advice. With [1:] you're making a
copy of the list just to iterate over it...

~~~
omaranto
You're right. I still wouldn't recommend looping over indices, but rather
using itertools.islice(xs, 1, None) instead of xs[1:].

------
msl09
I find that the testing for empty is a bit misguided if you want to be
rigorous with types. for instance:

    
    
         >>>def isempty(l):
         >>>    return not bool(l)
         >>>isempty([])
         True
         >>>isempty(None)
         True
    

If embedded within your program logic this kind of pattern can waste precious
time with debugging. You can catch your errors much more quickly if you are
explicit with your comparisons.

------
elandybarr
Failing to use join is a big one.

I have seen countless instances of people writing the logic to output commas
in between items (like for CSV export) that they want to concatenate into a
string.

    
    
        header_line = ','.join( header for header in headers )
        csv_line    = ','.join( str(dataset[key]) for key in dataset.keys() )
    

Example for a case of a dictionary mapping a string to a bunch of numbers.

~~~
marcinw
Proper Python would use the csv module for this operation, as your CSV export
would break if `header` or `dataset[key]` contains a comma.

~~~
elandybarr
Yeah. This was a special case for a one line CSV that was requested by my
client. It was a dictionary with a bunch of single measurements.

------
yeukhon
> PEP 8 is the universal style guide for Python code.

> If you aren't following it, you should have good reasons beyond "I just
> don't like the way that looks."

Core dev and Guido have said many times PEP 8 are not holy.

See [https://mail.python.org/pipermail/python-
dev/2010-November/1...](https://mail.python.org/pipermail/python-
dev/2010-November/105681.html)

In essence, a "stupid reason" like "I don't like it" is a valid reason not to
adopt PEP 8.

In fact, I don't like the PEP 8 recommendation on docstring. I like Google's
docstring (aka napoleon in Sphinx contrib-module).

[http://sphinxcontrib-
napoleon.readthedocs.org/en/latest/exam...](http://sphinxcontrib-
napoleon.readthedocs.org/en/latest/example_google.html)

------
orf
Interesting read, a couple of things I noticed though:

1\. In "Checking for contents in linear time" both examples are the same.
Perhaps remove the list entirely in the second example

2\. Itertools.islice helps if you need to slice a list with a bajillion
elements

------
gr3yh47
# Avoid this

lyrics_list = ['her', 'name', 'is', 'rio']

words = make_wordlist() # Pretend this returns many words that we want to test

for word in words:

    
    
        if word in lyrics_list: # Linear time
    
            print word, "is in the lyrics"
    

# Do this

lyrics_list = ['her', 'name', 'is', 'rio']

lyrics_set = set(lyrics_list) # Linear time set construction

words = make_wordlist() # Pretend this returns many words that we want to test

for word in words:

    
    
        if word in lyrics_list: # Constant time
    
            print word, "is in the lyrics"
    

the second example should read ... if word in lyrics_set: ...

~~~
darkxanthos
Just to point out, if you really will have a tiny list and that's knowable,
it's possible this example would be best with a straight linear time check. It
could be fewer operations than hashing a string and looking it up. Practically
pedantry though.

------
wodenokoto
I use python for datamining, and most of my work is done exploring data in
iPython.

> First, don't set any values in the outer scope that > aren't IN_ALL_CAPS.
> Things like parsing arguments are > best delegated to a function named main,
> so that any > internal variables in that function do not live in the > outer
> scope.

How do I inspect variables in my main function after I get unexpected results?
I always have my main logic live in the outer scope because I often inspect
variables "after the fact" in iPython.

How should I be doing this?

~~~
hyperion2010
If you are using the interpreter directly then that particular bit of advice
is hard to follow since you basically live in global all the time. For that
reason I would say that this advice applies mainly to .py files.

~~~
LyndsySimon
Agreed.

There's a big difference between "scripting" and "writing software" in terms
of best practices.

If you're writing some ETL scripts in an IPython notebook, it would be
overkill to encapsulate everything to keep your global scope clean.

------
parham
Instead of this:

    
    
        # Do this  
        lyrics_set = set(lyrics_list) # Linear time set construction  
        words = make_wordlist()  
        for word in words:  
            if word in lyrics_set: # Constant time  
                print word, "is in the lyrics"
    

You could do this:

    
    
        lyrics_set = set(lyrics_list)  
        words = set(make_wordlist())  
        matched_words = list(lyrics_set & words)  
        for word in matched_words:  
            print word, "is in the lyrics"

~~~
LyndsySimon
Of, off the top of my head:

    
    
        for word in (set(lyrics_list) & set(words)):
            print('{} is in the lyrics'.format(word))

~~~
parham
Even shorter, nice. How about this one liner, the last bit is looking a bit
messy any ideas?

    
    
        print " is in the lyrics \n".join([set(lyrics_list) & set(words)]), "is in the lyrics"

~~~
MattConfluence
How about

    
    
      >>> lyrics_list = ["her", "name", "is", "rio"]
      >>> words = ["is", "rio"]
      >>> print '\n'.join("{} is in the lyrics".format(word) for word in set(lyrics_list) & set(words))
      rio is in the lyrics
      is is in the lyrics

~~~
LyndsySimon
Heh, I just replied with pretty much this exact code to someone else trying to
make it a one-liner.

Yes, it's possible. I would never, ever publish code like this, though. It's
opaque.

------
LyndsySimon
I can't think of a single case where using sentinel values is necessary or
appropriate in Python.

Generally speaking, one should just return from within the loop.

~~~
jessaustin
I get what you're saying, but even when sentinels are used inside a function,
returning a _-1_ to the caller seems like a pretty bad API. It's OK to raise
_ValueError_! I had thought the idiomatic sentinel value was an instance of
_object()_ you could _is_ against, anyway.

~~~
LyndsySimon
I agree - _especially_ when you're returning a list index, since `[-1]` is a
valid index in Python.

------
u124556
> Consider using xrange in this case. Is xrange still a thing? doesn't range
> use a generator instead of creating a list nowadays?

~~~
omaranto
Well, I'd say no, not in "Python".

It seems to me that when people say "Python" they still mostly mean Python 2,
where range returns a list and xrange a generator. In Python 3 there is no
xrange and range returns a generator, but I think people still most often call
that language "Python 3", not "Python".

------
me_myself_and_I
I don't think there us such thing as a pattern per language, unless the
language is really unique. IMO what does exists is Language Bad Practices,
which are actually tied to the language itself.

An (anti)pattern is something abstract and can be applied to any other similar
language.

------
yoo-interested
Speaking of find\\_item, is the `for..else` loop (which can be used to write
find\\_item in another way) considered Pythonic? I personally like `for..else`
loops but I don't know where the consensus is at.

~~~
PeterGriffin
[http://en.wikipedia.org/wiki/No_true_Scotsman](http://en.wikipedia.org/wiki/No_true_Scotsman)

Don't ask what's "more Pythonic" or "less Pythonic", Python is not a cult,
it's a very practical scripting language. Ask for benefits and weaknesses of a
given approach in given circumstances.

~~~
dlitz
Speaking a language in the same way as other speakers of that language makes
you easier to understand.

~~~
PeterGriffin
When it comes to basic formatting, symbol naming and high-level code
organization, sure.

But computer languages, unlike human languages, are precise. Their intent is
clear. And you'll never encounter a case where the Python 3 interpreter hasn't
heard of that particular Python 3 keyword you're using.

It's also not an excuse to avoid certain features of a language, when using
them leads to a better and simpler solution, just because they're _less
popular_. Programming is not an exercise in popularity.

On the other hand, human language is fuzzy and full of phrases that consist of
statements having nothing to do with their meaning. Such as me saying "your
argument doesn't hold water".

Human languages also have additional layers entirely separate from the primary
meaning of a conversation, such as sending social cues like "how smart am I",
"do I like you", "do I fit in this group", and "am I a leader or a follower".
Each layer of concern drives a certain way of expression and imitation, none
of which occurs (or _should_ occur) when writing computer code.

A better example to compare to programming code would be mathematical
notation. As long as you express your intent shortly, using the available
mathematical notation, people will be fine, and your intent will be clear.

I've never seen someone ask in a math forum if their formula is more
Mathematic one way, or another way.

------
metaphorm
an anti-pattern is a design pattern gone bad. these aren't anti-patterns.
they're just noobie mistakes and non-idiomatic expressions.

~~~
ascotan
is it an anti-pattern to call something an anti-pattern because the author
isn't sure what an anti-pattern is?

------
greenspider
read this right after using a range in a loop in order to get the index. Can't
believe I went this long without knowing about enumerate.

~~~
maxerickson
It first mentions enumerate as a footnote after looping with a range:

[https://docs.python.org/2/tutorial/controlflow.html#for-
stat...](https://docs.python.org/2/tutorial/controlflow.html#for-statements)

But the tutorial in the Python docs has pretty high information density and
good coverage of things like this.

(I think there is some risk that this comment will be interpreted as _If you
don 't know enumerate you need to look at the tutorial._ That isn't what I
intend, I just want to point out that the tutorial is a reasonably dense
resource that hits on a lot of stuff like enumerate.)

------
noobermin
Question, how do you over multiple long lists (in python 2) especially if zip
itself takes a long time to zip them, for example.

~~~
takeda
You use Python 3.

J/K, while this is technically a limitation of Python 2, there actually is
izip in itertools package which is a generator and works in similar way to zip
in python 3.

~~~
noobermin
Awesome! I knew there was some lazy zip-ish thing for python 2. I personally
hate using range(len(foo)) as anyone else.

------
mimighost
I would say that the leaky out scope problem combined with those really
similar typos is perhaps most diffcult to spot out.

------
me_and
> Things like parsing arguments are best delegated to a function named main,
> so that any internal variables in that function do not live in the outer
> scope.

Why a function named `main`? We're not writing C here, there's no need for a
function named `main`. Let's call it something that's actually useful, like
`parse_cmd_arguments`

------
jimmaswell
Set membership checking is constant-time? Somehow I don't believe that.

~~~
kstrauser
Sets are basically hash tables that store only keys and not values. You can
easily write your own Set implementation by subclassing `dict` and stripping
out references to `.values()`, etc.

As such, testing for membership is an O(1) hash table lookup. If you're
skeptical:

    
    
        $ python -m timeit -s 'nums=range(1000000)' '100000 in nums'
        1000 loops, best of 3: 1.4 msec per loop
        
        $ python -m timeit -s 'nums=range(1000000)' '500000 in nums'
        100 loops, best of 3: 7.15 msec per loop
        
        $ python -m timeit -s 'nums=range(1000000)' '900000 in nums'
        100 loops, best of 3: 13 msec per loop
        
        $ python -m timeit -s 'nums=set(range(1000000))' '100000 in nums'
        10000000 loops, best of 3: 0.0572 usec per loop
        
        $ python -m timeit -s 'nums=set(range(1000000))' '500000 in nums'
        10000000 loops, best of 3: 0.057 usec per loop
        
        $ python -m timeit -s 'nums=set(range(1000000))' '900000 in nums'
        10000000 loops, best of 3: 0.0584 usec per loop

------
natmaster
islice should be used, not xrange

