
Python Idioms [pdf] - benn_88
http://safehammad.com/downloads/python-idioms-2014-01-16.pdf
======
crazygringo
Interesting philosophical points.

To me personally, testing for 'truthy' and 'falsy' values, or relying on
exceptions rather than checking values in advance, feels like sloppy and
imprecise programming.

A string being empty or not, or an array having items or not, or a boolean
being true or false, are all qualitatively totally different things to me --
and just because Python _can_ treat them the same, doesn't mean a programmer
_should_ take advantage of that fact. Sometimes it's possible to _over_
-simplify things in a way that obfuscates instead of clarifying.

When I read:

    
    
        if name and pets and owners
    

I have no intuitive idea of what that means, of what's going on in the
program. When I read

    
    
        if name != '' and len(pets) > 0 and owners != {}
    

I understand it exactly.

But by this point, I've come to understand that, for a lot of people, it seems
to be the opposite. It seems to be more of a philosophical difference, not
right/wrong.

~~~
andrewfong
I think part of the reasoning behind the truthy / falsy mechanic is that it's
more robust. If, for whatever reason, we did:

    
    
      name = None
    

instead of

    
    
      name = ''
    

Then the second conditional would fail, whereas the first would still be fine.

~~~
Too
Checking for empty strings can be done with len(mystring)==0 for this reason.
In many other languages this method is standard and recommended practice.

Relying on implicit conversions is just sloppy. What if that variable was
never supposed to be None in the first place. Better with an exception than
continuing with corrupt data.

Remember another python motto: Explicit is better than implicit.

~~~
masklinn
> Checking for empty strings can be done with len(mystring)==0 for this
> reason.

`len` blows up on None, so this blows up completely instead of just failing.

> Relying on implicit conversions is just sloppy.

There is no implicit conversion. Truthiness is a protocol, it does not convert
anything anywhere.

~~~
Too
> `len` blows up on None, so this blows up completely instead of just failing

That's the whole point! As I said, Better with an exception than continuing
with corrupt data. Or maybe I should say unsupported data type rather than
corrupt data.

~~~
danudey
At the point where you are doing your validation (or where you are doing the
actual work with the variable) you can make that decision.

It's entirely possible that upstream code is content with any type as long as
it's coercible. A lot of Python code just wants an iterable, for example, or
something that can be coerced to a string.

It's definitely possible to have a situation where you _need_ a string
specifically (or an integer specifically), but it's generally better to have
your code coerce the data whenever possible so as to allow for more logical
code.

e.g. 's.send(str(object))' (if object has a relevant __str__()) makes perfect
sense in code. You don't necessarily need a string, you just need something
that can behave as a string in a sensible manner.

You also end up with scenarios where None is a 'valid value' in e.g. the SQL
sense of 'value not specified'; for example, company_name might be '' (empty
string provided) or None (no value provided); either way company_name is
Falsey, and your logic should work the same.

------
csense
I disagree with promoting try / catch. Exceptions like ValueError can really
happen almost anywhere, so it is usually better to sanitize your inputs.

E.g. something like:

    
    
        try: 
            something = myfunc(d['x'])
        except ValueError:
            something = None
    

The programmer's intent is probably to only catch errors in the key lookup
d['x'], but if there is some bug in the implementation of myfunc() or any of
the functions called by myfunc() which causes a ValueError to unintentionally
be raised, it will be caught by the except.

For dictionary lookups specifically, get() is usually preferable:

    
    
        something = d.get('x')
        if something is not None:
            something = myfunc(something)
    

Or if dictionary may potentially contain None values:

    
    
        if 'x' in d:
            something = myfunc(d['x'])

~~~
drakeandrews
That example would surely be better as:

    
    
        something = myfunc(d['x']) if 'x' in d else None

~~~
calroc
In this form you perform the lookup twice: once to test 'x' in d and then
again to actually get the value d['x']. Try clauses in Python are very
inexpensive if they pass (don't raise an exception), so often the try..except
version would be preferable.

In any event don't optimize prematurely and use a profiler rather than
guessing if performance is an issue. ;-)

------
Walkman
For point 10:

'_' is often aliased as gettext to ease translation of string:

    
    
        from django.utils.translation import ugettext as _
    
        translated_str = _('Something to translate')
    

so using it will overwrite the alias. Instead, you can use '__' (double
underscore) as ncoghlan suggests below his answer [1]. or you can use the
'unused_' prefix as Google Python Style Guide suggests [2] or you can change
your code, so you don't need to use '_' as Alex Martelli suggests in his
answer [3].

[1]:
[http://stackoverflow.com/a/5893946/720077](http://stackoverflow.com/a/5893946/720077)

[2]: [http://google-
styleguide.googlecode.com/svn/trunk/pyguide.ht...](http://google-
styleguide.googlecode.com/svn/trunk/pyguide.html?showone=Lint#Lint)

[3]:
[http://stackoverflow.com/a/1739541/720077](http://stackoverflow.com/a/1739541/720077)

~~~
falcolas
I have personally never seen this before. I'd be wary to use it, since it
breaks the "_ is a throwaway" idiom, as well as the REPL "_ is the results of
the last expression" function.

Aliasing it to "t" or "txl" seems like a saner way, if I'm honest.

~~~
thruflo
This has been established practise in Zope, Plone et al for many years.

[http://developer.plone.org/i18n/internationalisation.html#ma...](http://developer.plone.org/i18n/internationalisation.html#marking-
translatable-strings-in-python)

~~~
dalke
It's standard Python practice since Martin von Löwis's work in support for
internationalization, which he presented at the IPC6 Python conference in
1997.

Quoting from
[http://www.python.org/workshops/1997-10/proceedings/loewis.h...](http://www.python.org/workshops/1997-10/proceedings/loewis.html)
:

> Here are several ways of producing the message catalogs. This is the one
> suggested by the GNU gettext documentation. First, all translatable message
> strings in the source code must be marked. In order to disturb readability
> as little as possible, the wrapper function around each string should be
> called _ (underscore). This use of the underscore usually does not interfere
> with its meaning in the interactive mode. A gettextized module would then
> begin with
    
    
        import intl
        
        _=intl.gettext

------
udev
One of the handy things not mentioned:

    
    
       with open("x.txt") as f:
           data = f.read()
           # do something with data

~~~
theophrastus
absolutely! that one is quite high up on my python_idioms_to_import file

------
jzwinck
The very first one, "Make a script both importable and executable," needs some
caveats. It's useful sometimes, but people often use it in places where it is
not a great idea. Here's why:

1) If you are in the mindset of "I want a single file which is both a library
and a program," how will you name it? Files to import must end with ".py" and
follow C-style name rules, so cannot start with a number cannot contain
hyphens. This leads many people to break conventions when naming their
programs, because on Unix, hyphens are conventional in program names but
underscores are not (count the examples of each in /usr/bin on any system).
And naming your program something like "2to3" is impractical if you want it to
be importable also.

2) It is unclear where to install files which are both programs and libraries.
Programs usually go in some sort of "bin" directory (again, on Unix systems),
but libraries do not. Where do you put a file which is both?

3) Sometimes the __name__ == '__main__' trick is used to include unit tests
within a module. That's not bad, but consider using Python's excellent doctest
feature instead, which often serves the same need but in a richer way.

~~~
analog31
I use the __name__ == '__main__' thing for unit testing.

I don't know if this is a generally applicable technique, but a lot of my
modules interact with hardware or physical measurements, so I have to "see"
the results in order to believe that the units are working. Often, what I'm
looking for is problems with what's actually being measured, and the effect of
changing operating conditions, not just my own copious programming bugs.

For this reason, my unit tests can be pretty elaborate, with GUI, graphs, and
other stuff. The unit test also functions as a "demo" of the module.

~~~
jzwinck
Absolutely--that's a great example of a library module that may also be
usefully executed. I had something similar today: a module that sends email.
It's useful to be able to run it (by explicit "python foo.py", not chmod +x)
and see a sample email in my inbox.

Unfortunately, for every good use of this trick, there seem to be two poor
ones. Oh well. Most of the things in TFA are more generally applicable.

------
simon_weber
> 9\. Create dict from keys and values using zip

In 2.7+, I'd recommend a dictionary comprehension instead.

~~~
aaren
In this case,

    
    
        {k: v for k, v in zip(keys, values)}
    

edit:

As I mention below, this becomes useful when you want to do e.g.

    
    
        {f(k): g(v) for k, v in zip(keys, values)}

~~~
epistasis
I'd disagree, as

    
    
      dict(zip(keys, value))
    

is more concise, doesn't introduce extra variables, and doesn't repeat itself,
and explicitly names a dict rather than using a symbol.

~~~
aaren
I like list comprehensions and to me dict comprehensions feel like a natural
extension of this. It means that you can do things like

    
    
        {k.upper(): v ** 2 for k, v in zip(keys, values)}
    

Or generally

    
    
        {f(k): g(v) for k, v in zip(keys, values)}
    

I find this very extensible. In terms of conciseness, they both fit on a
single line and I like the whitespace inside a dict comprehension.

------
agentultra
As an implementor of Hy (a homoiconic lisp frontend to Python) I've found
certain Python idioms to be rather infuriating of late.

In particular:

    
    
        >>> 0 == False
        True
    

Which makes the idiom of testing truthiness quite annoying in parsing code
such as:

    
    
        def is_digit_char(s):
            """ Return a parsed integer from 's'."""
            try:
                return int(s)
            except (ValueError, TypeError):
                return None
    

Which is harmless enough except that as a predicate it sucks because parsing
"0" will return False in a context where I'd rather know whether I parsed an
integer or not... which leads to non-idiomatic code.

This is mainly because True/False are essentially aliases for 1/0 and as such
won't identify so:

    
    
        >>> 0 is False
        False
        >>> 0 is 0
        True
    

So it's important to remember another Tim Peters-ism: _Although practicality
beats purity._ As read in the Zen of Python it seems he's referring to special
cases which this might be.

As a shameless aside, you should see what we're working on in Hy. There will
likely come a point where we'll be able to do kibit-style suggestions of
idiomatic transformations to your Python code.

 _Update_ : I ran into this while trying to write token parsers for a parser-
combinator lib in Hy.

~~~
riquito
Your function is_digit_char(s) is peculiar. From the name I would expect it to
return a boolean, instead it returns a number or None.

To write it like that and have a problem with the falseness of 0 it means that
you use it in a way to both use it as conditional expression and an integer
value, e.g.

    
    
        digit = is_digit_char('0')
        if digit: # fail
            print(digit)
         else:
            print('not a digit')
    

You should then check for his equality to None

    
    
         digit = is_digit_char('0')
         if digit is None: # pass
             print('not a number')
          else:
             print(digit)
    

But I would argue that you were in search for troubles when you wrote an
is_something() function that doesn't return a boolean. That is not idiomatic.

p.s. Hy is too crazy :-)

~~~
agentultra
It's a trivial example and I wouldn't focus on it too much.

It's not that peculiar -- instead of parsing the same character twice you
simply return the value that you parsed or None. My inspiration was from the
CLHS predicate function, DIGIT-CHAR-P [0].

The real point I was making is that Python has warts that make writing
idiomatic code impractical in some situations. I suggest that practicality
take precedence over purity. There are some situations that lead to non-
idiomatic code and that's okay.

[0]
[http://clhs.lisp.se/Body/f_digi_1.htm](http://clhs.lisp.se/Body/f_digi_1.htm)

 _Update_ : forgot the link. :)

 _Update update_ : Perhaps peculiar to Python because all values of integers
are not False except for 0 whereas in another language that doesn't have this
wart, anything that isn't False is True... even 0. In other words, anything
that isn't False is True. :D

~~~
masklinn
> In other words, anything that isn't False is True. :D

Except for `nil` (lisps, ruby), and possibly a host of other things depending
on the language (empty strings in javascript).

~~~
agentultra
Scheme is the only PL I know that has an explicit #f value.

CL, for all intents and purposes, treats nil as False... but some find the
conflation of nil and the empty list runs into the same issue when operating
on s-exprs.

    
    
        CL-USER> (if '() 1 0)
        0
    

vs

    
    
        scheme> (if '() 1 0)
        1

------
jackmaney
Although many of them boil down to preferences and philosophical points of
view, I find these kinds of idioms useful. Whenever I write code in a new
language, I want to "write as a native" so that I can maximize the effect that
the language has on how I think about programming.

For Python in particular, Jeff Knupp's "Writing Idiomatic Python"[1] (not
free, but not expensive, either) goes into detail on a lot of the concepts in
the OP's slides. (I'm not affiliated with Jeff in any way, just a satisfied
customer.)

[1] [http://www.jeffknupp.com/writing-idiomatic-python-
ebook](http://www.jeffknupp.com/writing-idiomatic-python-ebook)

~~~
tbirdz
What level would you say the idioms in the book are at? I have already been
programming in python for a little while, and I wouldn't want to pay for
something which I already know. It would be nice if there were more sample
idioms on that site so you could have a better idea of what the rest of the
book was like.

~~~
e12e
Given that the author is willing to give away copies to those that can't
afford the book, I'm sure he'll consider a preview/promise refund if you send
him an email? (See bottom of page).

Note: Also not affiliated in any way.

[edit: given:

[http://www.goodreads.com/book/show/17354838-writing-
idiomati...](http://www.goodreads.com/book/show/17354838-writing-idiomatic-
python-2-7-3)

it doesn't look like it's of all that much value to an experienced python
developer. I'd love to hear some other comments, though.]

------
jfischer
Alex Martelli gave a nice talk called "Permission or Forgiveness" about the
exception handling style recommended by OP:
[http://pyvideo.org/video/1338/permission-or-
forgiveness-0](http://pyvideo.org/video/1338/permission-or-forgiveness-0). He
has some nice insights into this issue.

That being said, I think there are some situations where you want to check for
problems up front (possibly in addition to exception handling). In particular,
if you are parsing some data from outside the program, you may want to provide
some context about what was wrong. KeyError is not very helpful to your users.

------
Vaskivo
As a huge Python fan, I'm ashamed to admit but I don't get the

    
    
        while True:
            break
    

What's the problem? I supose the use case is

    
    
        while True:
            # do stuff
            if some_condition:
                break
    

What is the alternative? 'while some_condition'? That means we must have the
'some_condition' variable outside of the loop. And if we have multiple exit
points it may become a mess.

~~~
falcolas
Personally, because I find infinite loops to be a real PITA, I prefer to do:

    
    
        for _ in xrange(100000):
            break
        else:
            logging.error("ran into an infinite loop")
    

unless I really do need an infinite loop for things like event handler loop,
which is admittedly quite rare.

~~~
meowface
That seems incredibly silly, and looks like it could lead to very infrequent
bugs (the worst kind). `while True` is shorter, simpler, and conveys the
actual purpose better.

Maybe having a statement like that when testing is okay, but in production
code that looks insane.

~~~
falcolas
I'd argue that `while True` would be fine in testing, but problematic for
production, particularly in the types of programs I write most frequently
(long running, minimal supervision). In these daemons, stalling on infinite
loops is significantly more painful than dropping out of a loop early
occasionally (and with proper logging that it did fall out of the loop, to
boot).

It's certainly not as idiomatic, but it's more correct in the long run. My
eyes were opened to this when reading through the NASA C guidelines. Closing
the door on infinite loops lets programs recover gracefully and do the correct
thing for the duration of the programming, as opposed to thing that _may_ be
correct for that moment of operation: i.e. looping 1-2 more times beyond the
limits of the xrange counter.

------
aaren
I would add generator expressions:

    
    
        (f(x) for x in list_of_inputs)
    

Just like a list comprehension, but with (...) rather than [...] and with lazy
evaluation.

These are useful when you don't need to evaluate all of the inputs at once but
still want to iterate over them at some point later on.

~~~
runarberg
Not to mention how awesome they become when you feed them as a set of values
to a function.

    
    
        values = (f(x) for x in list)
        g(*values)

------
kerpal
Thanks for this write up. I didn't know about enumerate. I never thought of
swapping variables as in example 4 either.

I noticed one small mistake in section 9:

    
    
      d[keys] = values[i] 

Should be:

    
    
      d[key] = values[i]

~~~
aaren
For a good few months I was doing

    
    
        for i in range(len(input_list)):
            v = input_list[i]
            ...do something with v and i...
    

enumerate is much nicer!

    
    
        for i, v in enumerate(input_list):
            ...

------
calroc
This is a great, if somewhat basic list. My $0.02: Python is a tool for
thinking clearly. That you can run your "thoughts" as written is a very nice
bonus, of course.

There are some really interesting things that Python allows:

    
    
        >>> d = {}
        >>> d[23], d[14], d['hmm'] = 'abc'
        >>> d
        {'hmm': 'c', 14: 'b', 23: 'a'}

~~~
calroc
I couldn't resist. Moar funky-cool Python:

    
    
        >>> from string import ascii_letters
        >>> ors = ['|'] * (len(ascii_letters) * 2 - 1)
        >>> ors[::2] = ascii_letters
        >>> ''.join(ors)
        'a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z|A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z'
    

(Obviously, in this example '|'.join(ascii_letters) would suffice, but if the
objects weren't strings...)

~~~
beambot
Not gonna lie... your snippet made me say "WTF?" Why not just do:

    
    
      from string import ascii_letters
      '|'.join( ascii_letters )
    

(I also like list slicing... but only when necessary.)

~~~
calroc
Yeah, I realized after posting it. (See edit.) The original code was using
non-string objects. I basically come up with this to get the same behavior as
str.join(). ;-)

~~~
jerf

        '|'.join(str(x) for x in y) # or
        '|'.join(map(str, y))

~~~
calroc
The original use for that code was to build a list of parsing objects to
create a single parsing object that was the OR'ing of the basic ones.

Aw, heck, code speaks louder than words:

[https://github.com/PhoenixBureau/PigeonComputer/blob/master/...](https://github.com/PhoenixBureau/PigeonComputer/blob/master/pigeon/cola/bubbles.py#L200)

~~~
jerf

        intersperseM or $ map chartok whitespace
    

Oh... wait... sorry... wrong language.

------
hyperbovine
His last slide could be written more idiomatically as

    
    
        ''.join('Thanks!')

~~~
Walkman
And this would be more idiomatic:

    
    
        'Thanks!'
    

(It is faster also! :D)

~~~
hyperbovine
Ahh but that is a mere expression, not an idiom :-)

------
mangeletti
Srsly, what Python programmer writes the code in the "Bad" examples therein?
This list looks like it's from 2005 or something.

~~~
deckiedan
Lists like this aren't for experienced pythonistas, but more for new people to
know what things to avoid, and what not to copy if they do come across it
online (in an article from 2005...) :-)

------
malkia
On page 20, there is slight mistake - count is used in the second ("NOT SO
GOOD") example but "i" is printed.

~~~
okwa
Or was it really a mistake?

~~~
malkia
Are there any anaphoric variable names in python, like cmd.exe?

------
juanuys
On page 20, didn't he/she mean "print(count, name)" in the BAD example?

On page 18, the Java comment: it depends. E.g. see [1]

[1] [http://stackoverflow.com/questions/299068/how-slow-are-
java-...](http://stackoverflow.com/questions/299068/how-slow-are-java-
exceptions)

------
gabipurcaru
Another one that I find useful -- using `map` instead of list comprehensions
when you want to apply a function to every element. So instead of:

    
    
        [str(x) for x in array]
    

Do this:

    
    
         map(str, array)

------
drewblay
>pets = ['Dog', 'Cat', 'Hamster']

>for pet in pets:

> print('A', pet, 'can be very cute!')

This may be nit picking but I prefer output like this:

print 'A %s can be very cute!' %(pet)

~~~
rsfinn
Possibly because you haven't moved to Python 3?

~~~
pekk
Python 3 doesn't dictate that style, just the parentheses

------
Redoubts
I don't understand the truth table on slide 9 for " \- vs None " and
"__nonzero__ (2.x) " __bool__ (3.x) vs __nonzero__ (2.x) " __bool__ (3.x) "

~~~
Arnor
I'm pretty sure the "-" vs "None" meant that there wasn't a truthy alternative
to None. More accurate would have been "N/A" vs "None". Not clear on the
__nonzero__/__bool__ stuff...

~~~
falcolas
__nonzero__/__bool__ are the methods called on objects when evaluating
truthiness. So, if you're implementing custom objects, those are what you
would use, and return either True or False, based on their truthiness value.

As an example, you might represent __bool__ on a database connection object to
reflect whether there is a live connection, allowing you to:

    
    
        if not conn:
            conn = MySQLdb.Connect()

------
myle
String concatenation is pretty fast nowadays with the + operator.

~~~
elbear
Do you have a source? Also, I would argue that concatenation with + is less
readable than string formatting.

~~~
NigelTufnel
I think that concatenation with + is okay when you're concatenating two
strings. And CPython can optimize x = x + y or x += y calls.

[http://docs.python.org/2/library/stdtypes.html](http://docs.python.org/2/library/stdtypes.html)

------
mrfusion
I'd love to see a writeup like this for Perl or Javascript. Has anyone come
across one?

How about Java?

------
rafekett
i agree with all but #2. this seems to embrace conciseness as simplicity or
understandability. it forgets the more cardinal value that explicit is better
than implicit.

