
The Elements of Python Style - pixelmonkey
https://github.com/amontalenti/elements-of-python-style
======
Walkman
This guy got a bunch of things wrong:

> encode = lambda x: x.encode("utf-8", "ignore")

Why not rather call it s (short for str)?

> With tuple unpacking, using _ as a throwaway label is also OK.

Use __ (double underscore) instead so it won't clash with the gettext
convention:

    
    
        from django.utils.translation import ugettext as _
    

> always use _args and_ _kwargs for variable argument lists

No, no, no, no, no! Why not clarify what are those accepted arguments?

What the following code is doing?

    
    
        def shuffle(*args):
            # do something
    

It shuffles SOMETHING.

And this?

    
    
        def shuffle(*cards):
            # do something
    

It shuffles CARDS. Big difference IMO.

> Use parens (...) for fluent APIs

I don't like that.

> Use implicit continuations in function calls

PEP8 suggest: "The preferred place to break around a binary operator is after
the operator, not before it."

Following this logic, this should be:

    
    
        return set((key.lower(), val.lower()) for
                    key, val in mapping.iteritems())
    

(for at the end of the first line)

> Rarely create your own exception types

> ... and when you must, don't make too many.

Sorry but that's plain stupid.

> if item vs if item is not None

Yes, you should absolutely care. Not all the times, but sometimes a "NULL"
value is different than a filled, but empty value. See zero. It's falsy, but
totally acceptable input...

~~~
pixelmonkey
I'm "this guy" who "got a bunch of things wrong" :)

Style guides are opinionated, and not everyone will agree. I intentionally
made the style guide part of a Github repository so I could consider feedback
from the community, and I've already incoporated plenty. As I mention below in
the thread, I think you have a good point about "args" vs "cards" for your
shuffle example.

Re: "The preferred place to break around a binary operator is after the
operator, not before it", I think you mis-read this PEP8 suggestion. A binary
operator is something like "and" or "+", not the "for" keyword inside a
listcomp.

Re: your other disagreements, they amount to style choices of their own. When
a style guide picks between two ways of doing things, there will inevitably be
detractors, and I am OK with that.

~~~
Walkman
If you don't use "if item" properly, it's bad code, not just style choice.

------
kevin_thibedeau
> Use reST for docstrings

I would suggest also using the Napoleon [1] extension now included with recent
versions of Sphinx. It allows you to use Google or Numpy-style reST formatting
to describe parameters and types without all the javadoc-ish noise from
:param:, :type:, and others.

[1] [https://sphinxcontrib-
napoleon.readthedocs.org/en/latest/](https://sphinxcontrib-
napoleon.readthedocs.org/en/latest/)

~~~
pixelmonkey
Thanks -- we are actually discussing this possible change in a Pull Request
here: [https://github.com/amontalenti/elements-of-python-
style/pull...](https://github.com/amontalenti/elements-of-python-style/pull/4)

------
ot
> If the strict 79-character line length rule in flake8 bothers you, feel free
> to ignore or adjust that rule.

Thanks for remarking this. Too many people use flake8/pep8/pylint in commit
hooks.

From the BDFL himself:

> "I personally hate with a vengeance tools named after style guide PEPs that
> claim to enforce the guidelines from those PEPs."

> "[stylechecker] tools' rigidity and simplicity reflects badly on the [style
> guide] PEPs, which try hard not to be rigid or simplistic"

> GvR

[https://twitter.com/raymondh/status/683793667996303360](https://twitter.com/raymondh/status/683793667996303360)

[https://twitter.com/raymondh/status/683809696902332416](https://twitter.com/raymondh/status/683809696902332416)

------
t0mk
My big problem when reading Python code are "from" and "as" in imports.

To illustrate: I read a function doing a few calls in file, and I want to
understand what happens in one of the calls (a name without any dot prefix).
So I do forward search for the name in the current file (a bit silly when
thinking of it, but forward slash is so close to my index finger; and in web
browser the search is forward by default), and it only finds it in imports in
the beginning of the file after "from". At this point I lose the context of
the function I was originally reading, because I have seen other chunks of
code which I try to understand a bit. Also I lose the position of the original
function in the file. If I want to dive deeper to the calls, I look up the
module in question.

If the name would be use with full namespace prefix, like

    
    
      import datetime
      datetime.datetime.utcnow()
    

^ I could immediately see which module it's coming from and go there. Straight
up! I wouldn't need finger acrobatics, and it would be readable in e.g. github
code viewer.

If you are really bothered by the length of the dot-prefixed names, why not do

    
    
      now = datetime.datetime.utcnow()
      print(now())
    

^ That's all clear, and most likely you will fit to 79 chars per line.

Namespaces are a good idea, and explicit over implicit, right? I understand
that there are some conventions (like _ for gettext), but I see importing
names without dot prefixes as killing readability.

~~~
vadskye
Minor nitpick: I think you mean

    
    
       now = datetime.datetime.utcnow # no parentheses
    

More generally, I think the strongest argument for using "from" is when you're
dealing with redundant "foo.foo.bar" names. Would you really be confused at
seeing "datetime.utcnow()" instead of "datetime.datetime.utcnow()"?

------
jqm
Good piece.

A (minor) nitpick.

On section titled '''Prefer "pure" functions and generators''' two "dedupe"
functions are shown. They are stated to be the same but aren't. The first
function returns the number of duplicated elements excluded (an int). The
second function returns a set of excluding duplicated elements. It's true (in
the first function) that the original list "items" will be modified to exclude
the duplicated elements but an additional unshown step is needed in the second
to determine the number... i.e len(items) - len(set(items)).

~~~
pixelmonkey
I agree. I opened PR #19 to address this:

[https://github.com/amontalenti/elements-of-python-
style/pull...](https://github.com/amontalenti/elements-of-python-
style/pull/19)

Would love to hear what you think of the changes.

------
entee
I've always been curious about docstring style. I've usually gone with numpy-
style docstrings:

[https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMEN...](https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt)

They tend to be easier to read than reST, and happen to be used by the
packages I use most (numpy, scipy). They are also easily worked into
sphynx/napoleon.

2 questions:

1.) Why use reST instead?

2.) Why isn't there a python-wide standard or best practices as is found for
Scala/Java/others?

~~~
wcarss
From [1], the "Docstring Standard" section of your link, "Our docstring
standard uses re-structured text (reST) syntax". It seems like they use reST?

1 -
[https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMEN...](https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt#docstring-
standard)

~~~
yeukhon
That's referring to the syntax in the docstring format they developed is only
supported by reST, render by Sphinx's integration with reST. For exmaple,

    
    
        .. image:: filename 
    

You don't get that in Markdown.

------
amyjess
> The biggest offender here is the bare except: pass clause. Never use these.
> Suppressing exceptions is simply dangerous.

This doesn't apply if you're using exceptions as control flow, which is
idiomatic and recommended Python behavior.

Let's say you want to make sure some file doesn't exist before you start doing
something. Specifically, you want to wipe it out if it exists, and you don't
care if it doesn't exist. You do:

    
    
        if os.path.exists(temp_file):
            os.remove(temp_file)
    

This isn't idiomatic Python behavior. Instead, you want to use exceptions:

    
    
        try:
            os.remove(temp_file)
        except OSError:
            pass
    

Mind you, an actual bare "except:" (without specifying an exception type) is
still bad, because that'll catch things you don't want being caught, like a
SystemExit or KeyboardInterrupt, but it's perfectly acceptable to silently
pass in this case.

~~~
shoyer
Even better (with Python 3.5 or newer):

    
    
        from contextlib import suppress
    
        with suppress(OSError):
            os.remove(temp_file)
    

More generally, context managers are the right way to refactor repeated
try/except/finally clauses.

~~~
amyjess
Well, I have yet another reason to migrate from 2 to 3 now.

------
teddyh
> _With tuple unpacking, using _ as a throwaway label is also OK._

I see this sometimes, but the problem is that this breaks internationalization
using gettext, commonly imported under the name _:

    
    
       from gettext import gettext as _
    

(For those who’ve never seen it, “_” is then used as a function which wraps
strings, like _("Book"); translation tools can then _parse the source code_ ,
deduce that “Book” is a term which needs translating, and at runtime the “_”
function can return the translated term instead of the original string when a
different language is needed. The name “_” for this functionality predates
Python and is a common feature across most programming language using the
gettext method of internationalization. It is therefore unfortunate that some
Python users have taken “_” to be a kind of throwaway variable.)

~~~
Spiritus
I thought it was mostly imported as __ (double underscore).

~~~
teddyh
No, I’ve never seen that. The normal single-underscore name is used in the
Python documentation:

[https://docs.python.org/3.6/library/gettext.html](https://docs.python.org/3.6/library/gettext.html)

~~~
Spiritus
You're right, I was probably thinking of some other
library/framework/language.

------
aaronchall
I agree with a lot of the criticisms. In addition,

1\. The prefixed underscore affects exports if you're providing an API in a
package - see the standard library for lots of examples of prepended
underscores.

2\. Don't use lambdas where a regular function definition works.

3\. Only use _args and_ *kwargs where these args are completely generic,
otherwise be specific as to the semantics.

4\. Complex list comps should be rewritten to append to lists.

5\. Multiline condition statements should use indentation other than 4 spaces
(2 or 8, for example).

6\. Don't use `if foo` when you mean `if foo is not None`.

Good luck with your style guide, I wish more people would make good style
important to them.

------
krick
Most of it is pretty obvious. However, what I am still fighting myself on with
Python is a project layout/module structure.

In languages with C++-like OOP model (Java, PHP) it's most of the time
reasonable to settle on "1 file, 1 class" and to use directory structure the
same way you use namespaces. Actually, in some such languages it's enforced.

For PHP it still leaves a problem with where you shoud put/expect to find
anything beside classes, but it's fine, more or less. I hope some day we'll
decide even more generally what works best, but for now it's decided per
project/framework and it's mostly bearable.

For Ruby it's mostly the same as for PHP, even though the language is a bit
more flexible. But it's still normal to use class as the main entity, so
pretty much the same guidelines/problems apply.

For Python… I cannot decide yet, and what's worse, looking at the code on,
say, Github, it doesn't seem like it's less of a problem for many others.
Because some of "solutions" are seriously horrible. No project structure
whatsoever, tens of classes in the same file. Classes, functions and even app-
specific code (like GUI, CLI or HTTP-routing) all in the same file. Multi-
thousand LOC files are considered completely normal. Ridiculous amount of code
in __init__.py, that completely re-arranges submodule structure or even
defines a bunch of new code. Hell, I've seen projects (more than 1) almost
entirely written in the __init__.py! Please don't say you think it's normal.

I would say that things remind of "project-structure" in C, but (it's scary to
suddenly realize that) it would actually be a compliment for Python, because
even though large files are considered normal as well, with modern practices
of writing C-code and all that "C-lang OOP" model stuff it's mostly clear
where you should put something. With Python I'm never sure. The only way to
find something is to use grep.

With things like that in Python, it's even no use to speak of a "project
structure" in the more general sense, aside from what's enforced by the
frameworks like Django.

So, any suggestions?

~~~
encoderer
In PHP you create classes for everything because there is no other
encapsulating scope. In that circumstance, with the class loader hook the
language gives you, one class per file is a clear win.

In Python, with module-level scope, it makes sense to group like-functionality
in the same module (aka _file_ ). Maybe it only seems "wrong" to you because
it's different than other languages you're more familiar with?

~~~
krick
Yes, it makes sense "to group like-functionality in the same module", but that
seems to be the very problem. Because I already describes what it leads to,
and compared with state of things in languages with more rigid code structure
I cannot say it satisfies me.

And, just to clarify, I consider myself quite familiar with Python, because
it's probably my first language I'm still using a lot, even though the
language itself changed a lot during this time. That's about 10 years. And I
am _still_ never sure where I'm supposed to put anything.

When 1 file contains everything there is to say about `time` — it will contain
a lot, because there really is a lot to say about time. It's much easier to
move around the codebase, when there are many files with clear semantic scope.
Say, there's a module Time with a file (and a class) Timezone in it. Then,
even though the class Timezone may be 4000 lines long — it doesn't bother me.
If I assume that there's no crazy person working on that project, that would
write something about product energy value in the class named "Timezone" —
there's pretty much zero cognitive load of keeping that file in my text
editor. Because I know, that even though 4000 LOC is quite a lot — there's
nothing else in that file. There's no "before" and "after" that class. If I am
not finding something here — it isn't here.

And the best thing is, that the directory structure can be quite nested, but
feel pretty flat. The directory tree is completely semantic, there's no 1
directory with 200 files in that, there's clear meaning in that:
src/Time/Timezone, src/Time/Datetime, src/Food/Components/Protein,
src/Food/Components/Fat. I can execute `tree -f` in the console, and I already
know a lot about what I can expect to find in the codebase. Discoverability is
important.

It is also easier to review commits in CVS with a project structure like that.
I can look only at files changed and I know, that somebody changed something
in Timezone class, not just something "time-related".

With python it's a great deal harder to follow project structure like that. If
everything can be grouped by file based just on "related-ness" principle —
files can grow huge and I have to apply constant, significant effort to notice
that something can become a sub-module, not just a file with bunch of
functionality. And looking at the file with the same 4000 LOC becomes way
harder: I don't know what else there is in that class, judging only by
filename, assumption of developers being more or less sane and even the piece
of code I'm looking at right now. There can be many things "related to time"
before or after that piece of code. 4000 suddenly seems much larger than
before, and the cognitive load now is quite significant.

That's why I'm looking for some "project-structure style guide" for Python.
There needs to be some set of simple rules to follow, so that python project
would not become the mess it always becomes. If we think it's reasonable to
have a style-guide about "backslash vs parens in multi-line expressions" it's
unreasonable to assume that project structure will grow just fine all by
itself. Well, it contradicts my observations.

~~~
encoderer
Python gives you packages for domains as large as "time". You would create a
directory (for the uninitiated: with an empty __init__.py file in it) and now
that directory is a package. The files within it are modules. There would be a
time package with a timezone module within. That module would have the
Timezone class.

The language has a feature -- a second level of scope -- that does need to be
managed if used. You're right about that and it's a legitimate concern I
suppose if it affects your throughput.

I suppose to me it's a win because my Python code has fewer classes than say
my php code because I'm able to make use of module scope to provide
encapsulation for things that don't need to be a class.

------
eugenekolo2
I can't take a guide seriously when things like this are sprinkled all over
it:

    
    
        return [x
             for x in items
             if x.endswith(".py")]
    
        if (response and
            "data" in response and
            response["data"]):
            return response["data"]
    

Just use 2 lines instead of using valuable horizontal lines, and make the code
harder to read.

ex:

    
    
        pyfiles = [x for x in items if x.endswith(".py")]
        return pyfiles

~~~
pixelmonkey
It's a fair point, but the style guide is meant to be illustrative.

The former example is indented the way longer lines might very well be e.g. if
your filter clause were a 2-part bool, you might not be able to fit it in a
one-liner. The indentation also clarifies the difference between it and the
imperative example above it.

The latter example is a rewrite of code written with 2 nested if statements,
so the indentation serves to show that now the two nested statements were
incorporated into the bool evaluation. If I made it a one-liner, someone might
make the argument that the nested version is "more readable", but as written
all you can say is that the single-if version is "less nested", which is the
point I was trying to get across.

------
mpdehaan2
At first read this is a pretty reasonable doc for some things to consider. I
also recommend watching the linked "PEP8" YouTube presentation from PyCON:
[https://www.youtube.com/watch?v=wf-
BqAjZb8M](https://www.youtube.com/watch?v=wf-BqAjZb8M)

From past projects, I got way more adherance-to-the-letter type patches versus
"make this code pythonic and elegant" type patches, and the latter IMHO is
more of the point of where things should go.

It's a bit of a long/slow talk, so if you find yourself getting bored, switch
it up to 2x speed on youtube and I think many of you will like it.

My personal PEP8 ignore list was:

-pep8 -r --ignore=E501,E221,W291,W391,E302,E251,E203,W293,E231,E303,E201,E225,E261,E241 lib/ bin/

Which is quite a lot of it :)

Codebases are easier to understand when everybody's consistent within the
codebase, but things need to be idiomatic well beyond PEP8 and it's easy to
have a forrest vs trees scenario.

------
user9756
>But another good example is rewriting an if/else chain as a dictionary lookup
or repetitive code as a tuple of operations followed by a for loop.

I'm familiar with the dictionary lookup method but not the tuple. Perhaps it's
obvious but could someone please give an example?

------
ivan_ah
All great tips, except I still feel could write docstings in markdown.

Is there a way to use Sphinx but with markdown?

------
SFjulie1
The grammar capos of python are back.

Can we get rid of the stupid indent vs tab stuff and have errors visible on
screen without black magic?

Oh and maybe why not braces to really control variable lifetime?

    
    
        from __future__ import braces
        SyntaxError: not a chance
    

Ah ah what a joke.

Just remember that use strict exists in Perl and in python you have

    
    
        dyslexia=0
        if True:
            dylesxcia=1
    

will raise no error.

The strawman fallacie of python purity of style

