

Google's Python style guide - cr4zy
http://google-styleguide.googlecode.com/svn/trunk/pyguide.html

======
nikcub
I don't like Google's import style, and I have noticed it in a lot of their
code. They nicely namespace all of their packages and modules only to dump
methods and classes into a single namespace when being imported and used.

for eg.

    
    
        from sound.effects import echo
        echo.EchoFilter(input, output)
    

What happens is that you then end up importing all of these methods and very
quickly you start getting name conflicts. Lets say you want to support a
third-party echo function:

    
    
        from sound.effects import echo
        from vendor.soundutil.effects import echo as soundutil_echo
    

You see this all the time in SDK and web API packages. Dozens of modules
called 'auth' (which auth? twitter? facebook?) or 'oauth' or 'request'.

Lets say you have a user page that integrates with social networks, would you
rather:

    
    
        from facebook.api.auth import auth
        from twitter.api.auth import twitter_auth
    

etc. etc. or

    
    
        import facebook.api
        import twitter.api
        ..
        facebook.api.auth()
        twitter.api.auth()
    

You end up either doing 'import as' hacks and a lot of renaming. Code is a lot
clearer to read when you see full method names such as facebook.api.auth
rather than just 'auth' and 'echo' everywhere. You also don't lose
documentation paths.

My general rule of thumb is to use 'from' infrequently, never do import *, to
retain the part of the path that still keeps namespacing sane and clear to the
developer and as the doc says to never do relative imports.

It means you can scan any part of the code and understand what is going on
without going back up to the top of the file. Also makes search/replace easier
(rather than s/echo/echo_new s/sound.effects.echo/mynewpackage.echo)

The other one I didn't see mentioned is nesting levels and method lengths.
Python isn't well suited to deep-nested and long methods. Especially if your
coding style is to comment out blocks of code during development as you test
things, you always end up commenting out parts and then having to re-indent
the rest of it.

The same usually applies if you have long 'and' 'or' clauses in ifs that span
multiple lines and make it harder to understand the code. I usually wrap those
tests into separate methods (if you are using them once, you will probably use
them again)

but for nesting, I try to stick to 2 levels max. If you go beyond that it is
usually a hint that you can refactor the codepath and perhaps even separate
out into another method.

I just happen to be doing this a few hours ago while writing an option and
argument parser for a command line utility that has sub-commands. a quick re-
factor made the code and all the different options and which options apply to
which sub-commands a lot easier to understand

Edit: just further on breaking up code and bounds checking into methods, it
makes life easier for other developers and for your future self. there is
nothing more exhausting than trying to debug a module and finding a 3-page
long method called 'run', which you end up having to break down yourself
anyway. separate all the bounds checking into one or two line methods, break
everything else up, document it, write some tests for it and then forget about
it - that is done and it works. get on with important things.

checking nesting levels and method length is almost something I would want to
put in a linter

~~~
oddthink
I find it's easier to have short names, defined at the beginning, rather than
clogging up my code with full.path.to.a.module. I imagine preferences vary.

What I don't understand is why they prefer:

    
    
        from thepackage.subpackage import amodule
    

over

    
    
        import thepackage.subpackage.amodule as amodule
    

I always found the second to be more clear, since it doesn't mix the idea of
package-resolution with the idea of picking-things-out-of-a-module. On the
down side, I end up typing "amodule" twice.

------
scorpion032
"List Comprehensions: Ok to use for simple cases (only)": [http://google-
styleguide.googlecode.com/svn/trunk/pyguide.ht...](http://google-
styleguide.googlecode.com/svn/trunk/pyguide.html#List_Comprehensions)

Really, Google? I find the following far more convenient to read:

    
    
        result = [(x, y) for x in range(10) for y in range(5) if x * y > 10]
    

Than the alternative:

    
    
        result = []
        for x in range(10):
            for y in range(5):
                if x * y > 10:
                    result.append((x, y))

~~~
ironchef
I found the alternative easier to read. I read it once. I had to scan yours
twice. I'd say yours is borderline simple case.

Also, i think part of the difference can be found when you're dealing with
large amounts of code in maintenance. I'd much rather read something dead
simple (if somewhat verbose) than something that makes me think at all
(another example here would be (in the ruby world) use of !unless).

Like anything that has to do with taste, to each their own (i prefer a
vinegary bbq while you might like a smokier bbq)

~~~
RyanMcGreal
I didn't like list comprehensions when I first encountered them, but after
getting accustomed to them, I now strongly prefer to write and read a
comprehension over a for-in loop.

~~~
ironchef
I agree, but it depends on how complex the operation within the comprehension
is.

------
qznc
I like the idea to include a name with TODOs, like

    
    
      # TODO(qznc) check Unicode handling
    

This never occured to me, but it provides a good pointer whom to ask for
details. On the other hand git-blame should be able to provide the same info.

~~~
viraptor
I hate those. Unless you're familiar with the whole team you don't know who
qznc is. Git/hg/svn blame will tell you that more precisely. Inserting your
name in the comment is only slightly less annoying than people insisting to
put "author"comments on top of the file (or even a function)

~~~
tonfa
And you don't have an easy to find out who he is? Or mail him based on his
username?

~~~
viraptor
If his username matches with his email, or if he uses the it consistently -
maybe. But probably he wrote that note ages ago, someone else rewrote the
function since then and left the comment because the content is still valid -
ie. you're contacting the wrong person.

In short: metadata which is not updated automatically is most likely not up to
date, and may be not correct in general.

~~~
tonfa
You can configure the linter to check for valid username.

------
waleedka
Interestingly, the Python code I've seen from Google uses 2-space indents
rather than 4 as the style guide recommends. And that includes code written by
Guido himself (AppStats and NDB, tools used in App Engine). I prefer 2 spaces
as well, and I was hoping that the official style guide would match what's
being used most commonly.

~~~
AncientPC
I don't mean to start yet another tabs vs spaces debate, but I've always felt
spaces dictate how others see the code while tabs allowed others to see code
however they like (2/4/8 spaces).

One exception is some projects are strict about lining up code properly in
multi-line statements, and spaces are more consistent in that respect.

I prefer tabs, but most Python code I've seen has been 4 space indents.

~~~
toyg
I used to prefer tabs (less bytes, better abstraction etc), but I've long
given up on it. Most tools are not smart enough to switch mode when required,
and reformatting entire files every time you want to make a small change on
somebody else's code is a real pain (and dangerous).

Community consensus is 4-spaces and that's it.

~~~
raverbashing
Yeah

I used to prefer tabs as well, and I still stand by it.

But you can always configure your editor to use 4 spaces. And 4 spaces looks
like 'less wasted space'.

I'll keep using tabs for some things, and 4 spaces to most professional
projects.

About 2 spaces I'll just say one thing: NO

------
jdevera
Why oh why the 80 character limit? It's the 21st century, screens are huge!
I'm not saying let's put the limit in 300, but 100 or 120 is good enough to
fit side by side diffs in one screen.

~~~
raverbashing
There are 2 issues there

1 - yes, screens are huge, but it doesn't mean people can/will use small fonts
or will scroll the screen

With today's big/wide screens it's more useful to have code side by side.

2 - Abuse. The 80 character limit is a pretty good indicator that you should
be doing something else instead of having your code go over 80 characters.

Long lines are confusing, and you most likely can split the logic in several
lines, facilitating maintenance.

~~~
jdevera
I see both points but regarding (2), there are common cases where you are not
necessarily doing anything wrong but the 80 character limit will make your
code less readable, especially when using four spaces or more for indentation.
For example you might end up in the fourth level of indentation wanting to
write a list comprehension that would be perfectly readable in one line but
have to break it down because of this rather short hard limit.

I think having a soft and a hard limit makes more sense, if anything, I would
make 80 characters the soft limit and perhaps 100 a hard limit, although I'd
prefer them to be 100 and 120.

~~~
bonzoesc
If you're in the fourth level of indentation, you might already have a
readability problem.

~~~
jdevera
"Might" being the key word. You might, if you are writing a simple script, but
if you are writing some more complex code, class and method already take two
levels, and that is your base line, I don't see how having two more levels is
a readability problem.

------
jbarham
Much as I like the general consistency of Python code with or without formal
style guides, I prefer Go's "style guide" even more, which is to just run "go
fmt": <http://golang.org/cmd/go/#Run_gofmt_on_package_sources>.

~~~
slurgfest
You could just run pep8, you know.

~~~
rbanffy
I'll continue surrounding "="s in keyword arguments and default parameter
values regardless of what PEP-8 says.

------
yuvadam
Weird. The recommendation for a shebang line is to use #!/usr/bin/python

That will definitely break stuff. Why not #!/usr/bin/env python?

~~~
nikcub
env is nothing more than running it in a new shell, so slightly slower for
each exec (since it starts a shell and then searches through PATH), may pick
the wrong interpreter (because PATH is dependent on the user and a lot of
other things outside of your control) and since this is all running on Google
infrastructure they know what /usr/bin/python.

their goal isn't to write portable code, it is to write fast code that runs on
google servers

~~~
etanol
Sorry, but I fail to see where the "env" command spawns a shell:

[http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/env....](http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/env.c?id=v8.13#n153)

Could you please explain what am I missing?

~~~
nikcub
what I mean is that it is just like running 'python' in a shell, not that each
instance spawns a new one

~~~
zwp
> like

One significant difference: when running python from a shell there's a fork()
and an exec(). env(1) doesn't fork: there are not two processes. (In other
words: the shell does not exit when you run a command, env vanishes).

There is obviously still some overhead to using env (and I just learnt from
the source that env has argument processing). I tried to replicate AncientPC's
test but on my machine both invocations take around 0.015s. (Perhaps their
username is an indication as to why they see a (2%!) difference...).

But okay. What I really came here to say is: security. You can make all the
efforts in the world to ensure that all programs get called with full
pathnames but then one env shebang and you're suddenly open to running
whatever's first in the user's $PATH and happens to call itself "python".

EDIT: eg
[http://portaudit.freebsd.org/d42e5b66-6ea0-11df-9c8d-00e0815...](http://portaudit.freebsd.org/d42e5b66-6ea0-11df-9c8d-00e0815b8da8.html)

~~~
nikcub
That last point is important which I forgot to mention, especially in
operating systems that have '.' first in PATH.

I notice the difference anecdotally without measuring it, but I don't know if
that is a conception bias because I know env should be slower.

The best way would be an autoconf script in your package and an install run
that finds and verifies the local framework.

I have to admit that I have never done this though. I have a few Python
scripts with decent distribution and just rely on the direct path (and a batch
file for win32)

------
shawnps
Regarding this one:

[http://google-
styleguide.googlecode.com/svn/trunk/pyguide.ht...](http://google-
styleguide.googlecode.com/svn/trunk/pyguide.html?showone=Default_Argument_Values#Default_Argument_Values)

I commented about this on another post. There's a nice explanation here as to
why using mutable objects as default values in function/method definitions is
bad:

<http://effbot.org/zone/default-values.htm>

In short, it can be bad to set default arguments to mutable objects because
the function keeps using the same object in each call.

~~~
toyg
That's a classic Python gotcha, mentioned in all books and tutorials nowadays.

------
iloveponies
Interesting they make no reference to PEP 8.

~~~
slurgfest
"Interesting" isn't the word I'd use - PEP 8 is pretty clearly ratified by the
Python community (more than any other style) and yet Google Knows Best

~~~
thristian
The Google style guide seems to match up with PEP 8 pretty closely, from my
brief review of it. Better yet, it actually includes explicit guidance on
things like 'how to name local variables and class properties' which PEP 8 is
mysteriously silent on.

------
plq
> Never use catch-all except: statements, or catch Exception or StandardError,
> unless you are re-raising the exception or in the outermost block in your
> thread (and printing an error message). Python is very tolerant in this
> regard and except: will really catch everything including Python syntax
> errors. It is easy to hide real bugs using except:.

What kind of SyntaxErrors are cought by the except: handler? Not all, I
presume:

    
    
      try:
        a b
      except:
        pass
    

This fails with SyntaxError on Python 2.7.2 on my machine.

~~~
mxey
That code is never executed because it fails to parse. I think you can catch
SyntaxError if you import a file with broken syntax.

------
maybe_someday
I wonder why pyChecker and not pyLint or pep8. Anyone got an insight?

~~~
tonfa
Pylint is used internally:
<http://comments.gmane.org/gmane.comp.python.logilab/1075>

------
willvarfar
Fun fact: Google recommend _against_ using map/reduce :)

[http://google-
styleguide.googlecode.com/svn/trunk/pyguide.ht...](http://google-
styleguide.googlecode.com/svn/trunk/pyguide.html?showone=Deprecated_Language_Features#Deprecated_Language_Features)

Oh, not different map/reduce? ;)

~~~
AncientPC
One of the nice things about using map() is multiprocessing.Pool.map() can be
a drop-in replacement.

OTOH, map() does otherwise have a little overhead compared to list
comprehension.

------
droctopu5
Wait, I thought Python WAS a style guide.

------
lclarkmichalek
The last point is definitely the most important, and applies to any language

------
silon4
Personally, in my own code I also like to put pass at the end of all blocks.
It's more consistent, and it also makes auto indent in the editors work
properly. Anyone else?

------
danso
For an offhand comparison: Google's unofficial Ruby style guide:

<http://www.caliban.org/ruby/rubyguide.shtml>

