
Profiling Python Like a Boss - bryanh
https://zapier.com/engineering/profiling-python-boss/
======
kgm
The built-in profiler can be told to dump the results of the profile to a
"pstats" file. A simple way to do this is to run your code under the module
itself, as in:

    
    
      $ python -mcProfile -o foo.pstats foo.py
    

This will run your script (foo.py) inside of the profiler and save the profile
to the file foo.pstats once the script is done.

Once you have a pstats file, you can do some interesting things with it. You
can load it into an interactive pstats session with this:

    
    
      $ python -mpstats foo.pstats
    

This will allow you to sort and display the profile data in a number of ways.

You can also use the third-party gprof2dot script[1] to turn the pstats file
into a GraphViz dot file, and thence into a visualization of your code's
performance. For example:

    
    
      $ gprof2dot.py -f pstats foo.pstats | dot -Tpdf -o foo.pdf
    

That will turn your foo.pstats file into a PDF. I like the PDF format here
because it is vectorized, renders quickly (more quickly than SVG, at least),
and you can search for specific text in it (which is useful if you have a
large profile and want to find a specific function).

[1]
[https://code.google.com/p/jrfonseca/wiki/Gprof2Dot](https://code.google.com/p/jrfonseca/wiki/Gprof2Dot)

~~~
dagw
Runsnakerun[1] is another simple interactive GUI tool for visualizing .pstat
files that I've used quite a bit in the past

[1]
[http://www.vrplumber.com/programming/runsnakerun/](http://www.vrplumber.com/programming/runsnakerun/)

------
bsimpson
If you're using Python to serve a website using Flask or Django, you need to
take a look at:

[https://github.com/phleet/flask_debugtoolbar_lineprofilerpan...](https://github.com/phleet/flask_debugtoolbar_lineprofilerpanel)
(includes screenshot)

or [https://github.com/dmclain/django-debug-toolbar-line-
profile...](https://github.com/dmclain/django-debug-toolbar-line-profiler)

It provides a very intuitive way to figure out where, exactly, your view is
slow. I know the Chrome and Firefox teams are both hard at work on building
more robust profiling tools, but I would love to see them integrate something
for JavaScript that's this easily accessible.

~~~
phleet
Sweet - always good to see people using my stuff in the wild :)

------
thebutter21
Another good walk through using Run Snake Run:
[https://speakerdeck.com/rwarren/a-brief-intro-to-
profiling-i...](https://speakerdeck.com/rwarren/a-brief-intro-to-profiling-in-
python)

~~~
hershel
Run snake run is very easy to use , and powerfull. In my opinion it should
have been part of the standard library.

------
bdarnell
You can also use a statistical/sampling profiler like
[https://github.com/bdarnell/plop](https://github.com/bdarnell/plop). This is
useful for getting profile data from a live server because it has much lower
overhead than the other options and can be turned on and off while the
application is running.

------
kmfrk
The most compelling example of profiling I ever saw was Iconfinder's use with
Django: [https://speakerdeck.com/nickbruun/lessons-learned-defying-
jo...](https://speakerdeck.com/nickbruun/lessons-learned-defying-joel-spolsky-
with-django?slide=41).

The best sales pitch for profiling there is, perhaps. :)

------
cabalamat
Couldn't this code:

    
    
            def inner(func):
                def nothing(*args, **kwargs):
                    return func(*args, **kwargs)
                return nothing
    

Be simplified to:

    
    
            def inner(func):
                return func

------
shoyer
In my experience, the easiest way to analyze Python performance is with the
%time, %timeit and %prun magic commands in IPython, an alternative Python
shell that is far more powerful than the original.

With IPython, you can simply write:

    
    
      %time result = expensive_function()
    

and the time it takes to execute expensive_function() is printed out to the
console. You can also directly substitute %timeit or %prun for %time to use
loops for the timer or to invoke built-in profiler. No need to write any of
your own boilerplate!

------
Erwin
I use timers, saving the results in memory and dumping them occasionally then
aggregating. Each timer has a category name (like sql query, or one of the
many higher-level tasks in the app) plus optional argument (so the actual SQL
query, or the thing that had some work performed on it). Each timer just saves
min-max-avg time. In the aggregate view you can see that e.g. "foo'ing" took
20% above average Foo time this week after a new software release and can
check that out.

For profiling, I've occasionally swapped out some key function with odd
characteristics with one runs in a profile. So you can create a new Profiler
instance, and patch your weird function to instead call
profiler.runcall(func). After you've tested it a few times, you swap back to
the old version and dump the stats.

If you have multiple proxying layers, multiple processes etc. a useful option
is to make e.g. Apache save the request time in the main or another log file.
This is the CustomLog %D option. Then you can easily find out that e.g.
/foo/bar requests take so and so much on average, with 95% below this number,
perhaps even doing real-time warning when they start to increase.

------
bryanh
Op here, happy to answer any questions. These tools have helped us many times
(from speeding up various Django extensions to some really complex code deep
inside Zapier).

~~~
sigil
Good stuff. If you're going to indulge the optimization fetish, at least
profile first! Corollary: learn to use gprof and/or your language's profiler,
it pays off bigtime.

Question. It's been a while since I used them, but what happens when you use
cProfile or kernprof in multithreaded / forking code? I vaguely recall wiring
up a custom trace event logger via sys.settrace() to work around this.

~~~
bryanh
I haven't a clue! That is an excellent question, most of our multiprocessing
is done via some task based & message queue based map-reduce, so we've not
needed to do process or thread profiling.

------
vaidik
checkout this PyCon talk as well for profiling Python programs -
[http://pyvideo.org/video/1770/python-
profiling](http://pyvideo.org/video/1770/python-profiling)

