

Python Performance Tips - dhotson
http://wiki.python.org/moin/PythonSpeed/PerformanceTips

======
sophacles
These tips are ok. Particularly the ones about profiling. I have a few more
that I have found over time. First tho, as always a good algorithm improvement
can help over any of these. Trick 0 is about that, the rest are more about
performance.

0\. Generator comprehensions and list comprehensions are you friends.
Sometimes one will make things like woah fast -- generators are lazy
evaluation at its finest (if you don't follow that find a Haskell fan and ask
them), this can be a real boon to your app. Sometimes you need the list
comprehension. If you can't reason out which would be better for you, or
reasoning suggests it shouldn't matter, try both anyway, just to be sure :).

1\. Function calls are expensive. Sometimes when you call a function a lot (on
the order of 1000's of calls in a normal run) its better to bite the bullet
and just unfactor it. I have seen improvements of 50% from just this.

2\. Dictionaries are fast. Classes are great, objects make life easy on
programmers, but when dealing with large datasets, sometimes nested list/dict
combos is way better than Objects (which are just dicts, but the syntactic
sugar of objectness can eat up quite a few method calls, see #1 above). Also,
sets are extremely useful and based on dicts (therefore also fast), so if the
semantics of a set are ok with your needs, use them.

3\. Python regex is fast. It is usually frowned upon, but sometimes a complex
regex is way better than pyparsing or python string methods. This saved my
butt on one very memorable occasion -- #1 and #2 above didn't really do
enough, so I replaced the core processing bits with a very complex regex and
got the speedup.

4\. The gc module can really do a LOT of good. Short running scripts with lots
of data can really be sped up by turning off gc (if you have the memory
capacity). Even tuning generational parameters can really affect performance
-- almost shockingly so.

5\. Any type of bit twiddling sucks. Take the time to do it in C and make the
extension. This includes most IP ops -- if you are working with large lists of
addresses look into dnet. Similar libs exist for a lot of other projects.

6\. __slots__ prove very useful sometimes. So does the struct module. Learn
about both, they can really do wonders for your code in both readability and
performance if used carefully.

7\. All of the above are wrong in some contexts -- they are not hard and fast
rules, but guidelines, in some cases they work great, in others they don't
help at all.

~~~
bbb
_5\. Any type of bit twiddling sucks. Take the time to do it in C and make the
extension._

I recently had to do something like this and ended up using SWIG to generate
the glue code. However, I noticed that my code had to spent considerable time
(many iterations) in the C++ library to reduce overall execution time (but
then it did by a factor of 20x-40x).

Do you happen to have some advice on how much speedup could be gained by
replacing SWIG with handwritten wrappers?

~~~
sophacles
My personal preferences are: Boost.python for C++[1], and Pyrex/Cython for C
wrappers. Both make things pretty nice. I never really got into swig, so I'm
not sure if there is noticable speedup betwen any of these. As for handwritten
wrappers, I have not had any personal experience trying to eke the extra speed
from not using a code generator type wrapper. hth

~~~
dagw
Swig solves a somewhat different problem than Boost.python/pyrex. Swig works
better when you have an existing C++ codebase you want to call from python
while making as few changes to the C++ side as possible, while Pyrex/Boost
work better when you are writing a C or C++ module from scratch to be called
from python.

------
pkrumins
I wonder if any of them still are current.

"At the time I originally wrote this I was using a 100MHz Pentium running
BSDI."

~~~
utku_karatas2
Most of the tips here seem to be based on "do stuff in such a way that less
Python C API calls making lookups get involved". Python API is the same Python
API more or less so I'd assume most of the tips are current.

------
jparise
The timeit module (<http://docs.python.org/library/timeit.html>) is an
invaluable tool for taking and comparing performance measurements.

------
admn_is_traitor
Python performance tip #1: don't use it if performance is a design goal.

~~~
dhotson
I wouldn't go that far as to avoid it altogether.

Python makes a great glue language even in applications where performance is
important. You can write all the performance critical stuff in C and then glue
it together in Python.

It's a pretty common approach and it works really well. Game engines often do
this where the main engine is in C++ with scripting in Lua.

