
Why Python is Slow: Looking Under the Hood - underyx
http://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/
======
gregwebs

        Dynamic typing makes Python easier to use than C
    

I disagree with the implications. The main reasons Python is easier to use is
independent of type system. Not having to manually manage memory, for example!
Overall it is a higher-level language, whereas C is designed for maximum
performance.

Dynamic typing is probably preferable to a 40 year old type system. But Python
could be easier to use (catch bugs ahead of time!) and execute faster by
adding a modern type system. Optional typing (like TypeScript or Facebook's
new PHP implemntation) would probably be appropriate.

~~~
Spittie
I'd love to see optional typing in Python, I wonder if there is any official
reasons about why it never got introduced.

I very rarely change the type of a variable, so it would be essentially free
speed and free speeds for me.

I actually wonder, do people use dynamic typing so often? I mean, it's nice to
do "variable = int(variable)" when I know that I'm getting an integer in a
string, but that's probably the only use case can I can think off that doesn't
just reuse variables for something else.

~~~
collyw
I started programming in Java, and moved to Perl and now Python. At first it
seemed a bit weird, dynamic typing, but now I embrace it and enjoy the
flexibility it offers. I see that many people prefer static typing and I am
not sure why.

Can someone give me a concrete example of when static typing would be
beneficial? (The majority of my work revolves around databases, so I guess I
rely on that as a kind of type checking to a degree).

~~~
maxlybbert
In Javascript (which is also dynamically typed) I've had a few times where
adding 1 and 0, say, yielded "10" because either the 1 or the 0 originally
came from a string. OK, I can live with that, but I find the fact that you can
use that string in numeric comparisons potentially confusing.

The general argument is that if you write enough unit tests, you can guarantee
that your code doesn't make those kinds of mistakes. The statically-typed
response is "why should you have to write unit tests when the compiler can
make that guarantee for you?"

~~~
ajanuary
Adding strings to integers is usually referred to as weak typing, which is a
seperate and orthogonal concept to dynamic typing. The runtime of a dynamic
language could look at the two types of the values, decide they're different
and throw a type error. Similarly, a static type checker could look at the
types of the values and decide it knows how to do an implicit cast from one to
the other.

~~~
maxlybbert
That's a valid point. But, being dynamically typed, you don't find out about
the type error until runtime. It's possible, although unlikely, that a Python
web application can run for months before hitting a particular code branch
where a string and an int are added together, and an exception is thrown. (In
Python, it's also possible to get syntax errors long after you start your
program).

It's possible that the compiler would have had enough information to determine
this, if the compiler were to check types. Again, the proposed solution is to
simply write enough tests to exercise all code paths.

For what it's worth, I enjoy working in Perl. But I realize when I do that
some things are deferred until runtime. I also enjoy working in C and C++,
partly because I can push some things to compile time (type checks, sure, but
also arithmetic (
[http://en.cppreference.com/w/cpp/numeric/ratio/ratio](http://en.cppreference.com/w/cpp/numeric/ratio/ratio)
), unit checking (
[http://www.boost.org/doc/libs/1_55_0/doc/html/boost_units.ht...](http://www.boost.org/doc/libs/1_55_0/doc/html/boost_units.html)
), a decent amount of reflection (
[http://en.cppreference.com/w/cpp/header/type_traits](http://en.cppreference.com/w/cpp/header/type_traits)
), some assertions (static_assert), etc.). I don't necessarily push it all to
compile time, but C++ allows me to.

------
amit_m
The bigger picture is that CPython's core team simply does not care too much
about performance. Performance has never been a fundamental requirement, but
merely an afterthought.

The biggest reminder of this was Python 3, which for me was a complete
disappointment. They could have limited python's super-dynamic behavior (e.g.
changing builtin functions, patching classes on the fly, etc.) or made them
optional. They could've added optional typing annotations a la Cython. Or even
changing the builtins and language syntax to allow more inplace operations and
preallocations, so that temporary results wouldn't have to be allocated on the
heap over and over again. All of these changes would have made python faster
and more JIT-able. None of these things happened. Performance-wise, python 3
is no step forward.

Python+Cython is still a powerful combination, but eventually Julia or similar
languages will eat python's lunch with respect to scientific computing.

~~~
GFK_of_xmaspast
I think if python3 had made major performance improvements (not to mention the
GIL) there wouldn't be nearly as many 2.7 holdouts.

~~~
ris
And it still wouldn't be done.

------
pnathan
One thing that bothers me - and has for a long time - is why Python (Perl,
Ruby, etc), never have really leveraged the work Common Lisp systems have done
(CMUCL, SBCL, etc), which provide very good performance without sacrificing
dynamic typing or the REPL.

~~~
WaxProlix
What work specifically are you referencing?

~~~
agentultra
Maybe:

[http://www.flownet.com/gat/papers/lisp-
java.pdf](http://www.flownet.com/gat/papers/lisp-java.pdf)

[http://www.iaeng.org/IJCS/issues_v32/issue_4/IJCS_32_4_19.pd...](http://www.iaeng.org/IJCS/issues_v32/issue_4/IJCS_32_4_19.pdf)

------
VeejayRampay
Python (and Ruby for that matter) are slow because there are no billion-dollar
companies stuck with an initial choice of language that impedes their ability
to grow. PHP and Javascript used to be extremely slow and now after several
dozens of millions thrown at JIT, rewrites, forks and redesigns they're
starting to get much much faster.

~~~
hyperbovine
Not so. Google uses Python extensively in-house (it's one of the four
"blessed" languages) and, in fact, employed GVR until he was recently hired
away by Dropbox--another billion-dollar company which relies heavily on
Python. At one point Google even spearheaded an ambitious effort
([https://code.google.com/p/unladen-
swallow/](https://code.google.com/p/unladen-swallow/)) to make CPython much
faster. It failed.

At the point where even Google can't make it happen, it really starts to look
like Python performance is limited at some very fundamental level to what we
see today. Personally I think this is fine. I use Python for everything day-
to-day and offloading the intensive stuff to a C extension (a la Numpy) works
just great. There are very few instances where I find myself wishing Python
was faster.

~~~
ihnorton
Unladen Swallow started as an intership project and didn't get very much
support at Google [1]. When US started (~2009), Google had been writing their
core, high-performance stuff in Java and C++ for many years already so there
was no internal or existential need to make Python faster.

[1] [http://qinsb.blogspot.com/2011/03/unladen-swallow-
retrospect...](http://qinsb.blogspot.com/2011/03/unladen-swallow-
retrospective.html)

~~~
ZenoArrow
On the other hand, there may have been applications higher up the stack that
would've benefited from a more efficient Python. I don't know if it still is
but I heard GMail was written in Python, you'd think an application of that
scale would benefit from better performance.

~~~
ihnorton
The internet says it is written in Java:
[http://en.wikipedia.org/wiki/Gmail](http://en.wikipedia.org/wiki/Gmail)
(possibly with some Python build tools, for which performance doesn't matter)

~~~
mike_hearn
The Gmail frontend is Java. The backend (mail routing, spam filtering, virus
checking and parts of storage etc) is C++.

------
PythonicAlpha
Yes, Python is slower in execution than some other languages, but:

    
    
       ... efficient use of development time ...
    

That is the reason, why Python counts (not only references). Python has many
very good libraries, is a good OOP language, easy to learn, but still very,
very powerful. You can express in Python some things in a single line, where
you need hundreds of lines in C++ or other languages.

The few percent of running speed that you might loose, are neglect-able in
most cases against the win in development speed.

In many applications you don't need the full CPU power, but often times you
are hindered by e.g. the disk speed or other factors ... and than you don't
lose anything when you are a little bit slower in some minor tasks.

~~~
nly
> You can express in Python some things in a single line, where you need
> hundreds of lines in C++ or other languages.

citation needed.

~~~
Borogove
Example: x = {'foo':34,'bar':[1,2,3],'baz':'quux'}

The headers alone for STL maps are hundreds of LOC.

~~~
zmmmmm
Counting headers is rather unfair.

C++11:

map<int, char> x = {{1, 'a'}, {3, 'b'}, {5, 'c'}, {7, 'd'}};

Any not-ancient C++ using boost Assign:

map<int, char> x = map_list_of (1, 'a') (3, 'b') (5, 'c') (7, 'd');

Shamelessly stolen from here:

[http://stackoverflow.com/questions/138600/initializing-a-
sta...](http://stackoverflow.com/questions/138600/initializing-a-static-
stdmapint-int-in-c)

------
rch
> Python ends up being an extremely efficient language for the overall task of
> doing science with code.

This is pretty close to how I characterize my time, 'doing X with code', and
Python yields great returns in these terms.

------
mrfusion
While everyone is on the subject, I thought I'd mention some weird behavior I
saw today.

I just switched my program from using a defaultdict to a regular dict.

I.e., from defaultdict(lambda:'NA') to regular dict using get(val, 'NA') for
access.

And I got a something like a 100x speedup. It runs in two minutes instead of
two hours. I had no idea a defaultdict would be so much slower.

Unless there's something funny going on in my program and it's unusual
behavior.

~~~
zo1
The docs for the defaultdict are here:
[https://docs.python.org/2/library/collections.html#collectio...](https://docs.python.org/2/library/collections.html#collections.defaultdict)

And they pretty much say that there is only one method that's been overridden
from the normal dict class. And all it does is it executes the anonymous
method (lamda) you supplied instead of returning the default value you pass
in.

Also, this depends on your usage, and what you have in the lambda expression.
I'm curious about your code, care to paste/link some of it for us?

~~~
cefstat
I would guess that the code tries to get the value for keys that do not exist
in the defaultdict. Well, if the key does not exist then it is being added
with the default value 'NA'. So in this way you can end up with a huge
dictionary where most of the values are just 'NA'.

~~~
mrfusion
Oh, wait, so defaultdicts actually insert missed keys?? I thought they simply
gave you a default value instead of a keyerror. If you're right, then that
would certainly explain it.

~~~
Someone
They have to, because they cannot, in general, decide whether the lambda you
gave it is idempotent _and_ returns an immutable value.

For example, if you provide a function that returns an empty list, and do:

    
    
        l = foo['bar']
        l.append('baz')
        print len(foo['bar'])
    

That should print 2 (apologies if this isn't correct Python)

If one had a defaultdict that took a default value in a language with enough
reflection, you might be able to deduce that the lambda always returns a
simple value such as 3.1415927, and not store copies in the dictionary.

~~~
maxerickson
What you have there will print 1.

------
williamstein
This is convenient, since it's almost exactly what I plan to talk about in the
undergrad class I'm teaching today
([https://github.com/williamstein/sage2014](https://github.com/williamstein/sage2014)).
I view this sort of article as good motivation for Cython, and the value of
this article is merely that it aims at typical undergraduates _not_ in
computer science.

------
codegeek
Good read. Whenever I read stuff like this, I always wonder if it is always
true that dynamically typed languages are slower than statically typed
languages ? Also, do we have to take this for granted that more higher level
the language is, the slower it will be ? Or are there exceptions ?

Also, it is worth asking if for majority of use cases of python for
data/analysis, the ease and flexibility outweights the slowness.

~~~
YZF
Always is a bit of a strong word but yes, it's always true.

Virtual functions in C++ which allow some form of dynamic behaviour are slower
than static function calls because they inherently involve another level of
indirection. Static calls are known at compile time, they can be inlined by
the compiler, they can be optimized in the context in which they're called.
Now, nothing prevents the C++ run-time from trying to do the same thing at
run-time but you can relatively easily see that it'll have to make some other
compromised to do so. Nothing prevents a C++ program from generating C++ code
at run-time, compiling it, and loading it into the current process as a .so.
Now that's a pretty dynamic behaviour but there's again an obvious price. You
can also write self modifying code. At any rate, static languages are capable
of the same dynamic behaviour that dynamic languages are capable of but you
often have to implement that behaviour yourself (or embed an interpreter...).

Fundamentally, a dynamic language can't make the kinds of assumptions a more
static language can make, it can try and determine things at run-time (ala
JIT) but those take time and still have to adapt to the dynamics of the
language. The same line of code "a = b + c" in Python can mean something
completely different every time it's executed so the run-time has to figure
out what the types are an invoke the right code. Now the real problem is that
if you take advantage of that then no one can actually tell what this code is
doing and it is utterly unmaintainable.

To compound the problems facing dynamic languages is the fact that CPUs are
optimized for executing "predictable" code. When your language is dynamic
there are more dependencies in the instruction sequence and things like branch
prediction may become more difficult. It also doesn't help that some of the
dynamic languages we're discussing have poor locality in memory (that's an
orthogonal issue though, you could give a dynamic language much better control
over memory).

EDIT: One would think it should be possible to design a language which has
both dynamic and static features where if you restrict yourself to the static
portion runs just as fast as any other statically compiled language and still
allows to switch to more dynamic concepts and pay the price when you do that.

~~~
logicchains
It's not always true in practice. See for instance:
[http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...](http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?test=all&lang=fsharp&lang2=sbcl&data=u64q)

Common List on SBCL is often faster than F# on Mono, eve though Mono is
statically typed, as the former runtime has had a lot more work put into
optimising it.

Or, compare F# on Mono with Clojure on the OpenJDK:
[http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...](http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?test=all&lang=fsharp&lang2=clojure&data=u64q)

Clojure again is faster, due to a more optimised runtime.

~~~
Solarsail
Notably, tho, Both SBCL and Clojure have optional types. So the number of
indirections necessary is already substantially reduced, as per the GP. For
this to be the case in practice, tho, requires that type annotations actually
be commonly used in both lisps.

A more interesting counter example would be LuaJIT, since it has to support
completely dynamic code with a fair bit of monkey patching going on. But that
is more of a showcase for tracing JITs being fairly powerful (for hard to
predict code bases) than that indirections are cheap.

~~~
logicchains
That's a good point. SBCL at least attempts type inference though, so I often
find the code is quite fast without annotations. But then I suppose SBCL with
type inference is probably closer to a statically typed language in that
sense.

------
Fede_V
Jake's blog is pretty great. I really suggest reading more stuff in there if
you care about scientific computing.

------
dmoney
Related: "Python is only slow if you use it wrong"
[http://apenwarr.ca/diary/2011-10-pycodeconf-
apenwarr.pdf](http://apenwarr.ca/diary/2011-10-pycodeconf-apenwarr.pdf) [pdf]

------
kghose
The 'slow' part wasn't so new to me, but the `id` command and the attendant
understanding that smply typing 110 in the interpreter creates an OBJECT and
when you assign a=110 `a` points at that object (and when you reassign a=30, a
new object is created and a points at that), blew my mind. Thanks for this!

Previously I had thought that doing a=110 creates an object 'a' that stores
the value 110 (and when we do b=a, b simply points to a). I had no idea there
is a third object in play.

~~~
stuki
That's pretty central to any OO language. "Everything's an object." Veering
away from that, only serves to turn the language into a mess of boxing and
unboxing.

That's not to say that OO is necessarily the ideal language paradigm, but it
has certainly been the most dominant in the era Python has existed.

~~~
theseoafs
OO languages do not have to do this. Ruby doesn't, for instance (that is, it
turns the language into a "mess of boxing and unboxing", but that's invisible
to the programmer and makes working with integers so much more efficient).
There's simply no reason to allocate integers on the heap -- it's a bad design
desicion.

~~~
icebraining
How does Ruby know that 'a' is an integer and not any other type? Java and C#
can use the variable types to track this, but I don't see how would a dynamic
language work without tagging the value.

~~~
gregwebs
My understanding from 5 years ago is that in MRI the leading bit is reserved
to indicate whether something is an integer. This of course reduces the size
of numbers that fit into a machine word but it seems like a pretty good
tradeoff.

~~~
icebraining
Yeah, apparently they're called tagged pointers. _The More You Know_.

EDIT: And here's Guido explaining why he didn't want that in Python:
[https://mail.python.org/pipermail/python-
dev/2004-July/04614...](https://mail.python.org/pipermail/python-
dev/2004-July/046147.html)

------
rpearl

      > binary_add<int, int>(a, b)
    

no... there is simply not function call overhead for adding two integers in C.

~~~
zhemao
It's just notation, to make it clear that two integers (and not some other
type of object) are being added.

~~~
rpearl
It's just not good notation at all, though. It strongly implies "call". And
"object". This is misleading.

------
ape4
This is hardly breakthru stuff.

~~~
phorese
True. It also never claims to be.

The very first part says that this is only a writeup for people who are not
intimately familiar with why "dynamically typed" might slow down Python.

~~~
falcolas
Based off my own experience, dynamic typing has much less to do with slowness
than the plethora of string copying which occurs in your typical python
script.

~~~
blt
but for scientific computing with numbers - the author's target audience -
dynamic typing is the main bottleneck.

~~~
freyrs3
Boxing and memory locality are the the main bottlenecks in scientific
computations with Python, dynamic typing is orthogonal.

~~~
andreasvc
If a language is dynamically typed, it is very likely that it will use boxing,
so they do not seem orthogonal to me.

