

Mixpanel: Scaling to the Internet - Writing C extensions for Python - suhail
http://code.mixpanel.com/building-c-extensions-in-python/

======
StavrosK
This is needlessly complicated. You can just use Cython and annotate the
types, or, even better, use shedskin to compile your Python code, usually with
no changes, to C++ and make a module out of it.

There are many options to speed up your Python code by ~50 _times_
effortlessly. See my blog posts:

<http://www.korokithakis.net/node/109>

<http://www.korokithakis.net/node/117>

~~~
samuel
I don't understand why stock Python doesn't include some sort of optional type
annotation, as SBCL and other Lisps. Some years ago there was a lot of fuss in
the Python community about including optional type checking in Python 3...
what happened? At the end, I found most changes from version 2 cosmetic and
not really exciting.

~~~
illumen
There are now annotations. But they are not used for anything by default.

------
kqueue
>So imagine: You want to stick to Python because it’s so fast to develop in
but need the performance of C/C++. Let me introduce you to C extensions in
Python.

Let me introduce you to Cython.

------
j_baker
Does anybody write C extensions like this anymore? Most of the time it's
easier to use swig or cython. If it's really simple, you can also use ctypes.

~~~
jnoller
Actually, a lot of people do write them this way. When you take something you
can write quickly and with no dependencies other than python, and instead pull
in swig or cython - you now have two problems instead of one.

As the author shows, writing a python c extension is fundamentally easy which
is why Python has such a huge ecosystem of them to begin with.

~~~
cdavid
Writing C extensions is only easy if you do trivial things. As soon as you do
something significant, writing all this C code by hand becomes increasingly
complex, especially because of reference counting and error management
(<http://docs.python.org/release/2.5.2/ext/thinIce.html>).

If dependency for distribution is an issue, you can just include the generated
C code (as we do in numpy and scipy, as a matter of fact). Also, cython can
generate both python 2 and 3 compatible code.

~~~
mturmon
It depends on what kind of "trivial". I have a lot (> 30000 SLOC; hundreds of
functions) of C code that does intricate math/stats/optimization, but from a
control flow and memory allocation point of view is simple. This kind of code
is perfect for the simple c-extension API of Python as described in the OP.

In fact, if you set up the gateway interface to your C code cleanly, you can
have the same C code callable directly from (e.g.) Matlab via the "mex"
mechanism.

~~~
cdavid
Yep, in that case, that's relatively easy - but cython is even better. Cython
is used a lot in the scipy community - if you don't know it, you may want to
look at it. Since I am using cython, I am trying to avoid writing raw C API as
much as possible, there is just no point.

~~~
slug
I wrote a few things in cython to give it a try and it's amazing what static
typing can do to improve the efficiency of the code. Just don't look at the
auto-generated .c file though, it looks scary, but this seems to be a common
occurrence (swig comes to mind).

------
hartror
My programming knowledge base is strongest with C++ and Python. I always start
writing projects in Python. Then if psyco hasn't gotten me the performance in
the (often very small) sections of code that do the heavy lifting a quick C++
extension is whipped up.

This method is so productive it always surprises me how quickly things get
done!

~~~
StavrosK
It's the best of both worlds, really. You write everything in a language
that's fast to write, then you write the bottlenecks in a language that's fast
to execute.

~~~
cdavid
This works only if your bottlenecks are a small portion of your work. For
example, if your twisted application is slow, you will have a hard time
optimizing it by writing a bit of C code.

~~~
hartror
Confused. Why would replacing larger chunks of code be a problem?

~~~
cdavid
Because of the python <-> C marshalling cost. Let's say you have some code
which uses a lot of small objects which interact with each other, maybe each
object has a small list with a few integers. Replacing this in C would make
sense because you can be much faster than python. But you still have the cost
of interfacing with python (creating the objects, boxing integers back and
forth, etc...).

That's why something like numpy manages to be so fast if used well: it
internally uses efficient (C-like) representation, but as soon as you need to
interact with every item in your array, then it is not only much slower, but
even slower than using standard python containers because of this cost.

~~~
jrockway
What language do you think CPython's internal data structures are implemented
in?

I know when I write an XS extension in Perl, though, that getting the C struct
associated with my Perl object requires one C function call -- no overhead at
all. Passing an object to Perl involves allocating memory and adjusting the
pointers. All very fast. If this sort of thing is your bottleneck, it's time
to step back and rethink what you are trying to achieve.

~~~
cdavid
Being implemented in C does not make things fast. Integers are implemented in
C in python, but they are really slow, because of the boxing/unboxing thing
(you need to chase one pointer pointer to get the actual C int from the Python
object PyObject_Int, and that cost alone is high). Numpy is much faster for
this kind of thing because it does not have this cost, and run the core loops
+ function calls in pure C, without going through the python runtime _at all_.

~~~
jrockway
Yes, exactly. As you context switch between the high-level and low-level
language, you sacrifice performance. If you want to write a fast extension,
you don't do this; you let the high-level language call your pointer an
object, and when you invoke a function, you invoke it on that pointer. This
way, you don't marshal back and forth (except perhaps incidental parameters
and results or exceptions).

~~~
cdavid
What I am saying is that the whole concept of doing the heavy logic in the
high level language and do the work in the C implementation cannot work when
your logic depends on handling many objects.

For example, I wrote a few years ago a small audio app to track frequencies
lines in audio. Each track was a list of positions (integers), generally a few
tens items max, and I needed to handle millions of such lists. Now, just
optimizing the track object to be in C is not that efficient because I needed
to access the content of each list in the high level logic.

Basically, If your logic needs to be able to handle those millions small
objects, you end up writing almost everything in the compiled language.
Abstracting this becomes very difficult.

~~~
jrockway
I see. Incidentally, Haskell has a nice solution to this problem:

[http://hackage.haskell.org/packages/archive/vector/0.7/doc/h...](http://hackage.haskell.org/packages/archive/vector/0.7/doc/html/Data-
Vector-Storable.html)

~~~
cdavid
Oh, python has a nice solution in this problem space - NumPy. But again, this
only works in some cases. Unfortunately, the model fails performance-wise once
you use non native type - think array of arbitrary precision floats. Haskell
being much more powerful and expressive, I would expect it to be able to
express those concepts in a more elegant manner, but I could be wrong (I know
next to nothing about haskell).

------
metamemetics
Is there any scalable\cloud python hosting that allows C extensions? I know
app engine doesn't unless they changed it recently. I'm guessing a VPS is
necessary?

~~~
gpjt
I think you can upload your own extensions to Picloud.

------
nphase
Wow, this looks so much easier than writing a PHP extension!

~~~
slug
Until you forget a decref/incref somewhere ;) Personally, I prefer to use
either boost.python, cython, ctypes or swig. Although writing C routines to
directly use numpy arrays is fairly easy.

