
Replacing Celery with Elixir in a Python Project - klibertp
https://klibert.pl/statics/python-and-elixir/
======
asksol
Author of Celery here. This is an interesting presentation!

I want to clarify something about Celery and RAM usage.

When writing web crawlers, and other (mostly) I/O-bound tasks, you should be
using the eventlet/gevent execution pools instead of the multiprocessing one.
This will _drastically_ reduce memory use, and perform better.

If you have four CPU cores you can start four worker instances with 1000
threads each (for a total of 4k threads): `celery multi start 4 -A proj -P
gevent -c 1000`

This will utilize all the CPU/cores in your system, working around the GIL.

One of the new features coming in Celery 4 is a message protocol with support
for multiple languages, maybe we could have an Elixir worker soon.

~~~
klibertp
Hi, thanks for reading. I need to stress that this is just an example project
I came up with _after_ I decided to play with Elixir and Python integration.
The project itself doesn't even make much sense - as you and others point out,
there are many other, more pythonic ways of handling this task. I chose web
scraping because it was easy to split into two subtasks: one IO and one CPU
bound.

> The slides mention problems with Celery and RAM usage when writing crawlers,
> but since this is a mostly I/O-bound task you should be using the
> eventlet/gevent execution pools instead of the multiprocessing one.

Here: [https://klibert.pl/statics/python-and-
elixir/#/5/6](https://klibert.pl/statics/python-and-elixir/#/5/6) in line 18
you can see a `_do_some_real_processing_function`. The whole premise of the
project is that this function is CPU-bound. My processor has four cores, so I
create a pool of 4 Python processes ([https://klibert.pl/statics/python-and-
elixir/#/5/2](https://klibert.pl/statics/python-and-elixir/#/5/2) line 10).

Fetching pages is an IO-bound task, so it's done by Elixir. There we have a
pool (for rate limiting) of 10 processes (the Erlang ones - important
distinction) that do the downloading.

I think the closest analogy to what happens in this project is a Twisted
(EDIT: or any other concurrent, but not parallel framework) app which uses a
pool of processes for CPU-bound tasks. Here the Twisted part is replaced with
Elixir.

EDIT: Also, we use Celery at work extensively and it works great and there is
no real need to replace it with anything! Again, this project is just a tech
demo, it doesn't make (much) sense on it's own. But there are other possible
integration patterns where Elixir and Python have different roles, which
actually do make sense. I think.

~~~
asksol
I found your project very interesting! I'm very familiar with Erlang/OTP, and
have been meaning to play with Elixir for some time. Since you're separating
downloading from processing, maybe IPC is a better term for what you're doing,
as you don't want these two steps to be separated over the network for data
locality.

I only wanted to clarify the RAM situation described in your slides, as it's
not widely known that you can use eventlet/gevent with Celery.

------
raphinou
Am I the only one not liking the slides having both vertical and horizonzal
navigation? Going through the whole presentation requires going in the right
direction (down unless last slide in section, in which case it is right...)

~~~
klibertp
You can use Space, which should take you to the correct next slide, without
thinking about directions. I found it a neat format to structure a talk into
segments. Also, you can press Esc and get a "map" of the slides.

~~~
lqdc13
Nope. I used space. After the Erlang slide it moves to the right "tools used",
which is the slide on the same level in the next column.

This along with back button hijacking makes for the single worst presentation
format I've encountered so far.

EDIT: it appears that only happens when you have been in that column before.
This is still pretty surprising behavior.

~~~
klibertp
Sorry about that. Maybe some of this behavior is configurable in Reveal.js,
I'd need to check.

It was fine for me because these are the slides for a talk and I was the one
showing them to people.

I'd be happy to post the video of the talk
([https://www.facebook.com/events/211449562541292/](https://www.facebook.com/events/211449562541292/)),
but after three months I still haven't heard anything from the people behind
the event...

~~~
lqdc13
No reason to apologize. The content was definitely still viewable.

------
simon_acca
Interesting article!

I would just like to point out that concurrent downloads can be handled much
more efficiently in Python > 3.4 thanks to the asyncio library. For an
example, look at Guido van Rossum's crawler [0].

[0]:
[https://github.com/aosabook/500lines/tree/master/crawler](https://github.com/aosabook/500lines/tree/master/crawler)

~~~
tejinderss
Asyncio doesn't support http yet but that could be fixed with a library like
aiohttp

~~~
talideon
The linked crawler uses aiohttp.

------
rtpg
The multiprocessing slide mentions the RAM usage issue with things like Celery
(because you start many instances of Python and load in dependencies). Does
this solve them?

If so, how does it get around the whole GIL thing and whatnot? Or maybe I'm
misunderstanding at what level things are happening?

Is it that you still have one python process but the bottleneck/URL fetching
is happening inside your elixir stuff?

Super interested in this, we have this problem with Celery workers and would
love to not be bound by RAM for worker count

~~~
klibertp
Unfortunately, no, this doesn't solve the problem of RAM usage by worker
Python processes: they are still separate OS processes, although they look
like "normal"[1] Erlang/Elixir processes from that side.

By "replacing" I mean pretty literally getting the same effect as with Celery.
[EDIT: BTW, in the recent post about Celery it was praised because it lets you
compose background/async tasks. Of course, Erlang has something like that,
too: [https://chrisb.host.cs.st-andrews.ac.uk/skel-test-
master/tut...](https://chrisb.host.cs.st-andrews.ac.uk/skel-test-
master/tutorial/bin/tutorial.html)] There really is no other reliable way
around the GIL in Python other than multiprocessing.

The main idea here is that we should keep IO-bound code on the Elixir side as
it's simply more efficient and easier to write there, but to delegate CPU-
bound code to a pool of external processes. But this is only one of many
possible patterns of integration: I can imagine a Django project where most of
the logic is on the Python side and Elixir only handles WebSockets/long
polling connections. Thanks to ErlPort passing data and calling functions
between Elixir and Python is effortless, you can call an Elixir function just
as easily as you can make Elixir call a Python function in a different
process.

Moreover, ErlPort supports Ruby, so you can have workers written in it, too.
There's no problem with workers in other languages, too, as long as you write
the glue code yourself (or generate using Elixir macros).

[1] It bears repeating: Erlang processes _are not_ OS-level processes. Erlang
Virtual Machine, BEAM, runs in a single OS-level process. Erlang processes are
closer to green-threads or Tasklets as known in Stackless Python. They are
extremely lightweight, implicitly scheduled user-space tasks, which share no
memory. Erlang schedules its processes on a pool of OS-level threads for
optimal utilization of CPU cores, but this is an implementation detail. What's
important is that Erlang processes are providing isolation in terms of memory
used and error handling, just like OS-level processes. Conceptually both kinds
of processes are very similar, but their implementations are nothing alike.

~~~
jerf
If you're having RAM problems, you don't want to start up each processor
independently as a "port". You'd want to write a Python server that offers a
socket, start one instance of that that listens to the socket and then forks
(since you control the system fully, probably let it fork once per connection
rather than worrying about "preforking" or other complicated scenarios for
when you don't control the concurrency), and then write the glue code to talk
across that socket. [1]

That way, your Python processes load all their modules and do all their
initialization, and the forking automatically takes care of Copy-On-Write
semantics in the RAM and you end up with only more-or-less one copy of them in
RAM.

[1]: I don't know if there's a "perfect" off-the-shelf solution for you, but
there's enough existing code in the world with all the pieces that it's pretty
easy to do that nowadays. For instance, one very easy protocol that leaps to
mind is:

    
    
        4 bytes to say how much JSON there is
        a JSON object containing the metadata for your connection
        4 or 8 bytes to say how long the incoming webpage is
        the contents of the webpage you're telling Python to process
    

It's not perfectly efficient, but it has great bang-for-the-buck, and will
carry you a long way before you start needing to do anything else fancier.

You might also be able to bash ErlPort into compliance and get the client and
server side going, but, well, this isn't very difficult code to write and it
doesn't take much "screwing around with opaque library code that doesn't
really want to be used separately" before you could have just written this.

------
falcolas
For something as straightforward as downloading files in parallel, why not
just use Python threads? Since most of the IO will occur in C without holding
the GIL, it seems silly to be forking off entire new processes for this sole
purpose.

The processing of the file in Python would hold the GIL, but this could still
be resolved with a multiprocessing pool of workers.

In other words, while academically interesting, mixing Elixir and Python for
web crawling doesn't make much actual sense.

~~~
klibertp
> In other words, while academically interesting, mixing Elixir and Python for
> web crawling doesn't make much actual sense.

That's right. It wasn't supposed to be a real project, it's a just a tech demo
of sorts. I will have some more real-world uses for the integration in my
Raspberry project, I think, but I'm not there yet.

------
plainOldText
I personally like to bridge Elixir and Python via nanomsg with MessagePack
serialization.

Here are some useful libraries:

[http://nanomsg.org/](http://nanomsg.org/)

[https://hex.pm/packages/exns](https://hex.pm/packages/exns)

[https://github.com/walkr/nanoservice](https://github.com/walkr/nanoservice)

------
lbn
Can we avoid the overhead of starting and shutting down processes by running a
single Python process and communicating using something like grpc [0] (or even
JSON-RPC for maximum simplicity)?

How do web frameworks like Flask handle multiple concurrent requests? Would
performance increase if we started multiple instances of this Python web
server on the same machine and load balanced them? The code would be much
simpler if there was no need to handle process management.

[0]: [http://www.grpc.io/](http://www.grpc.io/)

~~~
jeremyjh
The whole point here is we need parallel processing, and Python cannot provide
this in a single process due to the GIL. Flask applications depend on the
workload being I/O bound, so it can achieve concurrency where parallelism is
not required. If you built a process that is CPU bound in Python code in
Flask, you'd find it could not achieve much concurrency at all.

~~~
bpicolo
There are plenty of multiprocessing wsgi servers, flask builtin included

------
lrem
I think I misunderstood your presentation at first glance. Elixir "processes"
are actually green threads. Thus, you actually have Python interpreters in
separate OS processes, right?

------
elktea
While BEAM is indeed great I'm wondering why you didn't use Scrapy? It handles
concurrency well and is a battle tested production scraper.

~~~
klibertp
The same question was asked when I gave the talk. I'm afraid the only answer
is: because I could :-) I thought up a project where it would make some sense
to use Elixir and went with it. It's not practical or "real world" project at
all!

The point is that the integration Python<->Elixir can be so tight and that
there's little overhead when using each language for things it's good at.

~~~
eggy
> because I could

Love that!

I will look it over, and because I favor LFE (Lisp Flavored Erlang) over
Elixir, try to do something neat like this. Although, Elixir is growing on me
from my initial take on it a year ago.

------
arms
Thanks for sharing.

First my complaint: the slides are really annoying :) A traditional left-right
stack would've been nicer.

That said, this is something I've been looking at lately as I've got a bunch
of python code that I want to parallelize, and a strong interest in Elixir. I
found your code samples very helpful.

Edit: I see the other comments mention the slides, and how to navigate them w/
space. Disregard my complaint.

------
tbrooks
OP mentioned that Elixir is better at concurrency and parallelism. Python is
better at processing (more libraries/toolsets available).

As someone who wants to use Elixir more and see its community flourish, what
libraries does Elixir need so this project could be done without Python?

I'm guessing some sort of Beautiful Soup or Nokogiri equivalent?

~~~
klibertp
No, actually, there is _nothing_ lacking in Elixir ecosystem if you want to
write a web scraper! That's the point. I used HTTPoison for sane (and
efficient) HTTP requests and I could have used (and used in some other
projects) Floki
([https://github.com/philss/floki](https://github.com/philss/floki)) for HTML
parsing and querying.

However, there are things like generating PDFs with graphs based on the
tabular data on the pages or running some more involved Pandas TimeSeries
transformations which are simply not available in Elixir. Nor they should be,
I think: reportlab or Pandas are already written and do a good job at what
they are meant to do. This is the idea: we write a crawler in Elixir and
delegate processing to something else. Anything else, in fact: Python was
chosen because of ErlPort and how easy it is to integrate, but your workers
can in practice be written in anything that understands JSON.

------
agounaris
Interesting use case but is it actually nice to mix up the technologies on
your stack just for this? Also why re-invent a solution for something that
already works? Just my 2 concerns :)

------
emson
Fantastic. I recently used Elixir for scraping courses off Udemy. I've put the
results of this into a site,
[http://www.coursenut.com](http://www.coursenut.com)

Also I've added an Elixir course promotion to:

[http://www.coursenut.com/courses/3692](http://www.coursenut.com/courses/3692)

~~~
ci5er
What is the purpose of the scrape?

------
assaflavie
Does this solve the state-sharing difficulties of the python solution?

------
spraak
This is a cool way to use both tools together.

------
vegabook
I have been looking at exactly this type of solution, as I want to use Elixir
for distribution of python(numpy)-based computations to clients. Just a quick
question.... is ErlPort maintained? It looks like it's been in alpha for 3
years...

~~~
klibertp
There's little development, it seems, but it's not dead. There are discussions
below issues opened, PRs are accepted, occasional commit sometimes lands in
master.

It was stable when I used it. But it's also rather simple and short (less than
~1k loc on Python side, for example), so I think it wouldn't be a heavy burden
to maintain, even if the current maintainers and users all dropped dead
tomorrow :-)

~~~
vegabook
enough for me. I'm going to use this because the combo of Python (more
accuratey, _numpy and pandas_ ), plus Elixir, is pure goodness.

