
Why and How I Switched from Python to Erlang - rodmena
https://ourway.ir/Why_And_How_I_Switched_From_Python_To_Erlang.html
======
rdtsc
I use both Python and Erlang. Python is to get small stuff done quickly
language for me. And "small" doesn't have to be just a demo or hello_world
example, it can be a whole full back-end of a business.

(And btw when I say Erlang I also mean Elixir, they both share the same VM so
most things apply to both).

But Erlang is a the secret sauce (so to speak) for a high concurrency fault
tolerant backend. "Fault tolerant" should probably go first. The reason is the
observation that the higher the concurrency or complexity in the system, the
higher the need for fault tolerance. If you only serve 10 connections and the
backend segfaults, 10 clients have lost connectivity, you get 10 angry phone
calls. If you serve 1M connections and your backend segfaults, you get 1M
phone calls. Exaggerating here of course to get the idea across. But this is
not just a marketing gimmick, this translates to money in the pocket in
operational costs. Some subsystem crashes and restarts? Fine, let it do that
if it is 4am, no need to wake people up, will fix in the morning. I've heard
of stories of subsystems crashing and restarting for years with the main
service staying up.

The lack of centralized state might seem minor but unless you have been
debugging a shared memory systems with locks, threads and manual memory
allocations, with classes that have pages of attributes defined, and trying to
understand why it crashes on Wednesdays only at customer A's site, it is easy
to miss the benefit. This comes through using small lightweight processes with
an isolated heap and also using functional constructs.

Then in general, the ability to reason about concurrency using OTP constructs
(Erlang's standard library) and processes is like going from Assembly to C in
terms of distributed and concurrent systems. Can express ideas at a higher
level. This means having less code to look through and maintain.

There are other niceties like good garbage collection performance, awesome
tracing and hot code loading capability (used this a few times, so started to
appreciate it more now).

Now, individually all of those features can probably be found in other systems
and frameworks, but they are just not integrated or not quite there -- Java
has code loading but it is not the same. Can always spawn an OS process to get
fault tolerance, but can only spawn so many before system falls down, Go has
goroutines but they also share memory, so fault tolerance is not there. Other
languages have garbage collection, but most still have to stop the world
sometimes and so on.

~~~
emluque
You seem to be familiar with Elixir. May I ask you some questions?

. What kind of web applications are you building with it? I'm asking what kind
of web apps or scenarios do you think Elixir is particularly well suited for?

I saw a thread on Elixir a couple of days ago and it piqued my interest and I
saw a couple of videos that were posted there, one from some Ruby Conf that
claimed that Elixir was giving better results (in request time) than rails. He
never explained how that was posible or what would have been the results if he
would have been using a cache (it's always faster to hit an in memory cache
than hitting a db that has to touch disks, so if he speeded things up without
a cache he would speed things even more with a cache). Then I watched another
videos from some conference in Oslo or something, and from what I could
understand he was doing away with the db completely.

. So I have another question, how do you architect your application in Elixir
to keep application state though multiple requests (sessions) on multiple
boxes without using something like memcached or redis (or a network disk)?

. Even if it's running on only one Box? Where does data reside if you are not
using a db?

I have a basic understanding of Erlang processes (what's explained here:
[http://stackoverflow.com/questions/2708033/technically-
why-a...](http://stackoverflow.com/questions/2708033/technically-why-are-
processes-in-erlang-more-efficient-than-os-threads)) and how it's particularly
well suited for concurrency. My questions are about Web Apps and Elixir and
scaling.

~~~
pmontra
This is a benchmark I did with the versions of Rails and Phoenix that were
current in October 2015.

select * from visits, plus conversion to JSON and delivery to a client on
local loop. About 5000 records.

* Phoenix 140 ms

* Rails 248 ms

* Ruby without AR 219 ms

* PostgreSQL 2.97 ms, with no JSON generation and no delivery

select started_at, duration from visits -- JSON and delivery

* Phoenix 74 ms

* Rails 116 ms

* Ruby without AR 88 ms

* PostgreSQL 3.47 ms, no JSON no delivery

Single process, so maybe Phonix could get a larger advantage as the number of
processes/requests increase. For the typical none to low traffic site there is
little difference, the tool the programmer is more familiar with wins.

Edit: improved formatting.

~~~
emluque
Thank you for your reply. I can't edit my comment any more but where it says:

>Ruby Conf that claimed that Elixir was giving better results (in request
time) than rails

I meant:

Ruby Conf that claimed that Elixir was giving better results (in request time)
without a cache than rails with a cache

~~~
rimantas
IIRC, Rails app mentioned in the talk was very old. Not a fair comparison,
IMHO.

------
erikb
Yes, for certain programming jobs (i.e. work) Erlang is much better. Let's
just assume we can mostly agree on that.

But "I switched" means you switched your main language, right? I wouldn't
switch my main tool to something with a specific usecase. Your main language
should be something general purpose, because most of your life's problems have
a wide range of usecases to consider.

Therefore I would have expected an argument for why Erlang may be a beter
general purpose language or a headline like "Why <project/company/service x>
switched from Python to Erlang".

~~~
abrookewood
Yes, the article was pretty badly written I thought. It was mainly just
complaints about Python than any real exploration of why Erlang was better.

------
coltonv
If you're going to set pixel based margins on the left and right side of your
page, you need a media query to disable it or at least lower it to 5 pixels on
small screens. This is unreadable on mobile :/

~~~
zzleeper
I'm always surprised why more people don't just do F12, ctrl+shift+M (on
Chrome) while writing their websites. Just four keystrokes :)

~~~
abrookewood
Awesome! Had no idea you could do this! For those wondering, it renders a
mobile-like view of the current site. Thanks ;)

------
smcl
I'd be curious as to why exactly the SQLAlchemy query was slower than the raw
one. I get that there's gonna be some overhead, but did you figure out exactly
what was slowing it down?

~~~
Qantourisc
I have been comparing the 2, at third glance they don't even appear the be the
same query.

Second for performance as much as possible from the SQL building should be
outside the function. Also if the query is that static, you should probably
use the plain SQL anyway ;) Or maybe even build the SQL query string by using
SQL alchemy outside the function, and then using the SQL generated by
SQLAlchemy inside the function (if you don't like pure SQL, like for example
compatibility).

------
ngrilly
Two comments:

\- It looks like the SQL queries produced by pure_python() and simple_sql()
are structurally different, which would make any comparison worthless.

\- How does the author solve the long running tasks (> 500 ms in the article)
problem with Erlang? In Python, it looks like he was using Celery with Redis
then RabbitMQ. What does he use in Erlang? (Erlang message queues have no
built-in persistence.)

~~~
loxs
Erlang can have long living processes (green threads) and that's usually how
we solve problems. There is absolutely no problem (it's encouraged) to have
thousand of processes living minutes or hours if we need so.

~~~
ngrilly
Yes, but how do you manage fault tolerance at the machine level? If the
machine is rebooted for some reason, the pending messages will be lost, and
will never be processed. A persistent queue is necessary for this reason. This
is the reason why people use tools like Celery or RabbitMQ (which is written
in Erlang by the way). I don't see how replacing Python with Erlang changes
anything in this regard.

~~~
abrookewood
You handle that with multiple nodes that sync their state.

------
flocial
Too bad the author doesn't show specific examples. I can sympathize with the
sentiment. The problem with scaling languages like python and ruby in my
experience is the number of moving parts and jump in deep knowledge required
to scale when your application takes off. One day you're happily writing
compact code in your favorite language and suddenly you're rewriting core
parts in C/Go/Rust while learning a new language and bolting on a variety of
moving parts like redis, memcached, etc. to keep your business from becoming a
victim of its own success.

A lot of the problems are more or less solved problems with well known
workarounds but it's still a pain and probably why there's so much buzz around
finding the "next (insert your favorite language/framework)".

Erlang generated a lot of buzz several years ago but aside from the recent
success of WhatsApp it never quite stuck. I'm curious to know how this author
thinks differently.

~~~
themartorana
Now is a fantastic time for languages. Python and Ruby get you where you need
to be fast, and if you get far enough and need to scale, there are a huge
number of options, from stalwart holdouts like Java and C# (now on Linux!) to
newer languages on rock-solid run times like Clojure and Elixir, to whole new
languages like Go and Rust.

All these languages have their advantages and followers, but the thing they
all have in common is first-class concurrency, which everyone now realizes
they need (and old Erlang devs will tell you "duh").

Honestly, while I write Go almost exclusively anymore, I think Elixr is the
most exciting. The more I learn about Beam the more magical it becomes in my
eyes, considering just how old and stable it is. Phoenix is a game changer in
many ways.

~~~
ceslami
> if you get far enough and need to scale

Alright -- I'll bite.

Rarely anyone "needs to scale." Plenty of $1B businesses can succeed on a
fistful of c4.large's running Ruby on Rails. The same Ruby on Rails that ran
their MVP. Scaling via language change isn't the path to victory, unless you
picked a language that was poorly suited to your domain to begin with.

Too many companies end up following this cargo-cult advice, and spend more
time grappling with their tools than innovating. Because "concurrency."
Someday.

Optimizing your choice of language for anything other than a linear
relationship between Real Complexity and Implementation Time is a fool's
errand and a fiduciary travesty.

In all fairness, Elixir/Phoenix could become as well-learned as Ruby on Rails.
We'll end up in a best-of-both-worlds scenario. And at that point, I'll
happily eat my words.

But in the meantime, solve your "scaling" problems by measuring and
optimizing, instead of re-writing your app in the flavor-of-the-week.

~~~
themartorana
I can't think of a single billion dollar Corp that can run Rails on a few AWS
boxes. Time and again companies have rewritten systems from MVP to handle data
safer, better, and faster, from Dropbox to Twitch to CloudFlare to Uber to
Mail.ru...

Then there are plenty of 7 figure companies that have also had scale issues.
Game companies come to mind first and foremost but they're hardly the only
ones.

You want to run your marketing website on Rails? Yeah of course. But billion
dollar companies on Rails?

~~~
ceslami
For the sake of discovering a middle ground between our arguments, let's look
at an example of a >$1B company running Rails: Github.

Most of Github's stack is Ruby on Rails, with specialized components/sub-
systems written in C. This is a common theme amongst companies that use Rails
or Python at scale.

There's a reason why people keep those languages around. Its the same reason
why they tend to be used for MVPs: the tool gets out of your way so you can
focus on solving the hard problems of your domain. The longer you can preserve
those ergonomics, the better.

~~~
vacri
My guess is that Github uses a few more cpus than "a fistful of c4.larges"
suggests.

------
mianos
You might notice difference in performance when your sqlalchemy query is
different to your raw query. If you don't understand sqlalchemy then you are
probably best to not use it. How about going to mongo?

------
kccqzy
An SQL injection issue right there.

Edit: to be fair the variable proj_id sounds like it's not taken from
untrusted input, but nevertheless a bad idea.

------
ben_jones
The title should be: Mentioning some scaling problems I heard about in Python
and might have brushed up against in my own projects (likely not production
ones). Oh and their's this thing call Erlang and it's fast!

I sincerely hope nobody is making changes to a working production stack based
on posts like this.

------
jboogie77
didn't really explain how.... لطفا بیشترتو ضيح بدید

~~~
daveguy
لطفا بیشترتو ضيح بدید ? I tried Google for translation. It wasn't very
helpful.

Edit: ok, close enough --

"Please explain me Byshtrtv"

Edit2: replier and op were much more helpful than google, thank you!

~~~
rodmena
In Persian. He means: "Explain More"

~~~
jboogie77
don't forget please :)

~~~
rodmena
Well, It was version 1! I will try to explain more and more in coming days. I
wrote these articles for mind rest and between my coding sessions.

~~~
daveguy
Looking forward to the follow up!

~~~
rodmena
of course

