Hacker News new | past | comments | ask | show | jobs | submit login
How and Why We Switched from Erlang to Python (2011) (mixpanel.com)
135 points by zerr on May 19, 2013 | hide | past | favorite | 43 comments

Not trying to be mean, but I was honestly confused by a couple things:

1. The server for the company's core product is written in Erlang, but no one at the company is an Erlang expert?

2. The company decides to rewrite the core of its product in python, and gives the job to an intern?

I feel like I must be missing something....

This happened on Facebook's chat system too. For some reason, and I know no other language where this is true to quite this degree (except maybe lisp), but Erlang has a tendency for people to believe they understand how to build an app using it when in reality they can do nothing of the sort (at least in Erlang. It's common for folks to 'do it in Erlang' because its seen as different and somewhat elitist. People think of Erlang engineers as 'hardcore'; I shudder to attribute skill to a language rather than a person, but I digress...).

In the case of FB, as told to me, a bunch of folks in the valley said "of course we know Erlang!" and built the app but when push came to shove the only hardcore Erlang guys were in Poland and that made working on chat difficult. I'm not sure how much of that is truth, but its a recurring theme in Erlang projects, one that I've personally heard time and again.

For some reason, Erlang inspires confidence, perhaps even foolhardiness.

Side note: I work at an Erlang shop and our lead engineers speak at Erlang Factory every year. It's a great language and quite powerful but it can be difficult to understand.

So I heard a bit differently about the FB Chat thing. Facebook likes to move its engineers around to different teams from time to time (if only temporarily) to spread institutional knowledge around and prevent bus factors from being too low. This also is a way to alleviate burnout. There simply weren't enough engineers at FB who knew Erlang to make this practical for chat, so they rewrote it in C++. The Erlang bits were reasonably written and working fine, but it didn't fit into this policy. Any non-mainstream languages wouldn't fit, this isn't something about Erlang in particular.

Erlang is a great fit for a number of problem domains, but if you silo it into a small part of your organization, you're going to run into issues such as these. You have to be willing to train people in it. It isn't that hard to pick up if it's someone's full time job to do so, as long as they're smart. And you want to hire smart people anyway, right?

FWIW, the Erlang bits (a rather small chunk of code considering its scale) took over a year to effectively replace and led to numerous chat SEVs as people learned the hard way that their c++ replacement was missing some little bit of "seldom used but very nice to have when things get dicey" component or behavior in OTP. Given how people move around projects the difficulty of maintaining an Erlang team for just chat is hard to justify, but the move to c++ for chat back-end service says more about FB internal structure than it does about Erlang.

I think the truth is that whenever you migrate a code base, it's trivial to get 90% of the functionality quickly, but it turns out that things you thought of as unimportant, or edge use cases, are actually not rare events at all. I agree that maintaining an internal Erlang team for one piece of code isn't smart, but I tend to think that in light of the move to broader media services (chat, video and voice streaming) perhaps it might make sense to look at the language once again.

Maybe this time they'll do it in Go.

Erlang has gotten some hype about how "amazingly scalable" it is. The reality is that, as with anything, it takes work to make it scale, and once things get Really Big, it gets difficult.

My sneaking suspicion is that, in some cases, people pick it because they read that it's "web scale" or something like that without really knowing the details, and when things fall apart, they revert to something that they know better.

BTW, reference for the FB chat thing being rewritten?

JHeiliger was talking about it at an interview I went to discussing his years at Facebook.

On the note of Erlang's complexity, it's not so much complex as it is different. I think all languages get hard at scale, Erlang just has some nice native concurrent processing capabilities that are making more sense with each passing day as we move towards more cores in processors. It's easier to Parallelize Erlang than say C++ for example.

Hey Jonah, I can clear this up.

The api server was one of the very first things written, back when Mixpanel was a fun side project for some college kids, well before YC. We picked erlang because it sounded cool.

It was actually pretty stable, but the only piece of erlang in our stack. Replacing it was one of those important but nonurgent things that are hard to prioritize.

We want interns to have meaningful work, though, and "important + nonurgent" is just about the ideal thing for an intern to work on. They get to work on something interesting, but there's plenty of time for review and refactoring.

I had the same reaction. This is really not a fair critique of any weaknesses in Erlang or any strengths in Python. In his description of how they implemented maintenance of some global state, this line caught my attention: Our code was not set up this way at all, and it was clearly crippled by being haphazardly implemented in a functional style.

In addition to the statement that they had no Erlang experts currently on staff, it sounds like the people who wrote the system were not Erlang experts either. I wonder if they used OTP to build the system? The post doesn't say, but I'd bet they wrote a lot of their code with erlang's message passing primitives, if they were using messaging at all. This will get hairy and that's why most people should use OTP to build server applications in Erlang.

Lesson learned: you build better systems using languages and frameworks you know vs. using something trendy that you don't understand.

Usually in cases like this, the people who wrote the original implementation probably left the company.

Old discussion here[0]. This is the third time this has been submitted.

[0] https://news.ycombinator.com/item?id=2852415

He just ran Python benchmarks and said that it was good enough. It's rather disappointing not to see performance compared to Erlang.

I consider that to be a bright spot in this post. What would a performance comparison for this very specific app prove? The way I read it, the only goal of the rewrite was a more easily maintainable code-base with performance that was "good enough". It sounds like he was successful in meeting that goal.

With that said, comparing Python and Erlang on the basis of performance is a bit like looking for the world's tallest midget isn't it? Performance isn't the point at all, within a certain threshold.

Yes. The OP was careful and modest in its rationale for why they reimplemented this module.

they didn't switch from Erlang because of some language or implementation shortfall, but because after two years no one on staff was proficient in the language. all that tells us is there are more persons familiar with python than erlang.

film @ 11...

If I run a Erlang server and am no professional Erlang developer after 2 years what does that say about me?

He said that python is the actual language of choice in the company. That's why they don't have erlang professionals, i guess. Maybe good enough to throw together this API server but not good enough to make is scale and reliable?

That it's time to learn Rails?


> but because after two years no one on staff was proficient in the language

Hm, I wonder if they tried to train someone or if they waited to see if someone would study Erlang at home for free.

Why is this getting upvoted?

I read the title, ( sounds familiar ), look at domain mixpanel.com, didn't they did these a while ago?

Turns out to be same post from 2011.

Come on! Stop posting the same thing over and over again.

I realised that on HN it's quite acceptable to do this (although normally posts from the past would be prepended/appended with a (YEAR). I believe the reasoning is that not everyone has seen these articles, and as they are more than just news, many of them are worthy of occasional re-discussion over time. For example, this blogpost (https://news.ycombinator.com/item?id=5204324) about Julia-lang was posted in 2012 and then in 2013, and a comparison of HN's discussion over time is quite interesting (to me, at least).

That's because people like me haven't read the post.

The main dealbreaker for me is 8 bytes (16 bytes on x64) per character in strings - which is a list of integers. Each element contains two integers - one for the character and one for the next pointer.

Why Erlang community couldn't came up with something like Haskell's ByteString? I mean, there is a bit-syntax in Erlang, but with a very limited functionality of what you would expect for string handling.

We have the equivalent of Haskells (lazy) ByteString in the types binary() and iolist(). We have had that for at least 6 years. Their space efficiency is also on par with Haskells. If you process big amounts of data as a string(), then you are probably doing something wrong in your architecture.

Well Erlang does have binaries for quite a while and many use them as high performance strings. So you can also use lists for strings, but why do you if you don't like the characteristics of lists?

do you need to do much "rich" string processing in erlang? it certainly is not designed for that usecase.

you _can_ do quiet a bit of string processing in erlang's bytestrings (note: not unicode-aware), see e.g. cowboy (which has nice to read sourcecode).

as to why erlang does not have a haskell-like ByteString, i guess, it's because haskell has very fast (but potentially unsafe) c-bindings. erlang's ports are too slow to use them for something that small-grained. i'm not sure, why no BIFs were introduced to bring somthing similar, honestly.

> it certainly is not designed for that usecase.

That's a misleading statement, I think; it's like saying Python was not designed for web application development (which is true but also misleading).

It's more correct to say that Erlang was designed without much thought given to string processing. Erlang was designed as a fairly general-purpose language, only one with non-encoding-aware strings; the real problem is that it hasn't caught up, even though Unicode has existed since 1991 and has long been incorporated into most languages and most software by now.

Instead of adapting, Erlang seems to be stagnating in one area that users are frequently complaining about. In this day and age, I would argue that string processing is quite important for the things that Erlang can, or should, be used for.

Anecdotally, when I first tried Erlang I tried to create a naive, parallel log processor which read lines and spawned off lines to a pool of workers. As a newbie I was quickly stumped because Erlang's file I/O is abysmally designed. I eventually gave up the project, and later I found that Tim Bray, also an Erlang noob, had struggled with the exact same issue [1][2]. You would think that this is something Erlang would excel at, but apparently it's not.

I have been disappointed with Haskell's string support, too -- it's all over the place (String vs. Text types, the weird Data.Encoding module, horrible regexp library, etc.) -- but at least it's fully Unicode-aware and it has a fast ByteString type.

[1] https://www.tbray.org/ongoing/When/200x/2007/09/22/Erlang

[2] https://www.tbray.org/ongoing/When/200x/2007/09/21/Erlang

I like Erlang a lot, and am currently working on a project that utilizes Chicago Boss, but there's a lot of truth in what you write.

Erlang is somewhat strange for a language in that it got put to use in some very important critical systems before it got big. Things like Ruby, Tcl and Python slowly got popular and changed along the way. Doing an Erlang 2.0 that improves on some of the warts is probably not so easy...

If you are interested in the history of Erlang, this is fascinating: https://docs.google.com/viewer?url=http%3A%2F%2Fwebcem01.cem...

That's a good observation. I don't know much about Erlang's history beyond what is on Wikipedia, but I suspect that the number of people that used Erlang before it was open source was relatively small and limited to the technical community within Ericsson. That small, focused user group explains the language's weird and weirdly antiquated style and feature set, but it doesn't quite explain why the language has not been evolved to become more modern. (Of course, the fact that something as ingenious could come out of such a small community is also very impressive and wonderful.)

Unicode is a big thing in R16 and R17 where more of the system will use the unicode support we do have. Erlang modules will use utf-8 encoding and so on.

As for a built-in bstring() type, it is being discussed, but there are other things more important to get into the language, maps for instance. Erlang is rather conservative in the speed with which we add stuff to the language.

Erlang has some of the fastest I/O in a runtime. But it is not easy to use correctly. The same goes for string processing. Erlang can be blinding fast at that, but you must understand how to make it fast.

Having recently run into some non-obvious crappy performance characteristics with I/O in Erlang, I'm curious if you have any pointers to docs or code that shows how to use Erlang I/O correctly so it's fast?

More important things...

please please have a record replacement already. There are like a couple of proposals for record replacement.


You can use binaries. In Elixir strings are by default binaries containing UTF-8 codepoints, which are already nicely handled by binary matching and construction.

At first a was like "Whooot???"

Then I read: >>No one on our team is an Erlang expert, and we have had trouble debugging downtime and performance problems.<<

Now it is clean. As always: You must know your stuff amd you better be an expert than not.

This struck me when I was watching a talk about some web company talking about their scaling issues with their erlang stack. You could have replaced erlang with django or rails and it would have made zero difference. People are essentially making the very same mistakes the first startup I worked at 6 years ago doing php was doing.

That being said, I really like Erlang, and I can think of a few problem domains where Erlang would indeed be a really good fit. But serving HTML in my opinion is so trivial that all the fancy stuff that Erlang offers matters little in that regard.

Another two months and we should see the headline, "How and Why We Switched from Python to Go."

Followed by "How and Why We Switched from Go to Erlang" completing the cycle.

That transition (Go->Erlang) doesn't seem likely. Why? My impression this company's primary motivations switching to Python were 1. Maintainability (how maintainable Erlang was for them). and 2. Performance.

1. Maintainability: Go is ALGOL/C-family-esk and it will be much easier for a Java/C#/Python/<insert very common lang> programmer to ramp up on it than Erlang. Go code is sometimes called boring because it tends to be very straight forward and pragmatic- this makes for very maintainable large systems.

2. Performance: Go performs much better than Python or Erlang[1-3].

Computer Language Benchmarks Game:

1: Go vs Python: http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...

2: Go vs Erlang: http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...

TechEmpower: Web Framework Benchmarks: 3: http://www.techempower.com/benchmarks/

> Go performs much better than Python or Erlang[1-3].

I'm going to go out on a limb and say that performance is more subtle than can be reduced to a single "is better than" about throughput in HTTP requests or computing mandelbrot fractals in the shootout.

All of the languages have a specific area where they shine. Erlang in largely concurrent systems. Python also has a sizable mindshare in the HPC world.

>>I'm going to go out on a limb and say that performance is more subtle than can be reduced to a single "is better than" about throughput in HTTP requests or computing mandelbrot fractals in the shootout.

Sure, and I'd agree with that. But when it comes to empirical data, microbenchmarks are one of the best sources of data we have to look at. The best source will always be your specific application of course. Benchmarks will never be perfect, but they should not be so easily discarded.

>>All of the languages have a specific area where they shine.

Agreed. I do think it is worth pointing out that the Computer Lang. Bench. Game has a wide variety of programs just for the reason you point out, and it gathers both execution time and memory usage data.

>>Erlang in largely concurrent systems.

Very true; however, concurrency is also a strong point of Go.

>>Python also has a sizable mindshare in the HPC world.

While I'm sure this is true, that could be for reasons other than performance. I know some Astronomers that use Python because as non-CS people its easier for them to use than C or Fortan. However, if they have a really expensive program and they bring in someone with formal C-S training to port to C Fortran or Java the resulting improvement is almost always huge.

LOL, this would be hilarious.

how cheap

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact