Hacker News new | past | comments | ask | show | jobs | submit login

I saw this deck mid-2014, and it single-handedly made me want to learn Erlang. I felt that any system that can handle COD's load is, at the very least, worth checking out.

About a year later, I got a job doing Erlang full-time, and OTP is about as good as that slideshow indicated...So, if anyone at Demonware is reading this...Thank you! You introduced me to my second great love (after my wife)!




How did you get into erlang, like in project wise?

Because I played around with rabbitmq plugins, a nehe opengl examples port and some erlang notepad someone implemented. I also own both o'reilly hard copies.

But other then that I don't really have a project I can see myself use erlang for(one man shop).


The Cowboy webserver is fairly easy to get started with and I really recommend checking it out, since a web server feels like a "very Erlang problem" to me.

To answer your question though, I got started by playing with TCP and building a basic chat server. I spawned a new Erlang process for each user, used Erlang's internal messaging to share messages between processes, and then passed those messages along to TCP to the user.

A basic rule of thumb on whether or not it's a good Erlang problem is seeing if you can visualize the problem in terms of a bunch of tiny, self contained things (like cells), and you don't want one of those cells failing to have any risk of breaking anything else.


And it turns out a lot of problems actually fit that if you try to think about them that way. Sometimes it's worth -trying- to change your visualization; you'll find it works out better.

We had to write task scheduling software where I worked, and someone unfamiliar with Erlang said something to the effect of "Oh, that's easy, priority queue". Well, no; there are all -sorts- of sharp edges to doing that. Instead we just had one Erlang process per task, with timers (slight gotcha in that you shouldn't use the timer module, instead erlang:send_after). So every scheduled task in the next (time period) was a sleeping Erlang process (and was listening for updates, or for the timer message; on the timer message it would go and actually start executing). Super easy to test, reason about, debug, etc (because the correctness of the task execution was largely isolated from the correctness of the task scheduling) and a far better solution than what the non-Erlang mindset would naturally reach for (since in most languages, more concurrency = bad).


I'm just curious, what are some of the disadvantages to priority queue?


Okay. Let's start with a perfectly naive priority queue. We have a default data structure (nevermind the implementation for now), and a thread that pulls items off the queue, and executes them. Works perfectly in test.

But we notice something; our executions take a non-zero amount of time, and so if we have two events happening simultaneously, they don't both get fired off in time. So now we have to switch to a thread pool, that gets handed events from the priority queue. Task runners, in effect. So, great. Everything works now.

But then we have to start accepting updates to tasks. That is, we are able to move a task forwards or backwards in time. Now, suddenly, we care about the underlying data structure of the queue; can we both be updating the order of events -and- be pulling from it? Does it require locking to do that?

Maybe it does require locking, in which case are we certain that we got ~that~ implemented right? Not just the queue library, but, in the event of an exception in our code, we correctly free the lock? And how long does it take to rebalance the queue? Is that acceptable, since we can't be pulling from the queue during that time?

Maybe it doesn't require locking, such as when using immutable data structures where we build a copy and swap it out, but then we might have a chance of executing a task twice (i.e., our task launcher thread reads copyA, while we're modifying copyB, and so the head of the queue in both gets executed), so now we have to keep tabs on what we launched last. But that assumes a stable priority queue in the event of two items having the same time; -that's- a fun edge case to miss.

But maybe we get all that working, by getting a lock-free priority queue in place. We have one thread reading from a priority queue, and handing tasks off to a thread pool. What size is the thread pool? What happens if a task is long running? Can we exhaust the thread pool?

What happens if we need to scale out a bit, because we're exhausting our thread pool? We only have one thing reading off the queue; can it hand those off to another machine in a timely enough fashion? Probably not, especially if we're getting GC pauses; how do we solve that? Maybe we change it so that we launch things a little early, either locally or remotely, and the thread sleeps until it's time to start the task. That will work, but what happens when we want to update the task? Now the logic exists in two places; the task could still be on the priority queue in which case we have to modify the queue, or the task could be sitting in its own thread, and we have to modify it there (arguably this existed before; if a task had started you modified it in the thread it was executing in, if the task hadn't started you had to modify the priority queue, but this makes it far less determinate).

Etc. In Erlang, you skip past all of these issues, and implement the end solution as your first, rather more easily. There's still a data structure somewhere saying what tasks have to execute, and when (probably just in your database, though, nothing needed in memory), but you don't have to worry about managing it beyond "hey, make sure the items for the next (time period) are running in a thread (Erlang process)". You can do that with one process launching the others, and (naive) distribution across machines is incredibly easy (the same complexities of handling partitions apply to both solutions). Individual tasks are built from the beginning to handle sleeping until start time, and updates are trivial, being just a message to them that they respond to, and a modification of your data structure (database). There's no tight coupling between the two. You end up with just "whenever a change in events happens (creation or update), send a message to that process if it exists, and update the DB". And the process scheduler in Erlang handles all the concurrency. There's no locking you have to worry about, and latency will be spread equally, rather than running the risk of choking out a single task.

And this is the -naive- Erlang implementation. That's the benefit. For this particular use case, with priority queues you start running into issues, and as you solve them, your implementation looks more and more like the naive Erlang implementation.


Thank you so much for the write up, I'm just getting into programming at school and this was a fantastic and interesting read! I've always been interested in how data structures work in real-life applications and I'm thankful for the explanation.


Got a good book or website recommendation for learning Erlang?

How does it compare/contrast with something like Akka (which I've got rudimentary experience with in Scala)?


Akka was basically the Scala implementation of Erlang's actor model.

The key differences are, for me, Erlang is easier to reason about (this is mostly how my mind works; Akka it felt like everything was reactive, that nothing is doing any work until it gets a message; Erlang it felt like the opposite, that every process is doing work until it decides it needs a message, and then it can check its mailbox, and optionally block until something ends up in it), and because of its functional nature, things that require special keywords in Akka, just fall out 'for free' in Erlang. For instance, 'become'. 'Become' is just calling another function in Erlang, that has different behavior. Erlang also has more reliability characteritics (not as efficient, but, no stop the world garbage collection, shared nothing, so the only way an exception in one actor can affect another is via an OOM, or specifically hooking up links/monitors such that it causes it to affect the other actor). Akka, however, gives you the speed of the JVM (which -is- better, in pure number cruncing, than Erlang), more libraries (though Erlang is no slouch there, and they all kind of fit the Erlang model), and OOness, if that's your thing.



If you could share that code it will help a lot of people to see the power of Erlang as perceived by a novice coder in the language.


I sadly do not have access to that code anymore. I wrote a basic MVC framework using the aforementioned Cowboy server early last year though: https://github.com/Tombert/Frameworkey-Erlang

... Keep in mind I've gotten a lot better since I wrote this!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: