It bears repeating: Erlang processes are not OS-level processes. Erlang Virtual Machine, BEAM, runs in a single OS-level process. Erlang processes are closer to green-threads or Tasklets as known in Stackless Python. They are extremely lightweight, implicitly scheduled user-space tasks, which share no memory. Erlang schedules its processes on a pool of OS-level threads for optimal utilization of CPU cores, but this is an implementation detail. What's important is that Erlang processes are providing isolation in terms of memory used and error handling, just like OS-level processes. Conceptually both kinds of processes are very similar, but their implementations are nothing alike.
Also if an apache process dies it will not crash other requests
Instead, it's a "web application server", like Puma is for Rails apps: a server that runs your code in its own address space, where your code can crash the active worker process (which is usually handling multiple requests) for that server.
Puma ordinarily runs ~eight worker processes, and a crash will kill one of them causing it to take the time to re-fork, and discard any other sessions that were also scheduled on that worker. Erlang runs a million worker processes, and one of them crashing doesn't cause anything big to happen at all.
can you have 2m Apache processes on your machine ?
The whole point of this article is that there is an architecture behind this and that this architecture is believed to be generally useful.
MaxConnectionsPerChild defines how often the process is recycled
Now you're just moving a goalpost. We're talking about Erlang processes - you can have millions of them simultaneously.
> It wouldnt make much sense to recycle the process for each request however.
It does make a great deal of sense from the fault-tolerance perspective. It also helps security.
I have an unpleasant feeling that you don't know much about Erlang and its philosophy. The "one process per connection" architecture has many benefits; it's a canonical way of getting concurrency and parallelism on Unix systems, for example. The problem is the way OSes handle and schedule processes, it simply doesn't work with a very large number of them. Erlang implements its own kind of processes, which enforce the same constraints os OS-level processes without taking megs of RAM each and without being pressed after spawning a few thousand processes.
No. It's simply a terminology used in Erlang, there's nothing misleading in it, once you know the definition. I mean, there is very specific thing called "server" in Erlang: http://erlang.org/doc/man/gen_server.html and also "processes" are described here: http://erlang.org/doc/reference_manual/processes.html
What's highly misleading is using more general definitions of such words instead of Erlang-specific meanings. This is a conversation about Erlang and it makes sense to use the terms accepted in Erlang; you can't blame anyone for the fact that some of these terms have meanings different than you'd expect.
And there is a good argument in favor of calling them "processes". Conceptually, both are providing a very strong separation/isolation and cleanup guarantees, which is not true for any of other constructs you mentioned. The range of operation you can perform on a process: spawning, monitoring, sending signals - is the same for both OS and Erlang processes. You can very much think and use both kinds in exactly the same way, other than rather excessive memory usage by the OS-level kind.
I actually have a somewhat related talk tomorrow, about - in essence - replacing Celery in a Python project with Elixir. Integration via ErlPort makes it natural to think of Python worker processes as simply (Erlang) processes - you spawn and link and monitor them just like you would normal (Erlang) processes. It even works with Poolboy out of the box.
Fast forward to today. Look at projects like LING, where the Erlang VM runs directly on Xen, without a traditional operating system. Processes in Erlang are 100% analogous to OS processes.
Also look at something like Virtualbox or VMware. Is your process running in a VM less of a process than the OS process powering the VM itself?
How is that even slightly misleading? That's exactly what many web servers on erlang do.
The Erlang virtual machine has what might be called "green processes" – they are like operating system processes (they do not share state like threads do) but are implemented within the Erlang Run Time System (erts). These are sometimes referred to as "green threads", but have significant differences from standard green threads.
The long way of saying "No, I can't, but I really don't want to accept that there might be a better model for this than my out of the box Apache setup."
It's no secret that Apache isn't the leanest of HTTP daemons, but that's really a separate point to the argument of whether you can reasonably argue a process as a web server.
"In computer terms, Smalltalk is a recursion on the notion of computer itself. Instead of dividing “computer stuff” into things each less strong than the whole – like data structures, procedures, and functions which are the usual paraphernalia of programming languages – each Smalltalk object is a recursion on the entire possibilities of the computer. Thus its semantics are a bit like having thousands and thousands of computer all hooked together by a very fast network." -- The Early History of Smalltalk 
I also personally like the following: a web package tracker can be seen as a function that returns the status of a package when given the package id as argument. It can also be seen as follows: every package has its own website.
I think the latter is vastly simpler/more powerful/scalable.
What's interesting is that both of these views can exist simultaneously, both on the implementation and on the interface side.
Yes! I remember reading a quote by Alan Key about how pretty much every modern OOP language gets it right entirely wrong, because OOP is supposed to be - in Kay's view - about classes or inheritance, but about sending messages to each other.
If one thinks of processes as objects and sending messages as "method calls", it is very object-oriented indeed.
"Smalltalk is not only NOT its syntax or the class library, it is not even about classes. I'm sorry that I long ago coined the term "objects" for this topic because it gets many people to focus on the lesser idea.
The big idea is "messaging" -- that is what the kernal of Smalltalk/Squeak is all about (and it's something that was never quite completed in our Xerox PARC phase). The Japanese have a small word -- ma -- for "that which is in between" -- perhaps the nearest English equivalent is "interstitial". The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be." 
My contribution towards this is Objective-Smalltalk, where I am working on making connectors user definable. So far, it seems to be working.
I was very happy a couple of years back when I was building a toy project in C and Lua to come across a Lua library called concurrentlua that implemented Erlang-style message passing (without pattern matching, though, IIRC). It even allowed you to send messages across machines.
間 means "a gap", but I don't see the relation to programming.
Try 絆 and 繋がる ("bonds" and "connect") out if you'd like some words popular in cheesy pop songs!
It also leads to better code due to being functional, lots of amazing syntactic sugar like the |> operator and the OTP can easily allow you to move processes (these basically have very little overhead) to different machines as you wish to scale. Pattern matching and guards are also incredible.
I really do not want to write anything else!
Yes, it's a great default to be shared-nothing, and it's great to have a VM that supports this, the differences in affordances are important and I think Erlang was a milestone language. Very serious about that. But when it comes down to it, the practical difference between nginx (as in, a finished piece of software that exists right now, not the hypothetical space of future C programs) and an Erlang webserver is not much.
So when the Erlang community tries to rewrite the definition of "server" to be "an Erlang process", it is not an unreasonable response to point out that there are plenty of other web servers that have similar levels of isolation that just happen to be written in other languages, and that we don't run around saying "Oh, my nginx has two hundred thousand web servers in it!"
This is bad advocacy, and I'd really suggest that Erlangers stop trying to defend this point. It's not a defensible position. There's no reason to try to redefine "server" to be "the number of isolated processes", because even if you do, Erlang will not have some sort of unique claim to being able to run lots of such processes. Any two "processes" that don't write into each other's space and can't crash each other are "isolated", even if some implementation work had to be done to get it that way. And of all things, webservers are the definitive programs that have had that isolation bashed on and banged on to the nth degree; I wouldn't be surprised that nginx's isolation is more tested than Erlang itself's.
Well, due to Turing-completeness your parenthetical comment is wrong (since any language can do anything any other language can, if only by embedding an interpreter for the other), and your first comment isn't really relevant: sure, it's possible to write code which implements Erlang's ideas in C, or in assembler, or in AppleScript for that matter — but is it probable? Is one fighting the language, syntax and semantics every step of the way, or are they helping on? Is it even possible to see the forest for the trees when writing Erlang-style CSP code in C?
No, Erlang is not magic, and yes, it's possible to write an Erlang emulator in any language. But there are huge advantages to using a language meant to do something rather than a language which can be made to do something.
Yes, we all are familiar with Turing equivalence, but its an exceedingly pedantic point when discussing programming languages.
I'm not referring to Turing equivalence at all. I'm 100% discussing practical considerations here.
Armstrong was making an analogy, and it's useful to be charitable with analogies to get the most out of them. All analogies, descriptions, models and metaphors break down somewhere. He's been doing pretty well as an Erlang advocate so far.
"it can't actually do anything that other many languages can't do too"
I personally think it's really difficult to scale systems without loosing messages. Having the OTP available in GenServer and GenEvent in Elixir is amazing and provides a nice way of creating your own servers. A good example of these primitives in use is the Redis Driver; it uses two processes - one to send messages to Redis and one to Receive them. This means that essentially all messages going to redis are performed in a set and forget way. Most other clients (even in erlang) block and wait for a response from redis.
Now I am not saying that it's impossible to write this stuff well in other languages/virtual machines but I am saying that Erlang/Elixir/the OTP makes it easier.
"Yes, it's a great default to be shared-nothing"
I'm not certain you could call elixir shared nothing. It's simply wrong; I'd look at the Elixir primitive/server called Agent which a specific construct of sharing state.
"it is not an unreasonable response to point out that there are plenty of other web servers that have similar levels of isolation"
I'm not certain you understand what the OTP is giving you over a webserver or an operating system process, it's much more about having the ability to have a message queue system baked in and easy to use that happens to be supervised and has the ability to be moved to different machines transparently.
"This is bad advocacy, and I'd really suggest that Erlangers stop trying to defend this point"
You are arguing against a straw man for me here. Maybe the original poster is a bit about this but I couldn't care less about the number of processes. I'm much more concerned that Elixir allows me to think about building better and more scalable software with a lot of edge cases solved for me. Once you get that every section of your program can be thought of a server which you can send messages to and this is super efficient, you don't need to wait for anything to block anymore it really is a better model.
"Erlang will not have some sort of unique claim to being able to run lots of such processes"
It can definitely be done in other languages but it's much harder.
"I wouldn't be surprised that nginx's isolation is more tested than Erlang itself's."
It's never been about isolation, it's about not blocking and sending messages instead. And getting supervision and robustness and fault tolerance and easy of use. Yes you could have 300 rabbit queues in front of every part of your program but I'd say it's easier and more scalable to just use Elixir/Erlang/OTP.
I've programmed in Erlang for somewhere around 6 years. Which really makes things like "I'm not certain you understand what the OTP is giving you" rather silly. 
You're falling into the exact trap I was talking about, which is implicitly assuming there's some sort of magic sauce that only Erlang brings to you. You also are so busy disagreeing that you didn't notice that I already said pretty much everything you said. (What is it about Erlangers explaining back my own points to me, only angrily as if I didn't get the points I just made? I get this more from Erlang than any other community.) I'm not kidding about Erlang being a milestone language. But it's not magic.
"You are arguing against a straw man for me here."
What I'm arguing against is Joe Armstrong's attempt to redefine terms to suit his advocacy. If that's a "straw man", take it up with him, not me.
: See also http://www.jerf.org/iri/post/2930 . And for goodness' sake, please do not reply with another laundry list of the points I already made in that post about how that isn't exactly what Erlang has back to me as if I didn't make them already. I've already been through that. (You know the differences because that post tells you exactly what they are.)
I'm glad you clearly get the OTP but that is definitely not clear to me from your previous response. Good luck.
Process isolation isn't complete though, one erlang process can use all of many of the limited resources and end up with the whole vm shutting down. A single process can't use all the CPU, but it can use all the memory, all the atoms, all the filedescriptors, all the ets tables, etc; basically everything in the system limits
 Calling it simple is a compliment -- a normal person can look at the vm code and understand it, if they need to.
Could you point me at the right place to read (or read about) the beam sources? I'm looking at https://github.com/erlang/otp/wiki/Routemap-source-tree -- is there something better?
No it won't. Not if you mean an error in an Erlang process. It almost sounds impossible, but that is the beauty of the BEAM virtual machine. It is really a marvel of engineering.
Now if you mean an error in Erlang VM itself, then yeah, if that crashes it will bring down the all the connections. But even that there is a answer -- distributed Erlang. Erlang VM can talk to processes on other Erlang VMs, even if those are running on another server on other side of the planet.
are you talking about erlang-runtime here ? it is not very clear what you are implying. can you please elaborate ?
if i take your statement to mean 'erlang runtime' then this is no different than saying, if there is a bug in glibc which gets tickled when i do something, then my process will crash, or if there is a bug in jvm, then the application hosted on the jvm will crash etc. etc.
I don't see the difference between spawning a thread for each connection in other languages and spawning a process for each connection in Erlang, so I was curious if the word 'process' implied some extra level of insulation. Reading all the comments here it's still not clear to me what benefit is being obtained by having "2 million web servers" rather than 2 million connections.
Since there's no shared memory, performance characteristics change. Garbage collection is done per process and so "stop the world" doesn't happen, meaning each individual "web server" isn't at the mercy of other requests for performance.
It's lots of little things like that.
That is a huge difference. Otherwise by now, why would Erlang not just run pthread calls and just spawn a thread. Why bother doing all that work?
The reason is because Erlang processes do not share heap memory. That is one of the most important features of Erlang runtime. A crashing process can just crash on its own without leaving all the other 1999999 processes in an undetermined state.
Additionally, if a process terminates before a garbage collection cycle is run the VM just reclaims all of its memory without any additional computation needed.
Fortunately this more or less seems to be the case.
If you have one traditional application server and want to do some form of cross-user interaction - let's say chat - you can do that trivially, put it in a queue for that user in a global map. Now when you outgrow that server, you need to rewrite all your code to understand the concept of users being on other servers or use an external message queueing system.
In Erlang, all of this is built in by default - if you write code for the OTP framework (the standard library for dealing with messaging, process supervision, etc), all you need to do is connect the two servers together and point them at the same shared user->process mapping process (which you have to build whether you're dealing with one server or 20, as there's no global data otherwise).
Of course, if you have absolutely no direct interaction between users, it's trivial to scale anything - fire up a new server and direct some portion of your traffic at it. Erlang's trick is to make it that easy even when you do have direct interaction between users. And of course that works for even backend workloads - if you have a backend server that your frontend servers talk to and need to scale it, if you've coded in Erlang and put as much logic as possible in per-connection processes, you're probably a significant chunk of the way there.
So he's not really affiliated with any company, but rather a voice for Erlang itself.
One of the nice things about Erlang's general approach is that this was a lot less painful than in many other languages. (I have no idea, though, how much work it was to get the Erlang VM itself to use multiple threads.)
And running multiple Erlang VMs on one machine - from the Erlang application's perspective - is no different from running them on several machines. The really nice thing about Erlang is that it scales across multiple nodes very smoothly.
Tier 1: a very small master program which ran in one process. It's job was to open the listening socket and maintain a pool of connection handler processes.
Tier 2: connection handler processes, forked from the master program. When they started they would load up the web application code, then wait for connections on the listening socket or for messages from the master process. They also monitored their own health and would terminate if they thought something went wrong. (ex: this protected them from memory leaks in the socket handling code.) When an http connection came in on the socket, they would fork off a process to handle the request.
Tier 3: request handlers. These processes would handle one http request and then terminate. When they started, they had a pristine copy of the web application code (thanks to Copy-On-Write memory sharing of forked processes) so I knew that there was no old data leaked from previous requests. And since they were designed to terminate after a single request, error handling was no problem; those would terminate too. In cases where a process consumed a lot of memory it would get released to the OS when the process ended. We also had a separate watchdog process that would kill any request handler that consumed too much cpu, memory, or was running much longer than our typical response time.
This scaled up to handling hundreds of concurrent requests per (circa 2005 Solaris) server, and around six million requests per day across a web farm of 4 servers. That was back in 2010; I don't know how much the traffic has grown since then but I know the company is still running my web app. This was all very robust; before I left I had gotten the error rate down to a handful of crashed processes per year in code that was more than one release old.
BTW, while my custom http server code could handle the entire app on its own, and was used that way in development, for production we normally ran it behind an Apache server that handled static files and would reverse-proxy the page requests to the web app server. So those 6 million requests per day were for the dynamic pages, not all of the static files. That also meant that my web app didn't have to handle caching or keep-alive, which simplified the design and makes the one-request-then-die approach more viable.
I'm especially interested in how they manage state. Because when you do not have to manage state, everything becomes easy and scalable.
With state I mean for example a status message for a particular user.
It's called Phoenix.Presence. They used a combination of CRDTs with heartbeats to implement it.
You are right, it is a hard problem because of the distributed nature of Erlang/Elixir. That's why Chris provided a framework level solution for it.
 https://youtu.be/XJ9ckqCMiKk?t=921 (Erlang Factory SF 2016 Keynote Phoenix and Elm – Making the Web Functional)
Whenever a request comes in, you then route the request to that session process (even when it's on another physical machine). This not only provides a clearer mapping between your external and internal APIs, but also allows you to fairly easily ensure that a user can't perform concurrent actions.
Edit: Not sure if that's exactly what you mean, so let me know if you have more questions :)
Ruby is known to be a slow language. Most things will easily outperform it
Actual bottlenecks are the database, and your webserver/architectural choices.
Also, much better tooling around Erlang processes than goroutines.
Processes are meant to model the real world. I'd say the truth is closer to "Any time you would instantiate an object in Java, you spawn a process." That's not always going to be true, but more true than you might expect.
BEAM has a very efficient multi core fair scheduler. All 2M process share time on all available cores and are handled by a single scheduler. Because it's preemptive, starvation problems one might see on an event loop are avoided. A CPU intensive process can run alongside low latency IO processes and they are all fairly scheduled.
In the analogy, it's like having 2M web servers, but where only, say, 1000 of them are actually running. As a developer having independent web servers helps you to think about what should be done instead of how it's going to be done. After that, the fact that a server is actually running or it is stalled because it's reading from disk or it is waiting for available CPU is an implementation detail and something that monitoring will show you.
Each erlang process has its own (erlang) heap and stack, and there is no ability within the Erlang language to access memory by address, a process will only access its own memory, or special types of shared memory (such as the shared/large binary heap).
From the OS perspective, it's one process, with one memory space, and several threads.
A realistic Erlang example where you would have 2 Million processes is something like a chat server (1 process per connected user), either via a traditional tcp protocol or a websocket protocol for a browser. In another language, you would likely need to multiplex multiple users onto each thread, in Erlang each user gets their own process and the code ends up being quite straight forward.
You can do actor model concurrency on Java (and Scala) with the Akka framework (http://http://akka.io). It's a pretty mature one as well. I've been using it for a while now.
Erlang processes are very lightweight (only a few KB of memory) yet they have the property of OS processes of keeping their data isolated.
Think about it like an operating system. In most programming platforms today you are essentially programming in Windows 3.1 or DOS. So when the calculator process crashes, it could take down your game or your word processor because they all share their memory space. That is all cool, but it is cool for 1994, not for 2016.
When programming in Elixir/Erlang/LFE, is like moving up the ladder and using a real OS (preemptable execution, processes don't mess with each other's memory, and so on).
It is marketing material for my consulting activity anyway some of you can find it interesting.
The PDF is here: https://github.com/siscia/intro-to-distributed-system/blob/m...
The source code is open, so if you find a better way to describe things feel free to open an issue or a pull request...
I'm no Erlang programmer, though.
I doubt that erlang spawns millions of OS processes because that would be extremely inefficient due to CPU context switching. So in reality, all erlang is doing behind the scenes is closing the connection and destroying the session... It's not actually crashing and restarting any processes... You can easily implement this behavior with pretty much any modern server engine such as Node.js and tornado.
 restricting global shared memory, enforcing fault isolation through the process model, and supervisors
You are correct, Erlang uses something akin to "green processes" where it manages its own threads but they do not share anything in the way that normal threads would.
Unless that connection managed to write over some memory already. Then you can't just close the connection. You have to restart the whole server OS process.
> I doubt that erlang spawns millions of OS processes because that would be extremely inefficient due to CPU context switching
Erlang does spawn millions of processes. You can try it yourself in a few lines of code. They are not OS processes though. Although they are similar in that OS processes and Erlang ones don't share memory with their peers -- a very useful property indeed.
So Erlang in a way can have it cake and eat it took. It gets the benefit of OS processes but at a much lower cost.
He's referencing processes on the Erlang virtual machine, not OS processes. A BEAM process is much lighter weight.
With Erlang (or rather, the Actor model), you can build state-full pieces of code that are also resilient to crashes that cascade across the entire system, by having a standard pattern of restarting individual actors from a known clean state through a clean supervisor hierarchy.