
Elixir RAM and the Template of Doom - revorad
http://www.evanmiller.org/elixir-ram-and-the-template-of-doom.html
======
rdtsc
I like and use Erlang full time, but I am very glad to see Elixir gain
traction.

BEAM is really a marvel of engineering.

I've mentioned this before, but I've head experience engineers show disbelief
at what things it can do -- lightweight processes with isolated heaps, with
ability to have millions of them, isolated faults (if one crashes in anyway it
won't scribble over memory of others), very low latency garbage collection,
awesome monitoring and tracing facilities, live code replacement and so on.

It really feels like having superpowers using it so nice to see a whole new
ecosystem of languages on top of it.

~~~
MichaelGG
The isolated faults thing: how is this different than every other managed
language? Especially if shared state isn't used?

~~~
rdtsc
Start multiple threads. They write some data from them. One of them crashes.
Is it safe to just restart that thread and continue?

Not always, because that thread might have written to so some shared data
structure and left it in an inconsistent state. Restarting that one thread
might seem to work but because there is no guarantee, the safest way is to
restart the OS process.

It might not even be your code. Maybe your RPC library or some other package
did it internally. You might restart and 99% of time it will work, but it will
not be a guaranteed thing.

Now, of course the equivalent counterpart to this is not threads but you can
do this with OS processes. And I've done it in Python. You can use pipes, file
system writes, sockets to create a reliable system were some processes can
crash without bringing down the whole service. But it is cumbersome.

But imagine the sever handles a million connections. Certainly doable in
Erlang. Each one handled by one lightweight process. Connection does something
unusual, maybe uses a new feature which was added last night. It crashes,
that's fine. Other 999999 are ok. If this was a C++ program, it might have
segfault-ed and caused the other 999999 connections to be dropped.

I've seen systems which had a periodically crashing subsystem that was being
restarted behind the scenes for a while most of the system stayed up without
noticeable issues. In business terms that means a smaller ops team, it means
not having to wake up at 4am to answer pages (if subsystem heals itself, can
just fix it in the morning) and so on.

Moreover, live code hot-patching is often touted as a gimmick. But I've used
on a large production clusters while it was serving tens of thousands of
requests per second without stopping. Sure it was scary and by that time
something has already gone wrong (to need that fix), and it was nice not
having to to shut down everything just to add an extra trace or log statement.

~~~
illumen
I don't think it's so cumbersome with modern python. There's first class
coroutines, futures, process pools and such now. Also there are a number of
queues systems in python which easily allow running failed tasks again. One
project I use is basically task(func, args). That's it. Then if it fails we
can have them automatically restart, or wait for review etc. This is over
multiple machines too. It's only about 60 lines of code, including nice gui
interfaces to manage the tasks, and deploy the whole lot to various machines.

Erlang is amazing of course. Just wanted to say that some of those patterns
are now available in python and easily usable. Not common to all python code,
which is where I think erlang wins out. It puts this stuff front and center,
and has first class support. Looking at a random erlang code base, you'll
probably see it there. Random python code bases... not so much (ok, queue
systems are quite common in Django/Flask projects). It's very rare to see
python greelets/eventlets in the wild for example, but generators and async
stuff are becoming quite commonplace.

Also python has single dispatch now (built in, not in a third party library).
Another thing which Erlang does well (pattern matching). Again, not so
commonly used except in modern python shops. These combined with quickcheck
for python (hypothesis), and gradual typing really have made modern python a
much more happy place. Erlang deserves some big respect for spreading good
ideas.

~~~
rdtsc
Sure. Individual aspects of Erlang/OTP are all present in other systems. OS
processes, queuing backends, light weight co-routines, Java has some code
reloading too.

But it makes a difference if it is in one language/framework and built-in. It
makes tracing/debugging/developing easier.

Like with Python, yeah can use a queuing subsystems and submit jobs. But that
is another service to configure and manage. Can use multiprocessing (and I've
done that), but can't launch 1M of sub-processes. Which now changes how you
develop. It has a green-thread co-routine support via greenlet (eventlet &
gevent) but those share memory and if you do any CPU intensive work will block
each other and will also share the heap.

~~~
vvanders
I'd say it's a stretch that Java has code reloading. Sure you can reload the
bytecode, but unless you're marshaling the data to the new version like Erlang
does you're just introducing more errors than fixing things.

------
bjfish
Another interesting read is how BEAM (the Elixir/Erlang VM) does scheduling:
[http://jlouisramblings.blogspot.co.uk/2013/01/how-erlang-
doe...](http://jlouisramblings.blogspot.co.uk/2013/01/how-erlang-does-
scheduling.html)

~~~
jlouis
The post is old, but still somewhat true. The most important change is how
memory carriers can be moved between schedulers/cores nowadays which improves
the TLB miss rate and makes for better system locality.

------
dvcrn
I am very glad that I made the decision to pick Elixir as my next language
instead of other options in the pool.

Elixir is fantastic. Erlangs concurrency model takes some time to get used to
but is something I've never seen before. Things like state is managed in a
separate server process or the ease of building fault tolerant systems with
supervision trees.

Plus, for me the best part, Elixir is just "fun" to write. I didn't have this
much fun writing software since I first started with programming years ago.

If anyone is looking to get started, check out the book "Elixir in Action".
Great book that brings you up to speed with Elixir, OTP, process handling and
a little bit of inner workings.

------
melling
Elixir doesn't get much attention. I'm keeping a list of links for languages
that I'd eventually like to learn, which includes Elixir:

[https://github.com/melling/ComputerLanguages/blob/master/eli...](https://github.com/melling/ComputerLanguages/blob/master/elixir.org)

~~~
bpicolo
It's been getting a massive amount of attention lately, really. Well deserved
too. Phoenix is the sanest out-of-box web framework I've experienced.

~~~
melling
How's the editor support? I'm open to any editor or IDE.

~~~
supernintendo
I use Emacs with Alchemist [1] and I love it. The integration of common tasks
like module reloading, testing and inline evaluation have all increased my
productivity when writing Elixir. Code completion, Phoenix support and the
ability to quickly jump to function definitions within a project are all nice
features as well.

[1]
[https://github.com/tonini/alchemist.el](https://github.com/tonini/alchemist.el)

------
andy_ppp
Can people _please_ stop telling everyone about Elixir, at least until my
startup is built in it.

[http://www.paulgraham.com/avg.html](http://www.paulgraham.com/avg.html)

~~~
MichaelGG
It'd be great if: A, he could show a demo of what he means. Like what kind of
features were do much easier to make and B, what's YC show? Certainly they
have tons of data, and simply knowing the basic tech choices of all those
companies would say something. Were people using better languages more
successful? Did they, I dunno, have a higher demo day ratio?

Seems a waste to not look at those things. (My guess: it is probably totally
outweighed by luck, assuming better languages users are as driven as, say, JS
or PHP users.)

~~~
andy_ppp
Oh it's definitely not true that the language will affect your startup much in
terms of the application, but the right language choice _will_ affect the type
of people and philosophy of the company :-)

------
renox
I know nothing about Elixir but I found this curious "You can also see an
extra tiny optimization performed by the regex engine — notice that the final
string uses the ampersand from the original string, rather than from the
replacement string". OK, but why does it need 4 pointers instead of 3 with the
beginning slice length incremented (7 bytes)?

~~~
jerf
Probably because that optimization involved a subroutine call to "something
expecting to return an iolist", and that subroutine is what noticed it could
reuse the ampersand, then tacked on the replacement, then returned the iolist
["&", "amp;"], which the calling routine tacked on to its running list of the
result, which was the final return.

That's part of what makes iolists nice to work with, in the face of immutable
strings in a strict language; you can make subroutine calls that may return a
string or an arbitrarily nested list of strings, and you simply write code
that unconditionally gathers "Whatever was returned" into a list, instead of
sitting there appending all the time. The common "concatenation" operation
becomes a list consing instead of a laborious copy.

------
gnufied
I love Elixir very much, but most unixy languages have access to `writev`. Yes
that is not default for most IO but programmers typically can choose.

I have done some benchmarks of using `writev` vs concatenating smaller strings
using memcpy before sending to the network stream (to avoid kernel context
switches) and difference is pretty small.

------
twotwotwo
As a curiosity, a Linux iovec seems to have the same layout as an array of Go
strings (a bunch of (pointer, length) pairs). Looks like it'd be
straightforward to implement a `func WriteStrings` taking a pointer to os.File
and a []string that, on Linux, only makes one syscall.

Or you could make your own struct of byte pointers and lengths to make an
iovec you can fill with content from []byte's, with the caveat that the caller
needs to leave those bytes untouched between when it adds them to the vec and
the actual writev. To mix byte and string output, you probably need to use
unsafe.

I don't see a way to tie writev in with the stdlib text/template libraries,
which stick to plain io.Reader and Writer, without forking them.

Curious how writev compares to the usual approach of just pointing
text/template at a *bufio.Writer (which also passes the "generate 40GB of the
word DOOM and don't crash" test).

Regardless of the specifics, BEAM and the ecosystem around it is one of the
more interesting places to crib ideas from, in that it's pretty different from
a lot of stuff out there but has shown some value and longevity in production.

~~~
jerf
It would not be safe to attempt to provide a Go io.Writer that backs to some
file descriptor and backs to writev. The text of the io.Writer interface
forbids the io.Writer from modifying the passed-in []byte, which implies
strongly that the caller is still considered to own it. Also, the fact that I
said "implied" is awfully scary too. You can't buffer things easily because
you can't reliably know when the last .Write is called, and all the solutions
I can come up with will annihilate any conceivable performance advantage
(i.e., if you're communicating across to some other goroutine, you just lost
the performance). Especially since if you don't write it right away, you
almost certainly must take a copy to be safe, and, oops, there went all the
performance advantage again. The need to take a copy also prevents you from
just declaring it has to be flushable; it's legal for something using a Writer
to mutate a buffer and pass chunks of it in multiple times, and I'm pretty
sure I've already got code that does that. No point in buffering up several
slices if the caller is going to mutate it between invoking .Write calls.

What you could do is create an object that implements, say, interface{
Multiwrite([][]byte) (int, error) }, that also implements io.Writer as a
fallback in terms of the Multiwrite so this object is generally useful, but
that still gets you no performance win for text/template unless it is modified
to incorporate this new interface.

However, that does seem like it's at least a possibility; there are, for
instance, already optimizations in net/http for serving up files via the Linux
sendfile kernel call, and this would be pretty similar. Providing
implementations of Multiwriter for a file and a socket might not be a tough
sell; complexifying text/template to actually use it in the core library might
be, though. Depends on what kind of performance numbers you can show.

~~~
twotwotwo
Yeah, writev seems different enough from write to merit a different Go
interface. Hence the hypothetical WriteStrings method, for example, instead of
building around Writer.

Whether it'd be worth futzing to get templating with less copying depends on
whether you can measure much practical benefit over bytes.Buffer, I guess.
Takes round tuits to see.

Mostly just thought writev was neat; it seems cool that it's theoretically
possible to write Go code using it too.

------
nebulous1
Just to note he's not actually right on javascript dealing with string
literals inside closures in the same way (js will reuse the same memory for
all instances of "Foobar" in his map example). I'm pretty sure he just means
the general concept of allocations per closure instance will be familiar
though, which is correct.

------
notbzzcompliant
All the capabilities the author points to are part of BEAM, which has nothing
to do with Elixir. You could write the same code in LFE and it'll perform in
the same way.

