
Hot code reloading with Erlang - kansi
https://medium.com/@kansi/hot-code-loading-with-erlang-and-rebar3-8252af16605b#.985ei76fw
======
skrebbel
I wonder what HN's devops people think about this wrt the current trend of
containers and immutable infrastructure. Hot code reloading seems to be
directly at odds with the idea of immutable architecture, because essentially
the application code becomes state. So your container becomes stateful,
instead of swapping out your old appserver container by a new one.

What's your opinion? Ditch Docker and put the Erlang VM on the host OS? Ditch
hot code loading and swap containers the usual way? Some middle ground?

~~~
greenleafjacob
Hot code reloading is always more work than just blue-green and should be
avoided if necessary. For example, author of Learn You Some Erlang writes [1]:

> if you can avoid the whole procedure (which will be called relup from now
> on) and do simple rolling upgrades by restarting VMs and booting new
> applications, I would recommend you do so.

Erlang grew out of the challenges faced by telecoms industries such as what do
you do when blue-green isn't an option? Think an in-use packet switch that is
the only point of contact between two networks. No way to take the switch down
for maintenance without _some_ interruption in service, which gets messy when
dealing with timeouts. In the Armstrong thesis paper he gives another example
[2]:

> Usually in a sequential system, if we wish to change the code, we stop the
> system, change the code and re-start the program. In certain real-time
> control systems, we might never be able to turn off the system in order to
> change the code and so these systems have to be designed so that the code
> can be changed without stopping the system. An example of such a system is
> the X2000 satellite control system developed by NASA.

This power comes at a cost, though. LYSE again:

> It is said that divisions of Ericsson that do use relups spend as much time
> testing them as they do testing their applications themselves. They are a
> tool to be used when working with products that can imperatively never be
> shut down.

The point being, hot code reloading is an additional feature that can come in
handy but for most of HN's audience probably won't be relevant; the cost
outweighs the benefits of just blue-green deploying it.

[1] [http://learnyousomeerlang.com/relups#the-hiccups-of-
appups-a...](http://learnyousomeerlang.com/relups#the-hiccups-of-appups-and-
relups) [2]
[http://www.erlang.org/download/armstrong_thesis_2003.pdf](http://www.erlang.org/download/armstrong_thesis_2003.pdf)

------
Fuddh
Attended a talk by one of the creators of Erlang a couple of weeks ago. Very
passionate about achieving maximum uptime for applications written in his
language. This is one of the features that makes that possible... Fascinating
stuff.

------
stevegh
Interesting. The hot code reloading functionality in Erlang led me to
investigate ruby (my preferred Dev language) a bit more.

You can do a hot code load in Ruby using the Kernel#load() call. It won't
alter functionality currently on the call stack, but it will change the
functionality of everything not on the call stack. With some sympathetic
design, you can achieve hot code loading fo high availability in ruby.

~~~
pmontra
You can use that to replace code by monkey patching

    
    
        $ cat hi.rb 
        def method
          puts "hi"
        end
        method
        load("hello.rb")
        method
    
        $ cat hello.rb 
        def method
          puts "hello"
        end
    
        $ ruby hi.rb 
        hi
        hello
    

You must engineer your application to execute the load method and that's it.
However I wonder if this is really equivalent to what Erlang does. I remember
[http://rvirding.blogspot.it/2008/01/virdings-first-rule-
of-p...](http://rvirding.blogspot.it/2008/01/virdings-first-rule-of-
programming.html)

~~~
pmontra
Interesting post at [http://blog.rkh.im/code-
reloading](http://blog.rkh.im/code-reloading)

------
amelius
> Hot code loading is the art of replacing an engine from a running car
> without having to stop it.

Except you can clone the car into a controlled environment, and test the whole
procedure, before doing the actual replacing.

------
guiomie
This is cool. What type of scenarios could you not afford a few seconds of
downtime on a server? For example, why not simply remove a machine from the
cluster/nlb and upgrade it, then add it back ...?

~~~
yetihehe
When your server has several gigs of state. It's VERY useful on a dev server.
Instead of waiting several minutes for reload, I just load in new code
manually (typically I change only 1-2 files per reload). If something breaks -
hey, it's only dev server. Erlangs other feature - almost everything works
alone - helps with this. If something breaks, it breaks only in one place, so
most of the time I only need to make small changes and reload once more. Rest
of the system does what it needs without any downgrades.

~~~
simoncion
> Instead of waiting several minutes for reload, I just load in new code
> manually (typically I change only 1-2 files per reload). If something breaks
> - hey, it's only dev server.

Someone wrote a module for elixir that uses inotify (and similar) to -I think-
watch .beam files for modification and perform the required hot-reloads
automatically.

I would be reluctant to run this in production, and I can see situations (even
in development) where this could trigger unwanted code purging and would be
disastrous, but it's a pretty neat thing to have and -it seems- a must for Web
Dev people.

~~~
toast0
This would be terrible in production -- often there's an order you need to
load the beam files, and I wouldn't want to add that to the compile step, it's
very simple to load in the proper order. You could pretty easily use
code:soft_purge/1 prior to loading to avoid killing lingering processes
though, and then it would probably be reasonable for development.

~~~
simoncion
> This would be terrible in production...

Yeah it could be. Frankly, I'd likely reach for Erlang Releases before I
reached for this when updating software in production.

However, for a _large_ variety of dev work, this automatic module reloading
thingie works pretty well. :)

> You could pretty easily use code:soft_purge/1 prior to loading to avoid
> killing lingering processes though...

Mmm. Okay. So, I'm not 100% on how this works, so please bear with me and my
inaccurate terminology. :(

In any given Erlang system, there can be two versions of a module running, the
"current" one, and the "old" one, right?

So, if you call code:soft_purge/1 when there is no "old" code loaded, it
should return true, yes? (In addition to returning true when there's no
process running the "old" code.) [0]

So, would this be a way to write an auto-loader that doesn't purge in-use
code?

* code:soft_purge(?MODULE)

* if false, wait a while then retry

* if true, code:load_file(?MODULE)

I guess maybe you'd want to build up a list of all the modules that have been
modified, and wait until code:soft_purge/1 returns true for all of them before
loading the modules. (maybe.)

You also -obviously- want an override that allows for the purging of in-use
code.

[0] Testing _indicates_ that it does, but it's often good to double-check. :)

~~~
toast0
Yes, you've got the concepts and implementation correct.

The exact strategy for reloading (wait for all at once, load whatever is
ready, how long to wait, etc), left as an exercise for the reader. For dev, I
use a function in the shell that loads everything that changed (no soft
purge), for prod, i have a function that goes in order and checks soft purge,
then loads (if the 2nd module doesn't soft purge, it will have already loaded
the first module, but it will stop before trying the 3rd).

With most things in gen_server's, there's not a lot of opportunity for
lingering code, but sometimes it happens.

------
jgalt212
You know what would be really amazing is if you could restart the Erlgang VM,
or load new a VM without interrupting any of the running code modules.

~~~
simoncion
What -exactly- do you want to do when you say you want to restart the Erlang
VM?

I'm asking because I don't have enough context to know why _you_ want to do
what you're asking to do.

~~~
jgalt212
I would assume the longer any VM is running the higher the chances of a
service degrading. I guess this is mostly due to memory leaks or bit rot. I
have no Erlang VM experience, so my comment was geared towards VMs in general.

~~~
simoncion
> I guess this is mostly due to memory leaks or bit rot.

"Bit rot"? The only defense for the bit rot I'm aware of is ECC RAM.

Anyway. AFAIK (and I'm no Erlang expert, so there's probably something
pertinent that I don't know) unless there's a resource leak in core Erlang
code, resource leaks can be fixed by restarting the leaking application, or
killing the leaking process. [0]

[0] Erlang software is often broken up into Applications. [1] An application
is a collection of code with a well-known entry point that (ideally) does a
particular thing. An application can depend on other applications and the
services that they provide. Applications _can_ be started and stopped
independently of all others in the system, but -in order to keep running-
dependant applications need to be designed to handle the temporary absence of
an application that they depend on.

[1] [http://learnyousomeerlang.com/building-applications-with-
otp](http://learnyousomeerlang.com/building-applications-with-otp)

------
Grue3
Seems like a lot of work. In Common Lisp I can just press C-c C-c in SLIME
over the changed function and it goes live.

