
Hot-swapping Python 3 code - metahost
https://github.com/say4n/hotreload
======
orf
Hot reloading is next to impossible to do safely in the general case in
Python, and this library is really basic and doesn’t really do anything
special.

The only case where it does work is if you maintain a single reference to the
module you wish to reload, _and_ the target module doesn’t load any C
extensions, _and_ it doesn’t create state outside its module (very easy to do
accidentally).

That’s why every autoreloader implementation restarts the whole process when
any change is made, because how would you even detect that it was safe to hot
reload something in the general case let alone actually hot reload it safely?

If you need to add an autoreloader to your project use Hupper[1]. If you’re
weird like me and you’re interested in autoreloaders, check out the Django[2]
and Flask[3] implementations or my talk[4] for more details.

1\. [https://pypi.org/project/hupper/](https://pypi.org/project/hupper/)

2\.
[https://github.com/django/django/blob/master/django/utils/au...](https://github.com/django/django/blob/master/django/utils/autoreload.py)

3\.
[https://github.com/pallets/werkzeug/blob/master/src/werkzeug...](https://github.com/pallets/werkzeug/blob/master/src/werkzeug/_reloader.py)

4\. [https://youtu.be/IghyoR6ld60](https://youtu.be/IghyoR6ld60)

~~~
lunixbochs
I got it working well in a large project but I basically designed auto
reloading into deep parts of the framework. It's not something you can safely
bolt on, and I'm honestly not sure I recommend the sanity impact of trying to
build it properly (e.g. ran into a bunch of memory leaking even once I had it
working safely). But there are some kinds of programs you can't really restart
on every change.

The biggest caveat of these kinds of single file reloaders is they completely
fall apart when you import between your modules. For example, if `a` does
`from b import obj` and `b` does `from c import obj`, and you modify `c.py`,
`a` and `b` will both have a stale reference to obj (which might be a
primitive or even interned type like a string or an int, so it's not like you
can mutate it as a solution).

~~~
orf
I updated my comment to say “impossible in the general case” which is more
accurate, but I’d be interested in knowing more about that project - how big
was it, and how did handle some of the tricky bits?

Like, module A creates a class/function that is consumed by module B and C. If
you change module A, how did you know to also reload B and C (recursively)?

~~~
lunixbochs
This is for a scriptable desktop accessibility program. It is fairly big, is
very actively used by end users, and many users have >80 modules and edit them
_constantly_ during daily use. So the reloading system isn't just commonly
used, it's a critical path of using the app. (I also have not gotten any
complaints related to the module reloading in ~months, when I last fixed a bug
in the long dependency chain tracking, and reloading hasn't caused any major
problems in general).

I slightly break the mold of conventional Python imports - script folders
don't have a single entry point, the entire user script directory tree is
recursively imported (in alphabetical order) and autoreloaded on change.

Anytime I call into a module it is marked as the active module in its thread
(a sort of call stack but at the module level and only for certain kinds of
events).

"Call into" generally means the module is being imported, or I'm calling a
callback that was registered by the module.

Anytime code creates a global object (such as opening a window, registering
new voice commands, or registering a callback for some kind of event), cleanup
for the object is registered on the active module. Additionally, all imports
are tracked, so I have a graph of which modules import which other modules.

The import tracking allows me to reload dependent module chains (even long
chains, trees, or import loops) at the correct time. The object tracking
allows me to gracefully replace a module without it requiring any explicit
cleanup.

This graph extends to more than just Python files. There are special functions
for opening or reading files that mark those files as a module's dependencies
as well, so you can modify something like a CSV and the script that uses it
will get reloaded (and any module chains that depend on that script).

This all working well is tightly coupled to how the framework behaves. Scripts
won't generally spawn threads or spin in long loops, they'll mostly register
for event callbacks. For example, instead of a script spawning a thread or
sleeping if it wants to run code later, there's a cron system that works
something like javascript setTimeout/setInterval. Instead of storing something
in a global, you can put it in the persistent storage layer if you want it to
survive reloads (and restarts).

You can definitely do a few things that aren't reloadable, like spawning your
own thread, but most things you want to do end up being reloadable, or have an
equivalent or alternative API that is reloadable.

There are a few other nice touches and APIs that help glue all of this
together. Generally it means you can write your scripts as short, flat, mostly
declarative Python (with perhaps a few event handlers and setup code that
don't need to go in a __main__ guard) and not worry about common cases for
reloading. (There are also subsystems specifically watching out for edge
cases, such as a user script running an infinite loop)

Honestly the most annoying part of all of this (that I do generally handle
now) was reliably monitoring file changes when users put file or directory
symlinks at arbitrary places in the tree (which it turns out a lot of users
like to do).

------
BiteCode_dev
Because of the nature of Python, using importlib.reload() is going to lead to
terrible bugs at some point. Use importlib.reload() only in the shell, for
convenience.

If you want autoreload for dev (e.g: like django dev server), better restart
the entire process.

There is a 7 years old project that can help you with that called watchdog:

    
    
        https://pythonhosted.org/watchdog/
    

It's a library that uses the fastest way available (e.g: inotify,kqueue...) to
react to file changes (you can filter by types such as create, delete, write,
etc) and triggers whatever code you want, passing you the metadata.

If you just want something very simple, you don't have to code, you can
install watchdog with its optional watchmedo dependancy:

    
    
        pip install watchdog[watchmedo]
    

It will give you the watchmedo command, that lets you do something like this:

    
    
        watchmedo shell-command --patterns="*.py" --recursive \
        --command='python your_script_to_rerun.py' .

~~~
X-Istence
If you are working on code that runs a web server or some other service that
is generally a long running application, take a look at hupper:
[https://docs.pylonsproject.org/projects/hupper/en/latest/](https://docs.pylonsproject.org/projects/hupper/en/latest/)

It is built specifically to restart things like servers as it was implemented
for the Pylons Project (Pyramid web framework).

~~~
BiteCode_dev
Does hupper can restart on changes to non python files ?

~~~
X-Istence
There is an API to pass it non-python files (so that alongside your Python app
it can watch your configuration files for instance/HTML files).

However it does not work for just watching a directory + running a command, it
is specifically for watching Python code.

------
sillysaurusx
Does this "just work"? How does it handle classes?

I've been writing my python code in a style that can be hotswapped. Roughly, I
use global functions and pass around structs, like C. The function calls are
made with `api.foo(data)` instead of `data.foo()`. Then I can just replace
`api.foo`'s definition with whatever I want after attaching with a debugger.
Example:
[https://github.com/shawwn/gpt-2/blob/e4868225382e8a475ef3c3d...](https://github.com/shawwn/gpt-2/blob/e4868225382e8a475ef3c3d6cfdc3192aba0f2b4/train_multi.py#L475-L499)

I assume there's a much better way to do this, but the class system seems to
make it hard to swap things out. Doesn't "isinstance" break if you redefine
the class?

EDIT: Oh, this isn't actual hot-reloading for python. Darn. I'm really hoping
someone will tackle that... Cool project though!

EDIT2: Actually, this project _does_ seem to reload the modules:
[https://github.com/say4n/hotreload/blob/0cf3a0b466466f99f8c7...](https://github.com/say4n/hotreload/blob/0cf3a0b466466f99f8c7ccdb546fcef4553617f9/hotreload/reloader.py#L47-L48)
... so the original question stands.

By the way, you may want to use `import traceback; traceback.print_exc()` to
log exceptions. I'm not sure how that plays with python's `logger` class
though.

~~~
oefrha
importlib.reload does not "just work".
[https://docs.python.org/3/library/importlib.html#importlib.r...](https://docs.python.org/3/library/importlib.html#importlib.reload)
documents what is done and not done.

> Doesn't "isinstance" break if you redefine the class?

With importlib.reload, yes.

You might be interested in IPython's autoreload implementation, which does not
break isinstance:
[https://github.com/ipython/ipython/blob/f8c9ea7db42d9830f163...](https://github.com/ipython/ipython/blob/f8c9ea7db42d9830f16318a4ceca0ac1c3688697/IPython/extensions/autoreload.py#L253-L417)
However, you could still end up with semi-broken instances, for instance if
you update

    
    
      class A:
        pass
    

to

    
    
      class A:
        def __init__(self):
          self.attr = "something"
    
        def __str__(self):
          return self.attr
    

> By the way, you may want to use `import traceback; traceback.print_exc()` to
> log exceptions. I'm not sure how that plays with python's `logger` class
> though.

PSL logging has builtin integration with exception tracebacks. See
Logger.exception [1] or the exc_info keyword param to
Logger.debug/info/warning/error/critical/log [2].

[1]
[https://docs.python.org/3/library/logging.html#logging.Logge...](https://docs.python.org/3/library/logging.html#logging.Logger.exception)

[2]
[https://docs.python.org/3/library/logging.html#logging.Logge...](https://docs.python.org/3/library/logging.html#logging.Logger.debug)

~~~
aidos
I couldn’t live without ipython autoreload. You need to know where the
limitations are that require you to create your instances again. Things like
sqlalchemy model changes run into issues because of the meta class magic (I
think, not looked too deeply).

I feel like this library is a bit misrenamed. It seems to be more like
“autorun”.

Edit: here’s the ipython config I use to turn it on
[https://github.com/aidos/dotfiles/blob/master/_config/ipytho...](https://github.com/aidos/dotfiles/blob/master/_config/ipython/profile_default/ipython_config.py)

------
oefrha
For comparison, here’s Werkzeug’s reloader implementation:
[https://github.com/pallets/werkzeug/blob/master/src/werkzeug...](https://github.com/pallets/werkzeug/blob/master/src/werkzeug/_reloader.py)
(when you flask run or call app.run with reloader turned on, it’s handled by
this module.)

~~~
sillysaurusx
Thanks! That `_iter_module_files()` function is handy.

Unfortunately, it looks like it simply restarts Python each time the code
changes:
[https://github.com/pallets/werkzeug/blob/75215972967c1c00c15...](https://github.com/pallets/werkzeug/blob/75215972967c1c00c15b8b5c3c661c2278423360/src/werkzeug/_reloader.py#L156)
which isn't quite the same thing as reloading it.

~~~
the_mitsuhiko
It used to do hot reloading back in the days but unfortunately that doesn’t
really work in Python because of all the import side effects. It’s better to
restart everything.

Werkzeug keeps the socket open so you don’t observe the reload from the
outside.

------
brodul
As other have pointed out, the best way is to restart the process. If you are
looking for hot reloading you should look at programming languages that run in
BEAM like Erlang and Elixir[1].

There are some gotchas with hot swapping, but it's designed and used in
production for many years [2].

If somebody knows other languages/systems that do let me know.

1\. [https://elixir-lang.org/](https://elixir-lang.org/) 2\.
[https://stackoverflow.com/questions/37368376/how-does-
erlang...](https://stackoverflow.com/questions/37368376/how-does-erlang-hot-
code-swapping-work-in-the-middle-of-activity)

~~~
dig1
> If somebody knows other languages/systems that do let me know.

Clojure can do this via namespace live reload. ClojureScript is already doing
this inside _figwheel_ [1] environment. CommonLisp had this since dawn of
time. Kawa [2] can do it, but because it aggressively optimize the code, you
need to be careful. Racket can do it as well [3].

[1] [https://github.com/bhauman/lein-
figwheel](https://github.com/bhauman/lein-figwheel)

[2] [https://www.gnu.org/software/kawa/](https://www.gnu.org/software/kawa/)

[3] [https://github.com/tonyg/racket-
reloadable/tree/master#readm...](https://github.com/tonyg/racket-
reloadable/tree/master#readme)

EDIT: added Kawa and Racket.

~~~
bjourne
The languages you mention can't update data in flight, which is something
Erlang can. Suppose you have the following code running in a thread:

    
    
        def print_name(d):
            print(d['name'])
        d = {'name' : 'hello'}
        while True:
            print_name(d)
    

Now you hot hot-swap it while the while-loop is running:

    
    
        def print_name(d):
            print('hello', d['first_name'])
        d = {'first_name' : 'there'}
        while True:
            print_name(d)
        

Erlang can do this. No other language that I'm aware of can.

~~~
dig1
Clojure has all sorts of facilities to do that (atoms, agents or refs). You
can even listen for changes [1] and compare old and new values.

[1] [https://clojuredocs.org/clojure.core/add-
watch](https://clojuredocs.org/clojure.core/add-watch)

~~~
bjourne
Interesting, I didn't know that. I don't understand how the function you link
to can be used to accomplish that. Can you give me an example?

~~~
dig1
Sure; here is example with atom:

    
    
        ;; define a map
        (def myvar (atom {:foo 1}))
        
        ;; add watcher for 'myvar'; it will call anonymous
        ;; function and set old-state with previous value and new-state
        ;; with new value, when 'myvar' was changed
        (add-watch myvar :dummy 
          (fn [_ _ old-state new-state]
            (println "changed! - old: " old-state " new: " new-state)))
        
        ;; Run separate thread and and update :foo value to 2, after 6 seconds.
        ;; No scheduling is necessary, this is so it can be evaluated in REPL easily.
        (future
          (Thread/sleep 6000)
          (swap! myvar assoc :foo 2))
        
        ;; Infinite loop. Sleep for 0.5 seconds just for printing purposes.
        (loop []
          (println @myvar)
          (Thread/sleep 500)
          (recur))
    

and you will see output like this:

    
    
        {:foo 1}
        {:foo 1}
        {:foo 1}
        {:foo 1}
        {:foo 1}
        {:foo 1}
        {:foo 1}
        {:foo 1}
        changed! - old:  {:foo 1}  new:  {:foo 2}
        {:foo 2}
        {:foo 2}
        {:foo 2}
        {:foo 2}
        {:foo 2}
        {:foo 2}

~~~
bjourne
That is an example of thread messaging - not hot-swapping. Had myvar been
local to the thread it wouldn't have worked.

~~~
dig1
I used different thread just for printing/add-watch purposes.

But, using myvar or your-like example (hot loading function) in the same
thread works without any problems as well - put function in a loop, change
function body and reload namespace where myvar/function exists - loop will
pick that up instantly.

~~~
bjourne
In your example, you modified the assoc while another thread was accessing it.
That is a standard feature of languages with threading support and is not
impressive.

Let me give you a clearer example than my first one:

    
    
        def fn():
            d = {'name' : 'hello'}
            while True:
                print(d['name'])
        threading.Thread(target=fn).run()
    

This causes a thread to be started which prints "hello" indefinitely. Clojure
cannot, while the thread is running, change the execution to, say:

    
    
        def fn():
            d = {'name' : 'there'}
            while True:
                print(d['name'])
                time.sleep(0.5)
    

Clojure cannot change the value of the local variable d because it has no
reference to it. Clojure cannot add the line time.sleep(0.5) to the loop
because it has no facility for "upgrading" running threads.

~~~
fulafel
Reading up on the Erlang system, it doesn't upgrade function-local state this
way either. It relies on a convention where this state is stored externally to
the function by the gen_server framework and passed in, right?

(going by [https://stackoverflow.com/questions/1840717/achieving-
code-s...](https://stackoverflow.com/questions/1840717/achieving-code-
swapping-in-erlangs-gen-server))

~~~
bjourne
Yes and no. Like everything in Erlang, hot-swapping relies on code being
structured as servers that responds to messages. In this case, the VM sends a
message to the process telling it about the code upgrade. The process has to
handle this message "manually" and do what it needs with it. For example,
changing its internal state or updating database schema if it uses external
storage.

But you don't have to use the gen_server framework to accomplish this. It's
perfectly fine (but not recommended as there are lots of details you want to
get right) to write your own code for handling hot-swapping. See
[http://erlang.org/documentation/doc-4.9.1/doc/design_princip...](http://erlang.org/documentation/doc-4.9.1/doc/design_principles/gen_server.html)
for some details on server principles.

Immutability in combination with Erlang's process centric view is what makes
it possible.

~~~
fulafel
Thanks for the explanation.

There's an idea in there for a Clojure library :)

------
david_draco
You can do something like this on Linux:

    
    
        inotifywait -m myfile.py -e 'CLOSE_WRITE' | 
        while read line; do python3 myfile.py; done
    

It will be more responsive than a busy while-sleep loop.

------
pronoyc
How is this different from using Python Watchdog?

[https://pythonhosted.org/watchdog/](https://pythonhosted.org/watchdog/)

~~~
erikschoster
Looks like it starts a thread that repeatedly hashes the content of the target
script to poll for changes. Watchdog uses OS features like inotify and
FSEvents and only uses polling and comparing hashes as a fallback.

------
ed25519FUUU
Looks like it uses hashlib to check the file for changes, which is a nice way
to make it cross-platform compatible.

For those who run on linux, you can use the inotify tools to set up a watch on
a file (or directory or recursive directories). inotify uses kernel events and
won't require re-reading and re-hashing the file.

------
hacknat
Don’t hot swap. Just don’t do it. 99.99% of us should never need to do it.
Learn sound network topology and load balancer practices instead. Embedded?
Your hardware documentation will tell you how to offload this problem to it.
There’s no earthly reason for most of us to be learning this.

------
hathym
having an infinite loop reading the file and calculating its sha checksum
every 1s is not ideal. I'd suggest using the OS premitives to get
notifications when the file is touched (inotify on linux and
ReadDirectoryChangesW on windows), or simply use the watchdog lib

------
glic3rinu
When I add autoreload functionality to my programs I just spawn a thread that
checks for last modified date and then re-executes the program with
os.execlp(sys.argv[0], *sys.argv)

I found my solution superior in several ways:

\- I can force an autoreload just by saving a file (no file changes needed to
force an md5 diff)

\- exec() doesn't suffer from the side effects of not restarting the python
interpreter, clean start every time :)

\- it is also quite portable and doesn't require extra dependencies like
inotify/fswatch/etc. </ul>

------
jbverschoor
Sorry, but that's not really hot-reloading. It's only reloading a file.

Why do we only get hot-code-replacement and edit-code-and-continue-debugging
from microsoft and sun(oracle)?

~~~
julvo
Have a look at
[https://github.com/julvo/reloading](https://github.com/julvo/reloading), it's
a little library I wrote that can give you edit-code-and-continue-debugging in
Python loops. A more general solution for this sort of development in Python
would be neat though.

~~~
jbverschoor
So this should be posted instead of the OP

------
sfgweilr4f
Not keen on hashing the file every second but I understand why... I'd rather
it used filesystem modification events and responded accordingly. But that
might more more complex than what you intend.

But for simple scripts being live-coded that need to be quickly "up" and hot-
swapped on rapid modification this looks fine. I could always increase the
delay to a minute or hour or more if the changes were only occasionally
changed.

------
pmontra
By the way, we had this discussion about Erlang's hot reloading five years
ago.
[https://news.ycombinator.com/item?id=10669131](https://news.ycombinator.com/item?id=10669131)

Erlang runs on a VM designed for hot reloading so the comparison is somewhat
unfair to Python which was not designed for that.

------
underdeserver
For those using Bazel, there's ibazel:

[https://github.com/bazelbuild/bazel-
watcher](https://github.com/bazelbuild/bazel-watcher)

It's great, I use it all the time, especially with unit tests.

------
xiaodai
This is like Revise.jl in Julia

------
adenozine
Wait, this is actually just reloading a file upon filesystem changes?

That's not hot-swapping at all.

How grimy.

------
redis_mlc
You can use netfilter/iptables on linux to switch code behind a listening
port. I'd rather do that than a busy-wait utility, etc.

~~~
dividuum
How so? I assume adding -j DROP temporarily while the code restarts? Or is
there any other trick?

------
raymondh
This seems like an anti-pattern to me.

------
ranman
Why was this upvoted? This does nothing with “hot swapping”...

