In the first place, "we can't compile our code on every change because it takes too long" is a really awful situation to be in. Are developers not building and testing their changes before deploying them? Can Facebook not afford a continuous integration system that can run builds in parallel? It sounds like this problem is only happening because the application is a giant monolith, but for some reason splitting it up would slow down development even more... I'm not sure I buy that reasoning.
The article says that "Haskell’s strict type system means we’re able to confidently push new code knowing that we can’t crash the server", which is a real stretch. In addition to all of the usual ways a computation can diverge, this hot-swapping system adds a whole new variety of failure modes. The article talks about how the code needs to be carefully audited to prevent memory leaks, but it doesn't even mention the weird things that can happen when mutable state is preserved across code modifications. Debugging is a pain when your data structures can get into states that aren't reachable with any single version of the code. (This is a well-known issue in Linux kernel live-patching, for instance.)
We use a monorepo for all of the benefits it has, and deploying fast business logic updates this way helps mitigate one of its downsides (particularly when you've maximally parallelized the build). I've found https://danluu.com/monorepo/ to give a quick overview of how chopping up the repo would have separate downsides.
The section about "Sticky Shared Objects" speaks directly to mutable state across code modifications, just with a Haskell-minded focus.
> make you more comfortable with this stuff
Which stuff are you referring to?
Overall I'd love if all builds were significantly faster, so we contribute to upstream GHC to make it better in the ways we come across. Our platform has a deployment SLA that we strive to maintain as our "target build time".
Improving GHC compile times and reducing the binary size would be better, but presumably a lot of work has already gone into those problems and if it were easy someone would have done it by now. As for myself, I really like using Haskell and I'm glad whenever I hear about it being used in industry.
The article describes the hot-swapped module as containing frequently changing business logic, which sounds like it’s something they can probably do via an interface with well-constrained or no mutability.
I wonder why that wasn't an option for facebook.
 From podcast: http://www.haskellcast.com/episode/002-don-stewart-on-real-w...
You're reading a blog post, you do not know all they have tried, nor the various intricacies they're dealing with.
These kinds of designs typically emerge over a long and windy history and, for someone who was part of that process, it's difficult to coherently describe the final state to an outsider. Good textbook authors have this skill. Most tech blog authors do not. (I think that part of the problem is that people don't respect just how difficult it actually is.)
My guess: restarting a large fleet of processes is a pain. The rollout will typically be throttled to avoid connection churn, among other things. For risky code changes, you probably want a slow rollout anyway, but if you're just tweaking abuse detection rules (almost just a config change), it's nice to have your changes take effect more quickly. Dynamic loading seems like one reasonable way to achieve that goal.
Tangent: people, please stop making analogies to mechanical engineering feats that are WAY more difficult than what you did . People have been loading shared libraries forever; it's like adding an AUX port, not swapping out the engine. It's not even in the same league as Ksplice or as the JVM's dynamic loading/deoptimization.
Isn't this exactly the problem Go was invented to solve?
In my opinion the time wasted debugging Go issues that could have been statically prevented is better spent waiting for a slightly longer compile cycle to finish.
The authors might have done it in regards to waiting for C++ builds, but the problem was not a problem for those using other programming languages.
Erlang (and Elixir) define hotswapping very well. It is a standard way to upgrade code in production in some places. And even with it being well defined it is still very hard and there are enough corner cases to handle.
But when used correctly, it is really magical and can achieve nice properties.
Besides just upgrading code, hotswapping (at least in Erlang) can be used for debugging -- you can update the running code with extra log statements to catch sneaky corner cases. Maybe it is a customer setup, that is very hard to replicate.
Or you can use it for local development, as you edit code, the module gets auto-reloaded (with a helper).
It can also be used to deliver hot fixes. Say if the fix is simple and the customer cannot wait for a full release to be built, can update their system on the spot to tie them over. Not idea but I've seen it save the day many times.
This is a huge feature for me in my Elixir development. I mostly use Elixir for some server code that manages many connections to external network entities. It would be a huge hassle to bring down my server application every time I want to make a change.
With Elixir (yeah, Erlang), I can normally recompile the module I'm working on and deploy it in the running server. Not only is it a good way to constantly observe Erlang hot-swapping in action on my dev machine, it's a huge time saver.
Edit: As a sibling commenter notes this is most eminently doable with e.g. Common Lisp and BEAM (Erlang/Elixir), but more folks are (publicly) attempting this in other environments now (I've experimented with a number of approaches to this the last few years, so I'm trying to keep score - would love to see any comments on other attempts below).
: Quote: "At the core of the redesign is a Dynamic Scripting Platform which provides us the ability to inject code into a running Java application at any time. This means we can alter the behavior of the application without a full scale deployment."
Clojure as well right?
You'd be surprised how many languages can do this. Though it's hard to beat lisp (and Erlang), where it is the default.
Loading / unloading code is straightforward. The trick is in getting the code called from existing code.
From interning at Facebook, the only project that I'm aware of that uses Haskell is Sigma. On the other hand, numerous projects use Ocaml: Infer, HHVM, Flow, ReasonML, Pfff, etc.
However, it's Haskell that gets all the attention on Hacker News.