
A Primer on Automating Chaos - kiyanwang
https://blog.gremlininc.com/a-primer-on-automating-chaos-84ff4b053be0
======
im_down_w_otp
The article suggests writing the supporting chaos tooling in something like Go
or Rust. FWIW, using Distributed Erlang as the basis for a chaos
orchestration/control-plane was fairly straight-forward back when that was a
thing I did. It alleviated the need to figure out how to deal with
communication (push & pull), process migration, debugging, tracing, etc. The
self-hosting debugging and tracing facilities in Erlang being useful for
runtime instrumenting and keeping track of the injected-faults/probes to
better correlate chaos to observed behavior.

Integrating with other system/library facilities through NIF bindings to
things like `libfiu` was also fairly intuitive.

I suppose somewhat ironically, the next go around for this stuff for me will
actually be in Rust, but for completely different reasons.

In any case this series is great. It's always a joy to read.

------
marknadal
Also check out:

\- Chaos Monkey by Netflix
([https://github.com/Netflix/chaosmonkey](https://github.com/Netflix/chaosmonkey))

\- Jepsen Tests by Aphyr ([http://jepsen.io/](http://jepsen.io/))

\- PANIC by us ([https://github.com/gundb/panic-
server](https://github.com/gundb/panic-server))

~~~
jwhitlark
Also:

\- Automating resilience testing with Clojure and Docker - EuroClojure 2016
video
([https://www.youtube.com/watch?v=xQeatvHejHU](https://www.youtube.com/watch?v=xQeatvHejHU))

