
Conductor: A framework for testing distributed systems - luu
https://github.com/gvnn3/conductor
======
deadgrey19
I looked seriously at Conductor, but its world is very much tied into using a
static configuration files so it's not very programable. This works ok if your
experiments are simple, but not so much if you have more complex needs. So, I
wrote something inspired by conductor and PSSH
([https://code.google.com/p/parallel-ssh/](https://code.google.com/p/parallel-
ssh/)), but which is basically a little embedded DSL in python called ReDo.

[https://github.com/mgrosvenor/redo](https://github.com/mgrosvenor/redo)

I use ReDo to set up and run experiments, much like Conductor, but I can
program custom code around each experiment. I have a cluster of 20 machines
that I run distributed systems experiments on (installing, configuring and
running firmware and software for multi-machine tests), that I only ever
interact with through ReDo.

~~~
contingencies
I think the key problem here, as you point out, is that these types of
solutions make assumptions about the complexity and control-channels of your
infrastructure.

The intelligent place to install a distributed infrastructure testing
automation framework is on top of a capable general purpose infrastructure
abstraction and automation system.

Such a system would include the ability to fully deploy and configure any
number of component services (ie. Continuous Deployment) on any type of
requisite infrastructure. It would also be better placed to see which
variables are available to tweak (testing for example specific inter-service
[OSI layer X] network infrastructure latency or lossiness). Simulation is one
thing, simulation on real hardware another. A competent system would handle
both with the same codebase through a pluggable architecture. There is to date
no broadly accepted open source or commercial solution in this space.

If anyone wants to do a startup that builds one, email me.

