

Ask HN: When does it make sense to have a chaos monkey? - cirwin

I&#x27;ve read about companies having chaos monkeys to check that failed machines don&#x27;t take your site down. When do they typically start doing that? It doesn&#x27;t seem to make sense at our scale.
======
erbdex
The 'size' at which you unleash the chaos-monkey is i think a matter of a lot
of subjectivity. The QoS _expected_ is one factor. Can you _afford_ to fail?

Formally, two 'requirements', in order to randomly kill your servers are-

    
    
      1. You have Highly Available infrastructure.
      2. You have fail-over established across the architecture.

