> Configuration bugs, not code bugs, are the most common cause
> I’ve seen of really bad outages. When I looked at publicly available
> postmortems, searching for “global outage postmortem” returned
> about 50% outages caused by configuration changes. Publicly
> available postmortems aren’t a representative sample of all
> outages, but a random sampling of postmortem databases also
> reveals that config changes are responsible for a disproportionate
> fraction of extremely bad outages. As with error handling, I’m
> often told that it’s obvious that config changes are scary, but
> it’s not so obvious that most companies test and stage config
> changes like they do code changes.
PS. On HN you should use asterisks to italicize instead of > for quoting.