
Intercom Incident Report for 2017-01-19 outage - aidos
https://status.intercom.com/incidents/bb4c8n1pn1pm
======
aidos
Interestingly, Intercom are blaming a change of the execution plan on MySQL
(RDS Aurora) which raises the question – how can you prepare for such an
eventuality?

It seems like the sort of thing that might happen when the statistics are
providing the wrong information for carrying out the query (rather than any
sort of change to the server/db engine).

Does anyone know how to best protect your service from this?

~~~
zzzcpan
Well, you are supposed to use supervisors for something like that, i.e.
monitor for naughty queries and kill them.

