Handling third-party provider outages

lawrjone · on July 4, 2022

Hey folks,

Sharing an article written by a colleague about how to handle third-party outages, think AWS/GCP going down.

As engineers, you get used to people hammering on your door when there's an AWS outage asking "why don't we switch to X? I want to be multi-cloud by next quarter!"

Recency bias and our human inclination to underestimate the pain of alternatives mean we get reactionary when a painful outage happens, but the ideal reaction is measured and takes a lot more into account than the pain of last week.

This post provides useful prompts about how you might think about risks of provider outages, and why the obvious "multi-cloud!" might not be the best solution.