> AWS has their entire operations team working at all hours of the day and night supporting their infrastructure. They also offer some of the most highly redundant services in the world.
And yet, a couple times a year perhaps, we have discussions right here on HN about the latest AWS outage that took down half the Internet.
No group is infallible. If I thought about a world where cloud providers didn't exist (AWS or otherwise), where every company had to build and maintain all of their infrastructure themselves, and had to make a guess, I'd wager the combined occurrences of issues around availability, durability, etc. would far outpace what we have had.
That's not even considering the potential impact to software development and innovation that we get with commodity cloud services. This is hand-wavy of course but I'd stick to it.
You mean the incident where a small percentage of EC2 instances were unavailable for 30 minutes in a single AZ in US East 1? I see your definition of major incident is pretty loose. I remember that incident. I had services running there. It was so minor that my auto-scaling picked it up and my service impact was nothing.
And yet, a couple times a year perhaps, we have discussions right here on HN about the latest AWS outage that took down half the Internet.