If they don’t obfuscate the downtime (they will, of course), this outage would put them at, what, two nines? Thats very much out of their SLA.
People also keep talking about it as if its one region, but there are reports in this thread of internal dependencies inside AWS which are affecting unrelated regions with various services. (r53 updates for example)
It sounds like you think the SLA is just toilet paper? When in reality it's a contract which defines AWS's obligations. So the lesson here is that they broke their contract big time. So yes. Shaming is the right approach. Also it seems you missed somehow the other 1700+ comments agreeing with shaming
I wouldn't go that far. The SLA is a contract, and they are clear on the remedy (up to 100% refund if they don't hit 95% uptime in a month).
Just like reading medication side effects, they are letting you know that downtime is possible, albeit unlikely.
All of the documentation and training programs explain the consequence of single-region deployments.
The outage was a mistake. Let's hope it doesn't indicate a trend. I'm not defending AWS. I'm trying to help people translate the incident into a real lesson about how to proceed.
You don't have control over the outage, but you do have control over how your app is built to respond to a similar outage in the future.
If they don’t obfuscate the downtime (they will, of course), this outage would put them at, what, two nines? Thats very much out of their SLA.
People also keep talking about it as if its one region, but there are reports in this thread of internal dependencies inside AWS which are affecting unrelated regions with various services. (r53 updates for example)