AWS operates on a huge scale. Mindbogglingly huge, in fact.
What you see as an outage, AWS might be aware of it being a single rack in a single AZ. One single rack out of hundreds of thousands in a single AZ doesn't constitute an outage, and affects such a tiny percentage of customers that there is absolutely no logical reason why they'd update their service status dashboard.
Despite all the redundancy built in to the system and all the protections they can manage, these incidents are a common event, just by nature of the sheer scale that AWS is operating at.
If AWS was to start surfacing these on their status page it would essentially never leave Green-I state, except to occasionally dip in to Yellow or Red.
Things that you, and I, with smaller infrastructures would think about as being one in a million, or even one in a billion odds of them happening, are an absolute certainty on their scale. To give context, 3 years ago S3 announced they'd passed the 2 trillion object territory, after having hit the 1 trillion object mark 4 years ago.
I'll be the first to tell you that their dashboard underestimates impact, but in this case it was totally accurate.
The system was working normally -- ie. you could still use us-west-1 without issue if you were in another zone.
Perhaps the co-founder at Ably used "post-truth" because he knows Bezos is not a supporter of our current administration, but I wish he wouldn't because it waters down the bald-faced lies of Kellyanne Conway, Sean Spicer, and The Donald.
The missteps of our government should not become as trivial and normalised as the bodega not stocking organic almond milk.
In all the AWS documentation you are reminded time and time again that if you want any guarantee of availability: choose two zones. If you want more availability: choose two regions. If you're operating globally, you probably should anyway.
Your best bet is to monitor your own systems, and have enough monitoring in place to tell you that one zone is unavailable without having to rely on AWS to tell you.
Their dashboard has no effect on their bonuses BTW (at least it didn't the last time I asked), but it is slow to update because it is purposely gated by a human so as not to cause false positives, and that human has to manually verify the problem before reporting it, which takes time.
http://d0.awsstatic.com/whitepapers/architecture/AWS_Well-Ar...
"Best practices: Multi-AZ / Region. Distribute application load across multiple Availability Zones / Regions"
What AZ were they supposed to failover to? Another one reporting green?
Though to be fair, most sufficiently popular projects don't even need a real status page. One which simply reported its own traffic volume would suffice to know if the service itself is down (crowdsource your status!).
So, I picture some rich business executive, technical or not, who ultimately would decide if there should be a status page and how it would look, saying to themself, "Why would we tell the whole world our service is down? This would cause panic among everyone, rather than just annoyance to those who truly care."
Of course it becomes an even bigger problem at that point to blatently say "ALL SYSTEMS GO" with green checks, when it clearly isn't the case.
So, Amazon, like any other multi multi multi billion dollar business empire, clouds the water and seeks to control the perspective. "This wasn't downtime, as per your SLA. It was inability for some to access our servers, which were powered on the whole time!"
Anything to shore up stock price is to be pursued. Anything to bring it down is to be avoided.
