For legacy customers, it's hard to move regions, but in general, if you have the chance to choose a region other than us-east-1, do that. I had the chance to transition to us-west-2 about 18 months ago and in that time, there have been at least three us-east-1 outages that haven't affected me, counting today's S3 outage.
EDIT: ha, joke's on me. I'm starting to see S3 failures as they affect our CDN. Lovely :/
Q: Why computers don't crash at the same time?
A: Because network connections are not fast enough.
(I think we are starting to get there)
Perspective is everything.
What's the odds of the server with your repo and your own hard drive crashing at the same time?
Quite interesting really!
+I would suggest that for situations where the probability of my machine and github's/bitbucket's servers being down due to the same event would be events of such magnitude that I would not be worried about my project anymore being more focused on basic survival...
I think the problem is globally accessible APIs are impacted. As others have noted, if you can use region/AZ-specific hostnames to connect, you can get though to S3.
CloudFront is faithfully serving up our existing files even from buckets in US-East.
EDIT: less arrogant. I need a coffee.
Even data replication has options for this, too.
And I work in Ops.
EC2: why are you replicating EC2 instances or AMIs across regions? Why aren't you using build tools to automatically create AMIs for you out of your CI processes?
ELB: Eh? Why do I need ELBs to be multi-regional? I'm a little confused by this on, sorry.
EBS: My systems tend to be stateless, storing as much log, audit, or data in external systems such as RDS, DynamoDB, S3, etc. Storing things on the local system's storage is a bit risky, but if you have to there are disk replication solutions available. EFS comes to mind for making that easier. Backups also come to mind in the event of data loss.
VPC: Why does a VPC need to be cross regional? This one is also lost on me.
RDS: Replication is easy -- it's done for you. Convincing developers their application needs to potentially work with a backup endpoint to the data is harder than data replication problems at times. More often than not, it's simply a case of switching to a read-only mode whilst you recover the write copy of your RDS instance, but this is the role of the developers, not ops.
Lambda, ElastiCache, API Gateway... all these things aren't arguments against my original point: architect correctly. Yes it involves more work (from the developer's perspective, mostly), but more often than not in the event of a failure you're left head and shoulders above your nearest competition and left soaking up the profits as a result.
Based on your responses, however, I think we can safely agree to disagree and move on.
Have a great day! I hope you weren't too badly effected by the S3 outage!
Exactly to avoid single region outages?
Our webservers were hit by this outage. In order to make these cross-regional, I'd need to set up VPCs properly, security groups, instances, datastores (several databases), so on and so forth. I don't store anything on the local disk, but I'm not going to run a server in Europe hitting my db servers in us-east-1. AWS doesn't offer all the databases we use. Cloudformation isn't trivial to use once you get past the tutorial examples either.
Basically, your comment is a version of "you're holding it wrong!"
Some solutions present more difficulties than others, that's for sure. From the limited information you've given me, your solution is far from being a unique situation that poses many difficulties.
CloudFormation in YAML format is pretty easy. I recommend Terraform, however, which is much nicer again for this kind of stuff. It makes it rather "trivial" to get a multi-region solution in place.
As for the database replication: I highly doubt the solutions you're using don't offer replication, and if they don't, and they're not some very esoteric, highly specialised engines, then I would replace them with something that does.
It reads to me as though your primarily contention point is your databases. Not an easy problem to solve, I'll admit, but not impossible, neither.
HashiCorp's Terraform makes it a lot easier to go multi Cloud, and abstracting away configuration of the OS and applications/state with Ansible makes the whole process a lot easier too.
Disclosure: I work on Google Cloud (and didn't test this, but some other comment makes that clear).
EDIT: Found my answer. "Just to stress: this is one S3 region that has become inaccessible, yet web apps are tripping up and vanishing as their backend evaporates away." -- https://www.theregister.co.uk/2017/02/28/aws_is_awol_as_s3_g...