The problem is that you can't really simulate the systemic effects of a really major quake (like a scenario where Mountain View is out of power for weeks).
The DiRT drills include the simultaneous loss of whole offices, including the complete obliteration of the Bay Area. Participants are instructed to pretend that they've been eaten by zombies and not communicate with anyone in a remote office, and that's backed up (as described in the article) by cutting the network links to & from the Mountain View corporate office.
Now, there may certainly be larger systemic effects over long time periods caused by eg. the death of the company CEO, the loss of new product development, or the complete destruction of the company's market (in, for example, a global thermonuclear apocalypse that sets us back to the stone age). But the tests absolutely do simulate the simultaneous death of 15,000+ employees, destruction of the Mountain View headquarters, and loss of all datacenters, data, and power in the Bay Area, and verify that remote teams can continue the day-to-day operations in such an event.
What happens if Google goes down for a while? Is it really that bad? Yes, Gmail and Drive may cause all kinds of chaos but I think I can survive quite a while without the search engine and YouTube.
The data could be lost. Part of DiRT is ensuring that the result of losing N data centers is not "oops we lost 20% of GMail users data permanently".
On the scale of devastation this article is describing, none of Google's servicing being obliterated really matters but I'm personally pretty glad Google is prepared for multiple, massive scale disasters to happen at once.
The financial disaster would be horrendous. Look at the number of companies using google for email.
When the towers went down in 2001, the first business priority of many companies was getting email back online. It's an incredibly important part of a business.
Reminds me of when I was doing some consultancy for large oil/energy trading company in London. There was huge DR initiative going on at the time, and one thing they realized after surveying the business is that, if worst came to worst, they could still conduct business if they lost every other business system, as long as e-mail was available.
If Google App Engine goes down many sites will stop working. OAuth services will also start to fail. So will custom search provided to 3rd party sites (ok that may not sound too bad but you never now). Then there is cloud storage which in many cases is used as a backup mechanism. And how about the gazillion of sites that load js libraries from Google's CDN? Sure, a day or two is bearable, but make it a week and then the shit hits the fan.
The one thing you should bear in mind about earthquakes is that in many cases the aftershocks make more damage than the main one, because damage accumulates. Which means that critical infrastructure could take weeks to become fully operational.
>I think I can survive quite a while without the search engine
The search engine that holds the vast majority of the worlds traffic? The search engine that everyone is going to call on minutes after the earthquake to get news, but isn't going to be there so all the other search engines slow to a crawl. If Google search just disappeared for a week there would be a considerable number of businesses go out of business. You see it now in one off cases where a site is delisted and its traffic drops 99%.
That said, Google has datacenters around the nation and the world so that scenario is unlikely.
I wonder how it would play out if all the major search engines went down. I think people would start communicating on whatever sites they knew the URL for that were still up. Probably some ad-hoc manually curated indexes would pop up in comment threads on various sites and then a few people would probably aggregate these and host them on their local boxes.
> Yes, Gmail and Drive may cause all kinds of chaos but I think I can survive quite a while without the search engine and YouTube.
Perhaps you can, but how many companies are now wholly dependent on Google for their groupware functionality? What happens if you're one of the gaggle of people who make a living off YouTube?
In the grand scheme of things, not a lot of businesses use Apps for their email, but I'd hazard a guess that you're still looking at hundreds of millions of dollars in losses if that were to suddenly disappear.
It's the cascading service failures and related economic costs that would be the most damaging (and hardest to really quantify until a scenario really happens).
Wow, I had no idea that Google Apps were about 15% of Google's revenue. That is a way bigger than I would have guessed. (Hundreds of millions x 2 orders of magnitude is >= 10B, total revenue in 2014, 66B)
Maybe Google is distributed enough to stay up, but I'm pretty sure Microsoft and Amazon are not, and I doubt Google has the spare capacity to handle Azure load, much less Amazon load.
Utilization and availability are not directly related. Much of their utilization is probably not super time critical and could be proportionately shed or suspended in an emergency situation like the one in question.
I bet Google could absorb 100% of the emergency evacuation of AWS in a pinch. Remember also that not 100% of services will be moved, and those that are will not be running at 100% capacity - everyone would be suspending their batch/analytics/warehousing jobs that week (or month).
EDIT: Also, lock in would stop all the redshift/dynamo/etc users from migrating, and would limit people to just what they could throw up on GCE/GAE.
Most large companies do/have done exactly that. Arguably 9/11 was the big catalyst for DR planning.
There were Wall St firms that literally had to get the military to escort them into Lower Manhattan in the days after to rescue servers, tapes, etc and cobble something together in a satellite office.
No one wants a repeat of that, and you'll notice subsequent disasters like Sandy have barely been a blip for most as a result of policy changes.
At this point basically any major company will/should have procedures in place for "perfect" DR, the ability for everyone to work from home indefinitely (and alternate offices/infrastructure waiting for the few people who need something special, like the special needs of a trading desk), as well as obviously having back up data centers with full replication.
Sure you can. Just look for the keys to the main breakers to those buildings. Of course, paying for that amount of vacation for 15 days is probably not entirely feasible for any company.
https://queue.acm.org/detail.cfm?id=2371516