
Google Compute Engine Is Down - jread
https://status.cloud.google.com/incident/compute/15045
======
lern_too_spel
Google Cloud Platform is a cluster, in the
[http://www.urbandictionary.com/define.php?term=Cluster&defid...](http://www.urbandictionary.com/define.php?term=Cluster&defid=1424405)
sense.

Amazon eats their own dogfood with AWS services, while Google does not, and no
amount of marketing is going to make up for that. Every time I give it a try,
I find random bugs everywhere, and I know there are not enough internal
engineers feeling the pain to get those bugs fixed.

~~~
ngrilly
AWS had its own share of massive outage. I had a look at AWS and GCE incidents
history and it looks the frequency and magnitude of incidents is quite
similar. Do you have any factual information that contradicts this?

~~~
orenbarzilai
AWS asks you to design your architecture for failure and work across multiple
availability zones. Its complicated and expensive but you can avoid that. With
google from the other end you can't. and that's the price you pay when you
choose PaaS over IaaS

~~~
tapirl
Google Compute Engine is also an IaaS. You can also design your architecture
for failure and work across multiple availability zones by using GCE. GCE is
not GAE.

~~~
plnk22
GCE had a multi region outage so multi zone within a region wouldn't have
helped in this case.

~~~
ngrilly
I agree. This is the reason why this outage is more serious than others.

------
nateweiss
We found that we could still go into their console, SSH into the boxes (from
the browser-based thing), and reboot the boxes from there. When they came back
up, they worked. May have just been good luck, just happening to come back up
on hardware that's not behind the bad routers or whatever it is.

Edit: Also, we found that our instances were reachable (they mostly provide a
JSON-based API over HTTP), in the sense that they were getting the incoming
HTTP/HTTPS traffic that they normally do. But the responses were not getting
back out to whoever requested them... lost on the way back out or something.

~~~
sixbit
This worked for me too, about an hour ago. So not just good luck it seems :-)

------
jread
All our status VMs are unreachable: [https://cloudharmony.com/status-for-
google](https://cloudharmony.com/status-for-google)

------
mnml_
I was going to ask them If they will give compensations, but I can't contact
them as I don't have "silver support" :/

------
nitinics
On any post incident reports - shall we ever come out of using the most common
and ambiguous technology lingo "network issue". If it were identified a
network issue (Route black hole, prefix hijacking, resources depletion or
consistency issues etc) then you probably know enough to elaborate on the
specifics.

------
aceperry
Preliminary cause is described here:
[https://status.cloud.google.com/incident/compute/15045](https://status.cloud.google.com/incident/compute/15045)

------
dvh
"Incident began at 2015-02-19 15:59"

Is that in future or am I missing something?

~~~
rey12rey
I believe it's a typo as all other references seem to point to 22:59 Feb 18
2015.

It's 22:59 Feb 18 2015 also for Google Cloud SQL
[https://status.cloud.google.com/incident/cloud-
sql/17006](https://status.cloud.google.com/incident/cloud-sql/17006)

Edit: It turns out this was the case. Updated now.

------
sixbit
Ssh'ing from the gce web console seemed to make my instances reachable.
Afterward ssh from terminal and web access worked.

------
jsprogrammer
0:30 passed and no update?

Who would have guessed that? Anyone?

~~~
thezilch
There are updates and clear times for when to expect the next update.

~~~
jsprogrammer
Right, when I posted it said the next update would occur at 0:30, but it was
after 0:30 and there was no update.

------
iamspoilt
The page says that all issues are resolved now.

