
Why you should not use Google Cloud - samdung
https://medium.com/@serverpunch/why-you-should-not-use-google-cloud-75ea2aec00de
======
throwaway10701
As someone who is currently struggling with Google Cloud's mediocre support,
this is not surprising. We pay lots of money for support and have multiple
points of contact but all tickets are routed through front-line support who
have no context and completely isolate you from what's going on. For highly
technical users the worst support is to get fed through the standard playbook
("have you tried turning it off and on again?") when you're dealing with an
outage. Especially since the best case is your support person playing go-
between with the many, siloed teams trying to troubleshoot an issue while they
apparently try to pass the buck.

Not to mention the lack of visibility in changes - it seems like everything is
constantly running at multiple versions that can change suddenly with no
notice, and if that breaks your use case they don't really seem to know or
care. It feels like there's miles of difference between the SRE book and how
their cloud teams operate in practice.

~~~
jazoom
I'd just like to take this opportunity to praise Vultr. I've been using them
for years and their support has always been good, and contrary to every other
growing company, has been getting better over time.

I had an issue with my servers 2 days ago and I got a reply to my ticket
within 1 minute. Follow-up replies were also very fast.

The person I was talking to was a system administrator who understood what I
was talking about and could actually solve problems on the spot. He is
actually the same person who answered my support requests last year. I don't
know if that's a happy accident or if they try to keep the same support staff
answering for the same clients. He was answering my requests consistently for
2 days this time.

I am not a big budget customer. AWS and GCP wouldn't think anything of me.

Thank you Vultr for supporting your product properly. And thanks Eric. You are
very helpful!

~~~
com2kid
Google Cloud provides more than just VMs and Containers. It has a bunch of
services backed in, from a variety of databases such as Firebase (that have
powerful built in subscription and eventing systems) to fully baked in Auth,
(Google will even handle doing two factor for you!) to assisting with certain
types of machine learning.

Vultr looks like they provide more traditional services with a few extra
niceties on top.

Within Google's infrastructure, I can deploy a new HTTPS REST endpoint with a
.js file and 1 console command.

Could I set up an ecosystem on a Vultr VM to do the same? Sure, it isn't
magic. But GCP's entire value prop is that they've already done a lot of that
work for you, and as someone running a startup, I was able to go from "I have
this idea" to "I have this REST endpoint" in couple of days, without worrying
about managing infrastructure.

That said, articles like this always worry me. I've never seen an article that
says "Wow Google's support team really helped out!"

~~~
sorenjan
Using such proprietary features sounds like a great way to subject yourself to
vendor lock in and leave you vulnerable to your cloud provider's every whim. I
understand that using ready made features is alluring, but at what point are
you too dependent on somebody else? All these cloud services reminds me a bit
of left-pad, how many external dependencies can you afford? Maybe I'm too
suspicious and cynical, but then I read articles like these from time to
time...

~~~
bonesss
The difference, IMO, is that you're generally leveraging the cloud providers
_platform_ in addition to using their hosting.

There are ways to make the hosting relatively agnostic, but choosing a pub/sub
solution (for example), that operates at 'web scale' will have a distinct
impact on your solutions and force you into their ecosystem to maximize value.
Why bother with BigCorps UltraResistant services if you're only going to use
some small percentage of the capabilities?

I've made systems that abstract away the difference entirely, but I think the
'goldilocks zone' is abstracted core domain logic that will run on anything,
and then going whole-hog on each individual environment. Accept that "cloud"
_is_ vendor lockin, and mitigate that threat at the level of deployment
(multi-cloud, multi-stack), rather than just the application.

------
aleph-
So piggybacking on this, I have a similar story to tell. We had a nice young
startup, infra entirely built out on Google Cloud. Nicely, resiliently built,
good solid stuff. Because of a keyword monitor picked up by their auto-
moderation bot our entire project was shut down immediately, wasn't able to
bring it up for several hours, thank god we hadn't gone live yet as we were
then told by support that because of the grey area of our tech, they couldn't
guarantee this wouldn't keep happening. And in fact told us straight out that
it would and we should move.

So maybe think about which hosting provider to go with, don't get me wrong I
like their tech. But their moderation does need a more human element, to be
frank all their products do. Simply ceding control to algorithmic judgement
just won't work in the short term if ever at all.

~~~
setquk
I’m starting to favour buying physical rack space again and running everything
2005 style with a light weight ansible layer. As long as your workload is
predictable, the lock in, unpredictability, navigation through the maze of
billing, weird rules and what-the-fuckism you have to deal with on a daily
basis is merely trading one vendor specific hell for another. Your knowledge
isn’t transferable between cloud vendors either so I’d rather have a hell I'm
totally in control of and of which the knowledge has some retention value and
will move around vendors no problems. You can also span vendors then thus
avoiding the whole all eggs in one basket problem.

~~~
tootie
You can federate Kubernetes across your own rack and one or more public cloud
providers.

~~~
scurvy
Have you actually done this, or are you repeating stuff off the website?
Because everyone I've talked with about kubernetes federation says it's really
not ready for production use.

~~~
ipedrazas
The approach we have taken is to create independent clusters with a common
LoadBalancer.

Basically, the LB decides which kubernetes cluster will serve your request and
once you're in a k8s cluster, you stay there.

You don't have the control-plane that the federation provides and a bit of
overhead managing clusters independently, but we have automated the majority
of the process. On the other hand, debugging is way easier and we don't suffer
from weird latencies between clusters (weird because sometimes a request will
go to a different cluster without any apparent reason <\-- I'm sure there's
one, but none that you could see/expect, hence debugging).

My people's time is more important than your complex system.

------
warrentr
This is very concerning but can happen on AWS as well. July 4th last year at
about 4PM PST amazon silently shutdown our primary load balancer (ALB) due to
some copyright complaint. This took out our main api and several dependent
apps. We were able to get a tech support agent on the phone but he wasn't able
to determine why this happened for several hours. Eventually we figured out
that another department within amazon was responsible for pulling down the alb
in an undetectable way. Ironically we are now in the process of moving from
aws -> gcp.

~~~
halbritt
You're going to have to go multi-cloud if you truly want to insulate
yourselves from this sort of problem.

If and when you do, give serious consideration to how you handle DNS.

~~~
brightball
Fwiw, Ansible makes the multicloud thing pretty straightforward as long as you
aren’t married to services that only work for a specific cloud provider.

For that, you should consider setting up multiple accounts to isolate those
services from the portable ones.

~~~
weiming
Wouldn't that be Terraform (perfect for setting up cloud infrastructure) vs.
Ansible (can do all, but more geared to provisioning servers you already
have)?

~~~
brightball
Ansible uses Apache Libcloud to run just about anything you need on any cloud
provider in terms of provisioning. Once provisioned, it will handle all of
your various configuration and deployment on those.

Also plays really nicely with Terraform.

------
dilyevsky
No truly production and especially revenue critical dependency should go on
the card. Have your lawyer/licensing person sign agreement with them with
actual sla and customer support. If it’s not worth your time you shouldn’t
complain when you loose it.

~~~
Latteland
That's a great point. These cloud hosting companies don't make this a natural
evolution though, because there's no human to talk to, you start tiny and
increase your usage over time. But every company depending on something and
paying serious money should have a specific agreement. I wonder if this could
still happen though, even if you have a separate contract.

~~~
keypusher
> These cloud hosting companies don't make this a natural evolution though,
> because there's no human to talk to

This is not true at all. Once you start spending real money on GCP or AWS,
they will reach out to you. You will probably sign a support contract and have
an account manager at that point. Or you might go with enterprise support
where you have dedicated technical assets within the company that can help
with case escalation, architecture review, billing optimization, etc.

~~~
Latteland
It makes sense that would happen. So they just didn't have the contact info
for the people here? Maybe they just were spending a little, but their whole
business still depended on it.

------
jpollock
This is a standard risk with any attempt to remain anonymous with a supplier.
The supplier, since they don't know you, and therefore can't trust you, will
not offer much credit.

Cards get skimmed all the time. When a card gets skimmed, the issuer informs
everyone who is making recurring purchases with that card "Hey, this card was
skimmed, it's dead".

If someone has a recurring charge attached to that account, the recurring
charge will go bad. If this is an appreciable number of cloud services which
are billed by the second, this can happen very, very quickly and without you
knowing. Remember, sometimes the issuer informs you that the card was skimmed,
which you will receive after all the automated systems have been told.

So, the cloud provider gets the cancel, and terminates the card. It then looks
around sees the recurring charge, takes a look at your servers racking up $$
they can't recoup and the system goes "we don't know this person, they buy
stuff from us, but we haven't analysed their credit. Are they good for the
debt? We've never given them credit before. Better cut them off until they get
in touch."

If only they had signed an enterprise agreement and gotten credit terms. It
could still be paid with a credit card, but the supplier would say "They're
good for $X, let it ride and tell them they'll be cut off soon". They can even
attach multiple methods of payment to the account, where, for example, a
second card with a different bank is used as a backup. Having a single card is
a single point of failure in the system!

In closing, imagine you're a cryptocoin miner who uses stolen cards to mine on
cloud services. What does that look like to the cloud provider?

Yep, someone signs up for cloud services, starts racking up large bills and
then the card is flagged as stolen.

~~~
hknd
It looks like they used a personal GCP account for their multi-million dollar
business.

Would be interested to see what would've happened if they would've used a
business account.

------
balls187
While not a cloud platform, I had an experience along the same vein with
Stripe.

We're a health-care startup, and this past Saturday I got an email saying that
due to the nature of our business, we were prohibited from using their payment
platform (Credit Card companies have different risk profiles and charge
accordingly--see Patereon v Adult Content Creators).

Rather than pull the plug immediately, they offered us a 5-day wind down
period, and provided information on a competitor that takes on high-risk
services.

Fortunately, the classification of our business was incorrect (we do not offer
treatment nor perscription/pharma services), and after contacting their
support via Email & Twitter, we resolved the issue in less than 24-hours.

So major kudos to Stripe for protecting their platform, _WHILE_ also trying to
do the right thing for the customers who run astray from the service
agreement.

------
mikeyr0x
Please remember Google Cloud is a multi-tenant public cloud and in order run a
multi-tenant environment providers have to monitor usage, users, billing, and
take precautionary measures at times when usage or account activity is sensed
to be irregular. Some of this management is done automatically by systems
preserving QoS and monitoring for fraud or abuse.

This seems like a billing issue. If they had offline billing and monthly
invoicing (enterprise agreement) I do not believe this issue would have
happened.

If you are running an enterprise business and do not have enterprise support
and an enterprise relationship with the provider, you may be doing something
wrong on your end. It sounds like the author of this post does not have an
account team and hasn't take the appropriate steps to establish an enterprise
relationship with their provider. They are running a consumer account which is
fine in many many cases, but may not be fine for a company that requires
absolutely no service interruptions.

IMO, the time this issue was resolved by the automated process (20 mins) is
not too bad for consumer cloud services. Most likely this issue could have
been avoided if the customer had an enterprise relationship (offline
billing/invoicing, support, TAM, account flagging, etc, etc) with Google
Cloud.

~~~
ngrilly
A "consumer account"? I don't know what you're talking about. This is Google
Cloud, not Spotify. I don't know a lot of "consumers" spending hundreds of
dollars, thousands of dollars or more per month on Google Cloud. And paying
bills by wire transfer instead of credit card doesn't change anything to the
issue discussed here.

~~~
namibj
In Germany, maybe even all of europe you need a tax ID which you, at least as
far as the type they require is concerned, only get as a business, not a
consumer. I actually tried due to the relatively easy way to get a fancy,
reliable network (I kind of admire their global SDN that can push within 5% of
line rate with no meaningful, added packet loss (apart from the minimal amount
due to random cosmic rays and similar baseline effects).

~~~
DataWraith
They actually relented on that. You can now register "Individual" accounts
that don't need a tax ID:
[https://cloud.google.com/billing/docs/resources/vat-
overview](https://cloud.google.com/billing/docs/resources/vat-overview)

------
markbnj
I can't speak to the specific incident. We've been running almost 400 servers
(instances and k8s cluster nodes) for over a year on GCP and we've been quite
happy with the performance and reliability, as well as the support response
when we have needed it. I did want to address this comment...

> What if the card holder is on leave and is unreachable for three days? We
> would have lost everything — years of work — millions of dollars in lost
> revenue.

You should never be in this position. If this were to happen to us we would be
able to create a new project with a different payment instrument, and
provision it from the ground up with terraform, puppet and helm scripts. The
only thing we would have to fix up manually are some DNS records and we could
probably have everything back up in a few hours. Eventually when we have moved
all of our services to k8s I would expect to be able to do this even on a
different cloud provider if that were necessary.

~~~
ma2rten
Huh? You can provision new servers, but you can't just easily move over all
the data, can you?

~~~
lotyrin
Why not? You should have backup strategy with business-acceptable RPO/RTO.

------
londons_explore
This fraud flag is caused by your credit card being found in a leaked list of
card numbers somewhere.

They suspect you are a fraudster because you are using a stolen card.

Either sign a proper SLA agreement with Google (which gives you 30 days to pay
their bills by any form, and therefore you get 30 days notice before they pull
the plug), or have two forms of payment on file. Preferably, don't use your
GCP credit card at dodgy online retailers too...

~~~
ikiris
At least someone gets it.

~~~
sn41
I may be missing something, so help me out here... I get the impression that
the author was not told the precise reason why the activity was suspicious.
Wouldn't a precise error message, if not actually a human interface, been
helpful? Why the generic "suspicious activity" warning?

It seemed very Kafkaesque to me, getting tried and convicted without any
mention of the crime or charge. I think the author is justified in his
disapproval.

------
SoulMan
I can echo with the sentiment here. There have been a few times, they have
broken backward compatibility resulting in our production outage without even
new deployment. For example the BigQuery client library suddenly started
breaking because they had rolled out some changes from the API contract the
library was calling. When we reached out to support they took it very lightly
saying why are we even using "the ancient version of library", Ok fair enough
we upgraded the library to the recommended version but alas! the dataflow
library started breaking due to this new upgrade. For next few hours support
just kept on playing binary search of a version which was compatible with both
bigQuery and dataflow while the production was down.

The worst part is that when we did post morterm and asked Google why the
support resolution was so slow despite being "the privileged" customer, their
answer was that the P1 SLA was only to respond within 15 minutes there is no
SLA for resolution. Most of the "response" that were getting was that a new
support guy has taken over in a new time zone which is the most useless
information for us.

We are seriously thinking of moving to another cloud vendor.

~~~
user5994461
In my experience, the support from the other clouds is equally useless if not
worse.

AWS would never admit that anything is wrong from their side.

------
shaneos
I wonder how prevalent this behavior is. Mozilla behaves the same towards
browser extensions, which put business depends on. They removed our extension
multiple times, each time before asking for something different, be it
uncompressed source code, instructions for how to build it, a second privacy
policy separate from our sites policy and more. Each time we would have
happily responded to a request promptly, but instead you find out when you’ve
been shut down already.

Grace periods that respect your business should be a standard that all service
providers hold themselves to

~~~
ryan-c
It sounds to me like Mozilla identified your extension as potentially
malicious and prioritizing user safety, shut you down first.

As far as I know, Mozilla has no business relationship with extension
developers, so I would actually be very concerned if their first action _wasn
't_ to cut you off.

~~~
jazoom
I can confirm Mozilla handles this very poorly. I had the exact same
experience with them. It was so bad that I actually just left the extension
off their store and now focus on Chrome.

There is nothing dodgy about the extension. Mozilla was just being ridiculous.

~~~
davidgerard
What was the extension? Specifically.

~~~
jazoom
A companion extension to my price comparison website.

~~~
ryan-c
That entire class of browser extensions is shady. Do you make money on
referrals to shopping sites?

~~~
jazoom
Not from the extension (not that it would be against Mozilla's ToS if it did).
It has other nice features to make our users' lives better.

Thank you for judging my business without even knowing it.

~~~
ryan-c
Browser extensions that say they help with comparison shopping are a very
common type of "Potentially Unwanted Application" (PUA - aka malware with a
legal team). The infamous Superfish is an example of this type of thing, and
there are many others.

I don't know anything about your business or the extension, I'm just pointing
out that you're in a space that makes you suspicious by association.

~~~
jazoom
Fair enough. But this has nothing to do with Mozilla's actions. It was as GP
said. It includes things like their incompetence in dealing with a build
process that creates transpiled/minified code. Even when I gave them all the
source and the build instructions (npm run build) they still couldn't
comprehend what was going on. Yes, I know it's strange since Mozilla makes a
browser with a JavaScript engine.

Edit: I should add that after 2 weeks of back and forth emails the dude was
finally able to build it then blamed me for not mentioning he needed to run
"npm run build", even though I did mention it AND it's in package.json AND
it's mentioned in the (very short and concise) readme.txt.

So after this exasperating experience he just took down the extension without
warning and said it's because it contains Google Analytics.

I would have happily removed Google Analytics from the extension. The dude had
my source for 2 weeks and could have told me about that at any time, but
decided to tell me after 2 weeks of mucking around, after he had already
removed the extension.

It was me that decided it was not worth the hassle to have the extension on
their store. I just left it off.

------
sargun
I wonder if OP paid for Support?
[https://cloud.google.com/support/?options=premium-
support#op...](https://cloud.google.com/support/?options=premium-
support#options)

And had they converted their project to monthly invoicing:
[https://cloud.google.com/billing/docs/how-to/invoiced-
billin...](https://cloud.google.com/billing/docs/how-to/invoiced-billing)

~~~
lisper
What difference does that make? There's no justification for intentionally
shutting down a potentially critical service with no warning.

~~~
olefoo
"Oh hey, it looks like $customer suddenly started a bunch of coinminers on
their account at 10x their usual usage rate. Perfectly fine. Let them rack up
a months billing in a weekend; why not?"

A hypothetical but not unheard of scenario in which immediate shutdown might
be warranted.

It's a rough world and different providers have optimised for different threat
models. AWS wants to keep customers hooked; GCP wants to prevent abuse,
Digital Ocean wants to show it's as capable as anyone else.

If you can afford it, you build resilient multicloud infrastructure. If you
can't yet do that; at the very least ensure that you have off-site backups of
critical data. Cloud providers are not magic; they can fail in bizarre ways
that are difficult to remedy. If you value your company you will ensure that
your eggs are replicated to more than one basket and you will test your
failover operations regularly. Having every deploy include failing over from
one provider to another may or may not fit your comfort level; but it can be
done.

~~~
halbritt
"If you can afford it"

There's a degree of complexity that comes with multi-cloud that's ill-suited
for most early stage companies. Especially in the age of "serverless" that has
folks thinking they don't need people to worry about infrastructure.

My point is that the calculus has more to it than just money. The prudent
response, of course, is to do as you described. Have a plan for your provider
to go away.

Offsite backups and the necessary config management to bring up similar infra
in another region/provider is likely sufficient for most.

~~~
olefoo
> There's a degree of complexity that comes with multi-cloud that's ill-suited
> for most early stage companies. Especially in the age of "serverless" that
> has folks thinking they don't need people to worry about infrastructure.

I just heard a dozen founders sit up and think "Market Opportunity" in glowing
letters.

CockroachDB has a strong offering.

But multi-cloud need not be complicated in implementation.

A few ansible scripts and some fancy footwork with static filesystem
synchronization and you too can be moving services from place to place with a
clear chain of data custody.

~~~
halbritt
A few ansible scripts? Nah.

Everything I have runs in kubernetes. The only difficulty I have to deal with
is figuring out how to deploy a kubernetes cluster in each provider.

From there, I write a single piece of orchestration that will drop my app
stack in any cloud provider. I'm using a custom piece of software and event-
driving automation to handle the creation and migration of services.

Migrating data across providers is hard as kubernetes doesn't have snapshots
yet.

There are already a lot of startups in this space doing exactly the kind of
thing that I just described. Most aim to provide a CD platform for k8s.

------
ccleve
This is fatal. I have a small pilot project on Google Cloud. Considering
putting up a much larger system. Not now.

The costs of Google may be comparable or lower than other services, but they
don't seem to get that risk is a cost. Risk can be your biggest cost. And
they've amplified that risk unnecessarily and shifted it to the customer.
Fatal, as I said.

~~~
timc3
Making a decision purely based upon some posts on HN and the original artical
isn’t a good idea either as there is little data on how often this happens and
how often (and pulling the plug could happen with another IAAS). You need to
weigh up your options for risk management based upon how critical your project
is, the amount of time/money you have to solve the issues.

You might never see this happen to your GCP account in it’s lifetime.

------
flossball
This is a hallmark of Google's lack of customer service. They used to use the
same filtering alg on customer search feeds as public. The system was a grey
list of some sort and the client was worth about 1m in ads a day to them.
Never the less, once a month it would get blocked. Sometimes for over a day
before someone read the email complaint and fixed it. We had no phone, chat,
or any other access to them. They have no clue how to run a business nor do
they care. Never partner with them.

------
glogla
There's quite a lot of people talking about how this is their own fault, that
they should have expected it, that they should have been prepared. Victim
blaming, some would say, even.

But even if you assign blame to the OP for not expecting this, it doesn't look
good, because the lesson here is "you shouldn't use google and if you do,
expect them to fuck you over, for no reason, at any time".

~~~
wpietri
Exactly. The whole point of using AWS, Google Cloud, etc, is that you get to
stop thinking about certain classes of problems. An infrastructure provider
that is unreliable cancels most of the value of using them for infrastructure.

~~~
mmt
Worse, they can potentially more than cancel it out, if they merely remove the
"worrying about hardware" (yes, and network and load balancers and everything
else) aspects, which are, at least, well understood by some us out on the
market, and replace it with "worrying about the provider" where a failure
scenario is, not only more opaque, but potentially catastrophic, since it's a
single vendor with _all_ the infrastructure.

It reminds me of AWS's opacity-as-antidote-to-worry with respect to hardware
failures. If the underlying hardware fails, the EC2 instance on it just
disappears (I've heard GCP handles this better, and AWS might now, as well). I
like to point out that this doesn't differ much from the situation of running
physical hardware (while ignoring the hardware monitoring), both from a
"worry" burden perspective and from a "downtime from hardware failure"
perspective.

------
manigandham
Google just doesn't have the talent, skills, or knowledge for dealing with
business customers. They don't have competition in adtech and so never
learned, but that doesn't work with GCP. They have great technical features
but don't realize that's not what matters to a customer who wants their
business to run smoothly.

We've gone through several account teams of our own that seem to be eager to
help only to turn into radio silence once we actually need something. We have
already moved mission-critical services to AWS and Azure, with GCP only
running K8S and VMs for better pricing and performance.

GCP has good leadership now but it's clearly taking longer than it should to
improve unfortunately.

~~~
paulie_a
I generally agree with you but there is one exception, Google fi has amazing
support. I am surprised gcp wouldn't have similar support considering the
obvious cost differences though.

~~~
manigandham
Google Fi is for consumers.

~~~
paulie_a
And? Businesses should get even better support.

~~~
manigandham
>> Google just doesn't have the talent, skills, or knowledge for dealing with
_business_ customers.

------
cameldrv
This is the problem with being excessively metrics-driven. They have a fraud
problem, and there's some low-dollar customer that their algorithm determines
is say a 20% chance of fraud. They know that 80% of the non-fradulent people
will just upload their ID or whatever immediately, and they shut down all the
fraud right away. Their fraud metrics look great, and the 20% of customers
that had a problem have low CLV so who cares? It's not worth the CSR to sort
it out, and anyhow, the CSR could just get socially engineered. The problem is
that the 20% are going to talk to other people about their nightmare
experiences.

It may not be expensive to Google to lose the business, but it's very
expensive for the customer. Google's offering is now non-competitive if you
aren't doing things at enterprise scale. Of course many of Google's best
clients will start out as small ones. The metrics won't capture the long-term
repetitional damage that's being done by these policies.

~~~
nimbosa
this exactly describes Google policy on small fish, by neglecting their
concerns, Google gets a lot of negative reputation from small startups who
spread the word on forums like this, making their technical innovation largely
irrelevant to their future success

------
samfisher83
We seem to hear a lot of bad google customer support stories. I guess it
really shouldn't be surprising. Amazon grew as a company that put customers
first. Google is kind of known for not doing that. They shut down services all
the time. They don't really put an emphasis on customer support.

------
bhouston
I had the company visa blocked temporarily for suspicious activity twice in
the last 6 years and no one shut their service off, but I got a lot of
warnings. Seems like a really shitty thing for google to do.

Maybe for critical accounts you need to have a backup visa on file with Google
cloud with in case the first dies for security reasons.

A single visa is a single point of failure in an otherwise redundant systemm

~~~
mark_l_watson
I was thinking the same thing. For my consumer Google account (google play
music, YouTube red, buying movies, app engine) I can add additional payment
methods [https://cloud.google.com/billing/docs/how-to/payment-
methods](https://cloud.google.com/billing/docs/how-to/payment-methods)

After reading this article, I am probably going to do this.

------
jdietrich
If you use cloud services, a crucial scenario in your disaster recovery
planning is "what if a cloud provider suddenly cuts us off?". It's a single
point of failure akin to "what if a DC gets demolished by a hurricane?" or
"what if a sysadmin gets hit by a bus?". If you don't have a plan for those
scenarios, you're playing with fire.

[https://libcloud.apache.org/](https://libcloud.apache.org/)

------
lioeters
Sensible complaint/explanation for how this customer was treated: mission-
critical systems getting shut down without prior notice.

------
kuwze
I wonder what support option[0] they had from GCP.

[https://cloud.google.com/support/?options=premium-
support#op...](https://cloud.google.com/support/?options=premium-
support#options)

------
xt00
I’ve used AWS support many times and it’s actually really awesome. You can ask
them basically anything and they have experts on everything. Really
impressive. Yes you pay every month for it but it’s really good.

~~~
sargun
It sounds like this person didn’t pay for support.

~~~
balls187
AWS Billing support is Free. And it's also equally awesome.

Paid support tiers (as far as I know) are for deeper system level diagnostics.

------
shaohua
I have a similar story. I submitted a ticket to increase my GPU quota. Then my
account was suspended, because CSR think the account is committing fraud. At
that moment, I have a valid payment method and have been using GCP for a
couple of weeks. Only after I prepaid $200 and uploaded a bunch of documents
including credit card pictures and ID pictures, my account was restored.

You heard me right, I prepaid them so that my account can be restored.

I miss AWS sometimes...

~~~
jopsen
Honestly, if your not big enough to setup invoicing an option to always prepay
wouldn't be so bad. If it reduces risks if interrupts.

------
Bahamut
This story gives evidence to something I have seen from Google, and why I
refuse to pay them cash for a service ever again, much less critical ones -
Google customer service is bad by design. I have never seen a company so
arrogant and opaque/untransparent as them

------
jgalvez
This happened to me in a project too. Everything went down due to their bogus
fraud detection. I had a Kubernetes cluster down for over a day. Very
unfortunate as I loved GCP :(

~~~
masterleep
Same here. Total nuke of the project with no warning even though we were an
established paying customer, and there was no fraud involved.

------
pge
This highlights one of the challenges Google has going into the cloud market -
they don’t have a history of serving enterprise customers and the
organizational structure and processes to do that well. I think one of the
reasons Microsoft has gotten cloud market share so quickly with Azure (in
addition to bundling licenses with other products!) is that they have the
experience with enterprise customers and the organization to serve them well
(regardless of how Azure compares to GCP as a product). Supporting enterprise
customers is what they have always done - not so with Google (and Amazon had
to learn that as well).

------
stickfigure
Terrifying.

I'm curious though, did you have multiple credit cards on file with your
Google billing account? I'm under the impression that this is part of their
intended strategy for avoiding service interruption, but I'd like to know if
it actually works that way.

(I took this as a reminder to add a second card to my account)

------
londons_explore
GCP allows multiple payment methods.

You should _always_ have at least two payment methods on file with them for
anything important.

That way if one gets flagged for fraud, services won't be suspended.

------
oppositelock
This has happened to me. Google's billing system had a glitch, and all of a
sudden, an old bill which was paid years ago became unpaid. Google immediately
tore down everything in my account without notice due to non-payment.

If something like this ever happens in AWS, they email you, call you, give you
a grace period, and generally, do their best to avoid affecting your
infrastructure.

GCP is getting better, but it's not ready for anything other than science fair
experiments.

------
jacquesm
That's pretty bad of Google. I just looked in detail at a company using Google
Cloud exclusively for their infra and the application is somewhat similar to
what these guys are doing. I'll pass the article on to them. Thanks for
posting this.

------
JumpCrisscross
> _assets need to be monitored 24 /7 to keep up/down with the needs of the
> power grid and the power purchase agreements made_

If you're doing something this mission critical, you have to have an SLA with
your cloud provider.

------
Animats
Does Google Cloud offer a service where they are bound by contract not to do
this? If not, no business should use them.

~~~
sargun
Yes

~~~
Animats
Is it a lot more expensive?

~~~
SuperQue
Probably cheaper, as you can start negotiating discounts when you get to this
size.

------
mark-ruwt
If maximum uptime is critical to the business, your infrastructure should be
cross-provider.

I've been running three providers as peers (DO, Linode, Vultr) as a one-man
shop for years, and I sleep better at night knowing that no one intern can
fatfinger code that takes me offline.

~~~
PretzelFisch
At that Point wouldn't hosting it yourself running on something like vmware
vsphere be a simpler option? At least you would have a nice hardware
abstraction and a consistent api to build your tooling on.

~~~
mark-ruwt
I hear you. For me, the abstraction is the Linux distro. Build scripts
abstract out creating a clean, secure box before installing any custom
software, so regardless of the provider, every machine is exactly the same.

------
janment
Google's custom service is ridiculous. They ever suddenly emailed me that my
merchant earning (accumulated in about three years) will be escheated to
nation government until I provide valid payment information in one month.
However, for some reason. it will take more than one month for me to get a
valid payment information.

Then they really escheated my earning to some a nation government after 30
days. However when I requested them for the escheatment ID so that I can
contact that nation government to find my money back, they said they don't
have the ID! They eascheated my money without any recording! Which is almost
the same as throwing my money in ocean.

------
halbritt
I think this is another instance of GCP just being really, really terrible at
interacting with customers. I'm biased a bit (in favor of GCP), I suppose in
that I have a fair bit of infra in GCP and I really like it. I'll share a
couple anecdotes.

Last year my team migrated most of our infrastructure out of AWS into GCP.
We'd been running k8s with kops in AWS and really liked using GKE. We also
developed a bizdev arrangement.

As I was scaling up my spend in GCP, I began purchasing committed use
discounts. Roughly similar to reserved instances in AWS. I'd already made one
purchase for around $5k a month and these are tied to a specific region and
can't be moved. I went to purchase a second $5k block, and typo'd the request
ending up with $5k worth of infra in us-central rather than us-west. The
purchase doesn't go into effect until the following day and showed as
"pending" in the console. No big deal, I thought, I'll just contact support
and they'll fix it right away, I'm sure. I had this preconceived notion based
on my experiences with AWS. I open a support request and about an hour later I
get a response that basically tells me that once I've clicked the thing,
there's no undoing it and have a nice day.

I've literally just erroneously spent $5k for infra in us-central that I can't
use and their response was basically, "tough". $5k is a sufficiently large
loss that I'd be inclined to burn a few hours of my legal team's time dealing
with this issue, something I shared with the support person. After much
hemming and hawing over the course of a few days, they eventually fixed the
issue.

More recently, I've been dealing with an issue that is apparently called a
"stockout".

Unlike AWS, GCP does not randomize their AZ names for each customer. This
means that for any given region, the "A" availability zone is where most of
the new infrastructure is going to land by default. Some time in May, we
started seeing resource requests for persistent SSD failing in us-west1-a. The
assumption is that it would clear up pretty quickly after an hour or two, but
persisted. After about a day of this, we opened up a support case asking what
was going on and explaining the need for some kind of visibility or metrics
for this kind of issue. The response we received was that this issue was
"exceedingly rare" which was why there was no visibility and that it would be
rectified shortly, but couldn't be given any specific timeline.

I followed up with my account rep, he escalated to a "customer engineer" who
read the support engineers notes and elaborated how "rare" this event was and
how unlikely it was to recur. Again, I contacted my account rep and explained
my unhappiness with the qualitative responses from "engineers" and that I
needed quite a bit more information on which to act. He was sympathetic
throughout this whole process escalating inside the organization as needed and
shared with me some fairly candid information that I couldn't get from anyone
else.

Apparently, the issue is called "stockout" and us-west1-a had a long history
of them. The issues had been getting progressively worse from the beginning of
the year and at this time, this AZ was the most notorious of any in any
region. Basically, the support engineer either patently lied or was just
making stuff up. Also, I shared with my GCP rep how AWS load-balances across
AZs. He promised to pass that along.

The moral of the story is that if you want to be in us-west, then maybe try
us-west1-c. Also, GCP is a relatively young arm of a well-established company
that has a terrible reputation of being able to communicate with consumers.
They'll eventually figure it out, it will just take some time.

------
cabaalis
Am I alone in thinking critical infrastructure monitoring like this should be
run on the metal in your own center? Sure, offload some data for processing
and reporting to cloud providers. But I'm slightly worried that electrical
grid technology is using something they cannot control, and freaking uptime
robot (I use it also) instead of proper IT in a controlled facility.

------
snowwindwaves
Usually there are a bunch of comments about power plants being connected to
the internet. I doubt the connections from the control rooms or cloud back to
the machines are read only unless they have protocol level filters to remove
write commands from the wan to plant networks.

Just the way it is unless it is nuclear plant probably.

------
chevman
This is a rookie mistake - create and host a mission critical part of your
business (not just your backend infrastructure, your entire business) with a
vendor that you have no real, vetted contract in place with?

Total clown show.

Take this as a lesson in risk management and don't fuck it up again.

------
wiorcite
I lost Gmail account (locked due phone verification)[lost on holiday sim card]
all my photo from vacation important email is gone (done backup day before)
and that was like 3 years ego still waiting for explanation from Google 20
email send no response

------
nasmorn
Google once shut down my employers App Store account for 3 days for some
routine review. That didn’t just mean we didn’t get any money, that could be
argued even. It meant we were simply off the store. Because some needed to see
a TPS report

------
FloNeu
No doubt that this is horrible practice and customer service on googles-side -
but at the hypotethical question - what if the card holder is unreachable
blabla - we loose years of work etc. Well if you are not professional enough
to back-up your system regularly you should loose everthing and it's your
fault to begin with... For real - google is not the only one who should step
up there game... Only because it's in the cloud now, doesn't mean you should
ignore decades old best-practices... servers die, hard-drives die - people
with passwords die... handling stuff like that is part of your job...

------
gcpsupport
I work here in Google Cloud Platform Support.

First, we sincerely apologize for the inconvenience caused by this issue.
Protecting our customers and systems is top priority. This incident is a good
example where we didn’t do a good job. We know we must do better. And to
answer OP’s final message: GCP Team is listening.

Our team has been in touch with OP over what happened and will continue
digging into the issue. We will be doing a full review in the coming days and
make improvements not only to detection but to communications for when these
incidents do occur.

~~~
seanvk
I think it starts with ensuring that you have staff that actually review an
action before it takes place. Relying on automation can be catastrophic for
someone like OP.

------
chappi42
Horrible. This reminds me of travels in 3rd world countries where sometimes
the electricity just dropped out. Ha, never would have thought that the one
Google would occasionally become like this...

------
ConsultantEU
We are currently conducting a study on behalf of the European Commission in
order to have a better understanding of the nature, scope and scale of
contract-related problems encountered by Small and Medium Sized Enterprises in
EU in using cloud computing services. The purpose is to identify the problems
that SMEs in the EU are encountering in order to reflect on possible ways to
address them. To assist the European Commission with this task you are kindly
invited to contact: eu.study.cloudcomputing@ro.ey.com

~~~
ClaudiaPopescu
This sounds interesting! I have founded an SME and I have experienced issues
w/ cloud computing, mainly data storage and problems with the contracts from
providers. I will write you an email! Thanks!

------
awinder
Whether AWS or Google Cloud, if you’re running a real business with real
downtime costs, you need to pay for enterprise support. You’ll get much faster
response times so that you can actually meet your SLA targets, and you’ll get
a wealth of information and resources from live human beings that can help
even into the design phases.

Feel free to budget that into your choices on where to host, but getting into
any arrangement where you rely on whatever free tier of support is nonsensical
once you’re making any kind of money.

~~~
johlindenbaum
Totally agree. There’s so many missing redundancies on this project. CTO /
Dir. Eng should have their own card. They should have some contact with a
Google account rep, and at least the basic support package.

We’re not a big Google customer, couple thousand a month, but when we migrated
we instantly reached out to account reps and have regular quarterly check ins.

~~~
halbritt
True.

Even still, I feel like it's still reasonably likely that the robot would shut
down a project for "suspicious activity".

------
Lectem
I wouldnt trust google with anything anymore. Their customer support
(entreprise or not) always sucked. They're always "right". You can only suffer
the damage silently unless you're worth millions for them. People should come
to realize that it's been years google is NOT your friend.

------
Upvoter33
The weird thing here is that inside google, support is great. Meaning that
when people are building products for other google engineers, they go way
above and beyond what is needed to help each other out. Somehow, they are not
able to transition that to external customers.

------
shacharz
What really underlines this blog and thread is that as of the moment of
writing this comment, there's no official answer from Google, or even a non
official one from an employee. The feeling I get as a customer is that they
just don't care.

~~~
gcpsupport
I did make a post below actually a few hours back and will copy and paste it
below for reference. Rest assured that we are working on this one and are in
contact directly with the customer. We hope to have more of an official
response soon.

from below: I work here in Google Cloud Platform Support. First, we sincerely
apologize for the inconvenience caused by this issue. Protecting our customers
and systems is top priority. This incident is a good example where we didn’t
do a good job. We know we must do better. And to answer OP’s final message:
GCP Team is listening.

Our team has been in touch with OP over what happened and will continue
digging into the issue. We will be doing a full review in the coming days and
make improvements not only to detection but to communications for when these
incidents do occur.

~~~
masterleep
My company was another victim if you need another non-fraud case to look into.

------
whyagaindavid
Sad to know such events. Can anyone comment on what would be the number of
events (%) lead to tipping point- move people to avoid any SaaS? Anyone
experienced such tipping points that led to overall change in trend especially
in other industries?

------
dzimine
On top of "this sucks": such "fully automated" response is a violation of GDPR
compliance that requires "right to obtain human intervention on the part of
the controller".

Stunning that Google cloud can get away with this.

------
myrandomcomment
The first requirement of any production system is redundancy. It sound like on
top of the layers for cpu, storage, network and application, the new requiment
is redundant cloud platform providers.

------
anon12360238
The writer - twitter handle @serverpunch - not only doesnt identify him/her
self, they seem to have created a Twitter account just to shit on GCP.

What is the activity which got flagged as suspicious?

------
kerng
Wow, haven't heard of a story like this yet where a cloud provider provider
shuts down all resources without warning...

Luckily, with AWS and Azure there is great competition to move to.

------
matt_wulfeck
Does google allow you to preauthorize your purchasing card? It seems like part
of this is that they all of the sudden get suspicious about your billing
information.

------
exabrial
I guess Google treats their _paying_ cloud subscribers like their revenue
generating YouTube content generators. Moderately surprised...

------
SurenTer
Google servivcces are great - unless something goes wrong. It seems googlers
have never googled "support" :)

------
oooooof
Wow if that’s possible on google cloud then I agree wholeheartedly.... don’t
risk your business by using google cloud.

------
danielrhodes
Larger companies on AWS will spread their infra across multiple accounts to
deal with this risk.

------
VendicarKahn
I experienced exactly the same thing earlier this month. Due to "suspicious
activity" on YouTube, Google suspended not only my YouTube account, but also
Gmail, Google Chat, and ALL other services they provide, including their Cloud
services.

They provide no explanation as to why the accounts are closed, and provide
only a single form to appeal the situation.

In my case they refused to release the locks on their services so all
information, all contacts, all files, all histories, all YouTube data, and
everything else they store is now effectively lost with no means of getting
the data back.

This was done without warning, and the account was locked due to "suspicious
activity on YouTube".

Through Experimentation with a new account I found that the "suspicious
activity" they were referring to was Criticism of the Trump Administration
policy of kidnapping children from their parents.

Posting such criticism to the threads that follow stories by MSNBC and other
news sources triggered Google to block YouTube and all other services they
provide and to do so without warning or any explanation.

------
emilfihlman
Here's the harsh truth:

If your stuff relies on "cloud" you are on your own.

------
ediazrod
Sorry but this happened because Google and aws lack of the vision of run true
Enterprise companies. I see this situation on azure many times and in the end
time try lo locate to te comercial people behind of this customer .. it is a
lack of the Enterprise vision

------
javajosh
Well that's a real Horror Story and I hope Google's listening. I guess Amazon
web services really is the way to go.

~~~
jacquesm
Does it really matter whether you get bitten by the dog or the cat?

~~~
MemphisTN1
False equivalence. Amazon built their products on putting customers first
while Google is like a giant robot overlord without a human interface

~~~
quickthrower2
And dogs are supposed to be our best friends.

------
freedomben
Has anyone experienced anything like this on IaaS services like Heroku or
Gigalixir?

------
iosDrone
Wow, that is insane. Hope this gets distributed widely to Google Cloud
decision-makers.

------
some_account
I don't get it. Why use Google at all? There are tons of better things out
there.

~~~
dewey
I’d say that really depends on what you need. If you want a full platform with
a lot of integrated services there’s really only GCP, AWS, Azure, Tencent,
AliBaba and maybe something else I forgot. So I wouldn’t say it’s a ton of
better things that are available. Sure if you only need a bunch of VMs you
have a lot of options.

~~~
laegooose
I had AliCloud experience similar to OP but they gave me 24 hour deadline.

"We have temporarily closed your Alibaba Cloud account due to suspicious
activity. Please provide the following information within [24 hours] by email
to compliance_support@aliyun.com in order to reopen your account: ... If you
fail to provide this information your account will be permanently closed, and
we may take other appropriate and lawful measurers. Best regards, Alibaba
Cloud Customer Service Center "

I provided the documents in ~30 hours because that's when I saw the email.
There was no further communication from Alibaba. I assumed everything is ok,
but in 2 weeks my account was terminated.

------
caiocaiocaio
I think I have been linked to at least a few hundred Medium articles, and this
is maybe the third one that was good.

------
dangero
Wow as a CTO this is a nightmare scenario. Is this common? I guess this means
google.com does not use Google Cloud because I’m sure they have uptime
targets. They cannot handle incidents like this and expect people to take them
seriously as a cloud provider.

~~~
kkirsche
Read the SRE book and learn about the companies you are talking about…

~~~
fwdpropaganda
I'm not the person you were responding to, but I'd be curious to hear more.
Could you please explicitely say whatever it was that you were implying?

~~~
sls
I think the reference is to this book, "Site Reliability Engineering" [1]. As
you will see on the main page [2], it is part of Google's effort to describe
"How Google Runs Production Systems", which is basically what the parent
comment was asking about.

[1]
[https://landing.google.com/sre/book.html](https://landing.google.com/sre/book.html)
[2] [https://landing.google.com/sre/](https://landing.google.com/sre/)

------
deltateam
> We would have lost everything — years of work

Okay that sounds like a greater systemic problem.

You should be have been able to deploy your git repository on another system
pretty quickly, as well as have your own backups of your database.

The most time consuming thing should be setting up the environment variables.

Let me see, what else would be tricky: if you are using google analytics that
data might be gone, but your other metrics package should have had many
snapshots of that data too

~~~
hedora
From a Google shareholder’s perspective, this approach is unacceptable:

The only way to use their product safely is to engineer your entire business
so that cloud providers are completely interchangeable.

Forcing the entire industry to pay the cost of transparently switching upfront
completely commoditizes cloud providers, which means they’ll no longer be able
to charge a sustainable markup for their offerings.

This is fiscally negligent. Upper management should be fired.

However, it’s great for the rest of the industry — Google nukes a few random
startups from orbit, some VCs take a bath, early and mid-range adopters bleed
money engineering open source workarounds, and everone else’s cloud costs drop
to the marginal costs of electricity and silicon.

~~~
deltateam
its not even a premature optimization to make completely interchangeable
server code these days.

the shareholders should be proud that such naivete towards vendor lock is
still rampant

------
iblaine
"Why you should not use Google Cloud" if you're a small business. A large
business will have contacts, either on their own or through a consulting firm,
that can call Google employees and get help. As a small company, you're at the
mercy of what Google thinks is adequate support for the masses.

