
Kubernetes on Google, Azure and AWS Compared - stevenacreman
https://kubedex.com/google-gke-vs-microsoft-aks-vs-amazon-eks/
======
GordonS
> Azure is something I’ve avoided since using it for a few months last year. I
> was working as a Microsoft partner so it was unavoidable back then. Parts of
> it are alright but the user experience coming from an Amazon background is
> worlds apart.

Wat? IMO, the Azure portal is _amazing_ to work with, especially compared to
the old-fangled, inconsistent UI that AWS provides.

> Show them how some things on Azure need to be done in a clunky web UI

Years ago, this was true, but it hasn't been for a long time - the Azure web
UI of today is fast, consistent and looks great.

> some things on Azure need to be done in a clunky web UI, other things need
> Powershell and other random stuff uses the CLI

I've been using Azure for years, and I'm not aware of anything that can only
be done in the UI. I also don't believe there is anything that only works with
Powershell, or only works with the CLI.

> how that effects the design of DevOps pipelines and automation in general.
> Yes, you can make it work, but why make life hard for yourself

Eh? Azure DevOps pipelines are great to work with, and there's a huge library
of tasks available.

After reading this, it doesn't sound like the author has actually worked with
Azure recently, so I'm really not sure why he bothered including it in this
article.

~~~
maximilianburke
I've been in the process of migrating from AWS to Azure to take advantage of
start-up credits. Previous to this I was working at a large company on AWS-
only infrastructure. I feel like I've got enough experience with both at all
levels. In Azure we've been using Kubernetes for a number of months now.

I, too, think the Azure portal is amazing to work with. It feels much more
coherent than the AWS portal. I like the Azure command line package as well,
in preference to the AWS CLI. I've never touched the Azure PowerShell tools
and have never felt like I had to.

Creating resources via the portal or command line is something I only do as a
last resort anyways; we use Terraform for all of our cloud resource creation
purposes.

~~~
ladzoppelin
We are trying to be all Terraform but its not moving fast enough. We still
need the the Azure/AWS consul/cli to do much of the work. I switched from an
AWS shop to a "migrating to Azure from AWS" shop. The new job is all
Kubernetes work so I love it but I cannot get comfortable with Azure. I was so
much more productive and confident with AWS its crazy. I miss S3, EBS,
Snapshots, Route53, IAM and even fucking AWS role policy. I understand MS
realizes this which is why they change there API every 4 months but its
getting ridiculous we can't even use AKS (instead of ACS) because for some
reason they don't role out services to the US North Central Region. WTF?

~~~
maximilianburke
Yeah, there's a definite hurdle to get over when moving to Azure. I've been
finding analogs to most of the services I need, like Blob storage instead of
S3, managed disks instead of EBS.

There are some things I definitely prefer in Azure to AWS too; I find the AD-
based authentication to be much easier to understand and implement compared to
IAM. The SMB shares (File storage) are great, integrate well with Kubernetes,
and are a lot faster than EFS.

Non-premium VM IO performance is abysmal though, and I really wish I could
store SSH public keys in Azure AD.

------
sofaofthedamned
Interesting post. I'm a Linux guy but i'm in a new job so having to ramp up on
Azure DevOps. Looked good at first, I actually praised Microsoft but:

1\. Creating a project timed out after 5 minutes or so. This was in the gui,
but had no adblocker or anything similar. Refreshed page = no project. 2\.
Went to a different machine, logged in, no project still. 3\. Went to original
machine - refreshed - no project. Logged out and back in again - project was
there.

Azure DevOps was awesome at first but I can not trust something like this when
I work with it all day.

~~~
switch007
I've had similar experiences. Can't recall a similar issue with AWS, ever. And
those panes needing horizontal scrolling: wat?

~~~
illvm
XBox Dashboard designers moved to Azure :)

------
inscrutable
Fully agree with his comments as someone who's used GKE, AKS, EKS and Hetzner
for kubernetes clusters.

GCP's UX is so nice... e.g. compare the equivalent as command line option in
the UI vs the automation script that Azure gives you.

~~~
krn
> [...] as someone who's used GKE, AKS, EKS and Hetzner for kubernetes
> clusters.

How does Hetzner compare here? In terms of both, general experience and costs.

~~~
cardine
>How does Hetzner compare here? In terms of both, general experience and
costs.

We use Hetzner for this purpose. It's hard to compare it with the other
providers since with Hetzner you are renting unmanaged servers, not turning up
cloud instances.

But it is absolutely the best value you can get. Our bill would be at least
10x higher if we were using Google/Amazon/Microsoft.

Considering we spent a lot on servers as is, that makes it well worth it for
us. It might not be if your product is very computationally light.

~~~
segmondy
I don't know your location, but if you are in the US and running your own k8s
on baremetals, I would reckon you should have at least 1 full time person on
staff to handle all of it. Let's say that person costs only $50k/yr. Will your
workload on Google cost more than $50k? Google pretty much really does all of
the management for GKE. Just deploy your app and run.

~~~
cardine
>I would reckon you should have at least 1 full time person on staff to handle
all of it.

What exactly would that person be handling? We probably spend no more than a
couple hours per week handling everything involved. That's the point of
something like Kubernetes in the first place.

>Will your workload on Google cost more than $50k?

Our sever bill with GKE would far exceed $50k/yr (it would exceed $50k/mo!)

As a result, Hetzner is a no-brainer for us. As mentioned before, that might
be different for you if you are doing something that is computationally light.

------
stevenacreman
Author here.

I can say that this has pretty much gone as I'd expected.

Microsoft have 80,000+ developers. Their partner ecosystem is absolutely
massive. I've watched them hire hundreds of developer advocates that talk at
events. It's therefore quite hard to write anything online that's critical
without at least half of the comments coming from a biased source.

It is interesting to see the gap between views in the comments here.

Edit: I watch this comment go up and down a lot as the vote battle between
those with an agenda and those without click against each other.

Weirdly, I did criticise AWS a bit for their EKS offering in the blog but I've
not had anywhere near the toxicity from people about that.

~~~
GordonS
> It's therefore quite hard to write anything online that's critical without
> at least half of the comments coming from a biased source

Some stuff is perceptual, other stuff is factual. The issue I personally took
with your post is that you stated things that are blatantly untrue, such as:

    
    
        some things on Azure need to be done in a clunky web UI, other things need Powershell and other random stuff uses the CLI
    

And now I also don't like how you are basically saying that anyone who
disagrees or downvotes you is some kind of shill for Microsoft!

~~~
stevenacreman
The spreadsheet linked in the blog is almost entirely factual and I made quite
a point of adding comments with links to various sources.

If you want to be pedantic about the sentence you took offense at it is
actually true. There are operations that can only ber performed using
Powershell. People have given examples in this thread. But that isn't the
point.

What I should have written for that part was that most people google for
solutions. I've done this in the past on Azure and you get a random assortment
of answers. Click here, Powershell this there, do whatever. It's disjointed
and takes you out of mental context. I could update the blog with more clarity
but I don't think anyone will change their mind.

I've used Azure, GKE, AWS. It was played down a little in the blog, but I have
used the non AKS parts of Azure pretty extensively. Have you tried GKE? My
suspicion is that you've been stuck in a Microsoft world for a while.

Not everyone is a paid shill but some people have a lot invested in the
Microsoft ecosystem and I can only assume that's the reason for some of the
comments here. I do 100% believe the Microsoft PR team has posted in here at
least once though :)

~~~
bk24
If you honestly believe that there is such a thing as "unbiased" and that this
article is an absolute objective demonstration of that, I don't know how much
higher your horse could get. When someone gets paid or invests time in
anything, whether it is to build, use, market, or sell a product, then that
person has become biased. Anyone who can convince themselves that he/she is
immune from this is delusional.

------
dfischer
I tried Azure lately – it felt like I booted into Windows 98. I couldn't stand
it. A lot of the CLI UX was buggy for me too.

AWS works. A lot of the UI is dated but the new designs are nice.

GCP is the best. Google has done a great job on Dev UX both CLI and Web
Console.

~~~
amf12
> I tried Azure lately – it felt like I booted into Windows 98. I couldn't
> stand it.

Care to elaborate?

~~~
dfischer
The sidebar immediately took me back aesthetically. UX wise I felt it was
terrible drilling down into resources I was looking for.

Of course UX is also based on my bias (AWS and GCP).

Nonetheless I was surprised at the presentational bar for the azure UI. It
felt very dated. In the end I got a hang of it but again, it still felt like
using an old pc.

CLI wise I had a lot of errors spinning up resources. My support experience
getting it figured out was sucky. I had to abandon azure as a decision quickly
with that experience. Too risky.

~~~
bk24
a man of many subjective complaints

~~~
dfischer
a person lacking constructive comments

------
bsaul
A little of topic because it’s not a comparison : Recently had my first
experience using gke ( and kubernetes in general), and although i managed to
get something working in the end i would say it’s still pretty rough...

Documentation is a mess : the general layout for google cloud is really a pain
to read and navigate, but in addition to that you often have to jump between
kubernetes doc and google doc, with some information being on both. Don’t do
that please : either make it obvious that people need to get familiar with
certain chapters of kube doc, or provide all the info ( me preference would
actually go to the first option..)

It’s quite hard to guess what you can do on the gke web interface and what you
can’t. You can feel it’s meant for people that really know kubernetes, and not
people who discover both gke and kubernetes at the same time.

And for the life of me i couldn’t get the load balancer manage https. I’ve
read this was possible, but never saw the actual page explaining how. I ended
up using cloudfront but lost the ability to see end user ip in my logs in the
process. ( also logging is a real beast to tame on its own, with no obvious
way to know what is available by default, what is a paid option, what should
be configured on the stackdriver website, and what should be coded )

~~~
alpb
TLS with GKE with BYO cert isn't very clear I agree. It's basically in
Kubernetes Ingress docs: [https://kubernetes.io/docs/concepts/services-
networking/ingr...](https://kubernetes.io/docs/concepts/services-
networking/ingress/#tls) For GKE specific features, multiple-certs per Ingress
explained here: [https://cloud.google.com/kubernetes-engine/docs/how-
to/ingre...](https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-
multi-ssl), and LetsEncrypt here: [https://github.com/ahmetb/gke-
letsencrypt](https://github.com/ahmetb/gke-letsencrypt)

If you use GKE ingress, you can access end-user IP in X-Forwarded-For header.

~~~
bsaul
but then does that mean you’ll need to manually log incoming ip from the
header ? or is there a way to see them in the default logs of either the
cluster or the load balancer ?

------
craig_asp
"As it stands today I’ve personally used EKS, and AWS in general a lot. I’ve
used GKE a bit but only with my own personal credits doing Kubernetes The Hard
Way and spinning up a very quick GKE test cluster a while ago."

"I’m being serious when I say this: if the company I’m working for decided to
migrate to Azure I’d find a new job."

"It needs to be fast and bug free so that I can build cool automation on top.
Working on something like Azure, especially after having worked on AWS for
years, would be extremely depressing."

If the article is supposed to be an _unbiased_ comparison between cloud hosted
Kubernetes providers, I'd say it's a bit of a fail. For some it would be
completely different experience because they have experience with Microsoft
technologies. And those people might as well quit if their company moves to
AWS or a non-Azure platform.

~~~
bk24
Well said, this article is the furthest thing from unbiased. OP obviously
spent alot of years being paid to maintain infrastructure running crappy
software and is still feeling the scars. I'm no fan of MS myself but I do
acknowledge that they've improved alot from the duct-tape and glued NT days of
old.

~~~
geezerjay
...yet you've addressed none of the points made in the article, and only
resorted ro personal attacks.

------
wskinner
> Networking is the other reason. Google is miles ahead of everyone here.
> Similar story with HA and scaling.

Does anyone know what the author is referring to with this claim? I don't see
anything in the sheet to back this up. At least from a high level, all three
options support network policies via CNI, and GKE and EKS use the same one,
Calico.

~~~
stevenacreman
Hi, there's a comment on the cross region networking for GKE.

"Each cluster receiving an IP range for nodes and another for the containers
inside, which are directly routable across your private network, other
clusters, and regions."

AWS and Azure don't have a flat global network. You have to setup VPN's and
complicated overlay networks.

~~~
halbritt
Couple things:

Cross-region networking. To this day, I'm flabbergasted this isn't something
that AWS has. All the other networking bits seem more sensible to me as well.

Another, is 2Gbps per core up to 8 cores which is _way_ more throughput than
I've seen on any AWS or Azure instance.

On workloads where I care about Network IO:CPU ratio, I'm using 8 core nodes
and seeing 16gbps throughput between them consistently.

~~~
ti_ranger
> Cross-region networking. To this day, I'm flabbergasted this isn't something
> that AWS has. All the other networking bits seem more sensible to me as
> well.

AWS tries to avoid letting customers create multi-region or global failure
modes, like:

* [https://status.cloud.google.com/incident/compute/18005](https://status.cloud.google.com/incident/compute/18005) 22-hour incident with GLOBAL impact resulting in many VMs getting duplicate IPs (this no network connectivity) on 2018-06-15

* [https://status.cloud.google.com/incident/cloud-networking/18...](https://status.cloud.google.com/incident/cloud-networking/18001) 3-hour (seemingly GLOBAL) impact to load-balancers (no updates or creations) on 2018-01-03

* [https://status.cloud.google.com/incident/compute/16007](https://status.cloud.google.com/incident/compute/16007) \- 18-minute GLOBAL outage on 2016-04-11

* [https://status.cloud.google.com/incident/compute/15055](https://status.cloud.google.com/incident/compute/15055) \- 5-minute GLOBAL outage on 2015-08-04

* [https://groups.google.com/forum/#!topic/gce-operations/fynnX...](https://groups.google.com/forum/#!topic/gce-operations/fynnXnb2OFs) 43-minute multi-region packet loss due to multi-region configuration deployment on 2015-03-07

* [https://groups.google.com/forum/#!topic/gce-operations/1uw-q...](https://groups.google.com/forum/#!topic/gce-operations/1uw-qEqjBdo) unscheduled reboot of 28% of instances in multiple regions in ~1.5h on 2014-09-17

There have been recent changes in many AWS services (e.g. DynamoDB, S3,
Aurora) to allow cross-region replication without introducing any multi-region
dependency, precisely to make it easier to implement multi-region
infrastrucure to tolerate a single-region failure (however rare that is on AWS
compared to global outages on GCP).

> Another, is 2Gbps per core up to 8 cores which is way more throughput than
> I've seen on any AWS or Azure instance.

Which instance types did use on AWS? Many recent instance types (r4, i3, c5,
r5, z1) support up to 10Gbps with once core (2 vCPU instances). However, you
may need to use placement groups (
[https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placemen...](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-
groups.html) ) in large regions in order to get the full throughput on low TCP
connection numbers. The only reason I can think of that would explain why GCP
doesn't have this problem is they don't have any regions anywhere near as
large as large AWS regions ...

Edit: white-space only formatting changes

~~~
halbritt
Your point about global failure modes is well-taken. Even still, one could
reasonably expect inter-region networking not to be so difficult.

As for the instances I tested, they were m4 and r4. To get full 10GE out of
either, I needed to use m4.10xl and r4.8xl, both roughly around $900 per
month.

By comparison an n1-standard-8 is about $60/mo and the n1-highcpu-8 is $50/mo.
I've tested the former and got something like 15.8 or 15.9gbps using iperf.

IMHO, that's pretty significant.

------
andyfleming
I'm curious how the digital ocean offering compares. I think it's in limited
availability now.

------
kubenaught
My company is in the process of migrating off of a managed kubernetes
provider. Sure, it's nice to have someone else manage the operations of the
master. At the same time, a single customer is entirely insignificant to them.

We've experienced multiple outages from forced upgrades. We usually can find
out the reason through github, but it may not be a priority for them to
provide the fix. If they do, it could take days or weeks for it to become
available. Much revenue has been lost because they can't do something at the
speed which we could do it.

------
api
We are preparing to move to Google. Our app is CPU intensive and a killer
feature was the ability to spec nodes with a lot of cores and little RAM. We
save a lot of money by not paying for RAM we dont need. Google also had a
location in LA and we are in LA. 5ms to our cloud is nice.

------
Arnavion
>Show them how some things on Azure need to be done in a clunky web UI, other
things need Powershell and other random stuff uses the CLI.

This, at least, is wrong. Everything related to Azure's hosted Kubernetes can
be done via the CLI without ever touching the web portal or PS.

------
drewmassey
After several weeks of swimming upstream on a greenfield EKS project I
switched gears to GKE. The paradigm just felt way better. For large
enterprises that need IAM integration on EKS I suppose that wouldn’t have been
an option but GKE just feels way more paradigmatic, for obvious reasons.

Did I mention we are hiring devops engineers? If you are a kubernetes guru
email me :-)

------
paxy
The service that launched on Azure in 2017 was ACS (Azure Container Service),
which is very different from what AKS is today.

------
capkutay
Slightly unrelated, but what would be the best way to migrate a kubernetes
application from one cloud to the other?

~~~
scirocco
OpenShift is a Kubernetes-based container platform that can run on AWS, Azure,
Google...

------
mverwijs
Anyone here want to share their experiences running k8s on IBM Bluemix
(formerly Softlayer)?

~~~
po
I use and like IBM Cloud (Softlayer) for k8s stuff... I find they are making
improvements to it (including docs) on a regular basis and I can often find
and talk directly to their devs in a slack channel. Some parts are a bit
clunky and I feel like I can see their legacy stuff poking through the
abstractions but overall I find it to be good.

I tried Azure and had some issues and didn't really like it. I haven't tried
AWS or GKE for k8s yet.

------
nbevans
His hatred for Azure ruins the article. There seems to be a massive
correlation at the moment between Kubernetes and hyped up magpie egomaniac
developers.

~~~
FlorianRappl
I thought 100% the same. While it is known that GKE yields the best managed
Kubernetes experience I thought the whole Azure bashing was not only
pointless, but also flawed with many inaccurate and even false statements. So
much to "honest".

~~~
halbritt
It correlates with my experience pretty closely.

Granted, I haven't really used Azure in six months.

------
lawrence143
Good post, Steven. Keep going..

------
conradk
For "Maximum pods per node", it shows GKE at 100, AKS at 110 but still puts
GKE in green. How come?

~~~
cobookman
Does anyone even have more than 20 pods / node? I've not personally seen
anything near the 100 pods / node limit in the real world.

~~~
regnerba
We don't have 100, but definitely more than 20. A quick look at a random node
from our cluster:

\- 3x ingress pods (1 for an internal load balancer, 2 for an external load
balancer)

\- 1x cert-manager

\- 2x production version of internal app01

\- 2x staging version of internal app02

\- 1x fluentd

\- 1x elastalert

\- 1x kubewatch

\- 1x prometheus node exporter

\- 1x redis for sentry

\- 1x sentry web

\- 2x sentry worker

\- 4x sourcegraph language servers (there are 9 language servers running
across the 4 nodes, this node seems to especially like them)

\- 1x thelounge

\- 1x staging version of app02

\- 2x production version of app02

\- 1x kube proxy

\- 1x kubedns

\- 1x couchdb (small footprint just for testing some things)

the node is just over half provisioned and we are due to add another node to
the cluster soon.

~~~
deepsun
What does prometheus node exporter collects from within a pod? I was under
impression that to collect node stats you absolutely need to be "outside" any
container, i.e. using prometheus-k8s integration so that it pulls nodes stats
from k8s api, not nodes themselves).

~~~
halbritt
The kube-state-metrics service runs as part of the kubernetes API and provides
container metrics. It replaces heapster.

The node exporter runs as a daemonset on each node and provides node specific
metrics like CPU, memory, disk IO, network IO, etc.

The node metrics comes from the node itself.

