
Manage Kubernetes Clusters on AWS Using Kops - betahost
https://aws.amazon.com/blogs/compute/kubernetes-clusters-aws-kops/
======
bryanlarsen
The Kubernetes team has been doing a lot of work to make these admin tools
less and less necessary. More and more pieces of it can be run from within K8s
itself. For example etcd used to need to be set up and managed externally, now
it's just inside. And extensions are growing up too; see CRD's in 1.7.

And unlike setup & management tools, it appears that we have a clear "winner"
for K8s app management: helm. And there's more overlap than you'd expect. For
instance I recently typed "helm install prometheus" and not only did it
install prometheus but it installed it with all the hooks necessary to monitor
the K8s cluster.

I'm not sure why I can't do the same to get an elasticsearch, logstash &
kibana stack (or similar competing stack) set up as a cluster logging
solution. AFAICT right now you have to have the right flags set on your
kubelet startup script to do this, but that's the sort of thing that I hope &
believe that K8s is making better.

And setting up a glusterfs cluster to use as a storage provider also did a
surprising amount of its setup in k8s.

Obviously K8s setup can't quite be reduced to a simple `apt-get install
kubelet`, but hopefully eventually it isn't much more.

~~~
justinsb
100% agreed and looking forward to that - kops is a big part of that effort,
actually. If you look under the hood you'll see that kops uses a lot of the
kubernetes apimachinery. There's obviously a tricky chicken & egg situation
with first creating a cluster specifically, but the hope is that a lot of the
post-installation activities you do with kopscould be done through kubectl
(e.g. adding groups of nodes of a different instance type) , talking to those
same kops API objects on the k8s apiserver. The apimachinery team has been
doing great work to enable this, that seems to have to fruition in k8s 1.7.

------
shitloadofbooks
It's such a pity that AWS don't offer a managed Kubernetes cluster as a
service.

Our team has "wasted" a fair bit of time researching and implementing all the
bits and pieces to build pre-prod prod Kubernets environments, whereas Azure
(and obviously) GCE's out of the box solutions make it so much easier.

~~~
tyingq
It would be nice, but it's easy to see why they don't.

Assuming the customer set that wants a supported orchestration tool, it would
allow those AWS customers to more easily leave for Azure or GCE. Perhaps more
importantly, it would allow existing customers to have credible on-prem dev,
test, or disaster environments.

When you have the sort of market share AWS has, there's no real incentive to
open those doors. I suspect this won't change until/if this market is more
balanced.

~~~
atonse
Perhaps for some AWS services (ECS, etc). But I would still stay on amazon for
all the other surrounding services that are crucial, SQS, S3, occasionally
DynamoDB, Lambda.

I would love to have managed k8s on amazon. It wouldn't move me away because
of all those surrounding technologies.

~~~
tyingq
That's a good point, though there are on-prem and "other vendor" api
compatible services for some of those, like S3.

Minio, for example:
[https://github.com/minio/minio](https://github.com/minio/minio) or Google's
cloud storage. Both are compatible with the S3 api.

~~~
toomuchtodo
It's only a matter of time until these AWS primative replacements are mature
within Kubrnetes, which is definitely a good thing.

AWS is wonderful, but you always have to have tooling ready to not be held
hostage by a vendor. And any vendor will hold you hostage once they have
enough market dominance or enough of your business.

------
joshma
Is anyone here running k8s in production with kops? Are there any missing
pieces that require "manual" work, like rotating certificates? How feasible is
it to run, say, 30 clusters out of the box with a 2-person team?

~~~
justinsb
I'm one of the kops authors, and I will say that a _lot_ of people run k8s
clusters created by kops in production - I don't want to name others, but feel
free to join sig-aws or kops channels in the kubernetes slack and ask there
and I'm sure you'll get lots of reports. In general kops makes it very easy to
get a production-suitable cluster; there shouldn't be any manual work required
other than occasionally updating kops & kubernetes (which kops makes an easy
process).

But: we don't currently support rotating certificates. There used to be a bug
in kubernetes which made "live" certificate rotation impossible, but that bug
has now been fixed so it's probably time to revisit it. We create 10 year CA
certificates, so it isn't something that you have to do other than just good
security practice though.

If you file an issue
([https://github.com/kubernetes/kops/issues](https://github.com/kubernetes/kops/issues))
for certificate rotation and any other gaps / questions we'll get to them!

~~~
erikb
How do you handle that kubernetes requires the eth0 ip in no_proxy? Do you set
that automatically?

How do you handle that DNS in a corp net can get weird and for instance in
Ubuntu 16.04 the NetworkManager setting for dnsmasq needs to be deactivated?

How do you report dying nodes due to kernel version and docker version not
being similar?

Do you report why pods are pending?

Does kops wait until a sucessful health check before it reports a successful
deployment (in contrast to helm which reports success when the docker image
isn't even finished pulling)?

Do you run any metrics on the cluster to see if everything is working fine?

Edit: Sorry to disturb the kops marketing effort, but some people still hope
for a real, enterprise ready solution for k8s instead of just another fluff
added on a shaky foundation.

~~~
justinsb
kops is an open source project that is part of the kubernetes project, we're
all working to solve these things as best we can. Some of these issues are not
best solved in kops; for example we don't try to force a particular monitoring
system on you. That said I'm also a kubernetes contributor so I'll try to
quickly answer:

* no_proxy - kops is getting support for servers that use http_proxy, but I think your issue is a client issue with kubectl proxy and it looks like it is being investigated in #45956. I retagged (what I think are) the right folks.

* DNS, docker version/kernel version: if you let kops it'll configure the AMI / kernel, docker, DNS, sysctls, everything. So in that scenario everything should just work, because kops controls everything. Obviously things can still go wrong, but I'm much more able to support or diagnose problems with a kops configuration where most things are set correctly, than a general scenario.

* why pods are pending: `kubectl describe pod` shows you why. Your "preferred alerting system" could be more proactive though.

* metrics are probably best handled by a monitoring system, and you should install your preferred system after kops installs the cluster. We try to only install things in kops that are required to get to the kubectl "boot prompt". Lots of options here: prometheus, sysdig, datadog, weave scope, newrelic etc.

* does kops wait for readiness: actually not by default - and this does cause problems. For example, if you hit your AWS instance quota, your kops cluster will silenty never come up. Similarly if your chosen instance type isn't available in your AZ. We have a fix for the latter and are working on the former. We have `kops validate` which will wait, but it's still too hard when something goes wrong - definitely room for improvement here.

In general though - where there are things you think we could do better, do
open an issue on kops (or kubernetes if it's more of a kubernetes issue)!

~~~
erikb
Nice, thanks. My feeling is that this is about 75% of what we want and thereby
may really be the best solution there is, right now. I'll bring your responses
into my next team meeting.

------
AlexB138
I've setup a handful of K8s cluster on AWS, some of them using Kops. I would
say it's the easiest time I've had, and it's what I chose for production. My
only real complaint is that it requires extremely broad IAM permissions. Great
tool, and a good community.

~~~
rellimevad
I know scoping the permissions is something they've been working on:
[https://github.com/kubernetes/kops/issues/1873](https://github.com/kubernetes/kops/issues/1873)

------
Pirate-of-SV
If you're using (or want to use) Terraform and consider running k8s on AWS
take a look at tectonic-installer[1] and its `vanilla_k8s` setting. My opinion
is that it's far better than kops `-target=terraform` output. It's also using
CoreOS rather than Debian which seems reasonable.

[1] [https://github.com/coreos/tectonic-
installer](https://github.com/coreos/tectonic-installer)

~~~
bogomipz
Isn't Tectonic also based on kubeadm and Terraform? Kops and Tectonic felt
very similar to me.

~~~
lsvx
Tectonic is built on Terraform but is not based on kubeadm. However, a
Tectonic vanilla k8s cluster may be similar to a kubeadm one since they both
leverage Bootkube to provide self-hosted k8s.

------
mnutt
I recently started testing kubernetes using kops and was pleasantly surprised
how explicit it was about exactly what changes it would make in AWS. If you
get past the testing phase and want to use it conjunction with your existing
production infrastructure, I found AWS's VPC peering to be a handy way to let
your k8s cluster talk with existing services while keeping the kops-created
VPC separately managed.

~~~
jdc0589
the dedicated VPC approach definitely works. But, in case you weren't aware,
kops can launch a cluster in an existing vpc pretty trivially.

1\. create your cluster specifying the existing VPC id and CIDR. do not use
the --yes flag (I'm not sure the VPC cidr is really even used with this
approach)

2\. edit the cluster definition (kops edit cluster xxxxxxxxx) and just change
the subnet CIDRs to whatever nonexistant ones you would like for kops to
create and use.

------
meddlepal
We wrote a similar guide a few months back at Datawire.io
([https://www.datawire.io/guide/infrastructure/setting-
kuberne...](https://www.datawire.io/guide/infrastructure/setting-kubernetes-
aws/))

It's a little bit out of date now but we're working on an updated version that
deals with managing upgrades, CI/CD for infrastructure and adding other pieces
of cloud infrastructure in a way that the Kube cluster can access it
seamlessly.

~~~
jdc0589
can you comment on why you suggest using a public route53 hosted zone? Using a
public rather than a private zone always felt pretty dirty to me.

~~~
meddlepal
Reducing friction. At the time private zones in Kops were more of a PITA to
setup and I would have had to explain a bunch more stuff in the article.

I might look at that again when I redo the guide. Also considering switching
to the new Gossip based discovery mechanism so that Route 53 isn't even a
requirement. I was recently setting up a cluster for a customer and they
complained about having to use Route 53.

~~~
jdc0589
just learned about the gossip support, pretty interested in it too.

------
yebyen
How are people doing autoscaling clusters on AWS?

I just want to create a simple, single-AZ cluster with one master, and one
autoscaling group for workers. Ideally when there are no jobs to run, the ASG
scales down to zero.

(I have a Jenkins service that can spawn pods, and it works great on the
single-master-only cluster with no workers. There's even room for it to spawn
one pod alongside it on the master node. I just removed the dedicated taint
annotation. But as soon as I schedule one more pod, I wish I had an ASG for
worker nodes because I'm requesting more CPU and RAM than is available, so
stuck waiting for the first pod to complete and resources to free up.)

If I had more nodes (or the possibility of more nodes), I'd likely put the
dedicated taint back and give my Jenkins master server pod a tolerates
annotation so that it can land there when the resources are available, or use
a label selector in the pod spec to ensure it winds up on the stable master,
or both. I think I know enough to put this combination together, but I am sure
I can't be the first, and I don't want to miss the "golden path" if there is
one, this thread seemed like a good place to ask about it.

This seems like it would be a really common configuration, but I haven't been
able to find a lot of documentation or even blogs about setting up ASGs this
way.

I found kube-aws[1], which seems to be the only K8S orchestrator that actually
mentions creating AutoScaling Groups for worker nodes in the setup docs (??).
I found cluster-autoscaler[2] which appears to be the piece you need to get
the cluster to scale up an ASG when unsatisfied K8S resource requests have
left pending scheduled pods waiting, or scale down when nodes are idle for too
long. I'm totally mystified why this does not appear to be a supported
configuration of Tectonic or Mantl or Stackpoint, or... what's your favorite
K8s orchestrator and how does it handle this?

[1]: [https://github.com/kubernetes-incubator/kube-
aws](https://github.com/kubernetes-incubator/kube-aws)

[2]:
[https://github.com/kubernetes/autoscaler/tree/master/cluster...](https://github.com/kubernetes/autoscaler/tree/master/cluster-
autoscaler/cloudprovider/awshttps://github.com/kubernetes/autoscaler/tree/master/cluster-
autoscaler/cloudprovider/aws)

~~~
philipcristiano
kops creates an autoscaling group for the worker nodes as part of the cluster
creation process. Adding the cluster autoscaler is simple as deploying the
autoscaler (as mentioned in your linked docs) pointing to the correct ASG. The
IAM permissions for the autoscaler can be added with `kops edit cluster
$CLUSTER`.

~~~
yebyen
Thanks! I have been using kubeadm for my "cluster" with some home-made ansible
playbooks so I hadn't just gone ahead and tried it out yet. Good to know I'm
headed down a path that others have already been down.

~~~
jdc0589
I've spent a solid amount of time reading docs and looking in to this. It
sounds like this might not be an issue for you, but the major gotcha is k8s
does not support rescheduling pods when new nodes join the cluster yet to re-
balance things (there was some stuff in proposal phase to address it though).

So, if your workload is made up of lots of short lived ephemeral stuff you are
good to go, but otherwise you may have to manually step into rebalance stuff
to new nodes.

The autoscaler addon might have addressed some of this, but I'm not seeing
anything obvious after a cursory overview of the docs.

~~~
philipcristiano
The autoscaler will add minions when there is a pod that cannot be scheduled.
It doesn't help in balancing existing things, but at least the new pod should
be able to go there.

~~~
jdc0589
I saw that and was confused, I thought that would work out of the box with k8s
since (I think?) it continually tries to keep scheduling pods; I must be wrong
though.

If balancing gets implemented, am I correct that it would probably happen in
k8s core?

~~~
yebyen
I think I wouldn't assume that, myself; at least not after my stint in
VietMWare (PTSD from a prior vSphere/vSAN experience)

We had VMTurbo/Turbonomic and VMWare's internal DRS that would sometimes
compete against each other to decide where VMs should be scheduled.

You could handle balancing from inside of k8s core, or not. All you need is to
evict the pod on the over-provisioned node, and to arrange for provisioning of
the replacement node before the original pod is fully evicted. It should be
the same for Kubernetes, if I had to guess.

------
jacques_chester
If you need a management system that's not tied to AWS, I'd recommend folk
look at Kubo. We (Pivotal) and Google have been working on it for a while now.

[https://www.youtube.com/watch?v=uOFW_0J9q70](https://www.youtube.com/watch?v=uOFW_0J9q70)
for the motivation (tl;dr Kubernetes solves container orchestration, but
doesn't solve its own management).

[https://github.com/cloudfoundry-incubator/kubo-
deployment](https://github.com/cloudfoundry-incubator/kubo-deployment) for the
main repo.

Disclose: I work for Pivotal.

~~~
justinsb
kops isn't actually tied to AWS; there's early support for GCE and very early
support for VMWare, with more on the way.

~~~
jacques_chester
I didn't realise! That's good.

------
ShakataGaNai
Running K8s on AWS isn't a new concept to me. Obviously lots of people are
doing it with Kops and other tools (Ex: Tectonic). What I find curious is that
Amazon is advertising this on their blog. K8s is essentially a direct
competitor to their own ECS service. Arguably K8s is much better, much larger
community adoption and has significantly faster improvement.

Maybe AWS is gonna give up the ECS ghost? Or at least offer K8s as an option.
After all GCE offers it in GKE, so does Azure.

------
ciguy
We (www.startopsgroup.com) have been rolling out a lot of Kops clusters lately
for clients that need COLO + Cloud or multi-cloud environments. So far we
haven't found a better tool for the purpose, even if there are a few rough
edges.

------
machbio
wish there was some solution to not using the AWS Route53 -
[https://github.com/kubernetes/kops/issues/794](https://github.com/kubernetes/kops/issues/794)

~~~
justinsb
We should update that issue. There is support for private route53 domains,
which means you don't need a DNS domain name, and there is early experimental
support for gossip discovery (i.e. no Route53 at all) - just create a cluster
with a name of `<name>.k8s.local`.

~~~
jdc0589
unsolicited feedback: I know you guys beefed up the documentation with one of
the most recent releases, but this is an area that probably needed some work
circa v1.5. Most of it is pretty self explanatory if you are familiar with
Route 53, but It would be nice to have some more examples of stuff like
"here's what a reference cluster using a private route53 hosted zone and
private network topology looks like, and here's why you might want to do
that".

is there documentation on the gossip support anywhere? I'm not seeing much
aside from "just name your cluster with a k8s.local suffix and you are good to
go"

