I was able to setup an entire ecosystem from scratch in a week that scales well and can be managed in one location.
When I first looked at Kubernetes, the complicated part was setting it up on a cluster. If you use Kubernetes on GCE, or Azure, you don't have to do that step, everything else is ready to go for you!
- Automatic scaling of your application
- Service discovery
- Secrets and config management
- Logging in one central dashboard
- Able to deploy various complicated, distributed, pieces of software very easily using Helm (Jenkins, Kafka, Grafana + Prometheus)
- Able to add new nodes to the cluster easily
- Health checks and automatic restarts
- Able to deploy any container to your cluster in a really simple way (if you look at Deployments, it's really simple.)
- Switch between cloud providers and still maintain the same workflow.
I won't ever touch Ansible again, I really prefer the Kubernetes way of handling operations (it's like a live organism instead of something you apply changes to.)
Also, the entire argument that you probably don't need Kubernetes because your organization doesn't have 10s, or 100s of nodes just doesn't make sense after using it.
Having a Kubernetes cluster with 3 nodes is 100% worth it, even for rather simple applications in my opinion. The benefits are just way too good.
PS - Could you share which books/websites/resources you used to get up to speed to the point where you're at now?
It is also nice that I could do this on macs and linux. I used my personal MBP to do some and my office machine (system76 running Ubuntu.)
The best advice would be to create, test, delete over and over and over. I used personal AWS funds to do this while in grad school, so the added pressure of making the best use of limited funds was great -- i would spin up dozens and dozens of machines, join, run, tear down in short timeframes. As usual, choose a simple project. I chose multi-machine TensorFlow inference a model since that is something that is an obvious candidate for elasticity.
Funny story - I once spun up a K8 cluster and tried to delete the machines manually on AWS and witnessed how incredibly robust the cluster is (new nodes kept getting re-instantiated.) Lessons learned: 1. K8 is robust; 2. use kops to also tear down the cluster;
AWS has their own tutorial, but I didnt like that -- it was too focused on tangential details.
Someone has to build your docker images Jenkins, Kafka and Grafana + Prometheus.
I also like Kubernetes but i don't think Kubernetes is necessary for small controlled environments.
Ansible just works with help of galaxy. I set up Jenkins, Kafka, Grafana + Prometheus faster with ansible than with kubernetes with more/easier control, specificly when i take care of stuff inside those services which are not yet thought of in the docker container.
Also the scaling thing, small companies just don't need.
There is a level of complexity to kubernetes.
And if you needed to leave GKE to another provider or packaged , how would that look?
Thanks for any answers!
> The Kubernetes API should now be available at http://localhost:8001, and the dashboard at this rather complicated URL. It used to be reachable at http://localhost:8001/ui, but this has been changed due to what I gather are security reasons.
I was playing around with GCE Hosted Kubernetes about a year ago, and things were pretty clear as far as I recall. I've read lots of positive things, and figured it's a good way to start.
Then I tried again recently, and I couldn't even get to the dashboard. Eventually after several cryptic StackOverflow copy&pastes I managed to load it (don't even remember how), only for the session to expire after 10 minutes or so... It was utterly frustrating. I didn't actually get to the more interesting part I was planning to play with as a result...
People say that there's a learning curve, and I get it. And also I'm not even trying to install Kubernetes on my own, but try to use a hosted service. I'm also pretty switched on when it comes to security and trying new things (or I'd like to think I am), but there are some things that feel like too much of an obstacle for me unfortunately.
On their website  they list the following:
> Caution: As of September 2017, Kubernetes Dashboard is deprecated in Kubernetes Engine. You can use the dashboards described on this page to monitor your clusters' performance, workloads, and resources.
This stuff moves very fast. You can be excused for blinking, IMHO!
PS. McCloud? I sense nominative determinism at play.
I've been at one shop with a large scale DC/OS installation. You can run a k8s scheduler on DC/OS, but by default it uses Marathon. DC/OS has it's own problems for sure, and both tools require a full time team of at least 3 people (we had 8~10) and there are a lot of things that will probably need to be customized for your shop (which labels to use, scripts to setup your ingress/egress points in AWS, HAproxy configuration or marathon-lb configuration .. which is just a haproxy container/wrapper), but I think I still prefer marathon.
I briefly played with nomad and which I had spent more time with it. I know people from at least one startup around where I live using it in production. It seems to be a bit more minimal and potentially more sane.
The thing I hate about all of these is there is no 1 to n scaling. For a simple project, I can't just setup one node with a minimal scheduler. DC/OS is going to cost you ~$120 a month for one non-redundant node:
I hear people talk about minicube, but that's not something you can expand from one node to 100 right? You still have to build out a real k8s cluster at some point. All of these tools are just frontends around a scheduling and container engine (typically Docker and VMs) that track which containers are running where and track networking between nodes (and you often still have to chose and configure that networking layer .. weavenet, flannel, etc).
I know someone will probably mention Rancer, and I should probably look at it again, but last time I looked I felt it was all point-n-click GUI and not enough command line flags (or at least not enough documented CLI) to really be used in an infrastructure as code fashion.
I feel like there's still a big missing piece of the docker ecosystem, a really simple scheduler that can easily be stood up on new nodes and attach them to an existing cluster, and has a simply way of handling public IPs for web apps/haproxy containers. I know you can do this with K8s, DC/OS, etc. But there is a lot of prep work that has to be done first.
Well, that's a gross simplification of what Kubernetes is. (I don't know about Marathon.)
Kubernetes is a "choreographer" of cluster operations. At its core it's a consistent object store that contains a description of the state you want your cluster to be in. Various controllers monitor this store and try to "reconcile" the real world with the desired state. Operations include things like creating persistent volumes, setting up networking rules, and, of course, running applications. To say that it's a frontend for a container engine is a bit misleading, since Kubernetes can control so much more.
It's a nicely layered system — a "pod" describes the desired state of a single instance of an app, a "replica set" describes the desired state of a set of pods, a "deployment" describes the desired incremental rollout of a replica set, and so on. It's also a design that scales down to a single node (hence the popularity of Minikube as well as Docker for Mac, which includes Kubernetes), as well as up.
It's also a design that means that with a few exceptions, your configuration can target any Kubernetes cluster, not a specific cloud vendor. Without a single modification, I can deploy my app to the local Kubernetes on my laptop, or to our production cluster on Google Cloud. While migrating to Docker/Kubernetes took nearly a year, migrating away from GCP would take us probably less than a week (most of it involving pointing DNS to new load balancers, and moving persistent volumes over).
Beyond Google Kubernetes Engine and various other clouds (Azure is apparently very good), there's a bunch of tools now that do the heavy lifting of creating a cluster somewhere. Kubeadm and Kops are both popular.
Have you tried GKE? https://cloud.google.com/kubernetes-engine/
It abstracts away all the work of setting up a close to production HA cluster, so you can jump quickly into developing & deploying your app. You can start with 1 node and ask GKE to scale to N when you want it.
kubeadm can do that:
it basically comes down to bootstrapping it like normal and then removing the 'node-role.kubernetes.io/master' taint so that things can run on the master node.
The one area in kubeadm that is still being worked on is bootstrapping a HA cluster, but if you don't mind having a single master node, you can easily bootstrap a cluster and then add nodes to it later.
Kelsey's tutorial is a bit outdated (Oct 2 2017 with k8s v1.8, v1.11 just got released). Here is a link to the official kubeadm guide for Creating a single master cluster with kubeadm:
It basically is just running
Note: This is no production-ready cluster (it has a single master), also you should have some basic understanding of k8s, which the OP provides. I also highly recommend to dig around kubernetes.io/docs - good material there.
I started with kubeadm some days before the release of k8s v1.11, which made some stuff I wrote obsolete, oh well... :) I really like the new kubeadm phase stuff, though.
There is also an official guide for Creating Highly Available Clusters with kubeadm (it's updated for v1.11) which I just went through:
I opted for the "stacked masters" approach (run HA etcd right on the k8s master nodes), wrote some ansible tasks to automate to boring stuff like copying configs/certs etc., and am currently (re-)exploring add-ons and advanced functionality (helm, network policies, ingress controller, ceph via helm, ...).
Let's see how far I get this time!
I've been through this! Did you decide not to publish it, because it's been outdated?
Let someone review this, I'd love to read it, and getting feedback will be cathartic I promise! Post the link source!
Disclosure: I'm one of the founders
To be fair, this stateful containers in general are a relatively new thing in K8s and support has been improving.
Also, K8s is trying to do and abstract away a lot, it is more like a distributed operating system by itself. So it is more complicated than swarm.
Are we doing it wrong?
We’re using hyperkube and k8s 1.8 which came out around q4 of last year.
Almost all of these I can trace back to user error (ie we told folks to do X, they didn’t, and stuff broke). We’re now having to write a preflight checklist of sorts that the app runs through to make sure A bunch of stuff is “ok.” That in itself becomes brittle in my experience so I’m reluctant to do that.
Even if you can’t use it for production, it’s highly worth setting up a prototype environment on GKE to see what it gives you. I believe they now have GPU support, including support for preemptible GPUs (much cheaper).
Also GKE does a good job of staying up to date with releases. They are on 1.10, with support expected soon for 1.11. They _fully_ manage the upgrade of both the Kubernetes and etcd masters, and the worker nodes.
As mentioned in the linked blog post, Kubernetes is a fast moving project, and to use it you should plan and allocate significant resources in your team to keeping up with the new releases. There are a large number of fixes and improvements since 1.8 and I would look very seriously at both upgrading, and changing your processes to allow you to stay closer to the current release version.
The Kubernetes project does not, and has no current plans to, have a long term support release.
We then ask to rerun the script. If it fails, we fix the script.
If you want to try a stripped down version on your local machine, kubernetes-core can be installed into LXD containers via conjure-up with a click of the fingers - nice for playing around.
If so, that will be harsh. I am not sure how to help you there. There are many things that could go wrong as you say.
If you do have control over the OS (you are providing an ISO or virtualization image) then it should mostly "just work". My company is doing a similar thing, only we ship boxes with everything pre-installed. There is another scenario for hybrid cloud, but even then they download a VMWare image.
Also if you do have control over the OS: can you use CoreOS instead? It is very well suited for running K8s and has less things that can go wrong. RedHat bought them anyway. With Ignition (or even the old-fashioned cloud init with a config drive), it is a no-touch deployment (you do have to generate and inject the certificates beforehand).
One thing that sounds weird is that you are "telling folks to do X". Can you avoid telling them anything and have it automated?
also stuff that you describe does not look like some user error.
btw. you should not expose the deployment yaml/json to "users"/"developers".
You should have a ci that just runs `kubectl set image deployment/name pod-name=IMAGE` and keep all deployment descriptors, etc in a seperate source repository.
We support provisioning servers, building overlay networks with Vxlan, BGP & Wireguard, distributed storage and rolling out things like service discovery, load balancers and HA.
It may be worth exploring for those struggling with some of the complexity around container deployments. At the minimum it will help you understand more about containers, networking and orchestration.
Some like Cassandra are easier since it is multi-master and so you could just use local storage and if the node goes down then it's not the end of the world.
There is an interesting solution for MySQL: https://vitess.io which apparently is used by Youtube and Slack.
When you'll mature from using Docker to high volume production, you should ditch containers at all, they are good for prototyping and testing, but not for production loads and production security.
Also containers are better for production security than just having apps sitting side by side on the same disk.
I can provide more specific details if there’s questions.
In general ECS is fine when you have 1 or 2 services, but once you have 10s or 100s it gets pretty unusable.
Logging: While ECS ships everything to cloudwatch, actually accessing them is a nightmare and usually more effort than it's worth. It's very easy on k8s to get logs shipped from every container to elasticsearch and browse with kibana.
Autoscaling: Possibly less of an issue now that fargate's a thing - but fargate's expensive. Kubernetes makes it easy to set up both node level autoscaling, and deployment level autoscaling
Ingress: With ECS you have to set up loadbalancers for every service individually - in k8s once the controller's properly set up every service defines its own ingress, and the only manual change you have to do is create a DNS entry (only if you can't use a wildcard)
Metrics: Prometheus is very ingrained in kubernetes, and you get a wealth of information from every service almost "for free" once you've installed it.
And more: service meshes, secret control (through e.g. vault), declarative definitions, provider agnostic
Kubernetes has a huge startup cost and learning curve, but is incredibly powerful once you're up and running
Logging: Subscribe the ElasticSearch Lambda to CloudWatch Logs group to have all logs in ElasticSearch domain. Kibana is included in 'AWS::Elasticsearch::Domain' resource.
Ingress: Yes, one ALB per one autoscaled ECS Service, I don't see it as a problem.
Metrics, And more: not sure I see the benefits. CloudWatch has miriad of metrics, you can get lost there :)
It might look hard, but here is great starting point : https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui....
Yes, if you want to use AWS, use Cloudformation templates, there is no way around it :)
Also, this is not an argument _against_ containerization. You make no argument or show any evidence of performance benefits bare v. docker.
It's in this kind of a world that elastic compute provided by a cluster scheduler with efficient packing looks really appealing. And our jobs, while meaty in time and space, are stateless once the initial data is poured in, up until the results pop out - they are big functions, basically. Good match for a stateless container.
So k8s looks quite attractive.
If your app, running in Docker container, is compromised, like PHP webshell, it might try to escape the container. What capabilities you granted for your containers? CAP_SYS_ADMIN, CAP_NET_ADMIN or you have no idea? This is just one example of escaping https://www.twistlock.com/2017/12/27/escaping-docker-contain... and lets talk about namespaces, like user namespace. Is root inside container is a root user on host system?
Does it feels like production ready system?
If that's not enough, it's easy enough to ensure that a Pod is scheduled to run by itself in an otherwise unoccupied VM.
If that's not enough, the IaaS providers can be paid extra to ensure yours is the only VM on the physical machine.
You can have the same expensive guarantees, if you need them, but with a uniform control plane for all workloads. That's pretty attractive.
1. Every system has vulnerabilities. You can defend against them.
2. Any improperly configured system can be abused. In particular, the exploit you linked can be completely stopped with a litany of ways. https://news.ycombinator.com/item?id=16030107
Your argument going from "containers are unfit for production, you'll mature out of them one day" to "here's a small, preventable vulnerability" seems more like a security non-sequiter than an actual argument against containerization.
Further, claiming containers are not production ready is empirically and literally negated by them used, in production, by the largest tech companies that have ever existed.
2. Not running docker means you can lock your httpd by chroot, FreeBSD jail or OpenBSD pledge.