As someone who doesn't know enough about K8s, this looks like an amazingly easy way to get stuck in.
Can anyone who has experience with real deployments advise on what pain-points may be encountered by growing something like this further than N nodes (and talk about what N might be?)
One of the challenges with Kubernetes is that it's pretty fast moving, so it's a good idea to work with resources that are up to date. I know a last commit from May doesn't seem very old but that's going to a least be missing 1.7 and could be missing 1.6 or earlier as well.
Just to parrot... the governance and development model for Kubernetes is amazing. There are releases every three months and it is like clockwork. That said I usually wait for the first point release to ship before upgrading. 1.x.0 releases always feel like an RC and there's usually some critical bug fixed between 1.x.0 and 1.x.1.
Project plug. I am building a free service and tool for instantaneous cloud-running ephemeral Kubernetes clusters called Kubernaut. If you just want to play around with Kubernetes and start learning it is a great tool. Also useful for CI use via Travis: https://github.com/datawire/kubernaut
I've found a few not-so-obvious pain points on my very limited k8s experience.
1) k8s makes a lot of sense for stateless applications (such as your website) but not so much for stateful applications that require a client to connect to the same container every time (there are ways to do it, but they are a pain in the ass.)
2) Tooling is getting better with time, but it's still pretty green. Packages for your usual orchestration tools like Puppet and Ansible are volunteer work so they get easily out of sync, or require more work than you'd expect to get going. Using their suggested YAML format leads to another problem: there's no easy way to keep secrets outside of the configuration files, unless you build your own process around it
3) Some pieces, like a replicated DB, might be easier to be kept outside of the k8s cluster. You can technically run them there, but they weren't designed for running in that kind of environment and sometimes it shows
4) The CI/CD story using Jenkins pipelines is far from solved. There are several packages that provide some solutions, but the documentation tends to be horrible and that leads to days of debugging through trial and error
5) Leaky abstractions everywhere. As an example, the Jenkins plugins to build using your k8s cluster suggests using the "credentials" system, but you need to add them manually after you boot the Jenkins service. Then your slaves stop receiving the credentials and you have to reboot Jenkins (I had to reboot it on average every 3 or 4 builds.)
Don't let that discourage you from using k8s as a "run" cluster, if your app is a shared-nothing, stateless app. It's so much easier to setup (specially on Azure and GCP) and it obviates the need for setting up Puppet + Sensu + load balancing just to make sure your service keeps running when a node dies.
> Using their suggested YAML format leads to another problem: there's no easy way to keep secrets outside of the configuration files, unless you build your own process around it
Isn’t that what the Secrets resources are for? Just use those, and mount them into your containers in the filesystem or environment variables.
Kubernetes "secrets" aren't actually...well...secret. They're stored unencrypted in etcd with no ACLs. You must go outside the Kubernetes system to do this.
If I control a physical cluster of machines end-to-end, I might consider Kubernetes (because I don't have any better options and doing the work to actually secure it is probably less work than my alternatives), but it's also one of the reasons--though far from the only reason--that I couldn't begin to consider k8s if I'm running in AWS or another cloud environment.
> Kubernetes "secrets" aren't actually...well...secret. They're stored unencrypted in etcd with no ACLs. You must go outside the Kubernetes system to do this.
They are in progress towards encrypting at rest the etcd database. You can experimentally turn it on by following this doc:
Regarding #2, I think it's hard to integrate k8s with the rest of your infrastructure. If you are using, say, a combination of Terraforms + Ansible or Puppet to keep everything in shape, right now you have two options:
1) Write a bunch of bash scripts around kubectl and a bunch of YAML files. While painful, this the way I ended up going for (plus blackbox to GPG encrypt/decrypt secrets on the repo)
2) Try to use your usual tools (Ansible/Puppet) as a replacement for kubectl. This is the dream, but the plugins for Ansible and Puppet only support subsets of the latest k8s features, or require annoying stuff like setting your API endpoint on every task.
In other words, getting reproducible deploys between different k8s clusters (say, one for staging and one for production) is not really a thing yet. I guess there's an argument for saying that one should use k8s-native solutions for that (such as namespaces) but what about having clusters in different data centers?
Regarding CI/CD we use gitlab pipelines to build and push containers and run 'helm upgrade --install' to update charts in the cluster to use the newly pushed images. Works very well and not at all difficult to set up.
Can anyone who has experience with real deployments advise on what pain-points may be encountered by growing something like this further than N nodes (and talk about what N might be?)