101 products from 81 certified vendors. 33 completely separate, independent, certified, hosted environments, and every other different entry on the CNCF listing. With no less than 12 different ways to install it for yourself, on resources that you own one way or another. I think we're past Landmark status already, OpenStack never did all this.
As a developer, I feel I am ready to go with this approach.
It's my ops teams that can't cope with that degree of choice – they're apprehensive to choose, knowing that with 90+ options and almost all of them acceptable to me and my team, there's non-zero risk that we're going to choose the wrong one! We'll have to switch. And who knows why? We'll find out, if we settle on one.
The operational expense for us to set up Kubernetes is already great enough. The prospect of ultimately learning that maybe we picked the wrong one, then needing to switch to another one, for them, seems too large, I think.
Why not wait for the market to die down a little bit, or for that list to get just a little bit shorter first? Seems like I'll be waiting forever. If I narrow it down to only options that have been certified since K8S v1.9, maybe the choices will look a little bit more constrained.
I really want to convince them.
Just choose the cloud vendor you already trust the most or boot up a cluster on your own. It's just a set of systemd services. The level of fear regarding K8S offerings among developers is staggering and I _cannot_ figure out where it comes from. What would you "get wrong" that can't be easily changed? There aren't that many deep engineering pits to get yourself into that would take ages to get out of...
In particular, there are a number of options for the networking layer and the one you choose, and the way you configure it, can have significant performance implications.
There are 20+ choices for a cluster networking implementation here: https://kubernetes.io/docs/concepts/cluster-administration/n...
Setting up kubernetes is a breeze if everything goes well, but the moment there’s even one error I just have no idea how to resolve it.
You also have to be extremely careful with your affinity / anti affinity rules. The interactions get realy complicated really fast.
I didn’t really consider that, but that’s another thing.
Before you can use it effectively, you more or less have to learn to speak it’s language. Gradual rampup isn’t really a thing.
Almost everyone already knows the Control Plane. That's where your Kube API is served from, and it potentially includes the etcd service maintaining the cluster state. The language has changed here, but this is still the most familiar example for anyone who has run a Kube cluster at any scale.
This taint on a node, means that only pods which tolerate the taint may occupy a node. This is how you get so-called "dedicated masters" also known as your Control Plane. You can remove the dedicated taint in a single-node cluster to get a "minikube-like" experience without necessarily fanning out, but at least keeping the option there. I think it's better to start with only a single node, that's how I've learned much of my experience at least. All of the reliability calculations are much easier when you don't need to divide by anything.
Practically nobody but cloud vendors really need to care about masters or Control Plane anymore, since so many cloud vendors have a cost-saving solution called "Managed Kubernetes" where you just consume the Kubernetes API and pay for your own application workloads, receiving the masters with High-Availability at low (or no) cost.
But that's the most basic way to explain or set up anti-affinities that I can think of. You can set up taints and tolerations for anything, say you have your own dedicated "Routing Mesh" or nodes that are used as load balancers, there'll most certainly be a taint you may use for that, or feel free to invent and supply your own. (Another thing we don't need to do, since cloud vendors provide LB services. At some layer you'll still find a place for this concept if you think about the architecture of your system or product, I suspect. But all of my boilerplate examples are stale.)
I think affinities are usually handled in other ways, like StatefulSet, but I am not really sure how to explain pod affinities. I'm still avoiding most stateful workloads, so from me the biggest advice is to be sure that you are setting up resource quotas (limits / requests) and that you have a system in place for refining those definitions. If you make sure you do that, then out-of-the-box Kubernetes will be taking care of a lot of the rest for you. Pods will have an affinity for nodes that have more resources available for them, so long as you remember to give the controller an estimate and maybe also hard cap of the resource usage for each pod deployed.
This was the major advantage of early Kubernetes when it first started putting CoreOS's Fleetd out of business. Resource-aware scheduling. You can be explicit about node affinity with NodeSelectors, like "the database server should run on the only node in the node pool which provisions its nodes with 24 cores." But if your next-largest machine has only 8 cores, it might have been enough to just say in a resource request, "the database pod itself requires at least 12 cores." The effect is not quite exactly but almost/basically the same. You might also prefer to use a taint/toleration/node selector combo to be sure that no other workloads wind up on that node which might cause performance cross-talk with the database.