I was genuinely surprised that k8s turned out to actually be pretty straightforward and very sensible after years of never having anything to do with it and just hearing about it on the net. Turns out opinions are just like after all.
That being said, what people tend to build on top of that foundation is a somewhat different story.
Unfortunately people (cough managers) think k8s is some magic that makes distrusted systems problems go away, and automagically enables unlimited scalability
In reality it just makes the mechanics a little easier and centralized
Getting distributed systems right is usually difficult
I asked chatgpt the other day to explain to me Kubernetes. I still don't understand it. Can you share with me what clicked with you, or resources that helped you?
Controller in charge of a specific type of object watches a database table representing the object type. Database table represents the desired state of things. When entries to the table are CRUD-ed, that represents a change to the desired state of things. Controller interacts with the larger system to bring the state of things into alignment with the new desired state of things.
"The larger system" is more controllers in charge of other object types, doing the same kind of work for its object types
There is an API implemented for CRUD-ing each object type. The API specification (model) represents something important to developers, like a group of containers (Pod), a load balancer with VIP (Service), a network volume (PersistentVolume), and so on.
Hand wave hand wave, Lego-style infrastructure.
None of the above is exactly correct (e.g. the DB is actually a k/v store), but it should be conceptually correct.
No, there are many controllers. Each is in charge of the object types it is in charge of.
>What happens if [it] goes down?
CRUD of the object types it manages have no effect until the controller returns to service.
>If multiple controllers, how do they coordinate ?
The database is the source of truth. If one controller needs to "coordinate" with another, it will CRUD entries of the object types those other controllers are responsible for. e.g. Deployments beget ReplicaSets beget Pods.
The k/v store offers primitives to make that happen, but for non-critical controllers you don't want to deal with things like that they can go down and will be restarted (locally by kubelet/containerd) or rescheduled. Whatever resource they monitor will just not be touched until they get restarted.
What clicked with me is having ChatGPT go line by line through all of the YAML files generated for a simple web app—WordPress on Kubernetes. Doing that, I realized that Kubernetes basically takes a set of instructions on how to run your app and then follows them.
So, take an app like WordPress that you want to make “highly available.” Let’s imagine it’s a very popular blog or a newspaper website that needs to serve millions of pages a day. What would you do without Kubernetes?
Without Kubernetes, you would get yourself a cluster of, let’s say, four servers—one database server, two worker servers running PHP and Apache to handle the WordPress code, and finally, a front-end load balancer/static content host running Nginx (or similar) to take incoming traffic and route it to one of the two worker PHP servers. You would set up all of your servers, network them, install all dependencies, load your database with data, and you’d be ready to rock.
If all of a sudden an article goes viral and you get 10x your usual traffic, you may need to quickly bring online a few more worker PHP nodes. If this happens regularly, you might keep two extra nodes in reserve and spin them up when traffic hits certain limits or your worker nodes’ load exceeds a given threshold. You may even write some custom code to do that automatically. I’ve done all that in the pre-Kubernetes days. It’s not bad, honestly, but Kubernetes just solves a lot of these problems for you in an automated way. Think of it as a framework for your hosting infrastructure.
On Kubernetes, you would take the same WordPress app and split it into the same four functional blocks. Each would become a container. It can be a Docker container or a Containerd container—as long as it’s compatible with the Open Container Initiative, it doesn’t really matter. A container is just a set of files defining a lightweight Linux virtual machine. It’s lightweight because it shares its kernel with the underlying host it eventually runs on, so only the code you are actually running really loads into memory on the host server.
You don’t really care about the kernel your PHP runs on, do you? That’s the idea behind containers—each process runs in its own Linux virtual machine, but it’s relatively efficient because only the code you are actually running is loaded, while the rest is shared with the host. I called these things virtual machines, but in practice they are just jailed and isolated processes running on the host kernel. No actual hardware emulation takes place, which makes it very light on resources.
Just like you don’t care about the kernel your PHP runs on, you don’t really care about much else related to the Linux installation that surrounds your PHP interpreter and your code, as long as it’s secure and it works. To that end, the developer community has created a large set of container templates or images that you can use. For instance, there is a container specifically for running Apache and PHP—it only has those two things loaded and nothing else. So all you have to do is grab that container template, add your code and a few setting changes if needed, and you’re off to the races.
You can make those config changes and tell Kubernetes where to copy and place your code files using YAML files. And that’s really it. If you read the YAML files carefully, line by line, you’ll realize that they are nothing more than a highly specialized way of communicating the same type of instructions you would write to a deployment engineer in an email when telling them how to deploy your code.
It’s basically a set of instructions to take a specific container image, load code into it, apply given settings, spool it up, monitor the load on the cluster, and if the load is too high, add more nodes to the cluster using the same steps. If the load is too low, spool down some nodes to save money.
So, in theory, Kubernetes was supposed to replace an expensive deployment engineer. In practice, it simply shifted the work to an expensive Kubernetes engineer instead. The benefit is automation and the ability to leverage community-standard Linux templates that are (supposedly) secure from the start. The downside is that you are now running several layers of abstraction—all because Unix/Linux in the past had a very unhealthy disdain for statically linked code. Kubernetes is the price we pay for those bad decisions of the 1980s. But isn’t that just how the world works in general? We’re all suffering the consequences of the utter tragedy of the 1980s—but that’s a story for another day.
That being said, what people tend to build on top of that foundation is a somewhat different story.