Hacker News new | past | comments | ask | show | jobs | submit login

> There's an (increasingly small) group of software developers who don't like "magic" and want to understand where their code is running and what it's doing. These developers gravitate toward open source solutions like Kubernetes

Kubernetes is not the first thing that comes to mind when I think of "understanding where their code is running and what it's doing"...






Indeed, I have to wonder how many people actually understand Kubernetes. Not just as a “user” but exactly all what it is doing behind the scenes…

Just an “idle” Kubernetes system is a behemoth to comprehend…


I keep seeing this opinion and I don't understand it. For various reasons, I recently transitioned from a dev role to running a 60+ node, 14+ PB bare metal cluster. 3 years in, and the only thing ever giving me trouble is Ceph.

Kubernetes is etcd, apiserver, and controllers. That's exactly as many components as your average MVC app. The control-loop thing is interesting, and there are a few "kinds" of resources to get used to, but why is it always presented as this insurmountable complexity?

I ran into a VXLAN checksum offload kernel bug once, but otherwise this thing is just solid. Sure it's a lot of YAML but I don't understand the rep.


“etcd, apiserver, and controllers.”

…and containerd and csi plugins and kubelet and cni plugins and kubectl and kube-proxy and ingresses and load balancers…


And system calls and filesystems and sockets and LVM and...

Sure at some point there are too many layers to count but I wouldn't say any of this is "Kubernetes". What people tend to be hung about is the difficulty of Kubernetes compared to `docker run` or `docker compose up`. That is what I am surprised about.

I never had any issue with kubelet, or kube-proxy, or CSI plugins, or CNI plugins. That is after years of running a multi-tenant cluster in a research institution. I think about those about as much as I think about ext4, runc, or GRUB.


But you just said that you had issues with ceph? How is that not a CSI problem?

And CNI problems are extremely normal. Pretty much anyone that didn't just use weavenet and called it a day has had to spend quiet a bit of time to figure it out. If you already know networking by heart it's obviously going to be easier, but few devs do.


Never had a problem with the CSI plugin, I had problems with the Ceph cluster itself. No, I wouldn't call Ceph part of Kubernetes.

You definitely can run Kubernetes without running Ceph or any storage system, and you already rely on a distributed storage system if you use the cloud whether you use Kubernetes or not. So I wouldn't count this as added complexity from Kubernetes.


I'm not sure I can agree with that interpretation. CSI is basically an interface that has to be implemented.

If you discount issues like that, you can safely say that it's impossible to have any issues with CSI, because it's always going to be with one of it's implementation.

That feels a little disingenuous, but maybe that's just me.


So if you run Kubernetes in the cloud, you consider the entire cloud provider's block storage implementation to be part of Kubernetes too?

For example you'd say AWS EBS is part of Kubernetes?


In the context of this discussion, which is about the complexity of the k8s stack: yes.

Youre ultimately gonna have to use a storage of some form unless you're just a stateless service/keep the services with state out of k8s. That's why I'd include it, and the fact that you can use multiple storage backends, each with their own challenges and pitfalls makes k8s indeed quiet complex.

You could argue that multinode PaaS is always going to be complex, and frankly- I'd agree with that. But that was kinda the original point. At least as far as I interpreted it: k8s is not simple and you most likely didn't need it either. But if you do need a distributed PaaS, then it's probably a good idea to use it. Doesn't change the fact that it's a complex system.


So you're comparing Kubernetes to what? Not running services at all? In that case I agree, you're going to have to set up Linux, find a storage solution, etc as part as your setup. Then write your app. It's a lot of work.

But would I say that your entire Linux installation and the cloud it runs on is part of Kubernetes? No.


> So you're comparing Kubernetes to what? Not running services at all?

Surprisingly there were hosted services on the internet prior to kubernetes existing. Hell, I even have reason to believe that the internet may possibly predate Docker


That is my point! If you think "just using SystemD services in a VM" is easy but "Kubernetes is hard", and you say "Kubernetes is hard" is because of Linux, cgroups, cloud storage, mount namespaces, ... Then I can't comprehend that argument, because those are things that exist in both solutions.

Let's be clear on what we're comparing or we can't argue at all. Kubernetes is hard if you have never seen a computer before, I will happily concede that.


ah I apologize for my snark then, I interpreted your sentence as _you_ believing that the only step simpler than using Kubernetes was to not have an application running

I see how you were asking the GP that question now


Next you’re going to claim the internet existed before Google too.

Various options around for simple alternatives, the simplest is probably just running single node.

Maybe with fail over for high availability.

Even that's fine for most deployments that aren't social media sites, aren't developed by multiple teams of devs and don't have any operations people on payroll.


Because CSI is just a way to connect a volume to a pod.

Ceph is its own cluster of kettles filled with fishes


Very fair, although with managed services which are increasingly available, you don't typically need to think about CSI or CNI.

Hence

> Kubernetes is not the first thing that comes to mind when I think of "understanding where their code is running and what it's doing"...


CSI and CNI do about as much magic as `docker volume` and `docker network`.

People act like their web framework and SQL connection pooler and stuff are so simple, while Kubernetes is complex and totally inscrutable for mortals, and I don't get it. It has a couple of moving parts, but it is probably simpler overall than SystemD.


I was genuinely surprised that k8s turned out to actually be pretty straightforward and very sensible after years of never having anything to do with it and just hearing about it on the net. Turns out opinions are just like after all.

That being said, what people tend to build on top of that foundation is a somewhat different story.


it’s not k8s. It’s distrusted systems

Unfortunately people (cough managers) think k8s is some magic that makes distrusted systems problems go away, and automagically enables unlimited scalability

In reality it just makes the mechanics a little easier and centralized

Getting distributed systems right is usually difficult


I asked chatgpt the other day to explain to me Kubernetes. I still don't understand it. Can you share with me what clicked with you, or resources that helped you?

Controller in charge of a specific type of object watches a database table representing the object type. Database table represents the desired state of things. When entries to the table are CRUD-ed, that represents a change to the desired state of things. Controller interacts with the larger system to bring the state of things into alignment with the new desired state of things.

"The larger system" is more controllers in charge of other object types, doing the same kind of work for its object types

There is an API implemented for CRUD-ing each object type. The API specification (model) represents something important to developers, like a group of containers (Pod), a load balancer with VIP (Service), a network volume (PersistentVolume), and so on.

Hand wave hand wave, Lego-style infrastructure.

None of the above is exactly correct (e.g. the DB is actually a k/v store), but it should be conceptually correct.


Is there only a single controller ? What happens if goes down?

If multiple controllers, how do they coordinate ?


>Is there only a single controller ?

No, there are many controllers. Each is in charge of the object types it is in charge of.

>What happens if [it] goes down?

CRUD of the object types it manages have no effect until the controller returns to service.

>If multiple controllers, how do they coordinate ?

The database is the source of truth. If one controller needs to "coordinate" with another, it will CRUD entries of the object types those other controllers are responsible for. e.g. Deployments beget ReplicaSets beget Pods.


The k/v store offers primitives to make that happen, but for non-critical controllers you don't want to deal with things like that they can go down and will be restarted (locally by kubelet/containerd) or rescheduled. Whatever resource they monitor will just not be touched until they get restarted.

What clicked with me is having ChatGPT go line by line through all of the YAML files generated for a simple web app—WordPress on Kubernetes. Doing that, I realized that Kubernetes basically takes a set of instructions on how to run your app and then follows them.

So, take an app like WordPress that you want to make “highly available.” Let’s imagine it’s a very popular blog or a newspaper website that needs to serve millions of pages a day. What would you do without Kubernetes?

Without Kubernetes, you would get yourself a cluster of, let’s say, four servers—one database server, two worker servers running PHP and Apache to handle the WordPress code, and finally, a front-end load balancer/static content host running Nginx (or similar) to take incoming traffic and route it to one of the two worker PHP servers. You would set up all of your servers, network them, install all dependencies, load your database with data, and you’d be ready to rock.

If all of a sudden an article goes viral and you get 10x your usual traffic, you may need to quickly bring online a few more worker PHP nodes. If this happens regularly, you might keep two extra nodes in reserve and spin them up when traffic hits certain limits or your worker nodes’ load exceeds a given threshold. You may even write some custom code to do that automatically. I’ve done all that in the pre-Kubernetes days. It’s not bad, honestly, but Kubernetes just solves a lot of these problems for you in an automated way. Think of it as a framework for your hosting infrastructure.

On Kubernetes, you would take the same WordPress app and split it into the same four functional blocks. Each would become a container. It can be a Docker container or a Containerd container—as long as it’s compatible with the Open Container Initiative, it doesn’t really matter. A container is just a set of files defining a lightweight Linux virtual machine. It’s lightweight because it shares its kernel with the underlying host it eventually runs on, so only the code you are actually running really loads into memory on the host server.

You don’t really care about the kernel your PHP runs on, do you? That’s the idea behind containers—each process runs in its own Linux virtual machine, but it’s relatively efficient because only the code you are actually running is loaded, while the rest is shared with the host. I called these things virtual machines, but in practice they are just jailed and isolated processes running on the host kernel. No actual hardware emulation takes place, which makes it very light on resources.

Just like you don’t care about the kernel your PHP runs on, you don’t really care about much else related to the Linux installation that surrounds your PHP interpreter and your code, as long as it’s secure and it works. To that end, the developer community has created a large set of container templates or images that you can use. For instance, there is a container specifically for running Apache and PHP—it only has those two things loaded and nothing else. So all you have to do is grab that container template, add your code and a few setting changes if needed, and you’re off to the races.

You can make those config changes and tell Kubernetes where to copy and place your code files using YAML files. And that’s really it. If you read the YAML files carefully, line by line, you’ll realize that they are nothing more than a highly specialized way of communicating the same type of instructions you would write to a deployment engineer in an email when telling them how to deploy your code.

It’s basically a set of instructions to take a specific container image, load code into it, apply given settings, spool it up, monitor the load on the cluster, and if the load is too high, add more nodes to the cluster using the same steps. If the load is too low, spool down some nodes to save money.

So, in theory, Kubernetes was supposed to replace an expensive deployment engineer. In practice, it simply shifted the work to an expensive Kubernetes engineer instead. The benefit is automation and the ability to leverage community-standard Linux templates that are (supposedly) secure from the start. The downside is that you are now running several layers of abstraction—all because Unix/Linux in the past had a very unhealthy disdain for statically linked code. Kubernetes is the price we pay for those bad decisions of the 1980s. But isn’t that just how the world works in general? We’re all suffering the consequences of the utter tragedy of the 1980s—but that’s a story for another day.


> People act like their web framework and SQL connection pooler and stuff are so simple

I'm just sitting here wondering why we need 100 billion transistors to move a piece of tape left and right ;)


Well, and the fact that in addition to Kubernetes itself, there are a gazillion adjacent products and options in the cloud-native space. Many/most of which a relatively simple setup may not need. But there's a lot of complexity.

But then there's always always a lot of complexity and abstraction. Certainly, most software people don't need to know everything about what a CPU is doing at the lowest levels.


These components are very different in complexity and scope. Let's be real: a seasoned developer is mostly familiar with load balancers and ingress controllers, so this will be mostly about naming and context. I agree though once you learn about k8s it becomes less mysterious but that also means the author hasn't pushed it to the limits. Outages in the control plane could be pretty nasty and it is easy to have them by creating an illusion everything is kind of free in k8s.

A really simple setup for many smaller organisations wouldn't have a load balancer at all.

No load balancer means... entering one node only? Doing DNS RR over all the nodes? If you don't have a load balancer in front, why are you even using Kubernetes? Deploy a single VM and call it a day!

I mean, in my homelab I do have Kubernetes and no LB in front, but it's a homelab for fun and learn K8s internals. But in a professional environment...


No code at all even - just use excel

typical how to program an owl:

step one: draw a circle

step two: import the rest of the owl


... and kubernetes networking, service mesh, secrets management

You arent' forced to use service mesh and complex secrets management schemes. If you add them to the cluster is because you value what they offer you. It's the same thing as kubernetes itself - I'm not sure what people are complaining about, if you don't need what kubernetes offers, just don't use it.

Go back to good ol' corsync/pacemaker clusters with XML and custom scripts to migrate IPs and set up firewall rules (and if you have someone writing them for you, why don't you have people managing your k8s clusters?).

Or buy something from a cloud provider that "just works" and eventually go down in flames with their indian call centers doing their best but with limited access to engineering to understand why service X is misbehaving for you and trashing your customer's data. It's trade-offs all the way.


> …and containerd and csi plugins and kubelet and cni plugins (...)

Do you understand you're referring to optional components and add-ons?

> and kubectl

You mean the command line interface that you optionally use if you choose to do so?

> and kube-proxy and ingresses and load balancers…

Do you understand you're referring to whole classes of applications you run on top of Kubernetes?

I get it that you're trying to make a mountain out of a mole hill. Just understand that you can't argue that something is complex by giving as your best examples a bunch of things that aren't really tied to it.

It's like trying to claim Windows is hard, and then your best example is showing a screenshot of AutoCAD.


How’s kubelet and cni are “optional components”? What do you mean by that?

CNI is optional, you can have workloads bind ports on the host rather than use an overlay network (though CNI plugins and kube-proxy are extremely simple and reliable in my experience, they use VXLAN and iptables which are built into the kernel and that you already use in any organization who might run a cluster, or the basic building blocks of your cloud provider).

CSI is optional, you can just not use persistent storage (use the S3 API or whatever) or declare persistentvolumes that are bound to a single or group of machines (shared NFS mount or whatever).

I don't know how GP thinks you could run without the other bits though. You do need kubelet and a container runtime.


kubelet isn't, but CNI technically is (or can be abstracted to minimum, I think old network support might have been removed from kubelet nowadays)

Because the root comment is mostly but not quite right: there are indeed a large subset of developers that aren't interested in thinking about infrastructure, but there are many subcategories of those people, and many of them aren't fly.io customers. A large number of people who are in that category aren't happy to let someone else handle their infra. They're not interested in infra in the sense that they don't believe it should be more complicated than "start process on Linux box and set up firewall and log rotation".

For some applications these people are absolutely right, but they've persuaded themselves that that means it's the best way to handle all use cases, which makes them see Kubernetes as way more complex than is necessary, rather than as a roll-your-own ECS for those who would otherwise truly need a cloud provider.


Feels like swe engineers are talking past each other a lot about these topics.

I assume everyone wants to be in control of their environment. But with so many ways to compose your infra that means a lot of different things for different people.


I use k8s, wouldn't call it simple, but there are ways to minimize the complexity of your setup. Mostly, what devs see as complexity is k8s packages a lot of system fundamentals, like networking, storage, name resolution, distributed architectures, etc, and if you mainly spent your career in a single lane, k8s becomes impossible to grasp. Not saying those devs are wrong, not everyone needs to be a networking pro.

K8s is meant to be operated by some class of engineers, and used by another. Just like you have DBAs, sysadmins, etc, maybe your devops should have more system experience besides terraform.


"Kubernetes is etcd, apiserver, and controllers....Sure it's a lot of YAML but I don't understand the rep."

Sir, I upvoted you for your wonderful sense of humour.


I consider a '60+ node' kubernetes cluster is very small. Kubernetes at that scale is genuinely excellent! At 6000, 60000, and 600000 nodes it becomes very different and goes from 'Hey, this is pretty great' to 'What have I done?' The maintenance costs of running more than a hundred clusters is incredibly nontrivial especially as a lot of folks end up taking something open-source and thinking they can definitely do a lot better (you can.... there's a lot of "but"s there though).

OK but the alternative if you think Kubernetes is too much magic when you want to operate hundreds of clusters with tens of thousands of nodes is?

Some bash and Ansible and EC2? That is usually what Kubernetes haters suggest one does to simplify.


At a certain scale, let's say 100k+ nodes, you magically run into 'it depends.' It can be kubernetes! It can be bash, ansible, and ec2! It can be a custom-built vm scheduler built on libvirt! It can be a monster fleet of Windows hyper-v hosts! Heck, you could even use Mesos, Docker Swarm, Hashicorp Nomad, et al.

The main pain point I personally see is that everyone goes 'just use Kubernetes' and this is an answer, however it is not the answer. It steamrolling all conversations leads to a lot of the frustration around it in my view.


Hashicorp Nomad, Docker Swarm, Apache Mesos, AWS ECS?

I love that the Kubernetes lovers tend to forget that Kubernetes is just one tool, and they believe that the only possible alternative to this coolness is that sweaty sysadmins writing bash scripts in a dark room.


I’m absolutely not a Kubernetes lover. Bash and Andible etc. is just a very common suggestion from haters.

I thought Mesos was kinda dead nowadays, good to hear it’s still kicking. Last time I used it it the networking was a bit annoying, not able to provide virtual network interfaces but only ports.

It seems like if you are going to operate these things, picking a solution with a huge community and in active development feels like the smart thing to do.

Nomad is very nice to use from a developer perspective, and it’s nice to hear infrastructure people preferring it. From outside the reason people pick Kubernetes seems to be the level of control of infra and security teams want over things like networking and disk.


Can you describe who a Kubernetes hater is? Or show me an example. It's easy to stigmatise someone as a Kubernetes lover or hater. Then use it to invalidate their arguments.

I would argue against Kubernetes in particular situations, and even recommend Ansible in some cases, where it is a better fit in the given circumstances. Do you consider me as a Kubernetes hater?

Point is, Kubernetes is a great tool. In particular situations. Ansible is a great tool. In particular situations. Even bash is a great tool. In particular situations. But Kubernetes even could be the worst tool if you choose unwisely. And Kubernetes is not the ultimate infrastructure tool. There are alternatives, and there will be new ones.


HashiCorp Nomad?

The wheels fall off kubernetes at around 10k nodes. One of the main limitations is etcd from my experience, google recently fixed this problem by making spanner offer an etcd compatible API: https://cloud.google.com/blog/products/containers-kubernetes...

Etcd is truly a horrible data store, even the creator thinks so.


At that point you probably need a cluster of k8s clusters, no?

For anyone unfamiliar with this the "official limits" are here, and as of 1.32 it's 5000 nodes, max 300k containers, etc.

https://kubernetes.io/docs/setup/best-practices/cluster-larg...


Yes this is what I'm referring too. :)

Maintaining a lot of clusters is super different than maintaining one cluster.

Also please don't actually try to get near those limits, your etcd cluster will be very sad unless you're _very_ careful (think few deployments, few services, few namespaces, no using etcd events, etc).


Hey fellow k8s+ceph on bare metaler! We only have a 13 machine rack and 350tb of raw storage. No major issues with ceph after 16.x and all nvme storage though.

Genuinely curious about what sort of business stores and processes 14 PB on a 60 node cluster.

Research institution.

The department saw more need for storage than Kubernetes compute so that's what we're growing. Nowadays you can get storage machines with 1 PB in them.


Yeah, that's an interesting question, because it sounds like a ton of data vs not enough compute, but, aside from this all being in a SAN or large storage array:

The larger Supermicro or Quanta storage servers can easily handle 36 HDD's each, or even more.

So with just 16 of those with 36x24TB disks, that meets the ~14PB capacity mark, leaving 44 remaining nodes for other compute task, load balancing, NVME clusters, etc.


We have boxes with up to 45 drives yes.

Yeah, I'm sure there are tricky details as in anything but the core idea doesn't sound that complicated to me. I've been looking into it a bit after seeing this fun video a while ago where a DOS BBS is ran on kubernetes.

https://youtu.be/wLVHXn79l8M?si=U2FexAMKd3zQVA82


I think "core" kubernetes is actually pretty easy to understand. You have the kubelet, which just cares about getting pods running, which it does by using pretty standard container tech. You bootstrap a cluster by reading the specs for the cluster control plane pods from disk, after which the kubelet will start polling the API it just started for more of the same. The control plane then takes care of scheduling more pods to the kubelets that have joined the cluster. Pods can run controllers that watch the API for other kinds of resources, but one way or another, most of those get eventually turned into Pod specs that get assigned to a kubelet to run.

Cluster networking can sometimes get pretty mind-bending, but honestly that's true of just containers on their own.

I think just that ability to schedule pods on its own requires about that level of complexity; you're not going to get a much simpler system if you try to implement things yourself. Most of the complexity in k8s comes from components layered on top of that core, but then again, once you start adding features, any custom solution will also grow more complex.

If there's one legitimate complaint when it comes to k8s complexity, it's the ad-hoc way annotations get used to control behaviour in a way that isn't discoverable or type-checked like API objects are, and you just have to be aware that they could exist and affect how things behave. A huge benefit of k8s for me is its built-in discoverability, and annotations hurt that quite a bit.


Well, the point is you don't have to understand it all at the same time. Kubernetes really just codifies concepts that people were doing before. And it sits on the same foundations (Linux, IP, DNS etc). People writing apps didn't understand the whole thing before, just as they don't now. But at some level these boxes are plugged into each other. A bad system would be one where people writing business software have to care about what box is plugged into what. That's absolutely not the case with Kubernetes.

> Indeed, I have to wonder how many people actually understand Kubernetes. Not just as a “user” but exactly all what it is doing behind the scenes…

I would ask a different question. How many people actually need to understand implementation details of Kubernetes?

Look at any company. They pay engineers to maintain a web app/backend/mobile app. They want features to be rolled out, and they want their services to be up. At which point does anyone say "we need an expert who actually understands Kubernetes"?


When they get paged three nights in a row and can’t figure out why.

> I have to wonder how many people actually understand Kubernetes.

I have to wonder how many people actually understand when to use K8s or docker. Docker is not a magic bullet, and can actually be a foot gun when it's not the right solution.


I am at this compute thing since 1986, with focus mostly around distributed systems since 2000, and I keep my Kubernetes cheat sheet always close.

> Kubernetes is not the first thing that comes to mind when I think of "understanding where their code is running and what it's doing"...

In the end it's a scheduler for Docker containers on a bunch of virtual or bare metal machines. Once you get that in your head life becomes much more easy.

The only thing I'd really love to see from an ops perspective is a way to force-revive crashed containers for debugging. Yes, one shouldn't have to debug cattle, just haul the carcass off and get a new one... but I still prefer to know why the cattle died.


Yeah. In the whole cattle/pet discourse the fact that you need to take some cattle to the vet for diagnosis got lost. Very operator-centric thinking, I get where it’s coming from, but went a bit too far.

One may think Kubernetes is complex (I agree), but I haven't seen alternative that simultaneously allows to:

* Host hundreds or thousands of interacting containers across multiple teams in sane manner * Let's you manage and understand how is it done in the full extent.

Of course there are tons of organizations that can (and should) easily resign from one of these, but if you need both, there isn't better choice right now.


But how many orgs need that scale?

Something I've discovered is that if you're a small team doing something new, off the shelf products/platforms are almost certainly not optimized to your use case.

What looks like absurd scale to one team is a regular Tuesday for another, because "scale" is completely meaningless without context. We don't balk at a single machine running dozens of processes for a single web browser, we shouldn't balk at something running dozens of containers to do something that creates value somehow. And scale that up by number of devs/customers and you can see how thousands/hundreds of thousands can happen easily.

Also the cloud vendors make it easy to have these problems because it's super profitable.


You can run single-node k3s on a VM with 512MB of RAM and deploy your app with a hundred lines of JSON, and it inherits a ton of useful features that are managed in one place and can grow with your app if/as needed. These discussions always go in circles between Haters and Advocates:

* H: "kubernetes [at planetary scale] is too complex"

* A: "you can run it on a toaster and it's simpler to reason about than systemd + pile of bash scripts"

* H: "what's the point of single node kubernetes? I'll just SSH in and paste my bash script and call it a day"

* A: "but how do you scale/maintain that?"

* H: "who needs that scale?"


The sad thing is there probably is a toaster out there somewhere with 512MB of RAM.

It's not sad until it becomes self-aware.

A very small percentage of orgs, a not-as-small percentage of developers, and at the higher end of the value scale, the percentage is not small at all.

I think the developers who care about knowing how their code works tend to not want hyperscale setups anyway.

If they understood their system, odds are they’d realize that horizontal scaling with few, larger services is plenty scalable.

At those large orgs, the individual developer doesn’t matter at all and the EMs will opt for faster release cycles and rely on internal platform teams to manage k8s and things like it.


Exact opposite - k8s allows developers to actually tailor containers/pods/deployments themselves, instead opening tickets to have it configured on VM by platform team.

Of course there are simpler container runtimes, but they have issues with scale, cost, features or transparency of operation. Of course they can be good fit if you're willing to give up one or more of these.


> k8s allows developers to actually tailor containers/pods/deployments themselves

Yes, complex tools tend to be powerful.

But when I say “devs who care about knowing how their code works” I’m also referring to their tools.

K8s isn’t incomprehensible, but it is very complex, especially if you haven’t worked in devops before.

“Devs who care…” I would, assume, would opt for simpler tools.

I know I would.


We're almost 100 devs in a few teams - works well. There's a bunch of companies of our size even in the same city.

What's a bit different is we're creating own products, not renting people to others, so having uniform hosting platform is actual benefit.


Most of the ones that are profitable for cloud providers.

> Host hundreds or thousands of interacting containers across multiple teams in sane manner

I mean, if that's your starting point, then complexity is absolutely a given. When folks complain about the complexity of Kubernetes, they are usually complaining about the complexity relative to a project that runs a frontend, a backend, and a postgres instance...


In my last job we ran centralized clusters for all teams. They got X namespaces for their applications, and we made sure they could connect to the databases (handled by another team, though there were discussion of moving them onto dedicated clusters). We had basic configuration setup for them and offered "internal consultants" to help them onboard. We handled maintenance, upgrades and if needed migrations between clusters.

We did not have a cluster just for a single application (with some exceptions because those applications were incredibly massive in pod numbers) and/or had patterns that required custom handling and pre-emptive autoscaling (which we wrote code for!).

Why are so many companies running a cluster for each application? That's madness.


I mean, a bunch of companies that have deployed Kubernetes only have 1 application :)

I migrated one such firm off Kubernetes last year, because for their use case it just wasn't worth it - keeping the cluster upgraded and patched, and their CI/CD pipelines working was taking as much IT effort as the rest of their development process


I agree with the blog post that using K8s + containers for GPU virtualization is a security disaster waiting to happen. Even if you configure your container right (which is extremely hard to do), you don't get seccomp-bpf.

People started using K8s for training, where you already had a network isolated cluster. Extending the K8s+container pattern to multi-tenant environments is scary at best.

I didn't understand the following part though.

> Instead, we burned months trying (and ultimately failing) to get Nvidia’s host drivers working to map virtualized GPUs into Intel Cloud Hypervisor.

Why was this part so hard? Doing PCI passthrough with the Cloud Hypervisor (CH) is relatively common. Was it the transition from Firecracker to CH that was tricky?


This has actually brought up an interesting point. Kubernetes is nothing more than an API interface. Should someone be working on building a multi-tenant Kubernetes (so that customers don't need to manage nodes or clusters) which enforces VM-level security (obviously you cannot safely co-locate multiple tenants containers on the same VM)?

Yeah, I think this really exemplifies the "everyone more specialized than me doesn't get the bigger picture, and everyone less specialized than me is wasting their time" trope. Developers who don't want to deal with the nitty gritty in one area are dealing with it in another area. Everyone has 24 hours in a day.

The difference between a good developer and a bad is understanding the stack. Not necessarily an expert but I spend a lot of time debugging for random issues and it could be dns or a file locking issue or a network or a api or parsing EDI whatever. Most recently I found a bug in software that had to do with how Windows runs 32 bit mode on 64 bit. I've never used windows professionally and I have only had unix machines since I got a free Ubuntu CD. Yet I figured it out in like 20 minutes exploring the differences between the paths when running in two scenarios. Idk maybe I'm a genius, I don't think so, but I was able to solve the problem because I know just barely enough about enough things to poke shit and break them or make them light up. Compare that to a dev on my team who needed help writing a series of command line prompts to do a simple bit of textual adjustments and pipe some data around.

I'm not a even good developer. But I know enough to chime in on calls and provide useful and generally 'Wizarding' knowledge. Like a detective with a good hunch.

But yeah just autocomplete everything lol


It's great that you were able to debug that. It may have come at an opportunity cost of being able to solve some more specialized problem within your domain.

In my job I develop a React Native app. I also need to have a decent understanding of iOS and Android native code. If I run into a bug related to how iOS runs 32 bit vs 64 bit software? Not my problem, we'll open a ticket with Apple and block the ticket in our system.


I guess I never have enough leverage to order Apple to fix stuff. I'm like water and gravity. It's just a random example though and I agree you do give up a lot by being a generalist. However for most people we don't do really new or hard problems. Its a lot of spaghetti

I don't think of it as spaghetti but as messy plumbing.

I don't disagree with you, but I do think it's important to acknowledge that this approach requires someone else to do it. If you're at a big company where there are tons of specialists, then perhaps this is just fine because there is someone available to do it for you. If you find yourself in a different situation, however, where you don't have that other specialist, you could end up significantly blocked for a period of time. If whatever you're working on is not important and can afford to be blocked, then again no problem, but I've been in many situations where what I was doing absolutely had to work and had to work on a timetable. If I had to offload the work to someone else because I wasn't capable, it would have meant disaster.

> we'll open a ticket with Apple and block the ticket in our system.

Wouldn't it be annoying to be blocked on Apple rather than shipping on your schedule?


If we're blocked on Apple, so is everyone else. A key consideration in shipping high-level software is to avoid using niche features that the vendor might ignore if they're broken.

Great opportunity for someone ballsy to write a book about kubernetes internals for the general engineering population.

Bonus points for writing a basic implementation from first principles capturing the essence of the problem kubernetes really was meant to solve.

The 100 pages kubernetes book, Andriy Burkov style.


You might be interested in this:

https://github.com/kelseyhightower/kubernetes-the-hard-way

It probably won't answer the "why" (although any LLM can answer that nowadays), but it will definitely answer the "how".


I actually took the time to read the tutorial and found it helpful.

Thanks for taking the time to share the walk through.


That's nice but I was looking more for a simple implementation of the concept from first principles.

I mean an understanding from the view of the internals and not so much the user perspective.



This is actually cool, thanks.

Kubernetes in Action book is very good.

I actually have the book and I agree it is very good.

> Great opportunity for someone ballsy to write a book about kubernetes internals for the general engineering population.

What would be the interest of it? Think about it:

- kubernetes is an interface and not a specific implementation,

- the bulk of the industry standardized on managed services, which means you actually have no idea what are the actual internals driving your services,

- so you read up on the exact function call that handles a specific aspect of pod auto scaling. That was a nice read. How does that make you a better engineer than those who didn't?


I don't really care about the standardized interface.

I just want to know how you'd implement something that would load your services and dependencies from a config file, bind them altogether, distribute the load through several local VMs and make it still work if I kill the service or increase the load.

In less than 1000 lines.


> I don't really care about the standardized interface.

Then you seem to be confused, because you're saying Kubernetes but what you're actually talking about is implementing a toy container orchestrator.


This is one of the truest comments I have ever read on here

I almost started laughing at the same comment. Kubernetes is the last place to know what your code is doing. A VM or bare metal is more practical for the persona that OP described. The git pushers might want the container on k8s

If you have a system that's actually big or complex enough to warrant using Kubernetes, which, to be frank, isn't really that much considering the realities of production, the only thing more complex than Kubernetes is implementing the same concepts but half-assed.

I really wonder why this opinion is so commonly accepted by everyone. I get that not everything needs most Kubernetes features, but it's useful. The Linux kernel is a dreadfully complex beast full of winding subsystems and full of screaming demons all over. eBPF, namespaces, io_uring, cgroups, SE Linux, so much more, all interacting with eachother in sometimes surprising ways.

I suspect there is a decent likelihood that a lot of sysadmins have a more complete understanding of what's going on in Kubernetes than in Linux.


> If you have a system that's actually big or complex enough to warrant using Kubernetes (...)

I think there's a degree of confusion over your understanding of what Kubernetes is.

Kubernetes is a platform to run containerized applications. Originally it started as a way to simplify the work of putting together clusters of COTS hardware, but since then its popularity drove it to become the platform instead of an abstraction over other platforms.

What this means is that Kubernetes is now a standard way to deploy cloud applications, regardless of complexity or scale. Kubernetes is used to deploy apps to raspberry pis, one-box systems running under your desk, your own workstation, one or more VMs running on random cloud providers, and AWS. That's it.


I'm not sure what your point is.

> I'm not sure what your point is.

My point is that the mere notion of "a system that's actually big or complex enough to warrant using Kubernetes" is completely absurd, and communicates a high degree of complete cluelessness over the whole topic.

Do you know what's a system big enough for Kubernetes? It's a single instance of a single container. That's it. Kubernetes is a container orchestration system. You tell it to run a container, and it runs it. That's it.

See how silly it all becomes once you realize these things?


First of all, I don't really get the unnecessary condencension. I am not a beginner when it comes to Kubernetes and don't struggle to understand the concept at all. I first used Kubernetes at version 1.3 back in 2016, ran production workloads on it, contributed upstream to Kubernetes itself, and at one point even did a short bit of consulting for it. I am not trying to claim to be any kind of authority on Kubernetes or job scheduling as a topic, but when you talk down to people the way that you are doing to me, it doesn't make your point any better, it just makes you look like an insecure dick. I really tried to avoid escalating this on the last reply, but it has to be said.

Second of all, I don't really understand why you think I'd be blown away by the notion that you can use Kubernetes to run a single container. You can also open a can with a nuclear warhead, does not mean it makes any sense.

In production systems, Kubernetes and its ecosystem are very useful for providing the kinds of things that are table stakes, like zero-downtime deployments, metric collection and monitoring, resource provisioning, load balancing, distributed CRON, etc. which absolutely doesn't come for free either in terms of complexity or resource utilization.

But if all you need to do is run one container on a Raspberry Pi and don't care about any of that stuff, then even something stripped down like k3s is simply not necessary. You can use it if you want to, but it's overkill, and you'll be spending memory and CPU cycles on shit you are basically not using. Literally anything can schedule a single pod on a single node. A systemd Podman unit will certainly work, for example, and it will involve significantly less YAML as a bonus.

I don't think the point I'm making is particularly nuanced here. It's basically YAGNI but for infrastructure.


Kubernetes is an abstraction of VMs so that single container can be implemented in the absence of a code package. The container is the binary in this circumstance. Unfortunately they lose control of blame shifting if their deployment fails. I can no longer be the VMs fault for failure. What is deployed in lower environments is what is in Prod physically identical outside of configuration.

Its the first thing that came to the mind of the person who wrote the comment, which is positively terrifying

Really? There are plenty of valid criticisms of kubernetes, but this doesn't strike me as one of them. It gives you tons of control over all of this. That's a big part of why it's so complex!

IMO, it's rather hard to fully know all of kubernetes and what it's doing, and the kind of person who demands elegance in solutions will hate it.

This mainframe system from the 1990s was so much simpler

https://www.ibm.com/docs/en/cics-ts/6.x?topic=sysplex-parall...

even if it wasn't as scalable as Kube. One the other hand, a cluster of 32 CMOS mainframe could handle any commercial computing job that people were doing in the 1990s.


Seems the causality is going the wrong direction there. Commercial jobs were limited by mainframe constraints, so that's where job sizes topped out.

It's not simple but it's not opaque.

It gives you control via abstractions. That’s fine, and I like K8s personally, but if you don’t understand the underlying fundamentals that it’s controlling, you don’t understand what it’s doing.

It's very easy to understand once you invest a little bit of time.

That's assuming you have a solid foundation in the nuts and bolts of how computers work to begin with.

If you just jumped into software development without that background, well, you're going to end up in the latter pool of developers as described by the parent comment.


Fly.io probably runs it on Kubernetes as well. It can be something in the middle, like RunPod. If you select 8 GPUs, you'll get a complete host for yourself. Though there is a lot of stuff lacking at RunPod too. But Fly.io... First of all, I've never heard about this one. Second, the variety of GPUs is lacking. There are only 3 types, and the L40S on Fly.io is 61.4% more expensive than on RunPod. So I would say it is about marketing, marketplace, long-term strategy, and pricing. But it seems at least they made themselves known to me (I bet there others which heard about them first time today too).

We do not use K8s.

Yeah no I wouldn't touch Kubernetes with a 10' pole. Way too much abstraction.

If my understanding is right, the gist seems to be that you create one or more docker containers that your application can run on, describe the parameters they require e.g. ram size/cuda capability/when you need more instances, and kubernetes provisions them out to the machines available to it based on those parameters. It's abstract but very tractibly so IMO, and it seems like a sensible enough way to achieve load balancing if you keep it simple. I plan to try it out on some machines of mine just for fun/research soon.

It's systemd but distributed across multiple nodes and with containers instead of applications. Instead of .service files telling the init process how to start and and monitor executables, you have charts telling the controller how to start and monitor containers.

It's worth noting that "container" and "process" are pretty similar abstractions. A lot of people don't realize this, but a container is sort of just a process with a different filesystem root (to oversimplify). That arguably is what a process should be on a server.

No, they are not. I'm not sure who started this whole container is just a process thing, but it's not a good analogy. Quite a lot of things you spin up containers for have multiple processes (databases, web servers, etc).

Containers are inherently difficult to sum up in a sentence. Perhaps the most reasonable comparison is to liken them to a "lightweight" vm, but the reasons people use them are so drastically different than vms at this point. The most common usecase for containers is having a decent toolchain for simple, somewhat reproducible software environments. Containers are mostly a hack to get around the mess we've made in software.


Having multiple processes under one user in an operating system is more akin to having multiple threads in one process than you think. The processes don't share a virtual memory space or kernel namespaces and they don't share PID namespaces, but that's pretty much all you get from process isolation (malware works because process isolation is relatively weak). The container adds a layer that goes around multiple processes (see cgroups), but the cgroup scheduling/isolation mechanism is very similar to the process isolation mechanism, just with a new root filesystem. Since everything Linux does happens through FDs, a new root filesystem is a very powerful thing to have. That new root filesystem can have a whole new set of libraries and programs in it compared to the host, but that's all you have to do to get a completely new looking computing environment (from the perspective of Python or Javascript).

A VM, in contrast, fakes the existence of an entire computer, hardware and all. That fake hardware comes with a fake disk on which you put a new root filesystem, but it also comes with a whole lot of other virtualization. In a VM, CPU instructions (eg CPUID) can get trapped and executed by the VM to fake the existence of a different processor, and things like network drivers are completely synthetic. None of that happens with containers. A VM, in turn, needs to run its own OS to manage all this fake hardware, while a container gets to piggyback on the management functions of the host and can then include a very minimal amount of stuff in its synthetic root.


> Having multiple processes under one user in an operating system is more akin to having multiple threads in one process than you think.

Not than I think. I'm well aware of how "tasks" work in Linux specifically, and am pretty comfortable working directly with clone.

Your explanation is great, but I intentionally went out of my way to not explain it and instead give a simple analogy. The entire point was that it's difficult to summarize.


> I'm not sure who started this whole container is just a process thing, but it's not a good analogy. Quite a lot of things you spin up containers for have multiple processes (databases, web servers, etc).

It came from how Docker works, when you start a new container it runs a single process in the container, as defined in the Dockerfile.

It's a simplification of what containers are capable of and how they do what they do, but that simplification is how it got popular.


If a container is "a process", then an entire linux/unix os (pid 1) is simply "a process"

Not just the kernel and PID 1, we also tend to refer to the rest of the system as "linux" as well, even though it's not technically correct. It's very close to the same simplification.

> Containers are inherently difficult to sum up in a sentence.

Super easy if we talk about Linux. It's a process tree being spawned inside it's own set of kernel namespaces, security measures and a cgroup to provide isolation from the rest of the system.


If someone doesn't understand "container", I'm supposed to expect them to understand all the namespaces and their uses, cgroups, and the nitty gritty of the wimpy security isolation? You are proving my point that it's tough to summarize by using a bunch more terms that are difficult to summarize.

Once you recursively expand all the concepts, you will have multiple dense paragraphs, which don't "summarize" anything, but instead provide full explanations.


If you throw out the Linux tech from my explanation, it would become a general description which holds up even for Windows.

Core kubernetes (deployments, services etc..) is fairly easy to understand. lot of other stuff in the cncf ecosystem is immature. I don't think most people need to use all the operators, admission controllers, otel, service mesh though.

If you're running one team with all services trusting each other, you don't have problems solved by these things. Whenever you introduce a CNCF component outside core kubernetes, invest time in understanding it and why it does what it does. Nothing is "deploy and forget" and will need to be regularly checked and upgraded, and when issues come up you need some architecture-level of the component to troubleshoot because so many moving parts are there.

So if I can get away writing my own cronjob in 1000 lines rather than installing something from GitHub with a helm chart, I will go with the former option.

(Helm is crap though, but you often won't have much choice).


Having a team that runs the kubernetes for you and being on receiving end is indeed super easy. Need another microservice? Just add another repository, add short yaml, push it to CI and bam!, it's online.

But setting it up is not a trivial task and often a recipe for disaster.

I've seen a fair share of startups who took too much kool aid and wanted parrot FANG stacks just to discover they are burning tons of money just trying to deploy their first hello world application.


The irony is the whole devops and cloud sales pitch was developers can do all this themselves and you no longer need an sysadmin team. Turns out you still do, it’s just called the devops/cloud team and not sys admin team.

Maybe not Kubernetes, but what about Docker Compose or Docker Swarm? Having each app be separate from the rest of the server, with easily controllable storage, networking, resource limits, restarts, healthchecks, configuration and other things. It's honestly a step up from well crafted cgroups and systemd services etc. (also because it comes in a coherent package and a unified description of environments) while the caveats and shortcomings usually aren't great enough to be dealbreakers.

But yeah, the argument could have as well just said running code on a VPS directly, because that also gives you a good deal of control.


Based on the following I think they also meant _how_ the code is running:

> The other group (increasingly large) just wants to `git push` and be done with it, and they're willing to spend a lot of (usually their employer's) money to have that experience. They don't want to have to understand DNS, linux, or anything else beyond whatever framework they are using.

I'm a "full full-stack" developer because I understand what happens when you type an address into the address bar and hit Enter - the DNS request that returns a CNAME record to object storage, how it returns an SPA, the subsequent XHR requests laden with and cookies and other goodies, the three reverse proxies they have to flow through to get to before they get to one of several containers running on a fleet of VMs, the environment variable being injected by the k8s control plane from a Secret that tells the app where the Postgres instance is, the security groups that allow tcp/5432 from the node server to that instance, et cetera ad infinitum. I'm not hooking debuggers up to V8 to examine optimizations or tweaking container runtimes but I can speak intelligently to and debug every major part of a modern web app stack because I feel strongly that it's my job to be able to do so (and because I've worked places where if I didn't develop that knowledge then nobody would have).

I can attest that this type of thinking is becoming increasingly rare as our industry continues to specialize. These considerations are now often handled by "DevOps Engineers" who crank out infra and seldom write code outside of Python and bash glue scripts (which is the antithesis to what DevOps is supposed to be, but I digress). I find this unfortunate because this results in teams throwing stuff over the wall to each other which only compounds the hand-wringing when things go wrong. Perhaps this is some weird psychopathology of mine but I sleep much better at night knowing that if I'm on the hook for something I can fix it once it's out in the wild, not just when I'm writing features and debugging it locally.


> I can attest that this type of thinking is becoming increasingly rare as our industry continues to specialize.

This (and a few similar upthread comments) sum the problem up really concisely and nicely: pervasive, cross-stack understanding of how things actually work and why A in layer 3 has a ripple effect on B in layer 9 has become increasingly rare, and those who do know it are the true unicorns in the modern world.

Big part of the problem is the lack of succession / continuity at the university level. I have been closely working with very bright, fresh graduates/interns (data science, AI/ML, software engineering – a wide selection of very different specialisations) in the last few years, and I have even hired a few of them due to being that good.

Talking to them has given me interesting insights into what and how universities teach today. My own conclusion is that the reputable universities teach very well, but what they teach to is highly compartmentalised and typically there is little to no intersection across areas of study (unless the prospective student hits the pot of luck and enrolls in elective studies that go across the areas of knowledge). For example, students who study game programming (yes, it is a thing) do not get taught the CPU architectures or low-level programming in assembly; they have no idea what a pointer is. Freshly graduated software engineers have no idea what a netmask is and how it helps in reading a routing table; they do not know what a route is, either.

So modern ways of teaching are one problem. The second (and I think a big one) is the problem that the computing hardware has become heavily commoditised and appliance-like, in general. Yes, there are a select few who still assemble their own racks of PC servers at home or tinker with Raspberry Pi and other trinkets, but it is no longer an en masse experience. Gone are the days when signing up with an ISP also required building your own network at home. This had an important side effect of acquiring the cross-stack knowledge, which can only be gained today by willingfully taking up a dedicated uni course.

With all of that disappearing into oblivion, the worrying question that I have is: who is going to support all this «low level» stuff in a matter of 20 years without a clear plan for the cross-stack knowledge to succeed the current (and the last?) generation of unicorns?

So those who are drumming up the flexibility of k8s and alike miss out on one important aspect: with the lack of cross-stack knowledge succession, k8s is a risk for any mid- to large-sized organisation due to being heavily reliant on the unicorns and rockstar DevOps engineers who are few and far between. It is much easier to palm the infrastructure off to a cloud platform where supporting it will become someone else's headache whenever there is a problem. But the cloud infrastructure usually just works.


> For example, students who study game programming (yes, it is a thing) do not get taught the CPU architectures or low-level programming in assembly; they have no idea what a pointer is. Freshly graduated software engineers have no idea what a netmask is and how it helps in reading a routing table; they do not know what a route is, either.

> So modern ways of teaching are one problem.

IME school is for academic discovery and learning theory. 90% of what I actually do on the job comes from self-directed learning. From what I gather this is the case for lots of other fields too. That being said I've now had multiple people tell me that they graduated with CS degrees without having to write anything except Python so now I'm starting to question what's actually being taught in modern CS curricula. How can one claim to have a B.Sc. in our field without understanding how a microprocessor works? If it's in deference to more practical coursework like software design and such then maybe it's a good thing...


> […] self-directed learning.

And this is whom I ended up hiring – young engineers with curious minds, who are willing to self-learn and are continuously engaged in the self-learning process. I also continuously suggest interesting, prospective, and relevant new things to take a look into, and they seem to be very happy to go away, pick the subject of study apart, and, if they find it useful, incorporate it into their daily work. We have also made a deal with each other that they can ask me absolutely any question, and I will explain and/or give them further directions of where to go next. So far, such an approach has worked very well – they get to learn arcane (it is arcane today, anyway) stuff from me, they get full autonomy, they learn how to make their own informed decisions, and I get a chance to share and disseminate the vasts of knowledge I have accumulated over the years.

> How can one claim to have a B.Sc. in our field without understanding how […]

Because of how universities are run today. A modern uni is a commercial enterprise, with its own CEO, COO, C<whatever other letter>O. They rely on revenue streams (a previously unheard-of concept for a university), they rely on financial forecasts, and, most important of all, they have to turn profits. So, a modern university is basically a slot machine – outcomes to yield depend entirely on how much cash one is willing to feed it. And, because of that, there is no incentive to teach across the areas of study as it does not yield higher profits or is a net negative.


Maybe in the US. Any self-titled Engineer in Europe with no knowledge of CPU's, registers, stacks, concurrency, process management, scheduling, O-notation, dynamic systems, EE, and a bigass chunk of Math from Linear to Abstract it would be insta-bashed down in the spot with no degree at all.

Here in Spain atthe most basic uni you are almost being able to write a Minix clone from scratch into some easy CPU (Risc-V maybe) from all the knowledge you got.

I am no Engineer (trade/voc arts, just a sysadmin) and I can write a small CHIP8 emulator at least....


I am not based in the US, and I currently work for one of the top 100 universities of the world (the lower 50 part, though).

Lol. Yes. I scoffed.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: