Some of the arguments against Kubernetes and for simpler systems/approaches, like Docker Compose, systemd or others actually gave me the push i needed to collect some of my thoughts and past experiences in a blog post "My journey from ad hoc chaos to order (a tale of legacy code, services and containers)": https://blog.kronis.dev/articles/my-journey-from-ad-hoc-chao...
(i'm linking it as a top level comment here, since responding to multiple comments would feel spammy, but the discussion is interesting)
In short:
- Docker Compose is lovely for local development, but isn't necessarily the only option to do containers easily
- Docker Swarm is what i chose and has been working wonderfully with software like Portainer
- K3s is a good Kubernetes distro that cuts down on the resource usage and allows you to work slightly more easily with it
- that said, Kubernetes isn't always the right tool for the job, with which i agree, especially for smaller teams
- systemd services can also be perfectly passable, but only as long as you control everything around the service itself (e.g. the environment with the updates etc.)
- regardless of whether you use containers or not, there are benefits to configuration management systems like Ansible, which i suggest you look into if you haven't already
Unfortunately not Kubernetes, because it decoupled pod lifecycles from load balancers by design. K8S has no idea how to coordinate with load balancers to ensure new requests are flowing to new pods and to allow in-flight requests on moribund pods to complete before terminating them. This leads to at least a small number of 5xx errors being returned to clients on almost every deployment unless you implement workarounds like delaying pod shutdowns with arbitrary sleep times.
I've been managing a high traffic website running as 100s of Kubernetes microservices. We have 0 issues with 500s due to deployments.
As long as you have well set up probes and old pods have enough time to fulfil lingering requests before going away, there's no reason why you should have 500s at all.
There’s no question that it can be worked around by adding arbitrary delays in places, but finding the right delay then becomes a guessing game and a trade-off between safety and paying to run pods that will do no more work. Different applications and L4/L7 transports also require different behaviors. It also requires that the author of the podspec know to implement this workaround, instead of having this knowledge baked into the orchestrator where it rightly belongs. I suspect your early attempts at setting this up safely also failed.
From my experience, it's enough if the application is able to finish existing requests when it receives the signal to terminate. Kubernetes removes the terminating pod from the service endpoints, so it won't receive new connections. Therefore, if the application can serve all remaining queries during the grace period (30s by default), it should be fine.
Probes are useful for ensuring long term health and also for determining when to send traffic to a starting pod. They aren't strictly necessary in the termination process.
> I suspect your early attempts at setting this up safely also failed.
IMHO Kubernetes is too complicated for many use cases. The zero downtime deployment may require some testing, but it's definitely supported. It's not a rare requirement.
First, although you are correct that the endpoint is removed from the proxies, the removal is asynchronous and so the proxy configurations may not be up to date before the shutdown is initiated. Also, more and more load balancers are bypassing node ports these days, preferring instead to route traffic directly to pods, and so it’s becoming a less effective mechanism.
Second, the challenge is that app developers are rarely aware of the need, let alone how to write SIGTERM handlers that finish handling existing requests before exiting. For them it’s yet another cognitive and testing burden that is usually out of their business expertise domain, and to my knowledge few of the common web service frameworks implement a reliable shutdown handler out of the box.
On the other hand, if the orchestrator simply waited for the LB to signal that the pod was removed from the backend list before initiating the shutdown process, then it would be more robust by design.
>"Also, more and more load balancers are bypassing node ports these days, preferring instead to route traffic directly to pods, and so it’s becoming a less effective mechanism"
Could you or someone else say which LBs are currently bypassing node ports? Also how does a LB actually bypass a nodeport since a service type of Loadbalancer in Kubernetes always uses a nodeport? How exactly does an external Loadbalancer connect to a pod without first connecting to a port on the physical host?
The AWS load balancer controller for Elastic Load Balancers is capable of configuring routing directly to pods in an EKS cluster because those pods get a routable VPC subnet IP address by default. They’re not stuck behind bridges connected to non-routable subnets, so no NAT needed.
In our current non-web application we have a launcher, which allows users to select between the current and a few previous versions. This helps us iterate faster due to less show-stoppers. If a user experiences a blocking bug in current version, they log out and back into an older version.
For our web version I was thinking of doing something similar by having a launcher page, and then having each version run in its own container on a subdomain.
So far I tested this using Docker Compose with Traefik routing the requests to the appropriate app container, and it worked fine. My deployment script modifies the docker-compose.yml config file to add a new container, and then Compose brings the additional container online.
Current users get a notification that a new version is available but can continue working in the current version, and new logins have the new version selected by default (but can be overridden as mentioned).
Would something like this not work with Kubernetes? I'm not yet familiar with Kubernetes but I've been assuming we might end up using it.
There are ingress controllers for kubernetes that support setting up routing rules that would fit your needs. For example, the user's browser could set a header with a desired version and the ingress controller could route the request to a set of containers running that version. There are many other ways to skin this cat (all the way up to complex systems like GraphQL). Pick one that matches the size and complexity of your organization.
You can do this in Kubernetes by combining Envoy's draining support and graceful shutdown in the backend Pod (via handling SIGTERM or adding a preStop hook to delay termination). The big users of Envoy+Kubernetes are mostly running their own custom ingress controllers to coordinate this across the ingress system (easier than it sounds).
Honestly, it depends on what's in your stack and how everything fits together and what's the architecture like.
Almost any orchestrator (Nomad, Swarm and Kubernetes) can do a rolling deploy and draining of particular nodes if you ever want to do that either.
Currently my homepage runs on a few containers (horizontal scaling) and a load balancer in front of it, and redeploys have very little impact on the overall experience: https://kronis-dev.betteruptime.com/
That said, for write heavy workloads you'll need to even manage your DB migrations as a multi step process to allow both old and new schema versions to work during redeploy, whereas in other circumstances you'll need to think more about how to roll back either the DB, the app containers/instances or even events in your system (depending on the architecture and how much can go wrong).
In short:
- horizontal scaling can help, as can having a load balancer with health checks
- changes not being breaking between any 2 releases can help (e.g. multi-stage DB migrations)
- complications may arise with stateful services and data stores mostly
- apps and APIs being able to cope with others being offline, if only for a bit (e.g. request replay) can also help
I am currently using Terraform almost like Docker Compose to manage about ~5 services + a database + Redis in Docker containers. When I do a deploy, I swap version 0.0.1 for version 0.0.2 for example and it brings 0.0.1 down, then brings 0.0.2 up (aka, not very rolling deploy like).
How would I migrate that to Swarm? I'm not using AWS/GCP/any of that. Just locally hosted Docker containers.
You'll probably want to read up on the Compose specification and the values that Docker Swarm and Docker Compose support since last i checked there were some differences in v2 / v3 (which many people still use): https://docs.docker.com/compose/compose-file/
Of course, if you want a less steep learning curve for actually running Swarm, depending on your personal preferences (GUI vs CLI), you might want to have a look at Portainer, which is a lovely and simple dashboard: https://www.portainer.io/
>Kubernetes isn't always the right tool for the job, with which i agree, especially for smaller teams
I think this is the key point. Kubernetes is great when used correctly, in the right scenario. And I feel that most of the hate comes from situations where it's not suitable. Kubernetes is just one of the many platforms available to run your code. My point extends to more systems, like Ansible, Chef and all the others. TL;DR it's not a tool for developers, it's a tool for operations and systems integrators.
A developer working on an application shouldn't care about the platform being run on, only defining its requirements - think big enterprise system that just consumes data, processes and stores it. The moment an engineer starts thinking "I need X platform" they're not developing the app anymore, they're integrating it too.
The system integrator should be able to take those requirements and match it to the right platform - choose the OS, native package vs container, scalability, fault tolerance etc. But the system integrator doesn't need to care about the long term maintenance of the platform, just how to integrate the application onto the platform. k8s can be considered here for long term support, ease of integration, portability etc.
The systems maintenance are responsible for ensuring the system has the resources to run the application and fix any faults that appear in the infrastructure - this is where k8s shines the most, because I can drain a node, update it and uncordon it without worrying about where I'm going to move the containers running on the node because k8s will do that for me. Gone are the days where we need to plan taking a machine offline for an hour, making sure the right failover procedures are in place, because k8s should be demonstrating that regularly.
Essentially, kubernetes/Ansible/etc. should not be a part of the development process. It doesn't solve problems that an application developer should care about. It's an operations tool, able to monitor and manage deployed applications. The hate occurs when the wrong people are using the tools, or the right people are using the tools incorrectly.
And of course, these kinds of separations of concern can only really happen in larger teams, on larger projects. Dedicating a few people to maintaining a deployment, a different group to continuous integration, and a 3rd group to actually developing the functionality. That's just not possible in every case, and the most hate seems to come from those in-between scenarios where there's too many people to call it a small team, but too few to enable this separation of concern. Meanwhile, the most love seems to come from those that can differentiate between the many hats they need to wear in a small team.
> A developer working on an application shouldn't care about the platform being run on, only defining its requirements - think big enterprise system that just consumes data, processes and stores it. The moment an engineer starts thinking "I need X platform" they're not developing the app anymore, they're integrating it too.
This feels like a statement that begs more discussion because it's an interesting argument, albeit probably a polarizing one as well.
My experience has been the complete opposite to this - when developers also think about how the app will run, the outcomes will be the best.
No system runs in a vacuum or exists separate of its runtime environments, so attempts to decouple those could lead to countless problems and inconsistencies down the road. For example, if you create a Java app that expects configuration to come from XML files, this may cause difficulty with containerization and won't follow the best practices anymore. If you create an app that stores session data in memory instead of Redis, then it won't be horizontally scalable. Same for shared data, like using the file system for something - that will create a lot of problems with multiple instances vs using something like S3.
Even if you don't pick the platform yourself, you should make sure that the app integrates with it smoothly, hence the 12 Factor App principles can help for many of these cases.
> Essentially, kubernetes/Ansible/etc. should not be a part of the development process. It doesn't solve problems that an application developer should care about. It's an operations tool, able to monitor and manage deployed applications. The hate occurs when the wrong people are using the tools, or the right people are using the tools incorrectly.
In my experience, this leads to knowledge silos, problematic communication, not thinking about Ops concerns during development either due to a missing set of skills or just not caring due to a lack of ownership about the outcome, and just generally a slowed down pace of development.
Now, there probably is a lot of benefit in specialization (e.g. Ops specialists that consult across different teams and projects to make everything more consistent), but in my eyes everyone should know a bit of everything and communicate freely.
If you don't do that, developers use ORMs in their apps and create N+1 problems or large amounts of DB calls due to a lack of joins or no consideration for indices because they're not DBAs after all, other times people don't bother with instrumentation in their apps and introduce problems with proxying configuration due to not caring about the ingress will look like etc.
I actually recall seeing an issue at work about an Ops related problem, recommended a few steps to take care of it, whereas the responsible dev simply told me: "That's not in my job description, i just write Java."
Now, not making people learn about an extremely vast array of topics and spend evenings/weekends doing so would also be nice, but then again, that person didn't solve the issue, i did. The end result is what matters, optimizing the road to it is up to the people in control.
That said, "rockstar devs" or "DevOps" specialists (or whatever the current term is) also carry risks in regards to a low bus factor (everything hinging on them in certain orgs, vs spreading the knowledge), so it's all probably pretty situational and scalable to different degrees.
> when developers also think about how the app will run, the outcomes will be the best
Absolutely agree, but that becomes "engineering" in my view. I'm talking specifically about implementing functionality, not long term views and project direction etc.
> Even if you don't pick the platform yourself, you should make sure that the app integrates with it smoothly, hence the 12 Factor App principles can help for many of these cases.
Which is the role of the system integrator - to integrate the app with the platform it's to be run on.
> In my experience, this leads to knowledge silos, problematic communication, not thinking about Ops concerns during development either due to a missing set of skills or just not caring due to a lack of ownership about the outcome, and just generally a slowed down pace of development.
> Now, there probably is a lot of benefit in specialization (e.g. Ops specialists that consult across different teams and projects to make everything more consistent), but in my eyes everyone should know a bit of everything and communicate freely
Absolutely agree. And I hate that this happens, which is why I try to learn about all levels of the tech stack and not just one domain. But I do feel like this is a people problem, not a tech problem or project problem.
If the team cannot communicate effectively, they're never going to produce the best product, regardless of who knows what, budgets, etc. Ultimately, these roles aren't a waterfall structure. Developers shouldn't throw things over the fence to system integrators, and similar to operations.
Instead, the whole team works together, in the same space, on the same backlog. If ops has a problem, it gets prioritised on the backlog. If developers are unsure of how to approach something, they talk to the integration or ops people. In an ideal world, the teams would rotate between roles regularly, so that everybody is aware of the issues being faced that can't be seen from elsewhere.
Based purely on my own experience, far too much priority is put on budget constraints and making money instead of making a product that a customer actually wants. And this often leads to the one person, many hats issue, which then cascades into a mashup of roles and chasing our tails. I've had a lot more success on projects where we've clearly defined who owns which layers of the stack than when the team are just left to fend for themselves.
> That said, "rockstar devs" or "DevOps" specialists (or whatever the current term is) also carry risks in regards to a low bus factor (everything hinging on them in certain orgs, vs spreading the knowledge), so it's all probably pretty situational and scalable to different degrees
Specialists absolutely have their place, but are not a fixed requirement. Just because a project chooses k8s, doesn't mean it needs a k8s guru to own that layer, merely that the project needs some knowledge in that area to assess the scale required and make the call on keeping it small and manageable, or getting a consultant in for a short time to design and upskill the team, or getting a dedicated DevOps person to make the impossible happen.
But again, there's no reason why all of this couldn't happen with a team size of 1, with that 1 person wearing many hats. The important thing in those situations is knowing which hat is which and prioritising appropriately. My suggestion is a high level model that can be morphed to fit the situation, not a fixed framework that everybody must follow until the end of time.
(i'm linking it as a top level comment here, since responding to multiple comments would feel spammy, but the discussion is interesting)
In short: