Software is a garden that needs to be tended. LTS (and to a lesser extent requirements for large amounts of backwards compatibility) arguments are the path the ossification and orgs running 10+ year out of date, unsupported legacy garbage that nobody wants to touch, and nobody can migrate off because it’s so out of whack.
Don’t do this. Tend your garden. Do your upgrades and releases frequently, ensure that everything in your stack is well understood and don’t let any part of your stack ossify and “crust over”.
Upgrades (even breaking ones) are easier to handle when you do them early and often. If you let them pile up, and then have to upgrade all at once because something finally gave way, then you’re simply inflicting unnecessary pain on yourself.
That’s great if you have control over all the moving parts, but a lot of real-world (i.e. not exclusively software-based) orgs have interfaces to components and external entities that aren’t so amenable to change. Maybe you can upgrade your cluster without anybody noticing. Maybe you’re a wizard and upgrades never go wrong for you.
More likely, you will be constrained by a patchwork quilt of contracts with customers and suppliers, or by regulatory and statutory requirements, or technology risk policies, and to get approval you’ll need to schedule end-to-end testing with a bunch of stakeholders whose incentives aren’t necessarily aligned to yours.
That all adds up to $$$, and that’s why there’s a demand for stability and LTS editions.
@ji6 I have to tell you, we are using Kubernetes since the very early versions available.
If you automate everything you will never have an issue with K8, you can deploy all the required dependencies in one go. You can run the tests, if done correctly this literally requires 10 minutes.
This argument to have an LTS Version for me is like air. It's the kind of Nuclear Reactor argument. We did have the money to buy Uranium to burn but now we can't afford to dispose it normally.
Kubernetes is for flexibility not for big companies which want to use their software development processes of the 90's...
I'm sorry, this is simply not true. K8s relatively often introduces breaking changes. They are announced well in advance, which is very nice, but the solution will not take 10 minutes. Take Pod Security Policies and migration to Kyverno of OPA Gatekeeper. Even if you have a relatively simple cluster and choose to stick to defaults rather than write your own rules, it will still take time to understand the new system, to choose the policies you need, to update and test it. In complex cluster this takes weeks and in some cases even months, especially if you are legally obliged to follow change management.
Im curious, help me understand where the breakdown happens. Abstracting a few layers away, can we assume that you have system input, your system does something, it has output. If this is correct, can we also then say that inputs are generated by systems you dont control that might not play nice with updates within your system. Similarly, changing outputs might break something dow stream.
If all that holds, my question then becomes, why does your system not have a preprocesisng and postprocessong layer for compatibility? Stuff like that is easier than ever to build, and would allow your components to grow with the ecosystem?
It’s all about risk. If you have a simple enough system, you might be able to hide it behind an abstraction layer that adequately contains the possible effects of change.
But many interesting useful real-world systems are difficult to contain within a perfect black box. Abstractions are leaky. An API gateway, for example, cannot hide increased latency in the backend.
People accountable for technology have learned, through years of pain, not to trust changes that claim to be purely technical with no possible
impact on business functionality. Hence testing, approval and cost.
To paraphrase someone else's reaction, I'm violently shaking my head in disagreement. What you're saying only works when you have 100% full control of everything (including the customer data). As someone who spent years in the enterprise space, what you're describing is akin to 'Ivory Tower Architecture'.
LTS is a commitment. That is all. If someone is uncomfortable with such a commitment, then that's fine, let me free market sort it out. But what LTS does is it tells everyone (including paying customers) that the formats/schemas/APIs/etc.. in that version will be supported for a very long time and if I adopt it, I won't have to think about it or budget too much for its maintenance for a period of time measured in months/years.
I would go the extra mile here and say that offline format should be supported FOREVER. None of that LTS bs for offline data, ever. LTS means that you're accountable for some of the costs in your partnership with the customers. If you move fast and break things, they will have to work extra just to keep up. If you move fast but mindful with back-compat, you will work extra but your customer will be happier. That is all.
Re-reading your comment gives me chills and reinforces my belief that I will never pay money to Google (they have a similar gung-ho attitude against 'legacy garbage') or have any parts of my business depend on stuff which reserves the liberty of breaking shit early and often.
> Don’t do this. Tend your garden. Do your upgrades and releases frequently, ensure that everything in your stack is well understood and don’t let any part of your stack ossify and “crust over”.
You can't see me, but I'm violently nodding in agreement. Faithfully adhering to these best practices isn't always possible, though; management gonna manage how they manage and now how your ops team wants them to manage.
> Upgrades (even breaking ones) are easier to handle when you do them early and often. If you let them pile up, and then have to upgrade all at once because something finally gave way, then you’re simply inflicting unnecessary pain on yourself.
How different could things be if k8s had a "you don't break userland!" policy akin to the way the Linux kernel operates?
Is there a better balance between new stuff replacing the old and never shipping new stuff that would make more cluster operators more comfortable with upgrades?
Consider where people would use something for a long time and want to keep it relatively stable for long periods? Planes, trains, and automobiles are just a few of the examples. How should over the air updates for these kinds of things work? Where are all the places that k8s is being used?
If we only think of open source in datacenters we limit our thinking to just one of the many places it's used.
You don't need Kubernetes for over-the-air updates. Kubernetes is also not suitable for that. It scales up alright, but not down. As TA expounds, there are simply too many moving parts that require expertise to operate. And that's fine. Not every software has to be able to accommodate all use cases.
That is because the pressure on defense to adopt an approach and technologies that 'work' in commercial space has become dogma. Agile is a disease that fuels recipe approaches.
Recently worked with a team doing edge architecture and design that has had at least 5 iterations of k8s design and vendors. Why is that? What was the solution?
My opinion at the end (even though I love cloud managed k8s for commodity/insecure in the commercial space) is not to use k8s for edge and secure. Stop pretending. It's a mess and it won't get better.
If the elevator company can put in a display, run Linux, cage and electron / webbrowser I think that's a good idea. What I don't think is great is frontend people I work with having "map" explained to them because they don't know. vscode proves "js bad" wrong over and over, crap software can be written in any language.
I disagree, if they can get a product with great UX through the door but has to reboot the system on a timer (maybe once it knows its empty from a PLC input) every so often I'd call it a solution. In fact, Boeing thinks rebooting an airplane is a solution to integer overflow[0].
While I do not appreciate everything being a Web app it's also a very robust platform to build upon, there aren't many projects that gets as many eyes as browsers and their components.
Its not a garden. Gardens are horizontal with no dependencies between plants.
Infrastructure is a high rise building. Long term planning and careful maintenance is needed. And it shouldn't be necessary to replace the foundation every year or two.
Yeah imagine if someone built a bridge and was like, yeah, we need to do regular updates or else it will fail and come crashing down.
I feel like software engineers who preach that everything must be connected to the internet and update from the mothership regularly are fundamentally disconnected from reality. If your design is robust to begin with you should be able to depend on it without constantly fiddling with everything.
lol. have you ever done an update with breaking api changes and cluster global exposures/rbac for helm charts? It's like switching out the bolts in a running engine.
of course it's more work than simply applying the new charts, but the nice thing about k8s is that you can dump out the stuff from the working one, use k3s/kind/minikube, try the upgrade, and you are as good to go as with a dist-upgrade or similar.
> Gardens are horizontal with no dependencies between plants.
Two words: companion plantings
Also, there's a world of incompatible plants. Some plants actively harm the growth of others, and some simply can't grow in the same place because of different soil and nutrient requirements.
It shows you do not garden. There are some pretty strigent dependancies between plants. Certain plants protect one another from specific pests, others wont tolerate the same soils. Others require planting at specfic seasons. What for some plants is underwatering is another plants overwatering. Plants arent just things we stick in the ground and watch grow. Succesful gardening requires careful consideration of each plants requirements, or dependencies if you will.
That being said, you build gardens with soil and highrises with concrete. Kubernetes is soil, not cement.
I happen to agree. Many "new features" aren't necessarily needed by a large segment of the community that uses K8s.
Some of the attitudes in here about software needs to be constantly evolving is just odd. Many many systems run on legacy software and hardware because, for the most part, they just work.
Innovation and evolution are important, yes. But churn for the sake of churn does not fit every (most) use case...
Kubernetes feels like Javascript has reached the sysadmins, new updates, libraries, build tools every week.
It's mainly good for keeping high paid people employed, not keeping your servers stable.
I ran a cluster 1.5 years in production, took me so much energy. Especially that one night where digital ocean managed cluster forced an update that crashed all servers; and there was no sane way to fix it.
I'm back to stability with old school VPS; it just works. Every now and then you run a few patches. Simple & fast deploys; what a blessing.
I am not understanding how you say you ran a cluster when you were using a managed instance so you weren’t really running it. Now , going to a VPS makes you effectively the MSP. I don’t see how that addresses the LTS issues since k8s is updating so often that you have to keep pace with the updates (that you now need to do).
I use a managed instance and manage my own private instance.
> I don’t see how that addresses the LTS issues since k8s is updating so often that you have to keep pace with the updates (that you now need to do).
They're not saying they moved to running k8s on a VPS - they're saying they moved to using a VPS instead of k8s (i.e. have escaped the k8s upgrade cycle).
I agree on the very principal of what you're laying out here, but the reality is often rare if not at all in tandem with principals and "best practices"
Manufacturing comes to mind, shuttering a machine down to apply patches monthly is going to piss off the graph babysitters, especially if the business is a 24/7 operation, and most are currently.
In an ideal world there would be times every month to do proper machine maintenance, but that doesn't translate to big money gains for the glutton of shareholders who don't understand anything, let alone understand that maintenance prolongs processes as opposed to running everything ragged.
I don't think the comparison to taking your car to a mechanic is a good one. When I take my car to a mechanic they are returning it to a stock configuration. They don't need to update to a new, slightly different power steering this month, new brakes next, then injectors....
Sometimes I get this feeling that a lot of developers kind of want to be in their code all the time, tending to it. But it’s just not a good use of my time. I want it to work so I can ignore it and move on to the next best use of my time.
I trial upgraded a years old Django project today to 5.0 and it took almost zero work. I hadn’t touched versions (other than patch)in over a year. That’s the way I want it. Admittedly this was less about an LTS and more about sensible design with upgrading in mind.
> I trial upgraded a years old Django project today to 5.0 and it took almost zero work.
Django is great at backwards compatibility, but to be honest they haven't added many revolutionary features to Django for years, except a half-baked implementation of async support.
Yup. At my last company, Kubernetes was the only place where software was not 2+ years behind upstream because of this and I appreciate it a lot. And its API deprecation policy [1] made upgrade not so painful (if upgrades frequently break your stuff, check if your infra people really understand this policy, and your software vendors are compliant).
this sounds great in theory. after 5years of running a startup, you learn to pick your battles. that db library you upgraded? works great except for this one edge case where they changed the interface in an obscure portion that affects 5% of your users. it got past QA but thaat 5% of users get REAL VOCAL about it. now its in production and you're planning an emrgency revert of that dependncy after you verify that you aren't using any of the new features of the library.
sounds doable? now multiply that by the number of dependencies your app has.
I agree that it good to periodically update deps and allocate time for keeping your system up to date but Its easy to let tracking all your deps suck up all the time your startup would be better off spending on adding features that bring new users in and add to your bottom line.
I suggested upgrading like it’s the wild-west where?
When your rock solid dependency releases a new major version, when you’re ready you update it, and fix all the things you need to in a single PR, you take that to prod when you’re ready, and roll it back if it’s unhappy, or your favourite variation of out-and-back.
What I’m advocating against, is picking a version, and then sitting on it for so long that it inevitably goes out of support, and then find yourself having a really bad, un-fun time-probably at some awful time of day- and discovering the work you need to do to get out of the current system is so much more than you initially realised, because you’re so far behind.
Breaking changes happen anyway at some point, you may as well stay abreast of them.
At the end of course the decision will be based on business and market analysis, as even the cloud native foundation abides more or less to the same rules as everyone else does... Introducing LTS for kubernetes however will be a huge step towards pushing it down the enterprise products alley, where software is selected more for the amount of people available in the market, capable to work with it / operate it, and the running costs generated, rather than satisfying an actual need of the business.
Chances are that if your org needs LTS for kubernetes, then kubernetes is not the right solution to your problems. Which is probably the case anyway... but that's a whole different story.
I've spent so much time chasing down performance problems and bugs where the problem was due to an outdated dependency. Simply spending a few hours a month upgrading dependencies is a big win to avoid those situations
At the far end of the update frequency spectrum from LTS is Google's "live at head" philosophy. Google's Abseil C++ library, for example, recommends that consumers update to Abseil's latest commit from the master branch as often as possible. Abseil have no tagged releases besides master and an LTS branch (updated every 6-9 months).
I like the idea, and with something like say Chrome, this is excellent.
However each upgrade of k8s needs planning. Need to check through the list to see what APIs are broken, and figure out if anything needs changing, preparing for it, including stuff the cloud provider has chucked in the mix.
Probably to the point I feel like a cluster of clusters would be safer, so you can slowly roll it out.
Even if you could get everyone to tend the garden, it encourages unnecessary breaking changes or fragile design. LTS provides some much-needed backpressure. If a new feature truly requires breaking things, people who really want it will put in the effort.
That's why Kubernetes provides versioned APIs with proper lifecycles and levels of stability.
A stable API is not going to change randomly on you, and of it does, it's going to come with deprecation policy so you have time to do it.
Personally I did have few "painful" k8s upgrades, but every time it happened it was due to long-announced large scale change... That another team didn't want to upgrade over despite delaying so long the last version compatible became EOL
In theory, I agree. Tend your garden, do small upgrades often instead of major upgrades less often.
In a homelab this is easy to implement. In a small organization it is too.
But in the facetious "real world", things are a lot more complicated. I work as an SRE Manager and my team is basically ALWAYS upgrading Kubernetes. New releases drop about as fast as we can upgrade the last ones.
When you work on a large cluster, doing an upgrade isn't a simple process. It requires a ton of testing, several steps, and being very slow and methodical to make sure it is all done properly. Where I currently work, we have 2 week sprints and infrastructure changes must align with sprint cycles. So to promote an upgrade at the fastest possible schedule it requires:
- Week 0: Upgrade Dev environment
- Week 2: Upgrade QA Environment
- Week 4: Upgrade Sandbox Environment
- Week 6: Upgrade Prod Environment
That is the fastest possible schedule. That assumes we do a cluster upgrade every sprint, which is 2 weeks. It also ignores other clusters for other business units. We have 4 primary product lines, so multiply all that work times 4. Plus we have supporting products (like self-hosted gitlab, codescene, and custom tools running in k8s clusters).
I say fastest possible schedule because, we can't keep up with this schedule, but even if we could it is the fastest we could go and still maintain our deployment and infrastructure promotion policies.
With new releases every 3-4 months (12-16 weeks), we are essentially in a constant state of always upgrading kubernetes. Right now my team is 2 versions behind. Skipping versions doesn't make sense because you can't guarantee a safe upgrade when skipping versions.
This is why LTS releases are nice. When you run systems at large scale, it is impractical to upgrade that often. I'd prefer to limit upgrades to no more than twice a year and personally I find annual upgrade cycles to be the best balance between "tending the garden" and "not drowning in upgrade work". LTS releases are usually tests to skip upgrades so that companies can go from LTS release to LTS release, without the need to worry about upgrading every minor version in sync.
Remember, upgrading K8s clusters isn't what my bosses want to hear my team spends our time. They want to know that observability is improving, devs are getting infrastructure support, we are building out new systems, deploying hardware for the product team, running our resiliency tests, etc.. Sure upgrading is part of the job, but i can't be ALWAYS upgrading. I have a lot of other responsibilities.
I don't know what your internal architecture looks like, but on the face of it, arguably you should be running Dev and QA loads on the same cluster. Either you have an organization where an SRE team is responsible for running clusters for teams, in which case why not run Dev and QA on the same cluster, or you are responsible for last-line-of-support for teams responsible for running their own clusters, in which case you say, here's a new version of Kubernetes (e.g. a Terraform module) and in this sprint you are responsible for deploying it through dev, QA, pre-prod, and production clusters. Especially if you have separate Kubernetes clusters per product line (do you really need separate dev Kubernetes clusters per product line?).
unless you have some strict security reasons, why even have separate prod clusters per product? We definitely don’t. seems to kinda miss the point of k8s - aka running all kinds of mixed workloads with separation within a cluster if you need it.
Agreed. Separate prod clusters per product usually fixes an organizational problem (lack of trust that running shared workloads would/could be safe, versus giving each team their own servers) first. A lot of organizations, sadly, prefer to pay for separate clusters than to set up RBAC, ResourceQuotas, etc.
Hypothetically, since the scaling limit for Kubernetes clusters is ~10,000 nodes (last I checked), you could have multiple product lines that took up 10k+ nodes each. Then there's no reason why not to split by product line. But in the beginning it should be fine.
There's also edge cases - Kubernetes doesn't support setting resource requests or limits for networking or I/O, which you usually solve by setting up taints/tolerations/affinity to manually schedule those workloads onto nodes where you've manually run the numbers. But still not usually a reason to prefer separate production clusters.
Arguably, the design of kubernetes is for clusters to be based on physical clusters - to the point of a cluster per aisle (or multiple aisles), and resources of those clusters then being used to deploy applications even across clusters. Or in the small, like Chick-Fil-A, with kubernetes running locally at every restaurant.
Not just past 10k node limit - it's also because the k8s design has unsaid assumption about being used to arrange physical sets of machines, similar to how Borg paper would talk about cluster being equivalent to an aisle in datacenter.
Are you just referring to how you assign labels and taints to nodes, which in a physical datacenter could include labels like which aisle the baremetal server is in?
Of course that's feasible, to tell the Kubernetes scheduler that you only want to schedule the workload on a very specific server, but that's not really best-practice...
Sometimes certain use cases require clusters running in different regions of the world, or have to run in a specific region. There's also fail over. Personally I always strive to use a "mono" cluster unless I can't avoid otherwise. It does happen.
we have hundreds of clients most on k8s or openshift. Many of them are shifting to smaller team or division oriented clusters so that they can move at their own pace in consuming capabilities, tools, security practises, etc. It also simplifies distribution of operational expenses, yea there’s tools out there to help with that but it’s a whole lot easier to hand someone a sub-account in AWS and call it a day. Additionally it reduces your fault boundary. It isn’t as sexy as saying you have a 5000 node cluster with 100k+ pods running but it can make life a whole lot less stressful when it comes to upgrades and changes.
So basically, doing to infrastructure the microservices approach?
a) if you're going to intentionally make the trade-off to have higher costs in exchange for reducing ops load and simplifying administration within an account, where services make network calls across accounts, then adopting ECS/Fargate is probably a significant improvement over Kubernetes.
b) The underlying engineering / financial reality is still there and very much a leaky abstraction that will show up on your AWS bill. Cross-VPC, cross-region, and cross-AZ networking costs are very real overhead that you must consider; you either still have centralized network planning with shared VPCs and subnets or you basically decide to give up and let AWS send you the bill when teams create their own VPCs, their own subnets, etc.
c) setting up DNS / service discovery and network controls within a single cluster is simple. Doing so across account boundaries is not, and deciding to set up solutions like AWS Transit Gateway incur their own costs.
I'm sure it simplifies some things for some people, but Finance is going to get upset pretty quickly. Once you start to go down that path, it's very difficult to back out.
Everything’s benefits and tradeoffs. With this model there’s a clear ownership model. I’m not suggesting it’s right for everyone but it is a trend we’ve noticed with a number of clients over the last 2-3 years.
imo you've just traded one set of centralized costs (developing tools, security practices that work for all and educating people) for a distributed set of costs with everyone managing their own clusters and duplicating all this work. You've just pushed the cost onto other teams - which can be a fine approach for reducing the central teams stress, but is net more costly to the business. EDIT: Im not saying its necessarily a bad choice, just much more expensive.
It’s all swings and roundabouts has been since the broad adoption of computers. I expect in a few years time when everyone’s forgotten the pain points that drove them to decentralize there’ll be a big move to centralize again. Someone will get a gold star for coming up with the idea and being able to demonstrate the “cost savings”. Of course it’ll completely ignore the general disruption to all of the product teams as they adapt to the new world order but what’s a company without a little busy work.
IME mixing financial concerns with engineering concerns puts you into an architectural corner. Suddenly the business wants an integration between the two projects - which cluster does the integration run on? Do you spin up a whole new cluster / account just for the integration? Inside a project, you need to run dedicated infrastructure for large customers. How do you report that for billing?
Just set up tagging correctly.
> For very small projects, one cluster per product is expensive, so mutualisation makes sense.
If you're not working for a BigCo, every project starts out very small ;) You shouldn't optimize prematurely.
I've worked on ETL integrations in the past where there's an API for service A that's read-only and an API for service B that's write-only, service A is provided by a third-party vendor (so you're not exactly going to get the vendor's code to alter or fork it) and politically the people who work on the integration are different from the people who are responsible for service B; the integrators neither have permission to make alterations to service B nor will they get service B engineers to take backlog items to integrate read functionality or otherwise integrate with the vendor for service A directly. The integrators simply need to build a completely separate integration that lives outside of service A and service B to ETL from service A into service B.
If service A is in one cluster, and service B is in another cluster, where billing is per-cluster, in which cluster do you put the ETL integration? It belongs to either both or neither, depending on how Finance wants to bill it.
yeah, billing is pretty simple with a bit of investment on tagging. If it gets complicated there are third party tools to help. Much better than introducing significant complexity into your systems architecture just for the accounting team.
Why don’t you update dev+qa at the same time? Or qa+sandbox? Or dev+qa+sandbox? Separating these 3 might make sense for the applicative part (and even then, I think 4 environments is too many), but I’m not sure it really make sense to follow it on the infra side.
I’m also not sure of why you manage all these clusters, why not merging dev, qa and sandbox in a single cluster with namespace partitioning? It would be way less work and probably more cost effective
For sure, with scale comes slow-downs and new problems, that’s just an organisational reality, totally get that.
I am slightly curious why an upgrade takes 2 weeks? At my current work, rolling out an upgrade (self managed clusters), rolling out an upgrade on the “happy path” is a fairly low-intensity task: we kick it off, nodes gracefully drain, upgrade and come back online. No intervention necessary. Unhappy path just requires another command to roll them back. Prod is the same, but happens slightly slower because there’s more nodes (and more independent prod clusters).
This, like a recent LTS discussion I saw for a different tool, ignores one tiny little detail that makes the whole discussion kind of moot.
LTS doesn't mean it's immune to bugs or security vulnerabilities. It just means that the major release is updated and supported longer - but you still need to be able to apply patches and security fixes to that major release. Yes, it's easier to go from 1.20.1 to 1.20.5 than to 1.21, because there's less chance of breakage and less things that will change, but the process is pretty much the same - check for breaking changes, read changelogs, apply everything. The risk is less, might be slightly faster, but fundamentally, it's the same process. If the process is too heavy and takes you too long, having it be slightly faster won't be a gamechanger.
So LTS brings slight advantages to the operator, while adding potentially significant complexity to the developer (generally backporting fixes into years old versions isn't fun).
The specific proposed LTS falvour is also hardcore, without an upgrade path to the next LTS. The exact type of org that needs an LTS will be extremely reluctant to having to redo everything, 2 years later, with potentially drastic breaking changes making that change very hard.
That's not how LTS is supposed to work. You should be able to uograde effortlessly with minimum risk.
If you're at a point where a patch for LTS looks like an upgrade to the new version you've screwed up LTS.
Also, getting to the point of having an LTS and actually providing the support is expensive. You need experts that can backport security fixes and know the product inside out.
> That's not how LTS is supposed to work. You should be able to uograde effortlessly with minimum risk.
How do you do that on something as complex and with as many moving parts as Kubernetes? And how do you as an operator update that many things without checking there's no breaking changes in the patch?
We upgrade our distros pretty much fearlessly, all the time. While I have had breakage from Kernel upgrades, they've been very rare (and generally related to third party closed drivers). Kubernetes is _not_ more complicated than the Linux kernel, but it is much more dangerous to upgrade in place.
> Kubernetes is _not_ more complicated than the Linux kernel, but it is much more dangerous to upgrade in place.
eh, the kernel is an incredibly mature project with 1 machine scope. The kernel also has decades of operating systems research and literature to build on. Kubernetes in comparison is new, distributed and exploring uncharted territory in terms of feature set and implementation. Sometimes bad decisions are made, and it's fair to not want to live with them forever.
The kernel project looks very different today than it did in 1999.
There is a happy medium though, especially that Kubernetes is kinda far from it.
Erlang and its runtime discovered and solved most of the problems in the 80s. We are slowly rediscovering this the same way react discovered the event loop that Windows had discovered in the 90s.
Erlang solved the problem by making a custom VM that abstracts the network away for the most part and is pretty opinionated about how you do that. Kubernetes is not that. I don't see how Erlang is relevant here. You can run Erlang applications on kuberntes, not the other way around.
My answer is simple: don't. Use something far simpler and with fewer moving parts than Kubernetes, and something where crucial parts of the ecosystem required to make things even basically work are not outsourced to third party projects.
I don't see anywhere that GP said an LTS patch would take effort. They said the upgrade path to the next LTS would.
If you are talking about upgrade from LTS to LTS, can you give an example project where that is effortless? And if so, how do they manage to innovate and modernize without ever breaking backwards compatibility?
Here: "it's easier to go from 1.20.1 to 1.20.5 than to 1.21, because there's less chance of breakage and less things that will change, but the process is pretty much the same"
LTS to LTS is another story. But the point is that L=LongTerm so in theory you're only going to do this exercise twice in a decade.
> manage to innovate and modernize without ever breaking backwards
yeah. fuck backwards compatibility. that is for suckers. how about stopping the madness for a second and thinking about what you are building when you build it?
> in theory you're only going to do this exercise twice in a decade.
So I've seen things like this in corporations many times and it typically works like this...
Well trained team sets up environment. Over time team members leave and only less senior members remain. They are capable of patching the system and keeping it running. Eventually the number of staff even capable of patching the system diminishes. System reaches end of life and vendor demands upgrading. System falls out of security compliance and everything around it is an organizational exception in one way or another. Eventually at massive cost from outside contractors the system gets upgraded and the cycle begins all over again.
Not being able to upgrade these systems is about the lack of and loss of capable internal staff.
Fossilization and security risk is the cost. I'm dealing with one of these systems that's been around like 5 and a half years. It no longer gets security updates so has risk exceptions in the organization. But the damn thing is like a spider and woven into dozens of different systems and to migrate to a newer version is going to take, I'm estimating hundreds to thousands of hours of work on updating those integrations alone. Then you have the primary application and dealing with the multitude of customizations that would have been a stepped upgrade changing a little bit of functionality, now having to have massive rewrites.
The cost either way was likely millions and millions of dollars. But now they are having to do it all at once and risk breaking workflows for tens of thousands of people in a multitude of different ways.
Just upgrading the kernel on one of those "LTS" systems so that developers could start being ready for a kernel that wasn't 3.10 (and it turned out that core component of our app crashed due to... memory layout bug that accidentally worked on old kernels)...
I had to start by figuring all bits necessary to build not just kernel, but also external modules and attendant tools, using separate backported compiler because then-current LTS kernel wouldn't compile using distro-supplied GCC.
I've recently worked in a high profile company where it took them long and painful to move from CentOS 6 to 7 (over a year long effort, IIRC, finished for prod in 2021? but with some crucial corp infra still on 6 in 2022).
In 2022 they had to start a new huge effort do deal with migration off CentOS7, and the problems were so painful it was considered reasonable to build a Linux distro from scratch and remove all traces of distro dependency from the product (SaaS)
that sounds really interesting, can you elaborate on the challenges? why was it so important for them to move off CentOS7, and why didn't they move to RHEL or Alma or Rocky or whatever similar?
US Government woke up to the fact that allowing vendors waivers on requirements for upgrades ends up with nothing ever happening. CentOS7 is EOL'd next year.
Additionally, there was fun of FIPS-140 and OpenSSL older than 3.0.
Alma and Rocky were considered, but that would still involve (possibly similarly painful) migration as with CentOS 6 -> 7.
Have you seen pricing for RHEL? We're talking hundreds thousands of systems. I never seen raw stats, but I would have been totally unsurprised to see them hit million instances across all clouds used, at least occassionally.
Decoupling software from distro dependencies was seen as a way to future proof deployment story and avoid situations like we had with CentOS 7, where they really, really would have liked upgrading some stuff for newer APIs, but couldn't due to mess with OS-provided dependencies.
decoupling meant something like using "distroless" or static builds (musl?) or simply shipping everything on an alpine/ubuntu/debian/whatever image? (and previously there was no containerization, but now there is)
> but the process is pretty much the same - check for breaking changes,
Unless you're relying on buggy behaviour, there should be no breaking changes in an LTS update.
(...of course, there's no guarantee that you're not relying on buggy (or, at least, accidental) behaviour. People relying on `memcpy(3)` working as expected when the ranges overlap, simply because it happened to do so with historic versions of the `libc` implementation they most commonly happened to test with, is one example. But see also obxkcd https://xkcd.com/1172/ and https://www.hyrumslaw.com/ )
It’s impossible to avoid the occasional breaking change in an LTS, especially for software like this. Security fixes are inherently breaking changes— just for users we don’t like.
> Or a security vulnerability has forced a breaking change.
Theoretically, I suppose?
Do you have a historic example in mind?
I've been running Debian "stable" in its various incarnations on servers for over a decade, and I can't remember any time any service on any installation I've run had such an issue. But my memory is pretty bad, so I might have missed one. (Or even a dozen!) But I have `unattented-upgrades` installed on all my live servers right now, and don't lose a wink of sleep over it.
This happens all the time on systems that are running hundreds of thousands of apps across hundreds of customers.
The worst one I know: for a while basically all Cloud Foundry installations were stuck behind a patch release because the routing component upgraded their Go version and that Go version included an allegedly non-breaking-change that caused it to reject requests with certain kinds of malformed headers.
The Spring example app has a header with the specific problem impacted. And the vast majority of Cloud Foundry apps are Spring apps, many of which got started by copying the Spring example app.
So upgrading CF past this patch release required a code change to the apps running on the platform. Which the people running Cloud Foundry generally can’t get — there’s usually a team of like 12 people running them and then 1000s of app devs.
OpenSSL isn't necessarily the best at LTS, but 1.0.1 released a series of changes to how they handled ephemeral diffie hellman generation, which could be hooked in earlier releases, but not in later releases.
For the things I was doing on the hooks, it became clear that I needed to make changes and get them added upstream, rather than doing it in hooks, but that meant we were running OpenSSL with local patches in the interim of upstream accepting and releasing my changes. If you're not willing to run a locally patched security critical dependency, it puts you between a rock and a hard place.
Comparing a single function to an entire ecosystem is crazy. Making an LTS imposes a skew of compatibility and support to all downstream vendors as well as the core team. The core team has done a great job on keeping GAed resources stable across releases. Understand there’s more to it than that but you should be regularly upgrading your dependencies as par-four the course not swallowing an elephant every 2 years or whenever a CVE forces your hand. The book Accelerate highlights this quite succinctly.
From my perspective as a former developer on a kubernetes distribution that no longer exists.
The model seems to largely be, the CNCF/Kubernetes authors have done a good job of writing clear expectations for the lifetime for their releases. But there are customers who for various reasons want extended support windows.
This doesn't prevent the distribution from offering or selling extended support windows, so the customers of those distributions can put the pressure on those distribution authors. This is something we offered as a reason to use our distribution, that we can backport security fixes or other significant fixes to older versions of kubernetes. This was especially prevalent for the customers we focussed on, which were lots of clusters installed in places without remote access.
This created a lot of work for us though, as whenever a big security announcement came out, I'd need to triage on whether we needed a backport. Even our extended support windows were in tension with customers, who wanted even longer windows, or would open support cases on releases out of support for more than a year.
So I think the question really should be, should LTS be left to the distributions, many of which will select not to offer longer releases than upstream, but allow for some more commercial or narrow offerings where it's important enough to a customer to pay for it. Or whether it should be the responsibility of the Kubernetes authors and in that case what do you give up in project velocity with more work to do on offering and supporting LTS.
I personally resonate with the argument that this can be left with the distributors, and if it's important enough customers to seek out, they can pay for it through their selected distribution, or switching distributions.
But many customers lose out, because they're selecting distributions that don't offer this service, because it is time consuming and difficult to do.
As an industry, we need to get back to having security releases separate from other sorts of releases. There are tons of people who don't want to, or can't, take every feature release that comes down the pike (particularly since feature updates happen so insanely often these days), and this would be a huge win for them.
Maybe more importantly, you could get a distribution to support you but what about upstream projects? It'd be a big lift (if not impossible) to get projects like cert-manager cilium whatever to adopt the longer release cycle as well.
Is it normal for a distribution to also package upstream projects that customers want?
> Maybe more importantly, you could get a distribution to support you but what about upstream projects? It'd be a big lift (if not impossible) to get projects like cert-manager cilium whatever to adopt the longer release cycle as well.
Exactly this. I see a lot of parallels between k8s releases and OS releases.
Even if you're paying microsoft for patches to windows XP, I'm not seeing any of that and the python runtime that most of my software relies on also isn't seeing their cut so... I guess upgrade to at least python 3.10 and then call me back?
I would prefer to see the conversation turn more to "what can be done to reduce reluctance to upgrading? How can we make k8s upgrades painless so there's minimal incentive to stick with a long out of date release?"
> It'd be a big lift (if not impossible) to get projects like cert-manager cilium whatever to adopt the longer release cycle as well.
It's a great point which probably should be part of the discussion. Say even if Kubernetes project offered LTS, how would that play into every other project that is pulled together.
> Is it normal for a distribution to also package upstream projects that customers want?
I suspect it differs by distribution. The distribution I worked on included a bunch of other projects, but it was also pretty niche.
> But many customers lose out, because they're selecting distributions that don't offer this service, because it is time consuming and difficult to do.
Sure, but if they really need that service they will gravitate to distributions that do provide it, so, I think, no harm done here. It's to me like JDK distributions. Some give you six month, some give free LTS and others give you LTS with a support contract. LTS with backports is work, someone has to pay for it, so let those who really need it pay. Everyone else can enjoy the new features.
tl;dr: I'm with you in the camp that you can leave it to the distributors.
Good overview! I'd personally rather have better tooling for upgrades. Recently the API changes have been minimal, but the real problem is the mandatory node draining that causes downtime/disruption.
In theory, there's nothing stopping you from just updating the kubelet binary on every node. It will generally inherit the existing pods. Nomad even supports this[1]. But apparently there are no guarantees about this working between versions. And in fact some past upgrades have broken the way kubelet stores its own state, preventing this trick.
All I ask is for this informal trick to be formalized in the e2e tests. I'd write a KEP but I'm too busy draining nodes!
> but the real problem is the mandatory node draining that causes downtime/disruption.
> ...
> In theory, there's nothing stopping you from just updating the kubelet binary on every node.
I am pretty sure Kubernetes itself does not mandate node draining. I have been doing upgrade for bare metal cluster for years, and like you said, it's mostly just replacing kubelet binary and bounce.
However, I do understand in public cloud it's usually recommended to perform a node rolling update instead of modifying online nodes in place. Actually, I prefer this way because of the benefits of immutable infrastructure. The downtime is unfortunately, but so far I have been enjoying working with devs to better designing reliable apps to tolerate node issues like this.
> the real problem is the mandatory node draining that causes downtime/disruption.
This sounds a lot like "We don't actually patch the OS" which is quite common among many companies.
As a former enterprise kubernetes distro maintainer, I can tell you with certainty that most on-premise kubernetes customers aren't patching their machines between kubernetes releases, and try to stay on kubernetes releases for 18+ months.
I have hope for that problem to be solved too, with some combination of minimal kernels and live patching. I really don't think running two copies of everything and hammering CPU/RAM/Disk/Network with constant drain operations is a permanent solution for applying patches.
100% as someone who used to support Kubernetes commercially, long term support is an engineering nightmare with kube. My customers who could upgrade easily were more stable and easily supported. The customers that couldn’t handle an upgrade were the exact opposite - long support cases, complex request processes for troubleshooting information, the deck was probably stacked against them from the start.
Anyway, make upgrades less scary and more routine and the risk bubbles away.
Rotating out nodes during an upgrade is slow and potentially disruptive, however your systems should be built to handle this, and this is a good way of forcing it.
You definitely don't need to drain your nodes. I have never drained my nodes on my peronal cluster and just update and restart the control-plane components.
The procedure is more of a cloud-ism where people don't upgrade their nodes in place but rather get entirely new nodes.
No open source package that's given away for free needs to or should pursue LTS releases. People who want LTS need a commercially-supported distribution so that they can pay people to maintain LTS versions of the product they're using.
Not saying that companies shouldn't pay for extended support, but many other open source software have LTS releases with multi-year support (e.g. Ubuntu/Debian 5 years for LTS releases, and Node.js for 2.5 years.)
Additionally, I think one of the major reason for LTS is that K8s (and related software) regularly introduces breaking changes. Out of all the software that we use at work, K8s probably takes the most development time to upgrade.
For a group so devoted to "cattle, not pets", so many responses here indicate an almost constant need for hands-on effort from upgrade testing, UAT, right through to post-upgrade hypercare.
I'd like a slightly longer LTS purely so I'm not having to spend all my time spinning the plates to keep things up. I don't need 10 years LTS, I need three so I can work with the rest of the enterprise that moves even slower.
To borrow from reliability engineering, software failures in practice can approximate a "bathtub curve".
That is: an initial high failure rate (teething problems), a low failure rate for most of the lifespan (when it's actively maintained), then gradually increasing failure rate (in hardware this is called wear-out).
Unlike hardware, software doesn't wear out but the interfaces gradually shift and become obsolete. It's a kind of "gradually increasing risk of fatal incompatibility". Something like that.
I wonder if anyone has done large-scale analysis of this type. Could maybe count CVEs, but that's just one type of failure.
Perhaps it's because I work in a small software shop and we do only B2B, but 99% of our applications consist of a frontend (JS served by an nginx image), a middleware (RoR, C#, Rust), nginx ingress and cert-manager. Sometimes we have PersistentVolumes, for 1 project we have CronJobs. SQL DBs are provisioned via the cloud provider. We monitor via Grafana Cloud, and haven't felt the need for more complex tools yet (yes, we're about to deploy NetworkPolicies and perform other small changes to harden the setup a bit).
In my experience:
- AKS is the simplest to update: select "update cluster and nodes", click ok, wait ~15m (though I will always remember vividly the health probe path change for LBs in 1.24 - perhaps a giant red banner would have been a good idea in this case)
- EKS requires you to manually perform all the steps AKS does for you, but it's still reasonably easy
- All of this can be easily scripted
I totally agree with the other comments here: LTS releases would doom the project to support >10y-old releases just because managers want to "create value", but don't want to spend a couple weeks a year to care for the stuff they use in production. Having reasonably up-to-date, maintainable infrastructure IS value to the business.
1. Pay someone else to do it for you (effectively an LTS)
2. Don't use it
Software is imperfect, processes are imperfect. An LTS doesn't fix that, it just pushes problems forward. If you are in a situation where you need a frozen software product, Kubernetes simply doesn't fit the use case and that's okay.
I suppose it's pretty much all about expectations and managing those instead of trying to hide mis-matches, bad choices and ineptitude. (most LTS use cases) It's essentially x509 certificate management all over again; if you can't do it right automatically, that's not the certificate lifetime's fault, it's the implementors fault.
As for option 1: that can take many shapes, including abstracting away K8S entirely, replacing entire clusters instead of 'upgrading' them, or having someone do the actual manual upgrade. But in a world with control loops and automated reconciliation, adding a manual process seems a bit like missing the forest for the trees. I for one have not seen a successful use of K8S where it was treated like an application that you periodically manually patch. Not because it's not possible to do, but because it's a symptom of a certain company culture.
As someone working in managing a bunch of various Kubernetes clusters - on-prem and EKS - I agree a bit. Managing Kubernetes versions can be an utter PITA, especially keeping all of the various addons and integrations one needs to keep in sync with the current Kubernetes version.
But: most of that can be mitigated by keeping a common structure and baseline templates. You only need to validate your common structure against a QA cluster and then roll out necessary changes onto the production cluster... but most organizations don't bother and let every team roll their own k8s, pipelines and whatnot. This will lead to tons of issues inevitably.
Asking for a Kubernetes LTS is in many cases just papering over organizational deficiencies.
2 years is still not very long for large enterprises, they'll be conservative on the uptake - 3-6 months after it gets released, so they only have maybe 12-18 months before they're planning the next upgrade. It's better than vanilla kube, but compare that to RHEL where you get 10 years of maintenance, and can often extend that even further.
So in other words: no it doesn't. The CNCF does its thing, and if you want something else, you can give money to AWS or Azure or GCP and have your cake and eat it too.
I'd rather not see the resources in the Kubernetes project being re-directed to users who are in a situation where they aren't able to do a well-known action at planned intervals two or three times per year.
Two of the reasons we gave an ultimatum to teams to get off K8s is that CNCF doesn’t like LTS and loves introducing breaking changes about every other version.
Unless you have a third party product with a helm chart that really needs to run on EKS, our dev teams have been told to use ECS
Sounds like you are into some 90's top-down management mentality right there. Good luck with that. (in my experience, that no longer works well, and is a poor solution applied in highly regulated markets and mainframe land) That makes your lack of high speed software supply chain management not go away, and with ECS you'll just get mandatory Fargate update notifications, unless you're using ECS EC2 in which case, what you are even doing?
We don't allow systems that cannot be mass-managed and don't have an active lifecycle that keeps up with demand. Sometimes that doesn't mean K8S, but often it does because it's the only large adopted system with a high speed reconciler. But the teams get to choose that on their own.
Edit: let me rephrase that a little.
It seems that software lifecycle speed and business speed interact to a degree where if the software is faster than the business, you need a different product or speed up the business. For me, that would mean that if either aren't an option, I'd change jobs because it doesn't align with my standards.
Oneplane sounds right to me. Perhaps to you “order” is the most important thing, rather than teams’ efficiency, effectiveness and innovation. Which if so sort of proves the 90s point.
Unless your company has one product ("orderliness"), that seems not all that relevant. It's also highly unlikely that some sort of 'boss at the top' knows everything and therefore is the only one with the capability to make choices.
Aks is usually like one version behind, and they deprecate older versions every 3 months IIRC. In a one year window of versions, that's not necessarily long term at all. The upgrade process is surprisingly kind of okay though, which is already a lot better than what I usually expect from Azure.
The economic math is very simple: organizations that release faster are more responsive to the market, have lower operational costs, and are therefore more efficient. Market players that release slower will get squeezed by more efficient players. It may happen later rather than sooner but it is inevitable.
As a business, you can decide to become more efficient, or you can decide to try to support the status quo that enjoys internal political equilibrium. Smart businesses go for the former, most businesses go for the latter.
Honestly, rather than an LTS, I think k8s needs a much better upgrade process. Right now it is really poorly supported, without even the ability to jump between versions.
If you want to migrate from 1.24 to 1.28, you need to upgrade to 1.25, then 1.26, then 1.27, and only then can you go to 1.28. This alone is a significant impediment to the normal way a project upgrades (lag behind, then jump to latest), and would need to be fixed before any discussion of an actual LTS process.
Your problem here is waiting until 1.28 to be released before starting your upgrades from 1.24. A version or two is one thing but 4 versions and you’re just asking to be constantly struggling to keep your head above water.
The cynical voice inside me says it works as intended. The purpose of k8s is not to help you run your business/project/whatever, but a way to ascend to DevOps Nirvana. That means never-ending cycle of upgrades for the purpose of upgrading.
I guess too many people are using k8s where they should have used something simpler. It's fashionable to follow the "best practices" of FAANGs, but I'm not sure that's healthy for vast majority of other companies, which are simply not on the same scale and don't have armies of engineers (guardians of the holy "Platform")
This is sorely needed. It causes service disruption for weeks because of weird upgrade schemes in GKE (or any other vendor) which works on top of Kubernetes. There are too many options with non-intuitive defaults related to how the control plane and the node pools will be repaired or upgraded. Clusters get upgraded without our knowledge and breaks service arbitrarily. Plus, if you are doing some infrastructure level changes, you have to put in extreme effort to keep upgrading.
IMHO , infrastructure is too low level for frequent updates. Older versions need LTS.
Our cluster has reasonable complexity and yamls work well so far. Maybe we will see what you are saying if we scale further. It allows us to maintain a much more complex cluster with fewer developers. And the software is well tested/reliable.
But not having LTS is really difficult. It is impossible to keep rewriting things. And all related components: terraform, helm, GKE etc also update too quickly for us to manage with a small team. We do some lower level infrastructure work so these problems bites us harder.
There are tools to tell you if any of your deployed infrastructure uses a deprecated API. I mean, even if the tool didn't exist you could view the deprecation guide, scroll through the kubernetes versions you'll be upgrading through, and inspect your cluster for any objects using a Kind defined by any of those APIs. It's a burden but when is maintaining infrastructure not a burden? https://kubernetes.io/docs/reference/using-api/deprecation-g...
I mean, CRD API changes/deprecations are, by nature, not reported by the kubernetes project because they're maintained by whatever software provider you installed them from.
Unless you're saying that the CRD creates generic kubernetes resources in the cluster that is owned by the CR. In which case, the tool should definitely pick up the generic object at least, but not necessarily tell you that you need to update your version of the CRD, again because the CRD is owned by a third party.
Third party CRD’s are ubiquitous in every single kubernetes-based infra I’ve yet seen so far in my career. Nowhere was I making a statement vis a vis how it should be, but that dependencies like this have yet not to become a huge headache on every K8s upgrade I’ve done (started on 1.12, I just moved my current project to 1.27, and have done most upgrades in between).
Given that the kubernetes dev cycle is so fast and API changes are so frequent, combined with the fact that the kubernetes API is designed to be built on top of, this tends to be a real hassle in the real world was my only point.
Compare it to something like the “old” way of doing it, some monolith running on a container in a VM - that stuff could run until the cows come home with very few issues or intervention.
Yeah that's pretty fair. Though I'd say that's something I struggle with today as a whole; keeping up with software changes among all my infrastructure dependencies. The old solution to this used to be, "just don't ever update it and you'll have no problems until the server is 30 years old and fails to boot." Nowadays you can't really get away with that, so we update all of the things all of the time, constantly juggling support matrixes between various pieces of it all with interlocking dependencies.
I don't think this is a problem that is unique to kubernetes, is what I'm getting at. You never said that it was, I'm mostly thinking out loud.
In my opinion if you need a stable production enviroment; MicroK8s stands out as the Kubernetes LTS choice. Canonical managed to create a successfull robust and stable Kubernetes ecosystem.
Despite its name, MicroK8s isn't exactly 'micro'—it's more accurately a highly stable and somewhat opinionated version of Kubernetes.
It's not an issue with LTS, but just with the fact that kubernetes moves too fast and breaks backwards compatibility too often. I don't care about particular version as long as I can keep auto upgrade at night and don't care about breakage.
I found it a little weird the acronym was not spell out in full term.[1]
After all, we could also looks for a discussion in the article about the need for an STS, and how that should defined before GA or after the RC. Definitely at least before the EA versions of Kubernetes and at least before EOL of the product....
We actually did just that with just apiserver+sqlite-based storage and it works pretty well. Ofc you have the same problem with an outer cluster now but if you’re careful with locking your entire state into the “inner” cluster you can escape that
I think you're saying this sarcastically, but every Kubernetes component except the kubelet already can, and usually does in most distros, run as a Pod managed by Kubernetes itself. Edit the manifest, systemctl restart kubelet, and there you go, you just upgraded. Other than kube-proxy, which has a dependency on iptables, I don't even think any other component has any external dependencies, and the distroless containers they run in are pretty close to scratch. Running them in Pods is more for convenience than dependency management.
The actual issue that needs to be addressed when upgrading is API deprecations and incompatibility. For better or worse, when people try to analogize this to the way Red Hat or Debian works, Kubernetes is not Linux. It may not always perfectly achieve the goal, but Linux development operates on the strict rule that no changes can ever break userspace. Kubernetes, in contrast, flat out tells you that generally available APIs will stay available, but if you use beta or alpha APIs, those may or may not exist in future releases. Use them at your own peril. In practice, this has meant virtually every cluster in existence and virtually every cluster management tool created by third parties, uses at least beta APIs. Nobody heeded the warning. So upgrades break stuff all the time.
Should they have not ever released beta APIs at all? Maybe. They actually did start turning them off by default a few releases back, but the cultural norms are already there now. Everybody just turns them back on so Prometheus and cert-manager and Istio and what not will continue working.
I know it rarely happens in practice, but I disagree than Kubernetes needs an LTS since the Kubernetes cluster itself should be a "cattle not a pet" and thus, you should "just" spin up a new cluster with the same version as your upgrade strategy.
The whole point of Kubernetes is to move all of the machine management work to one place - the Kubernetes cluster nodes - so that your services are cattle, not pets.
But there is no equivalent system for whole clusters. To transition from one cluster to another, you have to handle everything that k8s gives you without relying on k8s for it (restarting services, rerouting traffic, persistent storage etc). If you can do all that without k8s, why use k8s in the first place?
So no, in practice clusters, just like any other stateful system, are not really cattle, they are pets you care for and carefully manage. Individual cluster nodes can come and go, but the whole cluster is pretty sacred.
Global load balancer and traffic shifting is a real thing.
But my preferred way is in place as it's easy: upgrade one Ctrl plane node after the other, than upgrade every node or create a new one and move your workload.
The power of k8s is of course shit when you only move legacy garbage into the cloud as those are not ha at all.
But lucky enough people slowly learn how to write apps cloud native.
Even with the best cloud native apps that can really be easily moved from one node to another, it is not easy to do cluster upgrades.
Upgrading one control plane node after another can still break your cluster when you have your control plane nodes running incompatible versions.
If you get a big worker node failover during the control plane upgrade, which of the control plane nodes should provision the new worker nodes, which versions of containers should kubelet pull, etc?
The upgrade process for a running multi-node solution is an extremely difficult technical challenge, and it has to be designed very carefully to actually have a good chance of working. Breaking changes are almost impossible to introduce if the goal is to upgrade gracefully - you probably need a special compatible version to run while the upgrade is still in progress, that is neither the old version nor the targeted new version, but can handle both. You need to make sure all APIs are versioned in a way where it's easy to distinguish a request coming from an old client (so you can serve the old format) or a new one etc.
Kubernetes has done some work on this, but it still often requires manual work. And many common plugins have just not put in this work and will break if you try to upgrade a live cluster.
> But there is no equivalent system for whole clusters. To transition from one cluster to another, you have to handle everything that k8s gives you without relying on k8s for it
Ah, but there are. Maybe not a cluster orchestrator that would let you hot swap a single cluster, but if you're only running one cluster it's inherently a pet already.
One common way that I've personally used to solve this problem is use Federated Clusters (kubefed) to run multiple clusters with global load balancing so that even when one entire cluster goes away its traffic is just shifted to the other cluster (whether because of upgrades or unexpected outages of the control plane)
Global load balancing is not enough, is it? If a request to save a new piece of info is served by one cluster, it still needs to update all the other clusters so that they can call retrieve it. So now you also need global data replication. And the data storage service shards need a name that's easily resolvable, so you need global name resolution across your clusters. And you need something to monitor the workloads so it can trigger failover, so you need global health checks too. And when you do an upgrade, you need to coordinate cordoning off the clusters etc - so you need global state as well.
So overall, you need all, or most, of the features of kubernetes in your cluster federation solution. So why then not make each of these individual kubernetes clusters really small, say a single-node cluster each, and rely on the global infra to manage between them? Hey-presto, your "cattle farm" is now a single big pet.
That's what we do and it's great. It doesn't even apply to just Kubernetes, you should be able to swap out most components of your infrastructure without manual work, that's why we even invented all this modern software, or even hardware for that matter: redundant power supplies! If we can do that, why is everyone so scared of doing the same for other systems?
Many comments here disagreeing about LTS just because it won't be up-to-date – are missing a critical point.
When people run a rolling release on their server, their original intent is "Yay I'll force myself to be up-to-date". Reality is, they get conflicts on installed 3rd party software in each upgrade. What ends up happening is, they get frozen on some point in time, without even security patches for a long time.
k8s is like an OS, it's not just the core components, there is <ingress/ gateway/ mesh/ overlays/ operators/ admission controllers/ 3rd party integrations like vault, autoscalers, etc>. Something or the other breaks with each rolling release. I've grown really tired of the way GKE v1.25 pretends to be a "minor" automated upgrade, when it removes or changes god knows how many APIs.
This is what is happening in kubernetes land. The broken upgrade fatigue is real, but it's hampered by wishful thinking.
Well, that and there appears to at least be a commercial market for an LTS based on announcements from Red Hat, Azure, and now Amazon.
Azure are the ones who effectively posed the question in the community of hey, since we all appear to be doing this is there any value in collaborating on it in the open by rebooting the LTS working group.
Now that said, while you are absolutely right that third party solutions are the biggest boat anchor to users upgrading whether commercial or open source, they are also the reason an LTS doesn't necessarily fix much because then those things need an LTS too and/or they decide hey this is great we will fix on this one and it becomes an even bigger lift when the ISV does need to bring their stuff up to latest.
Software is a garden that needs to be tended. LTS (and to a lesser extent requirements for large amounts of backwards compatibility) arguments are the path the ossification and orgs running 10+ year out of date, unsupported legacy garbage that nobody wants to touch, and nobody can migrate off because it’s so out of whack.
Don’t do this. Tend your garden. Do your upgrades and releases frequently, ensure that everything in your stack is well understood and don’t let any part of your stack ossify and “crust over”.
Upgrades (even breaking ones) are easier to handle when you do them early and often. If you let them pile up, and then have to upgrade all at once because something finally gave way, then you’re simply inflicting unnecessary pain on yourself.