Hacker News new | past | comments | ask | show | jobs | submit login
Kubernetes Cluster API v1.0, Production Ready (infoq.com)
82 points by linuxfreakerr 55 days ago | hide | past | favorite | 59 comments



Some of the arguments against Kubernetes and for simpler systems/approaches, like Docker Compose, systemd or others actually gave me the push i needed to collect some of my thoughts and past experiences in a blog post "My journey from ad hoc chaos to order (a tale of legacy code, services and containers)": https://blog.kronis.dev/articles/my-journey-from-ad-hoc-chao...

(i'm linking it as a top level comment here, since responding to multiple comments would feel spammy, but the discussion is interesting)

In short:

  - Docker Compose is lovely for local development, but isn't necessarily the only option to do containers easily
  - Docker Swarm is what i chose and has been working wonderfully with software like Portainer
  - K3s is a good Kubernetes distro that cuts down on the resource usage and allows you to work slightly more easily with it
  - that said, Kubernetes isn't always the right tool for the job, with which i agree, especially for smaller teams
  - systemd services can also be perfectly passable, but only as long as you control everything around the service itself (e.g. the environment with the updates etc.)
  - regardless of whether you use containers or not, there are benefits to configuration management systems like Ansible, which i suggest you look into if you haven't already


what is good to use if you need to do a rolling deploy midday without interruption for clients?


Unfortunately not Kubernetes, because it decoupled pod lifecycles from load balancers by design. K8S has no idea how to coordinate with load balancers to ensure new requests are flowing to new pods and to allow in-flight requests on moribund pods to complete before terminating them. This leads to at least a small number of 5xx errors being returned to clients on almost every deployment unless you implement workarounds like delaying pod shutdowns with arbitrary sleep times.


I've been managing a high traffic website running as 100s of Kubernetes microservices. We have 0 issues with 500s due to deployments.

As long as you have well set up probes and old pods have enough time to fulfil lingering requests before going away, there's no reason why you should have 500s at all.


There’s no question that it can be worked around by adding arbitrary delays in places, but finding the right delay then becomes a guessing game and a trade-off between safety and paying to run pods that will do no more work. Different applications and L4/L7 transports also require different behaviors. It also requires that the author of the podspec know to implement this workaround, instead of having this knowledge baked into the orchestrator where it rightly belongs. I suspect your early attempts at setting this up safely also failed.


From my experience, it's enough if the application is able to finish existing requests when it receives the signal to terminate. Kubernetes removes the terminating pod from the service endpoints, so it won't receive new connections. Therefore, if the application can serve all remaining queries during the grace period (30s by default), it should be fine.

Probes are useful for ensuring long term health and also for determining when to send traffic to a starting pod. They aren't strictly necessary in the termination process.

> I suspect your early attempts at setting this up safely also failed.

IMHO Kubernetes is too complicated for many use cases. The zero downtime deployment may require some testing, but it's definitely supported. It's not a rare requirement.

https://kubernetes.io/docs/concepts/workloads/pods/pod-lifec...


First, although you are correct that the endpoint is removed from the proxies, the removal is asynchronous and so the proxy configurations may not be up to date before the shutdown is initiated. Also, more and more load balancers are bypassing node ports these days, preferring instead to route traffic directly to pods, and so it’s becoming a less effective mechanism.

Second, the challenge is that app developers are rarely aware of the need, let alone how to write SIGTERM handlers that finish handling existing requests before exiting. For them it’s yet another cognitive and testing burden that is usually out of their business expertise domain, and to my knowledge few of the common web service frameworks implement a reliable shutdown handler out of the box.

On the other hand, if the orchestrator simply waited for the LB to signal that the pod was removed from the backend list before initiating the shutdown process, then it would be more robust by design.


>"Also, more and more load balancers are bypassing node ports these days, preferring instead to route traffic directly to pods, and so it’s becoming a less effective mechanism"

Could you or someone else say which LBs are currently bypassing node ports? Also how does a LB actually bypass a nodeport since a service type of Loadbalancer in Kubernetes always uses a nodeport? How exactly does an external Loadbalancer connect to a pod without first connecting to a port on the physical host?


The AWS load balancer controller for Elastic Load Balancers is capable of configuring routing directly to pods in an EKS cluster because those pods get a routable VPC subnet IP address by default. They’re not stuck behind bridges connected to non-routable subnets, so no NAT needed.


I can agree with that. :-)


In our current non-web application we have a launcher, which allows users to select between the current and a few previous versions. This helps us iterate faster due to less show-stoppers. If a user experiences a blocking bug in current version, they log out and back into an older version.

For our web version I was thinking of doing something similar by having a launcher page, and then having each version run in its own container on a subdomain.

So far I tested this using Docker Compose with Traefik routing the requests to the appropriate app container, and it worked fine. My deployment script modifies the docker-compose.yml config file to add a new container, and then Compose brings the additional container online.

Current users get a notification that a new version is available but can continue working in the current version, and new logins have the new version selected by default (but can be overridden as mentioned).

Would something like this not work with Kubernetes? I'm not yet familiar with Kubernetes but I've been assuming we might end up using it.


There are ingress controllers for kubernetes that support setting up routing rules that would fit your needs. For example, the user's browser could set a header with a desired version and the ingress controller could route the request to a set of containers running that version. There are many other ways to skin this cat (all the way up to complex systems like GraphQL). Pick one that matches the size and complexity of your organization.


You can do this in Kubernetes by combining Envoy's draining support and graceful shutdown in the backend Pod (via handling SIGTERM or adding a preStop hook to delay termination). The big users of Envoy+Kubernetes are mostly running their own custom ingress controllers to coordinate this across the ingress system (easier than it sounds).

https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overv...

Source: I helped build a system like this for a major tech company that handles many tens of billions request daily.


That sounds like a great deal of work that ought to have been needless. And Envoy is a second system, not part of K8S proper.


Honestly, it depends on what's in your stack and how everything fits together and what's the architecture like.

Almost any orchestrator (Nomad, Swarm and Kubernetes) can do a rolling deploy and draining of particular nodes if you ever want to do that either.

Currently my homepage runs on a few containers (horizontal scaling) and a load balancer in front of it, and redeploys have very little impact on the overall experience: https://kronis-dev.betteruptime.com/

That said, for write heavy workloads you'll need to even manage your DB migrations as a multi step process to allow both old and new schema versions to work during redeploy, whereas in other circumstances you'll need to think more about how to roll back either the DB, the app containers/instances or even events in your system (depending on the architecture and how much can go wrong).

In short:

  - horizontal scaling can help, as can having a load balancer with health checks
  - changes not being breaking between any 2 releases can help (e.g. multi-stage DB migrations)
  - complications may arise with stateful services and data stores mostly
  - apps and APIs being able to cope with others being offline, if only for a bit (e.g. request replay) can also help


I am currently using Terraform almost like Docker Compose to manage about ~5 services + a database + Redis in Docker containers. When I do a deploy, I swap version 0.0.1 for version 0.0.2 for example and it brings 0.0.1 down, then brings 0.0.2 up (aka, not very rolling deploy like).

How would I migrate that to Swarm? I'm not using AWS/GCP/any of that. Just locally hosted Docker containers.


This is probably a good starting point: https://docs.docker.com/engine/swarm/swarm-tutorial/

Essentially, if you already have a node or multiple nodes that have Docker, you can initialize a swarm with:

  docker swarm init --advertise-addr <MANAGER-IP>
And, optionally, add nodes as workers to the cluster:

  docker swarm join --token <JOIN-TOKEN> <MANAGER-IP>:2377
And then you can deploy stacks:

  docker stack deploy --compose-file docker-compose.yml my_stack
Here's a tutorial about stacks: https://docs.docker.com/engine/swarm/stack-deploy/

You'll probably want to read up on the Compose specification and the values that Docker Swarm and Docker Compose support since last i checked there were some differences in v2 / v3 (which many people still use): https://docs.docker.com/compose/compose-file/

Of course, if you want a less steep learning curve for actually running Swarm, depending on your personal preferences (GUI vs CLI), you might want to have a look at Portainer, which is a lovely and simple dashboard: https://www.portainer.io/

From there, it should mostly be a matter of running multiple replicas of each of your services and doing rolling updates, while having a stable load balancer in front of everything: https://docs.docker.com/compose/compose-file/compose-file-v3...

Also, you'll probably want health checks because Swarm can then route traffic to containers only after they have truly started: https://docs.docker.com/compose/compose-file/compose-file-v3...


To be clear, Kubernetes does not know anything about draining traffic from pods or nodes. See my comment elsewhere.


>Kubernetes isn't always the right tool for the job, with which i agree, especially for smaller teams

I think this is the key point. Kubernetes is great when used correctly, in the right scenario. And I feel that most of the hate comes from situations where it's not suitable. Kubernetes is just one of the many platforms available to run your code. My point extends to more systems, like Ansible, Chef and all the others. TL;DR it's not a tool for developers, it's a tool for operations and systems integrators.

A developer working on an application shouldn't care about the platform being run on, only defining its requirements - think big enterprise system that just consumes data, processes and stores it. The moment an engineer starts thinking "I need X platform" they're not developing the app anymore, they're integrating it too.

The system integrator should be able to take those requirements and match it to the right platform - choose the OS, native package vs container, scalability, fault tolerance etc. But the system integrator doesn't need to care about the long term maintenance of the platform, just how to integrate the application onto the platform. k8s can be considered here for long term support, ease of integration, portability etc.

The systems maintenance are responsible for ensuring the system has the resources to run the application and fix any faults that appear in the infrastructure - this is where k8s shines the most, because I can drain a node, update it and uncordon it without worrying about where I'm going to move the containers running on the node because k8s will do that for me. Gone are the days where we need to plan taking a machine offline for an hour, making sure the right failover procedures are in place, because k8s should be demonstrating that regularly.

Essentially, kubernetes/Ansible/etc. should not be a part of the development process. It doesn't solve problems that an application developer should care about. It's an operations tool, able to monitor and manage deployed applications. The hate occurs when the wrong people are using the tools, or the right people are using the tools incorrectly.

And of course, these kinds of separations of concern can only really happen in larger teams, on larger projects. Dedicating a few people to maintaining a deployment, a different group to continuous integration, and a 3rd group to actually developing the functionality. That's just not possible in every case, and the most hate seems to come from those in-between scenarios where there's too many people to call it a small team, but too few to enable this separation of concern. Meanwhile, the most love seems to come from those that can differentiate between the many hats they need to wear in a small team.


> A developer working on an application shouldn't care about the platform being run on, only defining its requirements - think big enterprise system that just consumes data, processes and stores it. The moment an engineer starts thinking "I need X platform" they're not developing the app anymore, they're integrating it too.

This feels like a statement that begs more discussion because it's an interesting argument, albeit probably a polarizing one as well.

My experience has been the complete opposite to this - when developers also think about how the app will run, the outcomes will be the best.

No system runs in a vacuum or exists separate of its runtime environments, so attempts to decouple those could lead to countless problems and inconsistencies down the road. For example, if you create a Java app that expects configuration to come from XML files, this may cause difficulty with containerization and won't follow the best practices anymore. If you create an app that stores session data in memory instead of Redis, then it won't be horizontally scalable. Same for shared data, like using the file system for something - that will create a lot of problems with multiple instances vs using something like S3.

Even if you don't pick the platform yourself, you should make sure that the app integrates with it smoothly, hence the 12 Factor App principles can help for many of these cases.

> Essentially, kubernetes/Ansible/etc. should not be a part of the development process. It doesn't solve problems that an application developer should care about. It's an operations tool, able to monitor and manage deployed applications. The hate occurs when the wrong people are using the tools, or the right people are using the tools incorrectly.

In my experience, this leads to knowledge silos, problematic communication, not thinking about Ops concerns during development either due to a missing set of skills or just not caring due to a lack of ownership about the outcome, and just generally a slowed down pace of development.

Now, there probably is a lot of benefit in specialization (e.g. Ops specialists that consult across different teams and projects to make everything more consistent), but in my eyes everyone should know a bit of everything and communicate freely.

If you don't do that, developers use ORMs in their apps and create N+1 problems or large amounts of DB calls due to a lack of joins or no consideration for indices because they're not DBAs after all, other times people don't bother with instrumentation in their apps and introduce problems with proxying configuration due to not caring about the ingress will look like etc.

I actually recall seeing an issue at work about an Ops related problem, recommended a few steps to take care of it, whereas the responsible dev simply told me: "That's not in my job description, i just write Java."

Now, not making people learn about an extremely vast array of topics and spend evenings/weekends doing so would also be nice, but then again, that person didn't solve the issue, i did. The end result is what matters, optimizing the road to it is up to the people in control.

That said, "rockstar devs" or "DevOps" specialists (or whatever the current term is) also carry risks in regards to a low bus factor (everything hinging on them in certain orgs, vs spreading the knowledge), so it's all probably pretty situational and scalable to different degrees.


> when developers also think about how the app will run, the outcomes will be the best

Absolutely agree, but that becomes "engineering" in my view. I'm talking specifically about implementing functionality, not long term views and project direction etc.

> Even if you don't pick the platform yourself, you should make sure that the app integrates with it smoothly, hence the 12 Factor App principles can help for many of these cases.

Which is the role of the system integrator - to integrate the app with the platform it's to be run on.

> In my experience, this leads to knowledge silos, problematic communication, not thinking about Ops concerns during development either due to a missing set of skills or just not caring due to a lack of ownership about the outcome, and just generally a slowed down pace of development. > Now, there probably is a lot of benefit in specialization (e.g. Ops specialists that consult across different teams and projects to make everything more consistent), but in my eyes everyone should know a bit of everything and communicate freely

Absolutely agree. And I hate that this happens, which is why I try to learn about all levels of the tech stack and not just one domain. But I do feel like this is a people problem, not a tech problem or project problem.

If the team cannot communicate effectively, they're never going to produce the best product, regardless of who knows what, budgets, etc. Ultimately, these roles aren't a waterfall structure. Developers shouldn't throw things over the fence to system integrators, and similar to operations.

Instead, the whole team works together, in the same space, on the same backlog. If ops has a problem, it gets prioritised on the backlog. If developers are unsure of how to approach something, they talk to the integration or ops people. In an ideal world, the teams would rotate between roles regularly, so that everybody is aware of the issues being faced that can't be seen from elsewhere.

Based purely on my own experience, far too much priority is put on budget constraints and making money instead of making a product that a customer actually wants. And this often leads to the one person, many hats issue, which then cascades into a mashup of roles and chasing our tails. I've had a lot more success on projects where we've clearly defined who owns which layers of the stack than when the team are just left to fend for themselves.

> That said, "rockstar devs" or "DevOps" specialists (or whatever the current term is) also carry risks in regards to a low bus factor (everything hinging on them in certain orgs, vs spreading the knowledge), so it's all probably pretty situational and scalable to different degrees

Specialists absolutely have their place, but are not a fixed requirement. Just because a project chooses k8s, doesn't mean it needs a k8s guru to own that layer, merely that the project needs some knowledge in that area to assess the scale required and make the call on keeping it small and manageable, or getting a consultant in for a short time to design and upskill the team, or getting a dedicated DevOps person to make the impossible happen.

But again, there's no reason why all of this couldn't happen with a team size of 1, with that 1 person wearing many hats. The important thing in those situations is knowing which hat is which and prioritising appropriately. My suggestion is a high level model that can be morphed to fit the situation, not a fixed framework that everybody must follow until the end of time.


I wonder what this means for projects like https://crossplane.io/, it's a tool that lets you provision cloud resources with Kubernetes manifests. I guess this cluster API project is maybe something Crossplane would use under the hood?

I haven't used Crossplate yet but I like the idea around it and could see things moving into this direction over time. Mainly because:

I use Terraform now to set up cloud resources and ArgoCD to control what runs in the cluster but there's pain points in having things exist in both worlds as dependencies. Like if I want to Helm install the AWS Load Balancer controller into my cluster using ArgoCD it requires having a few IAM policies created by Terraform beforehand and if you use something like EKS Fargate it also means your load balancer controller needs to know your VPC ID at install time which is something only Terraform knows about since it created the VPC.

So now you end up writing glue code to extract outputs from Terraform and use sed to replace them into Kubernetes YAML files. Another example would be having access to useful Kubernetes add-ons like External DNS to update public DNS records automatically but you still need to create your AWS Hosted Zone with Terraform beforehand for new domains.

It would be way cleaner and nicer if everything was provisioned by the same tool. I know Terraform can Helm install things but I don't like this pattern where a tool that's dedicated to setting up my cloud infrastructure is now responsible for application level things. It would be like trying to use Terraform to replace Ansible, it just doesn't feel like the right tool for the job.

But a unified tool that is a reconciliation master through Kubernetes seems like a no brainer evolution in how we deal with IaC. Not just to keep things in the same universe but applying changes and drift detection.


> I guess this cluster API project is maybe something Crossplane would use under the hood?

It’s more likely the other way around: Cluster API needs a so-called infrastructure provider that creates cloud resources to get a Kubernetes cluster running. The logic to join/create a Kubernetes cluster within the VMs created by that is managed by a separate provider, the bootstrap provider.

So, I guess it would be possible to externalise the creation of cloud resources to Crossplane and write a small shim infrastructure provider that just translates your cluster specification into Crossplane resources.

Crossplane has a different use case than Cluster API (create any cloud resources vs create a Kubernetes cluster), so Crossplane using Cluster API doesn’t make a lot of sense. Crossplane basically is „what if Terraform was a Kubernetes Operator“, so that somewhat sounds like what you’re interested in with the later parts of your post.


Genuinely interested -- why would I choose this over running Terraform as part of CI/CD?


If terraform crashes during apply, it leaves behind an inconsistent state by design: The lock is still set, and some resources which were created already are not yet in the statefile. Trying to re-run terraform after a crash during apply will generally lead to an error: Even if the lock is removed, resources may still conflict if they already exist.

In contrast, when a kubernetes controller or operator crashes, it can be expected to continue seamlessly where it left off.

It is easier to write kubernetes controllers that are able to continue seamlessly then to write terraform providers that do so, because of the granularity of the persistence of the state machine. Terraform locks the remote state, then applies all resources in the current root module, then unlocks the remote state again. In contrast, kubernetes operators can granularly update individual objects after each API call that is performed.


That is an (incredibly) poor explanation of Terraform plugin separation from state management (or perhaps you’re using a weird locking backend that has more issues than correctly implemented ones). If created resources don’t end up in the state file, that is a provider bug (they failed to call `SetId` soon enough).

Source: was a core developer of Terraform and some of the largest providers for many years.


If the explanation is incredibly poor, why is this issue still open: https://github.com/hashicorp/terraform/issues/20718


I have never seen Terraform crash in the years I’ve been using it. Optimizing based on a lottery event seems counter-productive

EDIT: Alright guys, good points all around. Mainly what I meant is the ratio of useful of Terraform vs its shortcomings. Not hating on the Cluster API pattern, but imho better to stick to standardized approach.


That's a very shortsighted comment. Just because you've never seen it happen, it doesn't mean that it doesn't. While terraform itself may not crash (maybe not in the manner that I assume you're referring), you are at the mercy of the implementation of the specific providers for which I've (and many others as well, I'm sure) seen plenty of issues. Even in "mature" ones as aws/gcp.

Besides that, the control-loop pattern, which the parent comment describes is a very sound design, which is employed not only in kubernetes and has nothing to do with "Optimizing based on a lottery event".


> That's a very shortsighted comment.

Is it though? GP claims to have been using terraform for years.

I have been using terraform for some time too, and from what i've seen well written providers will error out in a safe and controlled manner instead of making the whole thing crash.

> Even in "mature" ones as aws/gcp.

Can't comment for the gcp provider, but I've worked at companies where the terraform provider is used quite extensively and really can't recall a real crash that created serious problems. And i've been also using the openstack provider, again, quite extensively, with not much troubles.

So in conclusion I think that it's your argument that's really short-sighted, because it's based on FUD and situations that are theoretically possible but sufficiently remote in the real world that they can be ignored.


Doing a very quick search [1] on the aws provider github repo, produces `92` open, and `306` closed issues with various bugs in the provider. I agree that most of them might not count as creating serious problems, but I as well have been using it quite extensively and can definitely remember cases where I have had to manually deal with corrupted state in various ways.

I'm not sure which part of my comment was "FUD" - I never said that you shouldn't use terraform, I was pointing out that issues exist, always. I think of it as the nature of the work we're doing (I say we, as I imagine you are also in the SWE field based on your comments). Show me any piece of software without bugs, especially as complex as a tf provider, and I'm buying you whatever you want.

[1] https://github.com/hashicorp/terraform-provider-aws/issues?q...


Anecdotally I've had to untangle lots of half applied state. Sometimes a provider has a bug, sometimes somebody on the other side of your company changed your terraform plan without migrating the running state, sometimes there are network issues that prevent your plan from applying correctly, sometimes the cloud provider said it deployed the resources but it actually didn't, etc, etc.

It's extremely painful and delicate to do and it may not happen frequently, but in my experience it happens frequently enough when I'm dealing with a lot of terraform.


I've had Terraform break several times when my AWS STS token expired half way through a deployment. When that happens I ended up having to manually delete all the created resources because the Terraform state was corrupted to the point I couldn't reapply to create the remains resources but also couldn't delete the existing resource with Terraform.

That said, I agree in general that this wouldn't be the main reason to use the Cluster API over Terraform.


"Optimizing based on a lottery event seems counter-productive"

I think it depends on how often you "win" the lottery. If a lottery event happens once or twice a year and it's fairly hard to resolve, we classify them as land mines (or sea mines for the more maritime oriented people among us). These issues get a lower priority, but we'll try to work through them to avoid emergencies or lower their impact. So, maybe not optimizing, but increase resilience against lottery events.


If you don't need to dive into the rabbit hole that is Hashicorp and Terraform and can manage with just clustetctl and kubectl. Or if you prefer yaml with no templating :)

Terraform isn't a wrong decision, just that CNCF now has a more stable API/CLIs you can work with, with which Terraform also needs to integrate.


I guess it depends where you're starting from: I find the entire Hashicorp ecosystem much better documented, supported, more stable and generally less of a "rabbit hole" than the continuously shifting and expanding universe of k8s/"CNCF" projects.

Unsurprising because all the Hashicorp stuff is from a single vendor, and there is the risk of lock-in.


We use it to almost fully replace terraform (still use it for some basic stuff) for couple reasons:

* simplified tooling - you can use all the same tools you already use to manage your k8s configs

* more standardized setup for multi-cloud/onprem

* it’s based on vanilla kubeadm and cloud-init/systemd images - debugging problems is easier bc it’s all pretty standard, battle-tested tech


Many Kubernetes operators are sufficiently mature to use them to manage external infrastructure.

When it works, it's nice to be able to deal with one templating language and one system to manage all of the elements of your deployments. Deploying a single helm chart is a nicer developer experience than managing both a helm chart and a terraform plan.


my understanding is that the machinery underlying it are a bit more complex than what terraform would do.

if anything that would be because the kubernetes operators implementing the Cluster API for your specific cloud provider would be implemented in a continuously running control loop, thus picking up changes... whereas with terraform you would have to run terraform from time to time.

edit: plus of course all the other features coming from kuernetes like mutating admission, validation, rbac (eg: only allow some people to create clusters or allow people to create clusters without granting access to the underlying cloud provider) etc etc...


or over running Joyent's Triton?


It's exciting to read the comments - Why are we comparing Compose Vs Kubernetes?


Because Docker Compose is much simpler and does 90% of what everyone actually needs, but it's uncool, whereas Kubernetes is extremely trendy and does 100% of things that you probably will never need.


Kubernetes does:

  - secret management
  - storage management
  - network policies
  - autoscaling
  - extensibility (cert-manager, ingress controllers, ...)
Each of those things are needed even for small orgs, except maybe autoscaling.

Docker Compose does:

  - run containers
You still need to put your secrets somewhere safe, you still need to allocate storage, you still need to build your firewall, your reverse proxy, your certificate renewal workflow, etc...

The fact that k8s is so widespread is not because "so cool, much hype, wow", it's because it solves real problems and help you scale without having to rewrite everything every 3 months.

I can have a managed Kubernetes on DigitalOcean with all of my infrastructure set up in less than 30 minutes. But yeah, I'm a hype victim...

Kubernetes IS NOT only running containers. And you NEED more than "docker run" to serve an application. Claiming the opposite is a proof of ignorance and incompetence.


k8s doesn't do anything of those things you mention (except autoscaling) without a bunch of extra components. The one virtue of k8s (that you hint at when you mention Digital Ocean) is it provides a somewhat cloud-agnostic framework for plugging in all those pieces you mention.


Is k8s still in trending?


I hope not. I'm trying to get rid of a legacy kubernetes cluster I've inherited and it's a nightmare. Is there some sort of "how to get rid of k8s" service or tutorial out there?

P.S. The kicker is that this cluster is only used to deploy a couple three-page static websites, run a few simple cron jobs (using literally crontabs) and launch a couple docker images straight from Dockerhub.


Using K8s for what you described is so overkill it's hilarious.


Someone got their resume pushed...


I once went with k8s for similarly simple setup, because ultimately the pages weren't exactly static, weren't compatible with running them on one server, and we had to also take care of certs and other stuff. K8s meant that we could push this into much smaller and simpler infrastructure than otherwise.


I wish it would go away completely. Most orgs could get away with docker-compose.


Situation A:

  We've built our infrastructure on top of docker-compose on a single node server
  We've grown enough so that we need more servers and now k8s seems like a good fit
  Uh oh, we need to rewrite everything or not use k8s and reinvent the wheel
Situation B:

  We've deployed a single node k8s cluster and built our infrastructure on top of it
  We've grown enough so that we need more servers
  Good, just need to migrate to the new cluster, no need to rewrite anything
> I wish it would go away completely

Ain't gonna happen, because Docker Compose is only about running containers, there is no:

  - secret management
  - network policies
  - container scheduling with node affinity etc...
  - storage management
  - extensibility with other k8s operators (hello cert-manager, kubedb, kubevault, tekton, kubirds, etc...)
  - ...
I love how HN hates k8s for no valid reason.


I wouldn't go so far as "no valid reason", but yeah, it is always fun to come on HN and see how everyone using k8s is just doing "resume-driven development" and mindless automatons manipulated by "marketing." I always find it amusing because a lot of these takes seem like thy are at least 5 years out of date. If you are on any one of the major cloud vendors, setting up a k8s cluster is almost trivial at this point. And that comes with a fully-managed control plane and a vibrant ecosystem of drop-in solutions for common deployment patterns.

Do you always need k8s? Of course not, but it really is getting to the point where all but the most trivial of deployments could benefit from running on k8s. Are there other ways to do it? Sure, but the complexity of using k8s has been sanded down through widespread adoption and really nice tooling.


Preface this with I don't hate K8s. I see it as a solution and something to be used when the situation calls for it. I've used it quite a bit from directly interacting with it using Go and it's Go modules, using it's REST API, via kubectl, and via those that provide front-ends for it (Rancher).

I see it as all about use-case, staffing, and what kind of instance you are using. Cloud-based? A-lot more freedom in being able to just deploying a cluster and not worry about much, on-prem based? Well dang now you've opened up having to be the one to update those nodes yourself - hope that you are a systems admin as well and have the time to perform said updates.

Situation A - I'd like to introduce you to Docker Swarm.

Secret management - sure K8s can do it, but I'm almost certain your organization as well as mine has accepted a vault of some sort that is not in K8s for end users and services to use. Now you just have to write your applications to use this.

Network policies - while K8s can do this, I don't know if this is the strongest argument to use K8s or just a nice cherry on top. It feels like just a shift in responsibility or adding more granularity from your network security team.

When does the situation call for it (in my opinion)? Your company has embraced the methodology of work with staffing for it and you have more than a handful of apps on more than a handful of nodes.


> I love how HN hates k8s for no valid reason.

There's lots of valid reasons to not use K8s. I don't hate it. What I strongly disagree with is its use-by-default (not unlike how software is now Microservices-by-default.)

K8s offers lots of functionality (that most folks don't require) in exchange for operational complexity. If you're Google, Microsoft, or a huge organization that can truly afford a DevOps team, then more power to you. But for most small/medium sized businesses they're simply not getting the ROI for the resources they're putting into it. We don't bat an eyelash at those costs because "that's just how you deploy apps in the cloud" today. Same goes for Microservices. Monoliths aren't perfect, but at some places that blindly pray at the altar of Microservices wind up building distributed monoliths, adding complexity and friction that is simply not worth their cost.

Unpopular opinion: K8s and Microservices have hurt more than helped our industry at companies where scaling is not a huge factor (99% of companies.)


Most orgs could get away with systemd.


We use both, but I much prefer docker-compose, because systemd scatters deployment files in system directories, which makes them a bit of a pain to find and backup. Not sure if we're doing it wrong.

docker-compose is just one folder per service, much more practical.


The equation gets a bit different the further away you get from spending bonanza of silly valley


Will this thing let me provision a database yet?


Postgres operators are mature now. This is something you install on top of Kubernetes and allows to declare a fully replicated HA cluster with backup and restore in a few lines of YAML. It's actually quite cool how easy it is to create/destroy those clusters.

Same for Kafka, mongodb, redis, ...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: