Hacker News new | past | comments | ask | show | jobs | submit login
Falling for Kubernetes (freeman.vc)
164 points by icyfox on Aug 9, 2022 | hide | past | favorite | 184 comments



For anyone managing a k8s cluster and are fatigued with memorizing and reciting kubectl commands should definitely take a look at k9s[0]. It provides a curses like interface for managing k8s which makes it really easy to operate and dive into issues when debugging. Move from grabbing logs for a pod to being at a terminal on the container and then back out to looking at or editing the yaml for the resource definition in only a few key presses.

[0] https://k9scli.io/


I've used k9s every day for the last 6 months and it's really superior to everything if you have any vi-fu. It even plays nice with the terminal emulator's colour scheme. It's simply an all-around pleasant experience in a way no dashboard is.


I like Lens, as more of a GUIs fan, and very occasional k8s-er. It has saved me a lot of time.


Lens has been bought by another company, the same one that bought Docker, and they are not playing nice with the community.

Some people have forked it to remove the newly added trackers, the forced login, and the remote execution of unknown code, but I sadly guess that it will become a corporate nightmare and the forks will not be popular enough to really take over the development.


Which fork do you recommend?


You have a few OpenLens and one LibreLens and so far they only remove the newly added controversial features.



That’s the one I use but the fork owner does not seem interested in maintaining the whole project. His fork is mostly a patch.


Owner here, i have stated the current situation and why not make a full fork atm.

https://github.com/lensapp/lens/issues/5444#issuecomment-120...

Currently resolving other important issues like binary signing.


For those who use emacs, I'd also recommend the very nice `kubel` plugin - an emacs UI for K8S, based on Magit.


I had to look up k9s because I wondered what you meant by "curses like interface" - it couldn't be where my mind went:"f*ck_u_k8s -doWhatIwant -notWhatISaid"

And upon lookup I was transported back to my youth of ascii interfaces.



K9s made my learning of k8s way way way easier. I still use it every single day and I absolute adore it. The terminal user interface was so absolutely fantastic that it genuinely sparked my motivation to build more TUIs myself.


Do you pronounce it 'canines' or 'K-9-S'?


Logo is a dog, and everybody I know call it "canines"


I use k9s every day, love it. Only problem is that the log viewing interface is buggy and slower than kubectl get logs. Still love it though.


The larger the log buffer is, the slower k9s gets unfortunately. For me the builtin log viewer is useful for going back short time periods or pods with low log traffic.

You can work around it by using a plugin to invoke Stern or another tool when dealing with pods with high log traffic.


Can’t vouch for k9s enough, it’s great and I think it helped me to gain a much better understanding of the resource/api system.


k9s is by far the most productive tool in my toolshed for managing & operating k8s clusters.


Has anyone on AWS gotten k9s to work with Awsume [0] authentication? I miss using it but I can't auth to different AWS accounts and access my EKS clusters with it unfortunately.

[0] https://awsu.me/

edit: I figured it out! You need to use autoawsume which is triggered by awsume $profile -a


I've been using and recommending k9s to everyone and it just works. I love it and use it enough that I'm a sponsor.

It's an amazing project by a solo dev, please consider sponsoring. My guess is anyone using kubernetes can afford to get their org to shell out $10 for it.

(I'm not affiliated with k9s in any way except as a happy user)


It's been a smash hit at work. There is a bit of a learning curve, but nothing compared to kubectl.

: for changing "resource" types (and a few other miscellaneous things) / for filtering shift-? to see more commands


It's really better than any GUI I've used. I like it better than Weave Scope, Lens, and Rancher.


and, oh-my-zsh's kubectl plugin, to abbreviate more commands and which should allow for one to add the current cluster to their prompt with a convenient alias to switch between clusters (kcuc <config context>)


Works even better with fzf (aka "what's this subcommand called again ?").


Lots of people complain about Kubernetes complexity but I have found it is as complex as you make it. If you are running some simple workloads then once you have the pipeline setup, there is almost no maintenance required.

When people complain and then start talking about super complex configuration, bespoke networking functionality and helm charts that have "too many options" then surely that just means you don't have the skills to use the system to that degree?

I could say that .Net is too complicated because it has MSIL and library binding sequences involving versions and public keys and the fact you can not always link e.g. netfx with netstandard but these are either just things you need to learn, or things that you can't use until you do learn them.

It's like someone complaining that a Ferrari is too complicated because you can't work out how to change the cylinder head when most people will just drive it.


Where some people collide, and disagree about complexity depends on their roles.

If you're a consumer, then yes, it's as complex as you make it. If you keep it super simple you may lose out on some features, but that a reasonable trade-off.

If you're the person responsible for running and maintaining the Kubernetes cluster, then you're kinda out of luck. It honestly not that bad to install, you can do that in an afternoon. Where I find Kubernetes to be exceedingly complex is in debug-ability. I'm not sure there's anyway around that, it's basically a software defined datacenter, with all the complexity that brings... For some of us it's a software defined datacenter, on top of an actual datacenter, just to make things worse.

When I read about a company that just spin up a new Kubernetes cluster, because it's quicker than debugging the existing one, then I get concerned. For running payload, absolutely, just use the subset of the features you're comfortable with and build from there. Still I'd argue that most of us will never have problems large enough or complex enough that Kubernetes is a hard requirement.


This is a bit like people being apologetic for PHP. Sure, technically, it is possible to write good PHP code. It doesn't have to turn into a spaghetti.

I have several issues with Kubernetes that superficially look like I'm just avoiding the complexity, but I've dealt with systems that are much more complex with ease.

1. In most orgs and environments, its a cloud-on-a-cloud. A few years ago I used to joke with people that the virtualisation in the cloud is 7 layers deep and no human can understand or troubleshoot it any longer. Adding something like Docker adds 1-2 layers. Kubernetes doubles the layers. Everything you do with the underlying cloud is duplicated in Kubernetes, but incompatible. E.g.:

    Azure has:        Kubernetes has:

    Resource Groups   Namespaces
    Tags              Labels
    Disks             PVs & Container Images
    VMs               Nodes
    (various)         Pods
    Load balancers    Ingress
    NSGs & FWs        (various)
    Policy            Policies
    Key Vault         etcd
    ARM Templates     Helm charts
    JSON APIs         gRPC APIs
    Azure AD          Pluggable auth
    Azure Metrics     Prometheus
    Log Analytics     (various)
    PowerShell        cli tool
These interact in weird and wonderful ways. Azure NATs all traffic, and then Kubernetes NATs it again by default. There's "security" at every layer, but just to be irritating, all Kubernetes traffic comes from unpredictable IPs in a single Subnet, making firewalling a nightmare. You're running cloud VMs already, but Windows containers run in nested VMs by default on Kubernetes. Kubernetes has its own internal DNS service for crying out loud!

2. Trying to do everything is to be less than optimal for everyone. There are four distinct ways of managing a cluster, and they're not compatible. You can run imperative commands, upload Helm charts, sync the cluster with an entire folder of stuff, or use a plugin like Flux to do GitOps. But if different people in a large team mix these styles, then this causes a giant mess. (To be fair, this is an issue with all of the major public cloud providers also.)

3. Google-isms everywhere. Every aspect of Kubernetes uses their internal shorthand. I'm not an ex-Googler. Nobody at any of my customers is. Nobody around here "speaks this dialect", because we're 12,000 kilometres from Silicon Valley. I'm sure this is not deliberate, but there are definite "cliques" with distinct cultures in the IT world. As FAANG employees flush with cash jump ship to start projects and startups like Kubernetes, they take their culture with them. In my opinion, mixing these together at random into a larger enterprise architecture is generally a mistake.

4. Kubernetes is not much better than bare VMs for developers, especially when compared with something like Azure App Service. The latter will "hold your hand for you" and has dozens of slick diagnostic tools integrated with it. On Kubernetes, if you want to do something as simple as capture crash dumps when there's a memory leak detected, you have to set this up yourself.

5. Microservices and RPC-oriented by default. Sure: you're not forced to implement this pattern, but it's a very steep slippery slope with arrows pointing downhill. In my experience, this is unnecessary complexity 99.99% of the time. Just last week I had to talk a developer out of adding Kubernetes to a trivial web application. Notice that I said "a" developer? Yes, a solo developer was keen on adopting this fad "just because". He was seriously planning on splitting individual REST endpoints out into containerised microservices. He's the third solo developer I've had to talk out of adopting Kubernetes this year.

6. Startup culture. Kubernetes shipped too early in my opinion. Really basic things are still being worked out, and it is already littered with deprecation warnings in the documentation. It's the type of product that should have been fleshed out a bit better at one or two large early adopter customers, and only then released to the general public. But its authors had a lot of competition (Docker Swarm, etc...) so they felt a lot of pressure to ship an MVP and iterate fast. That's fine I suppose, for them, but as an end-user I have to deal with a lot of churn and breakage. A case-in-point is that the configuration file formats are so inadequate that they have spawned a little ecosystem of config-file-generator-generators. I don't even know how deep those layers go these days. (Again, to be fair, Azure now has Bicep -> ARM as a standard transpilation step.)

7. Weak security by default because containers aren't security boundaries as far as Linus Torvalds or Microsoft Security Response Center are concerned. Everyone I've ever talked to about Kubernetes in the wild assumes the opposite, that it's magically more secure than hypervisors.

I get the purpose of Kubernetes in the same way that I get the purpose of something like Haskell, coq, or hand-rolled cryptography. They all have their legitimate uses, and can be useful in surprising ways. Should they be the norm for typical development teams? I'm not convinced.

Maybe one day Kubernetes v3 will be mature, stable, and useful for a wider audience. Especially once the underlying cloud is removed and there are smaller providers offering "pure Kubernetes clouds" where node pools are bare metal and there's no NAT and there isn't an external load balancer in front of the Kubernetes load balancer to make things extra spicy when diagnosing performance issues late at night across multiple incompatible metrics collector systems...


k8s is deceptively simple (or is that deceptively complex?). Anyway, what I mean is that spinning up a basic cluster isn't hard. Maintaining a cluster on premises while following every existing infosec and ops guideline is. It's not that you can't do this, it's just a very non-trivial amount of work.


This is what infra/ops people are there for. I think a lot of the problems here are devs with no ops background having to maintain these platforms. It’s understandably daunting for those in this position.


> once you have the pipeline setup, there is almost no maintenance required.

You could apply this to a traditional deployment. Once you setup all the CI/CD there’s no maintenance required.

But the non kubernetes would probably be cheaper.


Depends on many details, and how you use k8s.

I mainly use it to save money as I can pay for less cloud resources.


You’re paying for resource isolation. You can put many things on 1 server and it will be cheaper. We have done that for decades.


Managing that can quickly get cumbersome, especially when you deal with software that wrecks easy mode of "just install mod-php from distro packages". Ensuring resource allocation was also much more annoying.

Letting scheduler handle allocation of many services onto few servers is how I managed to pay only 10% of what was the previous cost.


...or deploy your code on Google App Engine, Heroku, Elastic Beanstalk, Digital Ocean App Platform, Fly.io (etc etc) and spend all your time implementing user-facing features instead of maintaining infrastructure.

Yeah, I get it, compared to maintaining bare metal, k8s is amazing. But you're still wasting your time working on plumbing.


Google Cloud Functions for the win. Very reasonable pricing model too. Doing 55 requests per second, 24/7, and it is about $100 a month, including the managed Cloud SQL postgres instance.


That's what I thought as well, but now I do have some long-running jobs that exceed GCF's 60min limit. So I'm stuck with docker on Compute Engine, where GCP treats you like a 2nd class citizen as the OP found out.


I've worked on systems that did that and it was a huge huge mess, especially as the company grew. When jobs run that long, any failure means that they have to start over again and you lose all that time. Even worse, is that it stacks up. One ETL job leads into the next and it becomes a house of cards.

It is better to design things from the start to cut things up into smaller units and parallelize as much as possible. By doing that, you solve the problem I mention... as well as the problem you mention. Two birds.


you need to split those jobs into smaller ones that read their parameters from a queue. Then it will fit in serverless and also be more reliable


I'm not sure why this would be more reliable. But it would probably fit, but at the cost of additional complexity.


When you split up into smaller jobs, you have to design them to work in face of retries and parallel execution. It's a bit of complexity, but the end result is a scalable and self-healing system, that can handle lives code updates, features which contribute to make the full workflow inherently reliable and scalable.

If you have a big >1h job you have to add locks, make sure deploys don't interrupt the job, handle retries of the whole job, maintain serverless + not serverless, and then inevitably rewrite the whole thing when it takes too long to be viable. All in all a lot of work and complexity as well that is wasted on making a bad design work.


60+ minute jobs are already complex.


And much harder to maintain and understand...


We're doing that with cloud functions, pubsub and pulumi, the infra code to set that up is trivial, and it is actually a lot easier to maintain since it's fully serverless & you get retries and parallelism 'for free'. With cronjobs on vms the job itself might be a bit easier to code, but everything around it is a lot harder. (What happens if your 5h job crashes in the middle, who restarts it ? How do you manage locks to prevent concurrent execution ? How do you prevent that job from overloading the system ? etc ...)

just to clarify our setup: - 1 pubsub 'job' queue - 1 cloud function triggered by a scheduled event populates the job queue - 1 idempotent cloud function to handle a job, triggered by events on the queue.


AWS Lambda > Google Functions!


I have a lot of experience with both clouds.

AWS < GCP


Would love to know why.

AWS supports more languages, faster deployments, faster execution, cheaper, let’s you roll your own image.


Agree, especially at early stage you don't need to overcomplicate your infrastructure.


Amazon EKS + Fargate.

No bare metal to manage. Control plane complexity is abrastacted away. Fargate namespace + profiles, no worker node configuration.

EKS cost will be $90/m Thereafter you only pay for what cpu/mem limits you assign to your deployments/pods.

Otherwise, why bare metal? For your basic needs, bare metal, self managed control planes, etc are definitely over complicating things.


If you're abstracting away most of the complexity of k8s, why not just go use ECS and spend nothing on the cluster? You will probably have to do some rewriting of your deployment scripts when you move off of EKS anyway (just like when you move off of ECS), so you might as well use ECS and save the $90/m (and it's generally easier to use).


Using Fargate for long / permanently running workloads in EKS is only an option when the costs are none of your concern.


As opposed to?

Fargate is cheap, you can have permanently running workloads and still pay less than non-fargate options.


> This deployment needed to serve docker images on boot-up, which instance templates do support. However they don't support them via API, only in the web console or the CLI.

Not exactly true. I got around this limitation by sending a startup script[1] in API metadata which basically just does invokes `docker run ...` and it works just fine. This allows spinning up/down container-based VMs via API only which is nice.

[1] https://cloud.google.com/compute/docs/instances/startup-scri...


Yeah you can do the same thing via any image that supports cloud-init or ignition (eg fedora coreos) and have the same exact setup deployed in almost any cloud


> Keep it managed since that legitimately becomes a reliability headache.

This is the thing that I think will always give me pause. If I have to pay a third party to manage my cluster orchestration backplane, that seems like a pretty big piece of overhead.

Sure, I can do it myself, but then I have to deal with said reliability headache. It seems ironic that a cluster management framework -- that touts its ability to reliably keep your applications running -- has its own reliability issues.


This may not be a surprise to some, but when folks talk about reliability of the control plane, they usually think failure means their web service goes down. That’s not true. If you shot the kubernetes control plane, the individual servers can’t talk to it anymore - so they do nothing. Docker images that were running stay running. They even get restarted if they crash (via restartPolicy). Services that had specific other pods they were referencing continue referencing those pods. In net: everything except for kubectl and other kubenetes internals keeps working.

That said, one piece that isn’t talked about frequently is the network overlay. Kubernetes virtualizes IPs (so each pod gets an IP), which is awesome to work with when it works. But if your overlay network goes down - god help you. DNS failures are the first to show up, but it’s all downhill from there. Most overlays take great care to degrade well, so they’re not tied to the control plane, but I have yet to find one that’s perfect. The overlay is the part of kube that truly isn’t failure tolerant in my experience.


> Kubernetes virtualizes IPs (so each pod gets an IP), which is awesome to work with when it works

Kubernetes does no such thing.

Weave Net, which is likely the most used CNI, does. There are other options however, and some of them use baremetal routers via bridging or even VLANs for example.

https://kubernetes.io/docs/concepts/cluster-administration/n...


The fact that each pod has an IP is a core assumption of Kubernetes. Sure, the CNIs are responsible for actually implementing this, but it is a required part of their contract to provide 1 unique IP per pod (or, more precisely, either 1 IPv4 or 1 IPv6 or both per virtual NIC per pod - to cover dual-stack support in 1.24+ and Multus).


That's probably true, but also irrelevant to the question wherever kubernetes virtualizes IPs. But now that I'm rereading my comment: it does look as if I'm also talking about each pod having one IP. That was bad quoting / phrasing on my part, as I wasn't contesting that at all.

With flannel you could provision the IP through DHCP by bridging the network adapter of the pod to the physical interface to get an IP from a router appliance for example.

It's probably also possible to dedicate actual network adapters to the pod, but I've never attempted that... And that obviously wouldn't scale as it's hardware


Oh, you were focusing on the explicit notion of virtualizing IPs. I thought you were pointing out that Kubernetes itself is not the one generating the IPs, since it's the CNIs that do so, which are not built-in...

Either way, we are in agreement I believe. Kubernetes mandates for CNIs that they must allocate unique IPs, but they do so through a variety of mechanisms, sometimes even using external infrastructure.


Exactly. We are building these incredible open source tools... but they grow so complex that we need to pay others in order to use them effectively?

What would you say if you had to pay Google if you want to use Golang in an effective way (because the language has become so complex that it's difficult to handle it on your own?). Crazy.

I wanted to take a look at how to use K8s on my own cluster, and damn it, to install the whole thing is not that straightforward. So, now to keep my sanity I need to pay a cloud provider to use k8s! I guess that's the trick: build some open source monster that's very hard to install/maintain/use but has cool features. People will love it and they'll pay for it.


I think people are overestimating the difficulty of setting up k8s. It's not that hard. Grab Debian, install containerd.io, set one sysctl, load one kernel module and you're all set. Install few more packages, run kubeadm init and that's all.

The only thing that I've found truly hard is autoscaling VMs. Managed clouds got it easy, I was not able to make it so far. But it doesn't seem that hard either, I just need to write one particularly careful bash script, I just don't have time for that yet.

Google does have some secret sauce. For example there's horizontal scaling which spins up more pods as load grows. There's vertical scaling that adjusts resources for individual pods. Those are generally incompatible with each other and people adjust resources manually. GKE autopilot has multi-dimensional scaling which they didn't open source, it does both.

May be I wasn't hit with something nasty yet. But so far I think that those complains about managing k8s are somewhat strange. Yes, if you manage thousands of VMs, it might get another full-time job, but that's a scale. I manage few VMs and that's not an issue.


Something that isn't appreciated enough is how reliability issues demolish your teams throughput. Got a random cluster restart heisenbug taking customers offline? good luck debugging it for the next 6 weeks. Alternately, ignore the problem while your engineers get woken up at night until they quit...

The progress of software has been towards managed offerings for a reason. A company can make it their entire business to own fixing these reliability issues for you, and keeping them at bay. Do you really want to be int the business of managing a cloud on your cloud?


I don't fully understand, there's benefits for using a managed service in instances the control plane is something you only interact with but don't manage. Not every Ops team will have a CKA administrator at hand to delve into etcd or controller manager. Open a ticket and it's generally fixed.

Then there's situations where you want full control over the control plane itself, I've worked with companies that had clusters installed on bare metal in their stores in the UK. A CKA engineer is an essential in this case but brings it's own reliability headaches.


I don't disagree with you, but if you can reliably trade your dataplane outages for control plane outages, that's still usually a good tradeoff.


That's a really good point. Certainly you don't want either to go down, but outages are more tolerable if customer requests are still getting serviced regardless.


Vanilla k8s is pretty good. But once the 8 trillion vendors have you 'curl | helm' ing you end up with a knot of a system.

Keep it simple, use GitOps (ArgoCD is great), let k8s do what it's good at, managing workloads, not as a delivery mechanism for a vendor.

As an aside, the existence of the '{{ | indent 4 }}' function in helm should disqualify it from any serious use. Render, don't template.


> As an aside, the existence of the '{{ | indent 4 }}' function in helm should disqualify it from any serious use. Render, don't template.

This. My first thought when I saw the indentation hack was "it can't be serious, production-ready software".

My take on this is as follows.

If you have a simple use case, write your K8s manifests directly.

If you have a complex use case, Helm is often more pain than it's worth. Use alternatives, for example Jsonnet[0] with kubecfg[1]. Or emit manifests from your language of choice. Just don't use Helm.

[0]: https://jsonnet.org/ [1]: https://github.com/kubecfg/kubecfg


For a complex use case: cdk8s is cool https://cdk8s.io/


It is shocking that such a clearly bad design choice has stuck, and that Helm has become so popular in spite of it. I've had my eye on jsonnet, I'll have to try that next time I do something in k8s.


It makes sense if you consider how poor K8s' design already is. They somehow made it overcomplicated, yet lacking in critical features, with terrible security, no multitenancy, unnecessarily duplicated concepts, confusing and redundant configuration, a nightmarish setup process, no VCS integration, with mutable state, and lock-in reliance on specific external software components, while also having microservices, when the components were never really intended to be replaced, and misbehave when their assumptions change between versions. Then they invented and sunsetted multiple iterations of the same concept (plugins, essentially), redesigned other concepts (Roles) without consolidating them, and decided long-term support/stable ABIs are for chumps, so that anyone using it was stuck doing infinite migrations. It's up there with Jenkins and Terraform as one of the worst designed systems I've seen. The fact that you need a million integrations to make it do anything useful at scale is proof that it's more of a development toy than a practical tool.


You don’t like terraform? The version churn and things like the .lock file are pains in the ass, but overall it’s gotten pretty elegant and well documented.


Google "terraform wrapper". Terraform's interface is so bad that a million people have written their own interface to deal with how terrible Terraform's is. Most people settled on Terragrunt, but in some ways that's actually worse (DRY configuration is actually impossible with Terragrunt; I wrote my own wrapper (https://github.com/pwillis-els/terraformsh) to fix that problem).

Terraform's design is that of a half-baked configuration management program. It's not even declarative programming, because declarative programming requires the program actually try to achieve the desired end state. Terraform won't do that. It will try to create or destroy something based on what it thinks the world looks like, rather than what the world actually looks like. If anything is inconsistent, it simply dies, rather than try to fix or work around the inconsistency. Which is the opposite of what configuration management is supposed to do. It's supposed to fix the system state, not run away crying. In many ways, Puppet, Chef or even (ugh) Ansible would be better than Terraform at handling infrastructure state.

For example, if you tell Terraform you want an S3 bucket, and it tries to create one, but it already exists, Terraform dies. In that case they want you to manually use an 'import' command to import the existing S3 bucket (which btw is a different syntax for every single resource). Why doesn't it automatically import resources? Hashicorp's "philosophy" is that automation is bad and why would we want a computer to make a human's life easier when we can make the human serve the computer? This basic but critical feature is so important that Google created an entire project dedicated to it, called Terraformer.

If you write some Terraform HCL, you may be able to create your resources with it. But it's also very likely Terraform will not be able to change or destroy those same resources. Every provider has its own rules about what you can do with a resource, and none of those providers are required to apply those rules throughout the resource's lifecycle. So you deploy something to prod, and then need to change it, only to find out that change is impossible, half way through a terraform apply in production. Leaving you with broken production infrastructure. Many provider resources also have rules that require you to add a Terraform lifecycle policy, or it will be impossible to create, destroy, rename, etc a resource. Rather than make those lifecycle policies the default for a given resource, Terraform requires you to know in advance that that resource requires that lifecycle policy and to add it to every instance of that resource. But it will cheerfully create resources without the policy, with a big gaping hole in the floor waiting for your infra to fall through when you next make a change.

This is a very brief list of the insane stupidity that is Terraform. A blindfolded configuration management tool that can't handle changes and doesn't like automation, validation, or sane defaults.


Any popular software has wrappers written around it, regardless of actual quality.


> Terraform's design is that of a half-baked configuration management program. It's not even declarative programming, because declarative programming requires the program actually try to achieve the desired end state.

terraform is fully declarative; if I want a thing of the XYZ kind, I ask terraform «give me a thing of the XYZ kind and I don't care about how you will get it for me». Declarative programming operates with intentions without focusing on details of how the intention will actually be expressed; «give me XYZ» is an example of such an intention.

Declarative programming and the «desired end state» are orthogonal, and the desired end state may or may not even be the goal of a declarative program. For example, expert systems are all declarative, and they don't deal with «worlds» and can be «entered into» from «anywhere». And, since their «worlds» keep on constantly changing by introducing or retracting facts, at best it is only possible to think about a snapshot of such a world at a given moment in time.

> Terraform won't do that. It will try to create or destroy something based on what it thinks the world looks like, rather than what the world actually looks like.

Can you unpack that? terraform keeps the current state of «world» as a dependency graph in a state file and compares it with intentions expressed in .tf file(s) that under the hood translate into one or more new or changed graph nodes. If there are differences in two dependency graphs, it modifies the current in-memory dependency graph, applies changes to the infrastructure and incrementally updates the state file to reflect the dependency graph update progression in the infrastructure.

terraform is not a sentient being, therefore it can't solve philosophical problems nor can it reason about anything that goes beyond its state file. It is, essentially, a «make» for the infrastructure management with each provider supplying built-in build rules.

> For example, if you tell Terraform you want an S3 bucket, and it tries to create one, but it already exists, Terraform dies. In that case they want you to manually use an 'import' command to import the existing S3 bucket (which btw is a different syntax for every single resource). Why doesn't it automatically import resources?

Why? Because it is a feature.

If a S3 bucket already exists, it may mean of the following:

  1. It is already managed by another terraform project;
  2. It has another owner who might have created it using a different tool that manages the S3 bucket lifecycle differently;
  3. A coding mistake, e.g. a .tf file has been copied and pasted into;
  4. Something else.
If terraform were to be allowed to auto-import existing resources, case (1) would become a time ticking bomb until the S3 bucket would get updated by another run of a terraform project that manages it as well; case (2) would likely result in a homicide with the rightful S3 bucket owner chasing you around with a hammer; case (3) is a wake-up call to not rampantly copy and paste .tf files; case (4) can be anything else.


We are using helm at work but I've never touched it so I cannot comment on it being bad. However cargo cult habits have made quite a few technologies over the years take off when they never should have just because someone with either the right sort of intelligence and/or charisma made people believe it was good, and then the ripple effect made it take off.


I had someone on here assume we were a small dev shop because we don't use microservices the other day. I don't think they were expecting 2k devs to be what I responded with. Apparently monoliths a scale a lot bigger than you'd think.


I like helm because it's like a package manager. I don't know what's inside rpm or dpkg. I think they're terrible inside either. But the fact that I can apt install gimp makes it awesome.

Same about helm. I didn't write a single helm yet. But I like packages that I can helm install and I probably would avoid software without first-class helm. Helm is good for users. I can install it, upgrade it, configure it with values. Awesome.


I judiciously delete any and all helm I can find, and fight text-based templating where possible :/

Especially since k8s doesn't even use "text" for the format, it's just happens that JSON is the serialization, use datastructures and serialize them dammit :/


I'm fairly convinced that helm in production is an anti-pattern. With you instead having all K8s manifests checked into your Git repository with CI/CD to handle any changes.

Helm just has too much auto-magic and remove's one of k8s best features, the git-blame / git-diff for Infrastructure.


I don't understand why this isn't the prevailing philosophy. I'm over here Terraforming helm deployments and having no idea what is actually happening under the covers.


We have our CI run a helm —-dry-run —-debug before the deploy step, so we can see the exact kubernetes resources that are going to be applied. You can even save that as an artifact so you can diff it later.


There's Helm plugin (https://github.com/databus23/helm-diff) that show diff results for you, for example

    helm diff upgrade <release> <repo> --namespace <namespace>


Helm still saves you an incredible amount of work for setting up all the third party services.

If your team is big enough you can just write your own configs, but that takes a lot of time and often quite a bit of knowledge about the relevant service.

Rollbacks and resource cleanup is not to be underestimated either if you are not getting that from other tools like Argo.

Note: You can still use the Git-centric approach by generating configuration via `helm template`.


I'm convinced that the only reason Helm saves you time with 3rd party services is because 3rd party services only provide Helm chart or don't provide means of deployment to k8s at all.

I've been using kustomize to deploy some things with ArgoCD and it's so much easier. Now, I'm trying to never use helm for 3rd party.

However, for your internal things, Helm is hard to replace. It's easy to start a chart that is capable of deploying most of your internal services, but maintaining it is a nightmare.

Actually, using helm, as in `helm` the binary, directly sounds bonkers to me and I wouldn't wish this upon anyone.


I use Helm only for installing “supporting applications” -elastic searchoperator, jupyterHub, etc. Our normal deployments are standard K8s configs. These apps use Helm because a lot of the settings are complicated, Co dependent, etc.

Absolutely would not write helm charts from scratch for normal deployments, and if I got these apps in a better format than helm, I’d probably drop it immediately.


It’s the fact that they are complicated which warrants manual configuration. If the org can’t write the configuration, how will they support it when something goes wrong? It’s a problem waiting to happen.


I can't edit this anymore - but to anyone reading, I ought to specify that I do _not_ use Helm directly, only via Flux, our CD tool. It ablates a lot of the issues of dealing with helm charts.


> not as a delivery mechanism for a vendor

Amen. I got turned off from k8s following a tutorial that used Helm. I ran it and watched mysteries and line noise scroll past and walked away for a year. I thought "no, I will never put this squirming mass of tapeworms in front of anyone."

Then I took up with k3s and got underway.


> As an aside, the existence of the '{{ | indent 4 }}' function in helm should disqualify it from any serious use. Render, don't template.

Yeah, I think helm will be the death of Kubernetes. Some other workload management tool will come out, it will have a computer-science-inspired templating system (think Lisp macros, not C #defines) that is also easy to use, and the increased maintainability will be a breath of fresh air to people in helm hell and they'll switch over, even if the workload management fundamentals are worse.

It is a shame that ksonnet was abandoned. jsonnet is a very pleasant programming language for problems of similar scope. I think that people have to see some adoption or they give up; so Helm stays motivated to continue existing despite the fact that the design is fundamentally flawed, while alternatives give up when nobody uses them. If you're looking for a lesson in "worse is better", look no farther from helm. Easy things are hard, but everything is possible. That turns out to be important.

I also liked kustomize a lot, but it definitely sacrificed features for security and a clean overall design posture. I don't really know what it's missing, but it must be missing something. (I use kustomize for everything, but it seems rare in the industry. And I don't use it for complicated things like "make a namespace and app deployment for every PR"; I think to manage things like that it's missing features people want. I say just run your development branch on your workstation and commit it when it works. Running the app you write shouldn't be rocket science, and so a computer program that configures it shouldn't be necessary. The industry definitely disagrees with me there, though.)

One of the biggest problems I have with helm is that it's not easy to make local modifications to charts. That needs to be a first class feature, and something everyone knows how to use. As a vendor who ships a helm chart, I feel like almost all of my work these days is hearing from users "our k8s security team requires that every manifest set field X", and then I have to go add that for them and release it. If kustomize were the default, they'd just add that themselves instead of asking. But hey, opportunity to provide good customer service, I guess. (Luckily, most of the requests are compatible with each other. At one point we ran as root because that's the default, a customer required running as a user, and now everyone can run a locked down version; nobody NEEDED to run as root. So these requests generally improve the product, but it's a little tedious to make them ask.)


kustomize supports using a helm chart as a resource

https://kubectl.docs.kubernetes.io/references/kustomize/kust...


Can you please elaborate on “render, don’t template”?


With templating, you treat the yaml as text with a yaml-unaware templating engine (like golangs text/template). You need to make sure that the end result is valid yaml.

With rendering, you use something that is aware of yaml, and you feed it data and it outputs valid yaml.


I don't understand this comment. How else are you going to deploy pieces of k8s infra into k8s if not with Helm and Helm Charts? Sure, you can use Argo to deploy and sync Helm charts into k8s but...you're still going to be using Helm (if not indirectly via Argo) and you will inevitably need to template things that need to be dynamically configured at render-time.


I don't use templates for manifests and avoid them like the plague.

I use my preferred language to emit manifests and built tooling around it. It does not template, and instead generates the manifest by transforming a data structure (hash) into json. I can then use whatever language feature or library I need to generate that data structure. This is much easier to work with than trying to edit template files.

I don't need to save the output of these because when I use the tooling, it generates and feeds that directly into kubectl. There's also a diff tool that works most of the time that lets me see what I am about to change before changing it.

In fact, I ended up adding a wrapper for Helm so that I can have all the various --set and values, source chart, chart repo, chart version pinning all versioned in git, and use the same tooling to apply those with the flags to install things idempotently turned on by default. It sets that up and calls helm instead of kubectl.

That tooling I wrote is open-source. You never heard of it because (1) I don't have the time and energy to document it or promote it, and (2) it only makes sense for teams that use that particular language. Helm is language-agnostic.

EDIT: and reading this thread, someone mentioned Kustomize. If that had been around in '16, I might not have written my own tool. It looks like it also treat YAML manifests as something to be transformed rather than templates to be rendered.


Just kubectl apply the manifests. You can even use kubectl -k for the Kustomize configuration engine that can more or less replace most of what helm does today.


So what, I'm going to have a big Makefile or something with a bunch of kubectl applies? For each environment too? What if one of my dependencies (cert-manager for example) doesn't support directly applying via kubectl but have to be rendered with Helm? How do I manage versions of these dependencies too?

For better or for worse Helm is the defacto standard for deploying into k8s. Kustomizing toy apps or simple things may work but I have yet to see a large production stack use anything but Helm.


You can prerender helm charts into plain old manifests and apply those. How you want to handle applying them is up to you, even helm doesn't recommend or want people to run it as a service that auto applys charts anymore. Most folks check manifests into a git repo and setup automation to apply the state of the repo to their clusters--there are tons of tools to do this for you if you want.

Definitely check out kustomize, it's not a toy and it can easily handle one main manifest with specializations for each unique deployment environment. It's a very nice model of override and patching config instead of some insane monster template and yaml generation like helm.


I've worked 4 years for a client where everything was either plain K8S YAMLs Kustomizations, mostly the latter.

Large clusters, about 80 service teams and many hundreds of apps.

We (the platform team) managed roughly 30-40 services or controllers, including cert-manager, our own CRD for app deployments, ElasticSearch, Victoria Metrics, Grafana and others.

It was (still is, only I'm sadly not there anymore!) a high performing team and organisation with a lot going on.

We reviewed the decision to not use Helm many times and every time it was pretty much unanimous that we had a better solution than the opaque and awkward templating.


"kubectl apply -f https://github.com/cert-manager/cert-manager/releases/downlo..."

Taken straight from their docs.


I'm not using Helm for deploying applications, though I use it for vendor manifests. It's not a small production stack, nor is it a toy app.

I'm not using Makefile either.

Helm is a kind of least-common denominator software that's better than doing nothing, but the template paradigm leaves a lot to be desired. It's main advantage is being language-agnostic.


Why would you need a Makefile? You have to run helm to apply helm charts, how is `kubectl apply -f .` any more complicated then that?

The entire existence of helm is superfluous. The features it provides are already part of Kubernetes. It was created (by /) for people who understand the package manager metaphor but don't understand how Kubernetes fundamentally works.

The metaphor is wrong! You are not installing applications on some new kind of OS. Using helm is like injecting untracked application code into a running application.

At best helm is just adding unnecessary complexity by re-framing existing features as features that helm adds.

In reality helm's obfuscation leads to an impenetrable mess of black boxes, that explodes the cost and complexity of managing a k8s cluster.

First off if you are packaging and/or publishing apps using helm charts, stop it!

There is purpose to the standardization of the Kubernetes configuration language. Just publish the damn configuration with a bit of documentation.... You know just like every other open source library! You're building the configuration for the helm chart anyway, so just publish that. It's a lot less work then creating stupid helm charts that serve no purpose but to obfuscate.

Here is your new no helm instructions: We've stopped using helm to deploy our app. To use our recommended deployment, clone this repo of yaml configs. Copy these files into your kubernetes config repo, change any options you want (see inline comments). Apply with `kubectl apply -f .`, or let your continuous deployment system deploy it on commit.

What have you lost?


The only thing helm provides is awkward templating, IME. Ideally you'd never use a text template library to manipulate YAML or JSON structured objects. Instead you'd have scripts that generate and commit whole YAML files, or you'd just update the YAML manually (less optimal), and then you'd write those to the k8s API directly or through a purpose-built tool.

(Or, hell, helm but with no templating whatsoever).


> How else are you going to deploy pieces of k8s infra into k8s if not with Helm?

kustomize? raw manifests? "sed s/MY_PER_DEPLOYMENT_VALUE/whatever/" < manifest.yaml | kubectl apply -f -"? "jsonnet whatever.jsonnet | kubectl apply -f -"?

But yeah, a lot of people think Helm is mandatory and ask for it by name, and if you don't provide a chart they won't use your thing.


You can use envsubst instead of sed


You don't like text templating, so you're suggesting text templating, right?


I mean, I like text templating. Was just pointing out a way to do what the commenter was saying easier.


Yeah I think it's awfully unfair to say "you can't post the name of a relevant tool unless you want to go on record as LOVING TEXT TEMPLATING". I thought your reply was useful.


> How else are you going to deploy pieces of k8s infra into k8s if not with Helm and Helm Charts?

kubectl apply -f $MANIFESTS

> you're still going to be using Helm (if not indirectly via Argo) and you will inevitably need to template things that need to be dynamically configured at render-time.

Use Kustomize for dynamic vars and keep it at a minimum. Templating is the root of all evil.

Helm mostly adds unnecessary complexity and obscurity. Sure it's faster to deploy supporting services with it, but how often do you actually need to do that anyway ? The time you're initially gaining by using Helm might generate an order of magnitude more time in maintenance later on because you've created a situation where the underlying mechanics are both hidden and unknown from you.


> kubectl apply -f $MANIFESTS

How do you configure it? Like you're installing new version, do you go over manifests and edit those by hand over and over every update? Do you maintain some sed scripts?

helm is awesome because it separates configuration.


I actually used to sed configurations, now I use yq[0] whenever I need to programatically edit YAML/JSON. It has much less side effects.

But for Kubernetes manifests specifically, the right tool for the job is Kustomize[1] (which in that case does what Helm does for you, keeping dynamic variables separate). It ships with kubectl and I'm a big believer in using default tools when possible.

> Like you're installing new version, do you go over manifests and edit those by hand over and over every update?

I check the patch notes, diff the configuration files to see if anything new popped up, do the required changes if necessary, jump the version number and deploy.

It sounds laborious but it's really not that much work most of the time, and more importantly it forces you to have a good sense for what and how everything works. Plus it allows you to have a completely transparent, readable environment. Both are important for keeping things running in a production environment. Otherwise you might find yourself debugging incomprehensible systems you've never paid attention to in the middle of the night with 2 hours left before traffic starts coming in.

[0]: https://github.com/mikefarah/yq

[1]: https://kustomize.io


Well, that makes sense but sounds like I would spend much more time than necessary for many packages. If that would be my full-time job, I guess it would work, but I just need to install something and move on to other things. Some packages install thousands of lines of yaml. I guess I would need more than one day to grok it. Installing it with helm is easy and it usually works.

Something simple like GitLab probably would be impossible to understand for a simple person.


Yeah, I might be thinking of it as simpler than it really is just out of habit. Though it has to be said than most Kubernetes resources are pretty pedestrian, with habit you know where to look for the meaningful bits (much like any other form of source code).



So instead of Go Templates, I'm going to use Dhall? Why? I'd be losing interop with the entire Helm ecosystem too, so there go dependencies and such to maintain my IaC.

That blog post doesn't alleviate any issues one might have using the traditional Go Templates + Helm.


https://www.pulumi.com/docs/get-started/kubernetes/

Gives you a Typescript API for generating resources. The difference is that these templates are semantic (via the type system) not just syntactic as in text templates. Relatedly, they also factor and compose better because functions and imports.


It’s probably best you avoid using Kubernetes in production for a long while. At the very least until you understand why your comment is being so heavily downvoted.


I’ve been using Kubernetes in production for over 4 years. I think I fully understand what’s going on, it’s just I have a different opinion than those downvoting me.


There's a few dozen ways to work with k8s that don't involve helm. I understand that hasn't been your experience but to deny they exist shows a lack of.


I’m not denying they exist, I’m arguing they are either inferior or do not fully substitute functionality.


It's true they do less, but that's mostly a good thing. I never use helm, just scripts and kustomize.

The one thing it misses is a way to easily delete resources.


I really think this is why most people dislike and misunderstand the value of kube. If you raw dog it, it’s pretty amazing what you can build. It’s not very hard to roll your own Heroku (tailored to your workflows and workloads) with kube, if you shy away from all the noise of helm and other vendors as you say.


We are currently building a Database-as-a-service platform (tidbcloud.com) using Kubernetes. I have to say it is a love-and-hate story.

On the bright side, k8s is almost the only option of an abstraction layer on top of different Clouds, for a complex system with tens of components. Database is more than masters and workers, there are so many components you need to take care of. For example, we may need monitoring agents, certificate managers, health checkers, admin proxies, etc. Without k8s, you have to be the owner of a kindergarten.

On the other side, k8s IS complicated. It's like an unexplored garden. People just enter it and try to use whatever they see, and cause all kinds of problems. What we met are:

* Try to apply the operator pattern to everything, debugging is really painful. Learning curve is steep. * Small services still cost a lot. VPA is not mature enough and many tiny services may be just better off on lambda. * k8s is not really designed for small per-tenant clusters. Managing a fleet of clusters is no easy job, but it is something SaaS companies have to deal with.


> We are currently building a Database-as-a-service platform (tidbcloud.com) using Kubernetes.

I worked for a company that did exactly that (database-as-a-service on k8s); no one in the entire company knew how to run a cluster from scratch. This is a real problem if your developers want to do crazy bizarre advanced stuff like run tests because no one knows how anything fits with anything. At least, I thought it was a real problem as it wrecked any semblance of productivity, but no one else seemed to mind much and thought "it kind-of works on the CI on a good day if the planetary alignments are good" was fine. But hey, VC money so free to piss it all away on a small army of devs being wildly unproductive shrug

Also, the reliability of it all was nothing short of embarrassing, and debugging issues was hard because it was all pretty darn complex. Any technology is fantastic if it works, but I think that's the wrong metric. You need to look at how well things will go for you when it doesn't work – for any reason – because you only set your infrastructure up once, but you will "read" (debug) them countless times.

I hope it works out better for you, but I always felt that the advantages that k8s gave us could have been implemented from scratch significantly better by a few devs working on it for a few months. The other day I spent about an hour writing a shell script (and then spent a bit of time fixing a bug a few days later) to automate a ~15 minute task. k8s kinda felt like that.


It actually works better for us. The system is definitely complex, but we still have some ways to debug and develop locally. For example:

* You can test different parts of the system individually, via APIs.

* With k8s operator model, it's more like a white-box debugging experience. It is not ideal but do-able.

* You can have the rest of the system running remotely, but only your own piece locally. As long as the piece can have access to k8s api server, it just works.

The best thing k8s offers is repeatability. Scripts are so fragile once the system becomes more complicated. (with monitoring, management agents, etc.) And the product is a distributed database, which itself has so many running parts...


> no one in the entire company knew how to run a cluster from scratch

How is that even possible? I've worked for 2 dbaas companies that both used k8s and standing up a cluster with a control plane and data plane was as simple as a bash/terraform script. The only thing that was pretty annoying, as I recall, was cert manager because debugging it if you didn't configure it properly was painful, but once it worked it was great.

I mean even the non-dbaas companies I worked at didn't have that issue. It sounds like you would have had that problem even if you _didn't_ use k8s.


It's actually been an issue everywhere I've seen k8s be deployed. Part of the problem isn't k8s per se but rather "microservices" and that no one really has a good view of everything ties in with everything else, so running a complete product from scratch becomes really hard, and IMO k8s adds to this confusion and complexity.

No doubt there would have been issues if k8s wouldn't have been used, but at least that's (often) easier to understand, so it's 'only' "a mess I can understand and fix" rather than "a mess and wtf I have no idea".


Probably the next closest is just plain VMs (and potentially backplane/management layer running on k8s or whatever)

But yeah... Even then each cloud has quirks with Kubernetes and there's still quite a few resources just to stand up a cluster. Kubernetes can partially solve the initial provisioning but you generally need the cluster running with nodes before you can use something like CAPI or Crossplane (so you still need Terraform or Pulumi or scripts or whatever)

Having worked with a similar system, shared tenancy with tenant per namespace is just as bad but in a different way (if you use the classic operator pattern with 1 operator per cluster, you potentially have a massive blast radius). Then there's security...


1 operator per cluster is not ideal since most clusters are "stable" and don't need much care. Having plenty of them should be a headache.

The operator crash on our side does sound scary. But as a DBaaS system, as long as the blast radius doesn't touch the data plane, it is manageable.


From the Google trend[0] and the historical data of Kubernetes repo on GitHub[1], K8s has crossed the chasm to become the dominant choice for infrastructure. Whether it is the result of several companies working together, or from developers' choice. I think K8s will remain mainstream until there are big changes in the infrastructure world.

[0] https://trends.google.com/trends/explore?date=2014-06-01%202... [1] https://ossinsight.io/analyze/kubernetes/kubernetes


Seems like everyone is forgetting about PaaS, and I don't understand why ..

For many use-cases it's going to be much simpler and cheaper than a manged k8.

There's no lock-in with Cloud Run than GKE. (actual lock-in comes with proprietary databases and the like.)

edit: Missed the GPU part, might make the OP's project the exception to the rule

People also forget about auto-scaling groups of VM's such as Managed Instance Groups in GCP: https://cloud.google.com/compute/docs/instance-groups/


"Azure is the only major provider that still has free control panels"

Oracle Cloud Infrastructure does as well. Perhaps it does not yet qualify as major... It's major to Oracle, that's for sure.


This is one of the pet peeves of HN submitters and readers :)

Sure, here is my two cents FWIW - Kubernetes is complex for some set of folks but not for others. So, the answer is - depends; On a lot of external factors outside of just technical capabilities of team.

Kubernetes solves many non-trivial challenges but its not a silver bullet. I could certainly learn from Mercedes platform/infra team's "story from the trenches" (they reportedly run 900+ k8s clusters on production:)


The fact that there is a think piece everyday either extolling or cursing Kubernetes is a key indicator this bit of tech has some serious teething issues.


i still believe most of the companies are better off with just deploying containers on vm's with autoscaling groups with a LB in front of it and some kind of terraform + ansible | CI-pipeline deployment to manage it.

all the complexity you have to buy with dealing with k8s. you have to update it frequently, update all the charts, fix all the breaking things in a developing ecosystem, and still have to deal with the hard parts of dynamic certs, storage and network. and you would still need to dedicate "normal" vms to host databases and your monitoring and your storage because if your cluster goes down, so is your management layer.

i have been with k8s since 1.3 and it is so disproportionate that i will not touch it for most of the times


For very bare minimum playing around, I use k3s on Linode (1 shared CPU, 1 GB RAM, 25 GB disk for a master and another one for worker node) at 10$ / month.


This sounded like an article from 2017.


What's changed since?


> Bare metal is always my first choice both for past projects and startups.

Is this a new Tech Hipster thing? Like writing a letter with a typewriter rather than an ink pen or computer/printer? "You don't understand, man; a virtual machine is, like, not authentic, man."


There's benefits to using bare metal, but generally only for situations where you really need them. e.g. if you want to run hardware accelerated apps, run your own virtualization/VMM (like firecracker) etc.

So yes, for the most part it is a hipster thing.


Surely hardware acceleration can be exposed via Kubernettes?


It can and you can use taints,tolerations and affinities.


Bare metal can still have containers, I assumed it just meant on their own hardware?


Nope, you can follow the link where they describe the stack and they're very proud of the exclusion of containerization


OP here. To clarify, the blog doesn't use containers just because it didn't really call for it. Ended up creating a simpler CI pipeline to just push updated static files directly. For everything else that I work on, I'm 100% in on containers. Just those containers running mainly on bare metal and newly on k8.


Ah! I'm all for owning hardware but docker is the best tech from the last couple years IMO for development.


Probably.

If you need to run bare metal, you know it, because it is a money question.

The big reasons that make it make sense include performance, storage, connectivity and security. If you're doing interesting things on any of the first three, the economics make owning your own tend to make a lot of sense. The 4th is usually less a technical than a liability concern, and tends to be externally imposed. But sometimes firms are really just that secretive.


I think all the major cloud providers have bare metal offerings that work similarly to VMs. On the other hand, I think they're all running accelerator hardware to offload all the VM overhead so the performance is basically the same


My problem with Kubernetes is my problem with with front-end web frameworks - they introduce too much complexity to the point of being esoteric for simple systems.

If you have a simple website built on boring technologies like HTML, CSS, and vanilla JS, then nearly anyone can read, understand, and make changes to it, even backend developers. If you instead wrote it in React/Webpack/etc. then suddenly only frontend experts can understand and contribute and debug.

Same with k8s. If you make a cloud backend using boring technologies like plain old programs calling APIs then nearly anyone can read, understand, and make changes to it. But if you instead make it use a big pile of configuration files with a million switches then suddenly only k8s experts can understand and contribute and debug.

I'm not saying don't use ReactJS or Kubernetes, I'm just saying make sure that the benefits you get from switching to it outweigh the new complexity and therefore expertise required to understand and debug it.


In my opinion, the complexity is something that's entirely your own making.

k8s can be as simple or as complex as you like it to be and really isn't any more complex than setting up a VM (provided, you are using cloud provided k8s).

Want to run an application? Write a deployment file. Want to expose that application to other applications? Write a service file. Want to expose the application to people outside the k8s cluster? Write an ingress. Want some configuration? Write a config file.

With that, you've got the basics of k8s. None of this would be made easier without k8s. If you had just a VM, well, now you need to figure out how to get your deployable in the image. If you want to expose your app to others, you need to configure the routing. If you want it private, you need to configure the VPC correctly. If you want to configure the app, now you need to figure out how to set that up.

After the initial learning curve, k8s ends up being, frankly. What I've listed here is like 90% of what any dev needs to learn or use in k8s. Everything else can be learned on the fly based specific requirements.


> (provided, you are using cloud provided k8s).

Well, there you cheated ;)

K8s is easy if:

- you pay for cloud provided k8s

- you don't deviate from the happy paths

- you have someone who's going to fix your cluster when you deviate from the happy path

Infrastructure is hard. K8s is a black box wrapping such complexity, but the moment something goes wrong with, let's say, DNS, well, you have to know about DNS to fix the problem (and on top of that you have to know how to fix the problem the k8s way)


> you pay for cloud provided k8s

From what I've seen, the cloud provided k8s isn't much more expensive than the vms for the nodes themselves. Amazon's EKS, for example, is $0.10 per hour per cluster on top of the ec2 instance price for the nodes you select.

So why wouldn't you do that?

> you don't deviate from the happy paths

Agreed, but if you are deviating from happy paths, it is, IMO, valid to question if k8s is even the right choice. For example, I think k8s is a bad choice if you want to host stateful infrastructure like a DB.

Very large swaths of infrastructure are covered by stateless services.

> you have someone who's going to fix your cluster when you deviate from the happy path

This is pretty much axiomatic of any infrastructure. If someone screws up your vm, you need someone who's going to fix your vm when you deviate from the parameters given for allocating vms.

> Infrastructure is hard.

Agreed

> K8s is a black box wrapping such complexity, but the moment something goes wrong with, let's say, DNS, well, you have to know about DNS to fix the problem (and on top of that you have to know how to fix the problem the k8s way)

When DNS goes wrong anywhere, you have to know how to fix DNS the envoy way, the consul way, the coredns way, the pihole way, the ubuntu way, the freebsd way.

The happy path for k8s is really happy. The sad path is no worse than the sad path of any sort of deployment infrastructure. The default bells and whistles of k8s is something that solves way more problems than is created by needing to learn a new system (For example, liveliness and readiness checks, or cert handling).

But not only that, because k8s is becoming (rightfully) so popular, it means that issues you run into are almost always a google away. There are a ton of resources and patterns that are globally applicable to pretty much any k8s cluster.


> k8s is a bad choice if you want to host stateful infrastructure like a DB

You cheated again ;) Now we're only running the easy stateless stuff on k8s?

I thought k8s gives us simple abstractions that make running stateful services a breeze.


Why is that cheating?

I honestly don’t see why focusing on the strengths of a tech and avoiding the weaknesses is considered “cheating”. The whole point of my initial post was “k8s can be as simple or as complex as you want to make it”.

It’s not “cheating” to focus on the parts of k8s that make it simple. It’s not “cheating” to use a tech for it’s strengths and not it’s weaknesses.

You could technically use (and some people do use) redis as your primary data store. Is it “cheating” to suggest that’s not a good idea? Is it a bad idea to pull it in to your infrastructure even though it’s not a great at permanent storage?

K8s biggest value add is stateless applications. I’ve never suggested otherwise. You can get in the weeds of managing volumes and state if you like, but that adds a fair bit of complexity that, IMO, isn’t worth it vs something like a managed db instance from your cloud provider.

I honestly don’t get the “you are cheating by using and benefiting from the simplest features and setup of k8s like you said others should do!”


Maybe "cheating" isn't a good way to put it. It just seems like you're narrowing the scope of what k8s is often used for, and then calling it simple and easy. It'd be like someone saying: "<language> can be as simple or complex as you want it do be: it's actually really easy to write hello world. You should steer clear for anything more complex than that". Well sure, but that's not a great way to convey it's overall complexity.

But I actually agree with your underlying point. I don't see the benefit or running a db on k8s either. Seems like a lot of extra complexity for little to no benefit.


> It just seems like you're narrowing the scope of what k8s is often used for, and then calling it simple and easy.

I mean, to me, that's what it means to simplify.

Even in this post, I see people saying "Hey, it's not actually hard to run postgres in k8s" which, ok, maybe? IDK, seemed like not a great idea to me with the complexity of volume mounts and drivers along with adding extra networking in order to hook it all together.

> <language> can be as simple or complex as you want it do be: it's actually really easy to write hello world. You should steer clear for anything more complex than that

To some extent, that's exactly what I'm saying. I'd argue you can make things more featureful than "hello world" with stateless pods and you can get a lot of benefit out of those. Just like I'd tell a new Java dev "Hey, java is simple, so long as you don't use features like the LambdaMetaFactory, Reflection, or the annotation processor. Some people use those, but IMO they are mostly off limits"

Just because something HAS complex features doesn't mean you should use them (or that using them is a great idea)

> But I actually agree with your underlying point. I don't see the benefit or running a db on k8s either. Seems like a lot of extra complexity for little to no benefit.

And who knows, maybe this is my (our?) inexperience with k8s shining through. Maybe running postgres on k8s is actually highly beneficial. I've just not been convinced that's better than doing something like amazon's RDS or Aurora. The headache of setting up ceph or rook or whatever just seems like a pretty high burden for a pretty low payback.

To me, it's features like deployment rollouts, common logging infrastructure, autoscaling, certificate management/distribution, and recovery that are attractive to k8s and all come mostly out of the box.


Fair enough. I suppose if more people precisely qualified how & why they use k8s while advocating for it, I'd probably have a better overall view of it rather than assuming it's the blessed solution for every use case it targets.


I'm running postgres inside k8s. It's very easy. I don't know why people think that k8s is for stateless.


I used to think of stateful applications on top of k8s as an anti-pattern until I learned a bit more about some databases uses and changed my entire opinion about what k8s is.

If you're coming from the container world, it makes sense to think of orchestration as a way to scale stateless services.


I'd be glad to hear those usecases. I'm still in the camp of thinking it's not a great idea.

I have a hard time thinking coordination between something like postgres and the volume wouldn't be an absolute nightmare to handle. What sorts of volume drivers do you use there? How do you get over the network latency? How does scaling end up working?


I've not used postgres-on-k8s in any real production workload so my opinion on the matter isn't worth glozing about. I'd rather recommend the blog posts and docs fly.io did on it[1]. They themselves rely on Stolon[2].

AFAIK the secret sauce is postgres' ability for streaming replication[3], ie. pass data across instances as it's ingressed. Add some glue and you've got a cluster.

As far as storage goes, k8s supports topology awareness[4] and enables collocating pods and PVs. There are CSI drivers for most file systems you'd encounter in the enterprise[5][6].

> How do you get over the network latency?

Ultimately there are incompressible limits you'll hit (CAP theorem and all that). So you do what everybody does in computing : you cheat by exploiting human perception and treating your reads and writes differently[7].

I'm by no means an expert on the subject, but I hope I've answered some of your questions.

---

[1]: https://fly.io/blog/globally-distributed-postgres/

[2]: https://github.com/sorintlab/stolon

[3]: https://wiki.postgresql.org/wiki/Streaming_Replication

[4]: https://kubernetes.io/blog/2018/10/11/topology-aware-volume-...

[5]: https://kubernetes.io/blog/2019/01/15/container-storage-inte...

[6]: https://kubernetes-csi.github.io/docs/drivers.html

[7]: https://fly.io/blog/globally-distributed-postgres/


You can run database on local volume and not deal with network at all if you don't like it. You can dedicate node for your postrgres instance. Or you can start simple until you hit performance wall and then gradually implement those things. Kubernetes is pretty flexible in that regard.


> - you pay for cloud provided k8s

At this point, k8s is a cloud abstraction layer that is flexible enough that what's beneath it doesn't have to be a cloud, but in the vast majority of cases is a cloud. The various managed k8s flavors sit in between bare VMs (you control infrastructure) and any serverless offerings (what's infrastructure?), abstracting away a bunch of hard things while still offering a fair amount of flexibility and control, so you can layer other abstractions on top to make it look and function like serverless functions or a batch pipeline runner to your users or automate operations, governance and compliance tasks.

The primary value (at least as I see it) that k8s adds lies in its API and the built-in automation it layers on top of the cloud (or non-cloud) below, which I find is a surprisingly good fit for a very wide range of needs. That's what many orgs that run k8s on bare metal are really after: Providing infrastructure users with a single well-known, high-quality API and making use of the wealth of compatible tooling and third-party products written against that API.

Once you get to the point where you have a use for a fair amount of k8s features, I've found that home-grown setups tend to approximate a low-quality k8s clone anyway, so running k8s may end up the less painful route in some cases, but bare metal k8s clusters are a lot more rare than cloud-managed ones for a reason. Simpler setups can of course be run with less effort, and in a lot of cases, that's plenty good enough. A single VM with a monolithic app can go a long way, depending on the requirements.


> something goes wrong with, let's say, DNS

I ran into exactly this issue for the cert-manager to obtain a LetsEncrypt certificate for our cluster. (The ACME URL was failing inside the pod: no route to host). It took me quite a long time just to figure out it was a DNS related issue because K8s buries technology behind so many layers. How can I make a simple edit to /etc/hosts? Who knows? Ohh and you are not supposed to do it that way anyway. The solution ended up being forcing the cert-manger to run on a node with no DNS issues using such Byzantine terms as "tolerations". Its that bizarre vocabulary where they are obviously trying really hard to abstract something away from you that you would just rather be able to get your hands on. It reminds me of other absurdly complex technologies from the Java world like J2EE and whatnot.


> How can I make a simple edit to /etc/hosts? Who knows?

In case somebody else needs to do this, you actually can:

https://kubernetes.io/docs/tasks/network/customize-hosts-fil...

But as you said, you are "not supposed to". I'd say that if you need to do that, most of the time there's an underlaying issue that needs to be fixed instead and consider editing the hosts file a workaround.


K8s very much isn’t a black box, it’s observable on every level, and has no internal APIs.


How do you make Kubernetes as simple as you'd like it to be? Even trying to deploy simple apps is several times more complicated than any other similar system I've used.


It depends on your requirements, for basic development setup, you just need modern computer with installed Docker, k3d that runs in Docker containers and Tilt to build/deploy your docker image to local k3d docker registry and little k8s configuration.

You can just use basic concepts, so if all you need is container that exposes your application, you can just prepare Dockerfile, Tiltfile and k8s yaml file with deployment, service (LB) and use Tilt to automatically apply it to cluster, when you make any source code changes. Do you need to run some script once or periodically? Single k8s yaml file with Job/CronJob might be enough.

Often that's all that is needed, but maybe you need to additionally provide some env variables, use some secret values, create/mount volume to your container or deploy supporting infrastructure (databases/messages queues/authentication server/ci/cd/monitoring/tracing/alerting) or get into advanced use cases like chaos engineering, service mesh, serverless, workflow engine (CI/ML).

Kubernetes requires knowledge and experience to use all of its features and ecosystem, but you don't need to know it all from the start at expert level. However you need to actually learn Kubernetes, if you want to scale your solution, because eventually more people/teams might get involved, need for cooperation increases and non-functional requirements become more important and that's when you really start to appreciate Kubernetes.


Expired cert, but that is how simple it can be: https://k8syaml.com/

Fill out web form,and apply yaml to cluster via CLI. It is really that simple for simple apps.


K8s provides a cloud-agnostic solution for (mostly stateless) configuration management.

You have to record which ports your applications listen on, which subdomains are used for what, how to access that storage bucket with the data files in it etc. somewhere. The moment you have more than a couple of individual services managing this all by hand becomes a nightmare. You need a really strong culture of good practice to make sure that the exact production config actually ends up in source control.

You can be a k8s expert, or an expert in your own home-spun configuration management/git-ops workflow. But not having any workflow here just isn't an option in my opinion. You too easily end up with spaghetti infrastructure that everybody is too scared to touch. With k8s I can take my repo of yaml and change cloud providers in an afternoon.


The problem is kubernetes doesn't include any notions of obvious things I'd like to do like parameterizing my deployment - even though all the providers want to sell me kubernetes clusters as a service.


Out of curiosity, what kind of things are you looking for here? “Parametrising what” I guess is the question.


Off the top of my head? Namespaces, base images, configMaps (this should be pretty obvious), monitoring.

Now all of this can be done, but Kubernetes does this thing where it talks extensively about applications and then goes "oh so anyway, um, like, I guess just put that stuff in git?"

Which you can but with what separation? The way my configuration mounts to my applications is probably somewhat application determined, but "where" stuff runs is infrastructure.

None of it matters because K8S doesn't have an answer for any of it: no standard for emitting state from clusters to allow self-configuration, no system to define axes of freedom in configuration shape, no well defined system for say, plugging your applications network requirements into perhaps a clusters policy of what networking should be. No way to expose a set of permissions for reconfiguring my infrastructure to the cluster so RBAC can define a sensible ops model.

K8S is essentially a platform-for-platforms which is why it feels complicated: the moment you have it setup, you realize just how much is actually missing if you try to use it like it bills itself as.


The article makes a similar point in the section titled “A Dictionary Date” — There’s a lot of new concepts and resource types to learn with Kubernetes that force you of your application in a different way. It’s pretty powerful and logical once you learn it, but it does take a different mind set.

Fortunately, the API is programatic, so it’s possible to build abstractions on top to make dealing with those concepts easier. But that can also take a lot of work to build from scratch…


> My problem with Kubernetes is my problem with with front-end web frameworks - they introduce too much complexity to the point of being esoteric for simple systems.

Having seen all the ages of the web, I wouldnt support the attempt to shame frameworks like this. Fiddling the DOM is slow & causes bugs- the quasi-immediate mode version of the web we get so often now kicks ass.

But to the discussion at hand- the fact that you are comparing something as singular & without peer as Kubernetes to web frameworks of which there are many many many betrays the insolidity of your argument. Web frameworks have their own style & choices, many of which are indeed fairly arbitrary. Kubernetes is nothing more than a single control loop pattern, applied again & again & again. It's opionion is simpler than most critics dare to understand.

It's also a very very small pile of configuration to launch a webapp, typically. And this practice & understanding will scale not just "zero to one" but also "one to many". The time spent trying to avoid good useful tools & convince oneself to be selective & narrow in focus seems like a waste to me.


I am a full stack and I use React for the front-end and for me React is not hard. Why? Hear me out. All this talk about k8s being hard is real but not a bad thing. Why? Because our tech and our jobs are hard. Like you said to a non front-end person React from your POV is hard. As much as k8s is hard for a back-end or front-end guy. And that's how you have so many articles about X and Y being so complex and hard, and 90% of the time they are written by people that never really used it and it's not even their job. But then if you ask a front-end dev if he love React and if it's hard for him? At the beginning maybe but now? No. A back-end with Spring/Rails/Next.js... All the same maybe at the beginning but now? No. But try to cross the bridges without being already good at the other stack and this being your daily job; it's always hard.

As a side example I played fighter jet simulators (Falcon BMS, and DCS) and I stopped to play them for a year. Before I could fly and do a full cold start of the F14 [1], the F16, the KA-50, the SU-27 (ok this one is super easy), and I was learning the A10. Now I am looking at all these buttons and I am like: how was I able to do this?? Dang! But when I was into it, it was super easy. That's really the same here. You have the opposite of a honey moon period. At first it's (super) hard, and then it gets super easier and easier and it looks like you are a wizard to others.

And it was the same for me back in the day where people were doing crazy stuff with Mesos. They looked like wizard to me. Then I dived into Mesos, Raft, Peloton, etc... And now it no longer looks so crazy and scary.

I am sure you are doing stuff that look so easy to you but are super hard to others who work on a different part of the stack.

[1] https://www.youtube.com/watch?v=i7oC_Qwq0-w (because it's so cool)


> If you have a simple website built on boring technologies like HTML, CSS, and vanilla JS, then nearly anyone can read, understand, and make changes to it, even backend developers

Ehhhh, or it could be a pile of shit and be unmaintainable. The upside of the ubiquity of react is that once you learn it well, dropping into a react based project is much simpler.

How simple of a website are you talking about? Something so simple not to need a framework could also be equally simple in React, maybe more so because of the ecosystem, dev tools, testing, etc.


To use a concrete example - Swagger UI 2.0 was plain old html, css, and JS. I was able to easily navigate the code and customize it for my previous company with some extra features we wanted. Swagger UI 3.0+ was completely rewritten in React, and I had a super hard time navigating the new source code and trying to figure out how to achieve the same basic things I did with 2.0 with some minor JS tweaks. I'm sure if I invested a few days/weeks becoming a React expert I would be able to do it. But I wasn't willing to do that, so as a result, we are still on Swagger UI 2.0.

Judge for yourself:

Swagger UI 2.0: https://github.com/swagger-api/swagger-ui/tree/2.x

vs

Swagger UI 4.0 https://github.com/swagger-api/swagger-ui


I finally got to look at the source code.

There's a lot going on, and it's pretty well organized into components and small files. The worst thing is the flat file structure.

Diving into a React codebase definitely requires some process and familiarity with dev tools, common patterns and React in general, but the upside is that the process is repeatable. Doesn't do you a lot of good for diving into a random codebase, but as a pro React dev, I can hit the ground running much faster than with a project of similar complexity.


If you're using docker, then your project is already complex enough to migrate to k8s. You can install kubernetes on a single node, without any clustering, just to get its API and run your workloads there. It'll cost you 2GB of RAM. It won't be HA, but it'll be much more sane than docker.

I'm saying it as a person who avoided k8s as much as I could, but in the end I loved it. Yes, it has some learning curve, but after that it's good.


See, I find the most compelling part of k8s being that it makes it a lot easier to integrate different systems. It provides a standard way to define your system/service, which allows easy interoperability.


Quite frankly, the only difference between a developer who can understand React and a developer who cannot: one read the doc and the other didn't.


> My problem with Kubernetes is my problem with with front-end web frameworks - they introduce too much complexity to the point of being esoteric for simple systems.

I used to think the exact same thing but came to the realization that the complexity doesn't come from the tools, it comes from the people who misuse them.

Having a set of basic config files for Kubernetes is very easy, there are a few concepts to understand but it's something anyone can learn in a matter of hours (images, containers/pods, deployments, services, volumes and ingress) and the power of abstracting your whole infrastructure in a bunch of reusable YAML lines is pretty damn cool.

The same goes for things like React, once you understand the basic semantic blocks (components, props, state and more recently effects) you're 80% of the way there.

Problems arise when the "I'm so smart and look what I can do!" kind of dev starts telling you that using basic boilerplate is not good enough and that you should specify all the optional values as well because "cOnTrOl", or that `create-react-app` is not good enough because "you don't really understand what happens behind the scene", but guess what? I don't care what happens behind the scenes, the same way I don't care what cc and ld do to compile and link my C code. `create-react-app` is more than enough for 99% of production grade apps.

Problems arise when frontend devs need (or feel the need) to become experts in the build systems (one of the reasons why bun has become popular so quickly, standardized, included build system) or to introduce unnecessarily verbose conventions.

There are two keywords in your comment that perfectly describe what I'm talking about: "Webpack" and "big pile of configuration files". Neither of those things are necessary when using K8s or React.

It's overzealous devs who think micro-optimizing 5 ms off of the build process is worth writing 15 more YAML files that contribute to this false sense of "complexity". But when you think about how much you need to write in K8s (for most SQL DB backed web APIs we're talking about 3/4 files of 10 lines each) vs what you get in return (you can spin up a 1:1 copy of your entire infrastructure in a few seconds on any K8s compatible cluster, without worrying about the underlying hardware, to an extent) the tradeoff is clearly in favor of using it.

As an example I'm working on a project where if I needed another worker I could just add a few lines to the K8s configuration, commit, push and magically have that worker up and running in the cluster in no time. Whereas traditionally you'd have to contact someone from the infra team, specify the characteristics of the machine, operating system, set it up, connect it to the network etc. with K8s everything is handled in those few files. IMO it's better than magic.


> If you have a simple website built on boring technologies like HTML, CSS, and vanilla JS, then nearly anyone can read, understand, and make changes to it, even backend developers. If you instead wrote it in React/Webpack/etc. then suddenly only frontend experts can understand and contribute and debug.

The problem with this thinking is you assume that the backend developers know HTML, CSS and vanilla JS, but don't make that assumption for React/Webpack/etc because then your argument wouldn't be valid.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: