Hacker News new | past | comments | ask | show | jobs | submit login
Maybe You Don't Need Kubernetes (2019) (endler.dev)
281 points by WolfOliver 8 months ago | hide | past | favorite | 251 comments



Previous discussion: https://news.ycombinator.com/item?id=19467067 (315 comments).



All these "maybe you don't need this or that X" posts die in an instant when the user already knows how to do X (when the learning curve argument is gone).

Let's get it right:

Kubernetes is really really cheap. I can run 20 low volume apps in a kubes cluster with a single VM. This is cheaper than any other hosting solution in the cloud if you want the same level of stability and isolation. It's even cheaper when you need something like a Redis cache. If my cache goes down and the container needs to be spun up again then it's not a big issue, so for cheap projects I can even save more cost by running some infra like a Redis instance as a container too. Nothing beats that. It gets even better, I can run my services in different namespaces, and have different environments (dev/staging/etc.) isolated from each other and still running on the same amount of VMs. When you caculate the total cost saving here to traditional deployments it's just ridiculously cheap.

Kubernetes makes deployments really easy. docker build + kubectl apply. That's literally it. Deployments are two commands and it's live, running in the cloud. It's elastic, it can scale, etc.

Kubernetes requires very little maintenance. Kubernetes takes care of itself. A container crashes? Kubes will bring it up. Do I want to roll out a new version? Kubes will do a rolling update on its own. I am running apps in kubes and for almost 2 years I haven't looked at my cluster or vms. They just run. Once every 6 months I log into my console and see that I can upgrade a few nodes. I just click ok and everything happens automatically with zero downtime.

I mean yes, theoretically nothing needs Kubernetes, because the internet was the same before we had Kubernetes, so it's certainly not needed, but it makes life a lot easier. Especially as a cheap lazy developer who doesn't want to spend time on any ops Kubernetes is really the best option out there next to serverless.

If learning Kubernetes is the reason why it's "not needed" then nothing is needed. Why use a new programming language? Why use a new db technology? Why use anything except HTML 4 + PHP, right?

BTW, learning Kubernetes can be done in a few days.


All of this glosses over the biggest issue with Kubernetes: it's still ridiculously complex, and troubleshooting issues that arise (and they will arise), can leave you struggling for days pouring over docs, code, GitHub issues, stackoverflow... All of the positives you listed rely on super complex abstractions that can easily blow up without a clear answer as to "why".

Compared to something like scp and restarting services, I would personally not pay the Kubernetes tax unless I absolutely had to.


Exactly. A year or so ago I thought, hey, maybe I should redo my personal infrastructure using Kubernetes. Long story short, it was way too much of a pain in the ass.

As background, I've done time as a professional sysadmin. My current infrastructure is all Chef-based, with maybe a dozen custom cookbooks. But Chef felt kinda heavy and clunky, and the many VMs I had definitely seemed heavy compared with containerization. I thought switching to Kubernetes would be pretty straightforward.

Surprise! It was not. I moved the least complex thing I run, my home lighting daemon to it; it's stateless and nothing connects to it, but it was still a struggle to get it up and running. Then I tried adding more stateful services and got bogged down in bugs, mysteries, and Kubernetes complexity. I set it aside, thinking I'd come back to it later when I had more time. That time never quite arrived, and a month or so ago my home lights stopped working. Why? I couldn't tell. A bunch of internal Kubernetes certificates expired, so none of the commands worked. Eventually, I just copy-pasted stuff out of Stack Overflow and randomly rebooted things, and eventually it started working again.

I'll happily look at it again when I have to do serious volume and can afford somebody to focus full-time on Kubernetes. But for anything small or casual, I'll be looking elsewhere.


At work we're building an entire service platform on top of managed kubernetes services, agnostic to cloud provider. We had already had bad experiences running K8s ourselves.

Going into it we knew how much of a PITA it would be but we vastly underestimated how much, IMO.

Would not do again -- I would quit first.


Fire and Motion https://www.joelonsoftware.com/2002/01/06/fire-and-motion/

Written 18 years ago, so obviously not about Kubernagus, but it does explain the same phenomenon. Replace Microsoft with cloud providers and that's more or less the same argument.


> Long story short, it was way too much of a pain in the ass.

Kubernetes has a model for how your infrastructure and services should behave. If you stray outside that model, then you'll be fighting k8s the entire way and it will be painful.

If however you design your services and infrastructure to be within that model, then k8s simplifies many things (related to deployment).

The biggest issue I have with k8s as a developer is that while it simplifies the devops side of things, it complicates the development/testing cycle by adding an extra layer of complication when things go wrong.


So it sounds like it doesn't "just run" after all.


I run my home automation and infrastructure on kubernetes, and for me that is one of the smoothest ways of doing it. I find it quite easy to deal with, and much prefer it to the “classic” way of doing it.


what dark magic are you using? Not joking. I've tried learning kubernetes several times and gave up. Maybe I'm not the smartest. Can you point to guides that helped you get up and running smoothly? This is probably something I should put some more effort into in the coming months.


I think this is really hard, it's a bit like how we talk about learning Rails in the Ruby community. "Don't do it"

Not because it's bad or especially hard, but because there's so much to unpack, and it's so tempting to unpack it all at once, and there's so much foundational stuff (Ruby language) which you really ought to learn before you try to analyze in detail exactly how the system is built up.

I learned Kubernetes around v1.5 just before RBAC was enabled by default, and I resisted upgrading past 1.6 for a good long while (until about v1.12) because it was a feature I didn't need, and all the features after it appeared to be something else which I didn't need.

I used Deis Workflow as my on-ramp to Kubernetes, and now I am a maintainer of the follow-on fork, which is a platform that made great sense to me, as I was a Deis v1 PaaS user before it was rewritten on top of Kubernetes.

Since Deis left Workflow behind after they were acquired by Microsoft, I've been on Team Hephy, which is a group of volunteers that maintains the fork of Deis Workflow.

This was my on-ramp, and it looks very much like it did in 2017, but now we are adding support for Kubernetes v1.16+ which has stabilized many of the main APIs.

If you have a way to start a Kubernetes 1.15 or less cluster, I can recommend this as something to try[1]. The biggest hurdle of "how do I get my app online" is basically taken care of you. Then once you have an app running in a cluster, you can start to learn about the cluster, and practice understanding the different failure modes as well as how to proceed with development in your new life as a cluster admin.

If you'd rather not take on the heavyweight burden of maintaining a Workflow cluster and all of its components right out of the gate (and who could blame you) I would recommend you try Draft[2], the lightweight successor created by Deis/Azure to try to fill the void left behind.

Both solutions are based on a concept of buildpacks, though Hephy uses a combination of Dockerfile or Heroku Buildpacks and by comparison, Draft has its own notion of a "Draftpack" which is basically a minimalistic Dockerfile tailored for whatever language or framework you are developing with.

I'm interested to hear if there are other responses, these are not really guides so much as "on-ramps" or training wheels, but I consider myself at least marginally competent, and this is how I got started myself.

[1]: https://web.teamhephy.com

[2]: https://github.com/azure/draft


Thank you, this is very helpful.


I hope so! Please drop by our Team Hephy Slack if there are any gaps or if you need help.


So, assuming that Kubernetes was installed using kubeadm, this was the document you needed to find:

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/...

Moreover, if you are keeping pace with kubeadm upgrades at all (minor releases are quarterly, and patches are more frequent) then since the most recent minor release, Kubernetes 1.17, certificate renewal as an automated part of the upgrade process is enabled by default. You would have to do at least one cluster upgrade per year to avoid expired certs. tl;dr: this cert expiration thing isn't a problem anymore, but you do have to maintain your clusters.

(Unless you are using a managed k8s service, that is...)

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/...

The fact remains also that this is the very first entry under "Administration with Kubeadm", so if you did use kubeadm and didn't find it, I'm going to have to guess that either docs have improved since your experience, or you really weren't looking to administrate anything at all.


I appreciate the links, but for my home stuff I'll be ripping Kubernetes out.

The notion that one has to keep pace with Kubernetes upgrades is exactly the kind of thing that works fine if you have a full-time professional on the job, and very poorly if it's a sideline for people trying to get actual productive work done.

Which is fine; not everything has to scale down. But it very strongly suggests that there's a minimum scale at which Kubernetes makes sense.


Or, that there is a minimum scale/experience gradient behind which you are better served by a decent managed Kubernetes, when you're not prepared to manage it yourself. Most cloud providers have done a fairly good job to make it affordable.

I think it's fair to say that the landscape of Kubernetes proper itself (the open source package) has already reached a more evolved state than the landscape of managed Kubernetes service providers, and that's potentially problematic, especially for newcomers. It's hard enough to pick between the myriad choices available; harder still when you must justify your choice to a hostile collaborator who doesn't agree with part or all.

IMO, the people who complain the loudest about the learning curve of Kubernetes are those who have spent a decade or more learning how to administer one or more distributions of Linux servers, who have made the transition from SysV init to SystemD, and in many cases who are now neck deep in highly specialized AWS services, which in many cases they have used successfully to extricate from the nightmare-scape where one team called "System Admins" is responsible for broadly everything that runs or can run on any Linux server (or otherwise), from databases, to vendor applications, to monitoring systems, new service dev, platforming apps that were developed in-house, you name it...

I basically don't agree that there is a minimum scale for Kubernetes, and I'll assert confidently that declarative system state management is a good technology, that is here to stay. But I respect your choice and I understand that not everyone shares my unique experiences, that led me to be more comfortable using Kubernetes for everything from personal hobby projects, to my own underground skunkworks at work.

In fact it's a broadly interesting area of study for me, "how do devs/admins/(people at large) get into k8s" since it is such a steep learning curve, and this has all happened so fast, there is so much to unpack before one can start to feel comfortable that there isn't really that much more complexity buried behind that you haven't deeply explored already and understood.


It sounds like we both agree there's a minimum scale for running your own Kubernetes setup, or you wouldn't be recommending managed Kubernetes.

But a managed Kubernetes approach only makes sense if you want all your stuff to run in that vendor's context. As I said, I started with home and personal projects. I'd be a fool to put my home lighting infrastructure or my other in-home services in somebody's cloud. And a number of my personal projects make better economic sense running on hardware I own. If there's a managed Kubernetes setup that will manage my various NUCs and my colocated physical server, I'm not aware of it.


> there's a minimum scale for running your own Kubernetes setup

I would say there is a minimum scale that makes sense, for control plane ownership, yes. Barring other strong reasons that you might opt to own and manage your own control plane like "it's for my home automation which should absolutely continue to function if the internet is down"...

I will concede you don't need K8s for this use case, even if you like containers and wanted to use containers, but don't have much prior experience with K8s, from a starting position of "no knowledge" you will probably have a better time with compose and swarm. There is a lot to learn about K8s to a newcomer, but the more you already learned, the less likely I would be to recommend using swarm, or any other control plane (or anything else.)

This is where I feel the fact I mentioned that managed k8s ecosystem is not as evolved as it will likely soon become is relevant. You may be right that no managed Kubernetes setups will handle your physical servers today, but I think the truth is somewhere between: they're coming / they're already here but most are not quite ready for production / they are here, but I don't know what to recommend strongly.

I'm leaning toward the latter (I think that if you wanted a good managed bare metal K8s, you could definitely find it.) I know some solutions that will manage bare metal nodes, but this is not a space I'm intimately familiar with.

The solutions that I do know of, are in early enough state of development that I hesitate to mention. It won't be long before this gets much better. The bare metal Cluster API provider is really something, and there are some really amazing solutions being built on top of it. If you want to know where I think this is going, check this out:

WKS and the "firekube" demo, a GitOps approach to managing your cluster (yes, even for bare metal nodes)

https://github.com/weaveworks/wks-quickstart-firekube

I personally don't use this yet, I run kubeadm on a single bare metal node and don't worry about scaling, or the state of the host system, or if it should become corrupted by sysadmin error, or much else really. The abstraction of the Kubernetes API is extremely convenient when you don't have to learn it from scratch anymore, and doubly so if you don't have to worry about managing your cluster. One way to make sure you don't have to worry, is to practice disaster recovery until you get really good at it.

If my workloads are containerized, then I will have them in a git repo, and they are disposable (and I can be sure, as they are regularly disposed of, as part of the lifecycle). Make tearing your cluster down and standing it back up a regular part of your maintenance cycles until you're ready to do it in an emergency situation with people watching. It's much easier than it sounds, and it's definitely easier than debugging configuration issues to start over again.

The alternative that I would recommend for production right now, if you don't like any managed kubernetes, is to become familiar with the kubeadm manual. It's probably quicker to read it and study for CKA than it would be to canvas the entire landscape of managed providers for the right one.

I'm sure it was painful debugging that certificate issue, I have run up against that issue in particular before myself. It was after a full year or more of never upgrading my cluster (shame on me), I had refused to learn RBAC, kept my version pinned at 1.5.2, and at some point after running "kubeadm init" and "kubeadm reset" over and over again it became stable enough (I stopped breaking it) that I didn't need to tear it down anymore, for a whole year.

And then a year later certs expired, and I could no longer issue any commands or queries to the control plane, just like yours.

Once I realized what was happening, I tried to renew the certs for a few minutes, I honestly didn't know enough to look up the certificate renewal docs, I couldn't figure out how to do it on my own... I still haven't read all the kubeadm docs. But I knew I had practiced disaster recovery well over a dozen times, and I could repeat the workloads on a new cluster with barely any effort (and I'd wind up with new certs.) So I blew the configuration away and started the cluster over (kubeadm reset), reinstalled the workloads, and was back in business less than 30 minutes later.

I don't know how I could convince you that it's worth your time to do this, and that's OK (it's not important to me, and if I'm right, in 6 months to a year it won't even really matter anymore, you won't need it.) WKS looks really promising, though admittedly still bleeding edge right now. But as it improves and stabilizes, I will likely use this instead, and soon after that forget everything I ever knew about building kubeadm clusters by hand.


Kubernetes, once you know it, is significantly easier than cobbling together an environment from "classical" solutions that combine Puppet/Chef/Ansible, homegrown shell scripts, static VMs, and SSH.

Sure, you can bring up a single VM with those technologies and be up and running quickly. But a real production environment will need automatic scaling (both of processes and nodes), CPU/memory limits, rolling app/infra upgrades, distributed log collection and monitoring, resilience to node failure, load balancing, stateful services (e.g. a database; anything that stores its state on disk and can't use a distributed file system), etc., and you end up building a very, very poor man's Kubernetes dealing with all of the above.

With Kubernetes, all of the work has been done, and you only need to deal with high-level primitives. "Nodes" become an abstraction. You just specify what should run, and the cluster takes care of it.

I've been there, many times. I ran stuff the "classical" Unix way -- successfully, but painfully -- for about 15 years and I'm not going back there.

There are alternatives, of course. Terraform and CloudFormation and things like that. There's Nomad. You can even cobble together something with Docker. But those solutions all require a lot more custom glue from the ops team than Kubernetes.


The majority of what you posted reiterates the post I responded to , and it doesn't address the complexity of those features or their implementation. Additionally, I challenge your assertion that "real production environments" need automatic scaling.


You missed my point. I was contrasting Kubernetes with the alternative: Critics often highlight Kubernetes' complexity, forgetting/ignoring that replicating its functionality is also complex and often not composable or transferable to new projects/clusters. It's hard to design a good, flexible Puppet (or whatever) configuration that grows with a company, can be maintained across teams, handles redundancy, and all of those other things.

Not all environments need automatic scaling, but they need redundancy, and from a Kubernetes perspective those are two sides of the same coin. A classical setup that automatically allows a new node to start up to take over from a dysfunctional/dead one isn't trivial.

Much of Kubernetes' operational complexity also melts away if you choose a managed cloud such as Digital Ocean, Azure, or Google Cloud Platform. I can speak from experience, as I've both set up Kubernetes from scratch on AWS (fun challenge, wouldn't want to do it often) and I am also administering several clusters on Google Cloud.

The latter requires almost no classical "system administration". Most of the concerns are "hoisted" up to the Kubernetes layer. If something is wrong, it's almost never related to a node or hardware; it's all pod orchestration and application configuration, with some occasional bits relating to DNS, load balancing, and persistent disks.

And if I start a new project I can just boot up a cluster (literally a single command) and have my operational platform ready to serve apps, much like the "one click deploy" promise of, say, Heroku or Zeit, except I have almost complete control of the platform.

In my opinion, Kubernetes beats everything else even on a single node.


I'd argue that those who adopt kubernetes spend way more on ops than those who don't.


Source?


> All of this glosses over the biggest issue with Kubernetes: it's still ridiculously complex, and troubleshooting issues that arise (and they will arise), can leave you struggling for days pouring over docs, code, GitHub issues, stackoverflow.

It's probably good at this point to distinguish between on-prem and managed installations of k8s. In almost four years of running production workloads on Google's GKE we've had... I don't know perhaps 3-4 real head-scratchers where we had to spend a couple of days digging into things. Notably none of these issues have ever left any of our clusters or workloads inoperable. It isn't hyperbole to say that in general the system just works, 24x7x365.


Agreed. We moved from ECS to GKE specifically because we didn't have the resources to handle what was supposed to be a "managed" container service with ECS. Had agent issues constantly where we couldn't deploy. It did take a little bit to learn k8s no doubt. But now it requires changes so little I usually have to think for a minute to remember how something works because it's been so long since I needed to touch it.


Maybe, but the point with containers and kubernets is to treat it like cattle, not pets.

If something blows up or dies, then with Kubernetes it's often faster to just tear down the entire namespace and bring it up again. If the entire cluster is dead, then just spin up a new cluster and run your yaml files on it and kill your old cluster.

Treat it like cattle, when it doesn't serve your purpose anymore then shoot it.

This is one of the biggest advantages of Kubes, but often overlooked because traditional Ops people keep treating infrastructure like a pet.

Only thing you should treat like a pet is your persistence layer, which is presumably outside Kubes, somehting like DynamoDb, Firestore, CosmosDb, SQL server, whatever.


This is not good engineering. If somebody told me this at a business, I’d not trust them anymore with my infrastructure.

So, you say that problems happen, and you consciously don’t want to know/solve them. A recurring problem in you view is solved with constantly building new K8s clusters and your whole infrastructure in it every time!?! Simple example - A microservice that leaks memory.... let it keep restarting as it crashes?!

I remember at one of my first jobs, at a healthcare system for a hospital in India, their Java app was so poorly written that it kept leaking memory and bloated beyond GC could help and will crash every morning at around 11 AM and then again at around 3 PM. The end users - Doctors, nurses, pharmacists knew about this behavior and had breaks during that time. Absolutely bullshit engineering! It’s a shame on those that wrote that shitty code, and shame on whoever reckless to suggest a ever rebuilding K8s clusters.


> let it keep restarting as it crashes?!

Yes, "let it keep restarting while it crashes and while I investigate the issue" is MUCH preferred to "everything's down and my boss is on my ass to fix the memory issue."

The bug exists either way, but in one world my site is still up while I fix the bug and prioritize it against other work and in another world my site is hard-down.


That only works if the bug actually gets fixed. When you have normalized the idea that restarting the cluster fixes a problem — all of the sudden, you don’t have a problem anymore. So now your motivation to get the bug properly fixed has gone away.

Sometimes feeling a little pain helps get things done.


You and I wish that's what happened in real life. Instead, people now normalize the behavior thinking it'll sort itself out automatically over time without ever trying to fix it.

Self-healing systems are good but only if you have someone who is keeping track of the repeated cuts to the system.


This is something that has been bothering me for the last couple of years. I consistently work with developers who no longer care about performance issues, assuming that k8s and the ops team will take care of it by adding more CPU or RAM or just restarting. What happened to writing reliable code that performed well?


Business incentives. It's a classic incentive tension between more time on nicer code that does the same thing or building more features. Code expands to it's performance budget and all.

At least on backend you can quantify the cost fairly easily. If you bring it up to your business people they will notice easy win and then push the devs to make more efficient code.

If it's a small $$ difference although, the devs are probably prioritizing correctly.


I've witnessed the same thing, however there is nothing mutually exclusive about having performant code running in Kubernetes. There's a trade-off between performance and productivity, and maintaining a sense of pragmatism is a good skill to have (that's directed towards those that use scaling up/out as a reason for being lax about performance).


Nothing is this black and white. I tried to emphasise just a simple philosophy that life gets a lot easier if you make things easily replaceable. That was the message I tried to convey, but of course if there is a deep problem with something it needs proper investigation + fixing, but that is an actual code/application problem.


That's not what cattle vs pets is. Treating your app as cattle means that it deploys, terminates, and re-deploys with minimal thought at the time of where and how. Your app shouldn't care which Kubernetes node it gets deployed to. There shouldn't be some stateful infrastructure that requires hand-holding (e.g. logging into a named instance to restart a specific service). Sometimes network partitions happen, a disk starts going bad, or some other funky state happens and you kill the Kubernetes pod and move on.

You should try to fix mem leaks and other issues like the one you described, and sometimes you truly do need pets. Many apps can benefit from being treated like cattle, however.

This article touched on the distinction and has plenty of associated links at the bottom: https://medium.com/@Joachim8675309/devops-concepts-pets-vs-c...


The problem is that your "cattle" has mad cow disease and no amount of restarting will help.


So what if there is a recurring issue? Are you never going to debug it?


When cattle are sick, you need to heal them. Not shoot them in the head and bring in new cattle. If you your software behaves badly you need to FIX THE SOFTWARE.

Just doing the old 'just restart everything' is typical windows admin behavoir and a recipy for making bad unstable systems.

Kubernetes absolutly does do strang things, crahes on strange things, does strange things and not tell you about it.

I like the system, but to pretend its this unbelievable great thing is an exaturation.


You fundamentally misunderstood what I said.

I agree with you, treat your software like a pet.

I am saying though, treat your infrastructure like cattle.

Infrastructure problems <> Software problems.

So yeah, if you have severe bugs in your app, go ahead and fix it, but that has nothing to do with Kubes or not Kubes anymore.


Agree that the k8s tax, as described, is a huge issue. But I think the biggest issue is immaturity of the ecosystem, with complexity coming in second. You can at least throw an expensive developer at the complexity issue.

But when it comes to reliable installations (even helm charts for major software is a coin flip in terms of whether they’ll work), fast moving versioning that reminds me of the JavaScript Wild West (the recent RBAC on by default implementation comes to mind, even if its a good thing), and unresolved problems around provider-agnostic volumes and load balancing... those are headaches that persist long after you’ve learned the difference between a replicaSet and a deployment.


To further this point about the ecosystem, and this is AWS specific. You need, or have needed, to install a handful of extra services/controllers onto your EKS cluster to get it to integrate the way most would expect with AWS. Autoscalling? Install and configure the autoscaler. IAM roles? Install Kube2IAM. DNS/ALB/etc etc? etc etc etc.

After a slog you get everything going. Suddenly a service is throwing errors because it doesn't have IAM permissions. You look into it and it's not getting the role from the kube2iam proxy. Kube2iam is throwing some strange error about a nil or interface cast. Let's pretend you know Go like I do. The error message still tells you nothing specific about what the issue may be. Google leads you to github and you locate an issue with the same symptoms. It's been open for over a year and nobody seems to have any clue what's going on.

Good times :) Stay safe everyone, and run a staging cluster!


Kubernetes can be very complex.. or it can be very simple. Just like a production server can be very simple, or extremely complex, or a linux distro, or an app..

Kubernetes by itself is a very minimal layer. If you install every extension you can into it, then yes, you'll hit all kinds of weird problems, but that's not a Kubernetes problem.


You could use this argument for literally anything, though. I spent days poring over docs, googling, SOF, github issues, making comments, the whole works when I learned any new software/technology. The argument doesn't hold water, IMO.


True. I'm currently in the middle of writing a paper on extending Kubernetes Scheduler through Scheduler Extender[1]. The process has been really painful.

[1]: https://github.com/kubernetes/community/blob/master/contribu...


You're saying a feature that's in alpha, released 2 months ago is painful? You should at least wait until a feature is beta until expecting it to be easier to use.


Scheduler Extender was initially released over 4 years ago[1]. What you are referring to is Scheduling Framework[2], which indeed is a new feature (and will replace/contain Scheduler Extender).

[1]: https://github.com/kubernetes/kubernetes/pull/13580

[2]: https://github.com/kubernetes/enhancements/blob/master/keps/...

EDIT: formatting


Sorry about that. Can you explain the problems using the extender?


You can make an argument that Linux is ridiculously complex and troubleshooting issues that arise can leave you struggling for days pouring over docs, code, etc and that msdos is a much simpler system and be sort of right.


> can leave you struggling for days pouring over docs, code, GitHub issues, stackoverflow

I've had that when running code straight on a VM, when running on Docker, and when running on k8s. I can't think of a way to deploy code right now that lets you completely avoid issues with systems that you require but are possibly unfamiliar with, except maybe "serverless" functions.

\ And of those three, I much preferred the k8s failure states simply because k8s made running _my code_ much easier.


> I can't think of a way to deploy code right now that lets you completely avoid issues with systems that you require but are possibly unfamiliar with, except maybe "serverless" functions.

This is basically the same comment I was going to write, so I'll just jump onto it. But whenever I hear people complain about how complex XXX solution is for deployment, I always think, "ok, I agree that it sucks, but what's the alternative?"

Deploying something right now with all of its ancillary services is a chore, no matter how you do it. K8s is a pain in the ass to set up, I agree. But it seems to maintain itself the best once it is running. And longterm maintainability cannot be overlooked when considering deployment solutions.

When I look out in the sea of deployment services and options that exist right now, each option has its own tradeoffs. Another service might elimite or minimize anothers' tradeoffs, but it then introduces its' own tradeoffs. You are trading one evil for another. And this makes it nearly impossible to say "X solution is the best deployment solution in 2020". Do you value scalability? Speed? Cost? Ease of learning? There are different solutions to optimize each of these schools of thought, but it ultimately comes down to what you value most, and another developer isn't going to value things in the same way, so for them, another solution is better.

The only drop-dead simple, fast, scalable, deployment solution I have seen right now is static site hosting on tools like Netlify or AWS Amplify (among others). But these only work for static generated sites, which were already pretty easy to deploy, and they are not an option for most sites outside of marketing sites, landing pages, and blogs. They aren't going to work for service based sites, nor will they likely replace something being deployed with K8s right now. So they are almost moot in this argument, but I bring it up, because they are arguably, the only "best deployment solution" right now if you are building a site that can meet its' narrow criteria.


You could try k3s which is much easier to deploy.


If you're running on a single host anyways, why not just use init scripts or unit files? All Kubernetes is giving you is another 5-6 layers of indirection and abstraction.

EDIT: Quick clarification: still use containers. However, running containers doesn't require running Kubernetes.

> learning Kubernetes can be done in a few days

The basic commands, perhaps. But with Kubernetes' development velocity, the learning will never stop - you really do need (someone) dedicated part time to it to ensure that a version upgrade doesn't break automation/compliance (something that's happened to my company a few times now).


> If you're running on a single host anyways, why not just use init scripts or unit files?

You're absolutely right. Init scripts and systemd unit files could do every single thing here. With that said, might there be other reasons?

The ability to have multiple applications running simultaneously on a host without having to know about or step around each other is nice. This gets rid of a major headache, especially when you didn't write the applications and they might not all be well-behaved in a shared space. Having services automatically restart and having their dependent services handled is also a nice bonus, including isolating one instance of a service from another in a way that changing a port number won't go around.

Personally, I've also found that init scripts aren't always easy to learn and manage either. But YMMV.


> The ability to have multiple applications running simultaneously on a host without having to know about or step around each other is nice.

If you're running containers, you get that for free. You can run containers without running Kubernetes.

And unit/init files are no harder (for simple cases like this, it's probably significantly easier) than learning the Kubernetes YAML DSL. The unit files in particular will definitely be simpler, since systemd is container aware.


I'm extremely cynical about init scripts. I've encountered too many crusty old systems where the init scripts used some bizarre old trick from the 70s.

Anyway. Yes, you're once more absolutely correct. Every thing here can be done with unit scripts and init scripts.

Personally, I've not found that the YAML DSL is more complex or challenging than the systemd units. At one point I didn't know either, but I definitely had bad memories of managing N inter-dependent init scripts. I found it easier to learn something I could use at home for an rpi and at work for a fleet of servers, instead of learning unit scripting for my rpi and k8s for the fleet.

It's been my experience that "simple" is generally a matter of opinion and perspective.


Let's get concrete. Here's about as complex of a unit file as you will find for running a docker container:

    [Unit]
    Description=My Container
    After=docker.service
    Requires=docker.service
    
    [Service]
    Restart=always
    ExecStart=/usr/bin/docker start mycontainer
    ExecStop=/usr/bin/docker stop mycontainer
75% of this is boilerplate, but there's not a lot of repetition and most of it is relevant to the service itself. The remaining lines describes how you interact with Docker normally.

In comparison, here's a definition to set up the same container as a deployment in Kubernetes.

    apiVersion: v1
    kind: Deployment
    metadata:
      name: mycontainer
      labels:
        app: myapp
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: myapp
      template:
        metadata:
          labels:
            app: myapp
        spec: 
          containers:
            - name: my-container
              image: mycontainer
Almost 90% of this has meaning only to Kubernetes, not even to the people who will have to view this object later. There's a lot of repetition of content (namely labels and nested specs), and the "template" portion is not self-explanatory (what is it a template of? Why is it considered a "template"?)

This is not to say that these abstractions are useless, particularly when you have hundreds of nodes and thousands of pods. But for a one host node, it's a lot of extra conceptual work (not to mention googling) to avoid learning how to write unit files.


That's a great example! Thank you very much for sharing.

That said, it's been my experience that a modern docker application is only occasionally a single container. More often it's a heterogeneous mix of three or more containers, collectively comprising an application. Now we've got multiple unit files, each of which handles a different aspect of the application, and now this notion of a "service" conflates system-level services like docker and application-level things like redis. There's a resulting explosion of cognitive complexity as I have to keep track of what's part of the application and what's a system-level service.

Meanwhile, the Kubernetes YAML requires an extra handful of lines under the "containers" key.

Again, thank you for bringing forward this concrete example. It's a very kind gesture. It's just possible that use-cases and personal evaluations of complexity might differ and lead people to different conclusions.


> There's a resulting explosion of cognitive complexity as I have to keep track of what's part of the application and what's a system-level service.

If you can start them up with additional lines in a docker file (containers in a pod), it's just another ExecStart line in the unit file that calls Docker with a different container name.

EDIT: You do have to think a bit differently about networking, since the containers will have separate networks by default with Docker, in comparison to a k8s pod. You can make it match, however, by creating a network for the shared containers.

If, however, there's a "this service must be started before the next", systemd's dependency system will be more comprehensible than Kubernetes (since Kubernetes does not create dependency trees; the recommended method is to use init containers for such).

As a side note, unit files can also do things like init containers using the ExecStartPre hook.


For multi-container systems I really like Docker Compose and Swarm - a simple yaml file can define everything. It really is wonderfully simple.


> I've encountered too many crusty old systems where the init scripts used some bizarre old trick from the 70s.

Scripts sure - but modern Linux systems use systemd units, not shell scripts for this.


Even here on HN, one need not look far to find someone who resents everything about systemd and insists on using a non-systemd distro. Such people run systems in real life, too.


Should you run into such a system, you're still just writing code, and interacting with a daemon that takes care of the hardest parts of init scripts for you.

There are no pid files. There are no file locks. There is no "daemonization" to worry about. There is no tracking the process to ensure it's still alive.

Just think about how you would interact with the docker daemon to start, stop, restart, and probe the status of a container, and write code to do exactly that.

Frankly, Docker containers are the simplest thing you could ever have to write an init script for.


Do you have evidence that they run real systems?


Evidence I'm comfortable citing in public? No. Experience suggesting that at least one such person runs at least one real system? Yes.


It gives you good abstractions for your apps. I know exactly what directories each of my apps can write to, and that they can't step on each others' toes. Backing up all of their data is easy because I know the persistent data for all of them is stored in the same parent directory.

Even if the whole node caught on fire, I can restore it by just creating a new Kubernetes box from scratch, re-applying the YAML, and restoring the persistent volume contents from backup. To me there's a lot of value over init scripts or unit files.


> I know exactly what directories each of my apps can write to, and that they can't step on each others' toes

You can do this with docker commands too. Ultimately, that's all that Kubernetes is doing, just with a YAML based DSL instead of command line flags.

> Even if the whole node caught on fire, I can restore it

So, what's different from init/unit files? Just rebuild the box and put in the unit files, and you get the same thing you had running before. Again, for a single node there's nothing that Kubernetes does that init/unit files can't do.


> You can do this with docker commands too. Ultimately, that's all that Kubernetes is doing, just with a YAML based DSL instead of command line flags.

Well, I mean, mostly. You're gonna be creating your own directories and mapping them into your docker-compose YAMLs or Docker CLI commands. And if you have five running and you're ready to add your sixth, you're gonna be SSHing in to do it again. Not quite as clean as "kubectl apply" remotely and the persistent volume gets created for you, since you specified that you needed it in your YAML.

> So, what's different from init/unit files? Just rebuild the box and put in the unit files, and you get the same thing you had running before. Again, for a single node there's nothing that Kubernetes does that init/unit files can't do.

Well you kinda just partially quoted my statement and then attacked it. You can do it with init/unit files, but you've got a higher likelihood of apps conflicting with each other, storing things in places you're not aware of, and missing important files in your backups.

It's not about what you "can't" do. It's about what you can do more easily, and treat bare metal servers like dumb container farms (cattle).


> You're gonna be creating your own directories and mapping them into your docker-compose YAMLs or Docker CLI commands.

You don't have to create them, docker does that when you specify a volume path that doesn't exist. You do have to specify them as a -v. In comparison to a full 'volume' object in a pod spec.

> And if you have five running and you're ready to add your sixth, you're gonna be SSHing in to do it again

In comparison to sshing in to install kubernetes, and connect it to your existing cluster, ultimately creating unit files to execute docker container commands on the host (to run kubelet, specifically).

> apps conflicting with each other

The only real conflict would be with external ports, which you have to manage with Kubernetes as well. Remember, these are still running in containers.

> storing things in places you're not aware of, and missing important files in your backups.

Again, they are still containers, and you simply provide a -v instead of a 'volume' key in the pod spec.

> treat bare metal servers like dumb container farms

We're not talking about clusters though. The original post I was responding to was talking about 1 vm.

I will agree that, when you move to a cluster of machines and your VM count exceeds your replica count, Kubernetes really starts to shine.


"BTW, learning Kubernetes can be done in a few days. " - Learn something in a few days and the ability to run it in production are completely 2 different things. Security, upgrades and troubleshooting cannot be learned in couple of days.


Anyone who starts doing anything "in production" based on a "maybe you don't need k8s" article should step back and think about whether they are the right person to put things into production.

Are you bootstrapping your stealth-mode side-project? Pick whatever you think is best, but think about the time value of operations. (So maybe just pick a managed k8s.)

Are you responsible for a system that handles transactions worth millions of dollars every day? Then maybe, again you should seek the counsel of professionals.

Otherwise these articles are just half-empty fuel cans for an (educated?) dumpster fire.

That said HashiCorp stuff is almost guaranteed to be amazing. I haven't even looked at Nomad, but I think anybody starting out with orchestration stuff should give it a go, and when they think they have outgrown it they will know what's next. Maybe k8s, maybe something else.


Sure, but "how to toss existing docker containers into GKE" is IMO less than three days. Unless you have a reason to manage your own k8s cluster, k8s is extremely easy.


So let’s take step by step:

1. Learn shell scripts.

2. Learn Docker conf.

4. Learn yaml. (May be helm and plethora of acronyms specially designed for k8s).

5. Combine spaghetti of 1,2,3 to build container image scripts.

6. Tie it into CI/CD process for adventurous.

7. Learn to install and manage k8s cluster or learn proprietary non open source api of google or amazon or azure.

8. Constantly patch and manage plethora of infrastructure software besides application code.

Now with all this use a tool designed for million user application on an application which will be used by 100’s to 1000’s of users. I think k8s is designed for google kind of problem and is an overkill for over 80-90% of deployments and applications.

May be just use simple deployment with Ansible, puppet,chef, nix, gnu guix etc. to deploy and manage software based on necessity on a single vm and extend it to a large cluster if necessary of bare metal or vm or container in a cloud agnostic manner.

Not sure when the technology fashion overtook the infrastructure area like complex web app tooling in JavaScript world. K8s has its own place at scale required by handful of Organization with google level traffic and load for most traditional companies Simple cluster management and configuration management tool will work wonders with less moving parts and cognitive load.


Then you still have to learn Ansible, so you're still learning a tool. And you probably want to use Docker anyways, unless you're shipping statically linked binaries.

Also, you said VM which implies that instead of Kubernetes you're going to use a VM platform, which comes with every bit of the complexity Kubernetes has.

I agree with you that it's a simpler deployment method when you don't need HA. As soon as you need HA, then all of a sudden you need to be able to fail over to new nodes and manage a load balancer and the backends for that load balancer. Kubernetes makes that easy. Kubernetes makes easy things harder than they should be, and hard things easier than they should be.

The number of things I've managed that don't need HA is vanishingly low.


I think you're starting to get at what has annoyed me about a lot of the anti-k8s folks.

I really just don't think they understand the tool, because any production environment should have many of the features Kubernetes helps provide. So the argument becomes "I know how to do it this other way, so learning a new tool is too complex."

Kubernetes helps standardize a lot of these things - I can very easily hop between different clusters running completely different apps/topologies and have a good sense of what's going on. A mish-mash of custom solutions re-inventing these wheels is, in my opinion, far more confusing.

> Kubernetes makes easy things harder than they should be, and hard things easier than they should be.

This is really the crux, I think. I think a lot of people look at Kubernetes, try to learn it by running their blog on a K8s platform, and decide it's overly complex to run something that Docker probably solves (alone) for them. When you need HA for many services, don't want to have to handle the hassle of your networking management, design your applications with clear abstractions, etc., and really need to start worrying about scale, Kubernetes starts to really shine.

/rant


> I can very easily hop between different clusters running completely different apps/topologies and have a good sense of what's going on. A mish-mash of custom solutions re-inventing these wheels is, in my opinion, far more confusing

Kinda like jumping between Rails projects (assuming those rails projects don't diverge heavily from the "convention") vs jumping around between custom PHP scripts ;)

Or for Java Dev... kinda like jumping between Maven (again, assuming mostly follow the Maven convention) vs random Ant scripts to build your Java-based systems.

There will always be naysayers no matter what because they're used to the previous tech.


I am that old curmudgeon.

There's a cycle in tool ecology:

- simple useful tool

- adds formats, integrations, config, etc

- codebase is large and change is slow

- the skijump has become an overhang on the learning curve

- a frustrated person writes a simple useful tool

- goto 10

I haven't used k8 but it looks useful for uncommon workloads.

I expect some nu shinyee thing will replace it eventually.


I'm the kind of guy who just got lucky enough to land in workplace that moved away from previously ducktaped/custom build script to something not-necessary new but have since become accepted to be a better standardized tools.


I don’t have to go docker or kubernetes path for anything complex. I can still use more secure lxd containers with the same system configuration tool and scripts used for VM, bare metal to manage container.

Indeed due to problem with docker and CRI all kubernetes were vulnerable recently and needed security patch, as docker containers do not run in user namespace like lxd.

So for HA also traditional methods are better and functional programming approach like guix and nix to generate lxd container images, vm images or deploy to bare metal to run application is far superior and secure instead of spaghetti and non understandable black box images popular in docker world.


You have either deliberately misrepresented "Docker" in your post, or don't know enough about the vulnerability (and the affected software, runc) to make the claims you are making. The vulnerability was in runc, and Docker et al have had the capability to utilize user namespaces for a number of years.


No I did not mis-represent docker, used it since version 0.96 and after 1.0. Also commented, when under the new investment it tried to re-invent the wheels by moving away from lxc base to re-write its own libcontainer and lost on the development of unprivileged containers which landed in lxc 1.0. Since than it’s always been a security nightmare.

Docker although inferior and less secure than LXC/LXD became popular due to marketing driven by VC money, not on technical merits.

Check the old thread discussing the security issue with docker not supporting unprivileged container. [1]

[1] https://news.ycombinator.com/item?id=20487625


> Kubernetes is really really cheap

In terms of infra costs, I can believe it. But what about engineer resources to setup and maintain your infra in k8s clusters?

Where I work, we have a full time DevOps team that's almost the same size as the main product teams. That's really not cheap.


Weird, we are 3 working on infra/CI/tooling supporting some 80+ engineers working on the product which didn't write a single line of config for Docker, CICD, Infra automation, K8s.

In fact, it was done by one person until the product teams reached some 40 engineers.

We aren't 10x guys either. What's different though is that we don't believe in the microservice hypetrain so I don't have 200 codebases to watch after.


Do you run on managed k8s from a cloud provider or do you maintain you own cluster(s)?

The former that sounds reasonable, the latter sounds near implausible.


For whatever accounting/asset/tax reason, corporations seem happy to spend vast sums on monthly costs such as AWS, yet loathe purchasing physical computers to do the same job.


Seems like companies that began to adopt k8s come with less experience so the task looks humongous hence the need to hire "more people".


Agree / disagree. I’ve been using k8s in large deployments (1000s of nodes) for about 3 years.

It’s easy to get started with using GKE, EKS, etc. it’s difficult to maintain if you’re bootstrapping your own cluster. 3 years in, and despite working with k8s at a pretty low level, I still learn more about functionality and function every single day.

I do agree it’s great tooling wise. I personally deploy on docker for desktop k8s day one when starting a new project. I understand all the tooling, it’s easier than writing a script and figuring out where to store my secrets every damn time.

The big caveat is - kubernetes should be _my_ burden as someone in the Ops/SRE team, but I feel like you frequently see it bleed out into application developer land.

I think that the CloudRuns and Fartgates* of the world are better suited to the average developer and I think it’s Ops responsibility to make k8s as transparent as possible within the organization.

Application developers want a Heroku.

Edit: * LOL


> Kubernetes requires very little maintenance. Kubernetes takes care of itself. A container crashes? Kubes will bring it up. Do I want to roll out a new version? Kubes will do a rolling update on its own. I am running apps in kubes and for almost 2 years I haven't looked at my cluster or vms. They just run. Once every 6 months I log into my console and see that I can upgrade a few nodes. I just click ok and everything happens automatically with zero downtime.

> BTW, learning Kubernetes can be done in a few days.

This is true if you are using managed k8s from a provider or have an in-house team taking care of this. Far, far from the truth if you also need to set up and maintain your own clusters.

I'm very comfortable with k8s as a user and developer, would not be comfortable setting up a scalable and highly available cluster for production use for anything more serious than my homelab.


If you are running K8s on a single VM, you by definition do not need K8s. Run your apps with Docker and save yourself 50 layers of complexity.

> Kubernetes makes deployments really easy. docker build + kubectl apply.

The parts between "docker build" and "kubectl apply" is literally CI versus CD; they're more complicated than 2 steps. And when there's a problem with either, K8s is not going to fix it up for you. You'll have to be notified via monitoring and begin picking through the 50 layers of complexity to find the error and fix it. Which is why we have deployment systems to do things like validation of all the steps and dependencies in the pipeline, so you don't end up with a broken prod deploy.

> Kubernetes requires very little maintenance

Whatever you're smoking, pass it down here... Have you ever had to refresh all the certs on a K8s cluster? Have you ever had to move between breaking changes in the K8s backend during version upgrade? Crafted security policies, RBACs, etc to make sure when you run the thing you're not leaving gaping backdoors into your system? There's like 50 million different custom solutions out there just for K8s maintenance. Entire businesses are built around it.


I'm not very familiar with Kubernetes, but from what I've seen, to me it looks quite complicated. I'm wondering, when people say they use Kubernetes and consider it easy/simple, does that typically include operating the underlying Kubernetes container infrastructure as well? Or does "using Kubernetes" usually mean deploying containers to someone else's Kubernetes hosting (like Google)?


When I say it (and I would guess outside of people who either contribute to Kubernetes or are using it to build their own PaaS) I mean a managed service like GKE. When I attempted to set my own cluster up all the confusion was around all of the choices to make, what container networking stack, etcd, and so on. GKE gives you an easy button for simpler architectures where you just have some containers and you want them to go in a cluster.


You will probably use someone's recommended Kubernetes stack, like k3s. It doesn't make a ton of difference in practice whether you let someone else host your control plane or do it on your own hardware, but your actual nodes you probably want instance level control on.

Over time you'll probably grow your own customizations and go-tos on kubernetes that you layer on top of k3s or what have you.


I couldn't agree more. I've built a lot of sideprojects on Kubernetes and couldn't be happier with it. It's incredibly cheap (basically free in terms of resource usage), low-touch, and abstracts away so many pesky problems that I had to deal with before. My cluster has been running for 12+ months without a single application-level outage - containers simply restart and get rescheduled when something goes wrong, and external logging and analytics solutions are a breeze to integrate.


I disagree here - there's definitely some minimum amount of resources you want to use to make k8s worth it memory- and CPU-wise. You may be able to get by with 1 CPU and 1GB of RAM for your master (which is just coordinating, not running any workloads), and there's some overhead on the workers as well.

I've been looking at running k8s at some raspberry pis at home, but anything smaller than the recently released 4 is just not worth it IMO (though I've seen several people run clusters of 3B+s).


You can’t really compare you deploying kubernetes to a bunch of raspberry pis to people normally deploying them to cloud hosting services or using managed services.....

Totally different use case


I am not. GP doesn’t mention running managed. When talking about “basically free” and for side-projects I don’t really get your point. Minimum cost for a one-node n1-standard-1 cluster is a subsidized ~25 USD per month on GKE (assuming not using preembtibles). After 1 year you’re above the price of comparable hardware (there are pretty decent 300 $ laptops these days).

I’m a happy customer of GKE but it’s not for everyone and everything. Like you say, different use-cases.


> BTW, learning Kubernetes can be done in a few days.

As someone who has deployed K8S at scale several times, this is nonsense. Learning K8S deeply enough to deploy AND MAINTAIN IT is a huge undertaking that requires an entire team to do right. Sure, you can deploy onto AWS via KOPS in a day, and you can deploy a hello world app in another day. Easy.

But that only gets you to deployment, it doesn't mean you can maintain it. There are TONS of potential failure modes, and at this point you don't understand ANY of them. When one of them crops up, who do you page at 3AM, how do you know it's even down (monitoring/alerting isn't "batteries included"), how do you diagnose and fix it once you _do_ know what's broken?

Not to mention the fact that you _have_ to continuously upgrade K8S as old releases go out of maintenance in under a year. If you're not continuously testing upgrades against production equivalent deploys, you're going to footgun spectacularly at some point, or be stuck with an old release that new/updated helm charts won't work against.

TL;DR: If you can afford a team to deploy and maintain K8S, and have a complex enough application stack to need it, it's awesome; but it's not free in either time or staff.


How much effort is it to maintain k8s provided by a cloud service provider instead of something you have set up yourself?


As far as I have seen there is still updating overhead where you have to initiate upgrades to a known stable/supported version at a regular cadence. EKS suggests having an E2E CI pipeline to test it before updating production, and I feel like that's the only way to do it. There is nonzero churn even though it's managed.


> Kubernetes is really really cheap. I can run 20 low volume apps in a kubes cluster with a single VM. This is cheaper than any other hosting solution in the cloud if you want the same level of stability and isolation. It's even cheaper when you need something like a Redis cache. If my cache goes down and the container needs to be spun up again then it's not a big issue, so for cheap projects I can even save more cost by running some infra like a Redis instance as a container too. Nothing beats that. It gets even better, I can run my services in different namespaces, and have different environments (dev/staging/etc.) isolated from each other and still running on the same amount of VMs. When you caculate the total cost saving here to traditional deployments it's just ridiculously cheap.

If you're running 20 apps on a kubes cluster with a single VM you are running twenty apps on a single VM. There's no backup, scalability or anything else. There's no orchestration.

your deployment is a hipster version of rsync -avH myapp mybox:/my/location/myapp followed by a restart done via http to tell monit/systemd to restart your apps. It is a perfectly fine way of handling apps.

k8s shines when you have a fleet of VMs and a fleet of applications that depend on each other and have dynamic constrains but that's not what most of k8s installations work

> I mean yes, theoretically nothing needs Kubernetes, because the internet was the same before we had Kubernetes, so it's certainly not needed, but it makes life a lot easier. Especially as a cheap lazy developer who doesn't want to spend time on any ops Kubernetes is really the best option out there next to serverless.

Only in a throw production code over the fence sense.


You can be able to use Kubernetes very quickly. If you have a manged k8s cluster available to you, that somebody manages for you, then sure it is all upside. All the stuff you describe is great!

The thing is, if you have to manage the complexity and lifecycle of the cluster yourself, the balance tips dramatically. How do you provision it? How do you maintain it? How do you secure it? How do you upgrade it?

So I agree, k8s is great for running all manner of project, big and small. If you already have a k8s you will find yourself wanting to use it for everything! However if you don't have one, and you aren't interested in paying somebody to run one for you, then you should think long and hard about whether you're better off just launching a docker-compose from systemd or something.


> How do you provision it? How do you maintain it? How do you secure it? How do you upgrade it?

https://kubernetes.io/docs/setup/production-environment/tool...

https://kubernetes.io/docs/setup/production-environment/tool...

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/...

https://kubernetes.io/docs/concepts/security/overview/

Of course, it's not as easy as a managed solution, but it's not exactly black magic.

Docker compose from systemd is not bad, but maybe then instead of that using k3s is a better middle ground: https://github.com/rancher/k3s


> Kubernetes is really really cheap. I can run 20 low volume apps in a kubes cluster with a single VM.

Nomad is just as cheap, if not cheaper.


Have you looked into Erlang/the BEAM VM? It provides many of these same benefits.

> This is cheaper than any other hosting solution in the cloud if you want the same level of stability and isolation.

Erlang provides total process isolation and can theoretically also run on only a single machine.

> It's even cheaper when you need something like a Redis cache. If my cache goes down and the container needs to be spun up again then it's not a big issue, so for cheap projects I can even save more cost by running some infra like a Redis instance as a container too. Nothing beats that.

In Erlang, each process keeps its own state without a single point of failure (ie a single Redis instance) and can be restarted on failure by the VM.

> Kubernetes takes care of itself. A container crashes? Kubes will bring it up.

Erlang VM takes care of itself. A process crashes? The VM will bring it up.

> Do I want to roll out a new version? Kubes will do a rolling update on its own.

Ditto for Erlang VM, with zero-time deployments with hot code reloading.

> I just click ok and everything happens automatically with zero downtime.

Erlang is famous for its fault tolerance and nine nines of downtime! And it’s been doing so for quite a bit longer than K8s.


I've looked at k8s a while ago and it looked insanely complex to me. Maybe it's one of those things that you need to look at several times to understand it?


Kubernetes addresses a bunch of infrastructure concerns, so IMO it's more equivalent to learning a new cloud platform (AWS, Azure) than anything else. People who already know one or more cloud platforms, and are familiar with the kind of issues that come up with distributed and HA systems on cloud (e.g. networking, secrets management) may find Kubernetes much easier.


Docker swarm gets you this without all the complexity of Kubernetes. Hell even a largish docker-compose setup could get you there.


"BTW, learning Kubernetes can be done in a few days."

How do you troubleshoot an api-group that is failing intermittently?

How do you troubleshoot a CSI issue? Because CSI isn't a simple protocol like CNI.

What do you look at if kubectl gets randomly stuck?

What do you do if a node becomes notReady?

What do you do if the container runtime of a several nodes starts failing?

What do you do if one of the etcd nodes doesn't start?

What if you are misisng logs in the log aggregator or metrics?

What if when you create a project it's missing the service accounts?

What if kube-proxy randomly doesn't work?

What if the upgrade fails and ends up in an inconsistent state?

What if your pod is actually running but shows as pending?

Sure, you can learn how to deploy kubernetes and an application on top of it in a couple days, but learning how to run a production will take way longer than that.


One side of this - our base k8s config is 44k lines of yaml, leading to a bunch of controllers and pods running, in a rather complex fashion. Not to mention k8s conplexity and codebase itself.

It blackboxes so many things. I still chuckle at the fact that ansible is a first class citizen of the operator framwork.

It can certainly implode on you!

In my experience running nomad & consul is a more lightweight and simple way of doing many of the same things.

It’s a bit like the discussion raging in regards to systemd and the fact that it’s not “unixy”. I get the same feeling with k8s, whereas the hashicorp stuff, albeit less features, adheres more to the unix philosophy. Thus easier to maintain and integrate.

Just my ¢2.


> our base k8s config is 44k lines of yaml

Something has gone horribly wrong.

> In my experience running nomad & consul is a more lightweight and simple way of doing many of the same things.

If you can replace what you're getting out of your 44k lines of yaml with those two products, then you are using the wrong tools.


Edit, sorry - I missed the dot - I meant to write 4.4K lines, but greping through the templates dir it's actually close to 12k lines.

Ah, no, it's not about replacing functionality. It's about opening up for general integrations and ease of use.

If you've set up a fully fledged infrastructure on k8s with all the bells and whistles, there's a whole lot of configuration going on here. Like a whole lot!

I most certainly can't replace all of the above with those two tools, but they make it easier to integrate in any way I see fit. What I'm saying is that Nomad is a general purpose workload scheduler, where k8s is k8s POD's only. Consul is just providing "service discovery", do with it what you want. And so on...

Having worked a couple of years using both these setups I'm a bit torn. K8s brings a lot, no doubt, but I get the feeling that the whole point of it is for google to make sure you _do not ever invest in your own datacenters_. k8s on your own bare metal at least used to be not exactly straight forward.


> k8s on your own bare metal at least used to be not exactly straight forward.

I actually just deployed k8s on a raspberry pi cluster on my desk (obviously as a toy, not for production) and it took about an hour to get things fully functional minus RBAC.

> What I'm saying is that Nomad is a general purpose workload scheduler

Yeah, Nomad and k8s are not direct replacements at all. Nomad is a great tool for mixed workloads, but if you're purely k8s then there are more specific tools you can use.

> I meant to write 4.4K lines

Just a small difference! Glad no one wrote 44k lines of yaml, that's just a lot of yaml to write...

> close to 12k lines

Our production cluster (not the raspis running on my desk!) runs in about 4k lines, but we have a fairly simple networking and RBAC layer. We also started from just kubernetes and grew organically over time, so I'm sure someone starting today has a lot of advantage to get running more easily.


If you want ”cloud style” ingress, you’ll probably use metalLB and bgp etc. Here’s where it gets fun.

I mean, don’t get me wrong, it works - now at least. Never liked it until 1.12 tbh, which is when a bunch of things settled.

The article is about “maybe you don’t need...” and as an anecdote I helped build a really successful $$$ e-commerce business with a couple of hundred micro-services on an on-prem LXC/LXD “cloud” using nomad, vault & consul.

You can use these tools independently of each other or have them work together - unixy.

I have anecdotes from my last couple of years on k8s as well, and... it just ends up with a much more narrow scope.


Sort of similar to the fact that I basically always have both tmux and dtach installed, the latter for "just make this detachable", the former for "actually I'd like some features today".


Something like that. I want service discovery, where Consul really shines - cause containers ain't all we're doing, mkay. K8s forces etcd on you, for service discovery, and only within the cluster. So, sync it then... more code more complexity, no "pipe" ("pipe" not to be taken literally in this context) and simple integrations.

Not to mention, consul is already a full DNS server, but in k8s, we need yet another pod for coredns. Is YAP a thing? =)

For example, I love how easily envoy integrates to any service discovery of you liking, even your own - simply present it with a specific JSON response from an API. Much like how the Ansible inventory works. It makes integrating, picking and choosing as well as maintaining and extending your configuration management just so much more pleasant.


K8 is only cheap if someone else is running it for you and you never have problems.


It definitely can't be don in a few days. It took me 2-3 months to go through everything and run my own cluster at home. And yes, you need to go through about every major concept (architecture, object type, API, authentication, services, deployments, updates, etc, etc, etc) in order to make sense of Kubernetes overall. It's a very complex software. You can learn a simple programming language like Go in a couple of days, but definitely not Kubernetes.


Easy deployments/upgrades, and automagic external DNS and certificate management sealed the deal for me.


Which network fabric do you use and how did you set up DNS/cert management? For me certificates has been one of the pain points - have been using cert-manager with LetsEncrypt for some time but it has been notoriously unstable and they have introduced plenty of breaking changes between releases. (That being said I haven't tried the more recent releases, maybe things have gotten more stable in the past couple of months)

Google recently release managed certs for those running on GKE, but those are limited to a single domain per cert.


I use the external-dns and cert-manager tools. cert-manager uses lets-encrypt but fully automates everything, you just add an annotation to your ingress resource. Been using it in prod for around 6 months now with no problems.


Ah, sounds like they’re stabilizing then - I’ve had a lot of stability and upgrading issues with older versions. Just the fact that you couldn’t configure it for automatic renewal with anything else than 24h before expiry and these renewals would fail half the time...

But I will give it another try at some point.


Can second this. For most cases it's very low touch. When things break it has always been because I deployed bad code or configs. My weekends have freed up.


Your single VM is a single point of failure. You probably want to run this in 3 VMs, one in each data center. ECS gives this to you out of the box, rolling deployments and health checks included.


An awful lot of server systems can tolerate a hardware failure on their one server every couple years given 1) good backups, 2) "shit's broken" alerts, and 3) reliable push-button re-deploy-from-scratch capability, all of which you should have anyway. Lots of smaller shops trying to run to k8s and The Cloud probably have at least that much downtime (maybe an hour or two a year, on average) due to configuration fuck-ups on their absurd Rube Goldberg deployment processes anyway.

[EDIT] oh and of course The Cloud itself dies from time to time, too. Usually due to configuration fuck-ups on their absurd Rube Goldberg deployment processes :-) I don't think one safely-managed (see above points) server is a ton worse than the kind of cloud use any mid-sized-or-smaller business can afford, outside certain special requirements. Your average CRUD app? Just rent a server from some place with a good reputation, once you have paying customers (just host on a VPS or two until then). All the stuff you need to do to run it safely you should be doing with your cloud shit anyway (testing your backups, testing your re-deploy-from-scratch capability, "shit's broken" alerts) so it's not like it takes more time or expertise. Less, really.


Not to mention there are now servers available for purchase today that have 128 x86 cores. And 2-4 TB of RAM.

That's a lot of "cloud" right there in a single server.


Business services generally need high availability goals, so often that doesn't cut it. And your single server doesn't autoscale to load.

AWS gives you availability zones, which are usually physically distinct datacenters in a region, and multiple regions. Well designed cloud apps failover between them. Very very rarely have we seen an outage across regions in AWS, if ever.


In practice I see a lot of breakage (=downtime), velocity loss, and terrible "bus factor" from complex Cloud setups where they're really not needed—one beefy server and some basic safety steps that are also needed with the Cloud, so aren't any extra work, would do. "Well designed" is not the norm and lots of the companies are heading to the cloud without an expert at the wheel, let alone more than one (see: terrible bus factor)


Businesses always ask for High Availability, but they never agree on what that actually means. IE, does HA mean "Disaster Recovery", in which case rebuilding the system after an incident could qualify? Does it require active-active runtimes? Multiple data centers? Geographic distribution?

And by the way, how much are they willing to spend on their desired level of availability?

I still need a better way to run these conversations, but I'm trying to find a way to bring it back to cost. How much does an hour of downtime really cost you?


Agree - different business functions have different availability goals. An system that computes live risk for a trading desk might have different availability goals from an HR services portal.


I once ran a Linux server on an old IBM PC out of a run-down hotel's closet with a tiny APC battery for 10 years without a reboot. Just because I got away with it doesn't make it a great idea. (It failed because the hard drive died, but for a year and a half nobody noticed)

> An awful lot of server systems can tolerate a hardware failure on their one server every couple years given 1) good backups, 2) "shit's broken" alerts, and 3) reliable push-button re-deploy-from-scratch capability, all of which you should have anyway

Just.... just... no. First of all, nobody's got good backups. Nobody uses tape robots, and whatever alternative they have is poor in comparison, but even if they did have tape, they aren't testing their restores. Second, nobody has good alerts. Most people alert on either nothing or everything, so they end up ignoring all alerts, so they never realize things are failing until everything's dead, and then there goes your data, and also your backups don't work. Third, nobody needs push-button re-deploy-from-scratch unless they're doing that all the time. It's fine to have a runbook which documents individual pieces of automation with a few manual steps in between, and this is way easier, cheaper and faster to set up than complete automation.


> Just.... just... no. First of all, nobody's got good backups. Nobody uses tape robots, and whatever alternative they have is poor in comparison, but even if they did have tape, they aren't testing their restores. Second, nobody has good alerts. Most people alert on either nothing or everything, so they end up ignoring all alerts, so they never realize things are failing until everything's dead, and then there goes your data, and also your backups don't work.

But you should test your backups and set up useful alerts with the cloud, too.

> Third, nobody needs push-button re-deploy-from-scratch unless they're doing that all the time. It's fine to have a runbook which documents individual pieces of automation with a few manual steps in between, and this is way easier, cheaper and faster to set up than complete automation.

Huh. I consider getting at least as close as possible to that, and ideally all the way there, vital to developer onboarding and productivity anyway. So to me it is something you're doing all the time.

[EDIT] more to the point, if you don't have rock-solid redeployment capability, I'm not sure how you have any kind of useful disaster recovery plan at all. Backups aren't very useful if there's nothing to restore to.

[EDIT EDIT] that goes just as much for the cloud—if you aren't confident you can re-deploy from nothing then you're just doing a much more complicated version of pets rather than cattle.


> more to the point, if you don't have rock-solid redeployment capability, I'm not sure how you have any kind of useful disaster recovery plan at all. Backups aren't very useful if there's nothing to restore to.

As Helmuth von Moltke Sr said, "No battle plan survives contact with the enemy." So, let's step through creating the first DR plan and see how it works out.

1) Login to your DR AWS account (because you already created a DR account, right?) using your DR credentials.

2) Apply all IAM roles and policies needed. Ideally this is in Terraform. But somebody has been modifying the prod account's policies by hand and not merging it into Terraform (because reasons), and even though you had governance installed and running on your old accounts flagging it, you didn't make time to commit and test the discrepancy because "not critical, it's only DR". But luckily you had a recurring job dumping all active roles and policies to a versioned write-only S3 bucket in the DR account, so you whip up a script to edit and apply all those to the DR account.

3) You begin building the infrastructure. You take your old Terraform and try to apply it, but you first need to bootstrap the state s3 and dynamodb resources. Once that's done you try to apply again, but you realize you have multiple root modules which all refer to each other's state (because "super-duper-DRY IaC" etc) so you have to apply them in the right sequence. You also have to modify certain values in between, like VPC IDs, subnets, regions and availability zones, etc.

You find odd errors that you didn't expect, and re-learn the manual processes required for new AWS accounts, such as requesting AWS support to allow you to generate certs for your domains with ACM, manually approving the use of marketplace AMIs, and requesting service limit increases that prod depended on (to say nothing of weird things like DirectConnect to your enterprise routers).

Because you made literally everything into Terraform (CloudWatch alerts, Lambda recurring jobs, CloudTrail trails logging to S3 buckets, governance integrations, PrivateLink endpoints, even app deployments into ECS!) all the infrastructure now exists. But nothing is running. It turns out there were tons of whitelisted address ranges needed to connect with various services both internal and external, so now you need to track down all those services whose public and private subnets have changed and modify them, and probably tell the enterprise network team to update some firewalls. You also find your credentials didn't make it over, so you have to track down each of the credentials you used to use and re-generate them. Hope you kept a backed up encrypted key store, and backed up your kms customer key.

All in all, your DR plan turns out to require lots of manual intervention. By re-doing DR over and over again with a fresh account, you finally learn how to automate 90% of it. It takes you several months of coordinating with various teams to do this all, which you pay for with the extra headcount of an experienced cloud admin and a sizeable budget accounting gave you to spend solely on engineering best practices and DR for an event which may never happen.

....Or you write down how it all works and keep backups, and DR will just be three days of everyone running around with their heads cut off. Which is what 99% of people do, because real disaster is pretty rare.


This is kind of what I'm talking about WRT the cloud being more trouble than it's worth if you app sits somewhere in between "trivial enough you can copy-paste some cloud configs then never touch them" on the one end and "so incredibly well-resourced you can hire three or more actual honest-to-god cloud experts to run everything, full time". Unless you have requirements extreme/weird enough that you're both not-well-resourced but also need the cloud to practically get off the ground, in which case, god help you. I think the companies in that middle ground who are "doing cloud" are mostly misguided and burning cash & harming uptime while thinking they're saving and improving them, respectively.


You nailed it in the first sentence. The blog post pretty much boils down to "k8s looks good but we were too lazy to learn how to use the thing, so we opted into something else".

Fair enough, legit argument, but trying to make a "counter-k8s" case based on that is not very convincing.


> but we were too lazy

This is unnecessarily dismissive and contributes to what makes HN a toxic place for discussions. They've already addressed the reasons.


Point taken. However, the part about "why not kubernetes" reads:

[...] we started adding ever-more complex layers of logic to operate our services.

As an example, Kubernetes allows [...] this can get quite confusing [...].

[...] this can lead to tight, implicit coupling between your project and Kubernetes.

[...] it’s tempting to go down that path and build unnecessary abstractions that can later bite you.

[...] It takes a fair amount of time and energy to stay up-to-date with the best practices and latest tooling. [...] the learning curve is quite steep.

So in short "it is complex so this and that may happen if you don't learn it properly".

Ok, this reasoning applies a-priori to any tool.


Not every tool is equally complex or requires the same amount of learning. K8s has a reputation for being really high on the scale, so a reasonable team could consider it and then decide to use something less complex.


I am a one-person team running Kubernetes on Google Cloud. It costs around $62 per month before tax. (One n1-standard-1 node for $35.34 per month, HTTP Load Balancing $18.60 per month and SSD persistent disks for $8.50 per month). Kubernetes gives me the ability to scale up quickly when the need comes (which is soon, hopefully).

I evaluated AWS Fargate & Kubernetes, Azure and GKE before settling on GKE. Amazon charges $148 per month just for cluster management alone. Google and Azure charge $0 for cluster management. This ruled out AWS for me. AWS appears to be a reluctant adopter of Kubernetes. They seem to want you to use Fargate instead. I tried it and found it to be crap--very hard to get things running.

I was able to get Kubernetes running on Azure and GKE fairly easily. There were minor hiccups on both those clouds. With Azure initial creation of a AKS cluster failed because some of the resource providers weren't "registered" on my subscription. On GKE it was hard to get ingress working. Static IPs take a long time to take effect, and in the mean time you are fiddling with your yaml files trying to figure out what you may have done wrong, not realizing it is a GKE issue.

The awesome part of Kubernetes is that I didn't need to learn almost anything about Google cloud to get everything working. I only had to learn Kubernetes. If Google raises prices I can easily switch to Azure without learning any Azure technologies. My knowledge as well as my application is completely portable. I can't imagine doing any of this as a one-person team without Kubernetes.


Second the comment about ingress. By the way, if you have any idea how to get ingress to work without taking down the services that are in it while it updates that would be amazing. I think it would obviously work if we used a separate ingress/load balancer for each host, but that seemed kinda wasteful since we are ok with doing downtime in off hours scheduling with our project.


I believe the Kubernetes/Docker way is to not update in place but to create new instances. Can you spin up a new node pool/cluster and redirect traffic there?


Hey this sounds super interesting. I'm a PM at Upbound.io and would love to learn more about your experiences working with K8s. What's a good way of contacting you?


Kubernetes is here to stay.

What will change or what will be enhanced are: - Minimum requirements to actually run it (see k3s) - More managed services (gke, azure and aws exists but also digitalocean) - More/better handling of stateful services - Simple solution for write once read many (relevant for caching and for ci/cd)

At that pace we are already with such a jung project, yeah this is great. This is huge.

And no one needs to migrate already to kubernetes! But it already does a few things out of the box which reduces the complexity: - easy cert management - internal loadbalancing - autoscaling - green/blue deployment - deployment

But you do see how the industry is struggling with certain problems: We are now with kubernetes moving into a cloud native area.

Everyone know has kubernetes available. There was no mesos managed service from google, azure and aws. There was no docker swarm from google, azure and aws.


> But you do see how the industry is struggling with certain problems: We are now with kubernetes moving into a cloud native area.

We were were moving into a cloud native era way before Kubernetes.


I'm usually fairly buzzword-adverse, and I didn't follow the NoSQL/Spark/Big-Data phase when it was hip. I was also hostile to kubernetes until I had to work with it against will for a project.

Since then, I completely changed my mind about kubernetes. This is a very good technology for only one reason -and it's NOT about container orchestration-. Portability. K8s is the missing piece that allows you to create a network of cooperating computers independently of hardware, OS, and even architecture.

If I have an home-made k8s cluster on my raspberry pies at home, it's not because it's lightweight and easy to manage -k8s adds a significant overhead-, it's because it gives me the ability to unplug one of them, take the sdcard, format it and plug another board without any interruption or configuration. I could plug my intel laptop to that network and have some pods running on it without having to change a single line in my configuration. Finally, I can zip a folder and email a bunch of yaml files to my friends (or have them git clone the repo), and they will be able to replicate an exact copy of my home cluster with all -or some- of the services. This is truly amazing.


I literally don't understand what you are achieving with your k8s raspberry cluster thing.


automatic failover obviously


This post should probably have (2019) in the title.

I'm the Nomad Team Lead and can try to answer any questions. Since this post was made the team has expanded, and the task/job restarting issue they link has been (mostly) addressed. Also new since this post is our Consul Connect integration which can accomplish similar goals to k8s network policies, albeit opt in and with the actual discovery/networking code living in Consul/Envoy respectively.


I absolutely loved Nomad when I used it. We were stuck with both Windows, legacy apps and command line scripts and no time to learn Docker or anything like K8s.

We were already using Consul so bringing in Nomad to schedule everything - batch files, .Net executables, some legacy programs, etc was a godsend.

Unfortunately, as much as I loved the Nomad+Consul combination, I really couldn’t suggest it today. It is so much easier to find qualified K8s people than Nomad+Consul people I couldn’t in good conscience recommend it.

But this is all a moot point to me. While if I were leading another on prem project I would use K8s. We are all in on AWS+ECS+Fargate where I work now and we really don’t care about the lock in boogie man.

Given a choice, I would still say at least if you’re on AWS, use the native offerings. The value of hypothetically being able to migrate a large infrastructure “seamlessly” is vastly overrated.


> I absolutely loved Nomad when I used it.

Glad to hear it!

> Unfortunately, as much as I loved the Nomad+Consul combination, I really couldn’t suggest it today.

This is a fair critique and a problem any project living in a world defined by a strong incumbent suffers. You made me realize we need resources to help k8s users translate their existing knowledge to Nomad as many people looking at Nomad will have k8s experience these days.

So thanks for this comment. Maybe with the right docs/resources we can at least minimize the very real cost of using a non-dominant tool.

> But this is all a moot point to me. We are all in on AWS+ECS+Fargate where I work now and we really don’t care about the lock in boogie man.

This was me in a past lives/jobs! HashiCorp's entire product line (with the exception of maybe Packer and Vagrant) become much more compelling for multi-cloud, hybrid-cloud, and on-prem orgs.


In my world, there are two types of “architects” and “consultants”. You have the kind that are “Smart People” (tm) and tell their clients that “this is the way that it should be done” and you have those that “will meet the clients where they are”.

From my experience with Nomad and the little I know about K8s, Nomad is the latter. If you can run it from the command line, you can run it with Nomad. This in and of itself is a great value proposition.

But, Nomad does have the disadvantage, I posted about above and has to fight the “no one ever got fired for buying IBM”. I was able to get buy in only after I told the CTO, “it’s made by the same people who make Consul and Terraform”.


Kudos to your team. We have had great success with Nomad/Consul/Fabio. My only issue has been that there isn't a lot of insight into what is going on with Fabio. There aren't really any configuration options. I admit we are a year(plus) behind on versions so maybe things have changed.


> My only issue has been that there isn't a lot of insight into what is going on with Fabio. There aren't really any configuration options.

We have load balancing docs for 4 of the most popular load balancers including Fabio: https://www.nomadproject.io/guides/load-balancing/load-balan... We're monitoring Frank's search for a maintainer but unfortunately don't have the resources to commit to it at the moment. If Fabio becomes abandoned and we're still unable to take over we'll remove it from our docs.

Please feel free to open an issue if you have a specific idea or question. We triage every issue and appreciate user feedback immensely!


Wow we are really behind. I did not realize that Fabio was in danger of being abandoned. Do you have a recommendation on what we should be doing? I guess we can move to nginx as we have experience with that. Fabio was really nice cuz it just worked.

You guys really can't spare the resources to pick up such a necessary part of a container orchestration system?

I guess I can see it both ways, where you think that that domain already has very good solutions and you don't want to waste time reinventing the wheel.

But then it's much easier for people to pick up your stack and decide to use it when it has load balancing automatically included. Just saying.


Yeah there's definitely lots of internal discussion to see what can be done. Traefik is getting quite popular and is arguably the heir apparent. Nginx and HAProxy also work and are as battle tested as software comes.

We may also add a first class notion of ingress load balancing in the future and Envoy would be the natural choice there because we already use it for Nomad's Consul Connect integration.


In the past I tried to follow the Nomad+Consul set-up guides, but on my local machine rather than 3 separate machines, and I found it nearly impossible to get everything running. I ran into a dozen different errors and it seemed to be a lot more haphazard than the docs suggested. I really wanted to get into it, but that experience soured me on the idea that it would be a "simple" solution after all. (I'm willing to dig up and replicate my findings if you're interested)


I'm sorry to hear that! We definitely want to improve our cross-product bootstrapping situation.

Running "nomad agent -dev" and "consul agent -dev" on your local machine in 2 different terminals should work. Running more than one Nomad agent on the same machine is not recommended as you have to carefully configure ports and paths not to conflict.

There are some demo clusters available via Vagrant or Terraform, but we should really do a better job of going from the dev agent to a cluster:

- https://github.com/hashicorp/nomad/tree/master/demo

- https://github.com/hashicorp/nomad/tree/master/terraform


I feel like a lot of the problems K8s solves are fake problems that we have because all our software links/imports so many libraries/packages it's impossible to tell what's going on. It feels like have built an entire infrastructure around hopefully capturing that one time we got it working and being able to reboot to that state when things go wrong.


You’re not wrong but containerization is pragmatic - not idealistic. You can still write simple, clean software packaged in a SCRATCH container - but you can also isolate whatever crap you have to work with the get the job done. Both the elegant app and the horrible one now share a reliable init system.


Reminds me of the excitement around Java Beans


I am very happy working with Kubernetes in a very small team, I really like the abstraction of the different entities and the declarative ways to configure them.

I think that the learning curve is not that steep if you already had to do the same than with Kubernetes with other alternatives, in my on experience I have been discovering a lot of features that are very helpful not just during production but also during the development environments that were a real pain before.

I am have a lab/cluster/blog running on Kubernetes, I am in charge of this one alone, it is opensource [1], I version everything that goes to the cluster so you can see the evolution of the kubernetes entities, the config, the containers and the code. I started this from scratch and improving it feature to feature, I think that this might be a big factor with my positive experience with Kubernetes.

I wonder if an issue with adopting kubernetes is to try to migrate a big system into kubernetes in a very limited time lapse and trying to push/force features as they were working/handled in the previous approach?

[1]: https://github.com/vicjicaman/microservice-realm


A good friend and old coworker has been hitting me up for how to do stuff with docker since his job now uses containers. He has about 15 containers (on 2 hosts) that do not change to much and was asking about setting up k8s (it was a buzzword his manager heard). i talked him into just setting up swarm.

It took all of about an hour (over text messaging) to get it set up and all stacks/services running. They could not be happier.

It comes down to the right tool for the job. If you don't need all the bells and whistles, then keep it stupid simple. I realize swarm is not a 100% "enterprise" solution, however before they they were just issuing docker command after each reboot.


> i talked him into just setting up swarm.

Not contesting the heart of your comment but, given the current state of Docker, recommending Swarm to someone strikes me as bad advice. Nomad may be a better call.

Mirantis has openly said Swarm's future is a two year sunset with a transition path to k8s.


I agree, looked into swarm and routing was a mess, even discounting the bugs, lockups and moments of total disconnections, you had to rely on ugly dns hacks to know the live endpoints, making any kind of stateful service a nightmare to set up, while kubernet just give you an endpoint api, and even then, there were no guarantees for local services that swarm mesh would route the calls to the local instances, while in kubernet you can control precisely how services are resolved grouping them in pods, so that they don't needlessly clog the pipes.

I find very hard to find swarm use case that a wan, vpn, private segment or kubernet cluster can't handle better.


I've been using Docker Swarm in production for about 2 years now... processing about 1TB of data / month across 30+ containers. The networking and routing has been rock solid except for that one day that the Docker dev team, in one release, accidentally added random hashes to the internal DNS names of services. Ever since that day I've used the docker-compose network alias for internal routing https://docs.docker.com/compose/compose-file/#aliases

Discovering bugs in a technology you just started "looking into" actually sounds like the learning curve.


are you sure you're not mixing swarm and swarm mode?


Apparently out of Swarm, Swarmkit, and SwarmNext; my good experience has been from SwarmNext. So now this is even more confusing.


Ok, there is a difference? Do you have links to docs of both? This sounds hard to search for online.



Surprised to hear this, as I've been running a 3-node Swarm cluster for a couple of years now, and it's worked perfectly - and it's so much simpler than k8s.


Giving someone advice to use Nomad is bad advice as well. As much as it may not be a personal choice, k8s won by a long shot. You just can't rely on Nomad being around in 2 years, but k8s will surely be around.


Is that swarm in general, or v1 swarm that predated the current "swarmkit"?

https://forums.docker.com/t/is-there-a-roadmap-for-docker-sw...

https://github.com/docker/swarmkit/issues/2665


Apart from Swarm being slowly decomissioned, what would the benefits of Nomad vs Swarm be ?


Disregarding of Swarm's merits why would anyone care what Mirantis has to say? As any other vendor Mirantis has its own agenda and promotes their vision of cloud stack. Does not mean it is the best solution for any particular user.


> Disregarding of Swarm's merits why would anyone care what Mirantis has to say? As any other vendor Mirantis has its own agenda and promotes their vision of cloud stack. Does not mean it is the best solution for any particular user.

Because Mirantis now owns almost all of Docker's IP. https://news.ycombinator.com/item?id=22035084


Nomad, even deployed in single-node mode, is much more pleasant than dealing with Swarm in my experience.


I agree Nomad is probably a better solution, however my friend is an Oracle DBA who now manages the servers and docker since their System Administrator left and he was thrown into it.

I was addressing the:

"The takeaway is: don't use Kubernetes just because everybody else does."

line in the article, and agreeing (did not mean to start a holy war about swarm).

On a side note, I used to run a docker swarm at home up until about 4 months ago when I switched it to K8s, I really didn't have any bad mesh routing issues, and it was pretty stable. But to be a hypocrite I switched to k8s because everyone else DOES use it and I wanted to kind of stay relevant.


Mirantis is just some cloud hosting provider and from what I can tell it has no connection to Docker. I'm generally interested in why all the FUD around Docker and Swarm. Can you support these FUD statements with some legit news stories or blog posts from the people involved at Docker?


> Mirantis is just some cloud hosting provider and from what I can tell it has no connection to Docker. I'm generally interested in why all the FUD around Docker and Swarm. Can you support these FUD statements with some legit news stories or blog posts from the people involved at Docker?

https://www.mirantis.com/blog/mirantis-acquires-docker-enter...

> Today we announced that we have acquired the Docker Enterprise platform business from Docker, Inc. including its industry leading Docker Enterprise and 750 customers.

> What About Docker Swarm? > The primary orchestrator going forward is Kubernetes. Mirantis is committed to providing an excellent experience to all Docker Enterprise platform customers and currently expects to support Swarm for at least two years, depending on customer input into the roadmap. Mirantis is also evaluating options for making the transition to Kubernetes easier for Swarm users.

Mirantis owns essentially all of Docker, outside of Docker for Desktop (someone correct me here if I'm wrong), now. They are saying that Swarm is not the future of Docker. It's entirely possible that the remainder of Docker, now a developer tooling company, will continue with Swarm, but it seems unlikely. Also possible the community will keep it alive. None of those maybes are things I'd bet my platform on though.


You are honestly the first person who has been able to articulate this issue to me. Thank you.


If you look at docker swarm and other solutions, kubernetes is the first one ever to be adopted by all big companies out there.

And it doesn't has to be that one kubernetes, your solution only has to be kubernetes certified. This will allow us all to use the kubernetes api and features with different underlying implementations (as far away from the original or as close as it can get).

This is new.


but swarm mode is now part of "regular docker"


Personal opinion, but for something that small, I would probably not even use an orchestration tool at all, just some init scripts or unit files.


I have an actual production setup using docker-compose. It took 5m to setup and any idiot can read the compose file and reason about what's going on.

But I would not go spreading blog posts about it until I will have been maintaining this thing for min 1 year.

Can I hot-deploy (without downtime)? can I rollback? does it autoscale? can I monitor this thing? where are my logs? how do I create cronjobs? does it autorestart (ok that's an easy one...) will I end up with some half-assed deployment one day? can I trivially maintain dev/stage/prod envs? and so on and so forth.

The thing about k8s is that though it's (very) complex, it also solves a really broad spectrum of deployment issues OOTB.



Yep, still doing plain old VMs.

Just like NoSQL, BigData,..., in a couple of years k8s will be slowly forgotten as everyone updates their CVs to the Next Big Thing™.


Never bought into the VM fad myself. The physical hardware is chugging along quite nicely over here.


Owning physical hardware is a fad. I rent big iron mainframes from IBM and pay their consulting fees to punch new cards every time I want to modify a program.


Wondering if things will swing back to physical hardware with new chips like the amd epyc rome... 64 core ridiculous i/o on a single chip. that, compounded with the fact that people seem to be gravitating back towards compiled and higher performance languages. could run a medium traffic site on one machine. hello mainframe era 2.0


Depends on your use case. VMs give you more flexibility and a reasonable level of isolation that is needed in some scenarios.


Agreed. At this point HN could have a post about the merits of plain old dedicated servers and it would seem novel.


Just wait. In ten more years IBM and Deloitte will re-brand fat clients and on-prem infrastructure as something else and they'll start selling it to everyone. Replace "mainframe" and "dumb terminal" with "the cloud" and "web app" and we had this same conversation 40 years ago. Then in another 20 years we'll swing back to some version of consolidated+remote servers and lightweight client access portals.


Some technologies come and go. Williams-Kilburn tubes, for example, are now fully and finally obsolete. Ditto mercury delay line memories.

Other technologies merely expand and shrink. Big Data went from being a buzzword to a simple, established fact... albeit established in places you don't work. Does Google have Big Data? Does Amazon have Big Data? Do most of us work at either of those companies?

Big Data isn't obsolete in the same way dump trucks aren't obsolete.


Same deal over here. We wrote a little bit of tooling over the last few years to help automate various tasks regarding getting our code into production.

We have found that minimal touch is the best approach when it comes to infrastructure and tooling around it. The core of our business value is ultimately in our codebase. The infrastructure only serves to host it to our end customers, so we don't like to spend too much time on it. By keeping things very simple, we can hop between instances and even cloud vendors without much headache. We built self-contained deployments at the code level, not at the infrastructure-level (I.e. docker).

How many businesses (aside from hyperscalers) derive their principal business value from infrastructure itself? Is it worth spending all this time and frustration on various competing declarative infrastructure approaches? Surely with the amount of time it takes to get all this mess set up, someone in your organization could have purpose-built something from zero that accomplishes the same or better for your specific business concerns.


If you've forgotten about NoSQL or BigData, chances are your shop really just overshot how much data and traffic they have.


The vast majority of shops have less data than the NoSQL vendors wanted to convince them they had. (With the ability to handle big data, some shops have started creating huge datasets e.g. as part of the targeted advertising arms race, but it's not clear they really need to.)


Scale: the problem (almost) nobody has, and (almost) everybody wants.


Devs working at the vast majority of shops:

Easily handle production load on 3-5 servers / VMs while reading all about how to script quickly spinning up 1,000 VMs if they really needed to.


My servers are named after my pets and I'm not ashamed to admit it!


There’s different kinds of scale. I work for a company that has “only” hundreds of thousands of daily active users, but we ingest and persist a lot of data for them, on the order of millions of events per day for some of the larger users. We definitely run into big data problems in select services, despite not having a crazy number of users. Many companies are similar.


One modern PC server with PostgreSQL can handle about 1 mln transactions per second.

http://akorotkov.github.io/blog/2016/05/09/scalability-towar...


In the comments the guy says the hardware the benchmark was running on had 3TB of RAM.

That's not average hardware. It's a couple hundred connections max running select statements as fast as possible on the best hardware they could afford to test against.

That doesn't translate into your CRUD app handling a million requests per second.


They've been "forgotten" because they've become so mainstream that they're now part of the norm.


Containers are a mild improvement over existing technology. Nobody cares whether your cloud provider uses containers, older tech, or newer tech.


Clarifying above: VMs, containers, and MicroVMs are all a box to run code and some surrounding orchestration mechanisms to standardize and make configuration portable.

They're all part of the same generation of tech - portable instances. This is a leap over having physical servers, but a mild improvement over each other. Think git vs mercurial - the difference isn't significant enough to move from one to the other.

Serverless, for example, would be a leap over the instance model.


Why not containers? They seem more efficient than VMs. ("Process with restrictions" vs "full operating system and kernel with repeated context switching".)


"Not containers" for lots of reasons, depending on what you mean by "container". Docker is a crime against the computing industry. Other systems have implemented containers more competently, both within Linux (LXC et al) and without (FreeBSD jails), but per usual, without the cute mascot and the bonfire of VC money, these things go unheeded by the hype cycle.

Still, we have enough security issues with traditional virtualization, which only worsen every year as new side channel attacks are discovered. You'll find that container platforms like Fargate actually run everything in its own micro-VM because there's no other way for them to provide the basic isolation guarantees they peddle.


I wouldn't run untrusted code in a container either. I'm talking about the case where you control and trust the software.

Also, what do you mean by "Docker is a crime against the computer industry"?


NoSQL is definitely not obsolete. When you go to Amazon its not looking in PostgreSQL for your shopping cart, its looking in Dynamo. As someone who previously worked in Fortune 10 company, I can tell you that most production APIs were being driven off of NoSQL or were being migrated to NoSQL (specifically Cassandra and ElasicSearch).


You said it quite clearly, Fortune 10, which rules out 99% of all companies out there.


And 5 if not 9 out of that Fortune 10' NoSQL dbs were just tied with a duck tape ('Our frontend team needs something they could understand - Ah okay then') to the mainframe with a previous generation of NoSQL - like ADABAS, IMS or z/TPF - and a ton of COBOL code where all the work is done since forever.


Not for OLTP I assume? Or are they using transactional NoSQL?


idk, there's some value in kubernet way beyond instancing vms, it provides a nice system that ties beautifully into configuration management in a way that just vm, dockerfiles or even good old puppet scripts can't


Can't tell if you're joking, but it's not like NoSQL, Big Data, etc, went anywhere. The hype died down but the uptake is probably even bigger now.


Just like NoSql, BigData, renewables, and the Fax machine.


Kubernetes is akin to writing unit-test: plenty naysayers due to the fact that the benefit requires the user to use it everyday to be part of their dev/working cycle.

k8s won't go away and this comes from a dev who hates infrastructure.


Agreed. k8s is too bloated at this point, yet alternatives like nomad and swarm are missing some fundamental features, so we have had to adopt to k8s unfortunately.

For example, swarm still have no fault tolerance and nomad relies on Vault, another product from Hashicorp and is also in the same limited state wrt. documentation


Try ECS on AWS. Rock solid and reasonably simple and does everything you need.


I dont think, the correct answer is to change your stack to fit some vendor


Its just docker, your containers remain exactly the same only the orchestration changes!


You are pushing ECS quite hard, but why so? It's nothing particularly amazing in my opinion, good for what it does but requires all your other infra too to reside in AWS. Which is not that cheap if you're really concerned with cost. Eg NAT is 5 cents per hour and you need at least two of them if you want to use private subnets.


Maybe ECS has gotten better, but when I used it a few years back it had a lot of rough edges, and wasn't really "simple" to use. On top of that, you're locking yourself into a single provider.

A managed k8s service (all 3 big providers offer this) really isn't that much more complex, and has much better documentation / no vendor lock-in.


I think this post misses the point of where kubernetes brings value the most: multi-team environments. If you are a single-team, then nomad, non-distributed docker, systemd units, etc is probably a better choice today due to the operational complexity of running Kubernetes (GKE,AKS,EKS do lower the complexity, but you still need kubernetes experts on call)

However if you are in an organization with multiple teams (as are the vast majority of developers) then Kubernetes provides a common language for deploying, operating and securing your applications which enables you to go from a process which could takes days, weeks or even months to provision and configure a VM to minutes or hours to provision a kubernetes namespace.


It's funny we use GKE and coudn't be more happy, once your CI/CD is setup you don't have to think about it and makes your developper life much much easier.

I don't even think for one second to go back to uploading tar.gz or DEB / RPM using scripts / puppets ect ...


I wish this article was titled 'Nomad is pretty great' since that was really the point being made here. Nomad is not a true alternative to Kubernetes. It does not support autoscaling out of the box, it does not have most of the bells/whistles that Kubernetes has.

However, it is just a single executable and it just works. Combined with Consul and Fabio, you can run a container orchestration cluster with very little fuss that has service discovery, internal load balancing and cluster-wide logging.

It is a feature-rich task scheduler with a pretty good CLI. I highly recommend it, but if you need stuff like autoscaling, you need to use something else.


(Full disclosure: Nomad Team Lead here, so I'm biased!)

> Nomad is not a true alternative to Kubernetes.

You're right! We explicitly try not to be a standalone dropin replacement for k8s. Our comparison page goes into this a little bit (but now I realize it's in need of a refresh!): https://www.nomadproject.io/intro/vs/kubernetes.html

- Nomad relies on Consul for service discovery

- Nomad relies on Consul Connect (Envoy) for network policies

- Nomad relies on Vault for secrets

- Nomad relies on third party autoscalers - https://github.com/jippi/awesome-nomad#auto-scalers - although there's more we can do to enable them.

- Nomad relies on Consul Connect (Envoy) or other third parties for load balancing

- Nomad does not provide any cluster log aggregation (although logging plugins are planned which should make third party log aggregators easier to use)

Nomad still has many missing features such as CSI (coming in 0.11!), logging plugins, audit logging, etc, but we never intend to be as monolithic a solution as k8s. We always hope to feel like "just a scheduler" that you compose with orthogonal tools.


It's really odd that around 30% of the people who commented defending kubernetes are merely echoing stuff that the article itself mentioned.

Also, it's weird that the argument of another 30% of people defending kubernetes boils down to: "Using kubernetes is really easy, just hire Google/Azure/etc to do it for you."

Can't begin to wrap my head around that one.

But what do I know, I prefer to KISS and I like nomad. In fact, I'd be using swarm if its future wasn't spotty.


I agree with the sentiment, but I think the hardest part of kubernetes is dealing the complexity of the resulting cloud. You end up having to manage that complexity no matter what you're doing; the upfront cost of figuring out kubernetes is that maybe it takes a couple days longer to learn than something else, but now you have a huge ecosystem of technology and people that are solving the same problems you have.


docker compose and a jwilder proxy is all I need for most of my projects


I see Kubernetes also being used for ML workflows, yes...while you haven't figured out how to run a ML pipeline yet, add K8s (KubeFlow recommendation) As a ML researcher solutions like KubeFlow has been very painful as I don't want to learn k8s, I need to continue focusing on ML work and figure out my pipeline before I deploy K8s


There are lots of options out there. I'm currently exploring Cloud Foundry for our platform, since our core API is written in Spring Boot. This page from AquaSec has lots of links to articles comparing Cloud Foundry to Kubernetes. https://www.aquasec.com/wiki/display/containers/Kubernetes+v...

I also found this book to be helpful https://www.amazon.com/gp/product/B07T1Y2JRJ/ref=ppx_yo_dt_b...


I think most Nomad users are well aware that Kubernetes can do all it can - and vastly more.

If anything is wrong with Kubernetes, it would be the complexity of it and that it has a steep learning curve.

It seems it's best to have a small team of people to manage it and to solve solutions to problems that arise.

We started using Nomad last year and one thing I can say is that it's relatively easy to use and works well for its intended purpose, especially for us hybrid-cloud folks.


The part about all of this containerization technology I don't understand is that it was originally sold as a way (among other benefits) to unburden devs of having to think about ops stuff and just code.

Well, my experience with Docker, k8s, and related technologies, is that I now need to be an expert with these things just to get through my day. It's exhausting.


Before I started with k8s, I've been checking Nomad many times but eventually went with k8s. Nomad's learning curve didn't seem to be better, docs felt sparse and the much bigger ecosystem with k8s shouldn't be undervalued.


If you don't need container orchestration, you don't need kubernetes. If you need reliability and instead of using k8s, orchestrate your container manually you have issues.


Given all the talk here about Kubernetes’ learning curve, does anyone have any resources for learning it for those that haven’t yet? Whether blogs, books, video courses, etc.


What are you talking about, my blog that has an astounding 2 visitors a month (one or both of which are accidental clicks) definitely needs to be on kubernetes!


I built a big globally scalable system without Kubernetes or containers.

It’s very easy to understand and is highly reliable.


Personally, I've found Chef Habitat to be a sane alternative. I don't have any experience with k8s, but if you are not using containers especially, Habitat gets you:

- packaging

- deploys

- configuration management

- supervision

- service discovery

- isolation

in a more classic "VM" environment, using unix jails. For anyone who isn't interested in learning k8s, but also wants a relatively modern, all-in-one solution to these problems, I recommend checking it out.


Nomad is awesome. Just like the original author, our tiny team went with nomad for on-prem container orchestration instead of k8s. Since then we haven't spent much time on any nomad/consul maintenance, and all that time we got back - we spent on developing the product.

Basically, I agree with others that k8s is both complex and not fully mature yet.


Anecdotally, about the worst place I've seen Kubernetes used is for CI/CD pipelines where every pipeline gets a fresh cluster. It's truly a special kind of hell--it takes at least 10 minutes to get a cluster up and running (depending on cloud provider), and it causes all kinds of indeterminism and frustration. You wait for a cluster to be spun up, wait for all of the extra services to be added, have one of them fail (completely unrelated to your PR), and have to do the whole thing over again. It's truly the new version of the "compiling" XKCD.

If you ever want to see how brittle and bizarre Kubernetes can be, use it in a 100% automated fashion and hope for the best.


I heard the word Kubernetes one too many times from listening to podcasts put out by The Changelog. The overhype bit got flipped in my head and now I'm predisposed against it. Marketing kills!


I know it's an unpopular opinion but I don't see the point of kubernetes. It's a complexity monster which is not providing any real advantage over way simpler solutions.

Even pod to pod communication, which would be trivial to do using any sane solution is a huge pain in kubernetes.

I would actually be frightened by running kubernetes in production. The simplest things are so hard to do that I don't even want to think about how to fix a weird issue when something goes wrong...


> I know it's an unpopular opinion but I don't see the point of kubernetes. It's a complexity monster which is not providing any real advantage over way simpler solutions.

The main selling point of k8s over any other tool is that it provides a single, unified, secured, multi-tenant API endpoint for managing all of production. Your developers updating their production app use the same API as a CI system that wants to spawn some worker containers, and the same API as an operator service maintaining a Redis cluster. All of this results in a single view into production. If things go well, the end result is that you swap daily interaction with a handful of different tools with disparate states (Terraform, Ansible/Chef/Salt/Puppet, shell scripts and proprietary tools) into just managing payloads on k8s.

> Even pod to pod communication, which would be trivial to do using any sane solution is a huge pain in kubernetes.

How is it a pain? A pod behind a service provides a DNS name that allows running requests to it - this handles the bulk of production traffic. If you want to contact a particular pod that is not behind a service just use the k8s API to retrieve details about it (like Prometheus does via k8s pod service discovery).


The killer feature for us is autoscaling. Allows us to easily adapt to changing load throughout the day without having to overprovision. The ability to scale on custom metrics is really nice.


> Even pod to pod communication, which would be trivial to do using any sane solution is a huge pain in kubernetes.

You create a service and then use the DNS name? How is that hard?


Of course most software companies (ie, where most of HN work) don't need Kubernetes. At this point it's become a meme for wasted engineering.

Are you creating a cloud provider?

- If you are creating a cloud provider, then yes you might want Kubernetes. If you are making your own cloud provider, you should question why you are doing that.

- If you're not creating a cloud provider (for example, you're a software company), then use whatever VM / container / MicroVM / etc your cloud provider gives you rather than layering your own unnecessary complexity that adds questionable value on top of what your pay for from your cloud provider.


Normally cloud providers provide k8s to their customers.


Yes, that's the biggest use of Kubernetes. Most cloud providers provide Xen VMs, containers (via whatever orchestration mechanism) or MicroVMs. Most people making software shouldn't care which. It's a box to run code.


I get really sick of these types of posts because it helps to fuel a weird contrarian and/or fear based avoidance of these technologies. Then the company goes down a rabbit hole of inventing their own thing that basically is an in house version of kubernetes but worse. I think kubernetes is pretty easy, I'm not sure why companies are so against it other than wanting to be "even cooler than the cool kids".


To start: not everyone is a company. Not everything is about companies. When people bringing overly complex, mostly useless in this context, tech like kubernetes to a personal environment it's silly. But whatever, it's what they know.

But when they start telling other people to learn kubernetes because it's so useful then it's annoying. And there are lots of people advocating for using kubernetes on small personal projects. When they get pressed they all fall back to the justification, "You should use kubernetes in personal projects to learn kubernetes." which is back to silly again.

That's where these kinds of critical articles come from.


Most likely they themselves adopted a previous "cool" technology that later bit them after dying off, or being realized as just another fad. I think that apprehension of a new technology is a healthy, pragmatic response in most cases.




Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: