Let's get it right:
Kubernetes is really really cheap. I can run 20 low volume apps in a kubes cluster with a single VM. This is cheaper than any other hosting solution in the cloud if you want the same level of stability and isolation. It's even cheaper when you need something like a Redis cache. If my cache goes down and the container needs to be spun up again then it's not a big issue, so for cheap projects I can even save more cost by running some infra like a Redis instance as a container too. Nothing beats that. It gets even better, I can run my services in different namespaces, and have different environments (dev/staging/etc.) isolated from each other and still running on the same amount of VMs. When you caculate the total cost saving here to traditional deployments it's just ridiculously cheap.
Kubernetes makes deployments really easy. docker build + kubectl apply. That's literally it. Deployments are two commands and it's live, running in the cloud. It's elastic, it can scale, etc.
Kubernetes requires very little maintenance. Kubernetes takes care of itself. A container crashes? Kubes will bring it up. Do I want to roll out a new version? Kubes will do a rolling update on its own. I am running apps in kubes and for almost 2 years I haven't looked at my cluster or vms. They just run. Once every 6 months I log into my console and see that I can upgrade a few nodes. I just click ok and everything happens automatically with zero downtime.
I mean yes, theoretically nothing needs Kubernetes, because the internet was the same before we had Kubernetes, so it's certainly not needed, but it makes life a lot easier. Especially as a cheap lazy developer who doesn't want to spend time on any ops Kubernetes is really the best option out there next to serverless.
If learning Kubernetes is the reason why it's "not needed" then nothing is needed. Why use a new programming language? Why use a new db technology? Why use anything except HTML 4 + PHP, right?
BTW, learning Kubernetes can be done in a few days.
Compared to something like scp and restarting services, I would personally not pay the Kubernetes tax unless I absolutely had to.
As background, I've done time as a professional sysadmin. My current infrastructure is all Chef-based, with maybe a dozen custom cookbooks. But Chef felt kinda heavy and clunky, and the many VMs I had definitely seemed heavy compared with containerization. I thought switching to Kubernetes would be pretty straightforward.
Surprise! It was not. I moved the least complex thing I run, my home lighting daemon to it; it's stateless and nothing connects to it, but it was still a struggle to get it up and running. Then I tried adding more stateful services and got bogged down in bugs, mysteries, and Kubernetes complexity. I set it aside, thinking I'd come back to it later when I had more time. That time never quite arrived, and a month or so ago my home lights stopped working. Why? I couldn't tell. A bunch of internal Kubernetes certificates expired, so none of the commands worked. Eventually, I just copy-pasted stuff out of Stack Overflow and randomly rebooted things, and eventually it started working again.
I'll happily look at it again when I have to do serious volume and can afford somebody to focus full-time on Kubernetes. But for anything small or casual, I'll be looking elsewhere.
Going into it we knew how much of a PITA it would be but we vastly underestimated how much, IMO.
Would not do again -- I would quit first.
Written 18 years ago, so obviously not about Kubernagus, but it does explain the same phenomenon. Replace Microsoft with cloud providers and that's more or less the same argument.
Kubernetes has a model for how your infrastructure and services should behave. If you stray outside that model, then you'll be fighting k8s the entire way and it will be painful.
If however you design your services and infrastructure to be within that model, then k8s simplifies many things (related to deployment).
The biggest issue I have with k8s as a developer is that while it simplifies the devops side of things, it complicates the development/testing cycle by adding an extra layer of complication when things go wrong.
Not because it's bad or especially hard, but because there's so much to unpack, and it's so tempting to unpack it all at once, and there's so much foundational stuff (Ruby language) which you really ought to learn before you try to analyze in detail exactly how the system is built up.
I learned Kubernetes around v1.5 just before RBAC was enabled by default, and I resisted upgrading past 1.6 for a good long while (until about v1.12) because it was a feature I didn't need, and all the features after it appeared to be something else which I didn't need.
I used Deis Workflow as my on-ramp to Kubernetes, and now I am a maintainer of the follow-on fork, which is a platform that made great sense to me, as I was a Deis v1 PaaS user before it was rewritten on top of Kubernetes.
Since Deis left Workflow behind after they were acquired by Microsoft, I've been on Team Hephy, which is a group of volunteers that maintains the fork of Deis Workflow.
This was my on-ramp, and it looks very much like it did in 2017, but now we are adding support for Kubernetes v1.16+ which has stabilized many of the main APIs.
If you have a way to start a Kubernetes 1.15 or less cluster, I can recommend this as something to try. The biggest hurdle of "how do I get my app online" is basically taken care of you. Then once you have an app running in a cluster, you can start to learn about the cluster, and practice understanding the different failure modes as well as how to proceed with development in your new life as a cluster admin.
If you'd rather not take on the heavyweight burden of maintaining a Workflow cluster and all of its components right out of the gate (and who could blame you) I would recommend you try Draft, the lightweight successor created by Deis/Azure to try to fill the void left behind.
Both solutions are based on a concept of buildpacks, though Hephy uses a combination of Dockerfile or Heroku Buildpacks and by comparison, Draft has its own notion of a "Draftpack" which is basically a minimalistic Dockerfile tailored for whatever language or framework you are developing with.
I'm interested to hear if there are other responses, these are not really guides so much as "on-ramps" or training wheels, but I consider myself at least marginally competent, and this is how I got started myself.
Moreover, if you are keeping pace with kubeadm upgrades at all (minor releases are quarterly, and patches are more frequent) then since the most recent minor release, Kubernetes 1.17, certificate renewal as an automated part of the upgrade process is enabled by default. You would have to do at least one cluster upgrade per year to avoid expired certs. tl;dr: this cert expiration thing isn't a problem anymore, but you do have to maintain your clusters.
(Unless you are using a managed k8s service, that is...)
The fact remains also that this is the very first entry under "Administration with Kubeadm", so if you did use kubeadm and didn't find it, I'm going to have to guess that either docs have improved since your experience, or you really weren't looking to administrate anything at all.
The notion that one has to keep pace with Kubernetes upgrades is exactly the kind of thing that works fine if you have a full-time professional on the job, and very poorly if it's a sideline for people trying to get actual productive work done.
Which is fine; not everything has to scale down. But it very strongly suggests that there's a minimum scale at which Kubernetes makes sense.
I think it's fair to say that the landscape of Kubernetes proper itself (the open source package) has already reached a more evolved state than the landscape of managed Kubernetes service providers, and that's potentially problematic, especially for newcomers. It's hard enough to pick between the myriad choices available; harder still when you must justify your choice to a hostile collaborator who doesn't agree with part or all.
IMO, the people who complain the loudest about the learning curve of Kubernetes are those who have spent a decade or more learning how to administer one or more distributions of Linux servers, who have made the transition from SysV init to SystemD, and in many cases who are now neck deep in highly specialized AWS services, which in many cases they have used successfully to extricate from the nightmare-scape where one team called "System Admins" is responsible for broadly everything that runs or can run on any Linux server (or otherwise), from databases, to vendor applications, to monitoring systems, new service dev, platforming apps that were developed in-house, you name it...
I basically don't agree that there is a minimum scale for Kubernetes, and I'll assert confidently that declarative system state management is a good technology, that is here to stay. But I respect your choice and I understand that not everyone shares my unique experiences, that led me to be more comfortable using Kubernetes for everything from personal hobby projects, to my own underground skunkworks at work.
In fact it's a broadly interesting area of study for me, "how do devs/admins/(people at large) get into k8s" since it is such a steep learning curve, and this has all happened so fast, there is so much to unpack before one can start to feel comfortable that there isn't really that much more complexity buried behind that you haven't deeply explored already and understood.
But a managed Kubernetes approach only makes sense if you want all your stuff to run in that vendor's context. As I said, I started with home and personal projects. I'd be a fool to put my home lighting infrastructure or my other in-home services in somebody's cloud. And a number of my personal projects make better economic sense running on hardware I own. If there's a managed Kubernetes setup that will manage my various NUCs and my colocated physical server, I'm not aware of it.
I would say there is a minimum scale that makes sense, for control plane ownership, yes. Barring other strong reasons that you might opt to own and manage your own control plane like "it's for my home automation which should absolutely continue to function if the internet is down"...
I will concede you don't need K8s for this use case, even if you like containers and wanted to use containers, but don't have much prior experience with K8s, from a starting position of "no knowledge" you will probably have a better time with compose and swarm. There is a lot to learn about K8s to a newcomer, but the more you already learned, the less likely I would be to recommend using swarm, or any other control plane (or anything else.)
This is where I feel the fact I mentioned that managed k8s ecosystem is not as evolved as it will likely soon become is relevant. You may be right that no managed Kubernetes setups will handle your physical servers today, but I think the truth is somewhere between: they're coming / they're already here but most are not quite ready for production / they are here, but I don't know what to recommend strongly.
I'm leaning toward the latter (I think that if you wanted a good managed bare metal K8s, you could definitely find it.) I know some solutions that will manage bare metal nodes, but this is not a space I'm intimately familiar with.
The solutions that I do know of, are in early enough state of development that I hesitate to mention. It won't be long before this gets much better. The bare metal Cluster API provider is really something, and there are some really amazing solutions being built on top of it. If you want to know where I think this is going, check this out:
WKS and the "firekube" demo, a GitOps approach to managing your cluster (yes, even for bare metal nodes)
I personally don't use this yet, I run kubeadm on a single bare metal node and don't worry about scaling, or the state of the host system, or if it should become corrupted by sysadmin error, or much else really. The abstraction of the Kubernetes API is extremely convenient when you don't have to learn it from scratch anymore, and doubly so if you don't have to worry about managing your cluster. One way to make sure you don't have to worry, is to practice disaster recovery until you get really good at it.
If my workloads are containerized, then I will have them in a git repo, and they are disposable (and I can be sure, as they are regularly disposed of, as part of the lifecycle). Make tearing your cluster down and standing it back up a regular part of your maintenance cycles until you're ready to do it in an emergency situation with people watching. It's much easier than it sounds, and it's definitely easier than debugging configuration issues to start over again.
The alternative that I would recommend for production right now, if you don't like any managed kubernetes, is to become familiar with the kubeadm manual. It's probably quicker to read it and study for CKA than it would be to canvas the entire landscape of managed providers for the right one.
I'm sure it was painful debugging that certificate issue, I have run up against that issue in particular before myself. It was after a full year or more of never upgrading my cluster (shame on me), I had refused to learn RBAC, kept my version pinned at 1.5.2, and at some point after running "kubeadm init" and "kubeadm reset" over and over again it became stable enough (I stopped breaking it) that I didn't need to tear it down anymore, for a whole year.
And then a year later certs expired, and I could no longer issue any commands or queries to the control plane, just like yours.
Once I realized what was happening, I tried to renew the certs for a few minutes, I honestly didn't know enough to look up the certificate renewal docs, I couldn't figure out how to do it on my own... I still haven't read all the kubeadm docs. But I knew I had practiced disaster recovery well over a dozen times, and I could repeat the workloads on a new cluster with barely any effort (and I'd wind up with new certs.) So I blew the configuration away and started the cluster over (kubeadm reset), reinstalled the workloads, and was back in business less than 30 minutes later.
I don't know how I could convince you that it's worth your time to do this, and that's OK (it's not important to me, and if I'm right, in 6 months to a year it won't even really matter anymore, you won't need it.) WKS looks really promising, though admittedly still bleeding edge right now. But as it improves and stabilizes, I will likely use this instead, and soon after that forget everything I ever knew about building kubeadm clusters by hand.
Sure, you can bring up a single VM with those technologies and be up and running quickly. But a real production environment will need automatic scaling (both of processes and nodes), CPU/memory limits, rolling app/infra upgrades, distributed log collection and monitoring, resilience to node failure, load balancing, stateful services (e.g. a database; anything that stores its state on disk and can't use a distributed file system), etc., and you end up building a very, very poor man's Kubernetes dealing with all of the above.
With Kubernetes, all of the work has been done, and you only need to deal with high-level primitives. "Nodes" become an abstraction. You just specify what should run, and the cluster takes care of it.
I've been there, many times. I ran stuff the "classical" Unix way -- successfully, but painfully -- for about 15 years and I'm not going back there.
There are alternatives, of course. Terraform and CloudFormation and things like that. There's Nomad. You can even cobble together something with Docker. But those solutions all require a lot more custom glue from the ops team than Kubernetes.
Not all environments need automatic scaling, but they need redundancy, and from a Kubernetes perspective those are two sides of the same coin. A classical setup that automatically allows a new node to start up to take over from a dysfunctional/dead one isn't trivial.
Much of Kubernetes' operational complexity also melts away if you choose a managed cloud such as Digital Ocean, Azure, or Google Cloud Platform. I can speak from experience, as I've both set up Kubernetes from scratch on AWS (fun challenge, wouldn't want to do it often) and I am also administering several clusters on Google Cloud.
The latter requires almost no classical "system administration". Most of the concerns are "hoisted" up to the Kubernetes layer. If something is wrong, it's almost never related to a node or hardware; it's all pod orchestration and application configuration, with some occasional bits relating to DNS, load balancing, and persistent disks.
And if I start a new project I can just boot up a cluster (literally a single command) and have my operational platform ready to serve apps, much like the "one click deploy" promise of, say, Heroku or Zeit, except I have almost complete control of the platform.
In my opinion, Kubernetes beats everything else even on a single node.
If something blows up or dies, then with Kubernetes it's often faster to just tear down the entire namespace and bring it up again. If the entire cluster is dead, then just spin up a new cluster and run your yaml files on it and kill your old cluster.
Treat it like cattle, when it doesn't serve your purpose anymore then shoot it.
This is one of the biggest advantages of Kubes, but often overlooked because traditional Ops people keep treating infrastructure like a pet.
Only thing you should treat like a pet is your persistence layer, which is presumably outside Kubes, somehting like DynamoDb, Firestore, CosmosDb, SQL server, whatever.
So, you say that problems happen, and you consciously don’t want to know/solve them. A recurring problem in you view is solved with constantly building new K8s clusters and your whole infrastructure in it every time!?!
Simple example - A microservice that leaks memory.... let it keep restarting as it crashes?!
I remember at one of my first jobs, at a healthcare system for a hospital in India, their Java app was so poorly written that it kept leaking memory and bloated beyond GC could help and will crash every morning at around 11 AM and then again at around 3 PM. The end users - Doctors, nurses, pharmacists knew about this behavior and had breaks during that time. Absolutely bullshit engineering! It’s a shame on those that wrote that shitty code, and shame on whoever reckless to suggest a ever rebuilding K8s clusters.
Yes, "let it keep restarting while it crashes and while I investigate the issue" is MUCH preferred to "everything's down and my boss is on my ass to fix the memory issue."
The bug exists either way, but in one world my site is still up while I fix the bug and prioritize it against other work and in another world my site is hard-down.
Sometimes feeling a little pain helps get things done.
Self-healing systems are good but only if you have someone who is keeping track of the repeated cuts to the system.
At least on backend you can quantify the cost fairly easily. If you bring it up to your business people they will notice easy win and then push the devs to make more efficient code.
If it's a small $$ difference although, the devs are probably prioritizing correctly.
You should try to fix mem leaks and other issues like the one you described, and sometimes you truly do need pets. Many apps can benefit from being treated like cattle, however.
This article touched on the distinction and has plenty of associated links at the bottom: https://medium.com/@Joachim8675309/devops-concepts-pets-vs-c...
Just doing the old 'just restart everything' is typical windows admin behavoir and a recipy for making bad unstable systems.
Kubernetes absolutly does do strang things, crahes on strange things, does strange things and not tell you about it.
I like the system, but to pretend its this unbelievable great thing is an exaturation.
I agree with you, treat your software like a pet.
I am saying though, treat your infrastructure like cattle.
Infrastructure problems <> Software problems.
So yeah, if you have severe bugs in your app, go ahead and fix it, but that has nothing to do with Kubes or not Kubes anymore.
It's probably good at this point to distinguish between on-prem and managed installations of k8s. In almost four years of running production workloads on Google's GKE we've had... I don't know perhaps 3-4 real head-scratchers where we had to spend a couple of days digging into things. Notably none of these issues have ever left any of our clusters or workloads inoperable. It isn't hyperbole to say that in general the system just works, 24x7x365.
After a slog you get everything going. Suddenly a service is throwing errors because it doesn't have IAM permissions. You look into it and it's not getting the role from the kube2iam proxy. Kube2iam is throwing some strange error about a nil or interface cast. Let's pretend you know Go like I do. The error message still tells you nothing specific about what the issue may be. Google leads you to github and you locate an issue with the same symptoms. It's been open for over a year and nobody seems to have any clue what's going on.
Good times :) Stay safe everyone, and run a staging cluster!
Kubernetes by itself is a very minimal layer. If you install every extension you can into it, then yes, you'll hit all kinds of weird problems, but that's not a Kubernetes problem.
I've had that when running code straight on a VM, when running on Docker, and when running on k8s. I can't think of a way to deploy code right now that lets you completely avoid issues with systems that you require but are possibly unfamiliar with, except maybe "serverless" functions.
\ And of those three, I much preferred the k8s failure states simply because k8s made running _my code_ much easier.
This is basically the same comment I was going to write, so I'll just jump onto it. But whenever I hear people complain about how complex XXX solution is for deployment, I always think, "ok, I agree that it sucks, but what's the alternative?"
Deploying something right now with all of its ancillary services is a chore, no matter how you do it. K8s is a pain in the ass to set up, I agree. But it seems to maintain itself the best once it is running. And longterm maintainability cannot be overlooked when considering deployment solutions.
When I look out in the sea of deployment services and options that exist right now, each option has its own tradeoffs. Another service might elimite or minimize anothers' tradeoffs, but it then introduces its' own tradeoffs. You are trading one evil for another. And this makes it nearly impossible to say "X solution is the best deployment solution in 2020". Do you value scalability? Speed? Cost? Ease of learning? There are different solutions to optimize each of these schools of thought, but it ultimately comes down to what you value most, and another developer isn't going to value things in the same way, so for them, another solution is better.
The only drop-dead simple, fast, scalable, deployment solution I have seen right now is static site hosting on tools like Netlify or AWS Amplify (among others). But these only work for static generated sites, which were already pretty easy to deploy, and they are not an option for most sites outside of marketing sites, landing pages, and blogs. They aren't going to work for service based sites, nor will they likely replace something being deployed with K8s right now. So they are almost moot in this argument, but I bring it up, because they are arguably, the only "best deployment solution" right now if you are building a site that can meet its' narrow criteria.
EDIT: Quick clarification: still use containers. However, running containers doesn't require running Kubernetes.
> learning Kubernetes can be done in a few days
The basic commands, perhaps. But with Kubernetes' development velocity, the learning will never stop - you really do need (someone) dedicated part time to it to ensure that a version upgrade doesn't break automation/compliance (something that's happened to my company a few times now).
You're absolutely right. Init scripts and systemd unit files could do every single thing here. With that said, might there be other reasons?
The ability to have multiple applications running simultaneously on a host without having to know about or step around each other is nice. This gets rid of a major headache, especially when you didn't write the applications and they might not all be well-behaved in a shared space. Having services automatically restart and having their dependent services handled is also a nice bonus, including isolating one instance of a service from another in a way that changing a port number won't go around.
Personally, I've also found that init scripts aren't always easy to learn and manage either. But YMMV.
If you're running containers, you get that for free. You can run containers without running Kubernetes.
And unit/init files are no harder (for simple cases like this, it's probably significantly easier) than learning the Kubernetes YAML DSL. The unit files in particular will definitely be simpler, since systemd is container aware.
Anyway. Yes, you're once more absolutely correct. Every thing here can be done with unit scripts and init scripts.
Personally, I've not found that the YAML DSL is more complex or challenging than the systemd units. At one point I didn't know either, but I definitely had bad memories of managing N inter-dependent init scripts. I found it easier to learn something I could use at home for an rpi and at work for a fleet of servers, instead of learning unit scripting for my rpi and k8s for the fleet.
It's been my experience that "simple" is generally a matter of opinion and perspective.
ExecStart=/usr/bin/docker start mycontainer
ExecStop=/usr/bin/docker stop mycontainer
In comparison, here's a definition to set up the same container as a deployment in Kubernetes.
- name: my-container
This is not to say that these abstractions are useless, particularly when you have hundreds of nodes and thousands of pods. But for a one host node, it's a lot of extra conceptual work (not to mention googling) to avoid learning how to write unit files.
That said, it's been my experience that a modern docker application is only occasionally a single container. More often it's a heterogeneous mix of three or more containers, collectively comprising an application. Now we've got multiple unit files, each of which handles a different aspect of the application, and now this notion of a "service" conflates system-level services like docker and application-level things like redis. There's a resulting explosion of cognitive complexity as I have to keep track of what's part of the application and what's a system-level service.
Meanwhile, the Kubernetes YAML requires an extra handful of lines under the "containers" key.
Again, thank you for bringing forward this concrete example. It's a very kind gesture. It's just possible that use-cases and personal evaluations of complexity might differ and lead people to different conclusions.
If you can start them up with additional lines in a docker file (containers in a pod), it's just another ExecStart line in the unit file that calls Docker with a different container name.
EDIT: You do have to think a bit differently about networking, since the containers will have separate networks by default with Docker, in comparison to a k8s pod. You can make it match, however, by creating a network for the shared containers.
If, however, there's a "this service must be started before the next", systemd's dependency system will be more comprehensible than Kubernetes (since Kubernetes does not create dependency trees; the recommended method is to use init containers for such).
As a side note, unit files can also do things like init containers using the ExecStartPre hook.
Scripts sure - but modern Linux systems use systemd units, not shell scripts for this.
There are no pid files. There are no file locks. There is no "daemonization" to worry about. There is no tracking the process to ensure it's still alive.
Just think about how you would interact with the docker daemon to start, stop, restart, and probe the status of a container, and write code to do exactly that.
Frankly, Docker containers are the simplest thing you could ever have to write an init script for.
Even if the whole node caught on fire, I can restore it by just creating a new Kubernetes box from scratch, re-applying the YAML, and restoring the persistent volume contents from backup. To me there's a lot of value over init scripts or unit files.
You can do this with docker commands too. Ultimately, that's all that Kubernetes is doing, just with a YAML based DSL instead of command line flags.
> Even if the whole node caught on fire, I can restore it
So, what's different from init/unit files? Just rebuild the box and put in the unit files, and you get the same thing you had running before. Again, for a single node there's nothing that Kubernetes does that init/unit files can't do.
Well, I mean, mostly. You're gonna be creating your own directories and mapping them into your docker-compose YAMLs or Docker CLI commands. And if you have five running and you're ready to add your sixth, you're gonna be SSHing in to do it again. Not quite as clean as "kubectl apply" remotely and the persistent volume gets created for you, since you specified that you needed it in your YAML.
> So, what's different from init/unit files? Just rebuild the box and put in the unit files, and you get the same thing you had running before. Again, for a single node there's nothing that Kubernetes does that init/unit files can't do.
Well you kinda just partially quoted my statement and then attacked it. You can do it with init/unit files, but you've got a higher likelihood of apps conflicting with each other, storing things in places you're not aware of, and missing important files in your backups.
It's not about what you "can't" do. It's about what you can do more easily, and treat bare metal servers like dumb container farms (cattle).
You don't have to create them, docker does that when you specify a volume path that doesn't exist. You do have to specify them as a -v. In comparison to a full 'volume' object in a pod spec.
> And if you have five running and you're ready to add your sixth, you're gonna be SSHing in to do it again
In comparison to sshing in to install kubernetes, and connect it to your existing cluster, ultimately creating unit files to execute docker container commands on the host (to run kubelet, specifically).
> apps conflicting with each other
The only real conflict would be with external ports, which you have to manage with Kubernetes as well. Remember, these are still running in containers.
> storing things in places you're not aware of, and missing important files in your backups.
Again, they are still containers, and you simply provide a -v instead of a 'volume' key in the pod spec.
> treat bare metal servers like dumb container farms
We're not talking about clusters though. The original post I was responding to was talking about 1 vm.
I will agree that, when you move to a cluster of machines and your VM count exceeds your replica count, Kubernetes really starts to shine.
Are you bootstrapping your stealth-mode side-project? Pick whatever you think is best, but think about the time value of operations. (So maybe just pick a managed k8s.)
Are you responsible for a system that handles transactions worth millions of dollars every day? Then maybe, again you should seek the counsel of professionals.
Otherwise these articles are just half-empty fuel cans for an (educated?) dumpster fire.
That said HashiCorp stuff is almost guaranteed to be amazing. I haven't even looked at Nomad, but I think anybody starting out with orchestration stuff should give it a go, and when they think they have outgrown it they will know what's next. Maybe k8s, maybe something else.
1. Learn shell scripts.
2. Learn Docker conf.
4. Learn yaml. (May be helm and plethora of acronyms specially designed for k8s).
5. Combine spaghetti of 1,2,3 to build container image scripts.
6. Tie it into CI/CD process for adventurous.
7. Learn to install and manage k8s cluster or learn proprietary non open source api of google or amazon or azure.
8. Constantly patch and manage plethora of infrastructure software besides application code.
Now with all this use a tool designed for million user application on an application which will be used by 100’s to 1000’s of users. I think k8s is designed for google kind of problem and is an overkill for over 80-90% of deployments and applications.
May be just use simple deployment with Ansible, puppet,chef, nix, gnu guix etc. to deploy and manage software based on necessity on a single vm and extend it to a large cluster if necessary of bare metal or vm or container in a cloud agnostic manner.
Also, you said VM which implies that instead of Kubernetes you're going to use a VM platform, which comes with every bit of the complexity Kubernetes has.
I agree with you that it's a simpler deployment method when you don't need HA. As soon as you need HA, then all of a sudden you need to be able to fail over to new nodes and manage a load balancer and the backends for that load balancer. Kubernetes makes that easy. Kubernetes makes easy things harder than they should be, and hard things easier than they should be.
The number of things I've managed that don't need HA is vanishingly low.
I really just don't think they understand the tool, because any production environment should have many of the features Kubernetes helps provide. So the argument becomes "I know how to do it this other way, so learning a new tool is too complex."
Kubernetes helps standardize a lot of these things - I can very easily hop between different clusters running completely different apps/topologies and have a good sense of what's going on. A mish-mash of custom solutions re-inventing these wheels is, in my opinion, far more confusing.
> Kubernetes makes easy things harder than they should be, and hard things easier than they should be.
This is really the crux, I think. I think a lot of people look at Kubernetes, try to learn it by running their blog on a K8s platform, and decide it's overly complex to run something that Docker probably solves (alone) for them. When you need HA for many services, don't want to have to handle the hassle of your networking management, design your applications with clear abstractions, etc., and really need to start worrying about scale, Kubernetes starts to really shine.
Kinda like jumping between Rails projects (assuming those rails projects don't diverge heavily from the "convention") vs jumping around between custom PHP scripts ;)
Or for Java Dev... kinda like jumping between Maven (again, assuming mostly follow the Maven convention) vs random Ant scripts to build your Java-based systems.
There will always be naysayers no matter what because they're used to the previous tech.
There's a cycle in tool ecology:
- simple useful tool
- adds formats, integrations, config, etc
- codebase is large and change is slow
- the skijump has become an overhang on the learning curve
- a frustrated person writes a simple useful tool
- goto 10
I haven't used k8 but it looks useful for uncommon workloads.
I expect some nu shinyee thing will replace it eventually.
Indeed due to problem with docker and CRI all kubernetes were vulnerable recently and needed security patch, as docker containers do not run in user namespace like lxd.
So for HA also traditional methods are better and functional programming approach like guix and nix to generate lxd container images, vm images or deploy to bare metal to run application is far superior and secure instead of spaghetti and non understandable black box images popular in docker world.
Docker although inferior and less secure than LXC/LXD became popular due to marketing driven by VC money, not on technical merits.
Check the old thread discussing the security issue with docker not supporting unprivileged container. 
In terms of infra costs, I can believe it. But what about engineer resources to setup and maintain your infra in k8s clusters?
Where I work, we have a full time DevOps team that's almost the same size as the main product teams. That's really not cheap.
In fact, it was done by one person until the product teams reached some 40 engineers.
We aren't 10x guys either. What's different though is that we don't believe in the microservice hypetrain so I don't have 200 codebases to watch after.
The former that sounds reasonable, the latter sounds near implausible.
It’s easy to get started with using GKE, EKS, etc. it’s difficult to maintain if you’re bootstrapping your own cluster. 3 years in, and despite working with k8s at a pretty low level, I still learn more about functionality and function every single day.
I do agree it’s great tooling wise. I personally deploy on docker for desktop k8s day one when starting a new project. I understand all the tooling, it’s easier than writing a script and figuring out where to store my secrets every damn time.
The big caveat is - kubernetes should be _my_ burden as someone in the Ops/SRE team, but I feel like you frequently see it bleed out into application developer land.
I think that the CloudRuns and Fartgates* of the world are better suited to the average developer and I think it’s Ops responsibility to make k8s as transparent as possible within the organization.
Application developers want a Heroku.
Edit: * LOL
> BTW, learning Kubernetes can be done in a few days.
This is true if you are using managed k8s from a provider or have an in-house team taking care of this. Far, far from the truth if you also need to set up and maintain your own clusters.
I'm very comfortable with k8s as a user and developer, would not be comfortable setting up a scalable and highly available cluster for production use for anything more serious than my homelab.
> Kubernetes makes deployments really easy. docker build + kubectl apply.
The parts between "docker build" and "kubectl apply" is literally CI versus CD; they're more complicated than 2 steps. And when there's a problem with either, K8s is not going to fix it up for you. You'll have to be notified via monitoring and begin picking through the 50 layers of complexity to find the error and fix it. Which is why we have deployment systems to do things like validation of all the steps and dependencies in the pipeline, so you don't end up with a broken prod deploy.
> Kubernetes requires very little maintenance
Whatever you're smoking, pass it down here... Have you ever had to refresh all the certs on a K8s cluster? Have you ever had to move between breaking changes in the K8s backend during version upgrade? Crafted security policies, RBACs, etc to make sure when you run the thing you're not leaving gaping backdoors into your system? There's like 50 million different custom solutions out there just for K8s maintenance. Entire businesses are built around it.
Over time you'll probably grow your own customizations and go-tos on kubernetes that you layer on top of k3s or what have you.
I've been looking at running k8s at some raspberry pis at home, but anything smaller than the recently released 4 is just not worth it IMO (though I've seen several people run clusters of 3B+s).
Totally different use case
I’m a happy customer of GKE but it’s not for everyone and everything. Like you say, different use-cases.
As someone who has deployed K8S at scale several times, this is nonsense. Learning K8S deeply enough to deploy AND MAINTAIN IT is a huge undertaking that requires an entire team to do right.
Sure, you can deploy onto AWS via KOPS in a day, and you can deploy a hello world app in another day. Easy.
But that only gets you to deployment, it doesn't mean you can maintain it. There are TONS of potential failure modes, and at this point you don't understand ANY of them. When one of them crops up, who do you page at 3AM, how do you know it's even down (monitoring/alerting isn't "batteries included"), how do you diagnose and fix it once you _do_ know what's broken?
Not to mention the fact that you _have_ to continuously upgrade K8S as old releases go out of maintenance in under a year. If you're not continuously testing upgrades against production equivalent deploys, you're going to footgun spectacularly at some point, or be stuck with an old release that new/updated helm charts won't work against.
TL;DR: If you can afford a team to deploy and maintain K8S, and have a complex enough application stack to need it, it's awesome; but it's not free in either time or staff.
If you're running 20 apps on a kubes cluster with a single VM you are running twenty apps on a single VM. There's no backup, scalability or anything else. There's no orchestration.
your deployment is a hipster version of rsync -avH myapp mybox:/my/location/myapp followed by a restart done via http to tell monit/systemd to restart your apps. It is a perfectly fine way of handling apps.
k8s shines when you have a fleet of VMs and a fleet of applications that depend on each other and have dynamic constrains but that's not what most of k8s installations work
> I mean yes, theoretically nothing needs Kubernetes, because the internet was the same before we had Kubernetes, so it's certainly not needed, but it makes life a lot easier. Especially as a cheap lazy developer who doesn't want to spend time on any ops Kubernetes is really the best option out there next to serverless.
Only in a throw production code over the fence sense.
The thing is, if you have to manage the complexity and lifecycle of the cluster yourself, the balance tips dramatically. How do you provision it? How do you maintain it? How do you secure it? How do you upgrade it?
So I agree, k8s is great for running all manner of project, big and small. If you already have a k8s you will find yourself wanting to use it for everything! However if you don't have one, and you aren't interested in paying somebody to run one for you, then you should think long and hard about whether you're better off just launching a docker-compose from systemd or something.
Of course, it's not as easy as a managed solution, but it's not exactly black magic.
Docker compose from systemd is not bad, but maybe then instead of that using k3s is a better middle ground: https://github.com/rancher/k3s
Nomad is just as cheap, if not cheaper.
> This is cheaper than any other hosting solution in the cloud if you want the same level of stability and isolation.
Erlang provides total process isolation and can theoretically also run on only a single machine.
> It's even cheaper when you need something like a Redis cache. If my cache goes down and the container needs to be spun up again then it's not a big issue, so for cheap projects I can even save more cost by running some infra like a Redis instance as a container too. Nothing beats that.
In Erlang, each process keeps its own state without a single point of failure (ie a single Redis instance) and can be restarted on failure by the VM.
> Kubernetes takes care of itself. A container crashes? Kubes will bring it up.
Erlang VM takes care of itself. A process crashes? The VM will bring it up.
> Do I want to roll out a new version? Kubes will do a rolling update on its own.
Ditto for Erlang VM, with zero-time deployments with hot code reloading.
> I just click ok and everything happens automatically with zero downtime.
Erlang is famous for its fault tolerance and nine nines of downtime! And it’s been doing so for quite a bit longer than K8s.
How do you troubleshoot an api-group that is failing intermittently?
How do you troubleshoot a CSI issue? Because CSI isn't a simple protocol like CNI.
What do you look at if kubectl gets randomly stuck?
What do you do if a node becomes notReady?
What do you do if the container runtime of a several nodes starts failing?
What do you do if one of the etcd nodes doesn't start?
What if you are misisng logs in the log aggregator or metrics?
What if when you create a project it's missing the service accounts?
What if kube-proxy randomly doesn't work?
What if the upgrade fails and ends up in an inconsistent state?
What if your pod is actually running but shows as pending?
Sure, you can learn how to deploy kubernetes and an application on top of it in a couple days, but learning how to run a production will take way longer than that.
It blackboxes so many things. I still chuckle at the fact that ansible is a first class citizen of the operator framwork.
It can certainly implode on you!
In my experience running nomad & consul is a more lightweight and simple way of doing many of the same things.
It’s a bit like the discussion raging in regards to systemd and the fact that it’s not “unixy”. I get the same feeling with k8s, whereas the hashicorp stuff, albeit less features, adheres more to the unix philosophy.
Thus easier to maintain and integrate.
Just my ¢2.
Something has gone horribly wrong.
> In my experience running nomad & consul is a more lightweight and simple way of doing many of the same things.
If you can replace what you're getting out of your 44k lines of yaml with those two products, then you are using the wrong tools.
Ah, no, it's not about replacing functionality. It's about opening up for general integrations and ease of use.
If you've set up a fully fledged infrastructure on k8s with all the bells and whistles, there's a whole lot of configuration going on here. Like a whole lot!
I most certainly can't replace all of the above with those two tools, but they make it easier to integrate in any way I see fit. What I'm saying is that Nomad is a general purpose workload scheduler, where k8s is k8s POD's only.
Consul is just providing "service discovery", do with it what you want. And so on...
Having worked a couple of years using both these setups I'm a bit torn. K8s brings a lot, no doubt, but I get the feeling that the whole point of it is for google to make sure you _do not ever invest in your own datacenters_.
k8s on your own bare metal at least used to be not exactly straight forward.
I actually just deployed k8s on a raspberry pi cluster on my desk (obviously as a toy, not for production) and it took about an hour to get things fully functional minus RBAC.
> What I'm saying is that Nomad is a general purpose workload scheduler
Yeah, Nomad and k8s are not direct replacements at all. Nomad is a great tool for mixed workloads, but if you're purely k8s then there are more specific tools you can use.
> I meant to write 4.4K lines
Just a small difference! Glad no one wrote 44k lines of yaml, that's just a lot of yaml to write...
> close to 12k lines
Our production cluster (not the raspis running on my desk!) runs in about 4k lines, but we have a fairly simple networking and RBAC layer. We also started from just kubernetes and grew organically over time, so I'm sure someone starting today has a lot of advantage to get running more easily.
I mean, don’t get me wrong, it works - now at least. Never liked it until 1.12 tbh, which is when a bunch of things settled.
The article is about “maybe you don’t need...” and as an anecdote I helped build a really successful $$$ e-commerce business with a couple of hundred micro-services on an on-prem LXC/LXD “cloud” using nomad, vault & consul.
You can use these tools independently of each other or have them work together - unixy.
I have anecdotes from my last couple of years on k8s as well, and... it just ends up with a much more narrow scope.
Not to mention, consul is already a full DNS server, but in k8s, we need yet another pod for coredns. Is YAP a thing? =)
For example, I love how easily envoy integrates to any service discovery of you liking, even your own - simply present it with a specific JSON response from an API. Much like how the Ansible inventory works. It makes integrating, picking and choosing as well as maintaining and extending your configuration management just so much more pleasant.
Google recently release managed certs for those running on GKE, but those are limited to a single domain per cert.
But I will give it another try at some point.
[EDIT] oh and of course The Cloud itself dies from time to time, too. Usually due to configuration fuck-ups on their absurd Rube Goldberg deployment processes :-) I don't think one safely-managed (see above points) server is a ton worse than the kind of cloud use any mid-sized-or-smaller business can afford, outside certain special requirements. Your average CRUD app? Just rent a server from some place with a good reputation, once you have paying customers (just host on a VPS or two until then). All the stuff you need to do to run it safely you should be doing with your cloud shit anyway (testing your backups, testing your re-deploy-from-scratch capability, "shit's broken" alerts) so it's not like it takes more time or expertise. Less, really.
That's a lot of "cloud" right there in a single server.
AWS gives you availability zones, which are usually physically distinct datacenters in a region, and multiple regions. Well designed cloud apps failover between them. Very very rarely have we seen an outage across regions in AWS, if ever.
And by the way, how much are they willing to spend on their desired level of availability?
I still need a better way to run these conversations, but I'm trying to find a way to bring it back to cost. How much does an hour of downtime really cost you?
> An awful lot of server systems can tolerate a hardware failure on their one server every couple years given 1) good backups, 2) "shit's broken" alerts, and 3) reliable push-button re-deploy-from-scratch capability, all of which you should have anyway
Just.... just... no. First of all, nobody's got good backups. Nobody uses tape robots, and whatever alternative they have is poor in comparison, but even if they did have tape, they aren't testing their restores. Second, nobody has good alerts. Most people alert on either nothing or everything, so they end up ignoring all alerts, so they never realize things are failing until everything's dead, and then there goes your data, and also your backups don't work. Third, nobody needs push-button re-deploy-from-scratch unless they're doing that all the time. It's fine to have a runbook which documents individual pieces of automation with a few manual steps in between, and this is way easier, cheaper and faster to set up than complete automation.
But you should test your backups and set up useful alerts with the cloud, too.
> Third, nobody needs push-button re-deploy-from-scratch unless they're doing that all the time. It's fine to have a runbook which documents individual pieces of automation with a few manual steps in between, and this is way easier, cheaper and faster to set up than complete automation.
Huh. I consider getting at least as close as possible to that, and ideally all the way there, vital to developer onboarding and productivity anyway. So to me it is something you're doing all the time.
[EDIT] more to the point, if you don't have rock-solid redeployment capability, I'm not sure how you have any kind of useful disaster recovery plan at all. Backups aren't very useful if there's nothing to restore to.
[EDIT EDIT] that goes just as much for the cloud—if you aren't confident you can re-deploy from nothing then you're just doing a much more complicated version of pets rather than cattle.
As Helmuth von Moltke Sr said, "No battle plan survives contact with the enemy." So, let's step through creating the first DR plan and see how it works out.
1) Login to your DR AWS account (because you already created a DR account, right?) using your DR credentials.
2) Apply all IAM roles and policies needed. Ideally this is in Terraform. But somebody has been modifying the prod account's policies by hand and not merging it into Terraform (because reasons), and even though you had governance installed and running on your old accounts flagging it, you didn't make time to commit and test the discrepancy because "not critical, it's only DR". But luckily you had a recurring job dumping all active roles and policies to a versioned write-only S3 bucket in the DR account, so you whip up a script to edit and apply all those to the DR account.
3) You begin building the infrastructure. You take your old Terraform and try to apply it, but you first need to bootstrap the state s3 and dynamodb resources. Once that's done you try to apply again, but you realize you have multiple root modules which all refer to each other's state (because "super-duper-DRY IaC" etc) so you have to apply them in the right sequence. You also have to modify certain values in between, like VPC IDs, subnets, regions and availability zones, etc.
You find odd errors that you didn't expect, and re-learn the manual processes required for new AWS accounts, such as requesting AWS support to allow you to generate certs for your domains with ACM, manually approving the use of marketplace AMIs, and requesting service limit increases that prod depended on (to say nothing of weird things like DirectConnect to your enterprise routers).
Because you made literally everything into Terraform (CloudWatch alerts, Lambda recurring jobs, CloudTrail trails logging to S3 buckets, governance integrations, PrivateLink endpoints, even app deployments into ECS!) all the infrastructure now exists. But nothing is running. It turns out there were tons of whitelisted address ranges needed to connect with various services both internal and external, so now you need to track down all those services whose public and private subnets have changed and modify them, and probably tell the enterprise network team to update some firewalls. You also find your credentials didn't make it over, so you have to track down each of the credentials you used to use and re-generate them. Hope you kept a backed up encrypted key store, and backed up your kms customer key.
All in all, your DR plan turns out to require lots of manual intervention. By re-doing DR over and over again with a fresh account, you finally learn how to automate 90% of it. It takes you several months of coordinating with various teams to do this all, which you pay for with the extra headcount of an experienced cloud admin and a sizeable budget accounting gave you to spend solely on engineering best practices and DR for an event which may never happen.
....Or you write down how it all works and keep backups, and DR will just be three days of everyone running around with their heads cut off. Which is what 99% of people do, because real disaster is pretty rare.
Fair enough, legit argument, but trying to make a "counter-k8s" case based on that is not very convincing.
This is unnecessarily dismissive and contributes to what makes HN a toxic place for discussions. They've already addressed the reasons.
[...] we started adding ever-more complex layers of logic to operate our services.
As an example, Kubernetes allows [...] this can get quite confusing [...].
[...] this can lead to tight, implicit coupling between your project and Kubernetes.
[...] it’s tempting to go down that path and build unnecessary abstractions that can later bite you.
[...] It takes a fair amount of time and energy to stay up-to-date with the best practices and latest tooling. [...] the learning curve is quite steep.
So in short "it is complex so this and that may happen if you don't learn it properly".
Ok, this reasoning applies a-priori to any tool.
I evaluated AWS Fargate & Kubernetes, Azure and GKE before settling on GKE. Amazon charges $148 per month just for cluster management alone. Google and Azure charge $0 for cluster management. This ruled out AWS for me. AWS appears to be a reluctant adopter of Kubernetes. They seem to want you to use Fargate instead. I tried it and found it to be crap--very hard to get things running.
I was able to get Kubernetes running on Azure and GKE fairly easily. There were minor hiccups on both those clouds. With Azure initial creation of a AKS cluster failed because some of the resource providers weren't "registered" on my subscription. On GKE it was hard to get ingress working. Static IPs take a long time to take effect, and in the mean time you are fiddling with your yaml files trying to figure out what you may have done wrong, not realizing it is a GKE issue.
The awesome part of Kubernetes is that I didn't need to learn almost anything about Google cloud to get everything working. I only had to learn Kubernetes. If Google raises prices I can easily switch to Azure without learning any Azure technologies. My knowledge as well as my application is completely portable. I can't imagine doing any of this as a one-person team without Kubernetes.
What will change or what will be enhanced are:
- Minimum requirements to actually run it (see k3s)
- More managed services (gke, azure and aws exists but also digitalocean)
- More/better handling of stateful services
- Simple solution for write once read many (relevant for caching and for ci/cd)
At that pace we are already with such a jung project, yeah this is great. This is huge.
And no one needs to migrate already to kubernetes! But it already does a few things out of the box which reduces the complexity:
- easy cert management
- internal loadbalancing
- green/blue deployment
But you do see how the industry is struggling with certain problems: We are now with kubernetes moving into a cloud native area.
Everyone know has kubernetes available. There was no mesos managed service from google, azure and aws. There was no docker swarm from google, azure and aws.
We were were moving into a cloud native era way before Kubernetes.
Since then, I completely changed my mind about kubernetes. This is a very good technology for only one reason -and it's NOT about container orchestration-. Portability. K8s is the missing piece that allows you to create a network of cooperating computers independently of hardware, OS, and even architecture.
If I have an home-made k8s cluster on my raspberry pies at home, it's not because it's lightweight and easy to manage -k8s adds a significant overhead-, it's because it gives me the ability to unplug one of them, take the sdcard, format it and plug another board without any interruption or configuration. I could plug my intel laptop to that network and have some pods running on it without having to change a single line in my configuration. Finally, I can zip a folder and email a bunch of yaml files to my friends (or have them git clone the repo), and they will be able to replicate an exact copy of my home cluster with all -or some- of the services. This is truly amazing.
I'm the Nomad Team Lead and can try to answer any questions. Since this post was made the team has expanded, and the task/job restarting issue they link has been (mostly) addressed. Also new since this post is our Consul Connect integration which can accomplish similar goals to k8s network policies, albeit opt in and with the actual discovery/networking code living in Consul/Envoy respectively.
We were already using Consul so bringing in Nomad to schedule everything - batch files, .Net executables, some legacy programs, etc was a godsend.
Unfortunately, as much as I loved the Nomad+Consul combination, I really couldn’t suggest it today. It is so much easier to find qualified K8s people than Nomad+Consul people I couldn’t in good conscience recommend it.
But this is all a moot point to me. While if I were leading another on prem project I would use K8s. We are all in on AWS+ECS+Fargate where I work now and we really don’t care about the lock in boogie man.
Given a choice, I would still say at least if you’re on AWS, use the native offerings. The value of hypothetically being able to migrate a large infrastructure “seamlessly” is vastly overrated.
Glad to hear it!
> Unfortunately, as much as I loved the Nomad+Consul combination, I really couldn’t suggest it today.
This is a fair critique and a problem any project living in a world defined by a strong incumbent suffers. You made me realize we need resources to help k8s users translate their existing knowledge to Nomad as many people looking at Nomad will have k8s experience these days.
So thanks for this comment. Maybe with the right docs/resources we can at least minimize the very real cost of using a non-dominant tool.
> But this is all a moot point to me. We are all in on AWS+ECS+Fargate where I work now and we really don’t care about the lock in boogie man.
This was me in a past lives/jobs! HashiCorp's entire product line (with the exception of maybe Packer and Vagrant) become much more compelling for multi-cloud, hybrid-cloud, and on-prem orgs.
From my experience with Nomad and the little I know about K8s, Nomad is the latter. If you can run it from the command line, you can run it with Nomad. This in and of itself is a great value proposition.
But, Nomad does have the disadvantage, I posted about above and has to fight the “no one ever got fired for buying IBM”. I was able to get buy in only after I told the CTO, “it’s made by the same people who make Consul and Terraform”.
We have load balancing docs for 4 of the most popular load balancers including Fabio: https://www.nomadproject.io/guides/load-balancing/load-balan... We're monitoring Frank's search for a maintainer but unfortunately don't have the resources to commit to it at the moment. If Fabio becomes abandoned and we're still unable to take over we'll remove it from our docs.
Please feel free to open an issue if you have a specific idea or question. We triage every issue and appreciate user feedback immensely!
You guys really can't spare the resources to pick up such a necessary part of a container orchestration system?
I guess I can see it both ways, where you think that that domain already has very good solutions and you don't want to waste time reinventing the wheel.
But then it's much easier for people to pick up your stack and decide to use it when it has load balancing automatically included. Just saying.
We may also add a first class notion of ingress load balancing in the future and Envoy would be the natural choice there because we already use it for Nomad's Consul Connect integration.
Running "nomad agent -dev" and "consul agent -dev" on your local machine in 2 different terminals should work. Running more than one Nomad agent on the same machine is not recommended as you have to carefully configure ports and paths not to conflict.
There are some demo clusters available via Vagrant or Terraform, but we should really do a better job of going from the dev agent to a cluster:
I think that the learning curve is not that steep if you already had to do the same than with Kubernetes with other alternatives, in my on experience I have been discovering a lot of features that are very helpful not just during production but also during the development environments that were a real pain before.
I am have a lab/cluster/blog running on Kubernetes, I am in charge of this one alone, it is opensource , I version everything that goes to the cluster so you can see the evolution of the kubernetes entities, the config, the containers and the code. I started this from scratch and improving it feature to feature, I think that this might be a big factor with my positive experience with Kubernetes.
I wonder if an issue with adopting kubernetes is to try to migrate a big system into kubernetes in a very limited time lapse and trying to push/force features as they were working/handled in the previous approach?
It took all of about an hour (over text messaging) to get it set up and all stacks/services running. They could not be happier.
It comes down to the right tool for the job. If you don't need all the bells and whistles, then keep it stupid simple. I realize swarm is not a 100% "enterprise" solution, however before they they were just issuing docker command after each reboot.
Not contesting the heart of your comment but, given the current state of Docker, recommending Swarm to someone strikes me as bad advice. Nomad may be a better call.
Mirantis has openly said Swarm's future is a two year sunset with a transition path to k8s.
I find very hard to find swarm use case that a wan, vpn, private segment or kubernet cluster can't handle better.
Discovering bugs in a technology you just started "looking into" actually sounds like the learning curve.
Because Mirantis now owns almost all of Docker's IP. https://news.ycombinator.com/item?id=22035084
I was addressing the:
"The takeaway is: don't use Kubernetes just because everybody else does."
line in the article, and agreeing (did not mean to start a holy war about swarm).
On a side note, I used to run a docker swarm at home up until about 4 months ago when I switched it to K8s, I really didn't have any bad mesh routing issues, and it was pretty stable. But to be a hypocrite I switched to k8s because everyone else DOES use it and I wanted to kind of stay relevant.
> Today we announced that we have acquired the Docker Enterprise platform business from Docker, Inc. including its industry leading Docker Enterprise and 750 customers.
> What About Docker Swarm?
> The primary orchestrator going forward is Kubernetes. Mirantis is committed to providing an excellent experience to all Docker Enterprise platform customers and currently expects to support Swarm for at least two years, depending on customer input into the roadmap. Mirantis is also evaluating options for making the transition to Kubernetes easier for Swarm users.
Mirantis owns essentially all of Docker, outside of Docker for Desktop (someone correct me here if I'm wrong), now. They are saying that Swarm is not the future of Docker. It's entirely possible that the remainder of Docker, now a developer tooling company, will continue with Swarm, but it seems unlikely. Also possible the community will keep it alive. None of those maybes are things I'd bet my platform on though.
And it doesn't has to be that one kubernetes, your solution only has to be kubernetes certified. This will allow us all to use the kubernetes api and features with different underlying implementations (as far away from the original or as close as it can get).
This is new.
But I would not go spreading blog posts about it until I will have been maintaining this thing for min 1 year.
Can I hot-deploy (without downtime)? can I rollback? does it autoscale? can I monitor this thing? where are my logs? how do I create cronjobs? does it autorestart (ok that's an easy one...) will I end up with some half-assed deployment one day? can I trivially maintain dev/stage/prod envs? and so on and so forth.
The thing about k8s is that though it's (very) complex, it also solves a really broad spectrum of deployment issues OOTB.
Just like NoSQL, BigData,..., in a couple of years k8s will be slowly forgotten as everyone updates their CVs to the Next Big Thing™.
Other technologies merely expand and shrink. Big Data went from being a buzzword to a simple, established fact... albeit established in places you don't work. Does Google have Big Data? Does Amazon have Big Data? Do most of us work at either of those companies?
Big Data isn't obsolete in the same way dump trucks aren't obsolete.
We have found that minimal touch is the best approach when it comes to infrastructure and tooling around it. The core of our business value is ultimately in our codebase. The infrastructure only serves to host it to our end customers, so we don't like to spend too much time on it. By keeping things very simple, we can hop between instances and even cloud vendors without much headache. We built self-contained deployments at the code level, not at the infrastructure-level (I.e. docker).
How many businesses (aside from hyperscalers) derive their principal business value from infrastructure itself? Is it worth spending all this time and frustration on various competing declarative infrastructure approaches? Surely with the amount of time it takes to get all this mess set up, someone in your organization could have purpose-built something from zero that accomplishes the same or better for your specific business concerns.
Easily handle production load on 3-5 servers / VMs while reading all about how to script quickly spinning up 1,000 VMs if they really needed to.
That's not average hardware. It's a couple hundred connections max running select statements as fast as possible on the best hardware they could afford to test against.
That doesn't translate into your CRUD app handling a million requests per second.
They're all part of the same generation of tech - portable instances. This is a leap over having physical servers, but a mild improvement over each other. Think git vs mercurial - the difference isn't significant enough to move from one to the other.
Serverless, for example, would be a leap over the instance model.
Still, we have enough security issues with traditional virtualization, which only worsen every year as new side channel attacks are discovered. You'll find that container platforms like Fargate actually run everything in its own micro-VM because there's no other way for them to provide the basic isolation guarantees they peddle.
Also, what do you mean by "Docker is a crime against the computer industry"?
k8s won't go away and this comes from a dev who hates infrastructure.
For example, swarm still have no fault tolerance and nomad relies on Vault, another product from Hashicorp and is also in the same limited state wrt. documentation
A managed k8s service (all 3 big providers offer this) really isn't that much more complex, and has much better documentation / no vendor lock-in.
However if you are in an organization with multiple teams (as are the vast majority of developers) then Kubernetes provides a common language for deploying, operating and securing your applications which enables you to go from a process which could takes days, weeks or even months to provision and configure a VM to minutes or hours to provision a kubernetes namespace.
I don't even think for one second to go back to uploading tar.gz or DEB / RPM using scripts / puppets ect ...
However, it is just a single executable and it just works. Combined with Consul and Fabio, you can run a container orchestration cluster with very little fuss that has service discovery, internal load balancing and cluster-wide logging.
It is a feature-rich task scheduler with a pretty good CLI. I highly recommend it, but if you need stuff like autoscaling, you need to use something else.
> Nomad is not a true alternative to Kubernetes.
You're right! We explicitly try not to be a standalone dropin replacement for k8s. Our comparison page goes into this a little bit (but now I realize it's in need of a refresh!): https://www.nomadproject.io/intro/vs/kubernetes.html
- Nomad relies on Consul for service discovery
- Nomad relies on Consul Connect (Envoy) for network policies
- Nomad relies on Vault for secrets
- Nomad relies on third party autoscalers - https://github.com/jippi/awesome-nomad#auto-scalers - although there's more we can do to enable them.
- Nomad relies on Consul Connect (Envoy) or other third parties for load balancing
- Nomad does not provide any cluster log aggregation (although logging plugins are planned which should make third party log aggregators easier to use)
Nomad still has many missing features such as CSI (coming in 0.11!), logging plugins, audit logging, etc, but we never intend to be as monolithic a solution as k8s. We always hope to feel like "just a scheduler" that you compose with orthogonal tools.
Also, it's weird that the argument of another 30% of people defending kubernetes boils down to: "Using kubernetes is really easy, just hire Google/Azure/etc to do it for you."
Can't begin to wrap my head around that one.
But what do I know, I prefer to KISS and I like nomad. In fact, I'd be using swarm if its future wasn't spotty.
I also found this book to be helpful https://www.amazon.com/gp/product/B07T1Y2JRJ/ref=ppx_yo_dt_b...
If anything is wrong with Kubernetes, it would be the complexity of it and that it has a steep learning curve.
It seems it's best to have a small team of people to manage it and to solve solutions to problems that arise.
We started using Nomad last year and one thing I can say is that it's relatively easy to use and works well for its intended purpose, especially for us hybrid-cloud folks.
Well, my experience with Docker, k8s, and related technologies, is that I now need to be an expert with these things just to get through my day. It's exhausting.
It’s very easy to understand and is highly reliable.
- configuration management
- service discovery
in a more classic "VM" environment, using unix jails. For anyone who isn't interested in learning k8s, but also wants a relatively modern, all-in-one solution to these problems, I recommend checking it out.
Basically, I agree with others that k8s is both complex and not fully mature yet.
If you ever want to see how brittle and bizarre Kubernetes can be, use it in a 100% automated fashion and hope for the best.
Even pod to pod communication, which would be trivial to do using any sane solution is a huge pain in kubernetes.
I would actually be frightened by running kubernetes in production. The simplest things are so hard to do that I don't even want to think about how to fix a weird issue when something goes wrong...
The main selling point of k8s over any other tool is that it provides a single, unified, secured, multi-tenant API endpoint for managing all of production. Your developers updating their production app use the same API as a CI system that wants to spawn some worker containers, and the same API as an operator service maintaining a Redis cluster. All of this results in a single view into production. If things go well, the end result is that you swap daily interaction with a handful of different tools with disparate states (Terraform, Ansible/Chef/Salt/Puppet, shell scripts and proprietary tools) into just managing payloads on k8s.
> Even pod to pod communication, which would be trivial to do using any sane solution is a huge pain in kubernetes.
How is it a pain? A pod behind a service provides a DNS name that allows running requests to it - this handles the bulk of production traffic. If you want to contact a particular pod that is not behind a service just use the k8s API to retrieve details about it (like Prometheus does via k8s pod service discovery).
You create a service and then use the DNS name? How is that hard?
Are you creating a cloud provider?
- If you are creating a cloud provider, then yes you might want Kubernetes. If you are making your own cloud provider, you should question why you are doing that.
- If you're not creating a cloud provider (for example, you're a software company), then use whatever VM / container / MicroVM / etc your cloud provider gives you rather than layering your own unnecessary complexity that adds questionable value on top of what your pay for from your cloud provider.
But when they start telling other people to learn kubernetes because it's so useful then it's annoying. And there are lots of people advocating for using kubernetes on small personal projects. When they get pressed they all fall back to the justification, "You should use kubernetes in personal projects to learn kubernetes." which is back to silly again.
That's where these kinds of critical articles come from.