Doesn't look like the author knows what he is talking about. His point about early stage startup should not use K8S is fine. But the next advice about not using a different language for frontend and backend is wrong. I think the most appropriate advice is to choose a stack which the founding team is most familiar with. If that means RoR then RoR is fine. If it means PHP then PHP is fine too. Another option is to use a technology which is best suited for the product your are trying to build. For example, if you are building a managed cloud service, then building on top of K8S, or FireCracker or Nomad can be a good choice. But then it means you need to learn the tech being used inside out.
Also he talks about all of this and then gives example of WhatsApp at the end. WhatsApp chose Erlang for the backend and their front end was written in Java and Objective-C. They could have chosen Java for backend to keep frontend language same but they didn't. They used Erlang because they based their architecture on Ejabberd which was open source and was built with Erlang. Also WhatsApp managed all their servers by themselves and didn't even move to managed cloud services when they became available. They were self hosting till FB acquired them and moved them to FB data centres later on (Source: http://highscalability.com/blog/2014/2/26/the-whatsapp-archi...).
I don't think their advice about not using it in a startup is correct either. You just need to somewhat know what you're doing.
I know of such a case, where a single engineer could leverage the helm chart open source community, and set up a scalable infrastructure, with prometheus, grafana, worker nodes that can scale independently of web service, a CI/CD pipeline that can spin up complete stacks with TLS automated through nginx and cert-manager, do full integration tests, etc.
I found that to be quite impressive, for one person, one year, and would probably be completely impossible if it wasn't for k8s.
The thing is, unless using those technologies was somehow core to what the single engineer was trying to, it might be technically impressive but might not have actually provided value for users.
Users don't really care if you have a really impressive stack with cool technologies if it doesn't offer anything more than a couple of web servers and a DB server.
Right on. Previous devs at co I joined wanted to play DevOps cowboys. They used Ansible scripts to spin up various AWS services costing the company over 100K/yr.
New lead came in, got rid of that crap by using 3rd party services to spin up infrastructure. Got a load balancer, a few VMs + DB. Reduced the cost down by 85% and greatly simplified the entire stack.
I learned a really valuable lesson without having to make that mistake myself.
I understand why people get excited about tooling. It's cool to learn new things and automate stuff away. I'm prone to that myself and do this on my own server when I get that itch.
Having said that, it's wrong to foist this stuff onto an unsuspecting company where the owners don't know any better about tech, that's why they hire other people to do that for them and seeing that just left a bad taste in my mouth for overcomplicated setups.
I get that SV is different, that's why tools like K8 are made and I would jump on those tools on a heartbeat as needed.
But for other smaller businesses, the truth is they just need a boring monolothic load balanced app with a few VMs and a DB sprinkled with 3rd party services for logging or searching or other stuff not core to the business.
I know this utterly misses the larger point of your comment, but:
> They used Ansible scripts to spin up various AWS services
This seems less about using the "cool/new" tech... rather it's about using the "right" tech. Config management tools like Ansible/Chef/Puppet are very much previous-generation when it comes to cloud infrastructure.
They... can manage cloud infrastructure, but they were created prior to the ubiquity of cloud deployments, and the features are glued on. Not choosing a more modern IaC framework tells me they(those devs) were going to be making sub-optimal implementation decisions regardless.
Yeah, this project was several years old. Take this with a grain of salt, I'm not familiar with timelines in terms of k8s, but I would guess that it had not yet risen to popularity as it has in more recent years.
Rocket science isn't hard if you know it. Should we all build spaceships to deliver groceries? Good luck finding a few local rocket scientists in a pinch.
You can find plenty of auto mechanics though. Cars are cheaper and ubiquitous. Maybe they can't drive to the moon, but they can get most things done.
Unless your business is flying to the moon, stick to cars and trucks over spaceships.
I largely agree with the previous poster. If a single brief yaml file is too complicated, you're asking for too much from your tooling. I've seen hideous configuration management systems that did 1/100th of what kubernetes can do. There is some additional complexity around containers which is not a k8s complexity issue, and there are advanced features that most devs will never need to touch. Regardless, I'll take the complexity saved by k8s over custom built tooling or oversimplified PaaS any day.
His point is to use whatever simplifies workflow / reduces operational overhead. To some people, that indeed would be k8s. To you, that may be "managing a couple of webservers and a db". And that is great for you.
yeah I use k8s for basic webapps and it works wonderfully, and is way way easier than anything else, and yes I started developing in the 90s so I've seen it all. There is a bit of overhead in learning k8s, but once you know it, it's dead simple for every use case I've found and takes you way further than anything else.
> I know of such a case, where a single engineer could leverage the helm chart open source community, and set up a scalable infrastructure, with prometheus, grafana, worker nodes that can scale independently of web service, a CI/CD pipeline that can spin up complete stacks with TLS automated through nginx and cert-manager, do full integration tests, etc. I found that to be quite impressive, for one person, one year, and would probably be completely impossible if it wasn't for k8s.
But that's the thing though: they didn't do it alone. You literally pointed out that this wasn't true almost immediately: "leverage the helm chart open source community". They used the work of others to get to the result.
Also, I highly doubt they could debug something if it went wrong.
I simply cannot believe anyone would advocate, or believe, that because a Helm chart makes it simple to create a complex piece of infrastructure it must also be true that maintaining it is also simple. It's really.
Once you understand the concepts it's not hard to debug. It's fair to acknowledge that kubernetes is complex, but also we should not ignore the real work that has been done in the past few years to make this stuff more accessible to the developers that want to learn it.
Also, saying it's not "alone" in this example I think is not fair. What would you count as "alone"? Setting up the kubernetes cluster from scratch and writing your own helm charts? Using that same logic, I can't say that because someone else designed and built the hardware it's running on. I think it's fair to say that if someone, independent of coaching, regardless of the underlying infrastructure produced some production grade infrastructure by themselves, they certainly did it alone.
Using the helm charts from the community is arguably still doing it alone. There isn't anything back and forth. It's just the right tool for the right job. But, this starts being about language and semantics. Like saying that following a best practices on how to configure nginx isn't doing it alone, because someone else wrote that. Helm charts just very often expose what needs to be configured, and otherwise follows those best practices.
As for debugging. You do have a point that it becomes more difficult. But, this also holds true for any of the alternatives discussed her (lambdas, terraform). I'd argue that when it all comes down to it, that because you can spin up the entire infrastructure locally on things like minikube, that it makes it many times more easy to debug than other cloud-provider-only solutions.
How long did it take him to do this setup, a year you say, and that is impressive? I am not trying to be cute here, my question comes from a genuine place of curiosity. I've love to learn to spin-up a system like that, but from the tech/sales talks I see I am made to believe this can be done in a day. Expectation management is important, if people say ops is just a solved problem then I expect this to take very little time and to be easy to learn. Maybe I am learning the wrong thing here, and should do learn Helm or something more high level.
It took a year, but that was somewhat on the side of also building an OpenAPI based Web service, the gRPC based workers. So, it wasn't just the infrastructure stuff. If I were to estimate how much time for just the infrastructure and devops tooling, then two months. It's been up and running with less than 15 minute downtime over the course of two years.
I do consider this impressive. And, to be clear, I wouldn't say this is because of a "super-developer". In fact, he had no prior k8s experience. But rather that there are thousand upon thousand of infrastructure hours devoted to the helm charts, often maintained by the people who develop the services themselves. It is almost mind boggling how much you get for almost free. Usually with very good and sensible default configurations.
In my precious work place, we had a team of 5 good engineers purely devoted to infrastructure, and I honestly believe that all five would be able to spend their time doing much more valuable things, if k8s had existed.
As for whether or not such devops solutions could be done in a day. Hm. I don't know. These things should be tailored to the problem. If you've done all of this a few times, then maybe you can adjust a bunch of charts that you are already familiar with and do what took a couple of months and impressed me, in a couple of weeks. It's a lot more than just "helm install. Done", that goes into architecting a scalable solution. Implementing monitoring, alerting and logging. Load testing stuff. Etc.
That's seems like a very negative take in my opinion. This 'simpler operational tech' would still need to be able to scale, correct? If you think that there is a good and easier way to deploying 10-15 services, all of which can scale, and all of it defined in rather neat code, to be anything but "simple operational tech", then I believe you are confusing "solving a complex problem", with "simplifying the requirements of a complex problem". The latter of which has been stripped of many important features. K8S isn't anything magic, but it certainly isn't a bad tool to use. At least not in my experience, though I've heard of horror stories.
That does remind me that when that employee started, the existing "simple operational tech" was in fact to SSH into a VM and kill the process, git pull the latest changes, and start the service.
The only way you can solve the actual problem (not a simplified one) would in my opinion either be k8s or terraform of some kind. The latter would mostly define the resources in the cloud provider system, most of which would map to k8s resources anyways. So, I honestly just consider k8s to better solve what terraform was made for.
I'm sure the "simpler operational tech" meets few requirements for short disaster recovery. Unless you have infrastructure as code, I don't think that is possible.
>That's seems like a very negative take in my opinion. This 'simpler operational tech' would still need to be able to scale, correct?
Premature optimization is a top problem in startup engineering. You have no idea what your startup will scale to.
If you have 1,000 users today and 5 year goal of 2,000,000 users, then spending a year building infrastructure that can scale to 100,000,000 is an atrociously terrible idea. A good principal can setup a working git hook, circleci integration, etc capable of automated integration testing and rather close to ci/cd in about a weekend. Like you can go from an empty repo to serving a web app as a startup in a matter of days. A whole year is just wasteful insanity for a startup.
The reality for start-ups running on investor money with very specific plans full of OKRs and sales targets is very different: you need to be building product as fast as possible and not giving any fuck about scale. Your business may pivot 5 times before you get to a million users. Your product may be completely different and green-fielded two times before you hit a million users.
I can't imagine any investor being ok with wasting a quarter of a million+ and a year+ on a principal engineer derping around with k8s while the product stagnated and sales had nothing to drive business -- about as useful as burning money in a pit.
You hire that person in the scale-up phase during like the third greenfield to take you from the poorly-performing 2,000,000 user 'grew-out-of-it' stack to that 100,000,000+ stack, and at that point, you are probably hiring a talented devops team and they do it MUCH faster than a year
If you have a website with 1000 users today and product is going to be re-designed 5 times, it's probably best just to use sqlite and host on a single smallish machine. Not all problems are like this however.
Yeah to be honest, I run a k8s cluster now for my saas. But about 4 times more expensive then my previous company I ran on a VPS.
And scaling is the same that VPS I could just scale the same way. Run a resize in my hosting company panel. (I dont use autorescal atm)
Only if I would hit about 100x times the nrs I would get the advantage of k8s, but even then I could just split up customers into different VPS.
CI / CD can be done good and bad with both.
And in practice K8S's a lot less stable. Maybe because I'm less experienced with K8S. But also because I think its more complex.
To be honest k8s is one of those dev tools that has to reinvent every concept again, so it has it's own jargon. And then there are these ever changing tools on top of it. It reminds me of JS a few years ago.
Any startup that knows what their product is and are done with PoCs, should be able to deal with the consequence of succeeding, without failing. Scaling is one of those things that should be in place before you need it. In our case, scaling was a main concern.
and ... you might be justified in that concern. However... after having been in the web space for 25+ years, it's surprising to me how many people have this as a primary concern ("we gotta scale!") while simultaneously never coming close to having this concern be justified.
I'm not saying it should be an either/or situation, but... I've lost count of how many "can it scale?" discussions I've had where "is it tested?" and "does it work?" almost never cross anyone's lips. One might say "it's assumed it's tested" or "that's a baseline requirement" but there's rarely verification of the tests, nor any effort put in to maintaining the tests as the system evolves.
EDIT: so... when I hear/read "scaling is a main concern" my spidey-sense tingles a bit. It may not be wrong, but it's often not the right questions to be focused on during many of the conversations I have.
> I'm not saying it should be an either/or situation, but... I've lost count of how many "can it scale?" discussions I've had where "is it tested?" and "does it work?" almost never cross anyone's lips.
Also, discussions about rewrites to scale up service capacity, but nobody has actually load tested the current solution to know what it can do.
Just keep it simple, and if you take off scale vertically while you then work on a scalable solution. Since most businesses fail, premature optimisation just means you're wasting time that could have gone on adding more features or performing more tests.
It's a trap many of us fall into - I've done it myself. But next time I'll chuck money at the problem, using whatever services I can buy to get to market as fast as possible to test the idea. Only when it's proven will I go back and rebuild a better product. I'll either run a monolith or 1-2 services on VPSs, or something like Google cloud run or the AWS equivalent.
I should perhaps have clarified, but the 10-15 are not self maintained services. You need nginx for routing and ingress, set up cert-manager and other ingress endpoints are automatically configured to have TLS, deploy prometheus, which comes with node-exporter and alert-manager, deploy grafana.
So far, we're up at 6 services, yet still at almost zero developer overhead cost. Then add the SaaS stack for each environment (api, worker, redis) and you're up at 15.
Sometimes it's faster to implement certain features in another languages and deploy it as microservice instead of fighting your primary language/framework to do it. Deploying microservices in k8s is as easy as writing a single yaml file.
I am not privy to the details of the case, but a rule-of-thumb I heard once is that if it's far enough from your core, a SaaS can be used (obviating the whole question), and if it's part of the core, start by developing it as a separate functionality before moving it to another service.
In a lot of cases it's pattern abuse. I'm dealing with this all the time. People like to split things that can work perfectly as one whole, just for the sake of splitting it.
for example lambda (not microservices, running mini monoliths per lambda function)
yes by simple I mean covering high availability requirements, continuous deployment, good DORA measures - not simple as in half-baked non-functional operations (such as manually sshing to a server to deploy)
Ah, I see. Well, lambdas are also a nice tool to have, but it certainly do not fit for all applications (same as with k8s). I'd also point out that lambdas replace a rather small capabilities of k8s, and the type of systems you can put together. You would end up needing to set up the rest either through a terrible AWS UI or terraform. Neither of which I find to simplify things all that much, but perhaps this is a matter of taste.
In our case, the workers were both quite heavy in size (around 1 GB), and heavy in number crunching. For this reason alone (and there are plenty more), lambdas would be a poor fit. If you start hacking them to keep them alive because of long cold starts, you would lose me at the simple part.
Having very recently done this (almost, another dev had half time on it) solo, It's not _too_ terrible if you go with a hosted offering. Took about a month/month and a half to really get set up and has been running without much of a blip for about 5 months now. Didn't include things like dynamic/elastic scaling, but did include CD, persistent volumes, and a whole slew of terraform to get the rest of AWS set up (VPCs, RDS, etc). I'd say that it was fairly easy because I tinkered with things in my spare time, so I had a good base to work off of when reading docs and setting things up, so YMMV. My super hot take, if you go hosted and you ignore a ton of the marketing speak on OSS geared towards k8s, you'll probably be a-ok. K8s IME is as complex as you make it. If you layer things in gradually but be very conservative with what you pull in, it'll be fairly straightforward.
My otherhot take is to not use helm but rather something like jsonnet or even cue to generate your yaml. My preference is jsonnet because you can very easily make a nice OO interface for the yaml schemas with it. Helm's approach to templating makes for a bit of a mess to try and read, and the values.yml files _really_ leak the details.
With 1YoE I did most of that in about 3 months. Had a deadline of 6 months to get something functional to demonstrate the proposed new direction of the company, and I did just that. If I were to do it today I could probably rush it to a week, but that would mean no progress on the backend development that I was doing in parallel. A day is probably doable with more on-rails/ batteries included approaches.
Not because I'm amazing, but there's a frankly ridiculous amount of information out there, and good chunks of it are high quality too. I think I started the job early January, and by April I had CI/CD, K8s for backend/frontend/DBs, Nginx (server and k8s cluster), auto-renewing certs, Sentry monitoring, Slack alerts for ops issues, K8s node rollback on failures, etc.
The best way to learn, is to do. Cliche, but that's what it really comes down to. There's a fair few new concepts to grasp, and you probably have picked some of these up almost by osmosis. It sounds more overwhelming than it is, truly.
The problem is never spinning things up, it's in maintenance and ops. K8s brings tons of complexity. I wouldn't use it without thinking very carefully for anything other than a very complex startup while you're finding product-market fit.
You can get a majority of those things "running" in few days. If you don't want it to fall over every other day, then you need to have a ton of ancillaries which will take at least several months to set up, not to mention taking care of securing it.
Use a managed k8s cluster (eks, aks or gke). Creating a production ready k8s on vms or baremetal can be time consuming. Yes, you can do lamdba, serverless, etc. but k8s gives you the same thing and is generally cheaper.
It's actually pretty easy to do these days, even on bare metal servers. My go to setup for a small bare metal k8s cluster:
- initial nodes setup: networking configuration (private and public network), sshd setup (disallow password login), setting up docker, prepping an NFS share accessible on every nodes via private network
- install RKE and deploy the cluster, deploy nginx ingress controller
- (optional) install rancher to get the rest of the goodies (graphana, istio, etc). These ate a lot of resources though, so I usually don't do this for small clusters
I agree with that, setting up k8s on bare-metal took me 2 days, and we needed it to deploy elastic and some other helm charts as quickly as possible without loosing our minds maintaining nodes with some clunky shell scripts.
Also we bought us immediately an easy approach to build gitlab ci/cd pipelines + different environments (dev, staging, production) on the same cluster. Took me a week to set everything up completely and saved our team developing rapidly features really a lot of time and headache since then. But the point is, I knew how to do it, focus on the essentials and deliver quick reasonable results with large leverage down the route.
> deploy elastic and some other helm charts as quickly as possible
Bad culture alert! No one needs Elastic "as quickly as possible" unless their business, or the business they work for, is being very poorly run.
I would also argue that you might have got it running quickly, but how are you patching it? Maintaining it? Securing it? Backing it up? Have you got a full D/R plan in place? Can you bring it back to life if I delete everything within 3-6 hours? Doubt it.
> maintaining nodes with some clunky shell scripts.
Puppet Bolt, Ansible, Chef, ...
There are so many tools that are easy to understand that solve this issue.
That’s all solved for you, helm upgrade in ci/cd and bump of versions has been straight forward, if not snapshot rollback via Longhorn, also for DR. Accidentally deleted data => get the last snapshot, 5 minutes it’s back (except that there is of course CI/CD in place for new code + no write permissions for devs on the cluster and „sudden“ data deletion somewhat rare).
Elastic usecase is for crawling crazy amount of data and make it searchable and aggregatable and historically available, don’t know any other solution than elastic who has reasonable response times and easy-to-use access (plus we can add some application logging and APM).
> Puppet Bolt, Ansible, Chef, ...
Helm chart values.yaml and you’re all set, security + easy version bump included.
I believe elastic is available as service from AWS and elastic.co, if you need it fast, use that. If you need it long term it may be worthwhile to deploy your own for cost and flexibility purposes
> with prometheus, grafana, worker nodes that can scale independently of web service, a CI/CD pipeline that can spin up complete stacks with TLS automated through nginx and cert-manager, do full integration tests, etc.
> I found that to be quite impressive, for one person, one year, and would probably be completely impossible if it wasn't for k8s.
I've always found this interesting about web based development. I have no idea what Prometheus, grana, etc do. Ive never used k8s.
And yet, as a solo dev, I've written auto-scaling architecture using, for example, the AWS ec2 apis that let you launch configure and shutdown instances. I don't know what else you need.
Really the only advantage I see to morass of services is you get a common language so other devs can have a slightly easier time of picking up where someone left off. As long as they know all the latest bs.
In short. Prometheus is a worker that knows about all your other services. Each service of interest can expose an endpoint that prometheus scrapes periodically. So the services just say what the current state is, and prometheus, since it keeps asking, knows and stores what happens over time. Grafana is a web service that uses prometheus as a data source and can visualize it very nicely.
Prometheus also comes with an Alert Manager, where you can set up rules for when to trigger an alert, that can end up as an email or slack integration.
They are all very useful, and gives a much needed insight into how things are going.
> And yet, as a solo dev, I've written auto-scaling architecture using, for example, the AWS ec2 apis that let you launch configure and shutdown instances. I don't know what else you need.
This is fine, if you’re on AWS and can use AWS APIs. If you’re not (especially if you’re on bare metal), something like K8s can be nice.
I dont know AWS ec2 apis and for sure I'm not capable of writing auto-scaling architecture. This is the reason why I default to K8s. I have used it easily and successfully by myself for the last 4 years. It just keeps on running and hasn't given me any problems.
I've seen places hire a dev that write all the OPS stuff and they scaled awesomely.. I mean if they had purchased 100servers full time on amazon, they would have spent a fraction of the cost to scale, but they could scale.
In 5 years I think they've never once had to reach even near the 100servers.
At the same time. I can scale heroku to 500 servers, and still be under the cost of one ops person. I can make that change and leave it there. I can do that all in under 30 seconds. Oh. And CICD is built in as a github hook. Even with blue-green deploys.
I think his point was most start-ups don't need to scale more than a site like heroku can offer. If you need more than 500 servers running full time then it's time to start looking to "scale"
> At the same time. I can scale heroku to 500 servers, and still be under the cost of one ops person. I can make that change and leave it there. I can do that all in under 30 seconds. Oh. And CICD is built in as a github hook. Even with blue-green deploys.
And then Heroku shuts down.
If you're building something that needs to scale up rapidly if it succeeds, k8s is worth thinking about. Either you don't succeed, in which case it doesn't matter what your stack was, or you do, in which case you'll be glad that you can scale up easily, you'll be glad you are using a common platform which is easy to hire competent people in, and, if you were smart about how you used k8s, you'll be glad that you can relatively easily move between clouds or move to bare metal.
I think the set of cases where "we need to scale up rapidly if it succeeds" and "Kubernetes solves all of our scaling needs and we aren't going to have problems with other components" is almost empty. On the other hand, there are quite a lot of startups that fail because they put too much focus on the infrastructure and Kubernetes and the future and too little on the actual product for the users. Which is the point of the article, I think. Ultimately what matters is whether you sell your product or not.
> I think the set of cases where "we need to scale up rapidly if it succeeds" and "Kubernetes solves all of our scaling needs and we aren't going to have problems with other components" is almost empty.
I agree, but so what? K8s isn't magic, it won't make all your problems go away, but if you have people who are genuinely skilled with it, it solves a lot of problems and generally makes scaling (especially if you need to move between clouds or move onto bare metal) much smoother. Of course you'll still have other problems to solve.
Given that most startups never need to scale up much, it's not surprising that k8s is mostly used where it's not needed. But people usually prefer not to plan for failure, so it's also not surprising that people keep using it.
I mean, you still have to invest time on putting k8s to work, get people skilled with it, maintain and debug the problems... If Kubernetes didn't cost anything to deploy I'd agree that using it is the better idea, but it costs time and people, and those things might be better invested in features that matter to the users.
It depends. There are many things that carry a cost early but pay for themselves many times over later. Whether that will be the case for your startup depends whether you end up needing to scale quickly or not.
It's also worth considering that appropriate use of k8s can quite likely save you time and money early on as well. It standardises things, making it very easy for new ops people to onboard, and you might otherwise end up spending time reinventing half-baked solutions to orchestration problems anyway.
> It depends. There are many things that carry a cost early but pay for themselves many times over later. Whether that will be the case for your startup depends whether you end up needing to scale quickly or not.
Well, precisely what I said is that 99.9% of startups won't find themselves in a situation where they need to scale quickly and the only scale problems they find can be solved with Kubernetes.
> It's also worth considering that appropriate use of k8s can quite likely save you time and money early on as well. It standardises things, making it very easy for new ops people to onboard, and you might otherwise end up spending time reinventing half-baked solutions to orchestration problems anyway.
The point is that you might not even need orchestration from the start. Instead of thinking how to solve an imagined scenario where you don't even know the constraints, go simple and iterate from that when you need it with the actual requirements in hand. And also, "make it easier for new ops people to onboard" doesn't matter if you don't have a viable product to support new hires.
You seem to be describing very early stage companies, and if so I agree, host it on your laptop if you need to, it makes zero difference. But it's not binary with Netflix on one side and early stage on the other.
There are a lot of companies in the middle, and following dogma like "you don't need k8s" leads them to reinvent the wheel, usually badly, and consequently waste enormous amounts of time and money as they grow.
Knowing when is the right time to think about architecture is a skill; dogmatic "never do it" or "always do it" helps nobody.
What about CD of similar but not identical collections of services to metal? No scaling problem, other than the number of bare metal systems is growing, and potentially the variety of service collections. For instance, would you recommend k8s to tesla dor the CD of software to their cars?
Meanwhile, random_pop_non-tech_website exploding in traffic wasn't setup to scale despite years actively seeking said popularity through virtually any means and spending top dollar on hosting, and it slows down to crawl.
"Why no k8s?", you ask, only to be met with incredulity: "We don't have those skills", says the profiteering web agency. Sure, k8s is hard… Not. Nevermind that it's pretty much the only important part of your job as of 2022.
Obviously not, I was just pointing out that infra like k8s even under-the-hood for intermediaries (like web agencies) is still not always the norm given the real-world failures. There's this intermediary world between startups and giant corporations, you know. ;-)
>infra like k8s even under-the-hood for intermediaries (like web agencies) is still not always the norm
That's because 'the norm' for web agencies is a site that does basically zero traffic. If a company hires a 'web agency' that's by definition because the company's business model does not revolve around either a web property or app.
Whether that's a gas station company or a charity or whatever, the website is not key to their business success and won't be used by most customers apart from incidentally.
With that in mind most agencies know only how to implement a CMS and simple deployment perhaps using Cloudflare or a similar automated traffic handling system. They don't know anything about actual infrastructure that's capable of handling traffic, and why would they?
A lot of agencies are 100% nontechnical (i.e. purely designers) and use a MSP to configure their desktop environment and backups and a hosting agency to manage their deployed sites.
I very much agree with you. I must have been unnecessarily critical in my initial comment, I did not mean it as a rant, more like an observation about where-we're-at towards what seems an inevitable conclusion to me. Sorry that came out wrong, clearly I got carried away.
In asking if "Kubernetes is a red flag signalling premature optimisation", you correctly explain why we're yet on the "yes" side for the typical web agency category.
[Although FWIW I was hinting at a non-trivial category who should know better than not to setup a scale-ready infra for some potentially explosive clients; which is what we do in the entertainment industry for instance, by pooling resources (strong hint that k8s fits): we may not know which site will eventually be a massive hit, but we know x% of them will be, because we assess from the global demand side which is very predictable YoY. It's pretty much the same thing for all few-hits-but-big-hits industries (adjust for ad hoc cycles), and yes gov websites are typically part of those (you never know when a big head shares some domain that's going to get 1000x more hits over the next few days/weeks), it's unthinkable they're not designed to scale properly. Anyway, I'm ranting now ^^; ]
My unspoken contention was that eventually, we move to a world where k8s-like infra is the de facto norm for 99% of infrastructure out there, and on that road we move to the "no" side of the initial question for e.g. web agencies (meaning, we've moved one notch comparable to the move from old-school SysAdmin to DevOps maybe, you know those 10 years circa 2007-2018 or so).
[Sorry for a too-terse initial comment, I try not to be needlessly verbose on HN.]
>My unspoken contention was that eventually, we move to a world where k8s-like infra is the de facto norm for 99% of infrastructure out there, and on that road we move to the "no" side of the initial question for e.g. web agencies (meaning, we've moved one notch comparable to the move from old-school SysAdmin to DevOps maybe, you know those 10 years circa 2007-2018 or so).
This is very very hard to parse BTW. I don't want to reply to what you've written because I can't determine for sure what it is that you're saying.
Essentially I mean: scalable infra may be premature optimization today in a lot of cases, but eventually it becomes the norm for pretty much all systems.
You could similarly parse the early signs of a "devops" paradigm in the mid-2000's. I sure did see the inception of the paradigm we eventually reached by 2018 or so. Most of it would have been premature optimization back then, but ten-ish years later the landscape has changed such that a devops culture fits in many (most?) organizations. Devops being just one example of such historical shifts.
I anticipate the general k8s-like paradigm (generic abstractions on the dev side, a full 'DSL' so to speak, scalable on the ops side) will be a fit for many (most?) organizations by 2030 or so.
> Either you don't succeed, in which case it doesn't matter what your stack was, or you do, in which case you'll be glad that you can scale up easily
This take brushes right past the causes of success and failure. Early stage success depends on relentless focus on the right things. There will be 1000 things you could do for every 1 that you should do. Early on this is going to tend to be product-market fit stuff. If things are going very well then scalability could become a concern, but it would be a huge red flag for me as an investor if an early stage company was focusing on multi-cloud.
I certainly wouldn't recommend that anyone "focus on multi-cloud" in an early-stage company (unless of course multi-cloud is a crucial part of their product in some way).
Kubernetes is basically an industry standard at this point. It's easy to hire ops people competent in it, and if you do hire competent people, it will save you time and money even while you are small. As an investor "we use this product for orchestration rather than trying to roll our own solutions to the same problems, so that we can focus on $PRODUCT rather than reinventing half-baked solutions to mundane ops problems" should be music to your ears.
I agree with all of that. That said, I don't think competence is a binary proposition, and if you hire people who have only worked at scale they will be calibrated very differently to the question of what is table stakes. One of the critical components of competence for early stage tech leadership is a keen sense of overhead and what is good enough to ratchet up to the next milestone.
As many problems as containerization solves, it's not without significant overhead. Personally I'm not convinced the value is there unless you have multiple services which might not be the case for a long time. You can get huge mileage out of RDS + ELB/EC2 using a thinner tooling stack like Terraform + Ansible.
The overhead of containerisation is mostly in the learning curve for teams that are not already familiar with it (and the consequent risk of a poor implementation). A well designed build pipeline and deployment is at least as efficient to work with as your Terraform+Ansible.
If you have such a team, it can of course make sense to delay or avoid containerisation if you don't see obvious major technical benefits.
But those teams will get rarer as time goes on, and since we're talking about startups, honestly it would be questionable to build a new ops team from people with no containers knowledge in 2022.
Success is rarely so rapid that you can't throw money at a problem temporarily and build something more robust.
No one is advocating for a single server running in your closet, but a large and well funded PaaS can handle any realistic amount of growth at least temporarily, and something like Heroku is big enough (and more importantly, owned by a company big enough) that shutting down without notice is not a possibility worth considering.
Almost every k8s project I've looked at in the last few years is database bound. k8s is not really going to solve their scaling needs. They needed to plan more up front about what their application needed to look like in order to avoid that.
Yes, if your application looks like a web application that is cache friendly, k8s can really take you a long way.
In case it's not clear, nothing in my comment suggests that k8s will magically solve all your problems. It just provides abstractions that make growth (in size and complexity) of systems easier to manage, and helps to avoid lock-in to a single cloud vendor. The wider point is that thinking about architecture early will make scaling easier, and for most companies, k8s is likely to end up being a part of that.
The "web application" / cache-friendly part of your comment doesn't make much sense to me; k8s is pretty well agnostic to those kinds of details. You can orchestrate a database-bound system just as well as you can anything else, of course.
> if you were smart about how you used k8s, you'll be glad that you can relatively easily move between clouds or move to bare metal.
I'd argue you should definitely consider multi-cloud strategy from the get-go indeed in 2022. Something like Terraform helps statically setting k8s clusters on most clouds. Especially for startups, it's better to default to vanilla stuff and only complicate on a need-to basis.
Yes, completely agreed. Multi-cloud is really not that difficult nowadays, and it puts you in a better negotiating position (when you end up spending enough to be able to negotiate), as well as giving you more location flexibility and the ability to pick and choose the best services from each cloud.
Oh yes, negotiation is a strong argument in that context. One that makes or breaks a CTO's mission, me thinks, if that company expects a lean path to ROI.
A multi-cloud paradigm is also a great way to teach you about your application and about those clouds themselves. A good reminder that "implementation is where it's at", and "the devil is in the details".
The fact that they purchased 100 nodes has nothing to do with k8s but with their incompetence. You can run it on one machine. Also you can set up auto scaling easily based on whichever parameters.
That isn't the point. If he had a whole year, was there a tangibly better use of his time to get a product to market faster? What might the business implications be for doing or not doing so?
It seems many are focused on the time estimate. That was in creating the overall solution. About two months was to set up the infrastructure mentioned.
These often get developed side by side. GitLab, unit tests, api-server, nginx, cert-manager, deployments, integration tests, prometheus, metrics in services, grafana, alert-manager, log consolidation, work services and scaling, etc.
Just spinning up a cluster, nodepool, nginx, cert-manager w/let's encrypt cluster issuer, prometheus, grafana, can easily be done in a day. So, time estimates kinda depend entirely on what you mean by it.
Spinning up promerheus and grafana with automatic service discovery: one day.
Making good metrics, visualizations, and alerts: everything from a week to a month or two. So, take the time estimates with a grain salt.
As a rule, for anything in an startup, adding that "scalable" adjective is a waste.
Of course, exceptions exist, but it's one of those problems that if you have them, you probably know it, and if you have them and don't know it, you won't succeed anyway. So any useful advice has the form of "don't".
In a typical startup you are going to be strapped for cash and time. You need to optimize and compromise. In general, if you don't know how do something, then figure out if you need to learn it, or whether you can get by with a "good enough" solution, because there will be a queue of other things you need to do that might be more business-critical.
So if you already know Kubernetes, then great, use it. Leverage your existing skills. If you don't, then just use Heroku or fly.io or whatever, or go with AWS if that's your competence. Maybe revisit in a year or two and maybe then you'll have funding to hire a devops person or time to spend a week or two learning to do it yourself. Right now you want to get your SAAS MVP in front of customers and focus on the inevitable product churn of building something people want to pay for. The same advice goes for anything else in your stack. Do you know it well enough? Do you need it right now? Or is there a "good enough" alternative you can use instead?
And yet, to me it sounds like NIH since it's a pretty standard stack; couldn't they just get something like google app engine and get all of that from day one? Because did any of those things mentioned result in a more successful company?
I'd argue that using helm charts is the exact opposite of NIH. The things that take time are not the stack themselves, but the software and solutions. K8s just makes the stack defined in code, and written and managed by dedicated people (helm maintainers) as opposed to a bit "all over the place" and otherwise in-house, directly using cloud provider lock-in resources.
I'm sure there are plenty of use cases where that makes sense, and is a better approach. But, I disagree that k8s suggests a NIH-mindset.
Most startups get basic security for networking and compute wrong, K8s just adds even more things to mess up. Odds are even if you use an out of the box solution, unless you have prior experience you will get it wrong.
I will always recommend using whatever container / function as a service e.g. ECS, GCF, Lambda any day over K8s for a startup. With these services its back to more similar models of security such as networking rules, dependency scanning, authorization and access...
So question then - is it possible to found a tech startup without paying rent to a FAANG? Before I get the answer that anything is possible, I should say is it feasible or advisable to start a company without paying rent to the big guys?
The reality is unless you’re some rich dude who can borrow dad’s datacenter (And that’s cool if so), you’re either going to be renting colo space, virtual servers, etc.
It’s always a challenge in business to avoid the trap of spending dollars to save pennies.
IMO, you’re better off working in AWS/GCP/Azure and engineering around the strengths of those platforms. That’s all about team and engineering discipline. I’m not in the startup world, but I’ve seen people lift and shift on-prem architecture and business process to cloud and set money on fire. Likewise, I’ve seen systems that reduced 5 year TCO by 80% by building to the platform strengths.
I'm aware that no man is an island in some sense, but I'm not comfortable with locking myself into one of 3 companies who need to increase their revenue by double digits year over year. And as you say, a lift and shift is basically setting money on fire. Currently I run sort of a hybrid approach with a small IaaS provider and a colo. It seems to work well for us both technically and financially though that seems to go contrary to what is considered conventional wisdom these days.
That’s awesome. The most important thing is to understand why you’re making the decisions that you do.
Where I work, we can deliver most services cheaper on-prem due to our relative scale and cloud margins. But… we’re finding that vendors in the hardware space struggle to meet their SLAs. (Look at HPE — they literally sold off their field services teams and only have 1-2 engineers covering huge geographic regions. So increasingly critical workloads make the most sense in the cloud.
If you're priorities are 'which companies do my values align with among generally very high integrity companies to begin with' - then you might want to reconsider.
Google is not evil. They're just big, and define some practices which we might think should change in the future.
Once you have the thing up and running, you can start to think about hosting your own.
Also, you don't need to use fancy services because most startups can run just fine on a single instance of whatever, meaning, there are a lot of cloud providers out there.
If and only if your business model depends on it. A startup's job is mostly to find product market fit; if being decoupled from AWS isn't part of your market, you are spending money on a non-problem.
There is nothing stopping you from hosting your own OpenStack, managed k8s, and all that, on your own hardware. You would need a good reason to not let someone else deal with all of this though.
For a small enough company you could even just use use k3s + offsite backups. Once you grow large enough you can setup machines in 2-4 locations across the land mass where your users exist. If you have enough than a hardware fault in one isn't an emergency and you'd be able to fly out to fix things if needed.
Realistically, on all flash, you are very unlikely to need to maintain anything on a server for a few years after deployment.
That is probably a good idea for many startups. However, once you get into the world of audits and compliance certifications, things become a lot harder. But, but then again, at this point, I suppose it is easy enough to transition to some managed hardware.
Agree the author is wrong on that specific point, though thankfully the bulk of the article content deals with the headline, and is mostly fine wrt k8s.
Rather than the author "not knowing" what they're talking about, I suspect they're taking narrow experience and generalising it to the entire industry. Their background is selling k8s as a solution to small/medium enterprises: it strikes me that there may be a strong correlation between startups interested in that offering and those deploying failed overengineered multilang micro-architectures. Suspect the author has seen their fair share of bad multilang stacks and not a lot of counter examples.
The Whole advice of using same language is especially silly - iOS is stuck with Swift, and the web is stuck with JS, and maybe you need an applitation that scales using actors across mutiple machines with Golang or Java, or maybe you need to plug into Windows tightly and need C#.
Kubernetes is not 'harder' if all you need is to host a webapp. Where it falls on the hardness spectrum depends on what you are trying to do, and what is the alternative. I am very fluent with Kubernetes but have no skills in managing traditional virtual machines.
> The Whole advice of using same language is especially silly - iOS is stuck with Swift, and the web is stuck with JS, and maybe you need an applitation that scales using actors across mutiple machines with Golang or Java, or maybe you need to plug into Windows tightly and need C#
And you're also forgetting Android and macOS and Linux.
That's why cross-platform frameworks like Electron and React Native are so popular. The time wasted in going native for every single platform is just infeasible for most non-huge companies.
But you could also have 2 people working on React Native and have 1 person each for getting it to play nice with iOS/Android, and eliminate the need for an extra engineer.
Well, if React native is anything like the many react websites, then this isn't too far off actually. "modern" websites can already send your CPU puffing, when you hover over some element with your mouse pointer and it triggers some JS emulated style for :hover.
tss.. some people dont like being reminded that their favourite tech performs worse on an Nvidia 3090 than Winforms did on 800 mhz cpu running windows 98
Choosing more than one language as a startup can become really expensive quickly. As long as your tribes are small, chances are high that you one day run out of. e.g., python developers while you still have a lot of Java guys (or vice versa). This introduces unnecessary pain. (And obviously, you should have used Rust or Haskell from the get go for everything.)
The sole exception to this rule I would make is javascript which is more or less required for frontend stuff and should be avoided like the plague for any other development. As soon as you can get your frontend done in Rust, though, you should also switch.
Idk, I am someone, who has looked at many programming language, including all of those you mentioned. But a capable developer can be expected to learn a new language over the course of a few weeks if needed. I don't see how you could "run out of devs of language x", if you have capable devs on board. Especially, when those languages are all in the same programming language family/club.
Even the most capable developer that learns a new language in a few weeks will not be an expert in it. The difference in productivity and quality of the code will be huge. This is because in different languages things can be done very differently, it is not about the syntax as much as the best ways to do things.
I also thought WhatsApp is a bad example. They not only hosted themselves, but they used solely FreeBSD (as far as I know) in their servers. (which don't get me wrong, I find great as a FreeBSD sysadmin myself).
Using WhatsApp as an example of a lean engineering org should almost be banned at this point. WhatsApp had a high performing engineering team that used basically the perfect set of tools to build their application (which also had a narrow feature scope; plaintext messaging). Even with hindsight there is very little you could do to improve on how they executed.
Just because WhatsApp scaled to almost half a billion users with a small engineering team doesn't mean that's the standard, or even achievable, for almost all teams.
>Doesn't look like the author knows what he is talking about.
This was my first thought, and I was to comment so, but saw you already did. The only reason we see this comment is because HN has an irrational hate of K8s, for us that do run things in production at scale, k8s is the best option. The rest is either wrapped in licenses or lacks basic functionality.
I suspect a lot of the gripes and grousing about Kubernetes comes from SMEs trying to run it themselves. That will often result in pain and cost.
Kubernetes is a perfectly good platform for any size operation, but until you are a large org, just use a managed service from Google/Amazon/DigitalOcean/whoever. Kubernetes, the data plane, is really no more complex that eg Docker Compose, and with managed services, the control plane won't bother you.
K8s allows composability of apps/services/authentication/monitoring/logging/etc in a standardised way, much more so than any roll-your-own or 3rd-party alternative IMO, the OSS ecosystem around it is large and getting larger, and the "StackOverflowability" is strong too (ie you can look up the answers to most questions easily).
So, TLDR, just use a managed K8s until you properly need your own cluster.
Exactly! I chose k8s managed by Google years ago. As a solo developer you can have a cluster up and running with a web server, fronted by a Google LB with Google managed certificates in under an hour. Just follow one of the hundred tutorials. Just as quick and easy as setting up a single VM. But that really isn't the point is it. If that is all I needed, yes I'd use a $10 VPS. But for an "application", not a web site, you always need more.
My k8s "cluster" is a single node that doesn't cost me any more than a VM would. I don't need the scalability at the moment. But I do need different environments for dev, qa, and prod. And all three are running identically next to each other using Namespaces. Saved us a ton of maintenance and cost.
Any project that grows has its needs change. GKE gives you a ton of integrated tools right from the start including logging, alerting, metrics, easy access to hosted databases, pub/sub, object storage, easy and automatic network setup, easy firewall setup, dns management, and a lot more. k8s is no different than using any other hosted service. It provides a great set of features that you configure using fairly consistent yaml configuration files. And it is all accessible from the web based "Google Console" as well.
Learning the k8s yaml format and some basic kubectl commands is all you need to get going and it saves a TON of time that can go back into developing your application rather than dealing with configuring disparate pieces with their own configuration methods.
I was fairly early to k8s while they were still competing with other similar solutions and other tools like Puppet and Chef. I tested all of them and truthfully, k8s was the easiest to learn, implement, and maintain my app with. Using GKE of course. I would NEVER as a one man or even small team of developers take on managing an installation of k8s myself.
> I think the most appropriate advice is to choose a stack which the founding team is most familiar with. If that means RoR then RoR is fine. If it means PHP then PHP is fine too.
Taking human resource information into consideration sounds very wise. Although, learning a new language is generally not that a huge barrier, while changing your whole stack once the minimum viable product cap is passed can be very expensive. And if you need to scale the team, the available developer pool is not the same depending on which technology you have to stick with.
It doesn’t invalidate your point, but maybe it brings some relevant nuances.
> But the next advice about not using a different language for frontend and backend is wrong.
Being charitable, what I think they are getting at is maybe more about having fully separated frontend and backend applications (since the front-end examples he gives are not languages but frameworks / libraries). Otherwise it seems really backwards - I'm definitely an advocate of not always needing SPA-type libraries, but using literally zero Javascript unless your backend is also JS seems like it goes to a too-far extreme.
Re: single language, there's a grain of truth to it - see http://boringtechnology.club/ - but that one mainly says there is a cost to adding more and more languages. When it comes to back- and frontend though, I would argue there is a cost to forcing the use of a single language. e.g. NodeJS is suboptimal, and web based mobile apps are always kinda bleh.
"I think the most appropriate advice is to choose a stack which the founding team is most familiar with."
I'd think that's exactly what typically happens most of the time. But the degree of stack-lockin that occurs with startups still surprises me even when it's clear a better choice might have been made. Mostly due to management not being prepared to grant the necessary rewrite time.
sounds like it just boils down to: try to choose the technology your team is familiar with, not what other teams are successful with
Of course there's some balance needed. If your team is familiar with some niche language then long term that might not be a good strategy if you intend to bring more devs on board later.
One side of this which I don't think is discussed often is the fun of choosing new technology. How do you balance having fun and being realistic at the same time?
Fun meaning trying new technology, learning as you go, setting up systems that make you feel proud, etc. It can lead to failure, but I think having fun is important too.
I like to call what the author is referring to as, "What-If Engineering". It's the science of thinking you'll be Google next week, so you build for that level of scale today. It involves picking extremely complicated, expensive (both in compute and skilled labour) technologies to deploy a Rails app that has two features. And it all boils down to, "But what if..." pre-optimising.
It happens at all levels.
At the individual unit level: "I'll make these four lines of code a function in case I need to call it more than once later on - you know, what if that's needed?"
It also happens at the database level: "What if we need to change the schema later on? Do we really want to be restricted to a hard schema in MySQL? Let's use MongoDB".
What's even worse, is Helm and the likes make it possible to spin up these kinds of solutions in a heart beat. And, as witnessed and evidenced by several comments below, developers think that's that... all done. It's a perfect solution. It won't fail because K8s will manage it. Oh boy.
Start with a monolith on two VMs and a load balancer. Chips and networks are cheaper than labour, and right now, anyone with K8s experience is demanding $150k + 10% Superannuation here in Australia... minimum!
I've told this story before on HN, but a recent client of mine was on Kubernetes. He had hired an engineer like 5 years ago to build out his backend, and the guy set up about 60 different services to run a 2-3 page note taking web app. Absolute madness.
I couldn't help but rewrite the entire thing, and now it's a single 8K SLOC server in App Engine, down from about 70K SLOC.
My most recent job, and the job before that and the job before that all have one thing in common:
Someone convinced the right person to “put stuff on kubernetes” and then booked it for another job/greener pastures with barely any production services, or networking configured.
Thus an opening for a new SRE and once again I find myself looking at a five headed monster of confusing ingress rules, unnamed/unlabeled pods, unclaimed volume specs…
absolutely nuts to think what that would look like. Is he building services to abstract out english grammar? Have one service called VerbManager that returns a boolean if a given word is a verb, have another one called AdjectiveManager that does similar and so on.
>App Engine have actually been around for a long time
That means very little, I hope you realize. Reader, Voice, Chat, etc.[0] were all around a long time.
>Also, there's no inherent lock-in
GAE has plenty of proprietary APIs you can depend on. Whether or not you do is up to the programmer.
0 - A comment below notes that voice and chat aren't deprecated yet. Voice was announced deprecated, and google has had so many chat apps I'm not sure which ones are gone. Anyway, here is a more complete list of things Google has abandoned: https://killedbygoogle.com/
> It's the science of thinking you'll be Google next week,
There's other reasons to use K8s than just thinking of massive scale.
Setting up environments becomes a massive PITA when working directly with VMs. The end result is either custom scripts, which is a messier version of terraform, which ends up being messier than just writing a couple of manifest files for a managed k8s.
> anyone with K8s experience is demanding $150k + 10% Superannuation here in Australia... minimum!
sheds a tear for CAD salaries and poor career decisions
> Setting up environments becomes a massive PITA when working directly with VMs. The end result is either custom scripts, which is a messier version of terraform, which ends up being messier than just writing a couple of manifest files for a managed k8s.
The author advocates using a high-level PaaS. For sure working directly with VMs is the wrong answer to premature optimization, but as an early stage startup, there's plenty of services around that you basically just need to point your Git repo at and you'll have a reasonable infrastructure set up for you.
OP here, I agree. You can get super far with a service like fly.io/Heroku/Netlify/Vercel etc. (pick the one that works with your stack). VMs are an anti-pattern as well for the early stages of an application or startup.
I find those solutions end up being harder once you have to integrate something new you didn't know you needed. And the cost, at least for something like Heroku, is MORE than using something like GKE where I don't need a new dyno for every service. I consider GKE (and DO's and AWS's equivalent k8s solutions) on the same platform level as a fly.io/Heroku/etc.
> Setting up environments becomes a massive PITA when working directly with VMs.
I guess I've just never found this to be true.
My main goal when engineering a solution is always, "Keep It Super Simple (KISS), so that a junior engineer can maintain and evolve it."
Working directly with operating systems, VMs, networking, etc. (purely in Cloud - never on-prem... come on it's 2022!) is the simplest form of engineering and is much easier than most claim.
>> Helm and the likes make it possible to spin up these kinds of solutions in a heart beat
Genuine question, why is this bad? Is it because k8s can spin it up but becomes unreliable later? I think the industry wants something like k8s - define a deployment in a file and have that work across cloud providers and even on premise. Why can't we have that? It's just machines on a network after all. Maybe k8s itself is just buggy and unreliable but I'm hopeful that something like it becomes ubiquitous eventually.
Oh it's most certainly NOT bad! It's very, very good. But it's not the end of the story. Imagine if you could press a button in a recipe book and a stove, pan, some oil, and some eggs appeared and the eggs started cooking... amazing! But that's not the end of the story. You don't have scrambled egg yet. There's still work to be done and after that, there's yet more work to be done - the washing up being one of them.
It's everything that comes afterwards that gets neglected. Not by everyone, granted, but by most.
Its bad because kubernetes has a learning curve + typically you'll need someone to deal with it constantly. If your app is simple, you don't need that - simple docker-compose should suffice.
According to the following link, that literally sounds nothing like Kubernetes. Perhaps a more appropriate analogy to older tech is something like LSF or Slurm.
Sorry but managed k8s is really simple and wildly a better pattern than just running VMs. You don’t need google scale for it to help you, and spinning things up without understanding the maintenance cost is just bad engineering
If you need a service to manage K8s for you, then that's a red flag already (regarding K8s, not you personally.) If the service is so complicated that experienced engineers tell me constantly that only managed K8s is the way to do it, that tells me enough about why it's going to be a rough journey that should probably be avoided with IaaS or PaaS.
> ... and wildly a better pattern than just running VMs.
I've never had an issue running VMs, personally. And when I join a firm and begin helping them, I find I can very quickly and easily come up to speed with their infrastructure when it's based on straight IaaS/SaasS/PaaS. If it's K8s, it's often way more complicated ("Hey! We should template our YAML configs and then use Ansible to produce the Helm charts and then run those against the infra!" - ha ha!)
> If you need a service to manage K8s for you, then that's a red flag already
It's really not, k8s does a ton for you, trying to do the same with VMs would be unbelievably complex
> that should probably be avoided with IaaS or PaaS.
PaaS is great till it isn't, seen plenty of companies hit the edges of PaaS and then need to move to k8s
> infrastructure when it's based on straight IaaS/SaasS/PaaS
Again this is great till it isn't (see heroku) and then people move to k8s. Having control and understanding of the underlying infrastructure is important unless your just running some basic web app
Yeah isn't Heroku dead? PaaS is great till it isn't then you're screwed, seen plenty of companies be forced to move off of PaaS to k8s because of the edges. Its fine if you are running a basic web app
Level 1) you package your service into a zip/rpm/deb/etc and have an agent on the machine that periodically pulls
Level 2) you pack your software into an ami and use the update the asg config. You can periodically "drain" the asg of old instances
Level 3) you deploy your stack again with the new stack having the ami that you've build at level 2 referenced. You start shifting traffic between the old stack and the new stack. You monitor and rollback if something is wrong.
I find it's easier to use Ansible/Salt/Puppet Bolt and Packer to bake an AMI every night, update the launch template in a DB (which Terraform pulls the value from, thus there is no drift), and auto the ASG. Then you just force a drain.
Now you've got automatic, constantly updating VMs every night if you want them. And a new deployment is just commiting code to master and pushing and that whole pipeline triggers for you.
People like to overcomplicate things, Mirceal. You're on the right path :-)
I'll be honest I haven't fully explored AMIs as a solution but how do you run the AMI in your local dev environment? I can replicate the same K8s with docker images easily in local dev.
that's the crux of the problem. people no longer know, understand or want to know and understand what their software is vs what is around their software. they treat docker and k8s as a way of packaging software and just ignore all the lessons that generations of software engineers have learned when it comes to how to properly manage your dependencies and how to correctly pack your software so that it's resilient and runs anywhere.
we also live in a world that does not appreciate well crafted software and a lot of things are driven by the desire to build a resume. I've maintained code that was decades old and was amazing to work with and was still generating ridiculous amounts of money. I've also worked on code that was just written and used all the possible bells and whistles and development speed grinded to a halt once the it's been around for more than a couple of months.
My worst case scenario is having to work on code where the original developer didn't understand what they were doing and they just wanted to use X. Double the trouble if they didn't master X when the thing was put together.
The truth is at scale the last thing you want is a nest of unmanaged complexity, so it’s also the wrong instinct there. It’s usually contact with the real world that dictates what needs the extra engineering effort, and trying to do it ahead of time just means you’ll sink time up front and in maintenance on things that didn’t turn out to be your problem.
I think at scale, K8s is a good choice. I run a Discord server with like, 3,400 members now, and some of them are working at mental scale. They've claimed the same as you: K8s at scale is the only way.
I would very likely agree in all cases.
However, they only represent 4-5 users out of all 3,400. And that's the issue - only a small fraction of the industry operates at that scale :-)
You're forgetting that many people will want to use K8s for a project because they want it on their CV to get the high paying jobs. I saw the term on HN a couple of weeks ago -- CVOps
I'm not forgetting that fact, I'm simply choosing to ignore such people. They're not really what the industry is about. That's not in the spirit of a healthy society. That's just leeching.
Good luck to them, but they're not going to occupy time and space in my mind.
We'd like that (I'd like that), but resume-driven choices are a very large driver of technology direction, unfortunately.
It means those of who want to build something very maintainable and very stable using the most boring (stable, secure) technology possible are often outnumbered.
Maybe it's just my organization, but I see the behavior across the corporation far more than I'd like, and inevitably these people move on leaving a complex mess in their wake that long term support staff have to deal with.
We seem to mostly manage to avoid that in my department, but we have a very low turnover.
> I'll make these four lines of code a function in case I need to call it more than once later on - you know, what if that's needed?
The code example is not always right.
Beware, if you know it will be needed, you might as well make it a function now. Likewise if you think probably it will be needed, why not make it a function now?
It’s not a good review comment or rejection to say “yeah but I don’t want to do that because it’s not yet needed”. Sure, but what if you are just being lazy and you don’t appreciate what it should look like long term?
The “I don’t want to write a function yet not needed” is not a clear cut example.
You're making a gamble either way. The article you linked is correct that duplication is usually cheaper than abstraction. So if you really have no idea what your code will do in the future, then cheaper is the way to go. But an experienced dev can start to fortune-tell, they know what parts tend to be re-used, which abstractions are common and powerful. And if you also plan your architecture before you code, you can also see what abstractions might be needed ahead of time. If you are sure an abstraction is better, then duplication is tech debt.
A simple example. If you are making a multi-page website that contains a header on all pages, you can separate the header into a component from the get go, instead of duplicating the header for each page and then abstracting it in the future (where it starts to become more work).
>> Start with a monolith on two VMs and a load balancer. Chips and networks are cheaper than labour,
Kudos to you! You are a dangerous man for you opine the truth.
My advice is generally, "Build something. Then see if you can sell it." or "Sell something and then go build it." Either way, it all starts soooo small that the infrastructure is hardly a problem.
If you "get lucky" and things really take off. Sure. Yeah. Then get a DevOpps superstar to build what you need. In reality, your business will very probably fail.
You can’t just hire 1 DevOps superstar though because they need to sleep and not burnout. You’ll need ~7 people on a rotation if you need to really support anything worth really supporting. DevOps is about giving Developers a dedicated System Operations job for some small fraction of their time.
> DevOps is about giving Developers a dedicated System Operations job for some small fraction of their time.
No. DevOps is about the development and operations disciplines working together in a cross functional way to deliver software faster and more reliably.
In a small enough startup both disciplines may be represented by a single person, though.
I respectfully disagree. I’ve scaled these teams myself, and it’s about giving developers in a small organization the job of deploying the solution and ensuring that it runs. In larger organizations, DevOps becomes impossible and it naturally splits into Dev and Ops. It’s important to understand where it works and when it stops working to effectively manage the transition as the business grows.
> In larger organizations, DevOps becomes impossible
News to me. I work at one the biggest companies around, and our DevOps posture is pretty great, and that was achieved by cross functional teams from the historic dev and historic ops teams doing all the things that Accelerate codified.
You don't go from zero to needing global 24x7 support overnight.
Hiring 1 DevOps superstar is exactly what we did a few startups back and it worked great. Of course there was no after-hours support, it's a small startup. Eventually the team grew.
Fair enough. If you don’t need 24/7 and the bus factor risk is tolerable, then you might not need a rotation.
I’d just caution that when it comes to the mental health of the one person in this role, even if they seem like they are doing ok, check in frequently.
I’ve run DevOps and know from experience the pitfalls. I’m sorry that you’ve interpreted my general agreement and elaboration of your comment as nitpicking foolishness.
I lean conservative in my tech choices but I just don't see the big issue with Kubernetes. If you use a managed service like GKE it is really a breeze. I have seen teams with no prior experience set up simple deployments in a day or two and operate them without issues. Sure, it is often better to avoid the "inner platform" of K8s and run your application using a container service + managed SQL offering. But the difference isn't huge and the IaC ends up being about as complex as the K8s YAML. For setting up things like background jobs, cron jobs, managed certificates and so on I haven't found K8s less convenient than using whatever infrastructure alternatives are provided by cloud vendors.
The main issue I have seen in startups is premature architecture complexity. Lambda soup, multiple databases, self-managed message brokers, unnecessary caching, microservices etc. Whether you use K8s or not, architectural complexity will bite your head off at small scales. K8s is an enabler for overly complicated architectures but it is not problematic with simple ones.
>Did users ask for this?
Not an argument. Users don't ask for implementation details. They don't ask us to use Git or build automation or React. But if you always opt for less sophisticated workflows and technologies in the name of "just getting stuff done right now" you will end up bogged down really quickly. As in, weeks or months. I've worked with teams who wanted to email source archives around because Git was "too complicated." At some point you have to make the call of what is and isn't worth it. And that depends on the product, the team, projected future decisions and so on.
As a startup founder that's not VC funded, I would totally recommend you look into building with kubernetes from the get go. The biggest learning curve is for the person setting up the initial deployments, services, ingress etc Most other team members may just need to maybe change the image name and kubectl apply to roll things out. Knowing that rollouts won't bring down prod and that they can be tested in different environments consistently is really valuable.
I started out with Iaas namely Google App Engine and we suffered a ton with huge bills especially from our managed db instance. Once the costs were too high we moved to VMs. Doing deployments was fine but complicated enough that only seasoned team members could do it safely. We needed to build a lot of checks, monitoring etc to do this safely. A bunch of random scripts existed to set things up and migrating base operating system etc required a ton of time. Moving to kubernetes was a breath of fresh air and I wish we'd done it earlier. We now have an easy repeatable process . Infra is easier to understand. Rollouts are safer and honestly, the system is safer too. We know exactly what ports can allow ingress, what service boundaries exist. What cronjobs are configured, their state etc with simple kubectl commands.
Using kubernetes forces you to write configurable code and is very similar to testing: it sounds like it'll slow you down and shouldn't be invested in until the codebase is at a certain size but we've all learned from experience how is actually speeds everything up, makes larger changes faster, cheaper customer support and saves you from explaining why a certain feature has been broken for 10 without anyone's knowledge
> The biggest learning curve is for the person setting up the initial deployments, services, ingress etc Most other team members may just need to maybe change the image name and kubectl apply to roll things out.
This is a huge redflag.
It's basically admitting that you expect most later employees to not understand k8s or how its being used. You may think they don't need because it works, but you have to think about what happens when it doesn't work.
The shops I've been to all had the same mindset: the docker/k8s infra was setup by one guy (or two) and no one else on the team understands what's going on, let alone have the ability to debug the configuration or fix problems with it.
Another thing that might happen is some team members understand just barely enough to be able to "add" things to the config files. Over time the config files accumulate so much cruft, no one knows what configuration is used for what anymore.
You have this with literally every deployment mechanism, except with Kubernetes the boundary is standardized and you can easily find/hire/teach new team members to work on the complicated parts.
Custom VM/cloud/$randomSaaS deployments are much worse when it comes to "the one guy who understands the intricate details is on vacation".
Of course, if your deployment mechanism needs to be complicated, standarizing on something everyone knows is useful.
The underlying assumption behind my comment is that you really really want to simplify your deployment as much as you can.
Unfortunately this is nearly impossible with the currently accept set of standard best practices for developing web applications, where you use a scripting language and like six different database systems (one for source of truth on data, one for caching, one for full text search, and who knows what else everyone is using these days; I honestly can't keep track).
There are many ways to make the deployment process simple. My rule of thumb is to use a compiled language (Go) and an embedded database engine.
>It's basically admitting that you expect most later employees to not understand k8s or how its being used.
That's also exactly what would happen if you homebrewed your own system. You need to centralize some part of expertize around infra at some point, but hopefully around more than two people.
It's healthy to depend on your coworkers for their specific knowledge. The landscape is too large for everyone to know everything, and honestly if I heard someone say this in an interview I would chalk it down to social deficits because this is not how life, businesses, etc. work
I think the key with this though is that it's good when everyone on a team has a working knowledge of something, and one person has expert knowledge. If one person knows everything there is to know, and everyone else knows nothing, you've created a massive dependency on the single person (which in the case of infrastructure code, could easily be a near existential problem for the company).
It’s unrealistic to expect the entire team to know how to build and safely operate IPv6 BGP anycast with an HTTP/2 tls load balancer, authentication and authorization, integrated with a modern observability platform and a zero downtime deployment CI process.
It is realistic to expect a small team to build the same in an industry standard way and hand of a clear, well documented API to the rest of the team.
Bespoke solutions need to deal with this complexity somehow, k8s provides a standard way to do so.
I'd consider knowing how to use the API to configure services and deploy new services to be a good definition of "working knowledge". You're right that everyone doesn't need to know the ins and outs of everything for sure, but if you observe at your company that everyone is relying on "the Kubernetes guy" to do everything related to Kubernetes, you've just re-invented an old-school ops team in an especially brittle way.
Just like how you don't really expect later devs to learn the intricacies of your hand rolled deployment setup.
The difference is, kubernetes is pretty standardized and therefore learnable in a repeatable way, unlike the Frankenstein of tooling you might otherwise cobble together.
> I started out with Iaas namely Google App Engine and we suffered a ton with huge bills especially from our managed db instance
Are you factoring in the salaries of the people setting up Kubernetes? And the cost of those people/salaries not working on the actual product? And the cost of those people leaving the company and leaving a ton of custom infrastructure code behind that the team can't quickly get up to speed with?
> ton with huge bills especially from our managed db instance
This doesn't have much to do with App Engine, right? Last time I used it, we were using a PostgresQL instance on AWS and had no problems with that.
> Doing deployments was fine but complicated enough that only seasoned team members could do it safely
I just plain don't believe this. I bet you were doing something wrong. How is it possible that the team find too difficult to do an App Engine deployment but then they're able to setup a full kubernetes cluster with all the stuff surrounding it? It's like saying I'm using React because JavaScript is too difficult.
> Using kubernetes forces you to write configurable code and is very similar to testing: it sounds like it'll slow you down and shouldn't be invested in until the codebase is at a certain size but we've all learned from experience how is actually speeds everything up, makes larger changes faster, cheaper customer support and saves you from explaining why a certain feature has been broken for 10 without anyone's knowledge
This is far, far, far from my own experience.
Some other questions:
How did you implement canary deployment?
How much time are you investing in upgrading Kubernetes, and the underlying nodes operating systems?
How did kubernetes solve the large database bills issue? How are you doing backups and restoration of the database now?
If I were to found a company, specially not VC founded, dealing with kubernetes would be definitely far below on my list of priorities. But that's just me.
(I agree with your overall points, but when the GP said "Doing deployments was fine but complicated enough that only seasoned team members could do it safely" they were referring to their post-App Engine, pre-Kubernetes "manually managed VMs". I have no problem believing that deploys are very complicated in that scenario)
If setting up a PostgreSQL behind VPC with an EC2 in front of it was too difficult, there is now a serverless database product from AWS that costs 90% less what it used to.
No more load balancers, no more VMs, no more scaling up or down to match demands.
Please don’t do this. I’m dealing with the mess caused by following this line of thinking.
One guy (okay it was two guys) set up all the infrastructure and as soon as it was working bounced to new jobs with their newfound experience. The result is that dozens of engineers have no idea what the heck is going on and are lost navigating the numerous repos that hold various information related to deploying your feature.
In my opinion (and I’m sure my opinion has flaws!), unless you have global customers and your actual users are pushing 1k requests per second load on your application servers/services, there is no reason to have these levels of abstractions. However once this becomes reality I think everyone working on that responsibility needs to learn k8s and whatever else. Otherwise you are screwed once the dude who set this up left for another job.
And honestly.. I’ve built software using node for the application services and managed Postgres/cache instances with basic replication to handle heavy traffic (10-20k rps) within 100ms. It requires heavy use of YAGNI and a bit of creativity tho which engineers seem to hate because they may not get to use the latest and shiniest tech. Totally understand but if you want money printer to go brrr you need to use the right tool for the job at the right time.
The mistake people make thinking about Kubernetes is that it's about scale, when really it's just a provider for common utility, with a common interface, that you need anyway. You still need to ingress traffic, you still need to deploy your services, etc.
Kubernetes isn't just about global scale that most people will never need, which would agree with the article. It is about deploying new apps to an existing production system really quickly and easily. We can deploy a new app alongside an old app and proxy between them. Setting up a new application on IIS or a new web server to scale is a mare, doing the same on AKS (managed!) is a breeze. It is also really good value for money, because we can scale relatively quickly compared to dedicated servers.
It is also harder to break something existing with a new deployment because of the container isolation. We might not need 1000 email services now but we could very quickly need that kind of scale and I don't want to be running up 100s of VMs at short notice as the business starts taking off when I can simply scale out the K8S deployment and add a few nodes. There is relatively little extra work (a dockerfile?) compared to hosting the same services on a web server.
> Setting up a new application on IIS or a new web server to scale is a mare.
I disagree.
With octopus deploy I can add a step, set a package and hostname, and press a button, and have a new deployment of a new api or website in IIS pushed out to how ever many servers currently exist in a few minutes.
There are many ways to manage deploying services, and scaling, without K8S or containers in general.
While I could setup a new Octopus Deploy quickly, it would be like you setting up something in AKS. We are both good at the tools we know. But saying my way or your way is wrong - is the thing thats wrong.
Servers get have a tentacle which has an environment and tag. So you could say a server is test environment and tagged with main-web-app. While another server is production with the same tag. You promote a package from test to production. You configure your setup to say a package is deployed to a server tagged main-web-app.
Octopus is an orchestration tool so isn’t responsible for scaling up. But in AWS with an autoscale group you can configure Octopus to auto detect a new server tag it and deploy. Part of the deployment would add it to the load balancer.
As a side note you can deploy containers using octopus too. Tho I’ve never found a reason to use containers in production yet.
the only rationale to do what you described is if and only if you have outside capital. If you are spending your hard earned boostrapped cash on this, I'm sorry but its a poor business decision that won't really net you any technical dividends.
Again, I really see this the result of VC money chasing large valuations, and business decisions influencing technical archietcture, a sign of our times, of exuberance and senselessness
Engineering has to raise the cost of engineering to match it (280 character limit crud app on AWS lambda with 2 full stack developers vs 2000 devs in an expensive office).
Why should using different languages for front-end and back-end be a problem? I rather think that it is better to use languages that are appropriate for the given problem. It is not premature optimization to have parts of a back-end implemented in C/C++/Go/whatever else if high performance is needed. It would rather be a waste of resources, money and energy not to use an high-performance language for high-performance applications. Of course using the same language for the front-end might make no sense at all.
That's sad, JavaScript was already not great for front-end and we now get it in backend and even the edge.
Most job offers are for a mythical full-stack developper that'll master web design, CSS/HTML, front-end interactions and code, networking, back-end architectures, security,... You end-up with people who don't have time to get enough expertise and write and build clean stuff. Hacking poor JavaScript code everywhere.
With the same language, you may think you can somehow reuse stuff between front and back. A bad idea in most project, will typically create more problem than it'll solve.
I'm far from a full stack developer, but really how much code would actually be common across the front and and back end? I would have thought maybe some validation code, not sure how much else?
If you do server-side rendering with something like nextjs, then its quite a lot of code.
With trpc you can share types without going through an intermediary schema language (https://trpc.io/) although I think it would've still benefitted from separating the schema from the implementation (hope you don't forget the type specifier and import your backend into your frontend by accident)
For business logic as always the anwwer is "it depends". Does your app benefit from zero-latency business logic updates? Do you need rich "work offline" functionality combined with a rich "run automation while I'm away" for the same processes? etc.
Even when using Nextjs, it's still pretty common to have a separate API backend that doesn't also have to be in Javascript/Node. There are parts of backend code (HTML generation) that very strongly benefit from being unified with the frontend code, and there are parts of backend code (like database persistence code) that much less strongly if at all in some cases benefit from unification with the frontend code. Many people split these parts of backend code up between a separate Nextjs backend service and API backend service, and Nextjs lends itself well to this.
I'm a big Nextjs fan; I just think it's useful to emphasize that using it doesn't necessarily have to mean "only javascript/typescript on the backend".
> While Remix can serve as your fullstack application,
Isn't that where most teams should start, and where they should stay unless they have a really good reason to get more complicated? This is what I was thinking of while reading the GP comment about Next.js.
Sounds about right. I have an API service written in Go for which we also offer a client library in Go. We moved the common ground between the two into a separate module to avoid code duplication. That module ended up having the type declarations that appear on the API in JSON payloads, plus some serialization/validation helper methods, and that's it.
True, that can be a valid use-case. Another solution would be to leverage language agnostic API discovery solutions, generating the serialisation and part of validation code (swagger/open api/...). This way you provide a nice documentation and don't have to care about language differences while keeping good productivity.
Serialization/deserialization and templates are a huge pain to keep identical. The rest, not so much.
So, if you can keep all of your templates on a single place, and don't have a lot of data diversity, you won't have a problem. But if that's not your case, you do have a problem.
Personally, I think the frontend code should be more generic on its data and organized around operations that are different from the ones the backend sees. But I never could turn that ideal into a concrete useful style either.
In my mind it is not code reuse between frontend and backend, but expertise and standard library reuse that is the winner.
Better to have a full stack developer that can concentrate on becoming an expert and fluent in just one language rather than being kinda-ok in 3 or 4 IMHO.
The struggle is not learning a new language nor becoming fluent.
The real expertise is being a front-end expert, authoring efficient architecture around user interactions, browsing the MDN without effort, mastering the DOM... Then, being a back-end expert, knowledgeable on scaling, could architectures, security issues, being able to author good API design...
If you can be considered an expert in all of this, and also edge computing, I don't think switching language would be an issue for you. Language is a tiny fraction of required expertise and you might be more productive by switching to an appropriate one.
It's not a question of "if" a developer can learn a new language, but of "how long" it takes. And it's not a question of how long it takes to become moderately productive, but how long it takes to reach an expert level of proficiency.
The impression I've gotten from some of my co-workers is that in bootcamps and college they only learned one language, remember the time and effort that went into that, and assume learning another language will take the same time and effort. Because they haven't really put effort into a second one, they don't yet realize just how much conceptually transfers between the languages.
While concepts transfer over between languages the language is only 10% of it. The rest of it is the standard library, ecosystem, buildsystem and all kinds of intricacies you have to know about and what not which is specific to the language (ecosystem).
This is an underestimation of how hard it is to learn to program.
Someone learning a first language isn't just learning a new language: they're learning how to program. It's a new profession, a new hobby, a new superpower.
The rest of the stuff (standard library, ecosystem, buildsystem and all kinds of intricacies) is just a mix of trivia and bureaucratic garbage you gotta fill your brain with but will all be replaced in 10 years anyway. Sure it takes time but it's nowhere near as important as actually knowing how to program.
Even changing paradigms (imperative to functional) isn't as hard as learning the first language.
I think a lot of people here have been doing this programming thing for so long we've forgotten we once had trouble understanding things like:
x = 1
for i in [1, 2, 3] {
x = i
}
What is the value of "x" at the end? Assuming block scope, it will be 1, or assuming it doesn't have block scope (or that it uses the already defined "x" in this pseudo-example) it will be 3.
A lot of beginning programmers struggle with this kind of stuff, as did I. Reading this fluently and keeping track of what variables are set to takes quite a bit of practice.
I've been hired to work on languages I had no prior experience on, and while there was of course some ramp-up time and such, overall I managed pretty well because variables are variables, ifs are ifs, loops are loops, etc.
The concepts still roughly translate and help, though. You have projections/mappings in FP. Or if you want to go deeper, recursion. Understanding loops before those will definitely make them easier since they are equivalent.
The argument that learning a second language is as difficult as learning the first doesn’t really hold water in practice as well, lots of people have done so.
I'm not saying there are zero differences or that $other_language never has any new concepts to learn, but in functional languages variables are still variables, functions are still functions, conditionals are still conditionals, etc. Important aspects differ, but the basics are still quite similar as is a large chunk of the required reasoning and thinking.
There is also an issue with conceptual leakage, most noticeably I've found with devs well versed in one language bending another language into behaving like the former.
Agreed. You see this a lot with people with a C# and/or Java background using TypeScript with annotation based decorators and libraries/frameworks that implement dependency injection, etc. They don't really embrace the nature of the new language and ecosystem. And I can sympathise, it is simpler to do things as you have been doing them before.
You also see it when people who've mostly done imperative programming tries their hands with a lisp or some ml based language. I've been there myself. You still find yourself using the equivalent of variable bindings, trying to write in an imperative rather than declarative style.
I guess when trying to learn a new language in a different paradigm you also need to unlearn a lot of the concepts from the former language.
Well, that is the reason why outside HN bubble, Angular still wins the hearts of most enterprise consulting shops, as its quite close to JEE/Spring and ASP.NET concepts.
> ...the standard library, ecosystem, buildsystem and all kinds of intricacies you have to know about and what not which is specific to the language (ecosystem)
IME a lot of the conceptual stuff transfers pretty well there too.
I think the argument was/is not "its a problem we cannot have javascript" but "if we can have javascript everywhere, we only need to hire javascript devs and only need to care about javascript tooling", which is a fair point.
That does ignore the fact that an experienced frontend JS dev is not necessarily also a good productive backend JS dev, but at least they know the language basics, can use the same IDE, etc.
If thats worth it is something that depends on what you try to achive, I guess. I personally would not pick JS (nor TS) for the backend.
I think this has been a huge failing of our industry of late.
The rise of the "fullstack developer" has mostly reduced quality across the board. When you hire a "fullstack developer with 5 years experience" you aren't getting someone who is as good as a frontend developer with 5 years AND a backend developer with 5 years but someone that adds up to 5 years split between those 2 endeavors but probably with less depth as a result of switching.
(as a side note I also think it's contributed to developer title inflation)
Learning a new language and its tooling is comparatively easy compared to learning the required domain knowledge to be effective in a new area. i.e transition from frontend -> backend or visa versa has very little to do with the language or tooling.
Your average frontend dev doesn't know squat about RDBMS schema design or query optimisation, probably aren't familar with backend observability patterns like logging, metrics and tracing, most likely have very little experience thinking about consistency and concurrency in distributed systems, etc, etc.
Just like the backend dev is going to struggle with the constraints of working in a frontend context, understanding UI interactions, optimizing for browser paint performance, dealing with layout and responsiveness, etc.
Meanwhile if you know say Java and some scripting language, say Python and you end up a new job doing backend in JS it's not going to take long for you to pick up JS and hit the ground running because you are going to encounter exactly the same stuff just different syntax and runtime.
Backend being substantially divorced from frontend isn't a bad thing, it's generally a good thing that results in nice clean separation of concerns and healthy push-pull in design decisions about where certain logic and responsibilities should lie etc.
> The rise of the "fullstack developer" has mostly reduced quality across the board.
Having a solo developer that can do it all well enough also allows useful products to reach users faster (and for me it's really about solving user problems, not just making money), without getting derailed by communication overhead, bureaucracy, fighting within the team, etc. Just make sure you don't pick a developer who cares too much about things that mostly matter to other nerds, because then they might get derailed with premature optimization. Yeah, I've been there.
Most 'fullstack' positions don't have nearly the complexity where worrying about concurrency etc. is actually that relevant. The idea that most frontend devs have any knowledge about optimising for browser paint performance beyond using the correct CSS or framework is funny ;-)
I know, we have seen insane developer inflation the last 10 years.
The bar is generally just a lot lower now across the board. We have people with the title "Senior Engineer" that still regularly need mentorship and peer-programming sessions.
The title I have now (Principal Engineer) I feel like I don't deserve, it was reserved for folks I looked up to when I was starting and I don't think I have reached their level. Yet at the same time it's essentially necessary to distinguish myself from what is now considered "Senior".
I have a separate rant for the lack of proper juniors and lack of structured mentoring etc but that is for another day.
The chunking unit is programming languages which means as long as you can call the task the specific programming language, people will believe it's the same thing.
In reality you have a hammer expert and are in the business of making everything look like a nail.
So now we have complicated bloated applications of JavaScript in places where it's absolutely inappropriate and you need far more skill to navigate those waters then someone who uses more appropriate tools.
It's a perversion in the name of simplicity because we're forcing too coarse of a model on too fine of a problem and as a result everything explodes in complexity, takes too long, is too expensive and comes out working like trash.
We could do better but first we have to eat crow and admit how foundationally wrong we were and frankly things aren't bad enough to make that mandatory so the circus continues just like it did when we tried to make everything OOP and it made the same smell.
They're useful tools, but that's their perimeter; they aren't zeitgeists... well they shouldn't be.
Which is ironic as the DOM interface was designed as an abstract interface (the IDL used in the spec is more interested in compatibility with Java than JS).
In practice though the main reason is that to have decent DOM bindings you need to stabilize many other specs first (unless you do a ultra-specific DOM-only extension, but nobody wants that)
I have been very successful in replacing a JS browser application with Rust. It has been great, because Rust is much easier to change, and since the application is complex, I need to change it a lot.
But I wouldn't recommend it in general, because JS (and for slightly more complex things, TS) is much easier to create because of all the old Rust features that surface every time you mention it. And most GUI code is simple enough that you only have to write once, maybe fix a few details, and forget about it.
Wasm would be much more compelling if it was target by higher level languages.
The sooner they accept that there were no such thing as one language to rule them all, the better developer they become. I have never seen the "isomorphic" claim to be seriously analyzed. One aspect example is how much logic behind the wall is overlap with the optimistic ui logic? Some logic may seem reusable but it may not. It's insane when I saw a popular js framework author on twitter said javascript is a language of the web, other backend languages are not (not exact words). Like WTH.
It needs to be all Javascript, because Javascript in itself is already at least 5 languages, with ES6, browser runtimes, Node, ESM and CJS, and TypeScript, some CoffeeScript remenants, and the list doesn't end. There is no end to complexity.
It's about hiring talent that can drive customer value without concern for anything else and the business can eventually hire more experienced individuals to fix the mud mountain if there's market fit and continued need.
It is extremely easier to find more affordable talent that has come out of a boot camps knowing javascript but more specifically "react/nodejs" which lets them work both frontend and backend. They rarely know best practices, how to properly debug & troubleshoot a problem that isn't a Google search away, etc, but they will be hungry and work their butts off to ship their CRUD style features.
Most developers don't seem to want to learn more than "one" thing. Once they do one tutorial, it seems they're done for life with learning.
And they don't really have time to learn new stuff, as they spend too much of their time on Hacker News complaining there's too many new frontend frameworks or something like that.
Sounds like "consultant talk", too much fluff and big words but no nuance
"Using different languages" is just Tuesday in most places. Sure, don't use languages needlessly, but it's not a big hurdle unless you're just a "nodejs bro"
Honestly these are not optimizations at all, but rather architectural decisions. Large architectural decisions are generally best made at the outset, based on problem domain analysis. They are costly to change later.
Like so many others, the author appears to be latching on to the phrase "premature optimization" as a popular buzzword (buzz...phrase?). This is so far from what Knuth actually wrote in his book that it hurts.
> It would rather be a waste of resources, money and energy not to use an high-performance language for high-performance applications.
Given that probably most developers support the push to tackle the climate change, they seem to be making no effort to ensure their apps execute in as short time as possible using as little resources as possible.
You would expect that people would actually embrace doing things in C or Go to save energy.
Maybe cloud providers should think of showing the carbon footprint as a first class performance metric.
Just like with code optimization we should first make sure making code run slightly more efficient really has an impact on the climate. Because I very much doubt it does.
When you want to save the climate there are many, many low-hanging fruits. The choice of programming language is likely not one of them, as much as I like efficient code.
If you are building a SaaS company and build your site in RoR, but also have experience in say Go, and decide 'hmm instead of using RoR ill use Go for this backend thing so its faster'
That's fine.
Premature optimization would be saying:
I don't know Go/C++/C... but I know its fast, so instead of using what I know to get up and running quickly, i'll waste time building it in something I don't know which probably wont be as fast as doing it in something I know well.
The thing is, when you're building it, no one is using it! If it's slower in RoR than Go, who cares, get it up and running, and fix it later when people are actually using it.
> It is not premature optimization to have parts of a back-end implemented in C/C++/Go/whatever else if high performance is needed.
But the overwhelming majority of the time you don't need it, at least not yet. I would say that unless you have actual evidence that your other language would not be adequate - i.e. an implementation of your system or some representative subset of it, in your main language, that you spent a reasonable amount of time profiling and optimizing, and that still proved to have inadequate performance - then it is indeed premature.
That assumes it’s harder to build it in the other language though. Maybe if that language is C then that will the case, but building in say Go may be just as easy as building in JavaScript (or close enough that it doesn’t really matter), whereas rewriting it later would be a massive undertaking.
This I very different to say starting a with a micro service architecture which imposes relatively high overheads, with little benefit as splitting up a well designed monolith is easy.
> That assumes it’s harder to build it in the other language though. Maybe if that language is C then that will the case, but building in say Go may be just as easy as building in JavaScript (or close enough that it doesn’t really matter)
The cases where Go is significantly faster than JavaScript are vanishingly small.
> whereas rewriting it later would be a massive undertaking.
This is vastly overstated IME. Porting existing code as-is between languages is actually pretty easy.
> This I very different to say starting a with a micro service architecture which imposes relatively high overheads,
Disagree. You're imposing a huge overhead on hiring (if you want people with both languages) or on people's ability to work on the whole system (if you're happy hiring people who only cover one side or the other). Debugging also gets a lot harder. There's essentially twice as many tools to learn.
> This is vastly overstated IME. Porting existing code as-is between languages is actually pretty easy.
Yep. We always implemented (parts of) embedded software in Java first and later in JS and then port it. If no additional functionality is added, this is trivial and saves a lot of work and errors as you already tested, debugged and fixed the logic parts.
The experts in how to profile/optimize/etc a system aren't going to be JS devs though. They're going to be people who are used to dealing with systems that need to be written in languages from the machine-code-compiled lineage.
Which is to say ... while JS developers do know how to profile code, people who are routinely exposed to this problem are not going to be JS developers. The people who are good at identifying when a system needs to be re-written in Go for performance reasons are probably already Go developers and people who have lots of experience writing performant code.
Plus writing in that sort of language from the start means there is a chance to segue into performant code without needing to rewrite things in a new language.
> The experts in how to profile/optimize/etc a system aren't going to be JS devs though. They're going to be people who are used to dealing with systems that need to be written in languages from the machine-code-compiled lineage.
Sure they are. The skills aren't really language-dependent, and nowadays machine code is so far away from the actual hardware behaviour that it doesn't actually help a lot. Besides, the biggest speedups still come from finding errors or inappropriate algorithms or datastructures, and that's if anything easier to spot in a higher-level language where there's less ceremony to get in your way.
> finding errors or inappropriate algorithms or datastructures
when I interview developers, folks coding in Java can usually tell me what an appropriate datastructure would be, while half the JS developers can't explain how an array is different from a linked list. Because most of the time JS developers. dont have to think about it much.
That makes sense assuming you have already written your backend in JS. In that case, yeah the bar for rewriting in another language should be high (as it should be for any ground-up rewrite). But it's not a "premature optimization" when you are deciding on the tech stack to begin with.
Using a second language for performance (which is what the post I replied to was suggesting) is a premature optimization - you're paying all the costs of having your code in two different languages, for a benefit that will only start paying off when you're much bigger if at all.
You can't really tell for a Web App whether it'll be faster in JS or Python, but you can definitely expect a Computer Vision application with lots of heavy number crunching to be a lot faster in C++ than in Python. We have actually also made comparisons, and even if you use things like numpy and Python bindings for OpenCV, you won't reach the speed that a C++ application achieves easily without optimization.
That depends a lot on what that CV application does and what hardware it runs on. Naive C++ (that runs on CPU) is usually much slower than using Python as glue for libraries that run on GPU.
For a lot of applications there are pretty straightforward calculations to meet a desired frame rate.
Also I've seen UX research that established some guidelines for how long something can take while still feeling responsive allowing a user to remain focused.
Knowing you can achieve the required/desired performance in any given language is mostly a matter of experience solving similar problems.
We try to limit the amount of languages we use, but we have high performance Computer Vision code that is written in C++, we're interfacing that with Python for simplicity, and a web app in JS. Right tool for the job!
I really don't understand all these complaints about how Kubernetes is so complex, how it's an investment etc. I am a single developer that uses Kubernetes for two separate projects, and in both cases it has been a breeze. Each service gets a YAML file (with the Deployment and Service together), then add an Ingress and a ConfigMap. That's all. It's good practice so still have a managed DB, so the choice of Kubernetes vs something running on EC2 doesn't change anything here.
Setting up a managed kubernetes cluster for a single containerized application is currently no more complicated than setting up AWS Lambda.
What you get out of it for free is amazing though. The main one for me is simplicity - each deployment is a single command, which can be (but doesn't have to be) triggered by CI. I can compare this to the previous situation of running "docker-compose up" on multiple hosts. Then, if what you're just deploying is broken, Kubernetes will tell you and will not route traffic to the new pods. Nothing else comes close to this. Zero-downtime deployments is a nice bonus. Simple scaling, just add or remove a node, and you're set.
Oh, and finally, you can take your setup to a different provider, and only need some tweaks on the Ingress.
I agree I consider kubernetes to be a simplification. I have two apps running at my company the first is a forest of php files and crontabs strewn about a handful of servers. There are weird name resolution rules, shared libs, etc. Despite my best efforts its defied organization and simplification for 2.5 years.
The second is a nice, clean EKS app. Developers build containers and I bind them with configuration drop'em like they're hot right where they belong. The builds are simple. The deployments are simple. Most importantly there are clear expectations for both operations/scaling and development. This makes both groups move quickly and with little need for coordination.
It’s really complicated when something goes wrong. That is my only criticism. Particularly in the various CNI layers out there. You really have to know exactly how everything works to get through those days and that is beyond the average person who can create a docker container and push it into the cluster which is the usual success metric.
Networking is complex, unfortunately, and cloud networking has a legacy of trying to support things that never should be supported (stretched L2s, 10./8 everywhere, etc.)
Things get much simpler if you try to limit CNI complexity by going towards at least conceptually simpler tooling that matches original, pre-CNI design of k8s, IMHO.
99% of people aren't going to use a different CNI plugin to what their managed distribution ships with. Same goes for peeking under the covers of storage plugins, kubelet config, etc.
You pay AWS/GCP for that these days and just use the API.
In the limit, there are some startups that could run production on a single Linux host - I recently helped one get off Heroku and their spend went from ~$1k/mo to ~$50/mo and it made debugging and figuring out performance issues so much easier than what they were doing previously...
I've worked at many places where everything was run off single instance (usually windows) VMs.
These systems rarely were constrained in performance by scaling issues that would have be solved by scaling horizontally, and in many cases the added latency of doing so would have caused more problems than it would have solved.
And as you say, having everything on one or two VMs is not just orders of magnitude cheaper hosting, but also comes with benefits of much more easy debugging and performance monitoring.
These weren't tiny start-ups either, these were long running services contracted out to clients including the government and other major companies.
It doesn't scale infinitely, but I'd wager that the traffic just isn't there to justify these kinds of setups in 99% of cases, and that some of the scaling needed is because all the added latency and service discovery etc is adding overhead that wouldn't be needed without it.
I've often found it odd that people who strive for YAGNI at the code level don't apply the same to the system architecture level.
I'm very much with you on this, but I do understand that it's one of those things that is just not feasible when your team has no sysadmin/devops experience.
You were able to do it, but what happens to them when you're not around? Does their team have the required experience to handle it? That's the difference in cost. It's like DYI - yes, if I have all the skills and experience I can do everything myself incredibly cheaply, but... I don't. So I gotta pay.
But if your team does have the skills and experience, it's definitely worth looking into.
I do think people deeply underestimate what can be achieved with a single (or two, if you need a standby) dedicated Linux server these days. A single server can easily go up to 2TB of RAM and 128+ cores. Long before you ever get to a scale where that's a limitation, you'll have more than enough resources to figure things out.
Funnily enough, small amount of servers that you want to utilize as much as possible is pretty much one of the original use cases for kubernetes.
My first deployment involved virtual machines, but we were essentially packing as much as possible into lowest amount of VMs, then doubled it up so that we had failover. This way we had clear visibility of how many resources we were using and could allocate them according to how much we had.
You were able to do it, but what happens to them when you're not around?
This is why you pay someone. Saving $950/month means it's well worth spending $500 for a day of someone's time occasionally. You don't have to do everything internally when you run a startup. Buying in services that you only need occasionally is well worth the money.
Are there contractors out there that will take on call shifts? Because it seems unlikely, and if your proposal is "put into a production a system that you will have to spend $500/day every time it goes down and wait 2-4 business days for a resolution" then you're a braver person than I am.
Obviously not. You don't pay someone to just set it up. You pay to help do what you'd do if you had a dedicated devOps teams. You pay someone to set up the system with your team so they understand it, train your team to use it, write some documentation about it, script a rollback procedure, maybe help on developing playbooks, etc.
Besides, there are people out there who offer on-call services for a small retainer.
This is optimistic to say the least. I've worked as an SRE for 5 years and apart from the others in the team the devs don't have nearly as much knowledge. There's no way I'd rely on them to fix an outage.
And even on a small retainer you'd better hope they retained the knowledge of how all that stuff works if you're only calling on them every now and again.
The idea is devops as _culture_ - I come in as an expert and set it up, then show them how I did it, then run through various disaster recovery scenarios so they learn it and can handle the vast majority of problems.
And you'd be surprised how little problems you might have - I've had many VMs with literally years of uptime running without issue.
Most people focus on "devops" as a job - and they never bother to teach the rest of the team anything about how stuff works. Worse than that, modern clouds encourage you to build opaque and complicated systems which probably only someone working on them full-time has a hope to understand...
If it was so easy for devs to pick up SRE companies wouldn't be struggling to find good people.
Culture doesn't mean they'll know how to fix <some weird edge case> at 3am in the morning. The SRE with constant ops exposure is likely to have a much better chance, if only because they (ought to) really know how to debug it.
> I've had many VMs with literally years of uptime running without issue.
I hope they all have fully patched base libraries and kernels, security auditing is getting to a much more common requirement these days even among very small companies. For example, anybody using Facebook Login.
And even on a small retainer you'd better hope they retained the knowledge of how all that stuff works if you're only calling on them every now and again.
This is why we document things at the company I work for. If you're serious about a project and want it to exist for a long time there will be things that only come up every few years, and you won't remember them. You can either write things down or you have to relearn how things work every time. Wroting things down is much easier.
> I'm very much with you on this, but I do understand that it's one of those things that is just not feasible when your team has no sysadmin/devops experience.
But this applies to everything. You also need Heroku or Kubernetes or whatever experience to maintain those systems, right?
The question is how much you need to know. You need much less knowledge to run something in Heroku than in k8s; dramatically less in the "onboard a CRUD app" case. I'd argue that running k8s effectively is no less knowledge-intensive than running VMs.
> I'd argue that running k8s effectively is no less knowledge-intensive than running VMs.
LOL - are you f'ing kidding?
Running VMs is much closer to running your app locally than containers and k8s - it's not that hard to do badly, and only marginally harder to do well.
> I'm very much with you on this, but I do understand that it's one of those things that is just not feasible when your team has no sysadmin/devops experience.
Exactly - $950/mo is nowhere near enough to pay for those skills if you don't have them. It's good value for money.
I say it's over-rated. You don't need a huge amount of "sysadmin/devops", that's only now becoming a thing since we started calling it that. Before it used to be that backend devs just had intimate knowledge of how their service was running on a system, and most likely had to login and debug a multitude of issues with it. 99% of backend devs used to (maybe not so anymore now with "devops" et al) be more than capable of administering a system and keeping it chugging along as well as setting it up. It might not be 100% bulletproof or consistent or whatever, but more than enough for a company or service starting up.
We've lost that, and now everyone thinks we need "sysadmin/devops" for simple and moderately complicated deploys. Heck, most of the guides are out there, follow them. Also, ready-made images and docker containers are amazing these days with config built in. If you look-for or hire devops, you get K8s and all the "formal" items they'll bring with them. You don't need CI/CD for the first 6 months, just PoC the damn thing, get an MVP out and start getting users/customers, the rest will follow.
I've worked in DevOps for a while and if I could pay $950 to not run and maintain a server then I'd consider it money well spent.
There's always 1-2 comments in these threads that advise using a Linode VM or Hetzner dedicated server in order to save money; but they are really skipping over the headaches that come with building and maintaining your own servers.
- Are the server provisioning scripts source controlled?
- Is there a pipeline if the server needs to be recreated or upgraded?
- How is the server secured?
- Are the server logs shipped somewhere? Is there monitoring in place to see if that is working?
- Does the server support zero-downtime deployments?
- Are backups configured? Are they tested?
I imagine the answer to a lot of these questions is no and to be fair not all PaaS systems provide all of these features (at least not for free).
The server becomes a pet, issues inevitably arise and that $950 starts to look like a false-economy.
I think this all ignores the opaqueness of most PaaS providers - when everything is on a single box, you have infinite observability, standard Linux perf analysis tools, etc.
If you do it correctly, it is loads easier to reason about and understand than a complex k8s deployment.
The amount of traffic you can serve on a bunch of Linode boxes is pretty high. I know this sounds like a boomer yelling at the clouds --see what I did here?
Kubernetes is the right solution to a difficult problem you may or may not have in your future.
> In the limit, there are some startups that could run production on a single Linux host
I guess redundancy is not really a thing then?
With serverless offerings you can get rather good deals, I don't think you need your own K8S cluster if you can get away with a single Linux host, but a single Linux host is pretty pricey maintenance wise compared to google Cloud Spanner and cloud run.
Most of the time it really doesn't need to be. In the end what you care about is uptime and cost. A redundant solution doesn't have a perfect uptime just because it's redundant, in fact sometimes it might have even less uptime because of failures in the redundancy mechanism. Of course if you need to be always up it might be worth it. But for a lot of situations some downtime is acceptable and most of the time it's better to be on a simpler setup that's easier to debug and less costly to maintain, than to be on a complex one that requires more maintenance and still doesn't have perfect uptime.
Also, I wouldn't underestimate the cost of maintenance in managed solutions compared to self-hosted. It wouldn't be the first time that the managed solutions screw something up or apply some changes and you don't know whether it's your fault or theirs. You also have to add the cost of adapting your solution to their infrastructure, which is not trivial either.
> A redundant solution doesn't have a perfect uptime just because it's redundant, in fact sometimes it might have even less uptime because of failures in the redundancy mechanism
I'm fairly sure that google does a better job keeping cloud spanner and cloud run working, and their redundancy mechanisms working, than whoever runs your single linux box will do.
> most of the time it's better to be on a simpler setup that's easier to debug and less costly to maintain
Keeping a whole linux box running is more complicated and requires more maintenance than Cloud Run and Cloud Spanner.
> Also, I wouldn't underestimate the cost of maintenance in managed solutions compared to self-hosted.
It is not 0, it is just that if you manage cloud run and cloud spanner you don't manage all the other things you have to manage when you self host, and managing cloud run and spanner is really not a lot of effort, it is a lot less effort than managing a standalone database at least.
> You also have to add the cost of adapting your solution to their infrastructure, which is not trivial either.
Cloud run can run stock standard docker containers, you will have a bad time if your processes are not stateless though, and you will have the best time if you have a 12-factor app, but I would not count that as adapting to infrastructure.
> > A redundant solution doesn't have a perfect uptime just because it's redundant, in fact sometimes it might have even less uptime because of failures in the redundancy mechanism
> I'm fairly sure that google does a better job keeping cloud spanner and cloud run working, and their redundancy mechanisms working, than whoever runs your single linux box will do.
You'd be surprised - the major cloud providers have outages all the time.
Google in particular will have some random backing service firing 502's seemingly randomly while their dashboards say "all good".
> I'm fairly sure that google does a better job keeping cloud spanner and cloud run working, and their redundancy mechanisms working, than whoever runs your single linux box will do.
Most of the uptime loss won't come from your provider but from your applications and configuration. If you use Cloud Run and mess up the configuration for the redundancy, you'll still have downtime. If your application doesn't work well with multiple instances, you'll still have downtime.
> Keeping a whole linux box running is more complicated and requires more maintenance than Cloud Run and Cloud Spanner.
Is it? I keep quite some linux boxes running and they don't really require too much maintenance. Not to mention that when things do not work, I have complete visibility on everything. I doubt Cloud Run provides you with full visibility.
> It is not 0, it is just that if you manage cloud run and cloud spanner you don't manage all the other things you have to manage when you self host, and managing cloud run and spanner is really not a lot of effort, it is a lot less effort than managing a standalone database at least.
Managing a simple standalone database is not a lot of effort. For most cases, specially the ones that can get away with running production on a single box, you'll be ok with "sudo apt install postgresql". That's how much database management you'll have to do.
> Cloud run can run stock standard docker containers, you will have a bad time if your processes are not stateless though, and you will have the best time if you have a 12-factor app, but I would not count that as adapting to infrastructure.
That definitely counts as adapting to infrastructure. For example, if I want to use Cloud Run my container should start fairly quickly, if it doesn't I need an instance running all the time which increases costs.
I'm not saying Cloud Run/Spanner are bad. They'll have their use cases. But for simple deployments it's more complexity and more things to manage, and also more expensive. If doing "apt install postgres; git pull; systemctl start my-service" works and serves its purpose, why would I overcomplicate it with going to redundant systems, managed environments and complex distributed platforms? What do I stand to gain and at what cost?
I'm getting tired of the "You don't actually need Kubernetes while starting out!" crowd, despite being part of it. Of course you don't. Of course if you don't know Kubernetes, learning it as you try to get a company going on a minimum headcount is not the most efficient approach.
But for pete's sakes man, if you have used K8s before, know what you're doing and you're running on cloud, just shove a couple off-the-shelf Terraform modules at the problem and there's your execution environment needs solved, with easy and reliable automations available for handling certs, external access, load balancing, etc, all of it reasonably easy to get going once you have done it before.
Stop pretending Kubernetes is this humongous infrastructure investment that mandates a full time job to keep up at low scale. Of course if you have done this multiple times before you don't need to be told this, but people new to it shouldn't be fed a barrage of exaggerations.
I've seen projects fail or take 10x as long due to choosing to go with k8s day 1 instead of starting with even a basic VM.
It may be easy enough to get k8s itself setup. However, without an in-house expert sat within the dev team, all the access/permissions to resources to/from the cluster were hell. NFS disk Read/Write, S3 Read/Write/List, RDS DB access, Kafka access, corp network access to web server within k8s.. and then all the daemons you need to actually be able to monitor your logs/health of apps/etc.
Each of these things in isolation would be no big deal, except that they'd take a different subset of 10 different infra/devops team members 1-2 weeks to sort the first time, in some fragile manner that frequently fell over.
All of these for an application that was replacing something basically owned by 1 dude as 10% of his job, running on a 10 year old server under cron, with a big disk attached.
It’s not like vms have magical simplicity guardrails. I run all my personal stuff in K8s because it’s what I know and it’s all pretty set it and forget it. If it really is that simple, or your comfortable with Ansible great! but I’ve seen lots of issues on VMs with mountains of custom deploy scripts, or OS’ that can’t be upgraded, or manual configurations people don’t understand.
This sounds like projects struggling because they chose to go with k8s when the team lacked k8s expertise.
Honestly I feel like the original blog post shouldn't be talking about k8s at all. The real lesson, as far as I can tell, is just "when you're starting out, choose simple tech that you already know how to use."
If you already know how to use Kubernetes, then feel free to use it. If not then don't.
Don't you have to do all of this stuff even if you're not running k8s? Or are you complaining about a re-write/replacement of something that didn't need to be?
As a k8s novice, it seems apparent to me that access to physical resources from pods requires more levels of access configuration than bare metal having access to the same physical resources.
For example NFS you need to make sure the underlying hardware the k8s is running on has the mount, and then that the pods within have that access.
Whatever host level config you need to do on the VM needs to be basically replicated within each of your container images for the pods you want to run, no?
Then you have the same update cycle except across N container images instead of 1 VM?
> Stop pretending Kubernetes is this humongous infrastructure investment that mandates a full time job to keep up at low scale.
It is a full time job. The cost of using something is not just the setup cost. The same way that software development is not just the cost of writing the software.
Indeed. Hosted k8s has been low maintenance in my experience, much lower than managing a bunch of VMs. I think the common thread between people who have horror stories using k8s is either pains from self-hosting or teams who adopted it without experience and used the ample rope it gives to hang themselves.
[OP here] It feels bizarre saying this, having spent so much of my life advocating for and selling a distribution of Kubernetes and consulting services to help folks get the most of out it, but here goes! YOU probably shouldn't use Kubernetes and a bunch of other "cool" things for your product.
Most folks building software at startups and scale-ups should avoid Kubernetes and other premature optimisations. If your company uses Kubernetes, you are likely expending energy on something that doesn't take you towards your mission. You have probably fallen into the trap of premature optimisation.
Please don't take this post to be only aimed against Kubernetes. It is not. I am directing this post at every possible bit of premature optimisation engineers make in the course of building software.
Even if they know it, unless they absolutely need to implement it, why would you waste such valuable resources on of all things infra? (Unless your product IS infra). No engineer I know who’s smart enough to effortlessly deploy k8s on their own would want to do that as the job. There’s a million other interesting things (hopefully?) that they can be doing.
Not at all. If you know Kubernetes well and your needs are fairly simple it takes no time at all.
On a project a couple of years ago my cofounder and I opted to use Kubernetes after running into constant frustration after following the advice to keep it as simple as possible and just use VMs.
Kubernetes makes many complicated things very easy and lets you stop worrying about your infra (provided you’re using a managed provider).
On our next project we used a PaaS offering because we had a bunch of credits for it and it was painful in comparison to what we had with Kubernetes. It was way more expensive (if we didn’t have credits), the deployments were slow, and we had less flexibility.
Kubernetes isn’t perfect, far from it. But for those who know it well it is usually a good option.
Often to get access to things that integrate well with kubernetes.
Personally I don't see too much difference between kube yamls and systemd units and cloudformation or whatever and the "cluster maintenance" itself is not the burden it used to be if you stay on the "paved road" provided by your cloud.
You can get away without an orchestrator right up until about when your ARR hits $10mm
Hiring people with k8s experience in 2022 is not a difficult task, it's not an obscure technology anymore, and when your initial crop of devops decides to leave, the new guys coming in can scan your deployments and namespaces and pretty much hit the ground running within 48-72 hours. That's a big, important part of running a business. Being able to hire people with the right skill set, and being able to keep the lights on.
Bespoke systems built on top of EC2 require time, extra documentation and a steep learning curve, not to mention the fact that the bespoke system probably isn't getting security audits or being built to modern standards or best practices, tooling isn't being kept up to date. I can build a mechanical wristwatch in my garage, but if I had full access to the Rolex factory floor for free, I'd probably take that option.
I just came out of a project where Kubernetes performance issues involved wild guess and blind tuning until the so called experts actually found out why the cloud cluster was behaving strangely, including support from Cloud vendor.
And good luck making sense of all the YAML spaghetti available for bootstrapping the whole cluster from scratch.
If your devops guys are struggling, you are hiring the wrong devops folks, or at the wrong end of the pay band. Most devops guys I know are paid in the same band as a senior developer.
Sure, it is like bug free C code. It is only a matter of having the top of the cream. Pity there aren't enough of them in the world, including on cloud vendor support team.
Running a self-hosted Kubernetes is indeed ... questionable, but a managed Kubernetes? That's a pretty sane thing to do IMO. The alternatives are running your Docker containers either manually on some EC2 or other bare-metal server which is a nightmare to do deployments, or using something like Elastic Beanstalk which is even worse.
For me at least, Kubernetes has become something like an universal standard: if you're already running Docker or other containerization, "using Kubernetes" is nothing more than a clear, "single source of truth" documentation on what infrastructure your application expects. Certainly beats that Confluence document from 2018, last updated 2019, on how the production server is set up that doesn't even match up with what was reality in 2020 much less today.
Of course, managed means you don't take care of say, creating your own X.509 CA and issuing certificates, tending to their expiry, installing Tiller, setting up pod networking etc. etc.
All of these much more annoying and harder than `helm update --install`ing some charts to your own cluster.
To be fair installing tiller hasn't been a thing for years. And pretty much everyone using cert-manager and lets encrypt which makes the whole X.509 story pretty much a no brainer.
What about standardisation that comes with using a framework like kubernetes , while not using k8s you end up with adhoc deployment methods, clunky work arounds of handling networking, policies, secrets etc, with Kubernetes or even ECS it signals that the team or the developer is looking to use fixed set of rules for infrastructure, also k8s scales well even for smaller apps
Seriously, I use k8s for the same reason I use docker: it's a standard language for deployment. Yeah I could do the same stuff manually for a small project.. but why?
> Imagine spending a lot of time and money picking out the best possible gear for a hobby before actually starting the hobby.
Haha, this is exactly what “hobby” means to a lot of people. Less judgmental: thinking and dreaming about the right tools in disproportion to the need is something people do a lot, presumably because it is a source of joy.
That bit was pretty interesting to me as well, because that's exactly what many people do. I don't agree that this is what "hobby" means, it's more that we tend to try to buy our way into a life style.
I feel that Kubernetes is exactly that for a large number of developers, operations people less so, they want to be part of a professional, world class, trendy environment/community. So they try leverage Kubernetes to elevate them to this Silicon Valley type tech company.
There's absolutely value to be had from Kubernetes, but I agree with the article, it's not for the majority. Except I think there is one valid reasoning: If you purely see Kubernetes as tooling for deployment, it's hard to present any good alternatives for deploying to VMs or physical hardware.
You know these types of people who like to wear army clothes to look bad-ass even though they are wimps?
I see it the same type of folks as ones using FAANG tooling :) just so they can feel better about their job or more important when they work on business line CRUD.
I am working on business line CRUD and I like it with boring servers :)
I don't know. I default to GKE for all new deployments. It really reduces the mental overhead of infrastructure for me.
I can add services, cronjobs and whatnot as I see fit in a standardized manner and don't worry about. A couple YML's get you started and you can scale as you see fit. Anything you would like to deploy can share 1-2 vCPU. The whole thing is also relatively portable across clouds or even on premise if you want it. Since everything is in containers, I have an escape hatch to just deploy on fly.io, Vercel or whatnot.
I get the criticism, that people over complicate projects that maybe don't even have traction yet. Bashing k8s is the wrong conclusion here IMHO.
It feels a bit like people bashing React and SPAs just because what they have built so far never needed it. Just stick to your guns and stop evangelizing your way of doing things.
I've seen cases where we started off as simply as possible with no k8's. We built the initial product really quickly using a ton of managed services. Whilst it was great to get us going, once we hit "growth" things just didn't scale. (1) The cost of cloud was getting astronomical for us (and growing with each new deployment) and (2) it was totally inflexible (whether that be wanting to deploy in a different cloud, or extend the platform's featureset) because we didn't own the underlying service.
We ended up porting everything to k8's. That was a long & arduous process but it gave us total flexibility at significant cost savings. The benefits were great, but not everyone has access to the engineers/skillset needed to be successful with k8's.
That's why we built Plural.sh – it takes the hard work out of k8's deployments. I've seen people go from zero to a full production deployment of a datastack on k8's in just 2 weeks. It deploys in your cloud, and you own the underlying infra and conf so you have total control of it. And because we believe in being open, you can eject your stack out of plural if you don't like it and keep everything running.
I think it's funny that everyone uses/loves k8s ends up building a product to make it easier to use. There are a lot of examples in the comments here. To me, that's enough of a red flag that k8s doesn't understand their users or can't meet them where they're at.
> We ended up porting everything to k8's. That was a long & arduous process but it gave us total flexibility at significant cost savings. The benefits were great, but not everyone has access to the engineers/skillset needed to be successful with k8's.
I think the argument from most people advocating against using k8s from the get go isn't so much that you'll never need it, but more that it's better to pay this cost later when you have a demonstrated problem that needs to be solved, even if it's a bit more expensive (and I would bet that the total time expenditure isn't really all that much more moving to k8s later, it's just in one chunk instead of spread over your history).
By deferring the cost of building more complex infrastructure:
- You can use those resources to advance your product (arguably some companies that have hit a wall with PaaS vendors may never have gotten to the point where they outgrew those vendors if they spent more of their time on infrastructure vs. company value)
- You'll have a better idea of what you'll need and where your scaling hotspots are going to be if you've already encountered them. Building infrastructure that allows for flexibility in a few key areas is infinitely easier than building infrastructure that can scale in every way imaginable.
- You can potentially take advantage of new technologies if you defer the decision to when you need it. It would be silly to assume k8s is the final word. If you wait a few years, things will have evolved and you gain the advantage of using those few years of advancement rather than cementing a choice early on.
Had the same exact experience! Just posted a comment. I think could companies have done a good job marketing how cheap it is to get started on cloud. Once things scale, the bills change and it's no longer as cheap to use managed infra.
The night and day difference is hard to explain when people haven't had to deal with all these issues on top of scaling, uptime, performance issues etc. I just can't recommend k8s enough
I feel like most of these rants come from people who never built the alternative to kubernetes to support a modern workflow (CI/CD with branch deploys, monitoring, access control etc).
I love kubernetes because I don't need to build bespoken platforms at every company I join. I probably would have switched careers by now if I still had to deal with site specific tooling that all essentially implement a worst version of what Kubernetes has to offer.
I'll be honest - i've worked places where kubernetes was a thing, and places where it wasn't. Both within the last 5 years.
Kubernetes is a layer of complexity that just isn't warranted for most companies. Hell even Amazon still sticks with VMs. Autoscaling and firecracker solve most things.
Its nice to have cluster (pod?) management as a first-class concept, but you get that with tagging of instances just as well for most use cases.
Amazon is IMHO an anti-example - they have huge installed base of tooling to handle VMs, and most importantly, they have huge scale of capital available. Throwing single-use VMs at the wall is cheap for them.
Meanwhile, I was chugging around with k8s because it was balance between "easy to deploy our workloads with minimal time spent" vs "increasing the cost by few pizzas will cause noticeable loss so pack those VMs tight".
I don’t get these complaints AT ALL. I don’t use kubernetes, simply because I am running apps in managed environments, and have been using docker-compose with vscode remote to emulate those environments. But being able to define your resources and how they are linked via a schema, makes sense even from a dev perspective. Isn’t that all that kubernetes is doing at it’s most basic? Sounds like that saves time to me over manually setting everything up for every project you work on.
From what I've seen with Kubernetes, the problem is that in order to define the resources and links with a schema, it needs several abstractions and extra tooling. So you not only need to define the schema (which isn't that trivial) but also understand what Kubernetes does, how to work with it, and how to solve problems. It's a tool that adds complexity and difficulty, if you're not using the advantages it provides (mainly multi-node management and scale capabilities) you're just handicapping yourself.
This is a common misconception. k8s isn't about scale, multi-node or even reliability/resiliency. We had solutions for all of that before it came along.
It's about having a standard API for deployment artifacts.
The k8s manifests are trivial (if verbose) the complexity comes from running the underlying layer which when you are small you simply outsource to AWS or GCP.
There is some k8s know-how that is table stakes for a good experience, namely knowing what extra stuff you need on top of a base k8s cluster. i.e certmanager, external-dns.
Overall it's a lot less required knowledge than it takes to manipulate lower level primitives like GCP/AWS directly or be capable of setting up standalone boxes.
Before anyone starts with "but serverless!" you need to consider the spagetti you end up with if you go down that route, API Gateway, 100x Lambdas and thousands upon thousands of lines of Terraform boilterplate does not a happy infra team make.
> Overall it's a lot less required knowledge than it takes to manipulate lower level primitives like GCP/AWS directly or be capable of setting up standalone boxes.
I do not agree with this. I have tried to start Kubernetes and it's far more confusing to set up than just setting up something on a standalone box, mainly because it looks like the set of knowledge I need to set up anything on Kubernetes is a superset of what I need to set up the same thing in bare Linux.
It's really not unless you are doing a half-assed job of setting up bare boxes.
Lets see to get something even half-reasonable on a bare box you need the minimum:
process monitor: For this you can use systemd, supervisord, etc.
logging: rsyslog or similar.
http reverse proxy: nginx or haproxy
deployment mechansim: probably scp if you are going this ghetto but git pull + build and/or pull from s3 are all common at this level plus a bunch of bash to properly restart things.
backups: depends on persistence, litestream for sqllite these days, pgdump/whatever for RDBMS, wal shipping if you are running the DB on your own boxes.
access control: some mechanism to manage either multiple users + homes or shared user account with shared authorized_keys.
security: lock down sshd, reverse proxy, etc. stay on top of patching.
So are bare minimum there is a ton of concepts you need to know, you are essentially regressing back into the days of hardcore sysadmins... just without all the hardcore sysadmins to do the job correctly (because they are now all busy running k8s setups).
Meanwhile to deploy a similarly simply app to k8s (assuming you are buying managed k8s) you need to know the following things:
deployment: Describes how to run your docker image, what args, how much resources, does it need volumes, etc. You can go ghetto and not use service accounts/workload-identity and still be in better shape than bare boxes.
service: how your service exposes itself, ports etc.
ingress: how to route traffic from internet into your service, usually HTTP host, paths etc.
Generally this is equivalent but with way less moving parts.
There is no ssh server, patching is now mostly just a matter of pressing the upgrade button your provider gives you. You now only need to know docker, kubectl and some yaml instead of systemd control files, sshd_config, nginx.conf, bash, rsync/tarsnap/etc.
K8s reputation as being hard to run is deserved.
K8s reputation as being hard to use is grossly undeserved.
I mean, container+registry is possibly more complex to start and maintain than just creating a service file with SystemD. Deployment can be as easy of "systemctl restart service", and seems to me that configuring all of those resources in Kubernetes is far more difficult that just setting up a simple service on a bare box. Not to mention that you could use Docker too.
And by the looks of it, ingress doesn't seem trivial, not only you need to understand how NGINX works for reverse proxy but also understand how Kubernetes interacts with NGINX. You also ignored backups for Kubernetes, if you're just doing full disk backups you can do that with regular boxes, specially if they're just VMs.
> There is no ssh server, patching is now mostly just a matter of pressing the upgrade button your provider gives you.
No SSH server might be an advantage for you, but for me it means it's far more difficult to know what's happening when Kubernetes doesn't do what I want it to do.
> You now only need to know docker, kubectl and some yaml instead of systemd control files, sshd_config, nginx.conf, bash, rsync/tarsnap/etc.
But I still need to know how to set up all applications I use, which is the most important part. And understanding systemd service files, sshd_config, nginx.conf and rsync is far, far easier than understanding Kubernetes. Kubernetes manages those things so you actually need to understand both the underlying concepts and also the abstractions Kubernetes is making.
You are also comparing self-hosted with managed, and then only mentioning the disadvantages of self-hosted. Of course with a managed setup you don't need to worry about SSH, but you have to worry about actually dealing with the manager service. If you were to set up self-hosted Kubernetes you'd still need to learn about SSH, or about account management. That's not something about Kubernetes but about self-hosted or managed.
You don't run your own registry (except in very rare circumstances where that makes sense). You author Dockerfile (or better yet use a tool like jib that creates containers automatically without even a Docker daemon), then you push to hosted registry.
Ingress is trivial to use. Internally it's less trivial but you don't need to peek inside unless you manage to break it which generally speaking, you won't; even if you think whatever you are doing is very special it probably does it already. That is the benefit of literal thousands of teams using the exact same abstraction layer.
You don't have SSH (this is objectively good), instead you can get exec on a container if you need it (and that container contains a shell). You can now control on a fine grained and integrated fashion exactly who is allowed to exec into a container and you get k8s audit events for free. (and if you are using a hosted system like GKE then it automatically flows into said providers audit system).
The point is you can buy managed k8s, there isn't an equivalent for old school sysadmin that doesn't amount to outsourcing to a body shop.
Given that you haven't managed to identify any of k8s real downsides here they are:
It's a bunch of yaml. Yeah I don't like that either but there are tools like Tanka that make that suck less.
It's complex under the hood. When your process runs its in a network, fs and pid namespaces and you have cgroup managing resources. There is probably some sort of overlay network or some other means of granting each container an IP and making that routable (BGP, OSPF, etc).
Interaction between ingress, service and pods is through an indirection layer called endpoints that most people don't even realize exists.
Scheduling is complex, there are tunables for affinity, anti-affinity, it also needs to interact with cluster autoscaling. Autoscaling itself is a complex topic where things like hot capacity etc aren't exactly sorted out yet.
QoS, i.e pod priority isn't well understood by most. When will a pod be pre-empted? What is the difference between eviction/pre-emption etc? Most people won't be able to tell you.
This means when things go wrong there are a lot of things that could be wrong, thankfully if you are paying someone else for your cluster a) it's their problem b) it's probably happening to -all- of their customers so they have a lot of incentive to fix it for you.
Multi-cluster is a mess, no reasonable federation options in sight. Overlay networking makes managing network architecture when involving multiple clusters more difficult (or external services outside of said cluster).
Ecosystem has some poor quality solutions. Namely helm, kustomize, Pulumi, etc. Hopefully these will die out some day and make way for better solutions.
Yet for all of these downsides it's still clearly a ton better than managing your own boxes manually.
Especially because you are -exceedingly- unlikely to encounter any of these problems above at any scale where self-hosting would have been tennable.
I think the best way to think about k8s is it's the modern distributed kernel. Just like Linux you aren't expected to understand every layer of it, merely the interface (i.e resource API for k8s, syscalls/ioctl/dev/proc/sysfs for Linux). The fact everyone is using it is what grants it the stability necessary to obviate the need for that internal knowledge.
Just want to note that we have a mix of ECS and stuff still running on bare VMs using tools like systemd, bash script deploys, etc..., and I 100% agree with you. Once someone understands container orchestration platform concepts, deploying to something like k8s or ECS, is dead simple.
> You don't run your own registry (except in very rare circumstances where that makes sense). You author Dockerfile (or better yet use a tool like jib that creates containers automatically without even a Docker daemon), then you push to hosted registry.
Dockerfiles aren't trivial. I've helped migrate some services to docker and it's not "just put it in Docker and that's it". Specially because most applications aren't perfectly isolated.
> Ingress is trivial to use. Internally it's less trivial but you don't need to peek inside unless you manage to break it which generally speaking, you won't; even if you think whatever you are doing is very special it probably does it already. That is the benefit of literal thousands of teams using the exact same abstraction layer.
Literal thousands of teams also use Nginx and still doesn't mean that there aren't configuration errors, issues and other things that are to debug. Not to mention that a lot of applications can run without requiring a reverse proxy.
> You don't have SSH (this is objectively good), instead you can get exec on a container if you need it (and that container contains a shell). You can now control on a fine grained and integrated fashion exactly who is allowed to exec into a container and you get k8s audit events for free. (and if you are using a hosted system like GKE then it automatically flows into said providers audit system).
And that's cool if you need that, but "ssh user@machine" looks far easier.
> The point is you can buy managed k8s, there isn't an equivalent for old school sysadmin that doesn't amount to outsourcing to a body shop.
Look, the other day I had to set up a simple machine to monitor some hosts in a network. I've Ansible to automate it, but ultimately it boils down to "sudo apt install grafana prometheus; scp provisioning-dashboards; scp prometheus-host-config; scp unattended-upgrades-config" plus some hardening configs. I don't need redundancy, uptime is good enough with that. If Prometheus can't connect to the hosts, I can test from the machine, capture traffic or whatever and know that there isn't anything in the middle. If Grafana doesn't respond I don't need to worry whether the ingress controller is working well or if I missed some configuration.
Could I run that with Kubernetes? Well, managed k8s is already out of the window because the service is internal to an enterprise network. And self-hosted kubernetes? I still need to do the same base linux box configuration, plus setting up k8s, plus setting up the infra and networking with k8s plus the tools I actually want to use.
And that's the point. I am not trying to identify any k8s downsides, what I said that setting up k8s looks far more confusing than setting up a standard Linux box, and by the things you say I am even more convinced than before than it is indeed. I still need to understand and run the programs I want (which is most of the cost of deployment), but I also need to understand Kubernetes and, if I am using a managed platform, I need to understand how the managed platform works, configure it, link it with my system... In other words, not only do I need the general knowledge for application deployment but also the specific implementation in Kubernetes.
It would be great if otherwise smart people Also learn to know when what’s trivial to them is not trivial to the masses. K8s is not trivial, not to me at the least. I’m not some super duper engineer but I do alright? That’s all I can say at the least.
Have you made a good faith attempt to use k8s? Or are you just regurgitating how hard it is to use based on what you hear on the Internet?
My experience is even mediocre engineers are capable of understanding k8s concepts with minimal assistance assuming things are either sufficiently standard (i.e google-able) or well documented with any company specific practices.
I have made a good faith attempt at using it, yes. Is spending an entire week on it good faith?
I did get everything running but also saw so many settings and features I had to take on faith as being handled without understanding it. I just chose not to bother further because this seems like a minefield of problems in the future as we on boarded other engineers. The number of times I’ve seen production deployments go down due to k8s misconfiguration on other teams has only validated my concerns.
So what you are saying is everything worked? That sounds like k8s did its job.
You aren't meant to fully understand it in a week, anymore than you are expected to fully understand all of sysadmin in a week.
Just because you don't know what every single directive in nginx config does doesn't mean you can't use it effectively and learn what they mean when the time comes.
k8s isn't much different. You don't need to know what a liveliness probe is the first time you use it, you can learn about it as you go (most likely when you run into a badly behaved program that needs forced restarts when it hangs).
Ofcourse if you are running it yourself that is entirely different, you really do need to know how it works to do that but you should be using hosted k8s unless you have hardcore systems folk and actually need that.
You’re right that nginx is also complicated , but seems like we are talking about different alternatives.
The solution I went with, given my limited knowledge of some of these things, was to use elastic beanstalk. You write your flask application, upload the zip and that’s it pretty much. You get a UI to do all the configurations and for the most part nothing there is hard to decipher or google. The only hiccup might be when you’re trying to connect it to RDS and to the external internet but even that is straightforward as long as you follow the right instructions. We run apps that power entire SaaS organizations and this system seems to be more than sufficient. Why complicate further? We have other fish to fry anyway.
I am not a SDE but I did try one weekend to set-up a self managed k8s cluster in the previous company. I did have some previous knowledge of using k8 in GCP. Although when trying to set it up on a completely new cluster outside of GCP, I did run into some issues where I felt I was out of my depth. I think unknowingly I exposed whole of the cluster to the public internet.
On the other hand, with docker swarm, I built a cluster in less than a day. So yeah setting up k8 isn't as trivial as you are making it out to be.
Guys Kubernetes is a container platform for multiple nodes. I know it seems hard to understand from the outside, but its really not. You would naturally come up with ALL the same componets if you were to take your container strategy onto multiple computers.
What if you dont need multiple servers? Well go the single node approach and have a flexible, true and tested way to spin up containers, which can and should be able to crash whenever they have to.
Using Containers is pre mature optimization too? maybe I should get my typewriter.
well I grew up provisioning and using dedicated servers and nowadays you get much tighter security from the container ecosystem that I would on a dedicated server.
I have just provisioned another bare metal k8s cluster, taking software of dedicated servers and.... well if I dont have to, im not running bare metal anymore in 2022.
If you want to just "get a computer in there" have a look at harvester, its the best of the two worlds
having repeatable infrastructure from day 1 is great. kubernetes is the simplest way to have that. it's not the only way, but it's provider agnostic, has a lot of well maintained and understood tooling around, and splits clearly artifacts from deployment (i.e. no scripts configuring stuff after startup, everything is packaged and frozen when you build and upload the container image)
> Solve problems as they arise, not in advance.
while this does make sense for supporting varying requirements the lean way, it fails to address the increased costs that rearchitecting a solution mid-lifecycle incurs.
> Do more with less.
goddamn slogan driven blogging. what is the proposed solution? doesn't say. are we supposed to log in every prod/test/dev machine and maintain the deps by hand with yum/apt? write our own chef/puppet scripts? how is that better than docker images running kubernetes? The comparison between solutions is the interesting part.
op never say. guess "works on his pc" is enough for him, we can only assume he envision a battery of mac laptops serving pages from a basement, with devs cautiously tipotoing around network cables to deliver patches via external usb drives
There are a lot of valid reasons why a start-up might prefer Kubernetes to other solutions.
A simple and good enough reason is the need for variable compute:
You might need more computing sporadically (say for example you're doing software in the sports industry; your demand changes with the number and size of events, as well as on weekends).
Another reason might be a start up that allows their customers to execute some type of arbitrary code (eg a no-code product). This can vary from customer to customer, and it can also vary within the customers use cases.
Imagine having to manage all these storage, networking and compute resources manually... Or going full circle and managing it with a bunch of puppet/ansible/shell scripts. Now you slap some automated scale triggers on top that fires off these scripts.
Congratulations! We've built something that looks like Kubernetes; if we squint. There's some smoke coming out of it. Documentation? Lol, we don't have time for that. we are a company that gets shit done, we don't toy around with Shcubernetes and documentation! Error handling/monitoring/reporting? Eh, just read the code if something fails. Need to add cert issuance? Yeah let's implement our own ACME integration. Network ingress? Let's just have a 500line haproxy infront of an 2000line nginx config; no big deal. DNS? just append to named; who cares?
The name for the "we get shit done" crowd should probably be "we don't give a shit about anything and have others solve problems we created because of our lack of thinking and foresight", but it doesn't sound quite as memorable.
It's just people who are comfortable cutting corners at other people's expenses. When they have to own the shit they made, they start blaming others and leave the company.
I remember working with a client in the last 5 years that demanded a Kubernetes cluster to run custom analytics on very fast incoming data streams (several GB per hour). By "custom analytics" I mean, python scripts that loaded a day's worth of data, computed something, wrote the results to disk, and quit.
During development of the scripts, the developers/data scientists wrote and tested everything outside of the cluster since they were simple scripts. They had no problem grinding through a day's worth of data in their testing. But going into prod, had to shove it into the cluster. So now we had to maintain the scripts AND the fscking cluster.
Why?
"What if the data volume increases or we need to run lots of analytics at once?"
"You'll still be dominated by I/O overhead and your expected rate of growth in data volume is <3% per year. You can just move to faster disks to more than keep up. Also there's some indexing techniques that would help..."
Nope, had to have the cluster. So we had the cluster. At the expense of 10x the hardware, another rack of equipment, a networking guy, and a dedicated cluster admin (with associated service contracts from a support vendor). It literally all ran fine on a single system with lots of RAM and SSDs -- which we proved by replicating all of the tasks the cluster was doing.
I just don’t understand this type of article. If you understand how to use and deploy kubernetes and you have confidence in it… you should use it. What you spend in extra infra costs is trivial.
…and if you don’t understand how to use and deploy kubernetes just what in the fresh hell are you doing? Stick with technology you know and then move to kubernetes later if or when you need it.
We’re a relatively small shop, and we’re using k8s. All of our senior stuff know and understand it. The pipeline is fully automated. If we prematurely optimizing anything it’s developer output over saving money.
“So you want to run a bunch of stuff on one computer, why?”
In a quest to get closer to the metal, Kubernetes keeps you far away, which is the opposite of what any production service should want.
What is the purpose of adding layers when uni-kernels and eco-kernels give you better isolation and better performance?
Your cloud provider already runs your virtual machines OS on a hardware hypervisor. Then running Kubernetes on top of the OS and then a zillion containers on it is a recipe for poor performance.
What is the logic here that I am clearly missing?
Cloud providers native Kubernetes stacks don’t improve performance or pricing, compared to their cloud compute instances virtual machines and a dedicated virtual machine per would-be-container that thankfully doesn’t need to now share processing resources with others.
What gives? Why on earth would anyone run production processes in Kubernetes?
Merely making scriptable infrastructure doesn’t require kubernetes or containers… Ansible or it’s ilk will do just fine.
Deploy cloud instances like you currently deploy containers.
Save money, get better performance, have real networking.
*i have used kubernetes and understand how to use it as intended. I feel like I am missing the motivation for it’s current widespread usage in web apps
The advantage of K8s isn't really in the virtualization technique used, but in the orchestration it can be made to perform for you. You can for sure configure K8s to use a host per container, if this is what you want.
Example of thing that is pretty straightforward in K8s and much less straightforward outside of it.
1. For compliance reasons, you need to make sure that your underlying OS is patched with security updates.
2. This means you need to reboot the OS every X time.
3. You want to do this without downtime.
4. You have a replicated architecture, so you know you have at least two copies of each service you run (and each can handle the required traffic).
In K8s, this operation can be as simple as:
1. Mark your old nodes as unschedulable.
2. Drain your old nodes (which will grab new nodes from your cloud provider, installing the updated machine image).
3. Delete your old nodes.
The exact steps will differ based on your use case, but that's basically it.
Steps you didn't need to think about here:
1. If I'm about to take a node down, do I have enough redundancy to handle its loss? K8s understands the health of your software, and you have a policy configured so it understands if taking down a container will cause an outage (and avoid this). Note: third party tech will be naturally compatible - if you use Elastic's cloud-on-k8s operator to run ES, it'll appropriately migrate from host to host too, without downtime. Likewise, the same script will run on AWS, Azure, GCP.
2. How fast can I run this? If building this logic yourself, you'll probably run the upgrade one node at a time so as to not have to think about the different services you run. But if it takes 15 minutes to run a full upgrade, you can now only upgrade 100 hosts each day. K8s will run whatever it can, as soon as it can without you having to think about it.
3. What happens if concurrent operations need to be run (e.g. scale-up, scale-down)? With K8s, this is a perfectly reasonable thing to do.
4. Does this need to be monitored? This is a fairly standard K8s workflow, with most components identical to standard scale-up/scale-down operations. Most components will be exercised all the time.
Generally I've been impressed by how straightforward it's been to remove the edge cases, to make complex tech fit well with other complex tech.
A while back we upgraded between two CentOS versions. In such a case it's recommended to reinstall the OS - there's not a clear upgrade path. In K8s, this would have been the same set of steps as the above. In many orgs, this would be a far more manual process.
It deduplicates the kernel memory and system image base disk.
The minimum virtual machine size for a Windows server that is at all useful for anything is 4 GB of memory. Okay, okay, so you can technically boot it up on 2 GB and some roles will work fine, this will last only until some dingbat remotes to it with RDP with a 4K monitor and it starts swapping to disk.
Even if you use Server Core and block port 3389, it still needs a ton of memory just to start.
Running in a container it uses a few hundred megabytes.
Similarly, the minimum system disk size you can get away with is 32 GB if it is a discardable / ephemeral instance. You need 64 GB minimum if you ever intend to run Windows Update on it.
With containers, the unique parts of the image might be just a few hundred megabytes, even for complex apps.
My experience is with Windows, but from what I hear Linux VMs vs Linux containers have vaguely similar ratios.
So with containers, a single host can run dozens of applications, all sharing the same base disk, and all sharing the same OS kernel. The savings can be staggering.
At $dayjob, the admins are very much stuck in the dedicated VMs for every role mentality, and they're burning through enormous piles of taxpayer money to run them at literally 0.1% load.
Having said that, Kubernetes has its own problems. As you said, layering it on top of cloud VMs is a bit silly, and can easily result in the container running in a nested hypervisor at molasses speeds. Similarly, every single typical process changes dramatically: Deployment, updates, monitoring, auditing, etc...
Combine the above with the incompatible underlying cloud layer and things get really messy really quickly.
In my experience 90% of the world just isn't ready for the learning curve. Windows as an operating system isn't ready, certainly. Microsoft Azure isn't really ready either. Their AKS managed offering is still undergoing massive churn and seems to have more preview features than stable features. Even in the Linux world I hear more horror stories than success stories. It seems that everyone who says they love Kubernetes is using it on like... one machine. Come back and tell me how you feel after troubleshooting a failed upgrade on a cluster managing $100M of finance transactions.
What I would like to see is "native Kubernetes clouds" where the hosts are bare metal and there is no impedance mismatch between K8s and the cloud provider APIs because K8s is the API. Instead of the Azure portal or the AWS console you literally log into a Kubernetes console.
IMHO that would allow a true commoditisation of the public cloud and start to erode the near-duopoly of AWS and Azure.
I think exokernels and isokernels solve many of these issues where containers are currently used, check the Ocaml community for examples.
They run on hardware.
Ultimately, there needs to be a singular scheduling system running on hardware and a singular HAL-like driver layer, and exo or iso kernels deliver just that, vs lxe containers provided by os services.
Even if they are configured with fractional CPU (which mostly causes stupid throttling issues) they still are headachaes to maintain. I used to run 4000 pods in a single cluster, using Kubernetes was great. Otherwise, nowadays I'd just go with Fargate/Cloudrun/Fly.io etc.
Having worked with Kubernetes, it's great - I think even smaller setups can benefit from it and its approach (especially when using a managed provider).
But for startups and/or simpler setups, ECS/Lambda is so much less work, while usually being powerful enough.
I've looked at it a few times but haven't been able to get my head around the concept. Reading the intro documentation I haven't been able to map the concept of "nodes" to a server, database, and GPU server.
* You describe what type of state you need (Deployment = one or more Pods running your container, Service = expose the deployment)
* Kubernetes runs a control loop that reconciles the current cluster's state with your desired state
Nodes are the virtual machines on which everything runs, you don't need to interact with them unless you have special requirements (e.g. GPU instances). In that case, you annotate GPU instances with some type of label (like `has-gpu=true`), then in your deployment you add a node affinity saying it needs to run on nodes that have that label. Kubernetes will schedule it for you if there's any node matching.
The nodes are just the individual machines in your K8s cluster.
Which one(s) run your API/server, your database, etc is basically arbitrary. You’re free to let the K8s scheduler do as it see fits, or you can get increasing degrees of control by adjusting things like node and pod affinity/anti-affinity, etc.
K8s signals the team values infra and sees it as an important part of delivering their product.
It's an investment like any other and for some teams it makes sense to set infra on a solid path sooner rather than later.
You can move fast on things like serverless but the limitations of those systems can force architecture decisions that are sub-optimal whilst control of your own infrastructure can allow broader interpretation of problems and enablement of more wholistic solutions.
i.e it opens up building things "the right way" sooner in many cases and that can save a ton of time and frustration.
Also in the case where it's obvious you will outgrow serverless it also saves on a sometimes painful migration effort.
> K8s signals the team values infra and sees it as an important part of delivering their product.
But isn't "infra[structure]" what you run your product on, and not part of the product itself? If your "product" is so big it needs stuff like that built into it, then it feels to me that by defintion that stuff isn't "infra".
All in all, your sentence feels like "The Excel-Dell Humongo-XPS bundle signals that Microsoft values infra and sees it as an important part of delivering their product" to me. No, that only signals that A) Dell and Microsoft are using the dominant marketing position of Excel to milk money from naive customers, and B) If Excel can't run on more modest hardware, it's an unnecessarily bloated and inefficient product; look around for another spreadsheet.
First of all k8s isn't bloated, it's actually surprisingly compact if you think about what you are getting.
Second it doesn't imply the product -needs- it. Anything you can run on k8s can be run on bare boxes.
Just doing so probably means you either lack the scale to need k8s (i.e my spreadsheet doesn't have many users) or you lack the competence to understand you have reached the scale that it's required (i.e your spreadsheet is run by n00bs).
So it's much more of a positive than negative signal if you are seeking a vendor.
I think the product relationship thing is a poor analogy from that side though, it -mostly- matters for people that want to work there on/in/around the product itself.
As a primarily infra guy I have a preference for k8s because it allows me to have some reasonable set of services I can rely on being there if a place is "using k8s". If not I just have to hope either the existing infra team is competent enough to have things like service discovery, secret management, process monitoring, etc under control or that they are amenable to efforts to get it under control.
If I am approaching it from a backend developer perspective where most of that isn't my problem then it gives me some confidence the infra team has some idea what they are doing and that I won't need to deal with some sort of bespoke deployment system that has endless quirks and company-specific bugs that I shouldn't need to concern myself with.
Generally k8s just eliminates uncertainties by increasing standardisation so I don't need to watch people solve the same problem poorly over and over again for no reason at all.
personally i believe you should try to avoid as many services as possible. Why do you want to rely on a service? What if the service says "bye bye", you'd be like.. ok i guess our software doesnt work anymore.. RIP. I also feel like many developers use services to hide the fact they just suck at solving certain problems. E.g. we need to deploy.. ok we need this 3rd party service that will deploy for us, we will give them access to all our servers and therefore all data and that's absolutely not weird, + you'll have to read their docs for a few hours to know how it works... ok.. how about i write a simple deploy script in 5 minutes? "NO WTF! Are u insane??? Everything will crash and burn unless we use this service". :: So that's basically how my conversations go with these other devs. Really annoying, i just gave up on them, i just let them do what they want, you cant change their mind.
Tim Hockin (one of kubernetes creator) supports the idea to use something as much managed and automatic as possible if you don't have time for it [1]. He probably refers to use Cloud Run [2], over a (managed?) Kubernetes cluster (such as EKS or GKE)
The big missing piece of this article is a sense of at what scale and why should a startup decide to invest in a piece of infrastructure like Kubernetes.
The author mentions other things he considers red flags such as using a different language for backend and frontend development with no additional context.
Is the author talking about a startup in the context of one person who just knows JavaScript working on their own building a prototype? Is he talking about a series B company with 500k MAUs?
Some additional context would improve the article a lot. I think the author should have had a few people read over the article and given feedback before publication.
While I partially agree with the premise, I perceive the situation a bit differently. By not adequately prioritizing performance at the early technology-selection phase of a development project, teams often find themselves forced to prematurely adopt highly-complex deployment systems and other approaches to scale their low-performance system.
In my experience, a well-functioning new team will include performance in their selection process. Doing so allows a project to defer optimizations such as higher-complexity deployment (e.g., cluster orchestration, highly sophisticated caching, and so on) for much longer than those who build on low-performance platforms and frameworks.
Choosing a low-performance platform and framework sets an artificially low performance ceiling. Developers dealing with a low performance ceiling will either instinctively (through received or learned patterns) or reactively (through user complaints, issue reports, troubleshooting, and tuning) deal with performance challenges they should not need to deal with so early in a product's lifespan.
Ironically, some would say that including performance in your technology selection criteria is "premature optimization," but I argue the opposite: considering performance a feature early allows you to defer costly optimizations (such as the matter at hand, Kubernetes deployment), perhaps even indefinitely.
While I tend to agree to the conclusion on premature optimization - I disagree with the assumption that it is premature for most startups. In fact it's a reasonable insurance for startups (that is - if at all they succeed) it'll solve the problem of scale at that point without needing to make huge changes.
BTW Google open-sourced Kubernetes not for charity (like all businesses they also want to make money), they knew they had lost the cloud war with Amazon/Azure gulping up almost 80-90% market share. So they wanted a platform on which they can lure users back to Google Cloud when they start providing kick-ass services (to avoid the much famed vendor lock-in). And since docker was able to solve the dependency management in a reasonable way (not in a very esoteric way like nix) and were dipping their toes into distributed orchestration, they took it as a open fundamental unit to solve orchestration of distributed systems.
But yes Google succeeded in convincing dev/ops to use k8s by frightening them with vendor lock-in. But the most ironic thing that I see about k8s is that all these HPA, Restart on crash all those things are being touted as great new features. These features have existed in Erlang for decades (supervisors and spawning actors). I'm not sure why Google did not try to leverage the Erlang ecosystem - it might have been much faster to market (may be NIH).
> ... it'll solve the problem of scale at that point without needing to make huge changes.
This is incorrect. It's a common mistake to pre-optimise for a future that may never come. I call it "What-If Engineering".
You should always just build a monolith and scale it out as and when you understand what it is that needs scaling. Choosing the right languages and frameworks is where you need to put your effort. Not K8s.
What you say is true, but that is how insurance works - you pay a premium for "What if something unexpected happens", there is a 9 nines chance that it'd not happen but still we keep paying. K8s is similar.
It's because OTP does not integrate with anything not running on Erlang VM, and k8s instead derives from different family tree of general language-independent schedulers/environments.
Language independence is not a trait of k8s it's an artifact of docker packaging java/c++/perl/python/go/rust etc. as an arch dependent image. TBH I find k8s support for languages other than Golang pretty poor (there have been attempts to get java into k8s by redhat with native-image, but it seems to have not made it big).
Language independence is a trait of k8s in the sense that none of its interfaces are in any way specific to a language - the most restrictive in that are the few APIs that are based on gRPC, because state of gRPC libraries is still poor in some places.
Unless you want to embed another language in-process of some k8s components, but the need to do that is disappearing as things are moved out of process.
Build for today's needs but have a notional roadmap for how to scale your architecture as the company grows. The author is mostly correct that K8 is complete overkill for most early/mid stage companies and startups, however extensibility is a key architectural concern and one that cannot afford to ignored at the outset. If you need to re-write large amounts of code from scratch every time your business scales up that's a good sign that you aren't thinking far enough ahead. For instance here are some straight-forward things you can probably do today that require little additional effort but will make a migration to k8 much less painful down the road (should you ever get there): 1) Use domain driven design to properly abstract your code 2) Make services and auth stateless where-ever possible/practical 3) Containerize your codebase and leverage a container registry as part of your CI/CD process
BLUF always have both a current state architecture as well as a future state architecture in mind and a plan for how to gradually evolve from one to the other over time.
There is a lot to agree on in this article. Though teams should be able to choose whatever stack (languages, frameworks, databases, queues) they want. What’s important is there is a coherent way to deploy, build, operate, scale, observe these services, without relying on one or two team members. Getting a cluster setup and a few deployments is pretty simple, however scaling that cluster, setting up secure runtime (network policies, mTLS, Kata, KVM, gVisor), prometheus, CI/CD, preview, staging and production environments is a huge investment (personnel, cloud costs, delay to production delivery).
Teams should be able to benefit from Kubernetes and the surrounding cloud-native ecosystem without directly consuming or modifying it. You don’t need to reinvent the wheel, and that’s what you could say a lot of platform teams are doing today.
That’s what we’re working on at https://northflank.com a next-generation deployment platform using Kubernetes either in a secure multi-tenant environment, traditional PaaS or deployed directly into your GKE, EKS and AKS clusters. (disclaimer: Northflank co-founder)
It’s kinda curious that the HN sentiment in the comments has lately switched from “kubernetes is hell” to “kubernetes is quite useful and well worth it”
To me, kubernetes is the kind of giant, complicated, hard to understand software that is ripe for HN disdain.
I'm pretty sure the "kubernetes is hell" proponents were always just a vocal minority. But as always technology choices have to be made in the context of each business (available experience etc).
Modern CPUs are giant, complicated pieces of hardware. But they’re useful, and have a well defined interface that you can interact with it at a higher level.
Same with k8s. People conflated running k8s from scratch, with administrating. The former is very hard, you should buy it from a cloud provider. The latter is IMO not much harder than VMs at small scale and much simpler at you grow bigger.
What it is, is a low-level abstraction that probably wants something over the top of it to hide the messy details. I don't think anyone's really cracked that yet.
May be I don't see something. I'm learning about Kube right now and I like it. It's like docker-compose done right. Managing Kube cluster might not be that easy, but that's why managed offerings are there. And you can easily scale with Kube. And by scale I don't necessary mean scale up, because you can scale down as well and that's important to save your precious money.
Few other Kube goodies:
1. Plenty of apps are available as helm charts. It's like pre-build docker image, but even better and easier to install and manage (!).
2. There are some marvelous operator like Postgres operators which will do the hard work of managing Postgres database for you, including replication, backups, upgrading. On the down side is the fact that when it broke, you need an expertise... So may be managed database still is better for small scale.
I'm coming to kube as someone who spend some time dealing with docker-compose servers. And I feel like kube will be better.
For new projects, even on small scale, it's likely that I'll choose kube.
Both of them seem to have about equally many proponents as contrarians, so in each case it's half due to the community's ignorance and arrogance, half to its wisdom and humility.
Our startup has a webapp that acts as a UI for a machine learning application. We have two different types of heavy workloads (ML and something else). The workers for these run on Kubernetes, which makes them easy to scale (which we do a lot, automatically). The app itself doesn't run on Kubernetes yet (simple VM). It would be better if it did, though! We keep building a lot of functionality ourselves (e.g. deployment without downtime) that would be easier with K8s. I think what you really want to avoid is Microservices, not K8s. K8s gives you many advantages, such as the entire config in Git, containerized workloads, deploy management, etc., which sooner or later, you'll find yourself wanting. Depending on how you get started, the learning curve can be steep, and some of the tools are not mature but supposedly used by everyone (ArgoCD for deployment?).
What is it giving you over an ec2 auto scaling group or ECS/Fargate?
Both can scale as much as you like, your config can live as cdk/cloudformation/terraform code?
I'm not an expert on auto scaling groups, but we have used the Google Cloud equivalent for a time. The biggest issue for us was that deployment is not as easy. Can you update the software on the ec2 instance without turning it on, for example? With K8s, we can leave the deployment scaled to 0 and just patch the image of the deployment to perform a release while the workers are all shut down. Similarly, we don't have to write code to wait for the workers to finish their current job before we shut them down in order to be replaced by a newer version; this is all managed by K8s, and the configuration for it lives in Git.
As others have already pointed out, it is also important for us to remain independent from Google Cloud / AWS.
Apache Spark can use Kubernetes as a scheduler out the box. I don’t know if op is using Spark.
A lot of data tools are starting to target Kubernetes directly as a runtime so using them with GKE/EKS is a bit simpler as it’s officially supported, allows to run locally and on the cloud with no vendor lock in.
ECS in a scaling group works well if your app is stateless but as soon as you scale workers dynamically, do service discovery, orchestration, you end up building some of the features Kubernetes provides.
The flipside of this philosophy is that you have to be able to revisit your decisions when your requirements change.
This is a superpower that you want to have individually as an engineer but you need organisationally as a culture too. If you have that power then you when you need to write some similar code you can copy and paste, because you can trust that when you have a third use you will actually refactor properly. It means that you can get the service up and running quickly on a box you've got, because you can trust that if the service actually gets used your org will allow you the time to put it on the cloud properly.
It's the most important part of agile - make the right decision for what you've got now, and then make a different one later.
Comparison to internationalization might be interesting. Supporting multiple languages almost always expands the potential customer base, sometimes by a large amount. But even with the potential business impact it is extremely rare for applications to be localized early on. Doing the work early makes it much easier and pays off in organizing communication with users. Is this because the tools that are used are not impressive or powerful enough? Maybe needing to know multiple languages and team up with others is the blocker? Working on localization it is often remarkable how great effort often gets put into pretty much everything else and then translations are added hastily and with as little sophistication as possible.
I don't use it much right now but I've found kubernetes (at least the managed service variety) fairly staright forward compared to some of the altearntives
it beats the pants off the the random rubricks of ansible and puppet projects I've run into the past anyway
I often hear a lot about you should vs you shouldn't. How about just highlighting your experience with? Not all companies, people or circumstances are the same. To me, it's akin to comparing athletes and trying to debate who is the goat. Just talk about what you did and let people figure it out. If their startup failed because of it, then they got what they were after, to learn and grow for the next one. Or, maybe to not do one again. I can guarantee you that for every claim you make, there's someone who was successful in doing the opposite.
One stable Kafka cluster is worth more than 1000 kubernetes clusters (for free) to me. Kubernetes to me is kind of useless. I don't use it for the sake of learning it. What's the point ?
I write code for human, to serve real benefit/profit from business standpoint. I build value and trust based on what i delivered, not from what i "want to learn".
Currently, i go all in for unidirectional architecture. Think redux all the way, not just for frontend.
The point here is, it's the architecture that's drive business profit, not Kubernetes. Kubernetes is not your architecture.
I am curious - there are markets where usage of managed platforms like AWS is restricted due to data protection laws (no data center on a country's soil) or what if a large client wants an on-premises installation (we have plenty of such clients) - what's the alternative to k8s if you want your solution to be "portable" across providers? From what I understand, our DevOps team leveraged k8s to streamline it using a standardized way to define resources and to scale them, but I'm not an ops guy so maybe I'm missing something.
I can understand developers and devops - overengineering allows them to play with technologies for company money. Tech managers should control level of overengineering, but in many cases they just miss..
First, to respond directly to the post: having a (stable, one you don’t tinker with every other day) k8 setup isn’t bad, its only bad when you have to pay the setup overhead.
I built a boilerplate app with:
- authentication (Google auth) - cloud run backend -CDK - CICD (GitHub actions) - full-boilerplate frontend - full boilerplate backend - Signal error reporting - Postgres - graphQL - all my other preferences like git hooks, linter rules, vscode workspace, local environment etc.
Now, whenever I need to build an app, I just git clone this repo and its plug in credentials and play. My backend cluster doesn’t ever run more than one instance, so I guess having a scalable backend is pointless, but who cares? It’s free, time wise (and almost money wise since cloud run is pretty efficient). I built it a year ago and don’t touch most of the dev ops for any app, except to maybe rip out authentication when I don’t need it.
Imo, I think this is a highly effective method of engineering. Everyone has their favorite stack, so it might be worth it you to take a weekend and fully build out your boilerplate app so that if you ever need to, you can get straight to building the actual features.
Speaking from an infrastructure focused perspective who writes to make the provisioning faster, I understand why software-writing focused individuals like Kubernetes both on-prem and in the cloud. It's super easy to deploy to once the platform is provided to you. On the other hand which is a REQUIREMENT, is to get the platform available no matter the method, it can be a laborious chore that doesn't need to exist unless you need it.
on-prem: procuring hardware, installing hardware, installing the host OS/deploy the host virtual machines, networking, deploying K8s, CNI, CSI, etc
if cloud: increasing quotas, verifying that CPU sizes are available for more nodes or new node pools, performing virtual networking and subnetting for the K8s clusters, etc.
both: security everything from RBAC to networking to image validation, upgrades - you made sure to have a scalable app right? if not fingers crossed on availability.
Would the chore exist outside of Kubernetes, of course there would be a chore. I'm not sure so great though. While deploying to Kubernetes is simple. Setting up an on-prem platform is not what-so-ever and if you with public cloud you now have to be aware of all the caveats and gotchas that aren't handcuffs with on-prem.
Sort of agree. There are caveats of course but I tend to think if you can just start with a bunch of manually setup instances and roll from there. If you find yourself spending hours per day starting up and shutting down instances manually then yes, automate that part.
Nothing I've ever done has taken off at all so automating DevOps would have been a total waste of time. Then again the projects themselves have been a complete waste of time.
I think it depends on whether you automate it so that it goes out of your way so you can spend more time on things crucial to your work - or you're just pontificating because you hate your job anyway.
The problem with this sort of argument is that it sets up a straw-man, without describing what you should do instead. Kubernetes could be better - of course - but what better approach is the author recommending to use instead? It would be better if we could use one language on frontend and backend - of course - but what is this one language that works everywhere? It would be better - of course - if there was a SaaS that could do everything for us, is cheap and is open source - but what is that SaaS?
It's very easy to say something could be improved, but much harder to propose something to replace it that doesn't have its own shortcomings.
It's a pity because this article actually makes a good point, that we should think about what the actual user/business goal is before making technical decisions, and be careful lest we spend all our time on technology. I've heard this idea described as "innovation tokens". It's disappointing that the article chose to wrap this idea in clickbait, but I guess we wouldn't be talking about it otherwise!
Disclaimer: I have no end of personal biases in favor of kubernetes!
> The problem with this sort of argument is that it sets up a straw-man, without describing what you should do instead. Kubernetes could be better - of course - but what better approach is the author recommending to use instead?
The article explicitly advocates for high-level PaaS tools as an alternative to Kubernetes.
Maybe it could be read that way, but I think it's a good indication of the problem that a strict reading doesn't include that. The article advocates that you should be using Heroku or Vercel or Netlify or Fly as not doing so is an antipattern, but it then says that instead of k8s "Most organisations should consider some of the higher-level building blocks available via cloud providers", which seems contradictory.
I quite liked the actual message of the article (which I take as "be mindful of your technology choices") - but I think that it is really devalued by the less coherent but more click-baity arguments (e.g. "javascript is the only rational language choice")
Edit: Actually, re-reading the article yet again, I think I can see your reading (though I wish you had written the article, as your writing is much clearer). As I understand it, the article says you should use a PaaS-style solution until you outgrow it, when you should move to something from the cloud providers. For me, that would be CloudRun -> GKE-Autopilot -> GKE (skipping CloudRun if you don't want to learn two systems), but there may be some google-bias there!
"Another time our Data Science team told us that they needed an orchestration tool for their data pipelines. I steered the selection process towards Argo Workflows (which runs inside Kubernetes) instead of Prefect (a SaaS offering), which they already used for a PoC. There were all sorts of justifications in my head for this decision. Unfortunately, they were all based on premature optimisations. In the end, our team needed to build a new set of Terraform and Helm charts to automate the deployment of Argo Workflows and integrate it into our SSO etc. I regret this decision. I think we lost weeks or even months shipping something to end-users because of this decision. Premature optimisation!"
There is absolutely NO issue in adopting kubernetes, and then using SAAS for everything possible. In the above case, if you had a data pipeline proficient engineer in the team, you would not spend months to set up one. It comes down to making decisions based on your current situation (business urgency, team skills, future plans).
When I see k8s I automatically think o13g due witnessing it being tried a couple of times for the coolness factor, not because it was actually required.
I maintain a free, open source, fast paced shooter that really only works with <100ms pings. To this end I have a dozen servers around the globe. Each server runs a docker-compose with four instances of the game server (each for 24 players), along with some auxiliary services that do things like download updates, upload match demos and statistics.
I frequently need to re-provision new servers and close existing ones.
I've been doing all this with the help of docker-machine (now deprecated), and a collection of handy bash scripts. While this has worked, it's become more and more fragile and I need something better.
For years I've resisted Kubernetes because I keep hearing "premature optimisation", but at this point I'm not sure I have any other options. So before I dive head-long into Kuburnetes by what feels like necessity, how else could I possibly:
- Push button provisioning of new hosts across a number of cloud providers
- Monitoring so I can restart services under certain conditions.
- Notification of failures etc.
- Automatic / continuous deployment across all servers and regions.
What you are describing is fairly trivially accomplished using a combination of vm image provisioning software (e.g packer) and a traditional configuration management system (e.g. ansible).
It will be simpler, more straightforward and most likely use less computing resources. It’s also a decades old technique at this point that is extremely battle hardened.
You don’t say it outright but you imply that you are doing this as a single operator. That removes giant swathes of the advantages K8s brings to the table. Further your implication that there are multiple edge clusters is a case that is fairly new to K8s and the best practice would be to run n K8s clusters.
If you are looking for a more managed/turn key option I’m a huge fly.io fan boy.
I feel like this is probably good advice for startups and AWS-using companies. But there remains a lot of us who have, for many reasons, some good, some bad, a lot of investment in physical infrastructure already. For those, the Hashicorp or Kubernetes stack makes a lot of sense, if only to help standardise the insanity of what you’re running in-house.
It strikes me that this particular philosophy, that I associate for no other reason than a gut feeling with US-American culture, is one of the biggest reasons for our shitty software development quality. Every successful team that I have seen so far had solved the worst problems in advance and thanks to this investment ended up with a much higher capacity for times of crisis. The assumption that problems can be solved (really, solved, not somehow worked around until things will explode in your face Tuesday next week) when they show up in practice looks like fake optimisim.
Of course it takes good judgement to distinguish actual future problems from mere inconveniences and phantom issues. But that's a skill that people should try to cultivate not bullshitting their way out of responsible management with terms like "Pareto principle" and "agile".
Man these opinion pieces get boring fast and very rarely add value.
I honestly couldn't give a toss if some guy thinks using some < language / tech / library > is <blah>, it's a technology, a tool, try it, use it, if it works for you good, if not use something else. Move on.
The only time it's a allowed is if it's bashing JIRA :)
The same language argument is plain wrong. It is easier to change between languages within the same domain than to swap domains using the same language. If you done web services programming with NodeJS you can easily switch to GO. Starting with React will be much harder.
How does everyone think about Kubernetes on-prem ? Where the requirement is to run an application/services hands-off without any platform team to maintain it etc ? Is Kubernetes also the right platform for that sort of environment ?
The part of Kubernetes and startups is 100% correct I believe. It is very hard for some people to not own a thing. People can be very narrow-minded about buying a service. Even though they at the same time gladly buys work-hours to offload their personal workload.
To set up complete envs for dev, test and prod with GIT, full CI/CD and everything you need like databases and storage is less than 20 hours work in a modern cloud. This is something that comes with low maintenance as well. If you think containerization and Kubernetes is a better option, you are basically incompetent when it comes startups.
I'm not sure if I can agree with the author. We moved from plain AWS EC2 to K8 to improve our continuous delivery pipeline and it worked. Before we had dozen of custom terraform scripts, custom runners on gitlab, and it became much cleaner now with K8. Other aspect that improved was monitoring and introspection. I think mainly because K8 offers a reach set of tools and patterns helping exactly in that matters. It comes with its own costs, but I can definitely say that we went from a tailor made solution to a standard set of techniques and tools, which is a huge improvement, IMO.
It depends how experienced you are with it. If your team has no experience with it, don't do it. I'm pretty experienced, having set up a bunch of infras, so I'd run a blog on it now
Indeed - having used it for few years, I'd start the initial infra (for testing/playground/whatever) as small k8s on a local VM or similar solution so that we could prototype against the target abstractions from day one.
Much, much less work to then move the deployments elsewhere, including building and teardown of integration testing environments.
When I get a contract to move a startups infra to k8s the majority of the work is actually untangling the services and making them accept configuration and communication in a sane way, the actual yamls are easy. Starting with k8s actually enforces some structure and standards and makes for a much more friendly environment to work in, usually you can run the whole infra on your laptop
I feel most places over complicate K8s. The foundational ideas are sane, especially with managed k8s in GCE, AKS, Digital Ocean e.t.c
The basic idea is that of "be like this spec" via a yaml/json file - a controller is continuously monitoring an object and making adjustments until spec matches actual.
I've seen many startups have one person who sets up and maintains their k8s cluster. If you know what you're doing, it has pretty solid ROI.
I think it depends on where your "target" architecture is no? If you're doing monolith-first but know you eventually need to move to microservices, it might make sense to put in some effort to learning k8s in the simplest case first and expanding it out.
I think the issue with people that use k8s first is that they go for super trendy architectures before really nailing down the business logic first. This leads to technical debt and a lot of confusion around the tool.
I guess you wanted attention, congratulations you got it.
As far as the advice goes. No not really, lots of people are familiar enough with Kubernetes that it makes it the fastest choice possible for them. Engineers like to pretend that tech choices are oh so important and can be discussed in a vacuum, but really most tech choices are made once you know the people who are working on the project.
Pick stuff that is familiar to you unless you have a good reason to do otherwise.
With the advent of serverless product offerings, I agree 95% here. The other 5% are on premise, own their own server enterprises, which I know, are rare.
It no longer makes sense to spend so much resources on orchestration or distributing across multiple hardware.
Not when you can simply write code and have it running globally distributed on edge. You would be doing yourself a disservice to not follow this new paradigm shift.
I'm a developer that uses Kubernetes in production purely because I want to be able to use the same Docker images that I use in development.
I am not a Kubernetes advocate but what else is there that handles all of the issues faced when deploying containers? Such as scaling, deployment, configuration etc?
There are alternatives such as Hashicorps Nomad, but I don't see how this is any better/worse that K8s.
I'm aware of these products, but do not fit my use case.
I need to run some services on premises and have set up a self hosted Kubernetes instance on a physical server in a rack.
It could be overkill and maybe I could use something like Docker Swarm. Apart from this I am unsure what I can use that isn't K8s to orchestrate my containers on site.
Nomad is a lot easier to get into, all the while it provides almost the same functionality as k8s, including a lot of the edge cases like orchestrating vm's, lxc, firecracker and ofc workloads on windows and freebsd too. Weakest point of nomad compared to k8s IMO is the size of the community.
Docker Swarm Rocks
" .... I would recommend for teams of less than 200 developers, or clusters of less than 1000 machines.
This includes small / medium size organizations (like when you are not Google or Amazon), startups, one-man projects, and "hobby" projects. " https://dockerswarm.rocks/
Agreed, it's a shame that k8s won the mindshare, and other solutions don't get a second glance. Swarm will do most of what you need, and can help avoid the premature optimisation trap. Going from Swarm to Fargate or K8S is a small step on a journey.
Thank you, I've been struggling with the decision to move away from k8s as I am spending much time making it work, and much more again every time it needs to change, or be updated, when everything is already fine on metal. I just need to find a way to keep or convert the dockerfiles so that it just runs on metal and without docker.
I have read that this article isn't about k8s, but in my experience, companies that try to avoid k8s in favor of other specialized solutions are leaning more into premature optimisations. The company that used Nomad spent weeks in analisys of simple auth service, but every company that used k8s, went with whatever and moved on.
You can use elastic beanstalk, and RDS. You get scaling, reliability and backups. And it’s not complicated and you can set up easy CICD with just GitHub actions.
If you don’t need scaling, just use lightsail and litestream.
You just showcased my point! People would spend weeks in search for solutions that implement part of required functionality, instead of using whatever they're most familiar with and move on.
What's good back of envelope break-even for when I should move to K8S from a set of hand-rolled scripts and manual processes to manage a bunch of VMs? It definitely feels like more than 10, but how much more than 10? And do people manage database servers using K8S? Or is it better than hand manage those?
The most starry of starry eyed startups will want a team that is already fit for scale, so they build for scale before they need it. Possibly convinced by VC behaviour and getting as much from each funding round as possible. If you have to scrap your 1st round team and re-employ for 2nd at scale its going to be a hard sell.
Why build it when you can buy it? Right? The decision is more involved that this. The real question is...
What is the shortest route to a sustainable, defensible revenue stream?
In the 80's IBM built a super quick computer out of pieces you could buy at Radio Shack. The ultimate solution! Why incur all that upfront manufacturing cost when you can assemble it out of existing parts? They learned the answer really quickly. The lower barriers of entry meant anyone could recreate the product. And so a million clones popped up using the same parts. For a decade Apple ate their lunch by investing in a proprietary platform(which a lot people said was madness using the exact reasoning on display in this article)
SaaS is great but you have to have some sort of moat for whatever you are building. Sometimes that moat can be as simple the idiosyncrasies of your over-engineered, prematurely optimized product. I had a hand in the development of one of the major VPS players during the 2010's. There were things we could do just because of the weird decisions people had made in the past. That translated directly to competitive advantage.
The startup I am working for has seen entire engineering fired after they hired a lot of people and spun an enormous amount of infrastructure, spent tons of cash and produced absolutely nothing useful to the company (presumably because they were engrossed in spinning the infrastructure). The company isn't very large and does not provide software products to the clients. The systems are mostly business intelligence, billing, ERP-like, etc.
Now we are restarting lean (but wiser for the experience). 10x less people and technology accomplishes more and faster.
Most companies with that kind of problem DO NOT need Kubernetes because they don't have problems for which Kubernetes is the cheapest solution.
Most companies do need focus on keeping things simple, minimise amount of stuff to make it possible to hire a development team that has a shot at understanding what is happening.
For example, we made decision to restrict ourselves to only one cloud provider (AWS), to only one backend language, one frontend language, etc.
Our prod environment is simple, too. No effing microservices. One application on one application server easily serves the needs of a company of over 200 employees and our clients.
Simplifying the environment made wonders to productivity in many different ways, the most important being that the discussion now focuses on what the company needs and the functionality that would solve the problem rather than on technical trivia.
All this makes possible following things:
* having one development team where everybody can contribute to everything (no need to Zoom with local K8s guru to accomplish anything or spend time in endless meetings with "frontend people".)
* hiring a relatively simple developer profile. This does not mean hiring bad developers. But it means I am looking for people who know X and Y very well rather than X, Y, Z, A, B, C each a little bit... It is stupid to assume a person will know well every one of 50 different technologies listed on their LinkedIn profile -- most likely they only really know couple of things and everything else only superficially.
* hiring from talent pool that has a lot of experience but not necessarily in the stuff that other companies require. Other companies may hive overlooked them because they don't know a bunch of new tech, but we don't care. We only care that they are great people that get shit done, fit our company, and know how to design and implement functionality in their core programming language.
* engineers being able to accomplish their individual tasks from start to end, right after joining the company. No need for a long learning period.
* company needs main center of attention rather than spending large part of the effort on technical trivia.
My take? If you need to add something, first ask yourself what are the costs and better understand what you get in return for that cost.
Anything can be a red flag signalling premature optimisation, depending on the scope of the problem you're solving. Like any dogma, "you don't need k8s" will sometimes help you, sometimes harm you.
Kubernetes seems like subversion before git came out. Overly complex but got the job done. When git came out, all other VCSs were quickly abandoned. Just waiting for the day when the git of container management comes out.
k8s can be easy to set up, and if you know how to use it then you should. You shouldn't go to the trouble of learning it, but if you do know it already, then there is no reason not to go with services like gke autopilot.
At this point I'm pretty much convinced these repeated "Kubernetes is bad for startups" rants are some kind of FUD campaign (probably grass roots from people who've missed the containerisation and declarative infra train). Kubernetes is actually pretty great for the vast majority of use-cases. Sure if you don't have experience with kubernetes leave learning it until after you've hit product-market fit. But rejecting the advances made in this ecosystem is a red flag upon itself.
My ultimate MVP starter pack is probably a big fat VM running a single node K3s instance and a cloud provider managed database. Software wise nginx-ingress / cert-manager / argo-cd is a pretty good default set.
> Kubernetes is actually pretty great for the vast majority of use-cases.
From my brief glances at it, it seems like a great solution for very complex use-cases. But "vast majority"? The cost-benefit seems way off unless you have a very different sense of "typical use-cases" to me.
Most people are building simple web sites and apps that run on a single server, surely?
Agreed. OC is overstating it. It’s not that the anti K8s crowd is necessarily anti-containerization, it’s that containerization doesn’t necessarily require Something as complex as K8s.
I’m invested in Cycle.io, which is a company that simplifies containerization and infrastructure (major oversimplification of the company). But a lot of customers are refugees from K8s, whose appetite exceeded common sense. Their infrastructure ops teams are costing them 6-7 figs that they can shed with something like Cycle, because K8s is simply really complex.
This is going to become really really important in the next 3-4 lean years we have coming our way. If you’re a K8s user, and spending money you can’t afford on an ops team to manage, you owe it to your company, investors, and customers to explore options to reduce this type of burn.
P.S. Intentionally not making a hyperlink for Cycle.io because I’m honestly not trying to shill here.
> Kubernetes is actually pretty great for the vast majority of use-cases
Its a pain in the arse, and you're on the hook for looking after it.
I'm captain infra, so when Managed DBs came along, I felt threatened. But then when I used it, I realised that, yes, they are expensive, but I don't have to fucking worry about them. One touch deploy, proper backups are an option away, as is proper role based auth. K8s is like having a DB admin. Everything revolves around them, and it can be limiting, unless you have a specific use case.
K8s is useful for a specific role, but for most people who've outgrown a single machine/bunch of machines, ECS is good enough. Yes, there are fancy things you can't natively do, but you probably shouldn't do those just yet anyway.
Where K8s comes in useful is probably limited to:
1) mixture of real steel and cloud
2) A cluster shared with a large number of team, each deploying services that rely on other people's stuff
I've been managing clusters in one for or another for many years. K8s is certainly better than swarm, but its really not that great as a general purpose "Datacentre OS".
Kubernetes is the React of devops. You wont get fired for picking it. And there will be a tonne of support available online, as well as courses, and experienced people you can hire. Also there is cloud managed k8s which takes most of the pain out of it. And plenty of out the box stuff.
It is not a bad choice if you want to do the ops yourself for some reason. If you don't then use a BaaS or PaaS, that might be easier and probably not much more expensive, but will lock you in.
This has always been my observation/thought as well. It’s “ no one ever got fired for picking IBM” all over again. Cute advertisement slogan, but it’s based on a true/real mentality, and a dangerous one at that (in terms of reaching potential and managing burn).
It is like asking for something between vanilla JS and React. Or something between zipping your src folder and git.
The problem is, while this is in theory good, the weight of knowledge and support for React (and git) make it worth putting up with a bit more complexity. That support covers: Stackoverflow, Colleagues, New Hires, Cloud Support of these things (Vercel for example using both Github and React!)
You learn the tool, then you are set for 10+ years, probably, in both cases.
In the Kubernetes case, the killer thing is it very easy to set up a cluster on cloud platforms. I am not so sure about say Docker compose or the other ones. I am in the Azure world and Azure dropped support for direct container running, and now you need to use k8s if you want something cloud managed. (As far as I know).
"Containerization and declarative infra" are only a small fraction of what could be done with real network distributed OS's as early as the 1980s and 1990s, and with far lower levels of overall complexity. That "train" has left the station a long time ago, it's going full speed and there are no brakes on it either. Of course you can keep chugging along with clunky K8s and call that a "good default choice". But that's just being complacent about what real progress looks like.
Could you give some examples of these network distributed OS's that are going full steam ahead? Also interested in why you think those are less clunky than k8s. :)
Plan9 has basically invented the idea of namespaces, which are fundamental building blocks of containers. Plan9 also invented 9P file protocol which is used in WSL2.
Plan9 is the ultimate distributed OS. Running programs on different computers is as easy as mounting a remote /dev/cpu file to your process' namespace. Virtual desktop is as easy as mounting /dev/draw. And so on. Many such complicated things that require ad-hoc solutions on Linux are solved on a fundamental level in Plan9.
I wouldn't say that it's going full steam ahead, since its use in the modern world is fairly limited, but the ideas were there, and they have influenced the development of Linux.
Distributed OS hides distributed nature from the application. Kubernetes does not try to hide it. It allows developer to embrace it. Hiding network latency is leaky abstraction. Kubernetes does it right. It does not pretend that there's no network between your services. It's the other way around: there's always network between your services. Sometimes it's loopback network, sometimes it's real network, but you're aware of it.
I don't think that that engineers who built Kubernetes were looking at Plan9 and saying "ok guys, it's too easy to do things, we must make it harder so people are aware of the network".
I think the more believable scenario is "hey let's see what we can do with the existing Linux systems and sockets and stuff" and then importing the network into the model because it was the simplest solution.
Then again, whether or not should applications be network-transparent-by-default is a discussion for itself. I believe they should. Every program (that does not drive hardware directly) is just a piece of code that glues various OS APIs together. In the case of Plan9, the APIs are the filesystem, and the filesystem can be modified with 9P protocol. If we can use the same local program to perform an operation on a remote machine without any modification, why the hell shouldn't we.
The PostgreSQL operators are definitely usable, pretty much none of their limitations are going to hit you on an MVP. I think a lot of the FUD about "don't run databases on kubernetes" came from a time before we had statefulsets and persistent volumes.
I'm using the Zalando one. It only took a couple of hours to get it running (first time running it) and integrated in graphana. Includes incremental backups to S3. It really feels like a cloud independent managed database.
> probably grass roots from people who've missed the containerisation and declarative infra train
I am curious. How much work would it take to get on this train in 2022? Sounds like more of a cult than a valuable technology proposition when presented in this manner.
could you elaborate on the "ideal" argocd workflow?
let's say devs push to gitlab, which starts a pipeline, tests run, container images get built, then in the last step in the gitops repo a version bump happens and then argocd will pick it up automatically?
(and if one wants to be able to rollback then that's the same workflow just instead of version "bump" the pipeline tags images with the commit hash and at the last step sets the version to the right tag?)
Not the parent, but this is exactly what happens. ArgoCD is pointed to an "application" chart which just points to a path of a helm chart (of your app/sevice) in Git. So when your CI changes the image hash in the values file in your helm chart, ArgoCD with notice that, and change the deployment resource's image hash. It is just a nice system to have resources in sync between your git repo and what you have live in the cluster. Of course, if you change anything else, ArgoCD will change it too.
You're also right, rolling back just includes changing the hash back to whatever you had, or to a new hash which was a result of a git revert or whatever. Also the good thing is that if that newly deployed service is very broken (that it doesn't pass the k8s health check), ArgoCD will hold on to the old ReplicaSet and will not let your service die because of it.
What's also nice about ArgoCD is that you can play a bit with some service/application in a branch. Say you have some live service, and you want to adjust some configuration of it. Usually third party services have a lot of options to set. The problem is that you're not 100% sure how to get what you want, so doing a pull request for every small change can be very slow and exhausting. To work around that, you can point your ArgoCD's application chart to a chart which is in a branch, test/dev/fiddle with just pushing to that remote branch, and when you're satisfied, you merge your branch, and at the same time point your ArgoCD application manifest to point to the master/HEAD for that chart. In effect, at this step, only your Git repo will be updated, your service already has all the changes so ArgoCD will do nothing. That way you can iterate faster, and undo whatever regression you've introduced just by pointing ArgoCD to watch the master, not your branch (or you can just reset your branch to be indentical to master).
For deployments using Argo, yea that's pretty much it. You can set it to automatically deploy to your target environment as soon as it detects changes to your k8s mainfests, or you can require a manual "approve" step which is a push of a button for configs that are out of sync with what is applied to k8s.
For rollbacks, you have a button in argo that lists previous deployments and you can just choose which previous deployment you want to go back to. It works pretty well those (thankfully few) times i have had to rollback any deployments.
For automated testing the argo-workflows project is amazing and worth looking into.
I think when you look at the entire argo suite you start seeing something that could really disrupt the way we use products like gitlab, particularly for startups.
Kubernetes is the standard for infrastructure, everyone should use it instead of VM or all others singular infrastructure. All others options are costly on the mid and long term.
I don’t think so.
I run 2 dedicated-cpu cloud instances that will thrash any load you throw at it high availability, and I can spin up an entire 3-node production cluster in a few minutes with good old Ansible and some cloud provider module.
Kibernetes brings me overhead and wastes system resources.
I already have infrastructure as code.
Why would I want to containerize systems I already have YAML to describe?
People forget they real hardware has to run this stuff, and layers os schedulers are not helpful.
Look at the disaster of fibers threads and processes already!
I have previously talked both about the advantages and disadvantages of Kubernetes, but some of the points in this article seem interesting to say the least. I know that a lot of folks are already disagreeing with some of them, but I wouldn't suggest that the author is plain wrong, merely that there is a lot more nuance to it.
> Companies using Kubernetes for an application that is a web app.
There is probably a whole spectrum, with manually deploying code to shared hosting through SFTP being on one end and having full Infrastructure as Code running on Kubernetes with CRDs and other functionality on the other.
Personally, I think that you need to find the sweet spot:
- use containers, otherwise your app environments will rot over time, or at the least become inconsistent
- consider something like Ansible or another GitOps approach for managing the app config, consider the same for server seetup
- use some sort of container orchestration, whatever is easier and whatever you're more familiar with (Docker Swarm, Hashicorp Nomad, K3s)
- if you do go for Kubernetes (K3s or otherwise) make it as simple as possible, don't go for fancy service meshes or bunches of operators etc.
- document what you can: with Markdown or whatever for describing things that need to be known and done manually, code for the rest
This approach has allowed me to be able to add any server to a cluster easily, to distribute the load of all the app services, or even be able to treat many projects with different tech stacks the same way: just launch the container, configure resource limits, parameters, storage, networking/certificates and it's done. Things crash? Automatic restarts and liveness probes/health checks. Want the logs? Just look at the output or easily configure shipping them somewhere. Need to deliver new version to clients? Just push to Nexus/Artifactory/Harbor. And should your org have separate Ops people? Well, they won't really have to care about someone wanting to bump the JDK version or whatever, let the devs care about what's inside of the app themselves. Security concerns? Scan the whole container with Trivy? Clients have a different environment? They probably can run OCI containers somehow.
Many of those advantages also apply to monolithic apps, like some garbage that may only run on JDK $OLD_VERSION or $OLD_DISTRO, but should still have its surface area for attacks limited as much as possible. If done right, developing apps like this will be a breeze without getting too overcomplicated. If done wrong, you will get nothing done and will have to suffer with figuring out how to get your Kubernetes cluster up and running properly, then will have problems with resource usage etc.
> More than one language for your application. For example, a backend in Golang, Ruby, PHP etc. and a Frontend Web App in React, Vue etc.
This is probably the most controversial argument: I will admit that using something like Ruby on Rails or Laravel, or any other server side rendering option will probably be a pretty decent and fast way to get up and running.
Yet, nowadays people like developing APIs so that other apps can be connected to those relatively easily, which is somewhat handled by most of these options as well, though developing a separate front end and dogfooding the API yourselves will perhaps be the better way to ensure that everything works as expected.
More so, in my personal experience trying to develop both the front end and back end as a single bundle can sometimes be a major pain in the behind, which gets infinitely worse when you do have some sort of a full web app that you try to embed and serve from the back end container/process/deployment as static files, because of issues with permissions, threads/performance etc. If anyone is curious, doing that with Spring (not Boot) was really annoying and kind of brittle, having to always deploy them together and having unreasonably long startup times aside.
So I can't help but to lean towards a split setup (in most cases):
- RESTful API for the back end, can be used by the front end or other integrations (I don't think that GraphQL is all that good, but it's also an option)
- some sort of a separate web app front end (Vue, React, Angular) that is served by Nginx/Caddy/Apache and may later be replaced/migrated over to another solution etc.
> Not using a cloud service to host your app. Examples are Heroku, Vercel, Netlify and Fly.io. Most product teams will have over-architected their solution if they have to have an ops or infra team.
This probably depends on corporate policies or whatever people are comfortable with. Using those cloud services will result in vendor lock which, as Heroku showed, isn't always cost effective or even that good of an idea long term. Currently I pay around 260 Euros/year for my servers (5 VPSes) on which I run my containers, some people pay more than that per month on certain platforms. Because I settled on OCI containers as the common runtime format and the aforementioned approaches to orchestration, I can switch between whatever hosts that I want.
Previously, I've used DigitalOcean, Vultr, Scaleway, Hetzner, Contabo and Time4VPS (my current one) and can easily move, whenever a better offer comes along. Of course, for many startups with SV funding that doesn't matter much, but there is definitely a lot of merit to avoiding vendor lock and focusing on open source standards. Or, you know, you decide to work in an industry where the law book gets thrown at you and you have to do on-prem.
Also he talks about all of this and then gives example of WhatsApp at the end. WhatsApp chose Erlang for the backend and their front end was written in Java and Objective-C. They could have chosen Java for backend to keep frontend language same but they didn't. They used Erlang because they based their architecture on Ejabberd which was open source and was built with Erlang. Also WhatsApp managed all their servers by themselves and didn't even move to managed cloud services when they became available. They were self hosting till FB acquired them and moved them to FB data centres later on (Source: http://highscalability.com/blog/2014/2/26/the-whatsapp-archi...).