Not everyone needs K8s, not everyone needs multi-region. But as far as manual deployment goes... it's all fun and games until someone loses an eye. Often what you find if you have a process that can't be automated as-is is that you have a process that's got problems. Maybe those problems are bugs. Maybe those problems are "only one person knows how to run this, and if he wants to take a vacation or more onto something else, we're hosed." Automation is a good thing at any scale.
Once dealt with a company that spend close to 2 years to develop 2 different generations of a robotics system to glue together two plastic pieces. One of the pieces changed slightly close to the end making both robots useless. The pieces ended up being glued together manually for next to no labor cost because a person could actually do the gluing quite fast.
Premature automation wrecks budgets and production lines and can kill entire companies.
Spending a few hours on setting up an automated build on a SaaS CI server will start bringing benefits immediately and pay off very quickly. Not only quantitative (next deployments will take less human time) but also qualitative (less chance of errors/bugs/mistakes, easier to do them more frequently etc).
Obligatory XKCD: https://xkcd.com/1205/
It does make sense to automate - after you have the design. But the type of automation also changes with scale. Ideally, at a small scale, your automation can be something like: A script to pull the database for its regular backup. A script to automate uploading new PHP scripts and site assets. A script to reinstall the dependencies on a fresh image. Some version control and another backup layer over all of this. And then a huge test suite running the gamut: user workflows, failure conditions, site attacks, data integrity, backups.
It's the followthrough into automating the QA processes that makes the product great, not the underlying stack(which in most cases is going to have to be treated like a placeholder in the event of real scale). And it mostly isn't substantial engineering challenges at any level: they're "tick off the boxes" exercises, they just punch above their weight in terms of delivered value.
It's the tendency to add more things to configure and more systemic non-linearity("in this mode it does X, in that mode it does Y, but when Z is enabled both modes do Q") that creates a serious IT headache, and that in turn kills your progress as an independent. Oftentimes you have to give up on using the new stuff, the theoretically good stuff, because the additional layers of automation mean that you end up doing original R&D for an apparently ordinary problem, and you can't actually get a build off the ground. Or you fight through numerous issues and get it working but with no established "best practice" to follow, you only half-understand how to configure it properly, leading to technical debt and future unknowns in the risk profile.
Cleaning up dead/deprecated code and working to consolidate code paths provides opportunity to clean up obsolete configuration settings, remove special cases, and improve the reasonability and maintainability of the software into the future.
Well, financially if the project is generating revenue worth of one persons salary, then it doesn't make sense to make things that would require more personnel to improve mainteinability, since you can't afford it. You just have to live with the risk.
Complexity is the enemy in software engineering as much as it is in operations engineering. Sustainable reliability is about doing as little as you can get away with to achieve the reliability results that make your customers happy. That does mean following the advice of TFA and avoiding going straight to k8s for a small indie site that could be deployed with a simple Ansible/Puppet script. However it also means defining what your targets are and having indicators so that you're not relying on your gut instincts either: aim for 60% toil, 40% automation at first. Whatever the right balance is review it regularly as you scale your project up.
You can do a lot with a couple of beefy VPSs and a solid database these days.
"A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system."
 Systemantics by John Gall
Some searching today found a wikiquote page which led me here also claiming an EWD498 handwritten annotation as the source.
Do you know definitively whether the annotation appears in EWD498?
(I thought at one time I actually found an image with the handwritten annotation, but I can't find it at the moment... Maybe I dreamed that...)
Automation should follow the same rule as other things in a business. Only automate when it becomes painful not to. Or if it is trivially simple to.
For decades, I've been saying this a bit differently: "Software engineering is primarily an exercise in complexity management".
2 years ago we we needed to move in from the cloud to a internal network. Because of time pressure we got a PC as our first server, where we installed a set of tools for our embeded developers, from Gitlab, over Jenkins, LDAP backend, Nagios, Rocket.chat, Crowd, file sharing, nginx, Volumerize backups, nfs sharing, build artifact storage, a private Docker registry, we even run a build slave instance on it.
This was supposed to be a intermediate solution until we move to the real infra. In the new infra they have all of the things like several stages of load ballancers, tons of firewalls between the servers, the slaves are physically in a different network, everything is insanely complex and takes at least 20 times more time to set up than you'd expect. This is why we still run on that old PC (we added 7 more as build slaves) and ca. 200 people use it now for 2 years on a daily basis, which seems pretty weird but it just works.
Lately we saw that the network card started hanging and we needed to do a hard reboot which is not nice. The guys who are close to that PC (we use it remote) had a USB network card lying around and we asked them to connect it because the old one in the tower most probably had it's end of life because it has been used so much during the last 2 years.
We're still on boarding more and more people and it's not clear when we will be able to move to the new very complex infra.
Also, who is responsible for updating all those apps? Is that a full time job for somebody?
When it's just everything on one server there is surprisingly little to do, so for that we have a rotating sysadmin where one of the team members is responsible for the servers for one sprint and then the next person takes over. For some time we had a dedicated sysadmin but it was never enough work for them so the rotating sysadmin does it on the side right now.
In the new infra this is changing significantly and there it already is much more work and we will grow with at least two more teams to handle it full time.
What you should focus on is finding out what your use cases are, and then building the simplest thing that meets them. For some folks, high availability and zonal resiliency is an absolute must. For other folks, like Peter, it might not be. These context-less platitudes are pretty useless outside of the context in which they're made.
Point being: its a balance. I tell any startup who will listen: App Engine or Heroku. That'll get you REALLY far and strike a good balance between autoscaling, simplicity, and redundancy.
If you're still using a single server, you're doing something horribly wrong. This random guy's strategy isn't something to be proud of. He just doesn't know that there are better options out there. Simplicity isn't a VM you have to maintain, update, and secure. Its a fully managed PaaS.
NomadList and RemoteOK (from the article) have both had downtime before. They're both much more profitable than whatever startup of the day has decided they desperately need complicated infrastructure to run their CRUD app.
I've fallen for this trap also. I've written an article about deploying a Next.js app to Elastic Beanstalk, when I probably could've just stood up a server and SSH'd in to deploy. I've used Firebase when I could've just stood up a simple REST API and PostgreSQL.
There's nothing horribly wrong with using a single server, he knows there are other (not better) options out there.
Hell, even "two VMs on a third-tier infrastructure provider" is better than 2x better availability, because there are discrete events which impact a single VM and don't impact two.
Startups should be taking the easiest route possible to Happy Customers. That's the goal. There are two parts to that: "Happy" and "Customers". You need Customers. That's product-market fit. You also need them to be Happy. Availability is a part of that.
No, a startup does not need a multi-regional strategy behind redundant global ELBs with edge caching. I never said that. You know what's pretty damn easy though? HEROKU. You spin up multiple dynos and you get multi-AZ redundancy, NO EXTRA WORK. App Engine is the same way.
Startups, everywhere, for the love of god, stop spinning up servers. If you can SSH into it, that's a smell. If you HAVE to ssh into it, you're wasting resources. There are so many different "serverless" options out there. Focus on the product. The infrastructure can wait. Its not going anywhere.
It's just a different world outside of the startup, "My total possible user-base is all 7 billion people in the world!" world. This is HN, so startups are the correct, default assumption, but I do see a lot of this worldview leaking into the non-startup worlds as well.
Before NoSQL you had to do expensive/difficult stuff like manual database sharding and buying expensive dedicated hardware. Now you can instantly spin up a few instances.
Considering how much easier it is to put the infrastructure in place IF you ever need it, its odd to me how focused on it people seem to be.
This is so far from the truth. If you built your (fairly new) app, especially a new app, such that it requires multiple servers, then you're overthinking infrastructure at a time when you don't need it. Having multiple servers is great, but having a lean app that can run on a single server requires a different skillset. You don't need HA, multi-AZ AWS deployments behind a load balancer with route 53 DNS load balancing powered by auto scaling fleets with ci/cd jenkins and slack chat ops when you only have a small number of users.
Managing servers is something no startup should be doing. Period. If you can SSH into it, you're wasting time on infrastructure that could be spent finding product-market fit. The number of servers isn't the relevant points; its the fact that he's using ANY servers.
Guess what, managing a small amount of servers (two) isn't that hard and doesn't take a lot of time, and $40/month gets you really fast CPUs and lots of RAM these days.
As always, generalized absolute statements make no sense: everything should be analyzed in context, and based on specific metrics (in my case: money).
That sounds ipso facto, like I'm saying "PaaS is the best option for startups, because if it didn't make sense you're not a startup." but I mean it legitimately.
Startups are VERY DIFFERENT from normal businesses, in almost every conceivable way. Great startup engineers do not necessarily make great "normal business" engineers. Great startup infrastructure looks nothing like great normal infrastructure. The same solutions do not work for both.
Startups, generally, have "large" amounts of money and "small" amounts of humans. So when you say "based on my specific metrics (money) PaaS doesn't make sense"... uh, yeah, duh. You're not a startup. Being a small business does not make you a startup. Writing code does not make you a startup.
Startups burn money like hell in the pursuit of a 100x product-market fit. "Burn Rate" is thrown around a lot as a metric. Do you think the term "burn" was chosen by accident? Startups don't "invest" money. They don't spend it. They BURN it, with the hope that the kindling will explode and they can worry about the damage they caused later.
I just had an issue with that blanket statement that it's not kosher to be using a server still. It's totally fine for a startup to be managing their own infrastructure if they've done the cost benefit analysis on it.
An engineer costs, lets say $7000/month.
Now, all you have to ask is: How much time will the engineer be investing in the infra every month. Lets say you SSH in and update the system packages. 10 minutes? That's $6. Hope nothing goes wrong during the upgrade; remember, you're on the clock. Every minute you spend with that shell open is wasting money. Need to add SSH access for the new guy? Oof, we're gonna have to exchange some SSH keys, that might take 20 minutes. $12. Just got an email from AWS about a new Ubuntu AMI that fixes a security vulnerability... lets plan an upgrade. 1 hour. $30. We want to be more resilient against AWS maintenance downtime on instances, so lets bake an AMI and create an ASG. That's not too hard. Maybe a couple hours.
Oh... we just paid the difference for Heroku for an entire year. AND our engineers were able to focus on the product.
Managing infrastructure is a solved problem for any startup. Simply spin something up on a heroku-like platform and be done. Anything else is wasting precious time and money.
As soon as you add in any amount of sysadmin time, security work, etc. it becomes absurdly better.
But, again, cost doesn't matter. What really matters is that engineer had to WASTE a day working on it. That's a day that could have been spent advancing the product toward market fit. That's a day when that engineer wasn't coordinating with the rest of the team. That sets back her work by a day, which might block another engineer by a day, and it ripples down and down. That's not something startups want to happen. Because Engineers are expensive, yes, but more importantly, they're hard to find. You can't just throw money at hiring talent like you can at Heroku (or App Engine, or Lambda, or whatever works best for your business).
I disagree with this if you actually intend it in the blanket form you present it as. A fully managed PaaS can be the simplest solution overall, but it can also be overly complex.
It all depends on what you're trying to do and what your needs actually are.
How does App Engine compare ? From a bird's eye, it seems a lot of configuration is still needed to communicate within Google's ecosystem and as Google also pushes k8s, I wondered how much resources they were putting in App Engine to have it run smoothly vs k8s where they are the main driver force and have a natural advantage over the concurrents.
In some businesses, your SLOs will force you to implement high-availability solutions. In other businesses, your users are not paying for high availability, and you are doing nothing wrong by using a single server, assuming you can recover from a failure in a predictable amount of time, meet your SLOs and not lose data in the process.
I agree with the article author: engineers tend to go way overboard, and often nobody asks the really important question: who will pay for those tight SLOs?
While I agree with the general sentiment of using the bare minimum to get the job done, the gratuitous complexity problem is usually caused by the people using the tools than by the tools themselves.
If you'd please review https://news.ycombinator.com/newsguidelines.html and take the spirit of this site to heart, we'd be grateful. These other links might be helpful for that too:
That seems unnecessarily harsh.
But I do agree with your other points. Kubernetes introduces a lot of moving parts, even if the minikube interface is relatively straightforward. Over the years, experience has taught me that black boxes don't stay opaque for long - something breaks and you ultimately have to learn the internals, usually in an emergency break/fix scenario. At FAANG, the gave teams who's specialism is running those sorts of orchestration systems.
I trust systems I understand, and magic scares me.
Conceptually Kubernetes is a trivial system, a Plan9 of orchestration systems. Everything is an executor watching a key/value store and triggering on various state configurations. What trips people is the network / DNS layer, which is also implemented as a bunch of executors watching the key/value store. When it breaks, you're helpless as a novice, plus debugging networks is painful for anyone. If Kubernetes had a '--network-driver=none' that just uses the host network as is, akin to Minikube & '--vm-driver=none', we'd never had this 'complexity' argument thrown around.
Fortunately, once you understand the architecture, debugging the network is straightforward: poke around the system network/dns executor logs and the culprit will soon reveal itself.
When I started using Linux around 20 years ago, with Red Hat 6, manually compiling the kernel was the only to way to enable all sorts of hardware support that's enabled by default today. And it was a great learning experience.
Though I do get your point about complexity, and I agree that the kernel is (arguably necessarily) very complex. Unfortunately there are few mainstream, well-supported and broadly compatible alternatives. I miss the days when computers were simple and a single person could understand everything that goes on inside, all the way down to the hardware.
With kubernetes and other container orchestration platforms, I feel that there are viable, less complex alternatives for many use cases. Simpler platforms mean fewer points of failure, and more chance that a small team of generalists would easily be able to fix any issues and get back to producing value for the business.
There are definitely cases where kubernetes et al are warranted, but in order to provide any semblance of robustness would require a dedicated and well versed resource within the business to look after the system. And most businesses (I'd guess > 99%) don't have the scalability or reproducibility requirements to necessitate it. The extra layer of abstraction just isn't worth it for most teams.
This is why working in infra is so awful, people follow this principle for years and then invest a million a year in an infra team whose hands they tie.
Resilience is a feature, though I agree at early stage folks often need less than they think, a single auto-scaling group and load balancer in AWS or something similar isn't much heavier than a single linode VPS, except that it has substantially improved resilience.
Yep, having to wait for development resource to build a scalable system because the legacy is working just fine.... and then the legacy falls over and burns because it exceeded capacity like you warned it would a year before it did, and the business guys suddenly want you to fix it yesterday because now every account manager is out their end of month reports...
A significant downside of Scrum as a methodology is that it assumes product owners listen to the engineering team as well as the sales people yelling at them 24/7 for latest feature X before making prioritisation decisions.
Sticking to that order is critical.
Remember Soylent boasting about their elaborate compute infrastructure, for a business that made a few sales per minute? I once pointed out that they could handle their sales volume on a HostGator Hatchling account with an off the shelf shopping cart program, for a few dollars a month. But then they wouldn't be a "tech" company.
Soylent is apparently still around, competing with SlimFast and Ensure.
As the CTO of a small startup 0.1% downtime translates into vast quantities of lost trust and irreversible damage to our brand. Netflix goes offline for a bit they have some angry customers, and maybe lose some cash, but they'll still be chugging along. However, if we're down for a while our customers, who are already taking a risk trusting a new company, may disappear forever, and as a smaller/newer company the overall brand damage is far worse.
While I largely agree with the premise that companies over do it on infrastructure. I strongly disagree that my up time is less important than that of FAANG companies.
In a long chain of managers the manager above will only hear "there was an issue and it's fixed". The upper manager doesn't really have the time and the will to dig into the details.
Whether you're talking about a new Twitter or Reddit, which used to go down all the time (like, constantly), or some business startup, a downtime of 30 mins or so no-one really cares about, in my experience. At worst you'll get a phone call or two, a few emails, but if you handle them compassionately you'll be fine.
You can run a startup serving thousands or tens of thousands of customers on a single server, with no micro services, a simple server-side MVC setup and never have a $600 a year dedicated server even break 15% CPU.
I once took down a server by flooding the email server with error emails, which generated more error emails, which ran out of disk space, etc., etc. Took an hour to get the site up again.
Barely a blip in revenue. Client finally coughed up for proper email hosting rather than running their own server on the same box as their site, as I'd been advising them for 4 years.
This is a gross over-generalization. There are many kinds of problems in computing that can be approached by a startup AND which require large amounts of compute. The soup du jour is machine learning problems.
As for downtime, OP has a perfectly valid point. If you're building a B2B application, the customer has already taken a risk on you. If you go down in the middle of a busy workday, even for 30 minutes, you can be damned sure that someone at your client is getting some heat for taking that risk rather than going with $BIG_CO.
The early adopters are going to give you slack.
As for the tiny number of startups solving ML business problems, compared to the thousands of web apps launched daily on product hunt, if your USP is computing power and special tech, then obviously this advice does not apply.
Edit: You should really disclose you work on AWS when discussing this sort of stuff
Yikes. Is this irrelevant attempt to disarm them why we have these obnoxious "disclaimers" all over HN?
It's an absolutely meaningless gesture.
You're imagining the inconsequential downtime of Twitter or Reddit instead of a small B2B company trying to hook its first customers.
I agree, but context is key: if you're bootstrapping a start up, you don't need these things. You need to prove your product, then you scale.
But automation != scale. Having a process that streamlines your delivery process, regardless of scale, can be helpful. I've screwed up enough single-box deployments to learn that less.
Stepping back: our industry is pretty horrible about creating tools that can start small and scale up. I like where CockroachDB is going for that reason (just as an example). It would be great to start with a single database, and have a clear path to scale it horizontally across multiple nodes and data centers.
Kubes might get there... I'm not sure how focused they are on making small things work well, though... any examples of that?
Honestly, I do not even agree with his premise that you should start at bare metal with manual deployments. Getting some basic automation set up is STUPID EASY between Travis CI, Jenkins, Google App Engine, etc. I feel that toiling to deploy your services is a massive waste of time.
Obviously, the reality of this lands somewhere in the middle. I feel like Kubernetes is the whipping boy for "over-complicated infrastructure" undeservedly. Hosting it yourself I am sure is a bear, but there are a LOT of great hosted solutions available.
Google Hosted Kubernetes makes my job easier.
I write a few Deployments, Services, and Ingress controllers, set up keel.sh to update my deployments based on docker image uploads, and BOOM: Awesome, absurdly automated infrastructure.
Log aggregation? Comes out of the box with Stackdriver logging.
Monitoring and alerts? Comes out of the box with Stackdriver monitoring.
My developers can edit the Kubernetes resources through the Google Cloud GUI.
Deploying to an environment is as simple as pushing a docker image with the correct tag and letting Keel take care of the rest.
We have looked at alternatives, including bare metal, MULTIPLE TIMES, but in the end, we keep on deciding that Kubernetes is doing a lot for us and we do not want to stop using it.
I remember chatting with someone intimately familiar with k8s and docker. We were talking about an app I was working on which deployed to heroku. He asked how many dynos and I told him (it was under 10) and he said: "yup, you'll not need k8s".
10 dynos and a big database can serve an awful lot of users.
I've been in meetings where the combinded cost of the time taken to discuss if we should use a thing was more than the cost of the thing. Utterly infuriating.
Even when I worked for an agency there seemed to be an automatic discounting of the cost of people's time vs spending actual cash (and cashflow wasn't an issue).
I appreciate that there is often good reasons to DIY, but when there are not I will always favour something off-the shelf unless it is significantly more expensive.
The mindset shouldn't be about building an ideal infrastructure, but it should be about having a reliable infrastructure instead. It surely doesn't required to have every brand cool thing which was advertised on HN within last 2 weeks.
But fully automated pipeline for code delivery and configuration as code is essential. It doesn't require that much time (especially if you reuse one of thousands example from github), but it will save you later. Even for a single node in linode.
Even though it won't help you to build new features and attract new users, just think about as a necessary action to keep your existing users. Nobody wants to use the thing which isn't available because a developer messed up with deployment command, didn't notice it and left home.
After the last time I watched a video of the Nomadlist guy, Pieter Levels, talking about it and he said
>I've woken up so many times at 4:00 a.m. to just
check if my website down and I have to do all this stuff and then I'm awake for three hours because the server crashed. https://www.youtube.com/watch?v=6reLWfFNer0&feature=youtu.be...
So maybe just PHP on Linode has some drawbacks
They employee 100s of people because during budget allocation they often get a lot of money to spend in a short period of time. Multi year projects are hard to keep people interested in.
The reason for multiple vendors is because if you have just one they screw you. You actually need the internal competition.
And most of the complexity in tech stacks is actually coming as a result of the move to agile. You don't have a large upfront requirements gathering and architecture process across all of the teams anymore. Instead each team is given just enough for that sprint, told to design it yourself and then people are surprised when there is 20 different implementation styles.
I wish it was sarcastic
But as even the author allows, at some point you do need this stuff. The real truth is closer to "You don’t need all that complex/expensive/distracting infrastructure... until you do."
The other thing I might posit, although I haven't sat down and worked out a complete argument for, is that sometimes even in a very early stage a bit of more complex infrastructure (automation in particular) can be very helpful... specifically when it serves to allow you to run more experiments per unit of time / effort.
>That single @linode VPS takes 50,000,000+ requests per month for http://nomadlist.com , http://remoteok.io
So it's not that early stage. I think remoteok.io is currently the worlds #1 remote working site. ("Remote OK is the #1 remote jobs board in the world trusted by millions of remote workers" it says)
The fact that he uses PHP with PHP-FPM solves the problem of deploying a new version (starting the new version, switching new connections to the new version, draining existing connections, stopping old version).
But when using a single host machine, you still have the issue of updating the kernel and the OS, and this is sometimes better done in an "immutable" way, by setting up a new VM and switching traffic to it. This is where things become a bit more complex. You have to use things like AWS EBS or Google Cloud Persistent Disk, detach them from the old VM and reattach them to the new VM. You also need to use floating IP or a load balancer.
In other words, baby-sitting the machine and its OS, either done manually or automatically, is a real pain.
Maybe the real simplicity lies in using something like Google App Engine, Heroku, Clever Cloud or Scalingo.
Realistically developers are compensated by a range of different things - free gourmet meals might be one - another even more important one being career development.
If you let developers who want to use kubernetes use kubernetes even if it's not strictly necessary it might be a net positive for the company.
Hell, even if kubernetes is a slight negative compared to the simpler equivalent it could be a net positive because it gave somebody who wanted it a career boost from experience with a "hot" technology which made them happy.
Now, only if RDS is going to cause a massive headache would I be seriously against it - provided we walk into this situation with open eyes.
Obviously, you shouldn't prematurely optimize, but you also shouldn't wait until you are getting multiple hours of downtime every day while you are scrambling to get a better solution in place.
And I think many times we underestimate the server power we will need as we grow.
After a few hundred users I used to extrapolate that I would never really need more than a large VM and database instance. But then we added more and more functions and features and more and more computer power is needed per user now.
Then we started running into issues where certain procedures would take too long so they need to be queued, well then you have the overhead of a queuing system and on and on.
It's kind of the same thing that happens with companies as you hire employees. You think, oh, we have 5 developers now. We will never need more than 15 devs and some customer service and sales. But then you need someone to handle HR, and maybe a bookkeeper, and then people to manage the non developers, and then someone to manage the project and it just kind of grows exponentially.
I haven't heard this term before. I love it
(I agree with the whole "don't run k8 et al" for your side project, though obv.)
This article here resonated to me as well.
We are currently running a fairly busy web app on a single Linux cloud-based VM instance, occasionally rising new instances during higher loads, and a deployment pipeline based on small python and shell scripts. Maybe rudimentary for some current standards, but it's been working sufficiently well.
Fully second that. I started some of my best projects on the small Intel Atom Linux server at my residence. And we all know the garage story of Facebook (and similar)
Is essentially the TL;DR of that article. Being dismissive while not really substantially expounding on the faults of other processes is just bland contrarianism. Bald assertions need only be met with bald assertions.
Allow me to write a retort:
You do need all of that infrastructure and you should spend even more time making sure your pipelines are a well oiled machine to reduce deployment fears and establish confidence in your infrastructure. Make it nice to use. That new tool that came out? It was written for a reason, and is probably tackling a problem you weren't even aware you'd be facing. So stick it in and call it a day with the knowledge that you've avoided spending time on a problem someone else has already solved.
Your turn, cynical blog person.
I've seen / worked with a number of startups that used the author's advice and just had a simple setup "that worked". Until it didn't.
Here's how it typically plays out:
Our "dev" setup "the server" for us at <AWS/GCP/DO/Linode/etc> and everything worked fine until the provider restarted the server, or an OS upgrade happened, or we fired the dev we found on Upwork and he shut down the server. Now X doesn't work. We don't know how to reproduce what he did.
Now you are left with trying to go through someone else's bash history and decipher what steps they used to build the server. Did they forget to tell a service to autorun? Who knows.
I agree with OP that it's possible to over-engineer a fancy CI / CD pipeline and matching infrastructure for a founder that just has an idea and zero users when you should be getting product/market fit, esp. when you just have one developer working on the system. However, the opposite is also true. It's possible to under-engineer the infrastructure where developer productivity is dramatically slowed when you're spending a significant amount of your dev's time doing deploys and blocking other work from happening at the same time. This can happen fast when you hire dev #2 and #3. This isn't even getting into the perils around security and scaling when you play fast and loose with the infrastructure side of the house.
My last job was at a financial start-up worth about 200M at the time. And our server setup was dead simple: 8 servers and a load balancer. No containers, no bullshit, just servers running JVMs.