The amount of companies who use K8s when they have no business nor technological justification for it is staggering. It is the number one blocker in moving to bare metal/on prem when costs become too much.
Yes, on prem has its gotchas just like the EKS deployment described in the post, but everything is so much simpler and straightforward it's much easier to grasp the on prem side of things.
I've come at this from a slightly different angle. I've seen many clients running k8s on expensive cloud instances, but to me that is solving the same problems twice. Both k8s and cloud instances solve a highly related and overlapping set of problems.
Instead you can take k8s, deploy it to bare metal, and have a much much more power for a much lower cost. Of course this requires some technical knowledge, but the benefits are significant (lower costs, stable costs, no vendor lock-in, all the postgres extensions you want, response times halved, etc).
k8s smoothes over the vagaries of bare-metal very nicely.
If you'll excuse a quick plug for my work: We [1] offer a middle ground for this, whereby we do and manage all this for you. We take over all DevOps and infrastructure responsibility while also cutting spend by around 50%. (cloud hardware really is that expensive in comparison).
>Instead you can take k8s, deploy it to bare metal, and have a much much more power for a much lower cost. Of course this requires some technical knowledge, but the benefits are significant (lower costs, stable costs, no vendor lock-in, all the postgres extensions you want, response times halved, etc).
>all the postgres extensions you want
You can run Postgres in any managed K8s environment (say AWS EKS) just fine and enable any extensions you want as well. Unless you're conflating managed Postgres solutions like RDS, which would imply that the only way to run databases is by using a managed service of your cloud of choice, which obviously isn't true.
> You can run Postgres in any managed K8s environment (say AWS EKS) just fine and enable any extensions you want as well.
You absolutely can do this, and we do ineed run Postgres in-cluster.
We generally see that people prefer a managed solution when it comes to operating their databases. Which means that when it comes to their (eg) AWS EKS clusters, they often use RDS rather than running the DB in-cluster.
Our service is also a managed service, and that comes with in-cluster databases. So clients still get a managed service, but without the limitations of (eg) RDS.
Could you expand a bit on the point of K8S being a blocker to moving to on-prem?
Naively, I would think it be neutral, since I would assume that if a customer gets k8s running on-prem, then apps designed for running in k8s should have a straightforward migration path?
I can expand a little bit, but based on your question, I suspect you may know everything I'm going to type.
In cloud environments, it's pretty common that your cloud provider has specific implementations of Kubernetes objects, either by creating custom resources that you can make use of, or just building opinionated default instances of things like storage classes, load balancers, etc.
It's pretty easy to not think about the implementation details of, say, an object-storage-backed PVC until you need to do it in a K8s instance that doesn't already have your desired storage class. Then you've got to figure out how to map your simple-but-custom $thing from provider-managed to platform-managed. If you're moving into Rancher, for instance, it's relatively batteries-included, but there are definitely considerations you need to make for things like how machines are built from disk storage perspective and where longhorn drives are mapped, for instance.
It's like that for a ton of stuff, and a whole lot of the Kubernetes/OutsideInfra interface is like that. Networking, storage, maybe even certificate management, those all need considerations if you're migrating from cloud to on-prem.
I think K8S distributions like K3S make this way simpler. If you’re wanting to run distributed object storage on bare metal the you’re in store for a lot of complexity, with or without k8s.
I’ve ran 3 server k3s instances on bare metal and they work very well with little maintenance. I didn’t do anything special, and while it’s more complex than some ansible scripts and haproxy, I think the breadth of tooling makes it worth it.
I ran K3S locally during the pandemic and the only issue at the time was getting PV/PVC provisioned cleanly, I think Longhorn was just reaching maturity and five years ago the docs were pretty sparse. But yeah k3s is a dream to work with in 2025 the docs are great and as long as you stay on the happy path and your network is setup it's about as effortless as cluster computing can get.
I've been running one for a couple years now, and even in that short of time Longhorn has made huge leaps in maturity. It was/is definitely the weakest link.
Cost wise it's a no brainer. Three servers with 64 GB ECC and 6 cores for the price of three M5 larges. So 192 GB and 18 cores for the price of 24GB and 6 cores.
I think one of reason k8s can get a bad rap is how expensive it is to even approach doing it right with cloud hosting, but to me it seems like a perfect use case for bare metal where there is no built in orchestration.
Here is your business justification: K8s / Helm charts have become the de-facto standard for packaging applications for on-premise deployments. If you choose any other deployment option on a setup/support contract, the supplier will likely charge you for additional hours.
This is also what we observe while building Distr. ISVs are in need for a container registry to hand over these images to their customers. Our container registry will be purpose build for this use-case.
> The amount of companies who use K8s when they have no business nor technological justification for it is staggering.
I remember a guy I used to work with telling me he'd been at a consulting shop and they used Kubernetes for everything - including static marketing sites. I assume it was a combination of resume and bill padding.
Out of interest do you recommend any good places to host a machine in the US? A major part of why I like cloud is because it really simplifies the hardware maintenance.
I'm running kubernetes on digital ocean. It was under $100/mo until last week when I upgraded a couple nodes because memory was getting a bit tight. That was just a couple clicks so not a big deal.
We've been with them over 10 years now. Mostly pretty happy. They've had a couple small outages.
Hmm, not sure what you're asking then. You want a physical computer that you have access to?
DigitalOcean is just a hosting provider. You can buy just regular 'Droplets' (dedicated machine nodes) and host whatever you want on there, no serverless cloud junk. They also offer Managed Kubernetes. I use both.
Yeah, we know about the ZFS encryption with send/receive bug, it's frustrating our attempts to get really nice HA support on our logging system... but so far it appears that just deleting the offsending snapshot and creating a new one works, and we're funding some research into the issue as well.
I run BareMetalSavings.com[0], a toy for ballpark-estimating bare-metal/cloud savings, and the companies that have it hardest to move away from the cloud are those who are highly dependent on Kubernetes.
It's great for the devs but I wouldn't want to operate a cluster.
HBM or not, those latest server chips are crazy fast and efficient. You can probably condense 8 servers from just a few years ago into one latest-gen Epyc.
I run BareMetalSavings.com[0], a toy for ballpark-estimating bare-metal/cloud savings, and the things you can do with just a few servers today are pretty crazy.
Core counts have increased dramatically. The latest AMD server CPUs have up to 192 cores. The Zen1 top model had only 32 cores and that was already a lot compared to Intel. However, the power consumption has also increased: the current top model has a TDP of 500W.
Does absolute power consumption matter or would it not be better to focus on per-core power consumption? Eg running 6 32-core CPUs seems unlikely to be better than 1 192-core.
Yes, per core power consumption or better performance per Watt is usually more relevant than the total power consumption. And 1 high-core CPU is usually better than the same number of cores on multiple CPUs. (That is unless you are trying to maximize memory bandwidth per Watt.)
What I wanted to get at is that the pure core count can be misleading if you care about power consumption. If you don't and just look at performance, the current CPU generations are monsters. But if you care about performance/Watt, the improvement isn't that large. The Zen1 CPU I was talking about had a TDP of 180 W. So you get 6x as many cores, but the power consumption increases by 2.7x.
That could be an interesting site when it's done but I couldn't see where you factor in the price of electricity for running bare metal in a 24/7 climate-controlled environment, which I would assume expect is the biggest expense by far.
The first FAQ question addresses exactly that: colocation costs are added to every bare metal item (even storage drives).
Note that this doesn't intend to be used for accounting, but for estimating, and it's good at that. If anything, it's more favorable to the cloud (e.g, no egress costs).
If you're on the cloud right now and BMS shows you can save a lot of money, that's a good indicator to carefully research the subject.
To expand on this, I run BareMetalSavings.com[0] and the most common cause of people staying with the cloud is it's very hard for them to maintain their own K8S cluster(s), which they want to keep because they're great for any non-ops developer.
So those savings are possible only if your devs are willing to leave the lock in of comfort
This is compelling but it would be useful to compare upfront costs here. Investing $20,000+ in a server isn't feasible for many. I'd also be curious to know how much a failsafe (perhaps "heatable" cold storage, at least for the example) would cost.
You pay a datacenter to put it in a rack and add connect power and uplinks, then treat it like a big ec2 instance (minus the built-in firewall). Now you just need someone who knows how to secure an ec2 instance and run your preferred software there (with failover and stuff).
If you run a single-digit number of servers and replace them every 5 years you will probably never get a hardware failure. If you're unlucky and it still happens get someone to diagnose what's wrong, ship replacement parts to the data center and pay their tech to install them in your server.
Bare metal at scale is difficult. A small number of bare metal servers is easy. If your needs are average enough you can even just rent them so you don't have capital costs and aren't responsible for fixing hardware issues.
Are you going to risk your entire business over "probably never get a hardware failure" that, if it hits, might result in days of downtime to resolve? I wouldn't.
Just pay 2x for the hardware and have a hot standby, 1990s-style. Practice switching between the boxes every month or so; should be imperceptible for the customers and a nearly non-event for the ops.
How many hours of labor does that take every month you failover? What about hot hard drive spares? Do you want networking redundancy? How about data backups? Second set of hot servers in another physical data center?
All of that costs money and time. You're probably better off using cloud hosting and focusing on your unique offering than having that expertise and coordination in house.
sounds like an opportunity for someone (you?) to offer an abstraction slightly above bare metal to do the stuff you said to do, charging higher than bare metal but lower than the other stuff. how much daylight is there between those prices?
If anything this makes me feel better since my workload doesn’t require very beefy machines and the amount id be saving is basically irrelevant compared to my labor costs.
Yes, bare metal is not a panacea. Some use cases require 0 personnel change going bare metal (even having reduced labor), and some are very much the opposite.
Love the tool and UI you built. I homelab and while not always on 24/7 its way more affordable to run on my own bare metal than pay a cloud provider. I also get super fast local speeds.
Didn't know there was a verb for it! I "homelab" too and so far am very happy. With a (free) CDN in front of it it can handle spikes in traffic (that are rare anyways), and everything is simple and mostly free (since the machines are already there).
I rent a moderate sized Hetner box running FreeBSD and just spin up a jail (zfs helps here) or if necessary a bhyve VM per 'thing.'
I'd fire a box up at home instead but at ~£35/mo I can never quite find the motivation compared to spending the time hacking on one of my actual projects instead.
(I do suspect if I ever -did- find the motivation I'd wonder why I hadn't done so sooner; so it goes)
The amount of companies who use K8s when they have no business nor technological justification for it is staggering. It is the number one blocker in moving to bare metal/on prem when costs become too much.
Yes, on prem has its gotchas just like the EKS deployment described in the post, but everything is so much simpler and straightforward it's much easier to grasp the on prem side of things.