_bare_metal's comments

_bare_metal · 2025-03-10T17:43:26 1741628606

The amount of companies who use K8s when they have no business nor technological justification for it is staggering. It is the number one blocker in moving to bare metal/on prem when costs become too much.

Yes, on prem has its gotchas just like the EKS deployment described in the post, but everything is so much simpler and straightforward it's much easier to grasp the on prem side of things.

adamcharnock · 2025-03-10T20:06:00 1741637160

I've come at this from a slightly different angle. I've seen many clients running k8s on expensive cloud instances, but to me that is solving the same problems twice. Both k8s and cloud instances solve a highly related and overlapping set of problems.

Instead you can take k8s, deploy it to bare metal, and have a much much more power for a much lower cost. Of course this requires some technical knowledge, but the benefits are significant (lower costs, stable costs, no vendor lock-in, all the postgres extensions you want, response times halved, etc).

k8s smoothes over the vagaries of bare-metal very nicely.

If you'll excuse a quick plug for my work: We [1] offer a middle ground for this, whereby we do and manage all this for you. We take over all DevOps and infrastructure responsibility while also cutting spend by around 50%. (cloud hardware really is that expensive in comparison).

[1]: https://lithus.eu

outime · 2025-03-11T12:10:35 1741695035

>Instead you can take k8s, deploy it to bare metal, and have a much much more power for a much lower cost. Of course this requires some technical knowledge, but the benefits are significant (lower costs, stable costs, no vendor lock-in, all the postgres extensions you want, response times halved, etc).

>all the postgres extensions you want

You can run Postgres in any managed K8s environment (say AWS EKS) just fine and enable any extensions you want as well. Unless you're conflating managed Postgres solutions like RDS, which would imply that the only way to run databases is by using a managed service of your cloud of choice, which obviously isn't true.

adamcharnock · 2025-03-11T19:41:11 1741722071

> You can run Postgres in any managed K8s environment (say AWS EKS) just fine and enable any extensions you want as well.

You absolutely can do this, and we do ineed run Postgres in-cluster.

We generally see that people prefer a managed solution when it comes to operating their databases. Which means that when it comes to their (eg) AWS EKS clusters, they often use RDS rather than running the DB in-cluster.

Our service is also a managed service, and that comes with in-cluster databases. So clients still get a managed service, but without the limitations of (eg) RDS.

abtinf · 2025-03-10T17:54:12 1741629252

Could you expand a bit on the point of K8S being a blocker to moving to on-prem?

Naively, I would think it be neutral, since I would assume that if a customer gets k8s running on-prem, then apps designed for running in k8s should have a straightforward migration path?

MPSimmons · 2025-03-10T18:07:21 1741630041

I can expand a little bit, but based on your question, I suspect you may know everything I'm going to type.

In cloud environments, it's pretty common that your cloud provider has specific implementations of Kubernetes objects, either by creating custom resources that you can make use of, or just building opinionated default instances of things like storage classes, load balancers, etc.

It's pretty easy to not think about the implementation details of, say, an object-storage-backed PVC until you need to do it in a K8s instance that doesn't already have your desired storage class. Then you've got to figure out how to map your simple-but-custom $thing from provider-managed to platform-managed. If you're moving into Rancher, for instance, it's relatively batteries-included, but there are definitely considerations you need to make for things like how machines are built from disk storage perspective and where longhorn drives are mapped, for instance.

It's like that for a ton of stuff, and a whole lot of the Kubernetes/OutsideInfra interface is like that. Networking, storage, maybe even certificate management, those all need considerations if you're migrating from cloud to on-prem.

anang · 2025-03-10T19:39:21 1741635561

I think K8S distributions like K3S make this way simpler. If you’re wanting to run distributed object storage on bare metal the you’re in store for a lot of complexity, with or without k8s.

I’ve ran 3 server k3s instances on bare metal and they work very well with little maintenance. I didn’t do anything special, and while it’s more complex than some ansible scripts and haproxy, I think the breadth of tooling makes it worth it.

hadlock · 2025-03-10T19:53:08 1741636388

I ran K3S locally during the pandemic and the only issue at the time was getting PV/PVC provisioned cleanly, I think Longhorn was just reaching maturity and five years ago the docs were pretty sparse. But yeah k3s is a dream to work with in 2025 the docs are great and as long as you stay on the happy path and your network is setup it's about as effortless as cluster computing can get.

anang · 2025-03-11T07:00:35 1741676435

I've been running one for a couple years now, and even in that short of time Longhorn has made huge leaps in maturity. It was/is definitely the weakest link.

Cost wise it's a no brainer. Three servers with 64 GB ECC and 6 cores for the price of three M5 larges. So 192 GB and 18 cores for the price of 24GB and 6 cores.

I think one of reason k8s can get a bad rap is how expensive it is to even approach doing it right with cloud hosting, but to me it seems like a perfect use case for bare metal where there is no built in orchestration.

hobofan · 2025-03-11T09:10:23 1741684223

Here is your business justification: K8s / Helm charts have become the de-facto standard for packaging applications for on-premise deployments. If you choose any other deployment option on a setup/support contract, the supplier will likely charge you for additional hours.

pmig · 2025-03-11T10:45:39 1741689939

This is also what we observe while building Distr. ISVs are in need for a container registry to hand over these images to their customers. Our container registry will be purpose build for this use-case.

jamesfinlayson · 2025-03-10T23:44:38 1741650278

> The amount of companies who use K8s when they have no business nor technological justification for it is staggering.

I remember a guy I used to work with telling me he'd been at a consulting shop and they used Kubernetes for everything - including static marketing sites. I assume it was a combination of resume and bill padding.

8n4vidtmkvmk · 2025-03-11T06:07:49 1741673269

I'm using k8s for my static marketing site. It's in the same cluster as my app tho, so I'm not paying extra for it. Don't think I'd do it otherwise.

jamesfinlayson · 2025-03-11T07:30:50 1741678250

Oh agreed - that makes sense.

This guy told me it was just shameless over-engineering.

reillyse · 2025-03-10T18:49:46 1741632586

Out of interest do you recommend any good places to host a machine in the US? A major part of why I like cloud is because it really simplifies the hardware maintenance.

8n4vidtmkvmk · 2025-03-11T06:15:48 1741673748

I'm running kubernetes on digital ocean. It was under $100/mo until last week when I upgraded a couple nodes because memory was getting a bit tight. That was just a couple clicks so not a big deal. We've been with them over 10 years now. Mostly pretty happy. They've had a couple small outages.

reillyse · 2025-03-12T20:25:57 1741811157

I meant hosting a dedicated computer sounds like you are talking about a virtual cloud type setup?

hdjrudni · 2025-03-13T00:41:27 1741826487

Hmm, not sure what you're asking then. You want a physical computer that you have access to?

DigitalOcean is just a hosting provider. You can buy just regular 'Droplets' (dedicated machine nodes) and host whatever you want on there, no serverless cloud junk. They also offer Managed Kubernetes. I use both.

yimby2001 · 2025-03-11T00:43:31 1741653811

Talos for on prem k8s is dead simple

_bare_metal · on Dec 22, 2024

Plugging https://BareMetalSavings.com

in case you want to ballpark-estimate your move off of the cloud

Bonus points: I'm a Fastmail customer, so it tangentially tracks

----

Quick note about the article: ZFS encryption can be flaky, be sure you know what you're doing before deploying for your infrastructure.

Relevant Reddit discussion: https://www.reddit.com/r/zfs/comments/1f59zp6/is_zfs_encrypt...

A spreadsheet of related issues that I can't remember who made:

https://docs.google.com/spreadsheets/d/1OfRSXibZ2nIE9DGK6sww...

brongondwana · on Dec 22, 2024

Yeah, we know about the ZFS encryption with send/receive bug, it's frustrating our attempts to get really nice HA support on our logging system... but so far it appears that just deleting the offsending snapshot and creating a new one works, and we're funding some research into the issue as well.

This is the current script - it runs every minute for each pool synced between the two log servers: https://gist.github.com/brong/6a23fee1480f2d62b8a18ade5aea66...

_bare_metal · on Dec 23, 2024

Thanks for sharing!

sneak · on Dec 22, 2024

My main issue with ZFS encryption is that it only supports one key.

LUKS2 has something like 9 key slots.

I run ZoL over LUKS2 and it works great.

_bare_metal · on Dec 2, 2024

This.

I run BareMetalSavings.com[0], a toy for ballpark-estimating bare-metal/cloud savings, and the companies that have it hardest to move away from the cloud are those who are highly dependent on Kubernetes.

It's great for the devs but I wouldn't want to operate a cluster.

[0]: https://www.BareMetalSavings.com

_bare_metal · on Nov 27, 2024

HBM or not, those latest server chips are crazy fast and efficient. You can probably condense 8 servers from just a few years ago into one latest-gen Epyc.

I run BareMetalSavings.com[0], a toy for ballpark-estimating bare-metal/cloud savings, and the things you can do with just a few servers today are pretty crazy.

[0]: https://www.BareMetalSavings.com

tame3902 · on Nov 27, 2024

Core counts have increased dramatically. The latest AMD server CPUs have up to 192 cores. The Zen1 top model had only 32 cores and that was already a lot compared to Intel. However, the power consumption has also increased: the current top model has a TDP of 500W.

Guzba · on Nov 27, 2024

Does absolute power consumption matter or would it not be better to focus on per-core power consumption? Eg running 6 32-core CPUs seems unlikely to be better than 1 192-core.

tame3902 · on Nov 27, 2024

Yes, per core power consumption or better performance per Watt is usually more relevant than the total power consumption. And 1 high-core CPU is usually better than the same number of cores on multiple CPUs. (That is unless you are trying to maximize memory bandwidth per Watt.)

What I wanted to get at is that the pure core count can be misleading if you care about power consumption. If you don't and just look at performance, the current CPU generations are monsters. But if you care about performance/Watt, the improvement isn't that large. The Zen1 CPU I was talking about had a TDP of 180 W. So you get 6x as many cores, but the power consumption increases by 2.7x.

Guzba · on Nov 28, 2024

Makes sense, thanks for the good reply.

1oooqooq · on Nov 27, 2024

a graph showing this against cloud instance costs and aws profits would be funny.

phodge · on Nov 27, 2024

That could be an interesting site when it's done but I couldn't see where you factor in the price of electricity for running bare metal in a 24/7 climate-controlled environment, which I would assume expect is the biggest expense by far.

_bare_metal · on Nov 27, 2024

The first FAQ question addresses exactly that: colocation costs are added to every bare metal item (even storage drives).

Note that this doesn't intend to be used for accounting, but for estimating, and it's good at that. If anything, it's more favorable to the cloud (e.g, no egress costs).

If you're on the cloud right now and BMS shows you can save a lot of money, that's a good indicator to carefully research the subject.

_bare_metal · on Nov 27, 2024

To expand on this, I run BareMetalSavings.com[0] and the most common cause of people staying with the cloud is it's very hard for them to maintain their own K8S cluster(s), which they want to keep because they're great for any non-ops developer.

So those savings are possible only if your devs are willing to leave the lock in of comfort

[0]: https://BareMetalSavings.com

marcinzm · on Nov 27, 2024

Cloud isn't about comfort lock in but dev efficiency.

_bare_metal · on Nov 27, 2024

Not every use case is more efficient on the cloud

_bare_metal · on Nov 18, 2024

Out of curiosity, why not go bare metal in a managed colocation? Is that for the geographic spread? Or unpredictable load?

Every few months of this spend is like buying a server

Edit: back at my pc and checked, relevant bare metal is ~$500/m, amortized:

https://baremetalsavings.com/c/LtxKMNj

Edit 2: for 100tb..

mritchie712 · on Nov 18, 2024

agreed, one month of 50 TiB is $12,800!

we're using Filestore out of convenience right now, but actively exploring alternatives.

nthh · on Nov 20, 2024

This is compelling but it would be useful to compare upfront costs here. Investing $20,000+ in a server isn't feasible for many. I'd also be curious to know how much a failsafe (perhaps "heatable" cold storage, at least for the example) would cost.

nine_k · on Nov 18, 2024

Hiring someone who knows how to manage bare metal (with failover and stuff) may take time %)

wongarsu · on Nov 18, 2024

You pay a datacenter to put it in a rack and add connect power and uplinks, then treat it like a big ec2 instance (minus the built-in firewall). Now you just need someone who knows how to secure an ec2 instance and run your preferred software there (with failover and stuff).

If you run a single-digit number of servers and replace them every 5 years you will probably never get a hardware failure. If you're unlucky and it still happens get someone to diagnose what's wrong, ship replacement parts to the data center and pay their tech to install them in your server.

Bare metal at scale is difficult. A small number of bare metal servers is easy. If your needs are average enough you can even just rent them so you don't have capital costs and aren't responsible for fixing hardware issues.

tempest_ · on Nov 19, 2024

We run on our own stuff at our shop.

Some things that are hidden in the cloud providers cost are redundant networking, redundant internet connection, redundant disks.

Likely still cheaper than the cloud obviously but you will need to stomach down time for that stuff if something breaks.

kingnothing · on Nov 19, 2024

Are you going to risk your entire business over "probably never get a hardware failure" that, if it hits, might result in days of downtime to resolve? I wouldn't.

nine_k · on Nov 19, 2024

Just pay 2x for the hardware and have a hot standby, 1990s-style. Practice switching between the boxes every month or so; should be imperceptible for the customers and a nearly non-event for the ops.

kingnothing · on Nov 20, 2024

How many hours of labor does that take every month you failover? What about hot hard drive spares? Do you want networking redundancy? How about data backups? Second set of hot servers in another physical data center?

All of that costs money and time. You're probably better off using cloud hosting and focusing on your unique offering than having that expertise and coordination in house.

swyx · on Nov 19, 2024

sounds like an opportunity for someone (you?) to offer an abstraction slightly above bare metal to do the stuff you said to do, charging higher than bare metal but lower than the other stuff. how much daylight is there between those prices?

nthh · on Nov 20, 2024

I'm sure there are companies in this space providing private clouds on bare metal, I wonder how that would be to operate at scale though.

_bare_metal · on Nov 14, 2024

Indeed. Shameless plug of a toy I built that lets you see the price difference :

https://baremetalsavings.com

presentation · on Nov 14, 2024

If anything this makes me feel better since my workload doesn’t require very beefy machines and the amount id be saving is basically irrelevant compared to my labor costs.

_bare_metal · on Nov 14, 2024

Yes, bare metal is not a panacea. Some use cases require 0 personnel change going bare metal (even having reduced labor), and some are very much the opposite.

tonetegeatinst · on Nov 14, 2024

Love the tool and UI you built. I homelab and while not always on 24/7 its way more affordable to run on my own bare metal than pay a cloud provider. I also get super fast local speeds.

bambax · on Nov 14, 2024

> I homelab

Didn't know there was a verb for it! I "homelab" too and so far am very happy. With a (free) CDN in front of it it can handle spikes in traffic (that are rare anyways), and everything is simple and mostly free (since the machines are already there).

mst · on Nov 14, 2024

I rent a moderate sized Hetner box running FreeBSD and just spin up a jail (zfs helps here) or if necessary a bhyve VM per 'thing.'

I'd fire a box up at home instead but at ~£35/mo I can never quite find the motivation compared to spending the time hacking on one of my actual projects instead.

(I do suspect if I ever -did- find the motivation I'd wonder why I hadn't done so sooner; so it goes)

user432678 · on Nov 14, 2024

This is really nice, thank you!