Hacker News new | past | comments | ask | show | jobs | submit login

The nice thing about running a DB inside a cluster is running your entire application, end to end, through one unified declarative model. It's really really easy to spin up a brand new dev or staging environment.

generally though in production, you're not going to be taking down DBs on purpose. If it's not supposed to be ephemeral, it doesn't fit the model

We run dev environments and some staging entirely inside Kubernetes, but in prod we run all of our stateful components outside K8s (generally using things like RDS and ElastiCache).

Anecdotally, keeping stateful components outside of K8s makes running your cluster and application so much simpler and it is much easier to maintain and troubleshoot. The burden is increased configuration friction though, so often you don't want to do it for your ephemeral deployments (eg. dev environments, integrated test runners, temporary staging instances).

You can use tools like kustomize to keep your configuration as clean as possible for each deployment type. Only bring in the configurations for the stateful services when needed.

I feel like this is the "right" way for smaller teams to do K8s, assuming it's already a good fit for the application.

My biases are probably 5+ years old, but it used to be that running PostGres on anything other than bare metal (for example, running it in Docker) was fraught with all sorts of data/file/io sync issues that you might not be able to recover from. So, I just got used to running the databases on metal (graph, postgres, columnar) with whatever reliability schemes myself and leaving docker and k8s outside of that.

Has that changed? (It may well have, but once burned, twice shy and all that).

My "run-of-the-mill" setup for local toy apps is Postgres + [some API service] in Docker Compose, and for non-trivial apps (IE more than 3-4 services) usually Postgres in k8s.

I've never had a problem with Postgres either in Docker or in k8s. Docker Compose local volumes, and k8s persistent volume claims work really well. But I'm no veteran at this so I can only speak for what little time I've used them.

The whole reason I do this is because it lets you put your entire stack in one config, and then spin up local dev environment or deploy to remote with zero changes. And that's really magical.

In production I don't use an in-cluster Postgres, and it's a bit of a pain in the ass to do so. I would rather use an in-cluster service, but the arguments you hear about being responsible in the event of a failure, and the vendor assuming the work for HA and such seems hard to refute.

Probably you could run Postgres in production k8s and be fine though. If I knew what I was doing I likely wouldn't be against it.

Ha! That's probably the problem - I don't know what I am doing.

Why oh why did I ever leave silicone...

I might have lost count but I think they reimplemented the file storage twice in local Docker installs and twice in Kubernetes... it’s at a point now where if you trust cloud network storage performance guarantees, you can trust Postgres with it. Kubernetes and Docker don’t change how the bits are saved, but if you insist on data locality to a node, you lose the resilience described here for moving to another node.

Here’s a different point to think about: is your use of Postgres resilient to network failures, communication errors or one-off issues? Sometimes you have to design for this at the application layer and assume things will go wrong some of the time...

As with anything, it could vary with your particular workload... but if I knew my very-stable-yet-cloud-hosted copy of Postgres wasn’t configured with high availability, well, you might have local performance and no update lag but you also have a lot of risk of downtime and data loss if it goes down or gets corrupted. The advantage to cloud storage is not having to read in as many WAL logs, and just reconnect the old disk before the instance went down, initialize as if PG had just crashed, and keep going... even regular disks have failures after all...

That's great to know. Thank you. The stuff in your 3rd paragraph is something I have always tried to do, but it largely depended on the largely reliable single node instances. I tend to (currently) handle single disk failures with master-slave configs locally, and it has worked at my volumes to date, but I am trying to learn how to grow without getting increasingly (too) complicated or even more brittle.

Slack, YouTube, and lots of huge tech companies use Vitess for extreme scale in mysql it’s started to move towards being much less of a headache than dealing with sharding and proxies

Assuming you run kubelet/Docker on bare metal, containers will of course run on bare metal as well.

A container is just a collection of namespaces.

There's only network overhead when running Postgres in Docker container (and obv data is mounted, not ephemeral), but it's a performance hit. This in itself can be enough reason not to run pg in Docker/k8s in production.

It hasn't, if your DB is sufficiently large, or under a lot of writes, the shutdown time will be larger than the docker timeout before killing pg. Data loss occurs in that case.

In my previous gig (2017 timeframe), we moved from AWS instances with terraform plus chef which took a couple hours of focused engineer time to generate a new dev environment. This was after some degree of automation, but there were still a great deal of steps one had to get through and if you wanted to automate all that how would you glue it all together? Bash scripts?

We transition to k8s, with PG and other data stores in cluster, specifically RabbitMQ, and Mongo, which runs surprisingly well in k8s. In any case, after the whole adoption period and a great deal of automation work against the k8s APIs, we were able to get new dev environment provisioning down to 90 seconds.

There was clearly some pent up demand for development resources as we went from a few dev environments to roughly 30 in one month's time.

Following that, the team added the ability to "clone" any environment including production ones, that is, the whole data set and configuration was replicated into a new environment. One could also replicate data streaming into this new environment, essentially having an identical instance of a production service with incoming data.

This was a huge benefit for development and testing and further drove demand for environments. If a customer had a bug or an issue, a developer could fire up a new environment with a fix branch, test the fix on the same data and config, and then commit that back to master making its way into production.

These are the benefits of running data stores governed by one's primary orchestration framework. Sounds less threatening when put that way, eh?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact