
Ask HN: Do you run databases on Kubernetes? - extra_rice
I&#x27;m curious if development teams prefer to run databases in or out of Kubernetes. For those who do, how to you make it work? What are the key points you think anyone who is considering doing the same should think about before going for it? For those who eventually decided against it, what were the main factors for the decision?<p>I was at Kubecon earlier this year, and the impression I got was that in general, persistence on Kubernetes is still somehow a challenge, and I think that&#x27;s true given what I see first hand.<p>I work in a very small team that, sadly, do not have enough expertise with maintaining databases. I mean, we can use them, but we&#x27;re not at the level where we can make databases sing. I think somewhere in our organization, there are database people, but currently none of them are involved directly in our project. We are, at the moment, running MongoDB in our Kubernetes cluster, but we recently ran into some issues with it causing problems for the rest of the deployments. I was wondering if it&#x27;s time for us to consider moving it elsewhere so Kubernetes doesn&#x27;t have to worry about it.
======
davismwfl
Can you or someone else elaborate on what issues you run into when running a
database within Kubernetes? To be transparent, I have always ran my databases
on dedicated or on AWS instances. I am interested in understanding what
specific issues you have seen running DB instances within Kubernetes.

~~~
extra_rice
For the specific incident I mentioned in the original post, I didn't quite get
the exact details of the problem, but I think the gist of it was that our
MongoDB deployment was using too much memory for its cache. We're running it
on our Kubernetes cluster that use M5.xlarge EC2 nodes, and it was causing
problems to other applications deployed on the same node. Kubernetes was,
understably, having a hard time managing resource and scheduling the
deployments.

I think one of the challenges with running databases on Kubernetes is that, at
the time of writing, Kubernetes does not really have an abstraction for
databases and so it manages them just like any other resource (of a given
type). Operators are meant to address this though, and the people at Kubecon
seem to be excited about it, but I haven't used it personally.

------
hjacobs
I work at Zalando where we run hundreds of PostgreSQL database clusters on
Kubernetes (on AWS) using our Postgres Operator
([https://github.com/zalando/postgres-
operator](https://github.com/zalando/postgres-operator)). This gives us some
added flexibility, quick startup (e.g. for e2e), and the latest PG features.
That being said, I would be careful to recommend any specific stateful
workload approach without good understanding of the whole setup (true for
whatever cloud/k8s/onprem environment).

~~~
extra_rice
Yeah, sounds like operators really are a perfect fit for this use case. For
your e2e, I imagine you use the operators to spin up new database instances
for every run? I really like the idea of being able to do that, which is
something I imagine would be difficult outside of Kubernetes.

For production though, do you have any specific scheduling logic to ensure
that database resources don't cause issues with other services in the cluster,
or is it just a matter of configuring the database service?

------
drad
Your main issue will be IO unless you use a host only PV and if you do that
you are likely limiting you db instamce to a specific node which can have
scaling and/or HA impacts. Most will go with a network based FS to back your
db data, if that is the case your network IO will likely impact your db
performance. For a dev or test env this might not be a problem but for prod it
is usually a blocker.

~~~
extra_rice
> unless you use a host only PV and if you do that you are likely limiting you
> db instamce to a specific node which can have scaling and/or HA impacts

This is exactly what we do, and we're having trouble because of it and not
only in terms of scaling the database, but also because it runs on the same
node as all other Kubernetes resources. We are considering spinning up a
dedicated node for it, but I thought at that point we should probably just buy
an actual dedicated database solution instead; buy the expertise we don't have
currently.

------
streetcat1
Do not try to run database using bare kubernetes objects. Try to see if some
of the operators fit your need.

~~~
extra_rice
Yeah, this is one of the things we are considering at the moment. However, if
I understand correctly, for MongoDB specifically, you need to be using their
enterprise offering to be able to utilize their operator.

~~~
streetcat1
You can use postgres jsonb feature. You do not need mongo.

~~~
extra_rice
Not sure what exactly you meant with this. Are you suggesting that we migrate
over to Postgres, or that we can somehow use the Postgres operator with
MongoDB? I'm honestly open to switching databases, but I don't think
management will be thrilled with the idea (not for now at least).

