There could not be an application worse-suited to running in Kubernetes et al than a traditional database. Anyone claiming something that rams this square peg into that round hole is "production ready" is showing that they're an empty husk and shouldn't be trusted near anything important.
Note the downvotes already rolling in less than two minutes after I posted this. This subject is a major third rail here. It goes against the agenda of very powerful people and my account has been censured in the past specifically for making this particular argument, that database workloads and Kubernetes don't mix. Keep that in mind when you're asking HN for their experience on this (or any other topic that YC considers critical to the interest of their investments -- they've shown that they're willing to taint the discussion if it gets too dicey).
That being said, I also disagree that Docker isn't suited to running a DBMS, assuming you actually have a large enterprise (or cloud) datacenter backing your Docker daemon. In such cases:
• You'll probably have a large enough pool of Docker machines (k8s or not) that you're going to be deploying your DBMS container in a way that reserves an entire instance just for it (or it + its accessory containers);
• You'll probably have a SAN, and you'll have many enterprise-y reasons (e.g. live VM migration) to prefer backing your DBMS with said SAN, rather than with local instance storage.
If both of those are true, then Docker has no disadvantages compared to deploying your DBMS as a raw VM.
Cloud-ish infrastructure is often good for running distributed decentralized databases, but try running Oracle in a bunch of Docker containers on a crappy OpenStack cluster and soon you'll be crying into your scotch.
These efforts to make people think it's a good idea to run databases on K8s are misleading people, and god help those poor teams that waste years trying to stabilize something that a fancy web page and a youtube tutorial said was a great idea.
That said -- no, that's not the same thing at all. Barring anomalous conditions, VMs run as long as you keep them running. They won't be reaped and rescheduled onto some other node in the cluster, whether by automated rebalancing processes or by manual `kubectl delete po...` or `kubectl drain`. You can easily set up a VM that will behave more-or-less like conventional hardware if we ignore the perf hit.
This is a pretty simple thing. The reason people say you need to make your apps "12 factor" when you go to k8s is because it doesn't work well if your app cares about state. Databases care deeply about state. You can't just kill a DB server and spin up a new one to pick up where it left off. You can't parallelize a DB workload by spinning up 8 little DB nodes. It's not a web server and it just doesn't work like that. Things like CockroachDB exist specifically because normal databases don't work like that.
This is where people usually bring up things like annotations, labels, StatefulSets, etc. First, note that the facilities that accommodate stateful workloads are not priorities for Kubernetes and are generally not well-tested or consistent. This wouldn't be a news story or an independent project if they were.
Second, please realize you're doing all of that work to try and make Kubernetes do something it's not really designed to do, with potential negative impact on the availability and scheduling processes for the applications that do work well on Kubernetes, when you could just spin a VM and avoid all of these issues entirely. There's no reason to put a production DB on k8s other than cargo culting.
Just like any other tool that makes some things easier, Kubernetes also makes it easier to shoot yourself in the foot. Just like any solution, you have to know the system well enough to reason about it. There is still a lot that can be done to improve how we explain, document, and describe the system. But people run stateful workloads on Kube all the time, and they do it because it makes their lives easier on the balance.
If you aren't managing your state, then yeah you will run into a nightmare when trying to containerize stateful apps... or running them at all. You will literally have the same problems with a VM or physical hardware.
It's important to separate state management from process management.
A stateful application is absolutely not harder to contaknerize than a stateless one. Rather it is simply just harder to run stateful applications in any regard.
I would personally argue that it is easier to run a stateful app with a container manager. I know it sounds crazy but... keep in mind container tools are cenetered around what each individual application requires and the tooling tends to make it easier to express and assist in managing the state requirements of that application.
For that matter you can even prevent the scheduler from scheduling your stateful app on a new node, which seems to be the answer for the crux of the argument against containerizing a stateful app.
I agree, which is why I specifically avoided that language. Containers don't have to be implemented without regard for state -- but if you're talking about Docker or k8s, they are. Docker throws away anything not explicitly cemented in the image or designated as an external volume.
LXC, zones, and jails are containerization techniques that respect state. It's fine to run a database in these if desired. They behave just like real VMs; they have an init process, they get real IPs, they don't automatically destroy the data written to them, and they generally don't mysteriously shut down or get rescheduled. You can't be confident about any of that with Docker or k8s.
Statefulness is not a primary use case for Kubernetes. It took two years for StatefulSets to leave beta and there was a substantial false start in PetSets. As recently as April, which is the last time I seriously looked, there were still competing APIs for defining access to local volumes.
If you want to run a production database workload in a jail or a zone, that sounds fine to me. It's not about containerization in the abstract. It's about the way that Kubernetes and Docker do it.
(I mention Docker and k8s together because for most of k8s history Docker was the only supported runtime. It supposedly can use other runtimes now, but they're not widely used afaik, and behave similarly re: state anyway)
The trick is to express your state requirements. And yeah, you will be burned badly if you don't do this... and maybe docs and such should call this out better to make sure people don't set themselves on fire just because they didn't dig in deeply enough.
But docker and k8s do provide a means to assist in managing this state for you (swarm... not so well just b/c the work hasn't been done).
Why would an app dev be pushing changes to the db deployment (outside of data manipulation itself)?
Just because the app dev wants to spin up a db in dev to shove their data into doesn't mean that's how it should be deployed in prod.
Once you’ve achieved that whether your database runs on a VM or in Kubernetes doesn’t make a difference really.
Granted, if your not at that scale, running a database in Kubernetes is probably not the best of ideas. That has nothing to do with Kubernetes though, that’s because running a stateful service with decent working backup, recovery and automated failover is difficult in any case. If that’s not your job, you’re probably better off using RDS or something equivalent.
Honestly, it sounds like you're arguing that since the kubernetes API is easier and more accessible to use, then it's more dangerous to run state on that layer. That, and a community attitude of being more willing to accept failure, which some would argue is a good thing, others not so much, but I prefer to subscribe to the thought process discussed in the SRE book that failure is inevitable, and that putting your databases inside their kube equivlent saved toil time and harderns your setup.
That said, I would argue most folks being on cloud anyways should just use a managed postgres, but we're not always on cloud and I don't think claiming putting state in kube it's inherently wrong is fair.
I take it you've never managed a large VM hypervisor (e.g. vSphere) cluster. If your VMs aren't being pinned to particular hypervisor nodes by persistent claims on local instance storage or the like, they end up "floating around" on each restart in pretty much the same way k8s containers do. Especially so if you have live VM migration enabled, in which case you're probably doing the equivalent of `kubectl drain` all the time to deprovision and repair hardware.
(It’s not impossible in specific cases, mind you. I’m still waiting on tenterhooks for the moment someone introduces an Erlang-node operator where you can apply hot-migration relups through k8s itself.)
I recall a few nasty issues in the GitHub with data loss or unmountable volumes for the early adopters, with the official answer along the lines of "implementation is in progress".
There are still bugs, I do not disagree. Data loss bugs are considered top priority and I am not aware of any open such bugs against EBS driver.
You'll excuse me but no time to go through the history and dig up the tickets for reference.
But I fully agree that kubernetes and containers are not well suited to running production databases. In theory they could achieve parity with a dedicated machine or VM, but they're still a long ways from that - and it makes it very easy to lose your data. I was recovering a database where the persistent volume wasn't setup right and the container got killed and restarted. It was just before the holidays and it was a nightmare because everyone was on vacation.
Yeah you could get into that kind of problem with a VM or dedicated machine, but the bar is a lot higher, you'd need some kind of hardware failure. Kubernetes makes it really easy to shoot yourself in the foot when running databases.
In other words, your database application was using scratch storage instead of persistent volumes?
What this anecdote shows is that the developers or admins responsible to setup the database didn't do it properly.
Also, testing failures and data recovery should be your priority before going to production.
I don't see how you could blame that on software.
The point is not to say it wasn't human error - clearly it was, but it's an error that wouldn't have been as easy to make without kubernetes. There's a cost to running a database on k8s that largely people ignore. That's before you start talking about backups and recovery which also get harder and require more manual work with more potential for error.