Did I misunderstand that the original post was regarding being able to do it with Kubernetes, not a few physical hosts? I didn't see "single" in the comment when I originally responded.
"You can easily get a box that has this many (or more) cores. I wouldn't be surprised if our cloud provider analyzed the topology of the connections, and decided to put the whole thing on the same server."
And the comment you replied to is referring to "the server"
VMware can do it by live migrating the vm, you will incur a short pause though, and the networking is a bit tricky to setup.. This of course doesn't happen during an unexpected downtime, it's a cold boot on another node in that case.
I have never seen this go smoothly on a production server; it's always WAY slower than expected (if you use any significant amount of memory) and something always gets f'd up wrt the network connectivity, broken caches, etc.
VMWare has had a high availability mode for over a decade. It keeps checkpoints of system state elsewhere and synchronously replicates them.
If the primary catches fire or crashes, the secondary boots the VM quickly without data loss.
If the primary reboots, it checkpoints RAM to the secondary pauses the VM and unpauses it on the secondary a few milliseconds later.
Note that this transparently handles storage replication, which is something that is notoriously difficult in Kubernetes.
If your cluster fits on one machine, you've paid for a lot of (currently) unnecessary complexity up front, both in dev time and in hardware cost.
If your app scales to need the cluster, congrats. Sometimes delaying time to market to allow for a smoother ramp makes sense. Sometimes it does not.
(Wikipedia is a good example of succeeding without ever needing to scale out the back end. I doubt they'd have won out over their competition if they took an extra 12-24 months to launch.)