
Data migration with Kubernetes and Flocker - lewq
https://clusterhq.com/blog/data-migration-kubernetes-flocker/
======
lewq
We're super excited about bringing portable volumes to lots of different
orchestration frameworks, including Kubernetes. Any questions, we'll be
hanging out here, so ask away ;)

~~~
mbreese
Are there any plans to migrate volumes that aren't ZFS backed? (or does it do
this already?)

I'd think that the workflow would be similar to migrating ZFS snapshots, just
extracting volumes from docker directly, and sending tarballs to the new node,
as opposed to the built-in ZFS send/receive. It probably wouldn't be as
efficient, but it would make Flocker an easier sell to more people.

Another question - about what size do you think portable volumes need to be
before they stop making sense and you need to move to some other kind of
either shared storage (NFS) or remote object storage?

~~~
lewq
We are working on block device backends which integrate Flocker with
OpenStack, EBS and more, as well as ZFS on local storage.

Doing a tarball based export-import backend would be an interesting Flocker
backend ;)

With regards large volumes, adding more backends to Flocker will definitely
open up the scope for volumes then a ZFS pool (zpool) you'd want to have on a
single node. Generally though, when you need to scale data up to larger
scales, it's better to use a distributed data store which can shard your data
across a larger number of smaller stateful containers, which can also be
managed by Flocker, but of course that's application-dependent.

~~~
wernerb
Great to see development in storage with docker!

I like how this migration works. What about live migrations? Maybe I
misunderstood, but I think in the tutorial the service is offline for some
time. What about maybe applying backpressure (tcp level) to timeout the
request while migration is underway?

Edit: If we are doing tarball migrations, then maybe i'd rather have an rsync
backend :)

Oh and if we are doing large amounts of data migration (gigabytes), maybe even
use something akin to something like what bittorrent sync is doing? The
scenario would be that you use bittorrent sync (this is a theoretical
example..) to one-way migrate continuously (to maybe multiple hosts..??), then
pause/hold traffic when satisfied, complete final sync and you are migrated.

~~~
mbreese
I had a similar question, but my impression from the tutorial (and my
pondering) is that you couldn't really do a live migration with containers.
I'm not sure it will ever be supported. I don't think that the host would have
enough information about the processes in order to capture their state and
migrate running processes to a different host.

I'm not sure this would even be possible outside of a virtual machine. VM
hosts have significantly more information and control about the VM than the
container host has about the container. For example, a VM host knows exactly
what memory is hot and what is cold to enable sub one-second switchover times.

It might be possible to add such support to the Linux kernel itself to support
process migration between hosts, but that's the level of work required.

The alternative that I was thinking about using your standard high-
availability tools to create a new worker on the new host, then gradually
remove existing workers on the old node. I think that might be the only way to
really make "live" migrations work from a practical standpoint. For web-like
services, this would work. For others, it may not be practical.

EDIT: Looks like the checkpointing running processes might work after all!
(see: [http://en.wikipedia.org/wiki/CRIU](http://en.wikipedia.org/wiki/CRIU))
I must admit that I'm still a little skeptical, but would love to see this
working!

~~~
jsmthrowaway
CRIU has worked fine for a long time. I've used CRIU under LXC in a lab, and
CRIU alone in production.

This problem arises because Docker positions their intrinsics as novel
containers, even though there was an entire field of prior art long before
Docker showed up. Hence your confusion and worry checkpointing will never be
supported; it already was, but you didn't know about it, because to you Docker
== containers. One of my many problems with Docker, because they feed that.

Look how easy it is: [http://criu.org/LXC](http://criu.org/LXC)

~~~
mbreese
From that example, it's pretty clear you can checkpoint and restart a
container with CRIU, but can you migrate that state/checkpoint file to a
different host? I can see restarting it on the same host, but a different host
sounds "difficult". That's what I have the most questions about. I'm not sure
how that would work between systems, particularly open files handles, sockets,
etc... If you have a "share-nothing" architecture, I think that would be
easier, but it you have to maintain some kind of state within the container,
things get real hairy fast.

I have to agree though that it seems strange to me that Docker has gotten a
lot of support so quickly when things like LXC, FreeBSD Jails, or Solaris
Zones have been established for so long. I've only played with Jails a bit,
having done more work with full virtual machines in Linux. The other
containers, I at least had a familiarity with before Docker (nothing in
production). However, I had not heard about CRIU until today. But it looks
really interesting!

------
errordeveloper
Well done, Kai! Cannot wait to try it out ;)

