

Containers and persistent data - vezzy-fnord
https://lwn.net/Articles/646054/

======
MarkSweep
It sounds like if you smash together Project Goverenor and Flocker you get
Joyent's Manatee[1]. It uses ZooKeeper to automatically manage a PostgreSQL
cluster. It also uses a separate ZFS dataset to hold the database, so backup
and restore can be accomplished just using "zfs send" and "zfs recv". It's
open source[2] Javascript if you want to check it out.

[1]: [https://docs.joyent.com/sdc7/troubleshooting-
sdc7/manatee](https://docs.joyent.com/sdc7/troubleshooting-sdc7/manatee) [2]:
[https://github.com/joyent/manatee](https://github.com/joyent/manatee)

------
Drdrdrq
A bit off-topic: I always wondered why Docker recommends volumes. Given that
they are removed when no container uses them anymore they seem a poor choice
for persistant data to me. Shared directories are much better choice imho.
Would love to hear why this is not so though...

EDIT: looks like volumes are (now?) persistent:
[https://docs.docker.com/userguide/dockervolumes/](https://docs.docker.com/userguide/dockervolumes/)
Still don't see any advantage in using them though... Am I missing something?

~~~
shykes
Docker volumes are persistent. Docker never removes anything unless you ask it
to, specifically to allow you to decide what is persistent and what isn't. It
certainly has never removed volumes when containers stop using them.

However Docker lacks a good volume management UI, so that fact is not always
clear.

To answer your question, the reason volumes are useful is that they allow you
to be explicit about which part of the container's filesystem should have a
lifecycle of its own, across container upgrades.

------
phildougherty
Check out [https://containership.io](https://containership.io), it supports
moving persistent data between servers in a cluster via
[https://github.com/containership/codexd](https://github.com/containership/codexd),
and also allows you to backup/restore/migrate entire clustered databases
between hosting providers. Currently there is support for Crate.io and Apache
Cassandra, with more on the way.

------
UserRights
All these problems went away with using lxc, lost so much time with docker,
which is definitely the wrong way to do it.

However it is funny to see a whole industry emerging around artificially
created problems.

------
jaz46
At Pachyderm (github.com/pachyderm/pfs), we're also working on data-aware
scheduling of Docker containers as it applies to analytics.

------
lewq
Watch out for some announcements around this topic at DockerCon in a couple of
weeks, we are close to Flocker 1.0 :)

