
Docker Data Containers (2018) - faizanbashir
https://faizanbashir.me/docker-data-containers-cb250048d162
======
mbreese
I've never quite understood this pattern.

What is the benefit of keeping data in a container when the main point of
containers is that you can destroy them and rebuild them at any time? It seems
like the Docker ecosystem isn't built to support keeping state in a container.

For me, the big thing is that once the data is in a container, it seems
difficult to move around. How do you copy that container to a different
system? In order to get to your data from the outside of a Docker container,
do you need to use import/export? And even then, you just get a tar file?

Why not use a volume mount to a local directory to begin with? Then you can
still use the programs from within a container and still have easy access to
your data when you are done.

What am I missing? What is the advantage of using a data container?

~~~
fulafel
Volume mounting host directories breaks the isolation and is a security risk.
Data containers are more explicit about what happens in containers stays in
containers.

(I'm not saying data containers are super for everything, just that volume
mounts from host are a lot worse)

~~~
koffiezet
That would all be nice, if volumes would not be the exact same thing, where
only the in-filesystem location is being managed by the docker daemon and not
by the admin. In the end, both are just bind mounts to existing folders on the
filesystem.

~~~
fulafel
Not necessarily, since the storage volume scenario doesn't have to be on the
same host machine as the data you want to share to the container, you could eg
share over a registry. But even in the same host case:

1) The most common use case of host volume mounts is usually to share your
workspace to the container. That usually means mounting things from your
project directory, maybe in your "src" or home directories, with the
container, by default letting it write there too. This scenario is bad from
both workflow hygiene and security POV, and there are clear UX advantages in
the mental model of just using a more clearly bounded way of sharing data
unidirectionally and in a explicit, controlled way. Plus you won't be asking
other users to blindly trust your container with their host side files.

2) In many contexts (eg the Mac and Windows Docker apps) "docker" means
containers running in virtual machines isolated from the host.

3) In Linux-native production setups with containers, the storage driver is
using separate file systems, which offer isolation from many kinds of file
descriptor and shared UID namespace based attacks from crossing over to other
mounted filesystems (like the host OS and home directories).

------
aeyes
For the longest time I used this pattern to get test data to the developers
local environment but in Compose v3 volumes_from was removed.

There is of course no problem in still using the older config file formats or
Docker without Compose.

I switched over to building multiple stages, with and without test data for
example.

