
Docker Storage: An Introduction - nslater
https://deis.com/blog/2016/docker-storage-introduction/
======
rburhum
Glad to see more articles like this. I find tons of writings about stateless
containers, but hardly any about best practices for stateful containers. Just
yesterday I was going over the Django tutorial in the official Docker docs.
Everything there made sense except that it completely ignores how to handle
"media" folders (i.e. the Django folder that, among other things, contains
user uploads). Yes, the database is in a volume so I am glad that survives
creating/destroying the postgres container, but I kind of need the other
files, too.

~~~
atmosx
Well in today's world you'll handle the media folder by hosting files in the
_cloud_ (e.g. an S3 bucket).

Most of the times if you use volumes, it's because of poor design[1] than
anything else.

[1] modern web-app design guidelines:
[http://12factor.net/](http://12factor.net/)

~~~
cyphar
You're just punting on the problem. Now storage is someone else's problem, who
can't use containers because you decided to dump your state on them. Storage
with containers is something that we need to solve properly (and Docker isn't
doing a great job right now).

~~~
Annatar
_Storage with containers is something that we need to solve properly_

...Or you could just use SmartOS zones, and then all these artificial problems
go away, as zones natively reside on the ZFS filesystem inside of the zpool
underneath them. Why would one want to waste one's time on technology which is
clearly not finished and doesn't offer any advantages over zones? As a logical
being, I'm genuinely perplexed by the overall insistence on Docker. What is
the cause of it?

~~~
cyphar
Here we go again.

As we've discussed before, there are problems that are inherent to designing
distributed systems that are not solved by SmartOS's magic sauce. Please stop
pretending that every possible problem that is faced by GNU/Linux technologies
has already been solved by SmartOS, it's getting quite tiresome.

"Storage with containers" refers to correctly decoupling the state from your
application and storing it a way that allows for horizontal scaling. It's not
as simple as proclaiming "we have a magical filesystem, therefore all problems
are solved and you should use SmartOS kthxbai". There are harder problems
here, and no amount of SmartOS shilling will get around that.

~~~
Annatar
You completely ignored my question: what is the cause of _insistence_ on
Docker?

Docker storage does not solve the scaling problem either, indeed, no known
technology in existence solves it: this is one of the unsolved problems in
computer science, still largely terra incognita. The only way to design for
horizontal scalability is in the application, where the application keeps
track of the global state across all nodes it runs on. Even if one were to use
distributed storage engine like Oracle RAC with the automatic storage manager
and synchronous multimaster replication, one's application would still have to
contain logic about ejecting/re-integrating failed RAC nodes, as well as re-
trying the transaction on a different node, and that is in addition to
internal loadbalancing it would have to perform, especially if the goal is
high availability with exactly zero downtime or zero loss of service at all
times.

As far as my "pretending", show me one problem which GNU/Linux has solved that
SmartOS already hasn't. Just one.

In closing, I mentioned SmartOS on purpose: perhaps someone reading this will
look it up, and try it out. And perhaps they'll like it (Linux became popular
the same way). What I get out of it eventually is a job market with
opportunities, but most importantly, I get to _sleep through the night without
having to deal with idiotic problems in GNU /Linux solved anywhere from 11 to
25 years ago in SmartOS (depending on the problem)._ I really do not want to
waste any more time on Linux, and certainly not on Docker. SmartOS can do
Docker, just so you know, although it doesn't need it to provide
containerization, at all, but one does have that choice.

So if you really want to run Docker at all costs, which advantages does
GNU/Linux offer you over SmartOS? Let's talk technology.

~~~
cyphar
> what is the cause of insistence on Docker?

Because it targets developers and is based on technology usable on a _very_
popular platform. SmartOS zones have neither of these properties.

> show me one problem which GNU/Linux has solved that SmartOS already hasn't

Packaging software and software updates. SmartOS uses NetBSD's pkgsrc, which
is an interesting choice given the fact that BSDs have a very bad track record
for packaging. FreeBSD still doesn't package their base system, and the
recommended way of dealing with software on BSD systems _is to compile it from
source_. GNU/Linux has solved this problem such a long time ago that I'm
honestly surprised that SmartOS decided to use pkgsrc over something much more
powerful like zypper+rpm, dnf+rpm or apt+dpkg.

So there's one problem that SmartOS didn't solve first, and I'd argue hasn't
solved yet either. To be clear, I have no problem with SmartOS -- it has a lot
of very deep technology. But claiming that it's magic pixie dust that has
solved every problem that GNU/Linux has solved (and is working on solving) is
being facetous and dishonest.

Also, as far as I'm aware there isn't nearly as much auditing of SmartOS going
on as there is of GNU/Linux. Does SmartOS have support for all the features of
grsecurity+PaX? What about support for UEFI? Or even support for many
different architectures and hardware drivers? GNU/Linux may have many
problems, but it beats every other free software operating system in quite a
few areas (and beats a few proprietary operating systems in quite a few other
areas too).

> SmartOS can do Docker, just so you know, although it doesn't need it to
> provide containerization, at all, but one does have that choice.

Yes, I know that. I currently am working as part of the OCI on container
standardisation and am happy to see people working on that from the Solaris
camp (we need to work together on standardising the workflows we want to use).
But the one thing that people I work with don't do when working on container
standards and canonical implementations of those standards is start screaming
about how everyone should abandon a very popular platform because "I hate
supporting GNU/Linux because it's not the operating system I like". Because
that's just childish.

~~~
Annatar
_Packaging software and software updates. SmartOS uses NetBSD 's pkgsrc, which
is an interesting choice given the fact that BSDs have a very bad track record
for packaging. FreeBSD still doesn't package their base system, and the
recommended way of dealing with software on BSD systems is to compile it from
source._

Compiling from source? pkgsrc, and by extension SmartOS, fully supports
installing binary packages. The command to do this is called pkg_add. pkg_rm
uninstalls a binary package. SmartOS even went a step further and uses pkgin,
which works exactly the same way as apt-get. Please read pkg_add's and pkgin's
manual pages before arguing further on this point.

Also here is a short document which clearly illustrates how a binary package
is created and installed with pkgsrc:

[http://www.perkin.org.uk/posts/creating-local-smartos-
packag...](http://www.perkin.org.uk/posts/creating-local-smartos-
packages.html)

and here is another document pointing out how SmartOs has a nearly 14,000
package library of the latest version of software which normally runs on
Linux; versions of which are usually newer than on Linux, and it is built
fresh every day into binary packages, completely automatically:

[https://www.perkin.org.uk/posts/building-packages-at-
scale.h...](https://www.perkin.org.uk/posts/building-packages-at-scale.html)

 _So there 's one problem that SmartOS didn't solve first, and I'd argue
hasn't solved yet either._

And based on your response, I'd argue that you haven't seen or used anything
but Linux.

I'm far from being dishonest, and in fact I communicated exactly what my
motivation is, and I'm sorry but it's really not my fault you haven't read
SmartOS and Solaris documentation. That would be akin to me claiming that
Linux is better than FreeBSD but without knowing enough about FreeBSD.

For example, your question about grsecurity+pax is nonsensical in the context
of SmartOS: being Solaris based, it _doesn 't need_ Linux specific technology
like grsecurity. That was precisely my point, if you use a well designed
system, all of this Linux hacked-up nonsense goes away, because it was
nonsense to begin with. SmartOS has a secure kernel by design, and further
delineation can be achieved with role based access control.

It also _doesn 't need_ a lightweight virtual machine consisting of a single
application because it has zones, and also because there would be nothing to
reap your process at the end, and you'd end up with a zombie, an unreaped
orphan process. I'm surprised that you don't know that Docker had to re-invent
a kludgy copy of init precisely because of this problem, or else you would not
have asked me that.

As for insisting on Docker because it's popular, it's flawed picking something
based on popularity; if you pick a solution, it should be because it's
technically sound and therefore robust, so that you can sleep through the
nights when you're on-call without incidents, and read newspapers and drink
your espresso during the day because the damn thing just runs and runs without
needing any babysitting, like Linux does all the time.

To claim that SmartOS has no development tools, when it readily offers all the
popular frameworks, languages, and has the most advanced linkers and compilers
in existense us beyond obscene, I'm afraid. Please read the manual pages on
the link editor, ld, and the Sun Studio compilers to get an inkling of what
I'm writing about. It will tremendously help the quality of our discussion.

~~~
cyphar
> your question about grsecurity+pax is nonsensical in the context of SmartOS

... no? Because many of the grsecurity+pax improvements apply to any kernel
that runs on a CPU (it provides _active_ protections against certain forms of
kernel vulnerabilities caused by bugs). This includes illumos, thus the
question is valid. Unless you're claiming the illumos cannot ever have a
security bug.

> pkgsrc, and by extension SmartOS, fully supports installing binary packages.

Apologies, I was thinking of a different packaging system in FreeBSD. However,
from my reading of the blog you linked pkgsrc only had support for signatures
of packages in _2014_. GNU/Linux has had this for a very long time.

My impression about package management being a shit-show on other operating
systems is that every single podcast or blog post I read about those operating
systems is celebrating that "package management is easy now with pkg _" \--
while it's actually not IMO as good as certain GNU/Linux package managers.

> and here is another document pointing out how SmartOs has a nearly 14,000
> package library of the latest version of software which normally runs on
> Linux; versions of which are usually newer than on Linux,

Only 14000? Also, what distribution of GNU/Linux, what version of the
distribution, how much automated QA happens before releases, etc?

> I'd argue that you haven't seen or used anything but Linux.

Untrue.

> but it's really not my fault you haven't read SmartOS and Solaris
> documentation.

It's not my fault that you haven't read the source code of runC without
stating an ignorant opinon about how it works, based on a mix of outdated
information and pure fabrication.

> That would be akin to me claiming that [...]

SmartOS is better than GNU/Linux without knowing about the Linux technology
you're arguing about?

> I'm surprised that you don't know that Docker had to re-invent a kludgy copy
> of init precisely because of this problem, or else you would not have asked
> me that.

This is all not true, and you should know better. Docker/runC doesn't have an
init process in containers. You _can_ run an init process, but it isn't
necessary. In addition, the zombie problem doesn't exist because of sub-
reapers which are a Linux kernel feature. You might not accept the existence
of such features, but that's your perogative.

I asked you because I guessed that you didn't know about the GNU/Linux side of
things. Thanks for not letting me down on that one.

> As for insisting on Docker because it's popular, it's flawed picking
> something based on popularity;

I answered your question of why Docker was popular, now you're complaining
that I am talking about Docker because it's popular? What. In addition, I am a
maintainer of runC and actually care much more about the OCI than Docker.
There are some cool things coming from the Solaris and illumos folks, too bad
that you're stuck in your ways and won't even consider the possibility that
any GNU/Linux technology is good. That's just _insane*.

> To claim that SmartOS has no development tools,

... when did I claim that? I claimed that it doesn't have anything like Linux
when it comes to security frameworks like grsecurity+pax, UEFI support,
hardware and driver support. You haven't addressed those arguments (claiming
that grsecurity+pax isn't useful for SmartOS is showing that you don't know
what it is or how it works).

~~~
Annatar
_.. no? Because many of the grsecurity+pax improvements apply to any kernel
that runs on a CPU (it provides active protections against certain forms of
kernel vulnerabilities caused by bugs). This includes illumos, thus the
question is valid._

grsecurity is a set of patches for the GNU/Linux kernel. illumos is based on
Solaris, so the entire argument about grsecurity is nonsense. illumos uses red
zones to prevent buffer overflows, which grsecurity is attempting to address.
grsecurity is a commercial product by the way. A single license costs $19,000
USD.

Enhanced auditing and process control which grsecurity provides have been part
of Solaris, and therefore illumos, since Solaris 10, some even earlier.
GNU/Linux is still playing catch-up, and as long as people like myself, Bryan
Cantrill, Adam Leventhal and the rest of the former Sun kernel engineers live,
it will be playing catch-up forever.

 _My impression about package management being a shit-show on other operating
systems is that every single podcast or blog post I read about those operating
systems is celebrating that "package management is easy now with pkg" \--
while it's actually not IMO as good as certain GNU/Linux package managers._

In order for that opinion of yours to actually mean anything, the question is:
how many packaging formats do _you_ know to produce packages for? Only then
will you be in a position where you would actually be competent to make such a
statement, and where your opinion would actually make a difference. I myself
have packaged for HP-UX, IRIX, Solaris, GNU/Linux (RPM), GNU/Linux (DPKG), and
SmartOS (pkgsrc), so I'm in a position to make such statements, and yet I
didn't. You, on the other hand, apparently have no such inhibitions.

 _Only 14000?_

"Only". If you had packaged, you would have known that the same body of
software could be delivered by an arbitrary number of packages, since that
depends on the packager(s), respectively on the architecture.

 _Also, what distribution of GNU /Linux, what version of the distribution, how
much automated QA happens before releases, etc?_

The same thing which happens on GNU/Linux; if you think that packages on
GNU/Linux get tested like a classic UNIX vendor would test it, you're naive.

 _This is all not true, and you should know better. Docker /runC doesn't have
an init process in containers. You _can_ run an init process, but it isn't
necessary._

Is that right? [https://blog.phusion.nl/2015/01/20/docker-and-the-
pid-1-zomb...](https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-
reaping-problem/)

 _In addition, the zombie problem doesn 't exist because of sub-reapers which
are a Linux kernel feature._

As it stands, I've worked with Linux extensively and I've never heard of such
a thing. Please show me that code, for if you actually manage to show me that,
I will have learned something new.

 _In addition, I am a maintainer of runC and actually care much more about the
OCI than Docker._

Aaahhh, so that's why you keep on blindly lobbying for Docker. Now we finally
get to the bottom of the thing. Couldn't you just be honest from the onset
like I was on what your motivation is, so everybody knows where everybody
stands?

So basically, you're trying to re-invent project Kevlar, Solaris zones, in a
completely generic way so as to be everything and not be anything in
particular to anyone. Did you not study UNIX, and the old Henry Spencer's
saying

 _those who do not understand UNIX are condemned to re-invent it -- badly_?

You could just bite the bullet and benefit from a complete, enterprise, battle
tested solution with a decade of use behind it, and use Solaris zones by using
vmadm(1M) in SmartOS, you know. No need to re-invent the wheel. Again.

How many more solutions does GNU/Linux need in order to be able to run
lightweight virtual machines? Can't you people _engineer_ one solution which
_actually works properly_ , like we have it in illumos? Apparently that's too
much to ask. Or you could just use zones, which have been working for a
decade, and actually let one run _lightweight virtual servers running at the
speed of bare metal in production, today:_

[https://smartos.org/man/1m/vmadm](https://smartos.org/man/1m/vmadm)

[https://smartos.org/man/1m/zoneadm](https://smartos.org/man/1m/zoneadm)

[https://www.youtube.com/watch?v=hgN8pCMLI2U](https://www.youtube.com/watch?v=hgN8pCMLI2U)

As for "OCI", I have no idea what you're writing about. To me, an oldschool
UNIX guy, "OCI" stands for "Oracle Call Interface":

[http://www.oracle.com/technetwork/database/features/oci/inde...](http://www.oracle.com/technetwork/database/features/oci/index.html)

Finally, a word on your question about setting up Manta. Apparently it's open
source:

[https://github.com/joyent/manta](https://github.com/joyent/manta)

...which means you can set it up seven ways 'till Sunday, and do with it
whatever you please, however you please. I myself don't use it, because I
design my availability into the application I write, and my interest is in the
infrastructure layer anyway, where things like DNS are designed to be highly
available from the onset (again availability in the application, and not the
OS layer).

[http://dtrace.org/blogs/dap/2013/07/03/fault-tolerance-in-
ma...](http://dtrace.org/blogs/dap/2013/07/03/fault-tolerance-in-manta/)

------
andrewstuart2
So I was going to be snarky about the author not having their linux user as a
member of the "docker" group (thus requiring sudo every command) but that
sparked a line of thought; so I'll ask:

Is it better practice to leave yourself out of the docker group, thus forcing
explicit use of sudo, since the daemon runs as root? Is there a better daemon
auth model that's not in use so you can at least have longer-lived tokens,
etc?

Also, in case you do want to skip the sudo every time (careful with the
potential security risk):

    
    
        sudo usermod -aG docker $(whoami)

~~~
atmosx
Only a developer could ask such a question!!!! I'm joking :-P

From a sysadmin standpoint sudo is the right choice. Sudo is an established,
well defined, (mostly) bug-free program, designed to specifically for that
task. It's tool for the job. You can create specific policies[1] keep track of
who, what, when, allow _this_ and deny _that_.

I reckon that this case is a bit tricky though and most people just use sudo
to get 'root', so if you're going to do just that, then I guess it's the same.

[1] [http://linux.die.net/man/5/sudoers](http://linux.die.net/man/5/sudoers)

~~~
cyphar
> I reckon that this case is a bit tricky though and most people just use sudo
> to get 'root', so if you're going to do just that, then I guess it's the
> same.

sudo is still better in that case, because sudo leaves an audit trail in your
system log. Docker doesn't keep a detailed audit log for every request made by
a user.

~~~
alrs
[https://github.com/a2o/snoopy](https://github.com/a2o/snoopy)

