
Ask HN: What are the disadvantages of Docker? - codegeek
docker is hot these days and it is everywhere. But someone who likes to tinker with servers using scripting like bash etc, I don&#x27;t get docker. Ok I get that it allows you to &quot;containerize&quot; things so you can re-use wherever using same set of stuff. But what are some of the disadvantages of &quot;containerization&quot; specifically using docker ?<p>Just trying to convince myself to start using it but so far, my run_install.sh script beats everything. Why the hassle of containerization ? What overheads does it add which may not be worth it in some cases ?
======
di4na
Lot and lot.

Build. Docker layering do not fit the model of building software and
dependency, which makes its caching really brittle. At that point i discourage
having layers.

Networking: this is a mess, even k8s do not solve it completely. There is a
huge market of 3rd party provider of solutions for that.

Stability: Docker is based on lot of still unstable API and tools. I have a
kernel crash per month in prod from Docker and strange stuff happens.
Additionaly, docker breaks their own API regularly without respecting semver.

Disk/FS speed. This is a pain.

GC. Docker fill your disk faster than a java logger. And that is something. A
friend filled 100GB just trying k8s for a day...

UX. The docker cli got better but still far from there.

Debuggability. Crash without saving core dump, pain in the ass to load
debugging tools, etc.

All in all, we are getting rid of it at work. Spent the past 3 months deleting
it from all projects in active development.

~~~
cztomsik
I would also mention (VPS) hosting costs. Some applications need many docker
processes (postgres, redis, elastic, nginx, node.js backend) and each one of
them has overhead in both cpu and memory (~15 MB). CPU is usually not a
problem for small projects but memory is (5 processes and you're at 75 MB just
for docker, then there is operating system and if you need java, you're
screwed)

BTW: Docker is written in garbage-collected language (go) so that's why it
takes so much (for compiled lang) memory.

BTW2: if anybody knows about memory-cheap (using linode currently) VPS I'd
love to know

------
bdcravens
For certain apps, if you're on OSX, file system performance is horrible when
developing, especially on apps with a lot of file churn. I wrote about it
here, and how I was able to get most of the performance back:
[https://medium.com/@bdcravens/fixing-docker-for-mac-and-
rail...](https://medium.com/@bdcravens/fixing-docker-for-mac-and-rails-
performance-baf35f554bc7)

Even so, this seems like a silly problem to have to worry about, since I could
easily run the app on my machine at full performance with no workarounds.

I think the biggest disadvantage is the herd mentality. Rather than use Docker
where it provides a compelling advantage, we're being told to Dockerize all
the things and accept any pains that comes with as the price of progress. To
me it's as if you had a purely static site yet were told you needed to use
React and webpack. Solve for the pain you have, not what others have.

> But someone who likes to tinker with servers using scripting like bash etc,
> I don't get docker.

I will say that there is a point where if you like tiny utilities and the
"unix" way, Docker starts to make sense. Rather than have a giant OS with all
its varied dependencies, making tiny containers that do one thing well and are
tied together (for instance, I have a tiny container that monitors a networked
folder and uploads to s3) - you start seeing Docker not as VMs, but as
isolated but connected processes.

~~~
qorrect
> but as isolated but connected processes.

How are they connected ?

~~~
imauld
You can create a virtual networks and have all the related containers running
on one of these networks. They can communicate directly to each other in the
docker compose method (define a service `foo` and any container on that
network can communicate with it like
`[http://foo/some/endpoint`](http://foo/some/endpoint`)) or have a some other
message bus (redis, rabbitmq, zeromq, kafka whatever you want really) running
on the docker network that the other containers use to communicate.

Is this the best solution always? Probably not.

An example of using a Docker container as an isolated system process would be
Spotify's `docker-gc`.

[https://github.com/spotify/docker-gc](https://github.com/spotify/docker-gc)

------
spooneybarger
There are several benchmarks floating around about system call overhead for
docker. These are generally low, however, IO overhead in the form of
filesystem and network can be quite high.

Over at Wallaroo Labs, we've seen some workloads in Docker at are an order of
magnitude worse performance for some workloads. However, in those cases, it
was comparing raw performance on OSX vs running on Docker on OSX. IO overhead
while it exists, has generally been much lower when running in Docker.

Talking about "specifically using Docker" is a little difficult as Docker is
mostly providing a UI over existing Linux technologies and thus, isn't that
much different from other LXC based technologies.

If you want to talk about "docker specifics", you should be looking at things
like the overlay networking and Docker Swarm.

If you want to provide someone an environment where you know your software
will work because you've tested that specific environment then containers are
a very nice way to accomplish that.

I'm not sure what you consider the "hassle of containerization". I'm not a big
fan of Docker. The UX irritates me as I can never remember how to do anything
even if I've done it tons before (and in this way, its very similar to git for
me), but the creation of containers is, in my mind, quite easy. Certainly no
harder than putting together a `bootstrap.sh` with Vagrant.

Without knowing what your run_install.sh does, its hard to really say much
beyond that.

~~~
arseraptor
I would be very interested to know what network applications show high
overhead. Do you have any apps/number you could share? Does high overhead mean
they are generally slower or that they consume more CPU or both? (I ask this
as a researcher in computer science looking at container overheads)

~~~
spooneybarger
The amount of bytes we can push through a network connection (with a single
thread), using the same test application is lower when running in Docker than
without. Its particularly noticeable with Docker for OSX which shouldnt be
surprising as its running a VM.

The overhead is much lower on Linux as one would expect.

I don't have anything handy. I brought up the OSX overhead as I wanted to be
comprehensive given the somewhat ambigous nature of the original question.

Docker Swarm inter-container networking is pretty bad. It consistenly
collapses on us where throughputs drops by orders of magnitude. I know we
weren't alone in this as we found an issue that had been open for the problem
for quite some time. (And thus ended our quick experimentation for using
Docker Swarm as a demo environment).

~~~
idunno246
All of the OS X performance differences is from the busy box vm, not from
docker. And for networking, if you use a different network driver than default
the performance issues go away - the bridge/proxy stuff is slower, but you can
expose raw interfaces.

That said I’m also not a fan of docker except in certain cases(ie if you need
polyglot deployment). The development story is worse- OS X disk performance is
bad, and good luck getting a debugger or most tooling to work. It’s an extra
layer to worry about, such as now you manage number of containers and number
of servers instead of just servers.

~~~
spooneybarger
"if you use a different network driver than default the performance issues go
away - the bridge/proxy stuff is slower" <\-- issues go away does not match
our experience. "better than bridge", yes. "goes away however does not match
our experiences.

~~~
idunno246
We found macvlan did, but I’m sure our tests aren’t exhaustive. Also not
talking about swarm networking, know nothing about that.

~~~
spooneybarger
In our experience, Swarm intercontainer networking is best avoided.

------
gravypod
There are two pain points which I have experienced while using Docker.

Cons:

    
    
        1. Disk usage
        2. Permissions issues from volume mounts
    

Docker is really bad at garbage collection and it's not uncommon to have
docker use more than 40GB of disk on my machine at any given time.

In development it's sometimes handy to have volume mounts for often changing
files. This causes a lot of permissions issues with the containers. There's
some support in native docker to handle this (--user) but this support isn't
handled well by things like Docker Compose.

Pros:

    
    
        1. Easy to develop with
        2. Easy to version
        3. Easy to deploy
        4. Easy to migrate to another server
    

Gone are the days when you have to copy and paste the manual "apt-get
install....." from readmes. Just install Docker, clone the monorepo, run
"docker-compose up -d", and the entire development environment is now up and
running. You can rollback your entire system to any point in time, build
everything, and push it into production with amazing tooling around
everything.

You also never have to worry about developing two services that are both
networked that both want to use the same port. If I have Web Server A and Web
Server B I can run them with all production-similar configs (80/443) in
development all on my laptop.

~~~
imauld
I feel your pain on the disk usage, especially when rapidly making changes and
building with `docker-compose up --build`. However `docker system prune` is
your friend here.

It would be great if it were automated and a little smarter, it basically
leaves no survivors, but for all the advantages of Docker this isn't so bad.

~~~
gravypod
There are some community written garbage collectors out there that are good
but none mainlined yet.

I do agree that docker's pros are worth the cons. Anyone who has every
migrated a large application that they didn't have good documentation on the
installation or configuration process (especially ones which you have written
and forgot about) knows how valuable good, self-contained, packaging is.

~~~
imauld
I've used docker-gc by Spotify for production nodes that needed cleaning up
and it's worked pretty well.

------
BjoernKW
Additional, often unnecessary, complexity and the tendency to use it
indiscriminately even if it's not pertinent to solving the problem at hand.

I've seen teams spending an inordinate amount of time just for servicing
Docker and its surrounding infrastructure while they should've been working on
their actual products instead.

------
bradknowles
In my experience, Docker can be hard to debug.

There are lots of utilities that have been developed over the decades for Unix
and Unix-like OSes, but almost all of them assume you’re running on the bare
hardware and not inside of a jail or container.

So, if you’re going to use Docker for your deployments, you can basically
throw out all the tools you might use to try to help you figure out what is
going on in the system, unless they have been specifically developed for use
with containers or at least adapted to be container-aware. Which is basically
almost all of them.

If you don’t have access to the Docker host, then the only debugging
facilities you have available to you are the ones you explicitly build into
your container — you can’t assume that there will be any kind of debugging
facilities or tools available to you from the Docker host, because you don’t
have access to the Docker host.

In my experience, that’s orders of magnitude worse.

~~~
tetha
Yup, that's going to be the big bucket of icy water for our devs if we're
going to evaluate productive docker deployments. I don't mind too much,
telegraf and elasticsearch easily integrate with container setups, so I get
the data I use reguarly without a problem.

All the java-based monitoring and profiling we're currently tacking onto the
artifact during deployment? That'll be gone and you get to do all of that for
yourself. Or (and sadly, that's a more probable output in our place), I'll
need to build my own images.

------
throwaway180421
Lack of good IPv6 support. Getting origin IPs only returns the IPv4 of the
Docker gateway, and the only way to get around this is by setting the network
mode to host.

------
chatmasta
Docker makes a lot of things easy, but the tradeoff is hidden complexity, with
default configuration that may not be the best choice for what the user needs.

Let's look at networking as an example. Docker makes basic networking setup
very easy with the default docker0 bridge setup. However, this is really a
"solve 90% of cases" default that can really damage the 10% of cases that it
is ill-suited for. Developers unfamiliar with Linux networking are unlikely to
even realize it's a bottleneck. Concrete examples of where it becomes a
bottleneck depend on use case, but some can be (a) unnecessary ARP table
overflows when scaling to thousands of containers, (b) heavy TCP connections
between containers (think appContainer<->redisContainer). The reason for the
bottleneck seems to be an overreliance on iptables and ebtables for filtering
container-to-container communication.

The default container-container communication is so bad that I actually
switched to a shared socket (mounted in a named volume) for communication with
redis instead of using the default docker networking. I didn't do any formal
benchmarking, but the socket communication was significantly faster than TCP
communication for high throughput reads from redis (50mb+).

Since Docker uses netns + veth under the hood, I really wish it were possible
to create a netns and launch a Docker container into it with something like
--net-ns MYNETNS, like you can launch it into a cgroup with --cgroup-parent.
Unfortunately it's not possible without some ugly hacks AFAIK.

Of course, you can mitigate any issues like this, but it requires networking
knowledge and awareness of tradeoffs you are committing to by going "off the
reservation" in terms of Docker setup.

------
x0x0
kubernetes: SO MUCH COMPLEXITY.

If you have a giant app spread across multiple microservices, it definitely
could make sense.

For an app I worked on, that was basically divided into two classes of
instances: front end rails, and back end api in python/pylons, it was an utter
waste of time and energy. Just go with EBS plus a nice base ami and build
something useful with all the time you would have wasted on k8s.

~~~
imauld
For a setup like that, yeah k8s is probably overkill unless you're planning to
further decompose the application.

But FWIW k8s != docker. Kubernetes is a piece of software for orchestrating
containers and not only docker containers. Most people that are using k8s are
using docker but I don't think most people using docker are using k8s (just a
guess).

------
Can_Not
I have no idea if your point of view is sysadmin, devops, developer (front
end/backend?), employee, and/or freelancer. I like docker because I am all of
those, despite never actually using docker on the production/deployment side.
I'll go over how docker helps me. I use OSX and Debian/Ubuntu.

I currently freelance for a project that uses MongoDB. I use MongoDB for
nothing else, nor do I want to use it. In the past, I would infect my system
with MongoDB PPAs that would give me some version of MongoDB (who really knows
which version) and would break my apt-get update randomly. I like to tinker
with severs. I consider tinkering to imply fun, interesting, or accomplishing.
Fixing MongoDB PPAs breaking my system is the opposite of fun and interesting.
It occupies time I should be working for money or tinkering for fun. To refer
to "unfun tinkering" we'll use the term "yak shaving".

I fixed this by adding a docker-compose yaml file that specifies the same
MongoDB version as used in production. I run docker-compose up when I'm
working on the project. I can work on the project faster now and keep my
system safe from MongoDB.

With my other projects, I have a docker-compose yaml file that specifies which
version of postgre and redis I'm using. It's the fastest and easiest way to
get the exact versions I want without commingling the projects databases with
other projects I'm working on (this is huge for freelancing). It's faster than
vagrant (but I think I heard vagrant is getting docker support as a VM
backend).

Finally, one of the docker files is basically a bash script with extra
caveats. You put your build script there. Or you can run your existing bash
for from it. There might be some caching advantage to converting it, I'm not
sure.

Knowing nearly nothing about what you actually do or work with, I'd guess that
there is a reasonable chance that the cost of learning docker could be higher
than the value you would get out of learning it. It was worth it for me
though.

------
fpoling
Nginx can be updated and restated without loosing any connections. With Nginx
under docker it is not possible.

~~~
marcc
Why isn’t this possible using docker? I would assume you could still issue an
nginx reload command to the container running docker, since it’s the same
nginx software running.

~~~
idunno246
while it may be possible, its frowned upon. The docker way is immutable
containers, so you want to edit the nginx config you 'cant' just do it in
place, you need to make a new container with the new config and a new process
and kill all the existing containers(and hence processes/connections).

docker is basically at the point today where mapreduce was a few years ago.
while it may solve some problems, its not a thing that solves everything.

~~~
marcc
You could mount the nginx.conf files into the container so they are editable.
The “correct” way to run a Docker container is to supply the config at
runtime, not at image build time.

------
stackzero
It can be abit of a pain if you're not using linux OS as the server. Setting
up secure image registries or registries in general is also tricky. It can be
tricky learning how to debug dockerised applications because of the extra
layer of indirection.

