Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Docker Misconceptions (valdhaus.co)
76 points by lsm on July 30, 2015 | hide | past | favorite | 20 comments


It seems to me that a lot of the "Docker isn't good for production" stuff boils down to "Docker is a base layer that's not sufficient for production, and you need other tooling around it."

Like, if you're using Docker in conjunction with AWS's suite of tools (Elastic Beanstalk, CloudWatch, etc.), a lot of these concerns are taken care of, you know?

So Docker doesn't solve everything, but it can be part of the solution.


I totally agree.

Docker on ECS, with VPC for network and instance isolation, an ELB for load balancing and a Kinesis for log streaming is working extremely well.

Docker is feeling really great here as common tooling between a development system and production system.

Disclaimer: I'm working on a project, Convox, that automates setting up this type of system. https://docs.convox.com/


I have this sinking feeling that a lot of what is happening with Docker as a specific tool is going to be replaced in 4 or 5 years with unikernels.

My hope is that the orchestration/scheduling tools (mesos, kubernetes, etc.) mature in such a way that the switch from Docker to unikernels is largely transparent to most people.


For all we know, that unikernel might be Docker-branded.


Why? unikernel is quite opinioned imo.


If you look at it from the perspective of the typical deployed application, say, with 4 or 5 VMs working together, it probably doesn't amount to much.

But if you are Google or Amazon, who have to build massive data centers to host thousands and thousands of those apps, along side much larger-scale applications, you could achieve much more significant density (and therefore reduced costs) if you were running unikernels as opposed to VMs. Perhaps passing some of that cost difference on to the customer for both competitive reasons and as an incentive for them to upgrade.

That said, even for a small-time app, consider the weight of trying to run a complicated micro-service-based system on a developer laptop. Having to orchestrate a bunch of VMs is an unmitigated disaster. Having to orchestrate a bunch of containers in one or more VMs is an improvement, but not much.

If you could instead run unikernels, there's considerably less overhead. Especially since the unikernels are typically able to run hosted inside a standard host-OS process.

Don't get me wrong, the world isn't really there. But when you consider a kubernetes cluster of docker containers that you never SSH into ... why bother with all those added layers of OS and runtime cruft?


Given the amount of money Docker, Inc. has raised (>50M, three series rounds, etc.), I somewhat cynically think that this buzz about Docker may just be the result of a lot of marketing money.

I'm not really comfortable with such widespread adoption of a tool that is primarily a VC baby--NPM is setting itself up to fail (I think) in a similar fashion.

I do hope I'm wrong.


"Misconception: You should have only one process per Docker container!"

as soon as you start treating docker images as anything other that isolated statically compiled executables, you're not going to get the best out of docker.

if you are bundling inits, crons and companion apps into a single container then you need to stop, go back and either re-factor your code, or go to Full on VMs,

why?

because the networking is terrible. There are three great advantages to using real VMs over containers:

o Networking

o Isolation

o hot migration and resource allocation

Networking:

every instance of a service can have its own IP, and can be trivially tied to DNS automatically. scoped service discovery that's only sortof just possible now. however it uses immature tools with limited professional experience to back them up. DNS, DHCP with subdomains means images can be dropped in without any hard work

Isolation:

Its far harder to break out of a VM than it is a container. Especially if you are dealing with persistent storage and need to allow a container to write outside of its own chroot.

Hot migration:

This is killer. Hardware fails. having a cluster that automatically migrates around contention and hardware failure, without the app having to worry is worth many thousands of man hours. Yes making your own clustering system is fun, but its really quite hard to do well. Why bother when the hypervisor can do it for you?

There are three things going for docker:

Configuration library:

There is a rich library of prebuilt images

Baked in fudges:

You can bake in your dirty hack into the container, so long as you script it into your build job, its repeatable.

Speed:

yes there is less overhead. but lets be honest, how often have you hit up against VM speed issues that were down to your machine using too much CPU/memory? (if you're on AWS, no, you've not. AWS is dogshit slow, and expensive.)

Everything else, like immutable builds, easy dev environments et al, can be achieved already, and without much work.


I think you're being overly dismissive when it comes to easy dev environments. I was recently fixing the reddit vagrant environment and it was absolutely excruciating:

- Creating a scratch VM (not even provisioning) is a speed bump when you want to re-running scripts on a pristine environment to validate them. Starting a Docker image takes ~1s.

- Provisioning is slow! Reddit suggests installing a plugin (vagrant-cachier) to keep you sane. I ended up downloading a plugin to take VirtualBox snapshots of my VM and even that was depressingly slow. Docker commit takes maybe 20s on a big layer.

- VirtualBox shared folders are pretty bad, so apparently I should install vagrant-bindfs (and NFS packages). -v is effectively zero effort.

- I had to keep on shutting down and starting up the VM to tweak memory settings - too big and my laptop died, too small and nothing would start. You get far more flexible controls in Docker.

I couldn't help but think all the way through that putting everything in a Docker container would just be a far nicer experience, even if it does violate the single process rule - for this use case, Docker solves some papercuts very effectively.


I hear ya, but there are tools for this sort of thing

- snapshots, provision your VM once, snapshot it, power off.

- clone a new machine from that snapshot (its still powered on, and do what you will with it)

Bonus points for attaching an ephemeral drive.

- provisioning is slow, thats why you have hot spares to clone from/use directly.

To minimise hassle, I use the very same infrastructure that I deploy on in prod. This means that there is no difference between prod and dev.

I understand the need to run stuff on laptops, but for me, its really not worth it. Using the same systems and sized machines as in prod makes life so much simpler. Plus I can hand over a machine to another dev really simply.



lol i just deployed reddit on docker. very, very nice. will put it up on github if you are interested.


This post is a year old. Many of it's points are still valid, but others are not. For example, orchestration has been simplified with hosted services like Tutum and Cloud66.

I do however agree that not everything is ready to be containerized, but we are starting to get close.


>orchestration has been simplified with hosted services like Tutum and Cloud66.

Ah, so you need to use proprietary SaaS in order to have decent orchestration? Not good news.


Also not true. Consider kubernetes and mesos. Both are open source.


Oh yeah, that's a good point.


I didn't even notice this was posted a year ago until I got to the bottom (though I did feel some tools/ideas were left out which was explained by the date). That said by and large this is a really good resource and as someone who is going all-in with docker on a side project it was a very useful read!


Interesting, while the OP says they like Docker, they pretty much recommend against using Docker for the things/purposes that most Docker hype recommends it for.


I think it's more the case that he recommends against it...unless you know what you're doing!


A lot of these articles are correct. I would agree that Docker probably isn't ready for production. But containers provide a TON of benefits, and you should absolutely be thinking about how to containerize your applications now. Just because it's not currently ready for production doesn't mean you shouldn't start getting ready to move to a container solution. The ecosystem will mature, companies will offer solutions for these problems, and it will eventually be ready for production. When it is ready, you should be too.

The big problem that Docker solves is the dependency problem. Specifically, it ties multiple levels of dependencies together with application code in a way that makes no assumptions about your environment and how well-maintained it is. It means that your CI system can test on the exact same versions of binaries -- and every dependency down to the kernel level -- that you will run on your production systems.

Many bigger companies will have multiple Yum/Apt/Maven/Git repositories, and with Docker, it doesn't matter. Whatever is built into the container is what gets run. Most importantly it puts control of those things into the hands of the development team, not the system administration team. It allows you to more cleanly separate your infrastructure ops from your application engineering/devops, which is the prime benefit IMO because those two groups have never worked together well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: