Hacker News new | past | comments | ask | show | jobs | submit login

as someone who is more developer than ops, I feel like the docker stuff is still changing fast and that the way you would use docker today will be very different a year from now; but that containers seem to be the way of the future - if I have no pressing need to change my server architecture does it make sense to wait for things settle or would it be more beneficial to get in and learn now and experience the changes and why they were necessary?



Docker hopefully will be a little different in a year as it will offer solid security isolation (which is not the case now).

Right now if you run a Docker container and run something within it as root, if that application gets compromised someone can break out of that container and alter the master (due to the way Docker links container-root and system-root, a root user in a container is effectively a root user on the whole system).

Docker are working on allowing containers to run entirely in user-mode (thanks to improvements in LXC). This would mean that you can run a process as root within a container, and if that gets compromised there is near zero chance of leveraging that into damaging the master OS (since it will just have normal user privileges).

Here's an article about their progress (to usermode):

http://s3hh.wordpress.com/2013/07/19/creating-and-using-cont...

To quote Docker's own documentation[1]:

> However, it has been pointed out that if a kernel vulnerability allows arbitrary code execution, it will probably allow to break out of a container — but not out of a virtual machine.

In other words, right now, a root process is likely able to escape a Docker container. You can use SELinux, AppArmor, and similar to somewhat mitigate that when it happens but neither are near as powerful as having that usermode isolation on the master.

If Docker is able to get usermode containers working, it will be very difficult for a Docker container to either alter other Docker containers or the master system (other than over the network, maybe).

[1]http://blog.docker.com/2013/08/containers-docker-how-secure-...


It has nothing to do with LXC, but the Linux kernels. LXC simply bundles all of the linux kernel namespace features together into one shiney thing to make containers. Conceptually, docker, via libcontainer, does the exact same thing.


It is definitely changing fast, but that is not a reason to wait before trying out the technology in order to learn about it. If nothing else, the community will welcome your feedback :-)


precisely because you're a developer you whould embrace docker with open arms.

Because docker removes all the trouble of running applications that you need for your development: databases, application servers, queues...

I love the fact that I can focus on my code and not on all those details that stole so much of my time.


I'd like to see a post/writeup (or even an essay!) on how to do the magic "backing services" (http://12factor.net/backing-services) with Docker. And that's where I find Deis and other Docker orchestration systems lacking very much.

Sure, you can run MySQL in Docker, but it's a far cry from running it on native xfs with aligned partitions and whatever fancy you feel configuring. And since docker containers are very reusable, whereas backing data is by default should be persistent, my impression is that it's too easy to accidentally remove a docker container.


At the expensive of sounding like I'm just plugging my own company, this is what we're working on in the open-source project Flocker (https://github.com/ClusterHQ/flocker). We think that data-services like databases, queues and key-value stores, and anything else with state should be able to run inside docker containers too. Yes, you can already run a database in a docker container, but from an ops perspective, this is a nightmare and very far from what you want want to be able to do in a production system. Would love your feedback on what we're building and even more for you to get involved. Flocker is licensed under Apache 2.0, so feel free to get involved.


You say that running a database in a Docker container is "a nightmare and very far from what you want to be able to do in a production system," but you do not explain why. Perhaps you might substantiate such a claim?


This blog post about the problems running databases in container-based PaaS is a good starting point: http://blog.lusis.org/blog/2014/06/22/feedback-on-paas-reali...

Generally, you want to be able to answer these questions when it comes to operating your databases:

What are the failure points? What is the impact of each failure point? What are the SINGLE points of failure? What is my recovery pattern? What is my upgrade experience? What is the operational overhead in the applications running ON the product? What is my DR strategy? What is my HA strategy?

Pure Docker, and no other tool that we are aware of in the docker/container ecosystem that we are aware of provide really good answers to these questions when it comes to databases. That is what I mean when I said running databases in containers in a nightmare. It is possible today, for sure, but it is extremely complex operationally and its why it is so rare to see prod databases running in production today.


Nothing is stopping you from running MySQL in Docker on native xfs with aligned partitions: bind-mount whatever partition you want into the container by defining a volume in Docker.

This will be persistent, and will survive when you destroy the container. I use this to e.g. share a /home directory between a dozen experimental dev container I use to run my various projects - each container ensures I keep track of exact dependencies for each individual project, while I get to have a nice "comfortable" swiss-army-knife container with my dev tools and all all project files.

I also run a number of database containers which use volumes where I bind mount host directories to ensure persistence so I can wipe and rebuild the containers themselves without worrying about touching data.


Last time I tried was around 0.8, and aufs was not happy with xfs. I haven't tried having /var/lib/docker on ext4 and "/volumes" on xfs, I'll try if the need arises.

I run a few MongoDBs with volumes, but I'm not confident that I won't accidentally start two with the same volume, or that someone won't accidentally delete the volume, or .. or .. or.

As I've written to a sibling comment, I don't consider it a hard problem, but it hasn't been taken care of .. yet!


Am intrigued (and, again, showing my current early-stage understanding of LXCs), can you link the same data store container to multiple application containers? As in, have both a beta application and production application pulling data from the same core DB?

And do you simply define the container as a volume to ensure it stays persistent? That was the feeling I got from the docs, but again, might just be flagging how little I know at the minute...


docker run --name=my-data -v /host/data:/container/data data-container

docker run --volumes-from=my-data app-beta-container

docker run --volumes-from=my-data app-prod-container

That would share the data store, however the real way you'd do this would be....

docker run --name=my-data -v /host/data:/container/data data-image

docker run --volumes-from my-data --name my-database database-image

docker run --link=my-database beta-app

docker run --link=my-database prod-app

Doing --link will allow those two containers to network-communicate and you should only be communicating with your database over the network anyways.


Yes, you can definitely share volumes between any number of containers, it's a common usage pattern.

You don't need to make a container persistent: Docker, by design, will never remove anything unless you explicitly ask it to. If you want to separate the lifecycle of a directory within your container, so that it stays behind after you explicitly remove the container, or to share it between containers - that's when volumes are useful.


Unfortunately I haven't read enough 12fa, but I know I can address most of your questions with one factoid: Volumes. You are absolutely right that Docker containers are meant to be disposable, and should not contain backing data. That is what Volumes are for. I haven't done enough with volumes to give you a real primer on the use of them, but volumes can run on whatever backing store you want and they are not so intertwined with the container that they would be deleted along with it.

It looks like volumes have evolved significantly since the feature was introduced, you might want these links, sorry I haven't reviewed them myself:

https://docs.docker.com/userguide/dockervolumes/

http://crosbymichael.com/advanced-docker-volumes.html

(I actually do keep my backing data in the containers, we have institutionalized backups where all of the important data is already kept in git anyway, so instance clones are in fact disposable for me even though they have all of the important backing data in them.)


I'm familiar with volumes, and here's how I see the problem:

On a docker host you have the docker daemon, and whatever auxiliary stuff you need to orchestrate either the containers or the host (update docker itself, and so on), you have space for /var/lib/docker, and that's it. Volumes are always somewhere on /host/data. That means you have to make up a scheme and convention, and cook up scripts and add it to your already quite dynamic mental model.

If you go and want to manage volumes, you need something for that. And currently everyone and their cats have their own solutions (because there is one they claim to use and one they use, and one they hack on to use later). I'm not claiming it's a hard problem, just that it's not taken care of yet.

Maybe Flocker will deliver, I haven't checked it since it was posted 5 minutes ago :)


Yeah, the problem isn't that you can't do it. The problem is there are too many ways that mostly do what you need.


Docker's design makes it incredibly easy --- or at least it makes it difficult to treat your backing services NOT like attached resources, therefore forcing you into some sort of 12factor-esque design. There's no magic to "backing services".


So, basicaly, you're saying that you don't want to mess with that setup and use existing stuff... which you could achieve with everything as long as someone else do the setup for you. I don't see where Docker is better for a developper in this case... providing a vagrant box is exactly the same.


Startup time for a docker container is way way faster than a VM. Also, you could run the exact binary state of production, which is helpful if you run into "works on my machine" types of problems.


Agreed, but my point was that Docker doesn't reduce dev setup time. Give a good vagrant config file to a dev and tell him to do vagrant up and you have the same result as what you're saying. You can replicate production state with vagrant too (and bash scripts if we stretch this) and avoid "works on my machine" problems.

I'm not saying that vagrant > docker. The way I see it, docker is great if your infrastructure is using it all the way. If your prod setup if not dockerized, using docker in dev seems to me counterproductive than spinning up a VM and provisionning it with ansible or puppet to achieve production replication. As @netcraft said, I don't see why I should "change my server architecture" to use docker in dev.


If you have a complex stack (multiple services, different versions of Ruby/Python/etc, DB, search engine, etc), it's a real pain to shove them all into a single VM. Once you have 2 VM's running you have already lost to Docker on memory/space efficiency and start-up time.


I have yet to see real, complex, and distributed applications that share the exact same config in dev and production. I know that having the same versions of system libs in dev and prod can be a problem in some context and docker can help with that, but it's not the only solution and does not take care of the whole landscape (e.g., npm packages.json, pip requirements.txt, etc.).

I totally agree that startup time of a container is far less than a VM, but I don't see how docker "removes all the trouble of running applications that you need for your development: databases, application servers, queues"

You still need to install, configure these services, make sure that the containers can talk to each other in a reliable and secure way, etc.


First, I'm a dilettante. I haven't used docker in production. I've really only set up a handful of containers.

That said, all of those fiddly library dependencies are where i struggle the most at work. If i could just build a docker image and hand that off, it would save me a lot of grief with regard to getting deployment machines just right.

I do have a great deal of experience with legacy environments, and it seems like the only way to actually solve problems is to run as much as possible on my machine. Lowering that overhead would be valuable. Debugging simple database interaction is fine on a shared dev machine. a weblogic server that updates oracle that's polled by some random server that kicks of a shell script... ugh. Even worse when you can't log into those machines and inspect what a dev did years ago.

If you've got a clean environment, there's probably not as much value to you.


I hear you about legacy systems. Two years ago, I had to support a Python 2.4 system that used a deprecated crypto c library and I did not want to "pollute" my clean production infrastructure. Containers would definitively help with this scenario. The thought never occurred to me that docker could be used to reproduce/encapsulate legacy systems, thanks!


At the company I work for, we went through all the trouble of getting our distributed backend application running Vagrant using Chef so that we could have identical local, dev and production environments.

In the end, it's just so slow that nobody uses it locally. Even on a beefy Macbook Pro, spinning up the six VMs it needs takes nearly 20 minutes.

We're looking at moving towards docker, both for local use and production, and so far I'm excited by what I've seen but multi-host use still needs work. I'm evaluating CoreOS at the moment and I'm hopeful about it.


I don't see how Docker solves the speed problem without a workflow change that could already be accomplished with Vagrant.

* Install your stack from scratch in 6 VMs: slow * Install your stack from scratch via 6 Dockerfilea: slow * Download prebuild vagrant boxes with your stack installed: faster * Download prebuilt docker images with your stack installed: fastest

The main drawback of Vagrant is that afaik it has to download the entire box each time instead of fetching just the delta. That may not matter much on a fast network.


Running 6 VMs has non-trivial overhead though. That just isn't there using containers.


I have to disagree, although I'll admit that what's "trivial" is subjective. Sure, a container means you don't have to run another kernel. If the container is single-purpose like docker encourages, you skip running some other software like ssh and syslog as well. That software doesn't use much CPU or memory though. I just booted Ubuntu 12.04 and it's using 53MB of memory. Multiplied across 6 VMs that's 318MB, not quite 4% of the 8GB my laptop has. I'd call that trivial.

On the last project where I had to regularly run many VMs on my laptop, the software being tested used more than 1GB. Calling it 1GB total per VM and sticking with the 53MB overhead, switching to containers would have reduced memory usage by 5%. Again, to my mind that's trivial.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: