> I don’t need to worry about the version of node, nor of the dependencies nor anything else. If it’s worked for them, it’ll work for me. As simple as that!
This isn't true as far as I can tell, the Dockerfile will have a series of lines like this
RUN apt-get install x
RUN apt-get install y
RUN apt-get install z
RUN cat "config line" >> /etc/config.conf
I'm aware that this file is to generate a "run anywhere" image, but I worry people might be treating it as a huge step on from installation scripts when it's very similar. The image part afterwards, however, is a huge step onwards.
You might be interested in MDM, which is a general-purpose dependency manager for binary blobs.
Specifically for container images, you also might be interested in hroot -- it separates the concept of the image and transport out from the containerization system.
I agree wholeheartedly that it's the image permanence that's the interesting part about containers right now. In the last 24 hours I actually had an experience where a docker setup full of apt-get's failed to reproduce an image (new deps were added upstream that broke the system). Fortunately with hroot, I had the exact filesystems I had previously produced in a permanent, transportable system, and all covered by a hash so my production system could fetch exactly the correct version. I could have done this all manually with tars, but that's a pain for nontrivial use cases, and I could have done it with a docker registry, but I'm too much of a security nut to use the public one, and I already have git infrastructure set up, so it's actually easier to use that than try to spin up a private docker registry and secure it, etc.
It was mostly a clarification for people reading that the dockerfile doesn't guarantee repeatable builds.
Thanks for the post :)
This is a perfect example of how we're trying to design Docker: by looking for the right balance between evolution and revolution.
Evolution means it has to fit into your current way of working and thinking. Revolution means it has to make your life 10x better in some way. It's a very fine line to walk.
I think a lot of bleeding edge tools sacrifice evolution because it involves too many compromises - there's a kind of "if they don't get it, their application is not worthy of my tool" mentality, and as a result the majority of developers are left on the side of the road. I see several tools named in this thread which suffer from this problem, and as a result will never get a chance to solve the problem at a large scale.
In this example of build repeatability, "evolution" means we can't magically make every application build in a truly repeatable way overnight. However, we can frame the problem in such a way that lack of repeatability becomes more visible, and there's an easy and gradual path to making your own build repeatable.
Sure, you can litter your Dockerfile with "run apt-get install" lines, and that does partially improves build repeatability: first with a guaranteed starting point, second with build caching, which by default will avoid re-running the same command twice. Your build probably wasn't repeatable to begin with, and in the meantime you benefit from all the other cool aspects of Docker (repeatable runtime, etc), so it's already a net positive.
Later you can start removing side effects: for example by building your dependencies from source, straight from upstream. In that case your dependencies are built in a controlled environment, from a controlled source revision, and you can keep doing this all the way down. The end result is a full dependency graph at the commit granularity, comparable to nix for example - except it's not a requirement to start using docker :)
I agree, this is the right way to go about it. Someone with a nicely repeatable build can go ahead and get that with docker too, someone without still that gets a nicely distributable image. Docker seems to have taken off quickly as there's a benefit very soon after you start using it, and very little to get in the way of you having something running.
There's an issue in that people see the claims of one part and think they apply to the whole (I don't think the poster thinks that, but people reading it might get that impression), but this is a problem of education, not a technical one.
Yes, the procedure of generating the image (Dockerfile) is basically a glorified installation script.
But I don't understand why you use the properties of the image creation tool to refute runtime properties of docker images.
EDIT: like if when discussing the properties of a perfect headache free binary package management system, you mention that the code is still not guaranteed to be the same because when the packages are built, two builds of the same package could be slightly different. The purpose of the binary packaging scheme is to use the built artifact. Repeatable builds are a secondary goal.
I'm not, my concern is that the language used suggested that having a Dockerfile meant that if it builds for one person it would build for all when this isn't the case.
That's not a problem with docker, as it's not something docker is trying to (or claiming to) solve. I'm worried that some people might think it is, so thought I'd post here to clarify things.
In my experience (Linux on the desktop), bundling dependencies is an indicator of poor software quality, but I realize that the situation may be different on other systems.
In this case, I think that the solution is to use the distro's package manager to pin minor versions and rely on the distribution's updates for security fixes. Hosting your own repo is a bad idea, since it means you won't get any software updates. Software updates are really important -- they have security patches and bug fixes. If you're worried about the distro changing something from under your feet, you should pick something more stable. Debian or CentOS are good choices.
Of course, updating the dependencies (whether for patches or more consequential updates) in subsequent releases is fine... but within a given release cycle, one otherwise runs the risk of unexpected inconsistency, between e.g. a developer build and a CI build done a short time later.
I think this discussion illustrates the problem and might be more articulate than I'm being: https://github.com/bower/bower/pull/538
Thanks @hdevalence -- and anyone else who cares to comment. :)
If your application is simple, then sure, you can get away with almost any deployment and provisioning approach and it'll work "well enough". But these linking capabilities and products like Chef exist for more complex scenarios, and it would do you well to investigate the rationale behind them before being so dismissive.
I currently have a requirement to run 100s of applications provided by mutually untrusted 3rd parties, and co-ordinate startup/shutdown (for backup) and RPC access to these applications.
I need to be able to start an arbitrary combination of these applications on a node, depending on load (I cannot foresee the bandwidth/CPU requirements of each application without running it, and it will change unpredictably over time, sometimes to the point where a 1Gb/s link will be saturated by a single application for a few hours, and then change again to a trickle).
Sometimes I need to start multiple copies of this infrastructure for independent services that I may need to bring up/down independently.
In my scenario, using Docker alone to deploy the whole caboodle is not a maintainable solution. Using Docker to package the untrusted applications and selectively expose just the volumes for backup and a single port to just the host control process (keeping applications from talking to one another), and Chef to deploy/undeploy applications to nodes in arbitrary and constantly varying configurations that automatically rewire themselves is very maintainable.
The way I use these tools:
- Chef/Puppet/etc. = infrastructure deployment and configuration management
- Docker = application deployment and confinement
This separation is useful because the operations people can do their job, and the developers can do theirs, without stepping on each others toes and with minimal co-ordination. If you do everything in Docker, the ops team has a nightmare managing change in complex applications; if you do everything in Chef, your developers suddenly have to become Chefs, which is overkill and will waste time co-coordinating with the ops people.
My example above is childs play compared to what some organisations need to deploy and manage.
I have seen people discussing using that link feature where it seemed like they were just setting up a database or something for a single application and then linking that database container. Which seemed like it would be easier to set up the database in the container if possible.
I wasn't saying you can't use Puppet or Chef, just commenting on that particular case with using links for things like database dependencies for a single application.
The use case you describe obviously is not something you would try to manage with Docker alone.
From what you are saying it sounds like you have a good solution.
One thing that I remembered when you mentioned "requirement to run 100s of applications" was this new devops tool called Salt (saltstack.com). I actually don't know much about it but it sounded a little bit like what you are talking about. What do you think of Salt compared to Chef?
Obviously some people have good reasons to use links, like they need to run lots of databases on different servers or something. But for most installations that don't need to scale to serve millions of people, putting all of the application dependencies in one container makes a lot more sense.
I can host some containers on other physical hosts - I don't need to keep thinking (what if I fill up the biggest droplet on Digital Ocean) - If one of the services dies or needed kicking - 5/6 of the stack is unaffected - Orchestration is really only a matter of network endpoints getting written to environment variables - not too taxing
Running everything is one container also has it's advantages - like being able to push the whole stack as one image.
Extremely annoying to read.
Another version of the same font or the same font from another source may work. Webfont sites often try to tweak the hinting tables, they're trying to make the fonts look better but it breaks stuff all the time.
Chrome Version 32.0.1700.72 m on Windows 8.1
Docker as an educational tool can be pretty powerful. One of the most annoying parts of CS courses is the initial install/configure/dependency wrangling you have to do to install required applications in whatever courses you happen to be taking that semester. Since courses may have different and conflicting requirements, just preparing your machine to use for coursework can be a nightmare.
Docker solved this problem for me as a student, and I can imagine it being solved easily for others if professors would just latch on to it and provide DockerFiles for their students. Sure, OS X and Windows users may have the initial hassle of setting up VirtualBox or what not, but I think the trade off is worth it. And when the course is over, there's no longer a lot of development software sitting around your hard drive that you may never use again. Take any source you developed, the DockerFile you used, throw it all in a repo and then you can easily replicate the build environment if you need it later on.
As a developer, I use Docker to replicate "large scale" deployments on my own machine. Typically this is just a database container, a nodeJS server container, and a container for my web application code. However, as an exercise I've spun up a container with NGINX to act as load balancers for multiple running instances of my webapp container. It was simple, repeatable, and can be easily replicated on production servers.
Finally, onboarding of new developers becomes MUCH simpler with Docker. I developed bash scripts to quickly spin up containers for development and production workflows. So onboarding new developers to my codebase is fairly easy. I distribute the source code of the project, a repo that contains DockerFiles and bash scripts, and a small readme. Developers are typically up and running in less than an hour, regardless of their operating system of choice.
I'm not sold on using Docker for development though. I haven't attempted to setup multiple Vagrant machines so maybe that's why I'm not seeing the value, but setting up single dev machines through Vagrant is just so simple and straight forward.
Let's say I'm on on a VPS and I'm running multiple instances of CoreOS each hosting multiple containers. Can etcd be used in this case?
Etcd also provides distributed locking for the cluster through a module. If you need to prevent an action from happening more than once, you can take a lock on a specific key to prevent others from processing that item.
Related to locking is the leader election module which offers an easy way to choose a new leader for a distributed service. Module docs are here: https://github.com/coreos/etcd/blob/master/Documentation/mod...
Am just starting to explore Docker.
Docker is a pretty modern piece of tech, that has very little wrong with it, and seems to work exactly as designed. So you are kind of shoe horning the reference...
I'm a pretty smart person; I've poured over the Docker documentation, run through the interactive tutorial twice now, and I still don't have a great sense of 1) what it really is for, 2) how to really use it, or 3) how to handle slightly-non trivial use cases.
For #3, reading through a few examples online of how to get MySQL or Redis up and running in a container... honestly makes my head hurt. And those represent just one or two parts of the system I'm thinking could some day run on Docker - I don't have the time or patience right now to figure out how to get Nginx, Node.js, Redis, MySQL, RabbitMQ, and a few other things here or there - stitched together into a dockerfile.
If I had to make a guess at what the "bad parts" of Docker are, it's that it's complex and not all that understandable - yet. Maybe there aren't that many things that are technically wrong with it, but at this point, it's pretty painful to wrap one's mind around (IMO), and at least I personally think thats a "bad" part.
Docker feels like one of those things that is going to be indispensable and incredibly useful in a year or two. I'm certainly keeping my eye on it, but I'm staying away from the diving board for now.
That goes for anyone else reading this. I love answering questions and helping people. It's like pure bliss, so don't worry about being a bother.
I'd be happy to take whatever I learn and apply to my app and contribute it back to the community, maybe as a quick tutorial up on GitHub, or something along those lines.