We do this by combining (1) docker in "client mode", which connects to (2) a super-lightweight linux VM using boot2docker.
The details are on http://docs.docker.io/en/latest/installation/mac/
Once I got that ironed out, everything is running very smoothly, and I don't have to ssh into the VM to do things. Nicely done.
My wish for 0.9 is a more streamlined installation process, possibly by simply incorporating these steps into a Homebrew formula.
» docker version
Client version: 0.8.0
Go version (client): go1.2
Git commit (client): cc3a8c8
2014/02/05 23:10:55 unexpected EOF
docker@boot2docker:~$ docker version
Client version: 0.8.0
Go version (client): go1.2
Git commit (client): cc3a8c8
Server version: 0.8.0
Git commit (server): cc3a8c8
Go version (server): go1.2
Last stable version: 0.8.0
See also https://github.com/steeve/boot2docker/issues/48 which has some more information about this specific issue.
Absolutely not! I have to support my parents using a computer and I bought them a mac mini 6-7 years ago, because they would always get their windows machine in an unusable state where i couldn't even remotely connect to help. Using a mac, they can do the things they need to do with almost no problems: store photos, backup, email, web browse, facetime, ichat. That would be impossibly for them on a windows machine.
MENU LABEL boot2docker v0.5.2
APPEND initrd=boot2docker/v0.5.2/initrd.img loglevel=3 user=docker
$ docker login https://registry.example.com
2014/02/05 14:36:20 Invalid Registry endpoint: Get https://registry.example.com/v1/_ping: x509: failed to load system roots and no roots provided
Update: Filed a bug against docker, others are having the same issue. https://github.com/dotcloud/docker/issues/3946
We've confirmed the instructions still work with Docker 0.8 (make sure to change the checkout branch though :))
What is your development workflow? I am working on a Rails app, so my instinct is to have a shared folder between OS X and boot2docker, but afaik this is not supported as boot2docker doesn't support VirtualBox guest extensions.
It turns out that shared folders are not a sustainable solution (independently of whether boot2docker supports them), so the best practices are converging towards this:
1) While developing, your dev environment (including the source code and method for fetching it) should live in a container. This container could be as simple as a shell box with git and ssh installed, where you keep a terminal open and run your unit tests etc.
2) To access your source code on your host machine (eg. for editing on your mac), export it from your container over a network filesystem: samba, nfs or 9p are popular examples. Then mount that from your mac. Samba can be natively mounted with "command-K". NFS and 9p require macfuse.
3) When building the final container for integration tests, staging and production, go through the full Dockerfile + 'docker build' process. 'docker build' on your mac will transparently upload the source over the docker remote API as needed.
There are several advantages to exporting the source from the container to the host, instead of the other way around:
- It's less infrastructure-specific. If you move from virtualbox to vmware, or get a Linux laptop and run docker straight on the metal, your storage/shared folders configuration doesn't change: all you need is a network connection to the container.
- Network filesystems are more reliable than shared folders + bind-mount. For example they can handle different permissions and ownership on both ends - a very common problem with shared folders is "oops the container creates files as root but I don't have root on my mac", or "apache complains that the permissions are all wrong because virtualbox shared folders threw up on me".
That said, we need to take that design insight and turn it into a polished user experience - hopefully in Docker 0.9 this will all be much more seamless!
As Docker evolves, it would be great to have some kind of official resource to get suggestions for optimal workflows as new features become available (the weekly docker email is my best resource right now). Searching the internet for info has been a huge chore as most of the resources (including the ones hosted by docker.io) are woefully out of date.
Yes! We are trying to figure this out. Our current avenue for this is to dedicate a new section in the docs to use cases and best practices.
As you point out, our docs (and written content in general) are often inaccurate. We need to fix this. Hopefully in the coming weeks you will start seeing notable improvements in these areas.
Thanks for bearing with us!
Thank you for taking the time to write this, just to emphasize these two pain points. I've been using Docker since 0.5 and my current setup is still based around sharing from host to guest. The problems you mention obviously aren't deal breakers (at least for me), but the accumulated effort of dealing with these issues (especially having to modify permissions) adds up over time.
Here's a concern and a hypothetical, though, and I'd like some insight (or a facepalm) from others if I'm wrong...
Say I'm collaborating with a few people on a Rails app and we all work within a Docker container we build from a DockerFile located in our source control and we use the guest-to-host setup you outline. What happens if one of my developers accidentally pushes that container to Docker's public registry? Is my billion dollar ( ;) ) Rails app stored in that container and suddenly available for anyone that wants to pull the container?
I would hope the above is a far-fetched example, but with host-to-guest sharing I at least have some safeguard in knowing that my data is decoupled from my configuration. Is such decoupling worthwhile in your opinion?
In the absence of "best practices" even a discussion thread somewhere that allows Docker users to pick apart and discussion configurations would be helpful. Pretty much all I've been able to find is a smattering of blog posts.
What would be the recommended way? How to install the required software on either system. Host on OSX or on Linux?
On a recent Linux it seems like a modprobe 9p activates the required module and then a mount -t 9p serverIP /mountpoint seems to do the trick.
But what about the OSX side?
What's the overhead?
You cannot treat a docker container like a virtual machine – code running in the container has almost unfettered access to the parent kernel, and the millions of lines of often-buggy C that involves. For example with the right kernel configuration, this approach leaves the parent machine vulnerable to the recent x86_32 vulnerability (http://seclists.org/oss-sec/2014/q1/187) and many similar bugs in its class.
The algorithms in the running kernel are far more exposed too - instead of managing a single process+virtual network+memory area, all the child's resources are represented concretely in the host kernel, including its filesystem. For example, this vastly increases the likelihood that a child could trigger an unpatched DoS in the host, e.g. the directory hashing attacks that have effected nearly every filesystem implementation at some point (including btrfs as recently as 2012).
The containers code in Linux is also so new that trivial security bugs are being found in it all the time – particularly in sysfs and procfs. I don't have a link right now, though LWN wrote about one a few weeks back.
While virtual machines are no security panacea, they diverge in what classes of bugs they can be affected by. Recent Qemu/libvirt supports running under seccomp.. ensuring even if the VM emulator is compromised, the host kernel's exposure remains drastically limited. Unlike qemu, you simply can't apply seccomp to a container without massively reducing its usefulness, or using a seccomp policy so liberal that it becomes impotent.
You could use seccomp with Docker by nesting it within a VM, but at that point Docker loses most of its value (and could be trivially replaced by a shell script with a cute UI).
Finally when a bug is discovered and needs to be patched, or a machine needs to be taken out of service, there is presently no easy way to live-migrate a container to another machine. The most recent attempt (out of I think 3 or 4 now) to add this ability to Linux appears to have stalled completely.
As a neat system for managing dev environments locally, it sounds great. As a boundary between mutually untrusted pieces of code, there are far better solutions, especially when the material difference in approaches amounts to a few seconds of your life at best, and somewhere south of 100mb in RAM.
If a web application has a vulnerability that allows arbitrary code execution then Docker is only a mild help.
BUT, it can help migrate a certain set of security problems. It is a very simple way to provide pretty good protection against file-traversal type vulnerabilities, even when combined with privilege escalation.
People shouldn't view Docker as a security "silver bullet". But at the same time it does provide an additional layer of security, and that layer can be useful.
The Docker people have a good post about the Docker security model, and they list two future improvements they see as important:
"map the root user of a container to a non-root user of the Docker host, to mitigate the effects of a container-to-host privilege escalation;"
"allow the Docker daemon to run without root privileges, and delegate operations requiring those privileges to well-audited sub-processes, each with its own (very limited) scope: virtual network setup, filesystem management, etc."
I think most people would agree these are important goal.
Note: there will probably be an OpenVZ backend available for Docker at some point :)
You list various facts that are mostly correct, but your conclusion is wrong. Docker absolutely does not reduce the range of security mitigations available to you.
Your mistake is to present docker as an alternative to those security mitigations. It's not an alternative - it presents you with a sane default which can get you pretty far (definitely further than you are implying). When the default does not fit your needs, you can fit Docker in a security apparatus that does.
The current default used by docker is basically pivot_root + namespaces + cgroups + capdrop, via the lxc scripts and a sane locked down configuration. Combined with a few extra measures like, say, apparmor confinement, dropping privileges inside the container with `docker run -u`, and healthy monitoring, you get an environment that is production-worthy for a large class of payloads out there. It's basically how Dotcloud, Heroku and almost every public "paas" service out there works. It's definitely not a good environment for all payloads - but like I said, it is definitely more robust than you imply.
So your first mistake is to dismiss the fact that linux containers are in fact an acceptable sandboxing mechanism for many payloads out there.
Your second mistake is to assume that if your payloads need something other than linux containers, you can't use Docker. Specifically:
> You cannot treat a docker container like a virtual machine – code running in the container has almost unfettered access to the parent kernel, and the millions of lines of often-buggy C that involves. For example with the right kernel configuration, this approach leaves the parent machine vulnerable to the recent x86_32 vulnerability (http://seclists.org/oss-sec/2014/q1/187) and many similar bugs in its class.
> The containers code in Linux is also so new that trivial security bugs are being found in it all the time – particularly in sysfs and procfs. I don't have a link right now, though LWN wrote about one a few weeks back.
> While virtual machines are no security panacea, they diverge in what classes of bugs they can be affected by. Recent Qemu/libvirt supports running under seccomp.. ensuring even if the VM emulator is compromised, the host kernel's exposure remains drastically limited. Unlike qemu, you simply can't apply seccomp to a container without massively reducing its usefulness, or using a seccomp policy so liberal that it becomes impotent.
Of course you're right, sometimes a container is not enough for sandboxing and you need a VM. Sometimes even a VM is not enough and you need physical machines. That's fine. Just install docker on all of the above, and map containers to the underlying machines in a way that is consistent with your security policy. Problem solved.
> You could use seccomp with Docker by nesting it within a VM, but at that point Docker loses most of its value (and could be trivially replaced by a shell script with a cute UI).
That's your judgement to make, but I'm going to go a whim and say that you haven't actually used Docker that much :) Docker is commonly used in combination of VMs for security, so at least some people find it useful.
> Finally when a bug is discovered and needs to be patched, or a machine needs to be taken out of service, there is presently no easy way to live-migrate a container to another machine. The most recent attempt (out of I think 3 or 4 now) to add this ability to Linux appears to have stalled completely.
In my opinion live migration is a nice to have. Sure, for some payloads it is critically needed, and no doubt the day linux containers support full migration those payloads will become more portable. But in practice a very large number of payloads don't need it, because they have built-in redundancy and failover at the service level. So an individual node can be brought down for maintenance without affecting the service as a whole. Live migration also has other issues, for example it doesn't work well beyond the boundaries of your shared storage infrastructure. Good luck implementing live migration across multiple geographical regions! This has been established as ops best practices , so over time the number of payloads which depend on live migration will diminish.
> As a neat system for managing dev environments locally, it sounds great. As a boundary between mutually untrusted pieces of code, there are far better solutions, especially when the material difference in approaches amounts to a few seconds of your life at best, and somewhere south of 100mb in RAM.
To summarize: docker is primarily a system for managing and distributing repeatable execution environments, from development to production (and not just for development as you imply). It does not implement any security features by itself, but allows you to use your preferred isolation method (namespaes, hypervisor or good old physical separation) without losing the benefits of repeatable execution environments and a unified management API.
In the default configuration (and according to all docs I've seen), regardless of some imagined rosy future, today docker is a wrapper around Linux containers, and Linux containers today are a very poor general purpose security solution, especially for the kind of person who needs to ask the question in the first place (see also: the comment I was originally replying to)
You're right. But what it does is provide anecdotical evidence that your views are not shared by a large and growing number of experienced engineers.
> In fact I've really no idea what purpose your reply was hoping to serve.
It's pretty simple: you made an incorrect statement, I'm offering a detailed argument explaining why.
> In the default configuration (and according to all docs I've seen) [...] today docker is a wrapper around Linux containers
> [...] regardless of some imagined rosy future [...]
I only described things that are possible today, with current versions of Docker. No imagined rosy future involved :)
> and Linux containers today are a very poor general purpose security solution
I guess it really depends of your definition of "general purpose", so you could make a compelling argument either way.
But it doesn't matter because if you don't trust containers for security, you can just install Docker on a bunch of machines and make sure to deploy mutually untrusted containers on separate machines. Lots of people do this today and it works just fine.
In other words, Docker can be used for deployment and distribution without reducing your options for security. Respectfully, this directly contradicts your original comment.
> In other words, Docker can be used for deployment and distribution without reducing your options for security. Respectfully, this directly contradicts your original comment.
If I understand services like Heroku correctly, they give customers standard access to run arbitrary code inside a container as a standard user. Therefore, I expect it would be standard and unavoidable to have many different customers' applications running on the same machine, leading to the ability to exploit vulnerabilities similar to the recent x32 one. If they instead used a VM for each application, they would have to pierce the VM implementation, potentially plus seccomp in some cases, which is the mitigation the parent was referring to. The choice to use Docker instead of VMs limits the security options available.
>If they instead used a VM for each application, they would have to pierce the VM implementation, potentially plus seccomp in some cases, which is the mitigation the parent was referring to. The choice to use Docker instead of VMs limits the security options available.
The parent is suggesting you can use Docker as a supplement to any additional security measure one might choose (to quote: "Docker is commonly used in combination of VMs for security, so at least some people find it useful").
In your example, a person would run Docker on top of the VM, and gain "a system for managing and distributing repeatable execution environments".
AFAICT through the wall of text, the only problem you have with what I said is that Docker loses its value when combined with a VM. That's fair enough, but that was 1% of my comment.
If you're replying, please don't quote yet another wall of text, it's almost impossible to read.
It solves the "I want to run two apache's how do I stop them conflicting" problem. Not the "I don't trust what is being run here" problem, it has never been marketed as that, and you are presenting it as if it was.
Docker is a great way to build up a machine, and logically define a machines capabilities.
Your gripe about security is just completely irrelevant, it'd be like complaining that iPhoto doesn't increase OS X Security.
Edit: So the answer to the original "Is docker good for security?" I would say "maybe, but that's not its intention or focus".
There’s near zero overhead, because there’s no virtualization.
also, its not really docker doing that, its LXC. Docker is an API around it.
However, the analogy I just made is only at a point in time. LXC is not the only Linux provider to manage cgroups/namespaces, it was just the most convenient at the time for the target audience. That is a fleeting position soon remedied.
If you'd like to know more, I'd encourage you to get involved with Docker development.
thanks for nothing really.
We'd like to ship a set of utilities as a docker container, but unless the sysadmin gives everyone 'sudo' privileges on the server (unlikely and insecure), they can't run the container and its utilities.
Future versions of the Docker API will natively support scoping. This means that each API client will see a different subset of the Docker engine depending on the credentials and origin of the connection. This will be implemented in combination with introspection, which allows any container to open a connection to the Docker engine which started it.
When you combine scoping and introspection, you get really cool scenarios. For example, let's say your utility is called "dockermeister". Each individual user could deploy his own copy of dockermeister, in a separate container. Each dockermeister container would in turn connect to Docker (via introspection), destroy all existing containers, and create 10 fresh redis containers (for reasons unknown). Because each dockermeister container is scoped, it can only remove containers that are its children (ie that were created from the same container at an earlier time). So they cannot affect each other. Likewise, the 10 new redis containers will only be visible to that particular user, and not pollute the namespace of the other users.
Of course scoping works at arbitrary depth levels... so you could have containers starting containers starting containers. Containers all the way down :)
add silly disclaimer.. yes docker has some notion of portable containers plugins, but it uses lxc atm, and the feature is in lxc upstream.
The official installation process seems more complicated, and I don't really see an advantage.
This "native sandboxing for own-ABI if available, VM if not, and VM for everything else" approach would extend to any other platform as well, I'd think (Windows, for example.) I'm surprised that this isn't where Docker is going, at least for development and testing of containers.
(Though another alternative, probably more performant for production, would be something like having versions of CoreOS for each platform--CoreOS/Linux, CoreOS/Darwin, CoreOS/NT, and so on--so you'd have a cluster of machines with various ABIs, where any container you want to run gets shipped off to a machine in the cluster with the right ABI for it.)
Longer term we do need to support multiple ABIs, if only because a lot of people want to use Docker on x86-32 and ARM. Having ELF binaries built on Linux isn't of much help if they're built for another arch :) So at the very least we will need to support 3 ABIs in the near future.
The good news is that it can be done in a way which doesn't hurt the repeatability of Docker's execution environment. Think of it this way: every container has certain requirements to run. At the very least it needs a certain range of kernels and archs (and yes it's possible, although uncommon for a binary to support multiple archs). It may also require a network interface to bind a socket on a certain TCP port. It may require certain syscalls to be authorized. It may require the ability to create a tun/tap device. And so on.
Docker's job is to offer a portable abstraction for these requirements, so that the container can list what it needs on the one hand, the host can list what it offers on the other, and docker can match them in the middle. If the requirements listed by a given container aren't met ("I need CAP_SYSADMIN on a 3.8 linux kernel and an ARM processor!") then docker returns an error and a clear error message. If they are met, the container is executed and must always be repeatable.
TLDR: ABI requirements are just one kind of requirements. Docker can handle multiple requirements without breaking the repeatability of its execution environment.
Considering that the overwhelming majority of Unix servers are running Linux, I think it's better to say that Docker is Linux-based, end of discussion.
Not exactly the same but closer.
I'll stick to my statement. I want something Docker-like for Windows so that I can easily move things from one machine to another.
So to me that doesn't fit in with the Semantic Versioning contract. I think the product is too young yet to use a version scheme that assumes relative API stability.
Edit: prelim zfs driver work is here https://github.com/gurjeet/docker/tree/zfs_driver
Swappable storage engines will be easier to create over time, not less. There's also a ZFS branch, but the reality is people spent time and resources on getting BTRFS (which has been experimental for >6mos) instead of ZFS.
Docker development works a lot like Linux development (just on a much, much smaller scale.) If there's an area where you're comfortable committing, the barrier to entry is minimal. All you need is 2 maintainers to agree to your addition and its merged in. So get on it!
Edit - docker's great. if I were an investor, how would you guys monetize it? prof svcs, support? I could see folks paying for a dashboard, cloud controller w/ api and an easy-to-deploy openstack-like setup.
Basically - investing more in open source, investing in the docker platform, investing in commercial capabilities
So as you can imagine, we're meeting with a ton of companies and hiring really good people who want to be part of something pretty amazing.
It's common and easy to mount your host FS into the container, putting mutating data where you can take full advantage of the superior architecture and ops capabilities of whichever FS you prefer.
The images' internal AUFS/BTRFS layers are then only for keeping your binaries-at-rest and static configuration straight. They may as well be in highly indexed ZIP files, for all you care.
btrfs is the response to sun picking an incompatible license. when that is removed zfs might get more interesting for a lot of people.
Has btrfs jumped ahead of zfs in ways I haven't heard about?
Edit - this is my first search result:
This about settles the question for me. Assuming that the implication that btrfs performs otherwise holds true.
and from what i can tell clone operations are also handled atomically
though i kinda wonder what exactly is meant by atomic writes.
I might have to dig into it a bit more.
testing will come with time.
with raidz sure it's great but the vdevs being immutable is really rather annoying. the way btrfs handles multi device stuff is significantly better (replication is not between 2 devices but closer to the file (it allocates a chunk of space and decides where to put the other replica in the pool).
though i wish the erasure coding stuff would land faster.
btrfs has had send\receive for a while.
I haven't needed to dig into the btrfs man pages yet so can't common on how accurate this is.
btrfs also uses barriers
log devices and cache devices are awesome. i hope btrfs adds them.
the block device thing is a limitation of btrfs and annoys me though i've slowly moved to just having files. (though in anything largish i would probably be moving to a distributed fs anyway)
the sharing stuff is good but i think that's a tooling issue not an fs issue.
btrfs has an out of band dedup allowing you to run periodic dedup without having the memory penalty of live dedup (though costing disk)
* ZFS is a RAID manager.
* ZFS is also a filesystem.
* Writes are handled in transaction groups (TXGs).
* Every transaction group is written atomically.
* ZFS keeps a revision history of the past 128 transactions written to disk.
* ZFS is a Copy on Write filesystem.
* As such, due to the previous 2 features, snapshots are free.
* Snapshots are first class, read-only filesystems.
* Snapshots can be upgraded to read-write clones.
* Snapshots can be sent and received to other locations.
* ZFS uses block-level deduplication.
* ZFS supports transparent compression.
* Every metadata and block data is checksummed with SHA256 by default.
* Other checksum algorithms are supported.
* ZFS uses a "slab allocator" to minimize fragmentation.
* ZFS implements an "intent log" for synchronous writes.
* The intent log can be migrated to a fast SSD or NVRAM drive.
* ZFS uses advanced caching implementing for MRU/LRU and MFU/LFU caches.
* A secondary cache (outside of RAM) can be installed on fast SSDs.
* ZFS uses dynamic striping with its RAID arrays.
* ZFS supports triple parity RAID.
* ZFS autoheals bit rot when a block does not match its checksum, if the pool is redundant.
* ZFS fully supports advanced format disks (4k blocks and beyond).
* In fact, block sizes are dynamic from 512 bytes to 128K (or 1M in the proprietary ZFS).
* In the proprietary release of ZFS, native encryption is supported.
* In the Free Software release of ZFS, "feature flags" have been introduced to add on "plugins" without changing the core of the filesystem.
* ZFS supports native NFS, allowing the mount to be available before the export.
* ZFS supports native SMB for the same reason.
* ZFS supports native iSCSI, also for the same reason.
* ZFS can create static sized block devices called "ZVOLS".
* ZFS pools can be exported and imported.
* ZFS "scrubs" data to find blocks that do not match their checksum.
* The Free Software release of ZFS is supported on GNU/Linux, OpenIndiana, SmartOS, FreeBSD, and many other operating systems.
* Administration of ZFS is done via 3 commands: zpool(8), zfs(8) and zdb(8).
L2ARC, zil can each have their own volume configuration (mirror, etc.) For example, using different types of SSDs for each. http://forums.freenas.org/index.php?threads/zfs-and-ssd-cach...
zfs send & receive ... Send snapshots around like a fancy SAN.
raidz (N+1 - like raid5)
raidz2 (N+2 - like raid6)
It's also way faster and cheaper to put together boxes from commodity enterprise server hardware, making hardware raid cards basically expensive shelf dust catchers along with overpriced SANs and NASes.
(Extra shout out for iXsystems, not because they use lots of Python, but because of massive awesomeness supporting FreeBSD and FreeNAS. Also their parties put Defcon afterparties to shame.)
Conclusion: Full ZFS is often better than a SAN, NAS and/or hardware solutions. Also protip: Direct attached is way, way faster than 10 GbE, FC or IB, especially if images are directly available to compute nodes.
If anyone else is using Boxen, I packaged up a quick Puppet module to get up and running with Docker on OS X: https://github.com/morgante/puppet-docker
> Docker is an open-source engine that automates the deployment of any application as a lightweight, portable, self-sufficient container that will run virtually anywhere.
Like an internet browser for executables ? I don't understand how can this be useful...
Docker is a mechanism to bundle an application together inside a container (think VM instance) in a way that makes it easier to distribute.
Say you have a python/rails/nodejs/c++/whatever app. Sometimes getting all the dependencies on the system is cumbersome and hard to manage. This is true for both developers, and the people deploying these apps. I can't count how many times I've had a python app fail to build on a new box because of some C extension and forgetting to install a package on (centOS, ubuntu, debian, ect).
Docker lets me do all of this once for my app, with a Docker image, and now when I want to deploy it, all the system needs is Docker installed. This means on my laptop, on our staging server, on our production server, all they need is Docker, and I will have the exact same environment in each place, and to deploy the app, its exactly the same.
There's a ton of other bonuses like each container's dependencies and processes are isolated from each other. I also get a ton of the Docker features that allow containers to communicate with each other and setup service discovery between them (ex: your database container can now expose information to the app container using environmental variables).
Tons of other good reasons too though, you should check it out.
an example of where it's useful, for me, is in system integration tests. unit tests are designed to run without changing the machine they run on, but for system integration you need to build and install and configure and run a system. so you really need something like a vm. docker gives you that isolation, but with lower overheads.