Compose has a lot of users, and hundreds of thousands of Compose files on GitHub, so we wanted to open up development as much as possible.
(I work at Docker)
Is there any "supported deployment" for docker? I had the impression that swarm has been sidelined, and docker-compose was discouraged. Is there a path to deployment on kubernetes that uses only compose? (I have a hard time seeing how that'll work with ingress, and "compose" well with other kubernetes deployments?)
We have some on-prem stuff that would work on single-node swarm - but I've felt that compose has been in a bit of limbo this past year or so... Is some kind of production style deployment comming back?
There are quite a few ways now, and we will be helping make them better. These include Kompose (the maintainer, Hang Yan, is involved with Compose spec now) which converts to Kube Yaml, some work on Compose-Helm integrations that is ongoing, Amazon ECS has Compose support, compose-on-kubernetes (we think this needs rewriting to use a CRD again, after having changed it to use API extensions), and some other projects which are being worked on that I have heard about.
It is also an open source project, so even if the corporate overlords abandon it, the code could live on!
With an option like swarm, that fits a certain on-prem type scenario (eg: what gitlab now does with its horrible (working, but still horrible) omnibus installer would be a good fit for swarm).
I never did understand why docker-compose wasn't treated "better" by docker - I guess they focus on non-linux devs? Docker without an orchestrator is a bit like a daemon without a supervisor - if everything fits in a single container - why not just run it as a Linux process?
And compose was a perfectly OK orchestrator - it did some minimal, sensible things that you need, to compose a couple of containers. But it felt scary that you never knew what upstream intentions were.
It might be a VM either way for Mac and Windows, but Windows developers are likely to always be running a VM with Docker integration soon due to WSL 2, and on Linux it's still an incredible productivity multiplier that beats the pants off everything else when you have dependency issues. Only MacOS is left behind with the worst developer experience out of the three.
What makes you think that? Only difference I can think of is use of hemorrhaging edge kernel primitives in your application. Even networking is handled by docker-compose and you don't have to worry about port collision.
This was the first time I ever heard anything of the sort. Docker has been not only solid but by far the most performant system available, and running stuff on VMs always felt a half-baked solution in comparison.
Can you shed some light on what led you to form your opinion?
It was even very difficult to setup a test cluster on my MacBook Pro to try out multi-node support (you have to find a good distribution, something that supports Mac for multi-node, etc.). Whereas for Swarm, it was super easy.
I would argue that Docker Swarm is a wonderful orchestration system for all practical real-world applications, with the notable exception of really global and gigantic multi-region deployments.
Docker Swarm works superbly out of the box and it's trivial to setup a multi-node heterogeneous cluster of COTS hardware and/or VM instances.
Higher level abstractions can work and need to start somewhere. It really depends what tools are developed.
If you've been diligent in striving for stateless containers that will help you - but pretty much everything thing else in your compose file is useless.
Docker-compose gets you the "pod"-level in k8s, but doesn't really help with ingress/services etc. So no one can reach your app, certainly not via a load balancer that terminate ssl for you.
This is no longer a game of "beating the nginx container until it gives up and does what you need it to".
See https://github.com/compose-spec/compose-spec/issues/12 and also https://github.com/compose-spec/compose-spec/issues/13 for making it feature not version number driven.
But just this past week after a ton of pain previously trying to get a working Geodjango stack running on my Mac locally for experimentation I discovered a docker-compose script that gave me everything I wanted in about 5 minutes. A sample app. Django, nginx, redis, PostGIS and all the complex dependencies - all working together. I literally typed "docker-compose up" and I had a working web GIS-enabled web app running on localhost. And the fact that the deployment to staging is hopefully just as simple makes me smile.
Of course - production is a different story as I then have to worry that all the magic pieces aren't full of backdoors and security holes but I guess that's a job that needs doing anyway.
I couldn't imagine deploying a Python application without Docker nowadays.
But the containerization aspect gives the option to pull up and destroy services without worrying about polluting the host you're in.
in my experience only the simplemost applications actually don't need anything outside of vanilla python pip. suddenly the client wants PDF reports and you're setting up e.g. some tex-based printing system with a bunch of host packages needed. only containers give you the peace-of-mind that all dependencies, current and future, can be described as part of the codebase.
Couldn't the same thing be done via the package manager and a RPM spec or deb file where all the necessary dependencies are listed and installed as part of the package? It could be done on a VM or could be done on a machine by keeping track of what dependencies are installed when installing the application and handling uninstallation by removing all newly installed dependencies along with the application.
The package manager can handle removing pretty much any file it installs when uninstalling, so the host really doesn't get "polluted".
> or go back to VM overhead per application
In a development environment hosted on a VM, several applications can be installed on the VM (rather than having one per VM) to reduce overhead. Then testing and making code changes could be done by running a pip -e (editable) install and modifying code in the working directory, or making the change, repackaging and re-installing, and restarting the daemon.
With a container, at least in my experience, you need to re-build it each time you make a change to the code, which actually takes longer than modifying an editable install or re-building the wheel/RPM and reinstalling it.
In any case, the point I was trying to make is that the development cycle with containers, in my experience, is slower because you have to go through the build step every time you make a change. For an intpreted language like python, that shouldn't be necessary until close to the end where you test a fresh build before submitting the changes for review.
I literally wasted nearly a day and filled up my drive with compilation artifacts last time I tried getting GeoDjango working. All those binary dependencies, subtle OS config changes and custom Postgres extensions can be a pain.
So given my need for local deploy and remote deploy it's hard to think of a better solution.
I usually spin up a local lxc container and point the ansible scripts at it using a dev inventory file. Mount the source code from the host filesystem onto the the container and I'm ready to go.
I'm sure there's a historical reason for it not being `docker compose ...`, but it sounds like it's a good time to promote it.
Of course that was six years ago, so it is a bit surprising they haven't done a deeper integration since then.
Apparently Apple came very close to using ZFS too, instead of building APFS. Oh what could have been.
We're currently using a mix of docker-compose for the service and native development for the main application (requiring separate instructions for win, mac, and linux). I'd love to be able to transition this to a simpler all-docker-compose setup.
Okay, that sounds confusing, I know. We use Vagrant, because you can put every kind of ancient incantations into the Vagrantfile to check for whatever misconfiguration on the host, and so on. Plus you don't have to wrap your app in Docker this way, so debugging becomes a bit easier.
Basically we have a fair bit of Ruby in the Vagrantfile setting up a dev env on the host (scoop + pip + poetry) and provision files setting up the env inside the VM. (Let's say a DB + pip + poetry + uWSGI.) This is necessary, because for code completion you want the dependencies accessible for your IDE. (Yes, PyCharm Ultimate Pro Gold Edition has a way to use remote interpreters, but then you need to guarantee that for every dev too, and it's still not as fast as just having the files on the host in a venv.)
The VBoxSF shared filesystem has a lot of quirks, so we minimize the shared stuff. (We tried NFS and CIFS/SMB too. Both were amazing disasters.)
For NodeJS stuff we only have the runtime dependencies in the VM (database, other backed services), but you have to be careful to script everything in a cross-platform way in package.json (so basically don't script anything there start the NodeJS process as soon as you can, and work inside that).
And the provision scripts are simple, because they just setup docker containers. Usually without compose, because meh, it's easier to just type a few lines of Bash to rm -f the container and run a new one, and maintaining a useful compose file is unnecessary at that point.
I just wish there was some SaaS platform which allows you to upload a docker-compose file and run the containers for you. That way, you don’t have to touch the underlying OS.
v2 even allows to create the underlying networking resources (VPC), provision an application load balancer, etc.
I know their actual configurations aren't equivalent, but they are similar (Stack supports config options for distribution, etc.).
You have this problem where you want to set up infrastructure in an abstract way, so you come up with a way to write some infrastructure requirements and now you have a program that does all the actions necessary to take your setup from wherever it is, to what you specified.
Then you have multiple of those systems, and it gets confusing. So you come up with a generic way to specify the abstractions that can be sort of compiled into any of the flavors, so that you can set up infrastructure using any of the sub-abstractions available, instead of only a specific one.
This has the smell of a FactoryFactoryFactory sort of thing. Like, if this whole cottage industry were my architecture project I would be saying to myself "This is too much, a rickety, foot-gun-laden tower, destined to crumble. I've obviously chosen a poor abstraction, lemme back way the hell up."
Am I just totally wrong here? I feel like I'm taking crazy pills.
I’ve had similarly adverse reactions to the seemingly “unnecessary” complexity of many things: docker, webpack, autotools, react, AWS, just to name a few.
But I’ve found that, upon closer inspection, the software is usually complicated because the problem domain is complicated. Further, as an inviolable rule, the most popular tools for any problem are the ones that don’t make you rewrite all your code; popular tools build on top of the mess we already have. This can give the illusion that the tool is somehow responsible for all the preexisting complexity. Tools which attempt a “clean slate” approach are mostly doomed for obscurity.
In the case of docker-compose, I can say after years of resistance, I tried docker compose and was an immediate convert. Setting up, documenting, and sharing development environments has never been simpler.
I work in a shop where various developers work on Linux, OS X and Windows (useful for dogfooding - we support deploying to multiple platforms), and, there, Docker Compose has ironically turned out to be our chief source of cross-platform headaches, due to annoying differences in how it works on each host platform. Particularly with the networking subsystem.
I've seen the product they are selling, I've watched a docker command build me a multi-part infrastructure, automatically source databases and assets from afar, and do it all quickly. But then it failed the second time I tried it. I've not found it to be a reliable tool at all.
Which is the whole point of it. If it's not, it's just a bad abstraction. Docker-compose itself becomes the only variable
Since moving to Docker, we now have more cross-platform problems than we used to.
We continue to use it anyway, because there are some other benefits that we feel outweigh shortcomings like these. But still, it's annoying.
Local development also sucks. The docker engine is essentially unusable on my colleagues Mac, consuming 100%+ of all available cpu while sitting idle. On my Linux box, either docker breaks my networking, or IT firewall rules break dockers networking. It’s even worse with wfh, because our vpn is incomparable with docker. Local dev just happens in an anaconda env instead. So what’s the point?
VMs have none of these problems.
Docker is the leakiest abstraction I’ve yet to come across.
- How would EC2s compare to Fargate ? I encountered situations where, running the numbers, I much preferred having my own set-up with images ready to deploy + adapting my EC2 instances to the task, instead of dealing with Fargate's convoluted restrictions and definitions
- Has your colleague tried to increase the disk size given to Docker? I have seen on a few macbooks that this made a world of difference for CPU usage, and I thought I'd pass on the information if it hasn't already be tried yet
Particularly, as a species, we discount the future heavily, so as developers we do not accurately estimate the variance introduced by something that works by magic until it doesn't. So on some unnamed day in the future when you had other plans, you will be completely blocked while having to learn a bunch of things we were avoiding learning in the first place.
Yes. I think so.
docker-compose is just a tool to run multiple containers together. It is extremely simple, readable and straightforward (if you're familiar with docker). It is not an inf. abstraction at all.
The other part of this problem that isn't applicable generally to other things, is this is a fairly unique problem to cgroups. By design cgroups don't have a singular implementation or solution, so there are a lot of "competing" solutions. I think some amount of unix philosphy is at play here, to do one thing (or really, fewer) and that's why it seems insane how many systems you have to hook up together to make the thing work the way it should.
I'm fairly devops centric, but I'd still take this with a grain of salt b/c containers aren't my area of expertise.
Oups. Now Nginx is still serving the old version through cache. So need to clear cache, restart, shut this, reconfigure this, etc... After a while this gets tiring.
Docker is not the perfect solution (one can wonder if it's actually a solution) but it's a symptom of the real issue rather than the issue itself.
You should try it before you knock it. There are a lot of flavors but the abstractions of any single flavor isn't that deep.
Containers themselves aren't even an abstraction, they're just namespaced apps.
1. Writing all the YAML to deploy an application into Kubernetes is a lot of work. If one does everything fully filled out it's pretty normal to have over a thousand lines of YAML... or much more. With lots of repeat yourself. Tools built from a simpler spec that can "compile" to that would be so great. Like writing in a higher level language instead of machine code or assembly.
2. There are a variety of platforms now. I first thought of Kubernetes vs Nomad. But, Kubernetes using Istio is a different platform from Kubernetes where applications use Ingress. There is more diversity showing up. Being able to define an application in one spec and "compile" for different situations would be useful.
Just my 2 cents. As a Kubernetes user I'm happy to see the higher level tool bar moving. It's desperately needed.
This. After almost a year working with kubernetes I came to the same conclusion. I am/was studying for the kubernetes certification (cka) but in some way it's useless.
I mean, it would prove basic proficiency, but the whole ecosystem is so varied that wouldn't necessarily mean actual and practical proficiency.
It would probably help passing through vogon recruiters but a technical interviewer that knows his way around kubernetes wouldn't necessarily buy that.
Getting into Kubernetes, having Kompose (https://kompose.io/) generate Kubernetes specifications (and Helm charts) from Docker Compose files was a godsend.
1. Bring back extends
2. Specify services that are only intended to be run
Maybe #2 is possible now and I just don't know it?
I wrote a wrapper at previous job around docker-compose that made it easy to define aliases within the compose file for complete run commands... Eg to run migrate in your rails container.
(Disclaimer, I maintain https://composerize.com/, hoping this helps us track parity)
That said, I don't personally believe there exists a single specification that can work (optimally) for both local-dev and production deployments.
Disclaimer: ex-Docker engineer, still an active maintainer on the main docker project.
What's going to happen is, you (or the dev) will fill this thing out to the best of your ability. Then someone's going to run it in one env; it won't work, but Ops will modify it to make it work. So then you'll commit that as your one Compose file. Then someone will try to run it in a different env, and it won't work; they'll fix it to make it run. Now you have two Compose files to maintain, with 3 different use cases. So to make it easier to manage, you go back to one for dev, and one for each env. All this because the different components could not be defined independently and loosely coupled, because the spec (and tools) don't support that.
I like Compose for what it's designed for: let a dev mock something up locally. But it's going to become a pain in any environment that scales and will need a different solution.
Writing a spec is a good first step! If work is underway to make this a shared multi-service dev environment that works across Kubernetes, ECS, etc, that would be even better.
When I read the spec, there are ominous words about partial implementations. That makes me nervous. But I'm cautiously optimistic.
Swarm is a (sadly) soon-to-be-abandoned way to spin up a cluster of multiple hosts with Docker daemons; don't waste your time on it. The comparable solutions are AWS ECS or [to a lesser extent] HashiCorp Nomad.
Docker Compose is a console app that takes a config file and starts multiple Docker containers, creates volumes, and configures Docker networks - on a single Docker instance, so it's only really useful for testing stuff locally that requires multiple containers running at once.
It seems this inevitable abandonment has been abandoned as of Feb 24, 2020...
Basically, you may want to use swarm (or k8s or anything else) in order to orchestrate your containers across a multi-machine cluster and compose when defining a set of services running on a single host.
Yes, it focuses a lot on shared resources, but the negative space defined by where resources are not shared certainly informs any human-defined networking topology. I'm not so sure a tool couldn't make a pretty good stab at doing the same thing.
- docker - it is just a packaging mechanism for your app and a deployment unit if you use Kubernetes, ECS or similar
- docker-compose - if you have an app that spreads over multiple docker containers (i.e. you're doing micro services) it allows you to spin all of them with a single command allowing these components to be interconnected. It is essentially your local testing environment.
- swarm - as far as I know it is a dead project, it supposed to compete with Kubernetes and similar, but it failed
So unless there genuinely is a commitment to being able to have each orchestration engine just unreservedly guaranteed take any definition possible in docker-compose and deploy it, then there is no value to be gained here. None. Without that guarantee, then for every system you attempt to deploy a compose file to, you must DEEP DIVE into its internals to understand its limitations and costs and tradeoffs.
This gigantic massive timesink of a process is something I have always had to do. And I always discover all kinds of things that would have absolutely blocked practical deployments. This is the real thing that's bothering me, not that k8s and azure and docker all can sorta read some subset of a configuration. I don't care at all about that. It's not useful.
I have a bunch of apps. Some of them talk to each other, some do not. In general, if I deploy 5 copies of something, you should assume I mean "never on the same box" by default. (I'm looking at you, Kubernetes. WITAF.)
You could, as a vendor, do a lot of sophisticated work on the spanning tree to reduce the amount of network traffic versus loopback traffic. You might try to spread out unshared volumes across as many motherboards as possible. You could differentiate on how to most efficiently and/or stably migrate from previous config to new config, or vice versa. You could do a bunch of knapsack work, including (and this is a pet peeve of mine) pre-scaling for cyclical traffic patterns.
If you've ever looked at the Nest thermostat, one of several defining features was that it figures out the thermal lag in your heating and cooling system and it can make the house be the correct temperature at a particular time, rather than waiting until the appointed time to do anything. If a hockey puck on my wall can do shit like this then why doesn't every cloud provider do this day 1?
Tack onto this some capacity planning infographics, and a system to schedule bulk, low priority operations around resource contention, and I could probably help get you a meeting with my last boss, my current boss, and probably at least the one after that.