A bit of context missing from the article. We are a small shop with a few hundreds servers. At core, we're running a financial system moving around multi-million dollars per day (or billions per year).
It's fair to say that we have higher expectations than average and we take production issues rather (too?) seriously.
I updated and fixed a few points mentioned in the comments. (Notably: CoreOS).
Overall, it's "normal" that you didn't experience all of these issues if you're not using docker at scale in production and/or if you didn't use it for long.
I'd like to point out that these are issues and workarounds happening over a period of [more than] a year, summarized all together in a 10 minutes read. It does amplify the dramatic and painful aspect.
Anyway, the issues from the past are already in the past. The most important section is the Roadmap. That's what you need to know to run Docker (or use auto scaling groups instead).
Did you ever wonder how much traffic can a site get from Hacker News?
Here's the number after 10 hours on the front page: http://imgur.com/1AJcYij
I think its the responsibility of Docker users to understand the technologies they are using.
Docker has got into the bad habit of wrapping open source Linux technologies and promoting them in a way that makes it feel like Docker invented it. They did it to LXC and they are doing it to aufs and overlayfs. The HN community is far too vested in Docker to offer any real scrutiny and is very much a part of this hijack.
What is a Docker overlayfs driver? How is simply mounting a ready made overlayfs filesytem already present in the kernel or aufs a driver? These terms not only mislead but prevent recognition of the work of the authors of overlayfs and aufs.
They also require scrutiny as layered filesystems have tons of issues and the only way these can be resolved is engaging with the developers who most dockers users don't even know about. Docker can't solve these issues only work around them.
We build all our images on top of rhel containers that have undergone hardening to meet internal + regulatory compliance. The end result is that to some extent we can say to developers "dont worry about getting your applications production ready right away, just write reasonably stable code quickly and we'll strip out all the stuff we dont trust you to do and handle it at the infrastructure level".
End result is that we can start to eliminate some of the unsuitable uses for traditional middleware systems like our datapower infrastructure that, while being great for specific use cases, is usually to difficult to work with for your average front end developer who doesnt give a damn about soap headers and broker clusters.
As far as our architecture leads are concerned, and I'm inclined to agree, docker is a great packaging format for developer outputs because it puts everything you need to run an application naked (ie no complex logging, HA, Networking etc.) Into a single versionable, reviewable, very much disposable asset. But it is not, and should not, be a replacement for proper systems of record that require any measure of stability.
That said, there are quite a few mainframes that aren't going to be replaced any time soon, by either of OpenShift or Cloud Foundry.
So far, this has been stable for us, and based on your post it sounds like it will continue to be a viable strategy as we move to kernel 3.x or 4.x.
Maybe you had trouble, but I think that those that aren't using containers need to be aware that many are using them successfully. Use of containers vs. VMs/dedicated servers can also result in reduced energy usage and can be less expensive.
I think people get hung up on what containers aren't and where they fail, rather than where they exceed.
The CDN that is actually pushing the traffic runs FreeBSD and would be using, if anything, jails.
On 28 Oct 2016 https://twitter.com/aspyker/status/792137233264418820 mentions "Lots of changes in the big cloud providers and container related leadership as of late."
Containers are becoming a big thing in the enterprise, which means they are here to stay.
Rackspace CDN-enables containers: https://support.rackspace.com/how-to/differences-between-rac...
For what it's worth, I agree with you that a lot of the value of Container Engine comes from having a team ensure the bits all work together. That should be (one of) the positives of any managed service, but I think the Container Engine team has done a particularly good job in a fast moving and often bumpy space.
Disclosure: I work on Google Cloud ;).
We have stuff that must stay on AWS, too dangerous to make a move now :(
We'll ignore the "switch to auto scaling" bit. Teammates have already started testing CoreOS & Kubernetes (on AWS) while I was writing the article. We'll figure it out and get it in production soon, hopefully.
We have a subsidiary which has no locking on AWS and could use an infra refreshing. They already have google accounts and use one or two products. They'll be moved to GCE in the coming months, hopefully.
Why drop autoscaling though? Note that you can use Autoscaling Groups with Kubernetes on EC2: https://github.com/kubernetes/contrib/blob/master/cluster-au... and the team would happily take bug reports (either as GH Issues or just yelling at them on slack/IRC).
(Sadly http://kubernetes.io/docs/getting-started-guides/aws/ doesn't point to this...)
I suppose that auto scaling will be back on the table later, to auto scale kubernetes instances. Maybe.
As others (including downthread) have pointed out, Google definitely has a bad overall support reputation, but that's because nearly every (traditional) Google service is free and lacks this structured support model. Just like Cloud, if you're even a moderately large Ads customer, you get pretty good support!
Disclosure: I work on Google Cloud and want your business ;).
One thing often not considered ("Why not just have someone keep it running?") is that you really have to keep a team of people on it (in case there's a CVE or something) or at least familiar enough with the code to fix it. There's also the double standard of "you haven't added new features in forever!" (Maybe this wouldn't have applied to Reader though).
But, I agree if we could have kept it on life support somehow, we wouldn't have (as many) people asking "What if they shut down Cloud?!?". Conveniently, as people get a sense of how serious Google Cloud is about this business, even on HN I'm seeing this less.
tl;dr it makes money, will make a lot more, they're going to support it.
- Docker do not own CoreOS and did not create the distro
- Docker did not write OverlayFS
- There are issues with storage drivers. Don't use AUFS for the reasons outlined. I've had no issues with Overlay2, but note there is also devicemapper (which Red Hat have poured a lot of time into) and BTRFS. And ZFS now IIRC.
- It's not correct to say Docker was designed "not" to do persistent data. You do need to be aware of what you're doing, and I suggest not running DBs in containers until you have the solved the FS issues and have experience with orchestration.
- Kubernetes has been around for a while now and is fairly stable in my view. Swarm isn't rubbish, but the new version is very young and you may run into issues in the short term.
- With microservices, when you are running lots of small containers, you have to expect them to crash now and again. It's just the law of averages. You need to design your application to accept this and deal with it when it happens.
If you hit this many problems with any given tech, I would suggest you should be looking for outside help from someone that has experience in the area.
> It's not correct to say Docker was designed "not" to do persistent data.
Actually...looking at a lot of things "missing from" or "difficult in" Docker can more or less be traced back to Docker's overriding philosophy of container runtimes. Things like not having an init process and optimizing user experience around single "Run" commands.
Having relatively recently done a container retrospective, following the path of:
OpenVZ -> LXC -> Docker -> now (CoreOS, Rancher, LXD, Mesos, etc.)
I can't help but feel as if some of the "limitations" of Docker were more or less designed.
I'm not saying it's a bad thing, being a Python enthusiast I see value in simplicity/having only one way of doing things...but I just wanted to point out that where you draw the line for what is considered "designed" starts to get blurry in areas of Docker's chosen implementation.
Docker was specifically designed to NOT impose a model of ephemeral containers. For example, by default 'docker run' preserves all data.
Might not be the True And Approved Docker Way, but who cares? it works well, and has been in production for a number of years now. No data loss ever.
You know that trick the post mentions about how you have to keep removing images to stop your machines running out of disk space? Yeah, removing images doesn't actually reclaim disk space on devicemapper (!!):
Oh, and devicemapper is the only natively supported storage backend on the AWS official AMI.
How many filesystems has the Linux world created in the last 20 years? Of those how many are rotting piles of tire fire?
Filesystems are hard to get right.
ZFS is the way forward. For cross compatability, for reliability, for stability, for lots of use cases.
Btrfs is another tire to throw on the burning pile of other filesystems that won't work out in Linux.
Let the down votes begin.
But Linux has some major architectural issues that eventually the Linux faithful will have to admit to.
Observability. Though bless him Brendan Gregg is trying his hardest to help here.
And this is what is so painful for the rest of us to watch or deal with. This inability for the Linux folks to admit these glaring architectural problems. Problems they refuse to look outside their bubble to see how others solved these problems and adopt solutions that have already solved these problems. They just want to continue to bury their heads in NIH soil. And double down on trying to solve these issues, poorly, on their own without any awareness of how others have solved these issues. People outside Linux land just might have the right ideas on how to solve these problems!
But I expect no one in that camp to acknowledge this and down vote away.
Frustration, pain are more appropriate terms from my pov than inflammatory. But ymmv.
It's like "Come on folks! You can do better than this! Please! Just stop! Think! Please. I'm begging you."
I'm not entirely convinced we should settle on ZFS just yet. It's fantastic and quite possibly the best option right now but it has a few limitations:
- The Linux implentation seems to have issues with releasing memory from the ARC back to the system
- The only way to expand a zpool is to add a new vdev. Pools are essentially a RAID0 of vdevs so if a single vdev fails, your entire server fails. You can mirror or RAID within a vdev but this means that the reliability of a vdev is the reliability of your entire pool. The problem here is that you can't just add 1 or 2 new disks since adding a 1 or 2 disk vdev would be data-suicide. For smaller servers, this is silly.
BTRFS looked like it was getting there but it proved buggy and unreliable. Personally, I'm waiting for bcachefs: https://www.patreon.com/bcachefs
If instead I ran RAID6, I'd have 80% of the disk available and I could add disks in single disk increments.
I think ZFS makes great sense for businesses that can throw money at disks but for smaller businesses or home servers it's kinda bad.
It was kind of a pain to configure (albeit quite flexible) but it's been pretty nice overall, already survived 1 disk failure and a capacity upgrade (during which I had to resilver after every individual disk upgrade, which was time-consuming, but after the last disk got upgraded, the extra space finally showed up)
Disk is cheap. There's no reason to design like it's not.
Plus, disk may be cheap but servers to house it are not (the kind of servers you'd run in your house, I know you can get cheap SC846 off eBay).
I'm sure it's not an issue in enterprise environments, but I've personally been bit by ZFS's inability to shrink or rebalance pools (even offline) several times now through personal use.
Those sort of use cases need to be handled if ZFS is to become a more general use filesystem.
Alright, the suspense is killing me: what OS is the one that gets it right? I ask this to "the rest of us" as you put it.
I'm not sure what reality you live in where Linux users like, actively deny things like lacking an appropriate COW filesystem with low latency, or things like the fact it took us a decade to get only close to DTrace. Your own mind? 6 month Ubuntu users? Internet forums where you talk to other BSD users and nobody else?
> Let the down votes begin.
Don't worry, I'm sure people will oblige.
However XFS is perfectly good for storage. I've used it on 15pbs worth of fileservers with no particular nonsense.
I wouldn't touch BTRFS with a barge pole, not because of any inherent stability issues (although that is a big factor) its the utter nastyness of the tooling.
ZFS is a joy to setup, and its simple, logical and the man pages tell you useful things.
BTRFS, not so much
1. filesystem is corrupted, once again
2. try to repair it
3. oh right, the repair tool can not replay the journal
4. try to mount it
5. admit after 3 hours of nothing that the journal-replay code triggered on mount actually can really not deal with corruption
6. reboot the server to get the filesystem unstuck
7. rerun repair, this time throwing away the journal
8. look at the empty filesystem with everything in lost+found
9. restore from backup
The team I'm in only runs 1'400 servers, yet this happens regularly.
But, mostly it happens because the fileservers dont have UPS's (The pipeline tools are almost exclusively COW, and backups are tested many times a week.)
I think we paid for support on XFS from either redhat or SGI, but I can't remember, I left that place a year or so ago.
I've never had the balls to run ZFS on large(100tb+) arrays. Last time I tried the way slab handling was translated from solaris cause many problems (but that was more than 4 years ago. ) Plus the support is a bit odd, you either go with oracle(fuck that), or one of the openZFS lot.
To get the best performance/stability you ideally need to let ZFS do everything, instead of letting the enclosure do the raid 6 (4 * 14 disk raid 6 with 4 spares). This of course is a break from the norm and to be treated with utmost suspicion.
Have you seen the GPFS raid replacement? thats quite sexy.
I run ZFS at home, because it works, and has bitrot checking.
Just as long as it does't get anywhere near capacity.
What a useless comment. If you are running out of capacity adding disks to a zpool is incredibly mind numbingly easy.
"zfs add poolname mirror disk1 disk2"
Every filesystem has performance hit the floor when you consume all available capacity. This is not unique to ZFS but to well, reality.
Both get you ZFS and something lighter-weight than VMs. Need Linux? Bite the bullet and use VMs.
So if you have the skill FreeBSD. Otherwise SmartOS makes a killer setup for containers unmatched by anyone else.
Maybe not so important in the container context, but sooner or later, somewhere you need persistence, and tape offers certain persistence features you pretty much can't get elsewhere, especially at its media price points.
(Plus, if they got this wrong, when it's really not that hard to get right (I've done SCSI at this level before), I wonder what else they have.)
No killing yourself necessary, really ;)
Docker has been actively trying to become to household brand name for containers so it's difficult for me to give them a pass with your line of reasoning.
> Re Persistent Data: You are probably going to have a tough time if you try to do persistent data. Containers really excel at stateless tasks(or temporary state like caching). Certainly it's doable but for someone early in their adoption the proper toe in the water is probably the stateless business logic not the state layer.
Not a company of good taste in my book.
Any monkey can run a bunch of stateless systems. It's the stateful ones that have the worthwhile problems.
I suggest using something that works for you and your business instead. If the only thing that works is $SOLUTION, but you have these difficulties, then yes by any means look for help.
Edit: Ok, I feel compelled to extend my reply as I've been downvoted without comment, which is terrible.
Here's the thing: You're not forced to use Docker just because it's the next hyped thing. It is great and yes, I use it. However there is more than way of doing things and the only reason to choose one technology over another is if it's better for the business.
Anything else is just ignoring reality.
Edit: case and point
> Please resist commenting about being downvoted. It never does any good, and it makes boring reading.
> Please don't bait other users by inviting them to downvote you or announce that you expect to get downvoted.
And since it is such a frequent occurrence, would that not point to the fact that people do enjoy talking about being down voted?
Actually CoreOS produced pretty stable stuff: etcd, fleet. Actually with fleet/etcd combined with some cgroups i see barely a reason to use docker, I mean on Cloud's its actually another useless layer, at least for me, maybe not in Google scale.
It seems that they started to use docker while it was changing a lot, I didn't have any issue in the past few months with docker, but I don't use it in production.
2) You are not running it in production. Poster is. All sorts of issues you don't care about on your dev box matter a lot in production.
IOW, your experience can best be summarized by "just trying out Docker in a totally different way and it works great!" Irrelevant to this post.
Not just that, CoreOS is basically made to push systemd-nspawn as a container framework as hard as possible.
Unplanned downtime is the main drawback to both hosting your own OS's and using leading edge tooling for your critical systems. It doesn't matter what the underlying system, stuff like this happens. This stuff is complicated. You will find major warts in every new system. Everything will break, often, leaving your users stranded. It takes a very long time for software to mature. 
That's why you see engineers with 15+ years of experience patiently waiting for new technologies to mature before putting them in place in our critical infrastructure. Sure, we'll play with them, but putting new tech in production that could easily make your entire system unavailable is too risky. Yes, three year old tech is new.
Most folks don't realise how simple their requirements are. Most folks need to put files on a server and start a process. What they don't need is a whole mass of rickety scaffolding and control plane infrastructure that gets them halfway (and only halfway) to running an in-house PaaS.
So having kicked the tyres on Docker and been appalled by the messy design and dreadful tool quality, I went back to using OS-native packages and isolating our microservices via (gasp) user IDs.
Judging by their product direction I suspect Docker's board want them to challenge VMware, which (rubbing my crystal ball) suggests to me that their future is a bloated and overcomplicated piece of Enterprise, targeted at companies who think you can buy devops.
Meaning that rather than moving a whole OS from hardware to hardware, or spin them up or down as load required, you would spin up or down processes and their runtime requirements from a known functioning image.
Thus removing the overhead of the VMs hardware emulation and the need to run so many kernels.
The response from the VM world to containerization seems to be to push unikernels. Effectively going back to the DOS days where you have just the minimal kernel and userland, all living "happily" in the same (virtual) memory space.
As for performance, if its critical then get a real server, if you've got decent change management (ie ansible, puppet et al with no manual intervention) Then its no effort.
(seriously network boot, and vlans are your friend here.)
How do you define "DevOps"? Because configuration automation (e.g. cfengine)
and deployment automation (e.g. Capistrano) were more or less widely known and
(less widely) used before the word (portmanteau) was popularized.
If you change paradigms from the imperative sequence of mutations in a Dockerfile to a declarative specification that produces immutable results and a package dependency graph structure, you get a much better cache, no need for disk image layering, a real GC, etc. For example, the GNU Guix project (purely functional package manager and GNU/Linux distro) has a container implementation in the works named call-with-container. Rather than using overlay file systems, it can just bind mount packages, files, etc. inside the container file system as read-only from the host. Not only is this a much simpler design, but it allows for trivial deduplication of dependencies. Multiple containers running software with overlapping dependency graphs will find that the software is on disk exactly once. Since the results of builds are "pure" and immutable, the software running inside the container cannot damage those shared components. It's nice how some problems can simply disappear when an alternative programming paradigm is used.
Storage-wise, it had LVM2, ZFS, and loopback drivers. Never bothered with overlay, just used a generic clone API to the storage driver to spin up identical VMs before modification. Very easy with snapshot/thin-provisioning capable backends, like LVM2 and ZFS. Loopback just used cp, but because you could have these in memory they could also be very fast (but memory-hungry). Cloud-wise, we'd done a few but I was also iterating an orchestration system based upon internal requirements (high security/availability) and existing, proven solutions (pacemaker/corosync). It was designed for fully repeatable builds, something docker only began to add at a later date.
With Guix, any sub-graph that is shared is naturally deduplicated, because we have a complete and precise dependency graph of the software, all the way down to libc. I find myself playing lots of games with Docker to take the most advantage of its brittle cache in order to reduce build times and share as much as possible. Furthermore, Docker's cache has no knowledge of temporal changes and therefore the cache becomes stale. Guix builds aren't subject to change with time because builds are isolated from the network. Docker needs the network, otherwise nothing would work because it's just a layer on top of an imperative distro's package manager. Docker will happily cache the image resulting from 'RUN apt-get upgrade' forever, but what happens when a security update to a package is released? You won't know about it unless you purge the cache and rebuild. Docker is completely disconnected from the real dependencies of an application, and is therefore fundamentally broken.
Clearly, Docker itself has a a few annoyances, like no garbage collection and poor tagging process, but we're using it on a daily basis, and nobody complains about it (nothing a few bash scripts can't fix).
Finally, the biggest benefits of not just docker but containers in general are consistency and immutability which brings me to the final point I agree with the author: Your database shouldn't be containerized. It's not your pet (from pets vs cattle), it's your BFF.
Nobody writes rant posts when things work, I guess.
2) As disclosed by other comments, Google Container Engine doesn't use Docker. It uses containerization technology made by google for google, with a docker-interface added on top.
That's why you don't have problems with Docker. You're not using Docker at all! ;)
For the record, orchestration in Docker (1.12.x) is actually quite good. Its built-in service discovery and mesh networking make setup a breeze.
There's plenty wrong with a Pod that writes to PV - if your Pod/Container somehow gets corrupted, you're left with a corrupted database. It's unlikely, but as the origintal article states, Docker is young and things happen. Are you willing to risk it?
So in every case you need to ocnfigure replication and master selection. Are you willing to trust the pods to do the right thing without manual intervention on data that is the core of your business? I'm not.
It's certainly had its challenges (different versions of Docker clients causing havoc with different manifest versions, race conditions in the engine etc etc), but this article takes things a little far.
If you take pretty much any new technology like this make it a base part of your platform, you'll want support from a big vendor. To think otherwise would be naive.
There is definitely a debate to be had about Docker Inc's approach to production/enterprise rollouts though, but as a technology I'd say it's developing pretty much as I'd expect.
I've also seen Docker succeed in production despite the technical challenges. If you don't like the heat...
(1) See my blog: https://medium.com/@zwischenzugs, especially: https://medium.com/zwischenzugs/a-checklist-for-docker-in-th... and https://medium.com/@zwischenzugs/docker-in-the-enterprise-ec...
What combination of Docker version + distro + kernel + filesystem are you running, and how stable is it?
"Many attempts can be found on the internet, none of which works well. There is no API to list images with dates"
http://portainer.io/ seems to be able to do it, docker images lists it... I mean, I don't want to call bullshit without knowing the full story, but...
"So, the docker guys wrote a new filesystem, called overlay"
https://github.com/torvalds/linux/commit/e9be9d5e76e34872f0c... written in 2014 by the linux kernal team / Linus, but OK cool story
"It affects ALL systems on the planet configured with the docker repository"
OK, this is why people use AWS' container repo, or they use the open source software and maintain their own repo... this happens with any of the public repo services, and it was 7 hours.
"The registry just grows forever"
S3 is a supported backend, highly would advise anyone running their own repo to use it (or have a similar expanding-space story). There's also a config flag for the repo to allow removing images. Obviously wouldn't want to use this if you're hosting a public repo but internally, go for it, it's off by default, seems sane enough.
"Erlang applications and containers don’t go along"
I'm certain that people are running erlang containers successfully.
"Docker is a dangerous liability that could put millions at risk. It is banned from all core systems."
Daily FUD allowance exhausted.
I guess the tone of this article really bugged me because there's obviously a point to be made from this experience that running Docker is more difficult & error prone than it should be. And maybe it wasn't a great fit for this company culture. But this article crossed the "we didn't have a good experience, therefore, it's BS software" line, and frankly, that attitude may very well have been to blame for the lack of success just as much as docker's shortcomings were...
Also there is DCGC from yelp!
I looked at Kubernetes before Rancher but it was hard for me to really understand where to start and how it all fit together. Now that I understand much better what I am doing I should probably take another look at it.
Rancher itself is not an alternative to Kubernetes. It can control a Kubernetes, AWS, Mesos or very simple Cattle cluster. We started with Cattle with a plan to one day use something like Kubernetes, a step at a time, because we kinda felt intimidated (I left that company) by articles like these.
I mean, he's right on the cron part. That's how all the various open source docker cleaners are working.
*) Docker could maybe do better here, but tbh - I think documentation and getting the word out about how things work is one of the things Docker Inc does pretty ok.
The worst part of this problem is when you do run out of disk space docker's image/container database often gets corrupted resulting in the great cryptic 'could not find container entity id' error. The only fix I'm aware of is rebuilding all your images or mucking around with the docker sql database. The lack of basic features to manage the problem creates a worse and harder to troubleshoot problem.
It's an easy enough problem to avoid in the first place so it's really not big of a deal. It hurts docker as a company more than the users in the long run because it makes them look bad. There are a lot of other small things like this with the docker cli tools especially. They aren't deal breakers but they are definitely eyebrow raisers. For example if you mess up your docker run command it usually just fails in strange ways instead of telling you what's wrong with your command.
I'm actually pretty happy with docker overall but they really need to fix some of these basic usability problems.
However, as it turned out, these problems had nothing to do with Docker. Containerization is a great idea, and Docker's approach is sane (one container per service — fine).
The offender was docker-swarm. Like many others, I chose it as the default container management approach (it is written by the same team, so it should work the best, right? Wrong.) Docker-swarm is indeed buggy and not ready for production, or at least it wasn't 10 months ago. And if you use "one container per service" approach, orchestrating large groups of containers is a necessity.
Then I discovered Kubernetes, and became significantly happier. It's not without its own set of quirks, but it orders of magnitude smoother experience compared to docker-swarm. And it works with AWS just fine. (I didn't get to use DCOS, but I heard some nice things about it, too).
Tl;dr the source of all "Docker is not ready for production" rants seems to be docker-swarm, at least from my experience. Use Kubernetes or DCOS and everything will be so much better.
> Docker first came in through a web application. At the time, it was an easy way for the developers to package and deploy it. They tried it and adopted it quickly.
Unless you've got a lot of spare time on your hands, it's never a good idea to adopt unfamiliar tools where they're not needed. Stick to your traditional deployment techniques unless you're equipped, and have the time, to take a roller coaster ride.
Docker is young software.
That said, it seems the author did experience many legitimate breaking changes and abandoned projects. It would be great to hear a perspective from someone on the Docker team about how breaking changes are managed, how regressions are tracked, how OS packaging is handled (e.g. the dependencies the author mentioned wrt filesystems, as well as the bad GPG signature), etc.
That's so broad, it's meaningless. Nothing is ever "needed". Given enough time, you could run everything on a stack limited to technology released before 1995. And without "adopting unfamiliar tools", they'll remain unfamiliar forever.
And that while I think Docker does a really crappy job and I went through so much pain myself. But the whole article shows lack of understanding and approach. I wish we should discuss more constructive about all the issues Docker indeed has..
The one thing I do want to respond to is the notion that Docker cannot run stateful apps. Among the stateful apps I have run on Docker in production: elasticsearch, redis, postgresql, mysql, and mongodb. Containers and application state are orthogonal topics. The contained process is reading and writing a file system on persistent media just like all the other processes on the host and that file system doesn't mysteriously disappear when the contained process exits. Naturally stateless apps are simpler. They're simpler to run on VMs or bare metal too. When you need data in a container you do the same thing you do when you need it on a VM or on bare metal: you mount appropriate storage for it.
The main issue with most stateful applications isn't the existence of the state. The main issue is that if you have state then its probably important, you probably want to make it durable and highly available, you probably want clustering and replication, and now you're into discovery and peering, and that can be an actual issue with Docker, especially for finicky beasts like redis.
I'm excited by containers and all, but this point has always stopped me from going forward. If I were a huge company running bare metal, then yes I'd want to squeeze every last ounce out of my hardware. But if I'm running inside a hypervisor and somebody else is paying the cost, what are the benefits?
As for the article: it comes across as a blend of frustration and real technical challenges. Some good points, though the hostile tone weakens the argument.
I don't think Docker should worry too much about backwards compatibility right now, just power forward until it's got a solid formula. In the meantime, caveat emptor!
However, if you, for example, need to run 20x copies of 50MB application that uses 100MB memory at runtime in somewhat isolated environment, you would only provision machine with 20x100MB+50MB=2050MB + whatever OS needs (~100MB?) to keep everything in memory. If you made VMs for each of them, you would need 20x100MB+20x50MB+20xOS overhead=5000MB'ish, 150% increase in this case.
Also, starting up a container is a lot faster than starting a VM. If image is in the local cache, your application likely starts loading within 500ms. VM startup times are usually measured in minutes.
cf scale -i 100
VMs booted up from images are much slower. And cloning checkpointed VMs is not apparently something AWS or GCP can do.
Disclosure: I work for Pivotal, we're one of the companies who make `cf scale` possible.
Half of the article is various complaints about AUFS, and then Overlay. There are at least 6 storage drivers off the top of my head, and I feel like there are one or two more I am forgetting about. What made you choose AUFS? What were your thoughts on Overlay? Instead of going with the default and complaining, maybe research your options? Even worse though, the filesystem driver is completely beside the point. If you want to store things on the filesystem from a Docker container, you don't use the copy-on-write storage system! You create a volume, which completely bypasses the storage driver. There are even a bunch of different volume drivers for different use cases- say mounting EBS volumes automatically for your container to persist data to? His whole rant about data magically disappearing has nothing to do with the tool, but that he didn't take the time to learn how to use it correctly.
Further, while I am sure he is telling the truth and has machines randomly suffering kernel panics and crashing multiple times per day, it seems hard to believe that it is docker causing this. He is running 31 containers spread across 31 aws hosts... I am running thousands, across hundreds of aws hosts, and have been for a long time, and cannot recall a single kernel panic caused by docker.
I had to laugh at this one, then again I am an old dog.
I could also add the several solutions done by mainframes, or other UNIXes like Tru64.
I had created a bunch of solaris zones and then had the great idea to patch them.. everything broke.
I'm actually using smartos at home now. kvm and debian lx zones. works great. A solid platform with a UX that doesn't make me want to kill myself.
Still implies that it was always good..
It was absolutely terrible in Solaris 10. Maybe it's better now. Maybe with solaris 11 they fixed it so you didn't have to chose between a full zone that took 5GB or a 400MB sparse zone that would break in fun ways when you tried to do anything.
At the time I was using xen+debian and a 'full' vm for me was about 200MB. Basically base system+sshd+whatever daemon I needed to run. Attempting to do the same thing using zones was a complete disaster.
all solaris and aix was ripped out and replaced with linux.
I'd be more interested on articles that actually tested it running how it was meant to. Or, if you can't, just stop there and say "there's a showstopper bug, it wouldn't work for us" - those are useful articles too, and require much less effort.
About databases, mostly anything¹ you put between them and the hardware will hurt you. VMs are bad too. What I can't imagine is how you would create and destroy database instances at will - I might live in a different reality, but around here you'll always want as few database nodes as possible, and as a consequence, as powerful and reliable nodes as possible, what means as little moving parts as possible.
1 - What won't hurt is a disk abstraction layer designed for performance and as expressive as the original physical stuff. This is widely used with the "SAN" name, and I'm willing to bet you can mount one inside a container. But well, on AWS you really won't.
Everything was fine in testing, everything was fine in production... at first.
Fast forward a bit. More application put in docker, a lot more load in both testing and production. Things go wonky.
If we ran our current systems in Docker/Debian from January, a host would crash every hour and we'd have figured out the unstability. Instead the docker adoption happened slowly, while a few patches (and temporary regressions) were released along the year.
Even worse, it was a fad that adds a whole lot of complexity to the system. Anything that adds complexity always gets a jaded glance from me. Unless there's a very clear and good reason for it, increasing complexity is typically a sign that something is wrong. (I use this rule in my own code and designs too... if I'm building a tower of Babel then I'm not seeing something or making a mistake.)
"Let's take preconfigured Linux system images and treat them like giant statically linked binaries."
Don't get me wrong-- it is a somewhat interesting hack and does have some utility. (and it IS a hack) But the problems it solves are in fact a lot harder than they seem and stem from fundamental 30+ year old issues in OS design and the Unix way of doing things. Our OSes date from the days when a "box" was a very long lived stationary thing that was provisioned and maintained manually, and they're full of things that today are bad design decisions because of this... like sprawling filesystems full of mutable data with no clear mutable/immutable division.
But the hype on this thing was just insane. I've never quite seen anything like it except maybe Java in the 1990s and that had more substance (a popular Smalltalk-ish language). For a long time I've joked that devops conference conversation was sounding like:
"Docker docker Docker docker docker docker Docker docker docker?"
"Docker docker docker."
"Docker docker docker Docker docker."
Luckily it seems to be dying down and a lot of people are realizing it's not a silver bullet. This will probably inoculate a generation of devs and admins against over-hyped fads, but never fear... a new generation of devs and admins are coming in now and are eager to rewrite everything again since this time they'll be able to make a few trivial rearrangements of things that are already there and this time they'll get it right.
So far I've been bitten by the inability to clean up images of certain age.
Another really annoying thing is the inability to tag an image directly into a registry (AFAIK). You need to pull, tag and push back again. Given that images can be GBs long you end up with really heavy net traffic for a simple task.
For instance, you may want to run an image for a while on a staging server before re-tagging it as the current stable release.
Per publishable branch (e.g. master to prod, dev to QA, shared integration that isn't ready for dev to secondary QA):
- Publish image to private docker registry with tag in the format of <branch>-latest (e.g. master-latest, dev-latest). Also add labels with git revision hash and CI build number to reconcile what is in the image both in CI and in git repo.
- Capture the digest of the published image and put in .txt file as an artifact from CI.
- Add a tag to the git repo with the build number, treated as version number in the form of v<build-number>-<branch> (e.g. v1-master).
- Maintain list of applications' docker image digests that should be running in each environment. We currently have this automatically determined by pulling the digest artifact from the passing-build digest. We do allow for manual overrides, however.
- We have a CI project that updates the docker swarm with these digests and other settings (replicas, mounts, environment variables). We wrote a small tool that does this similar to where stacks/dab files are going but with more functionality.
With the labels and published digests we can go from build to digest, digest to build, etc.
This is working really well so far and we aren't fighting the tools. We may separate the non-master registry later in life, and that's really easy given that docker treats the private registry hostname as part of the tag.
You obviously have a process to build new images... why doesn't the same process prune old ones ?
> We are using Debian stable with backports, in production. We started running on Debian Jessie 3.16.7-ckt20-1 (released November 2015). This one suffers from a major critical bug that crashes hosts erratically (every few hours in average).
If you're stuck on an older system that's poorly supported by Docker then it may be a bad choice for you.
Read: Even if Ubuntu 16-LTS and Centos/RHEL 7 may be up to date enough now (not sure), they won't keep up with the latest minor kernel updates and that will become a problem in 6 months when docker 1.1X will require THE newest kernel.
After all, about 1/2 the gripes stemmed from the fact they docker itself is basically the engineering branch minus any serious regression/etc testing. Of course things are going to break from release to release, that is why test departments exist.
Honestly, I liked this article because they had been using Docker for more than just a couple weeks.
AUFS on production? Seriously? Create a BTRFS partition.
Docker isn't a shortcut to learning a linux based OS, it is a useful tool in the hands of people who know what they are doing.
If your company lost thousands of man-hours on a tool and has nothing more to show than a blog article with half the facts coming out of the author's head and the other half boasting about bad production practices, would you prefer me to sugar coat it?
A hire that knew linux and docker, would solve the problem.
I work with docker since 2013. I came to a team with devs that already used docker. I didn't know anything about it at the time. They gave me a docker image with a JVM based app, that had a two years old JDK, ran as root and the container was ran with privileges enabled.
Did I have to know docker to tell them that this was nuts? Nope, I had to know ops. For the same reason I knew AUFS before I learned about docker and I knew very well that I would trust AUFS on my NAS for media storage but not as storage for a production database.
Did I ever tell the devs how to write java? No. It isn't on my job description for a reason.
A linux guru will know that AUFS is unstable... but the problem is mentioned nowhere and Docker still uses it as the default filesystem for [most of] all cases.
The linux guru will know to avoid AUFS... but the replacement [overlay2] was rolled out very recently and it's only available in the latest OS and systems.
My choice is btrfs and it has been rock solid for me since the start. Of course one has to know a bit about btrfs and always leave a healthy percentage (I'd say 10% or more) of the filesystem free.
I think AUFS and overlay2 do have an advantage of their own though (apart the works-out-of-the-box). Due to the way they work, you can have shared memory between containers. Thus if I had to run many containers with a JVM app, I would give them a try to lower the tax on RAM.
CentOS & RHEL uses device-mapper with loopback device by default (and logs "Usage of loopback devices is strongly discouraged for production use. Either use --storage-opt dm.thinpooldev or use --storage-opt dm.no_warn_on_loop_devices=true to suppress this warning". Also related article 
CoreOS uses overlayfs by default now from a while 
Gotta find the right combination of OS & filesystem to have a stable docker.
I'd pick a hire who also knew a platform. I like Cloud Foundry because I work on it, Red Hat folk would prefer OpenShift, there's also Deis and Convox and others I am cruelly neglecting.
Containers are a building block. An important one, they make new architectures practicable. But past a certain point, rolling a custom PaaS that you will have to maintain at your own expense, with no commercial or opensource support whatsoever, forever, doesn't make much sense.
CoreOS is the company behind a competing container tech called rkt (rocket) and CoreOS Linux.
But it took me under a day to containerize an application I'm working on, and get it up and running flawlessly on Google's container engine. And this is coming from having zero Docker experience.
Is this just fear mongering or are there any actual issues with overlay2. we have been trying out overlay2 on our less critical systems and haven't experienced any issues so far but we have stressed it enough to know for sure.
That was a 7 hours interplanetary outage because of Docker. All
that’s left from the outage is a few messages on a GitHub
issue. There was no postmortem. It had little (none?) tech news
or press coverage, in spite of the catastrophic failure.
Hi everyone. I work at Docker.
First, my apologies for the outage. I consider our package
infrastructure as critical infrastructure, both for the free
and commercial versions of Docker. It's true that we offer
better support for the commercial version (it's one if its
features), but that should not apply to fundamental things like
being able to download your packages.
The team is working on the issue and will continue to give
updates here. We are taking this seriously.
Some of you pointed out that the response time and use of
communication channels seem inadequate, for example the
@dockerststus bot has not mentioned the issue when it was
detected. I share the opinion but I don't know the full story
yet; the post-mortem will tell us for sure what went wrong. At
the moment the team is focusing on fixing the issue and I don't
want to distract them from that.
Once the post-mortem identifies what went wrong, we will take
appropriate corrective action. I suspect part of it will be
better coordination between core engineers and infrastructure
engineers (2 distinct groups within Docker).
Thanks and sorry again for the inconvenience.
Why did this happen? There wasn't any explanation given. I'd love to see a postmortem from Docker.
Our containers are completely stateless. We use AWS's RDS and S3 to store state.
> The impossible challenge with Docker is to come with a working combination of kernel + distribution + docker version + filesystem.
Amazon has met this challenge with the ECS Optimized AMI. It works, gets frequent upstream updates, and you can open tickets against it and get great support.
* Amazon Linux
* Linux Kernel 4.4.23-31.54.amzn1.x86_64
* Docker version 1.11.2, build b9f10c9/1.11.2
* Device mapper
When Docker works, it’s great. The concept of being able to install software on the users behalf in the container and have the user install the container really does make life easier. The unwieldy command lines are nicely abstracted with docker-compose(.yml). Software upgrades via containers becomes a nearly trivial process.
I work on scientific software that can be a huge pain to install and configuring the packages correctly often times requires the help of the software creator. The fact that you can codify this “dark knowledge” in a Dockerfile is of tremendous benefit. And generally speaking, I like the concept of the Dockerfile. Whatever container technology ends up winning, I hope there will always be something like the Dockerfile.
- Docker version upgrades never go smoothly. Firing up docker after apt-get update && apt-get upgrade never works with respect to Docker. You get some obscure error and after spending 30 minutes in Google you end up having to rm -rf /var/lib/docker/aufs. My solution is sometimes to throwaway the VM and start over, but this is unacceptable.
- File ownership issues inside versus outside the container are (or at least used to be) a huge pain in the ass! I am referring to files mounted from the Docker host file system into the container. (And I am also referring to running docker as unprivileged user that is part of the docker group.) The solution I settled on is to enter the container as root, run chmod/chown on the appropriate directories and drop down to a “regular” user with gosu. At that point, you proceed on your merry way by starting your service, process, etc. This work flow solves permission problems inside the container, but outside the container is a different story. At least on earlier versions of Docker, you could lose permissions of files created via the container because at that point they are owned by root as seen from outside the container! More recent versions of Docker seem to solve this problem as I have gleaned empirically and from experimentation but I have yet to see adequate documentation about any of this and I have spent a long time searching! With more recent versions of docker I have observed that the files in question are no longer owned by root (again as viewed outside the container), but by the user that started the container, the behavior I would expect. I’m also not crazy about the gosu solution which seems like a bolted on rather than baked in solution.
Having worked all of 2015 in the container space, I can state this comment is a false statement. While it may not have been possible for this person, the technology has existed for quite some time to see what process are running in a container, debugging became possible with sysdig and ssh'ing into a container has been a thing for YEARS.
If they were running this on a single machine with backups they could restore to a time before the incident.
Also querying and downloading from the public registry everytime time seems like a waste of bandwidth.
Anyone who has used docker at all knows the above statement is 200% rubbish. FFS
While Kubernetes seems to be the orchestration choice du jour, I had been eyeing http://deis.io/ (which I believe to be at least as mature). Can anyone comment on their experience or thoughts on Deis vs Kubernetes? TIA.
If this is "real" filesystem for disks, not proxy for other filesystems, mostly read-only.
The concept of getting away from snowflake machines and making your infrastructure more code-like with deployment scripts is the way to go, but you certainly don't have to use docker for that. At the end of the day, it's more about forcing your team to be regimented in their processes and having a working CI system for your systems related stuff. You should be able to start from nothing and with a single command have a developer's environment up, all the way to a fully scaled prod system.
Docker is just a tool to try and get to that point of evolution (albeit a poor one).
Even if people try to sell it has flexible silver bullet, made up for everything; like any other tools around, they need to apply some sort of CAP theorem.
Because underwoods we're just talking in how moving bits.
For instance I use containers like AWS lambda's functions. And I use tar streams to pass the contents inside them (So no virtualised mounted filesystem needed).
With that I mean you can find different way to use them, in short: Law of the instrument.
If you're wise (and according with you article you got ton of experience), you can find good purpose for it.
Docker is just 'meant' to make your life 'easier', like any other tool, not always they satisfy the expectations.
In other hands you gain experience with it, I'm sure you would use them back for a better purpose, where they can perform better for you, there is nothing to worry about this is just experience.
Thanks for sharing.
Is it how Docker should have been designed to embrace production in the first place?
Heck, it is possible even remove images from the docker repository without corrupting it, just not from a single API endpoint. FWIW I think that it is a big failing of the v2 repository project to not have that functionality from the get go.
It's also worth noting that I've only run Docker in production using CoreOS and Amazon's ECS AMI. Both have their drawbacks, but nothing so dramatic as to keep me from recommending Docker in production for "cattle" style applications.
> Containers are not meant to store data. Actually, they are meant by design to NOT store data. Any attempt to go against this philosophy is bound to disaster.
You can do worse than use Red Hat Enterprise Linux 7 (RHEL7) that has a tested and supported blend of those components. Or CentOS if you just want to try it out and don't care about support.
Unsurprisingly the linked article is fraught with glaring errors and misinformation. This has become typical of most things I read that is docker-related; it's a circus.
It saddens me to see the Linux community get dragged into and overwhelmed by this mess.
Once you go down this path, it seems there is a very a limited number of providers capable of providing the infrastructure you require (at least compared to vps/dedicated/datacenter providers) and they can just keep pushing you into using more and more fragile infrastructure components that you would never want to maintain yourself.
Eventually, you may very well find yourself paying so much for all the hardware/virtualization/os/container/orchestration maintenance and resource overhead, with a number of backup instances of everything due to the unreliability of all these components, that you wish you could just go back to placing a few pieces of metal in a few datacenters and call it a day.
docker rmi $(docker images -f dangling=true -q --no-trunc)
This command won't help you clean up disk space if you build/pull images each with their own unique tags.
Almost every time I've hit a problem, it has had nothing to do with docker (though everyone likes to blame it). 99% of the time the issues I've encountered are due to docker configuration (and yes the learning curve is steep) or the app itself crashing.
The one time I've encountered an issue with docker with databases (used elasticsearch) was a kernal issue (where the host kernal was out of date). That was the only time I had to care about the host OS in debugging a docker issue.
One thing that I recommend for anyone getting into docker - use docker-compose for everything. Don't use the docker CLI.
One of the major problems I have with docker is that (especially for beginniners) the docker command line tool can cause unintended consequences.
Every time you run docker run <image> you are technically creating a new container from an image. Once that finishes, the container still exists on disk. Using the --rm flag will cause the container to actually be removed after running it. I sort of wish docker would emphasize that flag, because often when developers are debugging or trying things, they will continue to run docker run <image> multiple times - not realizing that they are creating a ton of containers.
Yes, the containers are usually very light, but it caused a lot of confusion for me when I started out. And part of my superstitious self believes that the less dead containers you have the less likely you'll have filesystem issues.
A second source of confusion for new devs I've noticed is that the concept of volumes are somewhat confusing. "Why would I ever do COPY in my docker file if I can just volume it?"
Once you've worked with Docker, like any tool, you become more adept at avoiding a lot of the mistakes mentioned in this article. Which leads me to a point I can't disagree with the OP about - upgrading docker is annoying as hell.
Almost everytime I upgrade docker I end up having some obscure error. When I look on docker github I notice I'm not the only one, and often times the issues are just a few hours old.
But, what the OP didn't seem to realize, is that to avoid these issues you need to lock your docker version in production to one that you know works. Additionally, you need to not build your dockerfile in production. Build your docker containers with a build system and upload them as artifacts. You can either use an internal docker hub or do docker save to save them as a .tar.
Building while in production is a real no-no, even though it may seem more attractive than moving around 300-500MB images. You never know if that one dependency you apt-get in your dockerfile is offline for some reason. And you can't always depend on devs of dockerfiles properly versioning them anyway.
The Garden container API which underpins Concourse CI and Cloud Foundry has switched to runC as the container runtime. So far works fine, with no Docker daemon required.
Disclosure: I work for Pivotal, we sponsor Concourse and donate the majority of engineering to Cloud Foundry.
You need to use external volumes
I was assuming he was complaining about them even with data volumes.
This was a fun read :)
No... losing a day of development is on their infra team.. not docker.
This is a typical case of blaming the tool. There are so many bad practices mentioned I don't know where to start. No orchestration... One container a host... Not using volumes correctly... Apt-get installing docker every time ci runs... Not using an optimized base os for containers... It's amazing what people will blame their problems on when they don't do their due dilligence with a new platform.
As said in the post, this was a consequence of Docker failing them when running multiple containers on a host.
> Not using an optimized base os for containers...
On a server that might be possible, but I thought Docker's advantage was providing a reproducible system that can also be run on your dev machine? Sorry, if Docker doesn't run stable on my dev machine, no way I'm trying it.
And for this you need a "real" base OS, which will be either apt-flavor (debian/ubuntu) or rpm-flavor (rhel, sles), depending on the organization.
While true in general, you always end up introducing latency (simply because it's on a different host) and possible packet losses/retransmits (e.g. when the switch buffer memory overruns). Given the "hft" in the blog name (high-speed financial transactions), both can equal losses of money.
The rule is "an" is used in front of a word that starts with a vowel sound. "An hour" is correct because the "h" is silent. All modern (native) English dialects pronounce the "h" in history, ergo "an history" is incorrect grammar.
from the first sentence.
> not a high quality reference
> proceeds to reference Wikipedia
Well, the grammar rule got a name. It doesn't say whether it applies to "a[n]" history" though.
2. not everyone is a native anglophone. the web is global.
3. nobody is an actual grammar expert, including you.
4. avoiding actual discussion is obvious