Disclaimer, new to containers, if it sounds like I'm doing something wrong let me know, it certainly feels like I'm missing something right now.
Disclaimer: I am an Apcera employee and Kurma is an open-source project sponsored by Apcera
It generally works if:
* you don't use it to store data
* don't use 'ambassador', 'buddy', or 'data' container patterns.
* use tooling available to quickly and easily nuke and rebuild docker hosts on a daily or more frequent basis.
* use tooling available to 'orchestrate' what gets run where - if you're manually running containers you're doing it wrong.
* wrap docker pulls with 'flock' so they don't deadlock
* don't use swarm - use mesos, kube, or fleet(simpler, smaller clusters)
It "geneerally works if" you " rebuild docker hosts on a daily or more frequent basis."
Perhaps I'm misunderstanding, but needing to rebuild my prod env several times a day seems pretty "not ready for prime time" to me.
That's like when we'd say that Rails ran great in production in 2005, as long as you had a cron task to bounce fastCGI processes every hour or so.
So, can you elaborate on why rebuilding the containers is good advice?
The theory runs that attackers need time to accrue and compound their incomplete positions into a successful compromise.
But if you keep patching continuously, attackers have fewer vulnerabilities to work with. If you keep rotating keys frequently, the keys they do capture become useless in short order. And if you rebuild the servers frequently, any system they've taken control of simply vanishes and they have to start from scratch.
I'm not completely sold on the difference between repair and repave, myself. And I expect that sophisticated attackers will begin to rely more on identifying local holes and quickly encoding those in automated tools so that they can re-establish their positions after a repaving happens.
But it raises the cost for casual attackers, which is still worthy.
The rest: not so much.
Rebuilding continuously for security is not something I would recommend.
So that I understand, could you elaborate?
Particularly, do you mean "not recommend" as in "recommend against" or "not worth the bother"?
It's not crazy to periodically rotate keys, but attackers don't acquire keys by, you know, stumbling over them on the street or picking them up when you've accidentally left them on the bar. They get them because you have a vulnerability --- usually in your own code or configuration. Rebuilding will regenerate those kinds of vulnerabilities. Attackers will reinfect in seconds.
The win to rotating them is not so much because you'll be regularly evicting attackers you didn't know had your keys, but because when you do have a fire, you won't be finding out for the first time that you can't actually rotate them.
It also forces you to design things much more reliably which helps continuity in non-security scenarios.
After redeploying and realizing that Todd has to ssh in and hand edit that one hostname and fix a symlink that was supposed to be temporary so the new version of A can talk to B, that's going to get rolled in pretty quickly. Large operations not doing this tend to quickly end up in the "nobody is allowed to touch this pile of technical debt because we don't know how to re-create it anymore" problem.
After a bunch of harrowing experiences with clients, I'm pretty close to believing "using packages for critical infrastructure is a bad idea".
In that case, you might be interested in bosh: http://bosh.io/docs/problems.html (the tool that enables the workflow jacques_chester was describing). It embraces the idea of reliably building from source for the exact reasons you've mentioned.
- Any infrastructure with lots of data. Data just takes time to move; backups take time to restore.
- You're on bare metal because running node on VMs isn't fast enough.
- You're in a secure environment, where the plain old bureaucracy will get in the way of a full rebuild.
- Anytime you have to change DNS. That's going to take days to get everything failed over.
- Clients (or vendors) whitelist IPs, and you have to work through with them to fix the IPs.
- Amazon gives you the dreaded "we don't have capacity to start your requested instance; give us a few hours to spin up more capacity"
> Imagine an outage in one of the 3 datacenters you are running your infra in the same region. You need to move 1/3 of the capacity to the remaining 2 datacenters.
Oh, this is very different. If your provider loses a datacenter, and your existing infrastructure can't handle it, you're already SOL - the APIs for spinning up instances and networking is going to be DDOSed to death by all of the various users.
Basic HA dictates that you provision enough spare capacity that a DC (AZ) can go down and you can still serve all of your customers.
I used to work in the team that runs Amazon.com. All of the systems serving the site can be re-built within hours and nothing can serve the site that cannot be rebuilt within a very thing SLA. However, I understand that not all the companies have this requirement. This feature is only relevant when a site downtime hurting the company too much, so it could not be allowed.
Reflecting to your points:
- Lots of data -> use S3 with de-normalized data, or something similar
- Running a VM has 3% overhead in 2016, scalability is much more important than a single node performance
- High security environments are usually payment processing systems, downtime there can be a bit more tolerated, delaying transactions is ok
- Amazon uses DNS for everything, even for datacenter moves. It is usually done within 5 minutes
- This is a networking challenge, using something like EIP (where the public facing IP can be attached to different nodes) makes this a non-issue
- Amazon has an SLA, they extremely rarely have a full region outage, so you can juggle capacity around
Losing a dc out of 3 does not require work because you can't handle the load, it is required to have the same properties (same extra capacity for example) just like before. Spinning up instances should not DDOS anything, it is with constant load on the supporting infrastructure.
The last point I agree with.
> Lots of data -> use S3 with de-normalized data, or something similar
S3's use case does not match up with many different computing models (hadoop clusters, database tables, state overflowing memory), and moving data within S3 between regions is painful. Also, not all cloud providers have S3.
> Running a VM has 3% overhead in 2016, scalability is much more important than a single node performance
Not when you have a requirement to respond to _all_ requests in under 50ms (such as with an ad broker).
> High security environments are usually payment processing systems
Or HIPPA, or government.
> delaying transactions is ok
Not really. When I worked for Amazon, they were still valuing one second of downtime at around $13k in lost sales. I can't imagine this has gone down.
> Amazon uses DNS for everything, even for datacenter moves. It is usually done within 5 minutes
Amazon also implements their own DNS servers, with some dynamic lookup logic; they are an outlier. Fighting against TTL across the world is a real problem for DR type scenarios.
> EIP (where the public facing IP can be attached to different nodes) makes this a non-issue
EIPs are not only AWS specific, but they can not traverse across regions, and rely on AWS' api being up. This is not historically always the case.
> they extremely rarely have a full region outage, so you can juggle capacity around
Not always. Sometimes, you can. But not always. Some good examples from the past - anytime EBS had issues in us-east-1, the AWS API would be unavailable. When an AZ in us-east-1 went down, the API was overwhelmed and unresponsive for hours afterwards.
> Spinning up instances should not DDOS anything, it is with constant load on the supporting infrastructure.
See above. There's nothing constant about the load when there is an AWS outage; everyone is scrambling to use the APIs to get their sites backup. There's even advice to not depend on ASGs for DR, for the very same reason.
AWS is constantly getting better about this, but they are not the only VPS provider, nor are they themselves immune to outages and downtime which requires DR plans.
Exactly. Don't put data in Docker. Files go in an object store, databases need to go somewhere else.
OP's first point is 'don't put data in docker'. Docker is not for your data. But more to the point, if you're rebuilding your data store a couple of times every day, a couple of hours downtime isn't going to be feasible.
> You're on bare metal because running node on VMs isn't fast enough
In such a situation, you should be able to image bare metal faster than 2 hours. DD a base image, run a config manager over it, and you should be done. Small shops that rarely bring up new infra wouldn't need this, but anyone running 'bare metal to scale' should.
Isn't part of the infra rebuild per se.
> Anytime you have to change DNS. That's going to take days
Depends on your DNS timeouts, but this is config, not infra. Even if it is infra, 48-hour DNS entries aren't a best-practice anymore (and if you're on AWS, most things default to a 5 min timeout)
> Clients (or vendors) whitelist IPs, and you have to work through with them to fix the IPs
I'd file this under 'bureaucracy' - it's part of your config, not part of your prod infra (which the GP was talking about).
> Amazon gives you the dreaded...
Well, yes, but this is on the same order as "what if there's a power outage at the datacentre". Every single deploy plan out there has an unknown-length outage if the 'upstream' dependencies aren't working. "What if there's a hostage event at our NOC?" blah blah.
The point is that with upstream working as normal, you should be able to cover the common SPOFs and get your prod components up in a relatively short time.
I agree, but I (and the GP, from my reading) was not speaking about only Docker infrastructure.
> Isn't part of the infra rebuild per se.
I can see your point, and perhaps these points don't belong in a discussion purely about rebuilding instances discussion. That said, I have a very hard time focusing just on the time it takes to rebuilding capacity when discussing a DC going down; there's just too many other considerations that someone in Operations must consider.
When I have my operations hat on, I consider a DC going down to be a disaster. Even if the company has followed my advice and the customers do not notice anything, we're now at a point where any other single failure will take the site down. It's imperative to get everything taken down with that DC back up; and it's going to take more than an hour or two.
Many larger companies can't do this; my company has 70+ datacenters with tens of thousands of servers. We can't re-build our prod infra in minutes or hours. We are still doing devops right :D
Like I said, I know you aren't talking about my situation when you made your statement... I just get frustrated when people act like there are hard and fast rules for everyone.
I work at MindGeek, depending on the time of the year it would be fair to say we rank within the top 25 bandwidth users in the world. We are not even close to that amount of servers and we deal with some of the largest traffic in the world. What company is running in 70+ datacenters!? World's largest VPN provider? Security company providing all the data to the NSA?
Maybe it is just my broad assumptions but I would hope that the major big 10 that come to mind such as Google, Amazon, Microsoft, etc would be able to rebuild their production regions in hours.
I don't work for one of those two, but I do work for a very large CDN.
"I'd have thought I've had heard of them..."
One Wiki search later... yup. I've heard of them.
People running edge networks and therefore need servers local to everywhere in the world to keep latencies down. Maybe it's not as much of an issue for MindGeek (the parent company for a lot of video streaming sites). I would guess you guys need a lot of throughput but latency isn't so much of a problem. Or you simply don't need to serve some parts of the world where it might be illegal to distribute some types of content.
FWIW, Cloudflare has 86 data centers: https://www.cloudflare.com/network-map/
Short version: That's a video streaming websites which is rather simple, yet bandwidth intensive.
Outsourcing the caching and video delivery means MindGeek can do with little servers and a few locations.
Nonetheless the CDN you're outsourcing to does need a lot of servers at many edge locations.
Actually, if we think in terms of "top bandwith users in the world". It's possible that your company is far from being in the list. It's likely dominated by content delivery / ISP / and other providers, most of which are unknown to the public.
I am in distributed systems and try to work exclusively on hard problems. So when I say "simple", that is biased on the high end of the spectrum.
If you go to pornhub.com and looks at "popular keyword", you'll only find thousand or ten thousands videos. In a way, there is not that much content on pornhub.
All major websites have challenge. Pornhub is a single purpose website and a lot of the challenge is in video delivery, which can be outsourced to a CDN nowadays.
"simple" is maybe too strong a word. I am trying to convey the idea that it has limited scope and [some of] the problems it's facing are understood by now and have [decent] solutions.
That's not to say it's easy ;)
Edit: didn't notice someone else had already said this, opened this tab like 15 minutes ago
Sure, you want to be able to deploy quickly. But if there's no reason to, then don't.
And I would be very scared if Docker images had a 1 day uptime max
We also see them mostly in non-prod environments where we have greater container/image churn. We use AWS autoscale and Fleet so containers just get moved to other hosts when we terminate them. We have actually thought about scheduling a logan's run type job that kills older hosts automatically - it's in the backlog.
While I sincerely hope I'm wrong, I assume it's because you reset the clock on the probability something goes very wrong.
I am currently fighting an ongoing battle at work to point out that the plans for our Mesos cluster have not factored in that the first outage we have will be when someone fills up the 100gb OS SSD because no one's given any thought to where the ephemeral container data goes.
It also naturally rewards optimizing around time-to-redeploy, probably a lot of benefits there.
That's like "my car works fine as long as you spray WD-40 into the engine block every 48 hours..."
These are not necessarily container centric.
Also hosts shouldn't need to be rebuilt THAT often. Of course, your infrastructure should automatically nuke and replace failing hosts (and there's various ways to establish what's "failing"), and like any good infrastructure team you should be keeping your hosts up to date with security patches, etc, which is best done by replacing them with new versions (as then you can properly test and apply a CI lifecycle), but docker itself is stable enough these days that you don't need to be that aggressive with nuking.
Agree (especially with the comments about orchestrating with kube/mesos) with all the other points, though.
"you don't use it to store data" - What is wrong with Docker volumes specifically? What issues did you run into?
"wrap docker pulls with 'flock' so they don't deadlock" - I have never had a problem with docker pull, can you elaborate on this?
I got bit hard by this issue:
when CoreOS stable channel updated Docker. All my data volume containers broke during the migration and migration could not be reverted.
As for the second issue: when two systemd services that pull containers with the same base images would start simultaneously, they would deadlock pretty reliably. Had to flock every pull as a result. This might be fixed by now though.
The issues here are the real telling story: I spent close to 12 hours yesterday trying to get a fairly simple node app to run on my mac yesterday. Turned out I had to wipe out docker completely and reinstall. Keep in mind this is their stable version thats no longer in beta. I've just run into too many documented bugs for me to consider it stable. I wouldn't even say it should be out of beta.
The issues here are the real telling story. https://github.com/docker/for-mac/issues
I love docker, it's amazing when it works. It's just really not there yet. I get that their focus is on making money right now, but they need to nail their core product first. I honestly don't care about whatever cloud platform they're building if their core app doesn't even work reliably.
- Stable? That's honestly laughable. It is nowhere near stable. As an example, I was trying to upload images to a third-party image registry but the upload speed was ridiculously slow. It took me forever to figure out but it turned out I needed to completely reinstall docker for mac.
- They had a command-line tool called pinata for managing daemon settings in docker for mac. They chose to get rid of it. Not only did we lose a way to declaratively define and set configuration but the preferences window has no where near all of the daemon settings that are available.
- The CPU usage is still crazy. I regularly get 100% CPU usage on 3/4 of my CPU cores while starting up just a few containers. Even after the containers have started it will idle at 100% 1/4 cores.
- It needs to be reinstalled regularly if you are using it on a daily basis. Otherwise it will get slower and slower over time. See my first complaint.
- The GUI (kitematic) will randomly disconnect from the daemon forcing me to restart the GUI repeatedly.
- They really need some sort of garbage collector with adjustable settings. With the default settings the app will just keep building and building images and eventually fill up, crash, slow down, etc. How is that acceptable? What other apps do that?
Like I said, I like docker in general. I think they are tackling some very hard problems and definitely experiencing some growing pains from such crazy growth. However, at some point they need to take a step back and focus on the core of what they offer and make it as simple, and rock solid as possible. As another example, they still haven't added a way to compress, and/or flatten docker images. No wonder docker for mac slows down after regular use when it's building 1GB+ images for simple things.
There's also the Docker.qcow2 file ballooning in size. Only way is to do a "factory reset" or running a couple commands to clear out old images.
Also, the CoW disk volume they use basically grows unbounded, even if you purge layers and old containers.
It's cute that they're trying to port it other places, but if you're trying to run with stability, don't use a mac, use a Linux box.
Why they did this; I have no idea, but it's a horrible idea.
If nothing else use VirtualBox, install Linux, and use docker that way.
It makes me sad that you believe that. We're not perfect but we try very hard to keep our users happy.
Are there specific issues that you could point out, so that I can get a sense of what you saw that you didn't like?
Keep in mind that anyone can participate in github issues, not just Docker employees, and although we have a pretty strict code of conduct, being dismissive about another participant's use case is not grounds for moderation.
EDIT: sorry if this came across as dismissive, that wasn't the intention. We regularly get called out for comments made by non-employees, it's a common problem on the github repo.
But this isn't about you, or your feels, or what you think is going on.
All that's happening is people are trying to communicate with you and you're not listening. You're doing your best, I don't doubt that, but you've got to step back, regroup, and come at the problem from another angle.
Don't get defensive, don't make excuses. "When you make a mistake, take of your pants and roll around in it." ;-) Give people the benefit of the doubt that they mostly know what they're complaining about (even if they don't.)
It's a pain-in-the-ass, but it's the only way to deal with this kind of systemic (mis-?)perception in your community.
I don't even think it's intentional, really, but I have never once seen a response by Solomon to criticism of Docker that did not dismiss the content and messenger in some way. These things are hard to get right and I'm not perfect, so I don't even really know what advice to offer and I'm far from qualified. I am very frequently dismissive of criticism as well and have had to put a lot of work into actively accepting it, so I can at least understand how hard it is.
I outright told him his initial reaction to Rocket, to use an example, directly caused me to plan for a future without Docker. Companies are defined by their executives, and a lot of Docker's behaviors become clearer when you consider some of the context around Solomon's personal style.
: the most productive example being https://news.ycombinator.com/item?id=8789181
I told my sister last night, jokingly, "I think I made a co-founder of Docker cry on HN today." (I also explained what Docker and Hacker News are a little. She likes me so she didn't call me a nerd to my face.)
Honestly, I was just amused at the reply evidencing (or seeming to) the very attitude the OP was complaining of, it was incredible.
Docker may be on top now but if you don't cater to the needs of your nonpaying users, it will be short lived. This includes taking seriously feature requests which don't necessarily serve enterprise.
You can find me on GitHub.
If there is a problem with the way the Docker open source project is run, it is with the amount of small but serious bugs that affect subsets of your users. This is probably due to you guys overextending.
Responding that you are saddened by this guys issue with his perception of the 'culture' in your team is just craziness. No one will be helped by this. The only thing that will help is showing that you guys are running a tight ship. Maybe hire an extra person on two to maintain your GH issue garden and proceed with making Docker the successful business we all believe it can be.
I have seen many instances of individuals with no affiliation to an open-source project being negative or rude and therefore giving a negative impression of the community or parent company. As a project owner or maintainer this is a really difficult thing to see happening, and can be difficult to address and prevent. So I personally think shykes makes a valid point. I wouldn't take this as dismissive. Give him the benefit of the doubt.
I also think we should cut maintainers a bit of slack. Imagine your are managing an open source project with over 10,000 logged issues and even more in the community forums. It is draining to spend all day dealing with complaints and issues and can be a thankless job. Maintainers often try to put their best foot forward, but it isn't an easy and they are human, and make mistakes and say things they regret. I'm not saying it is justified, just try to put yourself in their shoes.
I've personally logged a few issues with the Docker team and have also found the interactions to be respectful. There are times when I've asked for features and they have certainly played devils advocate, but that is to be expected with any project that is trying to prioritize their work and constantly fighting feature creep.
The third paragraph particularly is spectacular, as you manage to be dismissive about being dismissive.
It looks like now is not a good time for me to participate in this discussion. Instead I'll make a list of things we can improve so that the next HN post about Docker is a more positive one.
You should just be like, "Hey folks, we're sorry as hell we let you down. I'm here now and I'm listening. What has made you sad, and what can we improve?"
People love that sh--stuff.
That's just my advice. I don't even use Docker. I just like to see warm-fuzzy success, eh?
- You shouldn't have to read through pages of posts to find potential workarounds.
- You shouldn't have to count comments to guess if there are a lot of people affected by an issue.
- You shouldn't have to read comments, then wonder which posts were an official developer response or a comment from another end user.
- There should be some way to communicate easily that "this issue has our attention, but dealing with it is de-prioritized because ..."
In this instance, you can see it's a security issue and it will be tough to convince Docker to change it, but at this point it's two years later and people are still griping about it, so maybe it's time to put some attention on it. There are a few suggestions for seemingly reasonable updates, but nobody is championing any of them, presumably because there's no indication other than comment activity that Docker will consider any updates on this issue whatsoever.
I'm not sure myself what the right thing to do from Docker's perspective would be, but this is clearly one of those issues that has worn through the attention span of the development team. Someone either needs to stand up and say "we're not changing anything here" or "we're looking for a solution" - no bug report should be open that long.
This was a reply by @tiborvass in https://github.com/docker/docker/issues/8887
Can you possibly get any more dismissive?
1) you have been called a mouth breathing moron
2) it is confirmed that docker is run by assholes
3) the founder does not care about users (by asking for examples of poor conduct by Docker employees)
4) docker is run by a lousy leader who cannot demonstrate care, again by asking where people get a sense of docker not caring.
Internet comments lack a lot of tone so its easy to misread, but when there's this kind of fanatic backlash when there's a direct connection to the founder of a pretty big service, the luster of any of these sites dims considerably. You have given nothing actionable to the person who could have done something positive for docker, and just made it that more difficult for the organization to take legitimate criticisms at face value.
A more constructive response than yours would be to post examples of where Docker has failed in community engagement, rather than just re-stating his post with the most negative spin possible.
Can you please add Facebook connectivity to Docker?
If people are feeling the "users are stupid", combined with what looks like coverup culture, I find Docker's blog posts less and less credible.
For example, there was a recent blog post about An independent security review comparing Docker to other similar technologies, including rkt. The conclusion of the post was that Docker is secure by default. (I'll leave it to the reader's own opinion whether that is true or not; this is not the issue I am pointing out).
What is so weird is that, nearly two years after the fact, the tone in that blog post was still as if CoreOS had betrayed them. And while I get there are hurt feelings involved, this is not a high school popularity contest. When combined with arrogance and coverup culture, I can see Docker moving in a direction that drives them further and further from relevancy.
Based on what I'm hearing here about Docker Swarm, Swarm is basically a great advertising for K8S or Mesos. No one there is pretending that multi-node orchestration is an easy problem. If the divergence from the community continues, I can see lots of people fed up and going over to rkt, or something else that works just as well.
Which particular problem? The crappy performance/CPU usage one(s), or something else? Having used various different approaches (docker in vmware linux/virtualbox), the d4m (osxfs?) one seemed to be the least broken for general dev stuff.
Hopefully not more issues I need to keep a lookout for.
There is also the bit about host networking. Since Docker for Mac runs a transparent Linux, the host networking goes there instead. That thread ended up in a big, roaring silence.
I'm not saying this will necessarily stop me from using a project, but it definitely does not create any loyalty.
Compare this to say Rails, where I've had really positive experiences with the maintainers.
There are parts of docker that are relatively stable, have many other companies involved, and have been around for a while. There are also "got VC money, gotta monetise" parts that damage the reputation of stable parts.
Let me ask: WTF are people are doing that <100 machines is a "small cluster"? I ran a Top 100 (as measured by Quantcast) website with 23 machines and that included kit and caboodle -- Dev, Staging and Production environments. And quite some of that were just for HA purposes not because we needed that much... Stackexchange also runs about two dozen servers. Yes, yes, Google, Facebook runs datacenters but there's a power law kind of distribution here and the iron needs are falling very, very fast as you move from say Top 1 website to Top 30.
But if you need to store a lot of data, or need to look up data with very low latency, or do CPU-intensive work for every request, you will end up with a lot more servers. (The other thing to consider is that SaSS companies can easily deal with more traffic than even the largest web sites, because they tend to aggregate traffic from many websites; Quantcast, for example, where I used to work, got hundreds of thousands of requests per second to its measurement endpoint.)
Also, some sites are simply larger than StackExchange, and you never heard of them. There's a huge spectrum between StackExchange and Google.
Usually my cluster is very small, unless the pipeline is in progress. At this very moment I am not doing any processing, so I only have 3 machines up.
- If you don't use google container engine (hosted kubernetes, also known as GKE), kubernetes have a reputation of being hard to set up. I am running on GKE, so I can't comment too much.
- It's hard to see how all of the orchestration/docker/etc. things play together when entering the area. I expect many people hear "docker" and want to just try "docker", not being aware that there exist alternatives for some parts, some parts are more reliable that others, etc. E.g. the article we are discussing seem to be doing this.
You still have to deal with the other drawbacks of those platforms (slow networks, disks etc.) but that's not really a k8s issue.
Yeah, but there's no shortage of things wrapping Kuber (e.g. OpenShift).
Let's give it a try and see if I can find a good tutorial for kubernetes to do what I want. (I haven't tried this experiment in a couple months, since before nomad and swarm got my interest.)
Ok, I'm back. I went to kubernetes.io. there' "give it a try" has me creating a google account and getting set up on google container engine. Due to standard issue google account hassles, I quickly got mired in quicksand having nothing to do with kubernetes.
I have no interest in google container engine. Let me set it up using vagrant, or my own VPSes, or as a demo locally, whatever, I'll set up virtual boxes.
They lost me there.
It's a local setup to let you give Kubernetes a try. We haven't made it the default in the "give it a try" dialog yet, but we're considering it.
Disclosure: I'm an engineer at Google and I work on Minikube.
Kubernetes is definitely more to learn before getting started than swarm, but that's mostly because it has different and more powerful primitives and more features built in.
K8s might seem more unwieldy than swarm, but from that feature set you can expect things to work the way they are explained.
Swarm on the other hand has made my entire team question whether 1.12 is even worth upgrading to.
Ok, let's see, not only do I immediately get diverted to another page, but I feel like every OS+Cloud combination isn't represented. I guess the CLOSEST thing to working is Ubuntu+AWS. Click.
YAY! JUJU a new technology I need to learn. Hey guess what, this only works with Ubuntu. Closes browser. I spent weeks trying to map this out in my head. I can't understand why Kub doesn't just "install" like Docker does.
Ok, back to the deployment guide. Let's see there's a GIANT TABLE OF LINKS based on cloud+OS+whatever. So I think it's a massive understatement to say that Kub is more unwieldy than Swarm 1.12.
Is the guide that I was taking about. From your post it was not clear that you weren't using gce.
I would love for our docs to be better - good people are working on it though documentation is always hard. In the meantime, the community is a wonderful resource!
We hired a contractor to do this part. My team is so resource constrained that I don't have time for this, so we farmed it out. But now I'm thinking, the risk of project failure is much higher than I thought, made worse that the contractor is also showing signs of having poor communication skills. (I would rather be updated on things going wrong than to have someone try to be the hero or cowboy and figure it all out).
The only reason why we didnt go ahead is because of broken logging in k8s - https://github.com/kubernetes/kubernetes/issues/24677
This is the blocker for me. I cant switch to GKE already because I use AWS postgresql. But I want to use k8s :(
If you were not using AWS EBS .. what were you doing ?
I was on AWS. I sidestepped the issue by using AWS RDS (postgresql).
I had tried to get the nascent EBS stuff working, but when I realized that I'd have to get a script to check if an EBS volume was formatted with a filesystem before mounting it in K8S, I stopped. This might have been improved by now.
On the logging front, kube-up comes up with automatic logging via fluentd to an ElasticSearch cluster hosted in k8s itself. You can relatively easily replace that ES cluster with an AWS ES cluster (using a proxy to do the AWS authentication), or you can reconfigure fluentd to run to AWS ES. Or you can set pretty easily set up something yourself using daemonsets if you'd rather use something like splunk, but I don't know if anyone has shared a config for this!
A big shortcoming of the current fluentd/ES setup is that it also predates PetSets, and so it still doesn't use persistent storage in kube-up. I'm trying to fix this in time for 1.4 though!
If you don't know about it, the sig-aws channel on the kubernetes slack is where the AWS folk tend to hang out and work through these snafus together - come join us :-)
From what you wrote, it seems that lots of people consider logging in k8s to be a solved issue. I'm wondering why is there a detailed spec for all the journald stuff, etc.
From my perspective - it will be amazing if k8s can manage and aggregate logs on the host machine. It's also a way of reducing complexity to get started. People starting with 1-2 node setups start with local logs before tackling the complexity of fluent, etc
Is that the reason for this bug?
If you want logs to go into ElasticSearch, k8s does that today - you just write to stdout / stderr and it works. I don't love the way multi-line logs are not combined (the stack trace problem), but it works fine, and that's more an ElasticSearch/fluentd issue really. You'll likely want to replace the default ES configuration with either one backed by a PersistentVolume or an AWS ES cluster.
Could it be more efficient and more flexible? Very much so! Maybe in the future you'll be able to log to journald, or more probably be able to log to local files. I can't see a world in which you _won't_ be able to log to stdout/stderr. Maybe those streams are redirected to a local file in the "logs" area, but it should still just work.
If anything I'd say this issue has suffered from being too general, though some very specific plans are coming out of it. If writing to stdout/stderr and having it go to ElasticSearch via fluentd doesn't meet your requirements today, then you should open a more specific issue I think - it'll likely help the "big picture" issue along!
At the outset it had the look of something that wasn't an advance over standard issue virtualization, in that it just shuffled the complexity around a bit. It doesn't do enough to abstract away the ops complexity of setting up environments.
I'm still of the mind, a few years later, that the time to move on from whatever virtualization approach you're currently using for infrastructure and development (cloud instances, virtualbox, etc), is when the second generation of serverless/aws-lambda-like platforms arrive. The first generation is a nice adjunct to virtual servers for wrapping small tools, but it is too limited and clunky in its surrounding ops-essential infrastructure to build real, entire applications easily.
So the real leap I see ahead is the move from cloud servers to a server-free abstraction in which your codebase, from your perspective, is deployed to run as a matter of functions and compute time and you see nothing of what is under that layer, and need to do no meaningful ops management at all.
https://news.ycombinator.com/item?id=12364123 (217 comments)
The only points made against Docker proper are rather laughable. You shouldn't be remotely administering Docker clusters from the CLI (use a proper cluster tool like Kubernetes), and copying entire credentials files from machine to machine is extremely unlikely/esoteric.
Docker, with Kubernetes or ECS, is totally suitable for production at this point. Lots and lots of companies are successfully running production workloads using it.
Docker Swarm is advertised as stable, production ready cluster management solution. Then, if you actually try to use it, it is very NOT. Kubernetes is great, but it feels like adding another layer to the system, and it is not always a good thing (especially if you are on AWS and have to work with _their_ infrastructure management too).
I use rancher with it, and it's retarded simple using rancher/docker compose.
For a quick run-down see: https://github.com/forktheweb/amazon-docker-devops
More advanced run-down of where I'm going with my setup:
The only problems I've had with Docker container are those where processes get stuck inside the container and the --restart=always flag is set. When this happens it means that if you can't force the container to stop, when you reboot the defunct container will restart anyway and cause you the same issue...
My solution to this has been to just create a clean AMI image with ubuntu/rancher/docker and then nuke the old host when it gives me problems. This is made even easier if you use EFS because it's literally already 100% setup once you launch a replacement instance.
Also, you can do automatic memory-limiting and cpu-limiting your nodes using rancher-compose and health-checks that re-route traffic with L7 & HAProxy:
The only thing even comparable to that in my mind would be Consul health checks with auto-discovery:
If it's truly a problem with the containerization format / stability with the core product, I'm not sure what a good alternative would be. I see a lot of praise for rkt but the ecosystem and tooling around it are so much smaller than that for Docker.
Run the previous version of the cli in a container on your local machine. https://hub.docker.com/_/docker/
$ docker run -it --rm docker:1.9 version
I'd rather get out of IT and go into farming is this is considered a valid recourse.
1. TFA complains that versions of servers require corresponding client tools to be installed.
2. TFA says he loves docker in dev, but not in prod.
3. I state (omitting many steps): install the latest docker in your dev environment (or wherever you're connecting to prod), spin up a small docker image to run the appropriate version cli. This is cheap, easy, easy to understand why it works, consistent with points 1 and 2.
What am I missing?
At my current job we run nearly all our services in docker.
I've replied to this type of comment on here at least a dozen times, it has nothing to do with docker, it is a lack of understanding how it all works.
Understand the history, understand the underlying concepts and this is no more complex than an advanced chroot.
Now on the tooling side, I personally stay away from any plug-ins and tools created by the docker team, they do docker best, let other tools manage dockers externalities.
I've used weave since it came out, and it's perfect for network management and service discovery.
I prefer to use mesos to manage container deploys.
There is an entirely usable workflow with docker but I like to let the specialists specialize, and so I just use docker (.10.1 even), because all the extra stuff is just making it bloated.
I'm testing newer versions on a case by case basis, but nothing new has come out that makes me want to upgrade yet.
And I'll probably keep using docker as long as it stays separate from all the cruft being added to the ecosystem.
One can use Ubuntu LXD which is Linux containers built on top of LXC but with ZFS as storage backend. LXD can also run Docker containers.
One can also use Linux containers via Kubernetes by Google.
The container part of Docker works well. And they've ridden that hype wave to try to run a lot of other pieces of your infrastructure by writing another app and calling it Docker Something. Now everybody means a different subset when they say "Docker".
For a quick run-down see: https://github.com/forktheweb/amazon-docker-devops
I'm pretty sure that as long as the CLI version is >= the server version, you can set the DOCKER_VERSION env var to the server version and everything works.
I haven't used this extensively, so maybe there are edge cases or some minimal supported version of backwards compatibility?
zones in SmartOS provide full-blown UNIX servers running at the speed of the bare metal, but in complete isolation (no need for hacks like "runC", containers, or any other such nonsense).
Packaging one's software into OS packages provides for repeatability: after the packages are installed in a zone, a ZFS image with a little bit of metadata can be created, and imported into a local or remote image repository with imgadm(1M).
That's it. It really is that simple.
DC/OS is emerging as the go-to way to deploy docker containers at scale in complex service combinations. It 'just works' with one simple config per service.
I am shocked at how fragile etcd is in this way. I was hoping docker swarm was better, but I'm not surprised (alas) to find out that it has the same problem.
I'm about ready to build my own solution, because I know a way to do it that will be really robust in the face of partitions (and it doesn't use RAFT, you probably should not be using RAFT, I've seen lots of complaints about zookeeper too. I've done this before in other contexts so I know how to make it work, but so have others so why are people who don't know how to make it work reinventing the wheel all the time?)
It definitely sounds like Swarm is not ready but I wouldn't say this is representative of running Docker in production: instead you should be running one of the many battle tested cluster tools like ECS, Mesos or Kubernetes (or GCE).
We use Cloud66 (disclaimer - not associated with them) to help with the deployment issues if any arise.
Also we don't store DB in containers.
Take for example k8s - I just started exploring it as something we could move to. https://github.com/kubernetes/kubernetes/issues/24677 - logging of application logs is an unsolved problem.
And most of the proposals talk about creating yet another logger...rather than patching journald or whatever exists out there.
For those of you who are running k8s in production - how are you doing logging ? does everyone roll their own ?
The Github issue talks a lot about "out of the box" setup via kube-up. If you're not using kube-up (and I wouldn't recommend using it for production setups), the problem is rather simple.
The logging story could be better, but it's not "unsolved".
We don't want to stream our logs or setup fluent, etc. I just want to make sure my logs are captured and periodically rotated. Now, the newer docker allows me to use journald as logger (which means all logs are sent to journald on the HOST machine)... But I can't seem to figure out how to do this in k8s.
Also as an aside, for production deployment on aws.. What would you suggest. I was thinking that kube-up is what i should use.
(As an aside, I don't understand why more Unix software doesn't consistently use syslog() — we already have a good, standard, abstract logging interface which can be implemented however you want on the host. The syslog protocol is awful, but the syslog() call should be reliable.)
As for kube-up: It's a nice way to bootstrap, but it's an opaque setup that doesn't give you full control over your boxes, or of upgrades. I believe it's idempotent, but I wouldn't trust it to my production environment. Personally, I set up K8s from scratch on AWS via Salt + Debian/Ubuntu packages from Kismatic, and I recommend this approach. I have a Github repo that I'm planning to make semi-generic and public. Email me if you're interested.
If you're interested to see more about what it does behind the scenes, it is very similar to the full manual guide for setting up Kubernetes: https://coreos.com/kubernetes/docs/latest/getting-started.ht...
None of the container frameworks on Linux are ready, but that doesn't mean that's the case in general.
But if you read the bug, this doesnt work quite well in the wild. In fact, an entire spec is being written on it and right now the debate is whether to forward to journald and have it proxy further or reimplement a new logging format.
After about 2 years of giving it another shot on and off, we just gave up. And it's not like we were doing something crazy, just a typical documented "run this rails app" type thing. I would definitely not use this in production for anything based on my experience.
I can imagine it being harder on some Linux distributions because your host system will have to have all of the kernel bits etc. Ubuntu 14.04 and above or current versions of Fedora should be good - the distributions explicitly support Docker, and Docker, Inc. provide packages so that you can run either the distribution packages or the Docker, Inc. ones.
We haven't touched Swarm and are staying a couple versions behind. With that said, we're pretty happy with the setup right now.
I have no complaints about ECS now, and in fact I'd say that ECS solves a lot of the problems that the author complains here. I'd say that Swarm is not production ready, but docker itself as an engine for running a container is fine. You just need to build your own orchestration layer, or use something like ECS.
More details on Airtime's docker setup here: https://techblog.airtime.com/microservice-continuous-integra...
If one container needs to talk to another container of a certain type it just uses the load balancer, and the load balancer takes care of directing the request to the other container which may be on the same machine, or another machine, or maybe even in a different cluster.
Obviously this forces you into a fully stateless design because the load balancer round robins all the requests across random containers, but its very fail proof, and gives you high resiliency.
The ECS approach is a pretty minimal to glue together a cluster of instances running docker and ELB, while Kubernetes has a lot more baked into the core implementation.
Yes, in an absolutely ideal world no one ever should touch a production environment, and ideally in every situation you find yourself having to touch a production system you should then be creating at the very least a backlog item to figure out how to avoid doing it in the future, but the reality is some kinds of troubleshooting are hard to do without having access. Generally the kinds of situations where you need that access are ones when you want access the quickest and easiest, i.e. the world is burning somehow.
We've been using CentOS for the production hosts and CentOS/OSX for developing on. There were a couple of minor issues when we started but the benefits outweighed the negatives then. It's been completely painless since Docker came out of beta. We have some containers that are deployed daily and others that run for months without being touched.
I've never quite understood the negative attitude towards Docker. Perhaps we are too small to see many of the issues that people often complain about, but moving to Docker was a great decision in our office.
I predict the Docker fad will end in 2-3 years, fingers crossed.
I work for what is, ostensibly, a competitor and I don't agree.
Containers are here to stay. Docker will pretty much have a brand lock on that for a long time.
You can do this, for sure, but, tbh, I'd still rather use Vagrant. I'm already provisioning things like my datastores using Chef (and not deploying those inside of Docker in production); I might as well repurpose the same code to build a dev environment.
I do agree regarding using containers (I use rkt rather than Docker for a few reasons) for CI being a really good jumping-off point, though.
I've used both Vagrant and Docker for dev, and find Docker to be super fast and light. (though I've definitely run into some headaches with Docker Compose that I didn't see in using just Docker)
Running datastores in Docker is bonkers in the first place, which still leaves a significant place for an actual CM tool--so unless you are a particular strain of masochist, when you are replicating your local environment, you'll probably need it anyway.
If you do a lot of active machine development, where you're iterating on that machine setup, it's an extremely slow process. Perhaps it would make sense to iterate in Docker, and then convert that process to Vagrant once you finish.
> similarly worrying about "light" when you have 16GB of RAM and a quad-core processor
It's a concern when you're launching many machines at once to develop against. Docker tends to only eat what it needs.
Also, while I think everyone should have a 16GB machine, there's quite a few developers out there running 8 or 4GB machines.
Containers are definitely here to stay. Google, Heroku, and others have been using them for years.
Obviously Docker is just a subset, and it might fade, but I think it's pretty well established as a defacto standard.
I think docker itself may fade but I view its popularity as jails finally catching on rather than a new bit of hype that's going to die down soon.
I don't. All my configuration management is done with OS packages. And once you start making OS packages to do mass-scale configuration, the whole premise of Docker becomes pointless. If I have OS packages, and I can provision them using KickStart, or JumpStart, or imgadm(1M) + vmadm(1M), without using or needing any code, then what exactly do I need Docker for?
Obviously this person has never spent a good deal of time dealing with AutoCAD...
Which is to say: this is a sign that the involved technologies are still rapidly maturing.
You can export and import machines with this handy node js tool: https://www.npmjs.com/package/machine-share
Their mindset is very tool driven; if there's a problem let me just write a new tool to do that.
Ease of use or KISS isn't a part of their philosophy
While versions are an issue, it's at least a reasonable way to work around it.
It's not surprising docker can't always work, but it's nice to see that programmers are winning. I guess future OS designers and developers will try to encourage for more inter compatibility if possible. That's really a big nerve.