I’ve had all sort of difficulties installing Docker. By hand it’s not trivial to get a secure install. Docker machine is great except it’s often broken. The Docker machine dev team is a tired, understaffed bunch that’s always playing a sisyphean whack-a-mole against dozens of cloud providers and very needy posters on Github, myself included.
Kubernetes on the other hand is trivial with GKE.
It’s great for single node deployments. I run a single node on GKE and it’s awesome, easy, and very cheap. You can even run preemptible instances. The myth that kubernetes is complicated is largely perpetuated by the same kind of people who say React is complicated: the people who’ve not tried it.
And like React, once you try kubernetes you never go back. Kubernetes is actually the orchestration equivalent of React.
You declare what should be true, and Kubernetes takes care of the rest.
And the features it provides are useful for any-sized application! If you try kubernetes you quickly discover persistent volumes and statefulsets, which take away most of the complexities out of stateful applications (ie most applications). You also discover ingress resources and controllers, which make trivial so many things that are difficult with Swarm, like TLS termination. Swam doesn’t have such features, which any non-trivial app (say, Django, wordpress, etc) benefits from tremendously.
How do I install GKE on my servers? ;)
> By hand it’s not trivial to get a secure install.
The default install (basically, adding a repo and apt-get install docker-ce on Debian and derivatives - trivial to automate with Ansible) is reasonably secure if you view Docker as a tool for packaging and task scheduling with some nice extras and don't buy the marketed isolation properties. It only listens for commands on a local socket, and permissions are sane. I haven't looked into Swarm mode protocol traffic, though but I don't think it's tweakable anyway.
> The myth that kubernetes is complicated is largely perpetuated by the same kind of people who say React is complicated: the people who’ve not tried it.
I've tried K8s. I've set up a test cluster, it worked, I wrote some YAML, it worked, all good. So I've worsened the conditions (explicitly going into "I want things to break" territory) and made it fail. I've researched how hard it is to diagnose the problem and fix it - it happened to be complicated. At least, for me. Just felt that "if something goes wrong here, I'll have bad time trying to fix it". Surely, this is not the case on GKE where you don't run and don't manage the cluster.
I had somewhat similar experience with Docker and Docker Swarm mode, and it was significantly easier to dive into the code, find out the relevant parts and see what's going on.
> difficult with Swarm, like TLS termination
YMMV, but I just add some labels to the service and Traefik does the rest. ;)
(But, yeah, Traefik with Swarm requires some reasonable but not exactly obvious networking logic. May take one deployment of "why I'm getting 504s?!" to figure it out. And Traefik needs access to manager nodes to work with built-in Swarm service discovery.)
Great question! All major Cloud providers offer managed Kubernetes services:
The choice is Cloud + Kubernetes vs. roll everything on your own hardware.
Running your own hardware is a major IT effort. Kubernetes is just a part of that effort. Then you have take care about planning, provisioning, logging, monitoring, alerting, auditing, networking, storage, oncall.
You either run Cloud + Kubernetes and get rolling in 15 minutes, or you hire IT headcount.
For Enterprise nerds VMWare has also just launched a Kubernetes service, so the #1 Enterprise VM supplier just added k8s on-prem for everyone who is in a hybrid cloud situation.
You can always rent dedicated hardware – much cheaper than renting similar amounts of virtual servers, and the same administration and operations costs. It’s often even significantly cheaper than what GCE offers – but of course, setting up Kubernetes is more complicated.
If you are a small startup looking to validate market fit, your best bet is Cloud + Kubernetes. If you are an established business with millions of daily customers and serious IT headcount budget, you may look into roll-your-own. The best orchestrator at that scale is, again, Kubernetes.
For a startup, building a datacenter isn’t required in the beginning.
I meant, literally renting existing hardware, in an existing datacenter. This is just a single step from renting VMs at AWS or GCE, but already improves costs significantly.
Edit: got user names wrong
I've deployed a handful of servers automatically myself this way, and I'm still a student.
Saving about an order of magnitude in terms of costs, gaining flexibility, and the only additional work is so low that it costs a compsci student half an hour once (aka ~10€ in wages) — there's no reason not to do it.
I agree with what you say. I'm not tying to say people should all jump to k8s. Having options on the market is great.
But I was trying to refute the notion that Kubernetes has no advantages unless you're running a huge cluster. My main points where:
* It works great with 1 node.
* It comes with many features that Swarm does not have that are useful even at 1 node (PersistentVolumes, StatefulSets are biggest for me, though there are _many_ more I wouldn't want to go without anymore).
* Docker is not trivial to set up, either.
> How do I install GKE on my servers? ;)
Yes, of course. I was just saying there's a solid option to start exploring quickly.
> It only listens for commands on a local socket.
This is kind of a non-starter, isn't it? Of course it's easy to apt-get install docker, but then you want to control it remotely, right? Once you realize how nice it is to control Docker remotely, it's hard to imagine life before.
However, Swarm is undeniably simpler to work with unless you have very specific requirements that only K8S provides. The yml file is incredibly simpler.
Docker Swarm is the Kotlin to Kubernetes' Java. It's a much pleasanter and much less intimidating way to build production systems that scale pretty well.
Kubernetes needs you to have load balancers setup which can talk K8S ingress. Bare metal setup for k8s is a huge effort (and I have stuck long enough on #sig-metal to know) as compared to Swarm.
You should try out Swarm - you might choose not to use it, but you will enjoy the experience.
Do you have a concrete example of what you ran into? What do you mean by "secure install"?
on osx its just a matter on installing docker4mac and other linux distributions has pre-made packages. I am a linux noob and I was able to setup a 20 machine cluster with swarm trivially on centos7.
I was acutally surprised that I was able to do that so trivially, given I have minimal linux admin experience if any.
You're absolutely correct. Kubernetes has its advantages, even in a single-node setup. What many others are pointing out is that it also has significant disadvantages, too.
> but then you want to control it remotely, right?
By the way, I do talk to Docker (Swarm mode or standalone) deployments remotely, forwarding the socket via SSH.
ssh -nNT -L /tmp/example.docker.sock:/run/docker.sock example.org
docker -H unix:///tmp/example.docker.sock info
But, really, there is Ansible for this. Declare `docker_service` and get it deployed (or updated) as a part of the playbook. And ELK (with logging.driver=gelf for the containers or application-level) for logging.
(BTW, what's nice about K8s is that IIRC it allows to exec into pods with just `kubectl exec`. With Swarm you need some script to discover the the right node and container ID, SSH into it and run `docker exec` there - no remote access.)
Please get some experience with that, then re-evaluate whether Docker or Kubernetes is easier for small deployments.
But with your easier Swarm setup, how do you then attach cloud disks directly to your docker container, like a PersistentVolume affords? That one feature makes basically anything worth it, IMO. Most apps are stateful.
Usually I just pin containers to hosts. This is fine on a small setup. In fact, many of my small setups are just a single host.
Try ProxMox. They have it built-in.
Yes, true, but that's apple vs oranges. Swarm doesn't compete against GKE, it competes against Kubeadm. Because sometimes one actually can't or won't use cloud services.
I don't understand ur comment, yea using something in preinstalled in the cloud is easier than installing something on your own.
$13 per month per egress rule, bandwidth costs a factor of 100 more expensive than dedicated hosters, and costs for hardware 10x of what dedicated hosters offer.
I’m not sure what your definition of "more expensive" is, but compared to renting dedicated hardware (e.g., from Hetzner), or colocating, GKE is significantly more expensive.
Of course, if you’re on the scale of Spotify, you can get much better deals – but from what I see on https://cloud.google.com/products/calculator/, it’s not cost-effective.
That’s why I disagreed with the "GKE isn’t more expensive".
depends on what scale your at and the fact that your vendor locked, can't use GKE in other data centers
I regularly use a script like this, and it's never failed me unless it runs out of disk space: https://github.com/coventry/InformationDropout/blob/master/p...
I haven't used but I have used DC/OS, Mesos, and Marathon extensively, which is not a setup I'd do for a small number of nodes personally.
But I stress it's not about the scaling. It's about the features even at a single node. I wouldn't be spending time writing this stuff had it not been revelatory for me.
PS. If you're not able to try it on GKE, there's also https://github.com/kubernetes/kops. GKE is great for trying, at least, though. Just get a cluster running and get rolling with the concepts. Then you'll know what it's all about. There isn't that much to it, and what there is to it is great and well documented with a great community.
I've personally tried both orchestration options and much prefer Swarm to smaller deployments. Sure k8s could prepare you better for the future if your app takes off to the moon but that's just premature optimization.
Also I challenge the $10 because what cloud instances plus ancillary services add up to $10? I’m talking 2 CPUs and 7.5gb RAM.
I would like to note that DC/OS now has Kubernetes [Beta]. You can manage Kubernetes resources on your DC/OS cluster now.
Here are the instructions on Github to get started: https://github.com/mesosphere/dcos-kubernetes-quickstart
You can also join the Slack to help you on you out: http://chat.dcos.io/ (the #kubernetes channel)
Would you use DC/OS-marathon (vs k8s) now if you were to make that decision now.
I've heard that mesos stack is good for machine learning/big data stacks but how does marathon compare for deploying webapps.
Despite going with mesos I really had to contend with the fact that k8s just has way, way more developer support - there are so many rich applications in the k8s sphere. Meanwhile I can probably name all the well supported Mesos frameworks offhand. Next, marathon "feels" dead. They recently killed their UI interface as I imagine that they are having trouble giving resources to marathon. 3 years ago I wanted a reverse proxy solution that integrated with mesos as well as non-mesos services so I hacked Caddy to make that work . 3 years later, I was looking for a similar solution and found traefik. It claimed to work with mesos/marathon, but the marathon integration was broken and the mesos integration required polling even though mesos had an events API, so I hacked traefik to make that work . On the other side of the fence, you have companies like Buoyant who rewrote a core piece of their tech (Linkerd) just to support K8s (and only K8s). This has a compounding effect, where over the years things will just become more reliant on assuming you are running k8s.
That "cost" you pay to setup Mesos/k8s is usually a one time cost on the order of a month. I feel however, that k8s is going to give you a better ROI (Unless, you are managing 100s of nodes with Spark/YARN/HDFS, then Mesos continues to be the clear winner).
The DC/OS stack is nice for web apps, APIs, scheduled tasks, etc once you get it up and running. If I were just deploying small APIs and web apps, I would use something more managed like Lambda or Heroku/similar myself depending on the use case. That said, I'm more of an app dev than an ops person.
(Disclosure: I run CNCF, which hosts Kubernetes and funded the EdX course.)
Make sure you do a context-switch for kubectl too.
I see some people talking about Swarm vs Kubernetes. Swarm has always maintained a less modular approach which made it simpler to work with - it's obviously not going anywhere since there are customers relying on it in production.
Out of the two it's the only one that can actually fit into and run on a Raspberry Pi Zero because K8s has such high base requirements at idle in comparison. For a better comparison see my blog post - https://blog.alexellis.io/you-need-to-know-kubernetes-and-sw...
While the two solutions are obviously with different goals in mind, one being to fully run the kubernetes setup locally, and the other one to run a few docker containers who talk to one another, if it's for the purpose of running a simple-ish dev environment, in my experience, docker-compose is much faster and simpler than minikube.
It's basically an extended version of native k8s yaml with some smart defaults and grouping resources similar to compose.
As for bug reports - zero feedback from anyone on that thread from maintainers. If you are one of the maintainers - it might be good to write this comment on that thread instead of HN
* - I understand that web developers might be a small percentage of users and my case doesn't represent everybody
If you want to help send in diagnostic reports for the Docker guys.
Find out what is causing the CPU spikes with
Docker for Mac/Windows is a great product - it has allowed me to roll out Docker to developers who wouldn't otherwise deal with Vagrant or other solutions
Right now, I am running 10 containers on my Mac, and my highest CPU in Intellej. Docker is running consistently with about 5-10% of the CPU.
I just installed Docker edge. We'll see how it goes in the long run but so far with k8s enabled hyperkit uses 25% of a virtual core.
Anyways, I envision this being very useful for development, may even replace my docker-compose based test setup.
Many people ask why would someone use an orchestrator on a small cluster (dozens of hosts). Why not? Swarm is very easy to manage and maintain, using Puppet or Ansible is not less complicated at all.
The future of Docker, Inc. is of course Docker EE, and the future of Docker EE is _not_ the scheduler, it's everything around it.
The idea that dockerized software somehow is less dependent on configuration management seems to be a popular and completely misguided one. The two trends are completely separate, but I would argue from experience that unless you have absolutely nailed the configuration and integration of all your moving parts, don't even look at containers yet.
Containers tend to lead to more moving parts, not less. And unless you know how to configure them, and perhaps even more importantly how to test them, that will only make matters worse.
This is more than naive. As long as your software needs any kind of configuration, there is a need for configuration management. There will be access tokens, certificates, backend configuration, partner integration of various kinds, and monitoring and backup configuration and you will want guarantees that these are consistent for your various testing and staging environments. You will want to track and bisect changes. You can either roll your own framework for this or use Ansible/Puppet.
Whether you distribute your software pieces with tar balls, linux packages or docker images or completely orthogonal to how you tie these pieces to a working whole. And the need for configuration management absolutely increases when moving towards containerized solutions, not by the change in software packaging format but by the organizational changes most go through where more people are empowered to deploy more software which can only increase integration across your environment.
I see organizations that have ignored this because they believe this magic container dust will alleviate the need of keeping a tight grip over what they run, and find themselves with this logic spread over their whole toolchain instead. That's when they need help cleaning up the mess.
Containerization does not result in less moving parts in a given environment; but it does result in less moving parts across the whole development flow (from a developer laptop to a production cluster).
Then everything you do on top of that is handled by the container scheduler, and your containers.
I’ve not seen PXE used anywhere that the DC wasn’t O&O (or essentially close to it). As that’s the exception to the rule these days, isn’t your premise a bit cavalier?
I’ve used PXE a lot in my past  to great benefit (well, more specifically iPXE through chainloading), so I’m not detracting from it, just saying it’s applicability is limited for most folks.
 I wrote “Genesis” for Tumblr which handled initial machine provisioning from first power-on after racking to being ready for deploys.
Really? Is that a new thing? I don’t remember that last time I was reading the docs, but maybe I missed it (that it’s the “preferred” method rather than being just “an option”).
Seems rather odd to limit your audience like that in the age of “cloud everything” as I think it’s generally more rare that folks fully control their layer 2, but what do I know. :)
But in either case, you want the config to be centrally stored, so you can modify it without having to ssh onto every machine.
If you’re on prem with VMware or openstack there’s better options which you can use, but that’s not exactly baremetal. in those environments CoreOS recommends using different provisioning options more suited for those providers.
I’d love to know who these mythical folks are?
1) dozens of hosts (heck, hosts >1) is exactly why you need orchestration
2) while there are huge deployments across the globe, I wouldn’t consider “dozens of hosts” small by no means. That’s actually probably above average.
3) k8s is actually easier to maintain than you allude. I see these comments about Swarm over k8s generally from folks who never even tried it (or did so years ago), is that the case here?
It just sounds accurate to say it is like trying to be like Google. Well if it is but less complicated then there is no down side.
The complexity overhead of kubernetes may be premature when you don't have massive requirements
Azure also comes with a Swarm-mode deployment template which is pretty great (https://github.com/Azure/acs-engine/blob/master/docs/swarmmo...). However, setting up and running Swarm has been a pleasure.
UCP = Universal Control Plane (web app to manage clusters and containers)
HRM = Host routing mesh (I think). If I recall correctly, it’s used to control ingress to the cluster via virtual host routing, etc.
They are paid Docker Inc. products. I worked for Docker once upon a time so I’m happy to hear there are happy customers :)
Other than that, how have you found EE ? We have considered paying for it, but after seeing Docker Cloud at 15$ per node per month, we are considering switching to it.
however, it does not do termination. You should trivially include a haproxy/nginx pod/service/vm to do the termination for you.
One of the cool new infrastructure features is Docker Cloud - which allows you to "bring your own nodes" for 15$/node/month and setup Swarm. https://docs.docker.com/docker-cloud/cloud-swarm/using-swarm...
I believe it benched a bit worse than the comp but for most cases you're not going to run into issues.
All you need to do is create a docker service out if traefik/nginx/haproxy/whatever and bind the ports of the service to type "ingress". From then on, all traffic is internal - and you deal with it normally from one docker service to another.
I've been looking for a Dokku replacement for running my side-projects on with easy scalability (by provisioning another server), but I haven't found anything that's suitable for running multiple apps together in a heroku-like way but still able to scale beyond a single server. Do you know of anything like that?
For example, I have ten applications, and each requires a database, a redis instance, a celery instance and two web workers. Dokku lets me deploy these independently of each other, but uses the same nginx instance and proxies them transparently.
As I understand it, Swarm has no notion of multiple projects. Each swarm is running a single deployment, where all containers are equal, is that correct?
Basically, Dokku is a self-hosted Heroku, which is what I need (I want to be able to easily create a project that I can run semi-independently of the others on the same server). My understanding is that, to do that with Swarm, I'd have to have a single termination container that would connect to every app, but apps wouldn't be any more segregated than that. Maybe I'm complicating things, though. Have you used Swarm for such a use case?
I tried the official tutorial, but couldn't get it to work, as the instructions appeared outdated and didn't work for single-host deployments, and were geared more towards Windows and Mac than Linux. Would you happen to have a good "getting started" document? All my apps are already using docker-compose.
EDIT: Also, a machine that's a manager doesn't want to join the swarm as a worker as well, that's why I'm saying that it doesn't appear good for single-server deployments:
> Error response from daemon: This node is already part of a swarm. Use "docker swarm leave" to leave this swarm and join another one.
Also, if you are dead-set on making them entirely separate, every separate "application" is a separate "Stack". So you can stack deploy them separately.
If I had to do what you just told me - single nginx proxying to two different "applications" - i would do this.
1. stack 1 - application 1 + network 1
2. stack 2 - application 2 + network 1
3. stack 3 - nginx + network 1
now you can deploy any of them independently. You can make this even more sophisticated by having each stack on a different overlay network (encrypted as well). And nginx bridging between them.
Not sure why you are facing problem with the official tutorial - btw, a manager is a worker ;) I have a fairly large dev swarms on a single node.
We rolled our own go program that just does go tpl substitution in yaml with overrides like helm. Works on charts out of the box, but instead of talking to a service, it outputs a yaml manifest ready to kubectl apply.
We've thought about open sourcing it. It took us literally a day to put together and had worked without flaw for 8 months in production.
* Render the chart and run it through kube-cloud-build to make sure my containers are all built before I deploy: `helm template /path/to/chart | kube-cloud-build -r repo`
* Deploy with helm: `helm upgrade RELEASE_NAME /path/to/chart`
Has been working really well for me.
Swarm was always much simpler to get started with, so some people will prefer it. But it's clear to me which one won.
I really wish I could remember what SPOF was pertaining to, but I just can't remember. Does anyone have any idea if this is still relevant/accurate information?
He told me this maybe 2-3 years ago, so I was wondering how things have changed since then, or if anyone knows what he might have been talking about.
Note that even if you don't have HA, Kubernetes being a SPOF isn't necessarily critical. Barring some kind of catastropic, cascading fault that affects multiple nodes and requires rescheduling pods to new nodes, a master going down doesn't actually affect what's currently running. Autoscaling and cronjobs won't work, clients using the API will fail, and failed pods won't be replicated, but if the cluster is otherwise fine, pods will just continue running as before. Ny analogy, it's a bit like turning off the engines during spaceflight. You will continue to coast at the same speed, but you can't change course.
If you want a standalone installation of Kubernetes on a Mac try minikube.
guillaumerose was referring to all versions, so both "all OS X versions" and "all macOS versions" would be wrong, no...?
For example, Wikipedia has a page called "OS X Yosemite" which describes it as "A version of the macOS operating system", and the Wikipedia article on macOS says it was first released in 2001.
Because it's a bother to keep in mind when they introduced the new name? Or release names? It's hard to imagine that someone would actually be talking about certain number of releases this way.
PS: Apple's stupid marketing bs strikes again.
Because I already have Docker for Mac installed to be able to build and test images I think it's useful to have this local k8s integration.
Given that running docker on a Mac has been a long and bumpy journey, I’m not sure I’d want to bundle the two together.
What am I missing?
If Docker's local Kubernetes install provides a way to connect to custom registries (e.g. GCR) without installing a third party plugin onto every minikube, I'd consider that a major win.
Docker is primarily for running Linux applications on Linux (yes, I know there are things like Joyent SDC, Docker Engine on Windows etc).
But anyway, thanks for the info :)
* Docker is useful for production and has various other benefits, and Docker for Mac is a nice way to develop locally with Docker even if it's not as efficient as on Linux.
* Docker for Mac uses some built-in virtualization tools in macOS to share network and filesystem more efficiently than you could do with the older VirtualBox approach. So it's maybe a little closer to native OS support than you're thinking.
* A typical configuration has a single Linux VM holding many Docker containers, which is better than the alternative of many VMs.
I assumed the main idea of Docker were reproducible builds on different machines and wanted to use it for building iOS apps.
macOS doesn't support native containers. Windows does have Windows containers. Docker on Windows can run both Linux or Windows containers.