Hacker News new | past | comments | ask | show | jobs | submit login
Docker for Mac with Kubernetes (docker.com)
285 points by watoc on Jan 6, 2018 | hide | past | favorite | 164 comments

So confused by all the posts from people who say they run Swarm because kubernetes is too complicated or is only for huge deployments.

I’ve had all sort of difficulties installing Docker. By hand it’s not trivial to get a secure install. Docker machine is great except it’s often broken. The Docker machine dev team is a tired, understaffed bunch that’s always playing a sisyphean whack-a-mole against dozens of cloud providers and very needy posters on Github, myself included.

Kubernetes on the other hand is trivial with GKE. It’s great for single node deployments. I run a single node on GKE and it’s awesome, easy, and very cheap. You can even run preemptible instances. The myth that kubernetes is complicated is largely perpetuated by the same kind of people who say React is complicated: the people who’ve not tried it.

And like React, once you try kubernetes you never go back. Kubernetes is actually the orchestration equivalent of React. You declare what should be true, and Kubernetes takes care of the rest.

And the features it provides are useful for any-sized application! If you try kubernetes you quickly discover persistent volumes and statefulsets, which take away most of the complexities out of stateful applications (ie most applications). You also discover ingress resources and controllers, which make trivial so many things that are difficult with Swarm, like TLS termination. Swam doesn’t have such features, which any non-trivial app (say, Django, wordpress, etc) benefits from tremendously.

> Kubernetes on the other hand is trivial with GKE

How do I install GKE on my servers? ;)

> By hand it’s not trivial to get a secure install.

The default install (basically, adding a repo and apt-get install docker-ce on Debian and derivatives - trivial to automate with Ansible) is reasonably secure if you view Docker as a tool for packaging and task scheduling with some nice extras and don't buy the marketed isolation properties. It only listens for commands on a local socket, and permissions are sane. I haven't looked into Swarm mode protocol traffic, though but I don't think it's tweakable anyway.

> The myth that kubernetes is complicated is largely perpetuated by the same kind of people who say React is complicated: the people who’ve not tried it.

I've tried K8s. I've set up a test cluster, it worked, I wrote some YAML, it worked, all good. So I've worsened the conditions (explicitly going into "I want things to break" territory) and made it fail. I've researched how hard it is to diagnose the problem and fix it - it happened to be complicated. At least, for me. Just felt that "if something goes wrong here, I'll have bad time trying to fix it". Surely, this is not the case on GKE where you don't run and don't manage the cluster.

I had somewhat similar experience with Docker and Docker Swarm mode, and it was significantly easier to dive into the code, find out the relevant parts and see what's going on.

> difficult with Swarm, like TLS termination

YMMV, but I just add some labels to the service and Traefik does the rest. ;)

(But, yeah, Traefik with Swarm requires some reasonable but not exactly obvious networking logic. May take one deployment of "why I'm getting 504s?!" to figure it out. And Traefik needs access to manager nodes to work with built-in Swarm service discovery.)

> How do I install GKE on my servers? ;)

Great question! All major Cloud providers offer managed Kubernetes services:





The choice is Cloud + Kubernetes vs. roll everything on your own hardware.

Running your own hardware is a major IT effort. Kubernetes is just a part of that effort. Then you have take care about planning, provisioning, logging, monitoring, alerting, auditing, networking, storage, oncall.

You either run Cloud + Kubernetes and get rolling in 15 minutes, or you hire IT headcount.

EKS isn't generally available

> Running your own hardware is a major IT effort. Kubernetes is just a part of that effort.

For Enterprise nerds VMWare has also just launched a Kubernetes service, so the #1 Enterprise VM supplier just added k8s on-prem for everyone who is in a hybrid cloud situation.

> You either run Cloud + Kubernetes and get rolling in 15 minutes, or you hire IT headcount.

You can always rent dedicated hardware – much cheaper than renting similar amounts of virtual servers, and the same administration and operations costs. It’s often even significantly cheaper than what GCE offers – but of course, setting up Kubernetes is more complicated.

Exactly. Hardware rental is one way to tackle provisioning. You're still left with all the other tasks required to bootstrap your own datacenter. As you build up the roll-your-own solution, you end up in the same place: hire IT headcount.

If you are a small startup looking to validate market fit, your best bet is Cloud + Kubernetes. If you are an established business with millions of daily customers and serious IT headcount budget, you may look into roll-your-own. The best orchestrator at that scale is, again, Kubernetes.

> Exactly. Hardware rental is one way to tackle provisioning. You're still left with all the other tasks required to bootstrap your own datacenter. As you build up the roll-your-own solution, you end up in the same place: hire IT headcount.

For a startup, building a datacenter isn’t required in the beginning.

I meant, literally renting existing hardware, in an existing datacenter. This is just a single step from renting VMs at AWS or GCE, but already improves costs significantly.

I think pacala's point still stands - if you rent hardware, there is still a larger effort involved around running scalable services (DR, auto-scaling/load balancing, deployments etc). If you rent hardware, you likely will need a larger dedicated % of resources dedicated to maintaining that, over OOTB service providers.

Edit: got user names wrong

you still have to go from bare metal in someone else's DC to having you software running on top of it, who's going to do all that config work? You're spending resources either way.

As I said before, thanks to container linux, that's 3 lines in a json config.

I've deployed a handful of servers automatically myself this way, and I'm still a student.

Saving about an order of magnitude in terms of costs, gaining flexibility, and the only additional work is so low that it costs a compsci student half an hour once (aka ~10€ in wages) — there's no reason not to do it.

That's not true in a whole lot of industries. Many (!) small shops simply can't choose the cloud because of various reasons (regulations, for example).

Thanks for the reply.

I agree with what you say. I'm not tying to say people should all jump to k8s. Having options on the market is great.

But I was trying to refute the notion that Kubernetes has no advantages unless you're running a huge cluster. My main points where:

* It works great with 1 node.

* It comes with many features that Swarm does not have that are useful even at 1 node (PersistentVolumes, StatefulSets are biggest for me, though there are _many_ more I wouldn't want to go without anymore).

* Docker is not trivial to set up, either.

> How do I install GKE on my servers? ;)

Yes, of course. I was just saying there's a solid option to start exploring quickly.

> It only listens for commands on a local socket.

This is kind of a non-starter, isn't it? Of course it's easy to apt-get install docker, but then you want to control it remotely, right? Once you realize how nice it is to control Docker remotely, it's hard to imagine life before.

Actually if you are using StatefulSets and PV, then kubernetes is a better fit for you.

However, Swarm is undeniably simpler to work with unless you have very specific requirements that only K8S provides. The yml file is incredibly simpler.

Docker Swarm is the Kotlin to Kubernetes' Java. It's a much pleasanter and much less intimidating way to build production systems that scale pretty well.

Kubernetes needs you to have load balancers setup which can talk K8S ingress. Bare metal setup for k8s is a huge effort (and I have stuck long enough on #sig-metal to know) as compared to Swarm.

You should try out Swarm - you might choose not to use it, but you will enjoy the experience.

Completely agree. Tried them both and found Docker Swarm to be much simpler to setup and maintain.

> Docker is not trivial to set up, either.

Do you have a concrete example of what you ran into? What do you mean by "secure install"?

on osx its just a matter on installing docker4mac and other linux distributions has pre-made packages. I am a linux noob and I was able to setup a 20 machine cluster with swarm trivially on centos7.

I was acutally surprised that I was able to do that so trivially, given I have minimal linux admin experience if any.

I’m referring to all the stuff Docker machine does beyond apt-get install docker. It sets up TLS certs and sockets do you can control the docker daemon remotely. When docker machine works it’s great, but when it doesn’t it’s frustrating.

Its built into docker swarm, afaik. You don't need to do anything special setup for TLS.


> Kubernetes has no advantages unless you're running a huge cluster.

You're absolutely correct. Kubernetes has its advantages, even in a single-node setup. What many others are pointing out is that it also has significant disadvantages, too.

> but then you want to control it remotely, right?

By the way, I do talk to Docker (Swarm mode or standalone) deployments remotely, forwarding the socket via SSH.

    ssh -nNT -L /tmp/example.docker.sock:/run/docker.sock example.org
    docker -H unix:///tmp/example.docker.sock info
(Needs user to have access to the socket, of course. If sudo with password-requirements is desirable, `ssh + sudo socat` is a viable alternative.)

But, really, there is Ansible for this. Declare `docker_service` and get it deployed (or updated) as a part of the playbook. And ELK (with logging.driver=gelf for the containers or application-level) for logging.

(BTW, what's nice about K8s is that IIRC it allows to exec into pods with just `kubectl exec`. With Swarm you need some script to discover the the right node and container ID, SSH into it and run `docker exec` there - no remote access.)

I appreciate where you’re coming from, but you can’t bring GKE into a conversation about the challenges of Kubernetes ops. GKE does everything for you.

I guess you are correct. What about Docker Machine vs Kops, then?

You're running on GKE. Of course that's easy, they're handling all the difficult parts for you! It's still not that difficult running it on your own servers, but there are many more pain points.

Please get some experience with that, then re-evaluate whether Docker or Kubernetes is easier for small deployments.

Right, I agree.

But with your easier Swarm setup, how do you then attach cloud disks directly to your docker container, like a PersistentVolume affords? That one feature makes basically anything worth it, IMO. Most apps are stateful.

You don't, because you're running on your own hardware, and there is no concept of cloud disks. There are some solutions like Ceph, but they're tough to setup.

Usually I just pin containers to hosts. This is fine on a small setup. In fact, many of my small setups are just a single host.

> Ceph tough to setup

Try ProxMox. They have it built-in.

NFS mounts, provisioned via the same configuration management platform the hardware is provisioned by (Saltstack in my case)

"Kubernetes on the other hand is trivial with GKE."

Yes, true, but that's apple vs oranges. Swarm doesn't compete against GKE, it competes against Kubeadm. Because sometimes one actually can't or won't use cloud services.

I guess you are right, it's apple vs. oranges. If we take GKE out of the picture, in terms of setup, is it accurate to say it competes with Kops?

Kops seem (from quick overview) to be focused on AWS and/or other cloud providers. Several times I've been limited either to actual hardware in a room, or a contracted provider giving me something along the lines of OpenStack. Unless there's a project I've missed, at the moment that means I either have to deal with kubeadm, or just roll from scratch myself (umm... nope).

GKE=Google cloud?

I don't understand ur comment, yea using something in preinstalled in the cloud is easier than installing something on your own.

GKE = Google Kubernetes Engine (aka Google Container Engine for Kubernetes on Google Cloud Platform)


Docker isn't easy to install (see my OP). Kubernetes is probably more difficult to install, but it's not more expensive to use GKE than to roll your own Kubernetes elsewhere. And once you understand Kubernetes from a user perspecitve, it's easier to set up on your own.

> it's not more expensive to use GKE than to roll your own Kubernetes elsewhere.

$13 per month per egress rule, bandwidth costs a factor of 100 more expensive than dedicated hosters, and costs for hardware 10x of what dedicated hosters offer.

I’m not sure what your definition of "more expensive" is, but compared to renting dedicated hardware (e.g., from Hetzner), or colocating, GKE is significantly more expensive.

Of course, if you’re on the scale of Spotify, you can get much better deals – but from what I see on https://cloud.google.com/products/calculator/, it’s not cost-effective.

Maybe it’s my inexperience. I’ve not had to define egress rules yet. I run 1 preemptible n1-standard-2 node and pay for ancillary services. It’s $35-40/month.

I have the equivalent of 2 n1-standard-4 nodes (a bit more performant, actually), rented as dedicated servers, for $30 a month. Including egress and bandwidth.

That’s why I disagreed with the "GKE isn’t more expensive".

"but it's not more expensive to use GKE than to roll your own Kubernetes elsewhere."

depends on what scale your at and the fact that your vendor locked, can't use GKE in other data centers

I'm not proposing the following as a solution to your problem, which I'm sure is on a much larger scale, affording many opportunities for failure. I'm curious how it breaks in that context, though.

I regularly use a script like this, and it's never failed me unless it runs out of disk space: https://github.com/coventry/InformationDropout/blob/master/p...

I agree for the most part, but kubernetes orchestrates docker (for now). It's also much easier to install than full blown k8s

Do you really find k8s useful for a single node deployment or were you just making an example?

I haven't used but I have used DC/OS, Mesos, and Marathon extensively, which is not a setup I'd do for a small number of nodes personally.

Yes. Though I should emphasize that I'm using GKE. So it's zero config, zero ops. I am running a single n1-standard-2 node on GKE for a smallish app right now. Comes to 30-40 dollars a month all in (egress traffic, cloud storage, other services). I am working hard on getting people to use this app, and it's great to know that scaling will be best-in-class if I succeed.

But I stress it's not about the scaling. It's about the features even at a single node. I wouldn't be spending time writing this stuff had it not been revelatory for me.

PS. If you're not able to try it on GKE, there's also https://github.com/kubernetes/kops. GKE is great for trying, at least, though. Just get a cluster running and get rolling with the concepts. Then you'll know what it's all about. There isn't that much to it, and what there is to it is great and well documented with a great community.

Yet with a Docker Swarm setup, you could get away with probably 10 dollars/month. K8s likely taking more resources than your app containers is why k8s may not be the best option for single node deployments.

I've personally tried both orchestration options and much prefer Swarm to smaller deployments. Sure k8s could prepare you better for the future if your app takes off to the moon but that's just premature optimization.

This might all very well be true, but Swarm doesn’t let you connect cloud disks directly to your containers, to my knowledge.

Also I challenge the $10 because what cloud instances plus ancillary services add up to $10? I’m talking 2 CPUs and 7.5gb RAM.

A Digital Ocean droplet @ 1 GB runs at $10/mo which I have Docker Swarm + my app containers running on.

I've found that both Swarm and k8s - but _especially_ the latter - makes the day-to-day application deployment really smooth compared to almost anything I used before (git-deploy, fabric, rsyncing things around). Docker containers are a large part of that, of course, but the tooling and configuration/secret handling help as well.

As a discloser, I work at Mesosphere who engineers DC/OS.

I would like to note that DC/OS now has Kubernetes [Beta]. You can manage Kubernetes resources on your DC/OS cluster now.

Here are the instructions on Github to get started: https://github.com/mesosphere/dcos-kubernetes-quickstart You can also join the Slack to help you on you out: http://chat.dcos.io/ (the #kubernetes channel)

>I have used DC/OS, Mesos, and Marathon extensively

Would you use DC/OS-marathon (vs k8s) now if you were to make that decision now.

I've heard that mesos stack is good for machine learning/big data stacks but how does marathon compare for deploying webapps.

I've run 2 Mesos stacks in production and have experience setting up a k8s stack (on prem). First off in my experience k8s ops is way more complex that the DC/OS stack. I recently setup a new DC/OS deployment (80% of the cluster resources was Spark, which works natively with Mesos and I'd rather run the ancillary services on Marathon, then spend another 80% of my time on k8s). If I didn't have the Spark requirement I would have went k8s.

Despite going with mesos I really had to contend with the fact that k8s just has way, way more developer support - there are so many rich applications in the k8s sphere. Meanwhile I can probably name all the well supported Mesos frameworks offhand. Next, marathon "feels" dead. They recently killed their UI interface as I imagine that they are having trouble giving resources to marathon. 3 years ago I wanted a reverse proxy solution that integrated with mesos as well as non-mesos services so I hacked Caddy to make that work [1]. 3 years later, I was looking for a similar solution and found traefik. It claimed to work with mesos/marathon, but the marathon integration was broken and the mesos integration required polling even though mesos had an events API, so I hacked traefik to make that work [2]. On the other side of the fence, you have companies like Buoyant who rewrote a core piece of their tech (Linkerd) just to support K8s (and only K8s). This has a compounding effect, where over the years things will just become more reliant on assuming you are running k8s.

That "cost" you pay to setup Mesos/k8s is usually a one time cost on the order of a month. I feel however, that k8s is going to give you a better ROI (Unless, you are managing 100s of nodes with Spark/YARN/HDFS, then Mesos continues to be the clear winner).

[1] https://github.com/mholt/caddy/pull/40 [1] https://github.com/containous/traefik/pull/2617

From me, an eternal thank-you for upgrading Caddy's proxy middleware!

I think I need to try k8s in prod to give a good answer.

The DC/OS stack is nice for web apps, APIs, scheduled tasks, etc once you get it up and running. If I were just deploying small APIs and web apps, I would use something more managed like Lambda or Heroku/similar myself depending on the use case. That said, I'm more of an app dev than an ops person.

Have any recommended resources for learning Kubernetes? I've looked at it a few times, but it always seemed rather intimidating.

If you like videos, I recommend the free EdX course. If you don't I recommend the Katacoda interactive sessions.


(Disclosure: I run CNCF, which hosts Kubernetes and funded the EdX course.)

Interactive tutorials from katacoda: https://www.katacoda.com/courses/kubernetes

manning book Kubernetes in action was easy( albeit a bit tedious) read.

This! I wonder if people say it because it is what they heard or from actual experience? I found K8s far easier but seen people say otherwise and suspect there is a bit of an echochamber happening.

I can't believe that you came to that conclusion after trying them both. Try comparing the config files for Swarm and k8s for the same system, the former are much shorter and simpler.

I normally used minikube for openfaas development - I appreciate the efforts of the project, it's an invalueable tool. The DfM integration works very well for local development and I've got some screenshots below:


Make sure you do a context-switch for kubectl too.

I see some people talking about Swarm vs Kubernetes. Swarm has always maintained a less modular approach which made it simpler to work with - it's obviously not going anywhere since there are customers relying on it in production.

Out of the two it's the only one that can actually fit into and run on a Raspberry Pi Zero because K8s has such high base requirements at idle in comparison. For a better comparison see my blog post - https://blog.alexellis.io/you-need-to-know-kubernetes-and-sw...

I find that Swarm is the only thing that fits my use case of just shipping. There's so much complexity and overhead with kubernetes and often I just want something that works so I can ship it. You just can't beat Swarm for building an insta-cluster that works well and quickly gets you where you want to be. I'm sure kubernetes is great if you have forty datacenters around the globe, but I don't, and neither does anybody I'm building services for.

Have you ever tried docker-compose?

While the two solutions are obviously with different goals in mind, one being to fully run the kubernetes setup locally, and the other one to run a few docker containers who talk to one another, if it's for the purpose of running a simple-ish dev environment, in my experience, docker-compose is much faster and simpler than minikube.

docker-compose trades a better initial UX for far less flexibility and, funnily enough, higher practical complexity in the long run. It's great if you want to get from zero to MVP with as little thinking about what your infrastructure needs will be as possible. It's pretty awful, however, when you want to truly productionalize what you've done and you find out that in order to do so you'll have to use the newest Compose format, and your docker-compose.yml (and possibly additional supporting Compose) files are not at all easier to read or simpler to write than e.g. k8s objects in YAML.

If you want a single file configuration on k8s, take a look at kedge.org

It's basically an extended version of native k8s yaml with some smart defaults and grouping resources similar to compose.

One could also just use one file with multiple YAML objects delimited by “\n—-\n”. The Kedge stuff looks moderately interesting, and I definitely would have found it useful a year ago. Now, though, I seem to have developed a begrudging admiration for the “native” YAML format. Maybe it’s Stockholm Syndrome, but once I finally began to understand Kubernetes I began to find the verbose YAML format to be a benefit rather than a barrier.

Thank you for your blog, it's a great source of knowledge for docker, kubernetes, and swarm, also love the openfaas

Docker for Mac is not usable today, because of high cpu due IO [1]

[1] https://github.com/docker/for-mac/issues/1759

It is usable for the vast majority of users. If it is causing an issue please file a detailed bug report, with diagnostic ID, also try the Edge releases, and give some information about what you are actually running, for example how to replicate it. Most of the bug reports in that thread are totally unhelpful. Quite likely it is not even the same cause for different people, as some people said it was fixed on Edge while others did not. Even a single well thought out detailed bug report would make it much easier to investigate the issue.

Are you sure it works for vast majority of users? At least in my developers circle who use macOS for web* development - all of them have issues with docker high cpu usage due IO. Some use docker-sync to go around the issue.

As for bug reports - zero feedback from anyone on that thread from maintainers. If you are one of the maintainers - it might be good to write this comment on that thread instead of HN

* - I understand that web developers might be a small percentage of users and my case doesn't represent everybody

I am not a maintainer but do work on LinuxKit which is used. If docker-sync helps, then that suggests that you have an issue specifically related to file sharing. Please file a new issue, do not add to this one, which explains how to reproduce your development setup. Different setups work very differently (eg polling vs notifications), and people use things very differently, there is no one set of tooling and setup that is "web development", but it sounds like in your company you all use similar tools and setup, so it is not surprising you all have the same issue. We have a performance guide here https://docs.docker.com/docker-for-mac/osxfs/ that covers some use cases.

My team of 10 use it extensively for local testing of our stack. None of us have seen the CPU issue.

I have 60-65% CPU usage at idle - it's not ideal for battery life but it is usable and I'd say it's easier to use than minikube (which is something that I also use a lot). This is very early for DfM + K8s - I'd say the usability outlays teething issues.

If you want to help send in diagnostic reports for the Docker guys.

60% at idle is a big no no for laptops.

I run 10-15 containers on my Mac and don't notice it after fixing particular containers (I don't doubt there is a more general issue)

Find out what is causing the CPU spikes with

    docker stats
or screen into the Docker for Mac VM and diagnose with top etc.

    screen ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty
I found particular containers were causing the issues and fixed it with alternate builds, prior versions or resource limitations on containers

Docker for Mac/Windows is a great product - it has allowed me to roll out Docker to developers who wouldn't otherwise deal with Vagrant or other solutions

I haven't run into that issue, but another issue that will likely trip up a lot of devs is the extremely slow shared volume support. [1] Mounting a database directory from the host is pretty much a no-go. Mounting source trees (e.g. React app with hot code reloading) less so, but still much slower than native. Many devs have resorted to workarounds such as NFS or rsync.

[1] https://github.com/docker/for-mac/issues/77

I've not had any problem with it, and the high high productivity win of being able to tell people just to turn Kubernetes in Docker on is amazing so far.

Right now, I am running 10 containers on my Mac, and my highest CPU in Intellej. Docker is running consistently with about 5-10% of the CPU.

Docker for Mac is running in VM anyway, these kind of issues related to slow IO was around more than year. Main reason I run linux on my macbook


Not appropriate

I've never noticed this issue with the stable version. However I'm experiencing high CPU usage whenever I start minikube on my Mac.

I just installed Docker edge. We'll see how it goes in the long run but so far with k8s enabled hyperkit uses 25% of a virtual core.

This is a Kubernetes issue, the same issue on Minikube and Docker for Mac. The Kube API server uses a lot of CPU when idle. We are trying to diagnose what is happening.

yea it cpu usage goes down if i disable kubernetes.

Also noticed this with the version of Kubernetes that comes with DfM. hyperkit consumes around 35% CPU while idle, which amounts to around double the idle battery drain on my MBP.

Subscribed. Thanks for sharing this thread as Docker for Mac has always had performance issues for me too. Even when no containers are running I find it using significant resources all the time.

Oh man. Thanks for sharing this. I thought it was just my Mac.

I have a Dropbox running inside a container on my Mac. Even though there is minimal activity on the account it drains the battery like crazy. Native Dropbox client isn’t great in that regard, but it feels like there is an order of magnitude difference.

I am using it to run Microsoft SQL Server for Linux and haven't had a problem with high CPU usage.

Wow! Did Docker give up swarm? I thought there was a time when Docker didn't like the existence of k8s all that much.

Anyways, I envision this being very useful for development, may even replace my docker-compose based test setup.

No, it didn't. Yes, k8s has 'won' in large-scale deployments, but if you're working at a small shop, then just imitating what Google does with millions of servers is dumb. Do what works at your scale -- and Swarm is extremely easy to manage.

Many people ask why would someone use an orchestrator on a small cluster (dozens of hosts). Why not? Swarm is very easy to manage and maintain, using Puppet or Ansible is not less complicated at all.

The future of Docker, Inc. is of course Docker EE, and the future of Docker EE is _not_ the scheduler, it's everything around it.

> Swarm is very easy to manage and maintain, using Puppet or Ansible is not less complicated

The idea that dockerized software somehow is less dependent on configuration management seems to be a popular and completely misguided one. The two trends are completely separate, but I would argue from experience that unless you have absolutely nailed the configuration and integration of all your moving parts, don't even look at containers yet.

Containers tend to lead to more moving parts, not less. And unless you know how to configure them, and perhaps even more importantly how to test them, that will only make matters worse.

If you design your infrastructure and choose your tooling well, then containerized (not "dockerized") software is far less dependent upon configuration management; indeed, using Chef/Puppet/etc can be completely unnecessary for the containerized workload. To be clear, however, there is absolutely still a need for the now-traditional configuration management layer at the level of the hosts running your containerized workloads. What's kind of exciting about this is that the giant spaghetti madness that our configuration management repo has become—and I'm pretty sure it's not just us ;-)— at our org is going to be reduced in complexity and LOC by probably an order of magnitude as we transition to Kubernetes-based infrastructure.

> indeed, using Chef/Puppet/etc can be completely unnecessary for the containerized workload

This is more than naive. As long as your software needs any kind of configuration, there is a need for configuration management. There will be access tokens, certificates, backend configuration, partner integration of various kinds, and monitoring and backup configuration and you will want guarantees that these are consistent for your various testing and staging environments. You will want to track and bisect changes. You can either roll your own framework for this or use Ansible/Puppet.

Whether you distribute your software pieces with tar balls, linux packages or docker images or completely orthogonal to how you tie these pieces to a working whole. And the need for configuration management absolutely increases when moving towards containerized solutions, not by the change in software packaging format but by the organizational changes most go through where more people are empowered to deploy more software which can only increase integration across your environment.

I see organizations that have ignored this because they believe this magic container dust will alleviate the need of keeping a tight grip over what they run, and find themselves with this logic spread over their whole toolchain instead. That's when they need help cleaning up the mess.

I never said anything about magic container dust, nor did I say anything about having less of a grip over our operations. I was attempting to make a point about how your workloads themselves (applications/jobs) can be free of a direct need for Chef/Puppet/etc, which can dramatically simplify your configuration management layer. I never intended to claim that somehow magically our pods need no configuration bits at all, and honestly I’m not sure where you got that idea.

The statement was that containerized workloads are less dependent on configuration management. That could easily be interpreted as if configuration management gets less important when you containerize, which is an idea that seems to spread easily on its own, while I have found the complete opposite to be true. That's why number one guideline is to get a grip on your infrastructure and configuration before you move to containers. Otherwise you will end up with a mess worse than before.

No, it's not 'less dependent'. The way how it helps is that you can standardize configuration management across environments relatively easily.

Containerization does not result in less moving parts in a given environment; but it does result in less moving parts across the whole development flow (from a developer laptop to a production cluster).

Thanks to CoreOS containerLinux, you don’t really need Puppet or Ansible. You specify your Ignition config, and just drop it as a tiny partition on the servers, and drop the OS onto a separate, adjecent partition, and then have local storage even separately. Commonly you use PXE to avoid having these locally.

Then everything you do on top of that is handled by the container scheduler, and your containers.

> Commonly you use PXE to avoid having these locally.

I’ve not seen PXE used anywhere that the DC wasn’t O&O (or essentially close to it). As that’s the exception to the rule these days, isn’t your premise a bit cavalier?

I’ve used PXE a lot in my past [0] to great benefit (well, more specifically iPXE through chainloading), so I’m not detracting from it, just saying it’s applicability is limited for most folks.

[0] I wrote “Genesis” for Tumblr which handled initial machine provisioning from first power-on after racking to being ready for deploys.

Using PXE (or, rather, iPXE) is the recommended deploy mode for container linux, that's why I mentioned it.

> Using PXE (or, rather, iPXE) is the recommended deploy mode for container linux

Really? Is that a new thing? I don’t remember that last time I was reading the docs, but maybe I missed it (that it’s the “preferred” method rather than being just “an option”).

Seems rather odd to limit your audience like that in the age of “cloud everything” as I think it’s generally more rare that folks fully control their layer 2, but what do I know. :)

Well, you can always still use the disk install, and instead mount the ignition config as elastic block storage read-only. That's what most people do in the cloud.

But in either case, you want the config to be centrally stored, so you can modify it without having to ssh onto every machine.

Network booting combined with good out of band management controls allows fully automated provisioning of nodes in a baremetal setup, so I don’t see how this is a negative? The alternative is a live cd/usb or attaching a disk with something already preinstalled.

If you’re on prem with VMware or openstack there’s better options which you can use, but that’s not exactly baremetal. in those environments CoreOS recommends using different provisioning options more suited for those providers.

> people ask why would someone use an orchestrator on a small cluster (dozens of hosts)

I’d love to know who these mythical folks are?

1) dozens of hosts (heck, hosts >1) is exactly why you need orchestration

2) while there are huge deployments across the globe, I wouldn’t consider “dozens of hosts” small by no means. That’s actually probably above average.

3) k8s is actually easier to maintain than you allude. I see these comments about Swarm over k8s generally from folks who never even tried it (or did so years ago), is that the case here?

Disagree. Do not find using K8s any more complicated and believe it being more complicated is a myth that just gets repeated over and over to be believed it is true and not based on actual experience.

It just sounds accurate to say it is like trying to be like Google. Well if it is but less complicated then there is no down side.

The future of Docker, Inc. is being an Oracle acquisition.

I can't believe I'm saying this, but if some giant corporation is going to buy Docker, Inc. then I really, really hope it's Microsoft. An Oracle acquisition would be an absolute nightmare.

Your docker compose yml is trivially reusable for Swarm. We run it in production and is spectacular .

The complexity overhead of kubernetes may be premature when you don't have massive requirements

That's the thing, we use k8s in production and I just used docker-compose for testing in my device environment for the sake of simplicity. Though over time it became not so simple to maintain two parallel config as the system grown up.

that's one of the reason we decided to jump on swarm. And seeing how Swarm has matured and added features, we are very happy.

Azure also comes with a Swarm-mode deployment template which is pretty great (https://github.com/Azure/acs-engine/blob/master/docs/swarmmo...). However, setting up and running Swarm has been a pleasure.

This is very interesting. I've been looking for middle ground between Dokku and Kubernetes, I'll give swarm a try. Does it handle ingress at all? vhosts, TLS certificates, etc?

If you're paying for EE, then UCP can do a lot for you. We're using the HRM heavily, and it's very handy.

I don't know what any of those mean :(

EE = Enterprise Edition

UCP = Universal Control Plane (web app to manage clusters and containers)

HRM = Host routing mesh (I think). If I recall correctly, it’s used to control ingress to the cluster via virtual host routing, etc.

They are paid Docker Inc. products. I worked for Docker once upon a time so I’m happy to hear there are happy customers :)

Ah I see, thank you for clarifying! I'm not familiar with the products, but I'll research them.

Docker has added ingress routing in 17.06 that works at a lower level than HTTP Routing Mesh. You should try that now.

Other than that, how have you found EE ? We have considered paying for it, but after seeing Docker Cloud at 15$ per node per month, we are considering switching to it.

ingress is the most beautiful part of Swarm - it is built in.

however, it does not do termination. You should trivially include a haproxy/nginx pod/service/vm to do the termination for you.

One of the cool new infrastructure features is Docker Cloud - which allows you to "bring your own nodes" for 15$/node/month and setup Swarm. https://docs.docker.com/docker-cloud/cloud-swarm/using-swarm...

I'm using https://github.com/containous/traefik with Swarm for termination, Let's Encrypt and routing with success. Configuration is handled with service labels.

+1. Traefik is leaps and bounds more ergonomic than haproxy etc.

I believe it benched a bit worse than the comp but for most cases you're not going to run into issues.

Thanks for the shoutout. Would you happen to have a writeup on how to do that, or a few lines on whether it's Swarm-aware somehow? I guess it can't be too hard to figure out, but if there's something that will speed that up, that's better.

No - you don't need it to be swarm aware at all.

All you need to do is create a docker service out if traefik/nginx/haproxy/whatever and bind the ports of the service to type "ingress". From then on, all traffic is internal - and you deal with it normally from one docker service to another.

Ah, okay, so pretty agnostic. Sounds like this is geared towards single project deployments, though (ie it wouldn't be suited in a multitenant scenario), but that's good to know.

I've been looking for a Dokku replacement for running my side-projects on with easy scalability (by provisioning another server), but I haven't found anything that's suitable for running multiple apps together in a heroku-like way but still able to scale beyond a single server. Do you know of anything like that?

but that is what swarm is - swarm is fully production ready to scale to tons of servers and works brilliantly on a single server as well. Not sure why you think swarm wouldnt fit the bill ?

From what I saw, it's not very easy to have a deployment with multiple applications on the swarm, is that wrong?

For example, I have ten applications, and each requires a database, a redis instance, a celery instance and two web workers. Dokku lets me deploy these independently of each other, but uses the same nginx instance and proxies them transparently.

As I understand it, Swarm has no notion of multiple projects. Each swarm is running a single deployment, where all containers are equal, is that correct?

Basically, Dokku is a self-hosted Heroku, which is what I need (I want to be able to easily create a project that I can run semi-independently of the others on the same server). My understanding is that, to do that with Swarm, I'd have to have a single termination container that would connect to every app, but apps wouldn't be any more segregated than that. Maybe I'm complicating things, though. Have you used Swarm for such a use case?

I tried the official tutorial, but couldn't get it to work, as the instructions appeared outdated and didn't work for single-host deployments, and were geared more towards Windows and Mac than Linux. Would you happen to have a good "getting started" document? All my apps are already using docker-compose.

EDIT: Also, a machine that's a manager doesn't want to join the swarm as a worker as well, that's why I'm saying that it doesn't appear good for single-server deployments:

> Error response from daemon: This node is already part of a swarm. Use "docker swarm leave" to leave this swarm and join another one.

the docker stack deploy command does automatic diffing of which services have changed. So your deploys are automatically optimized. This is generally the philosophy of Swarm vs kubernetes - everything is built in. You can argue this is less powerful, but in general it works brilliantly. In so far as separating out the different "applications", you simply put them on a separate overlay network (encrypted if you want).

Also, if you are dead-set on making them entirely separate, every separate "application" is a separate "Stack". So you can stack deploy them separately.

If I had to do what you just told me - single nginx proxying to two different "applications" - i would do this. 1. stack 1 - application 1 + network 1 2. stack 2 - application 2 + network 1 3. stack 3 - nginx + network 1

now you can deploy any of them independently. You can make this even more sophisticated by having each stack on a different overlay network (encrypted as well). And nginx bridging between them.

Not sure why you are facing problem with the official tutorial - btw, a manager is a worker ;) I have a fairly large dev swarms on a single node.

Swarm is nice as-is but is, but really shines with support like docker-swarm-proxy. Please check out http://proxy.dockerflow.com/. It allows for automatic service discovery as well as routing/termination etc. It also runs as a stack on swarm itself. Supports custom TLS termination as well as SNI routing.

Helm makes this easy. It’s great, just use the templating features and not the package management stuff.

We found the same thing after the package management stuff caused issues in prod.

We rolled our own go program that just does go tpl substitution in yaml with overrides like helm. Works on charts out of the box, but instead of talking to a service, it outputs a yaml manifest ready to kubectl apply.

We've thought about open sourcing it. It took us literally a day to put together and had worked without flaw for 8 months in production.

Cool! I just don't use the package management stuff. My workflow also has local manifest template rendering, but I use helm-template[0]. My chart is just a local directory, not a package or anything. Then, to release:

* Render the chart and run it through kube-cloud-build[1] to make sure my containers are all built before I deploy: `helm template /path/to/chart | kube-cloud-build -r repo`

* Deploy with helm: `helm upgrade RELEASE_NAME /path/to/chart`

Has been working really well for me.

[0] https://github.com/technosophos/helm-template

[1] https://github.com/dminkovsky/kube-cloud-build

Officially, both Swarm and Kubernetes are valued features of Docker.

Swarm was always much simpler to get started with, so some people will prefer it. But it's clear to me which one won.

Kubernetes is also coming to docker-ce: https://github.com/docker/docker-ce/releases/tag/v18.01.0-ce...

One of my engineering friends told me he didn't use Kubernetes in the past because there was a single point of failure with it for distributed setups.

I really wish I could remember what SPOF was pertaining to, but I just can't remember. Does anyone have any idea if this is still relevant/accurate information?

He told me this maybe 2-3 years ago, so I was wondering how things have changed since then, or if anyone knows what he might have been talking about.

Kubernetes supports HA masters now: https://kubernetes.io/docs/admin/high-availability

Note that even if you don't have HA, Kubernetes being a SPOF isn't necessarily critical. Barring some kind of catastropic, cascading fault that affects multiple nodes and requires rescheduling pods to new nodes, a master going down doesn't actually affect what's currently running. Autoscaling and cronjobs won't work, clients using the API will fail, and failed pods won't be replicated, but if the cluster is otherwise fine, pods will just continue running as before. Ny analogy, it's a bit like turning off the engines during spaceflight. You will continue to coast at the same speed, but you can't change course.

Interesting, well thats great to know. Thanks for the ELI5 explanation.

Multi-master? Kubespray supports multi-master deployments now.

Does this obviate minikube?

For me, yes. I just deleted my minikube and mini shift directories.

sounds good to me!

Anyway to get this feature without updating to High Sierra first?

Which feature?

If you want a standalone installation of Kubernetes on a Mac try minikube.


Kubernetes is enabled for all OSX version

Minor nitpick, it's called macOS now.

I don't understand why people refer to pre-Sierra releases as macOS. Sierra and onwards is called macOS.

guillaumerose was referring to all versions, so both "all OS X versions" and "all macOS versions" would be wrong, no...?

Apple rebranded "Mac OS X" to "OS X" and later rebranded that to "macOS". It's not like they're different lines of operating systems; it was a rename of the whole line, so my impression is it's fine to use the term "macOS" to refer to any of the versions since 2001, or it's also fine to use the name that was given at release when referring to a specific version. In other words, probably best to not worry about any particular phrasing, and not try to put exact technical meaning on any of these terms. :-)

For example, Wikipedia[1] has a page called "OS X Yosemite" which describes it as "A version of the macOS operating system", and the Wikipedia article on macOS[2] says it was first released in 2001.

[1]: https://en.wikipedia.org/wiki/OS_X_Yosemite

[2]: https://en.wikipedia.org/wiki/MacOS

>I don't understand why people

Because it's a bother to keep in mind when they introduced the new name? Or release names? It's hard to imagine that someone would actually be talking about certain number of releases this way.

PS: Apple's stupid marketing bs strikes again.

You don't need High Sierra for this. Just install the Edge channel of Docker for Mac.

Thank you that was the issue.

I don't understand the use-case here. You want to use Docker+Kubernetes but can't work out the bits to run it in VM's on your own?

Running Kubernetes locally isn’t really trivial. I feel like this makes a clicky clicky install a lot more possible and opens the doors to more people. And that’s what Docker’s been good at since day one, democratizing otherwise esoteric technologies.

This is a replacement for minikube, i.e. a local dev k8s cluster.

If you need to experiment with k8s locally it's pretty easy to install Kubernetes in one or 2 VMs on your laptop with kubeadm but then you need to install Docker inside this VM. Minikube is also installing another Docker daemon inside their own VM.

Because I already have Docker for Mac installed to be able to build and test images I think it's useful to have this local k8s integration.

minikube is still my default for running a local kubernetes instance since it supports more customisation of the k8s settings.

A use case would be for Devops to be able to just point developers to a simple and documented method of running this environment in development without having to invest the time into creating and maintaining it themselves or expect the developers to invest into the operations aspects.

I don’t get it either. minikube is trivially installed in my experience with homebrew.

Given that running docker on a Mac has been a long and bumpy journey, I’m not sure I’d want to bundle the two together.

What am I missing?

If I don't have to run minikube, and instead get something equivalent just by installing Docker for Mac, that's a minor win.

If Docker's local Kubernetes install provides a way to connect to custom registries (e.g. GCR) without installing a third party plugin onto every minikube, I'd consider that a major win.


Thanks for the suggestion. We will look at it.

Once kubernetes become stable in docker ce can one attempt to run their own cluster on baremetal?

The phrase "Docker for Mac" is super misleading. If we run Docker in a Linux VM on macOS, I don't think it counts as "Docker for Mac", IMHO.

Docker is primarily for running Linux applications on Linux (yes, I know there are things like Joyent SDC, Docker Engine on Windows etc).

Half-OT: Is it possible to run Xcode CLI tools from inside Docker for Mac?

No. Docker for mac runs a Linux VM.

Doesn't that defeat the purpose of Docker?

But anyway, thanks for the info :)

Running a Linux VM on Mac defeats some of the purpose of Docker, but it's still valuable:

* Docker is useful for production and has various other benefits, and Docker for Mac is a nice way to develop locally with Docker even if it's not as efficient as on Linux.

* Docker for Mac uses some built-in virtualization tools in macOS to share network and filesystem more efficiently than you could do with the older VirtualBox approach. So it's maybe a little closer to native OS support than you're thinking.

* A typical configuration has a single Linux VM holding many Docker containers, which is better than the alternative of many VMs.

I see. okay.

I assumed the main idea of Docker were reproducible builds on different machines and wanted to use it for building iOS apps.

That's certainly a use case, it's just not compatible with Apple's restrictions.

No, because the purpose of Docker is not 'to run XCode CLI tools'...

I think the GP means that the purpose of Docker is to provide containers, which are a substitute for VMs. If you are going to be running a VM anyway, why run a container?

The purpose of running Docker on Mac is to run Linux containers, mainly for developing or testing containers that will be run on Linux servers.

macOS doesn't support native containers. Windows does have Windows containers. Docker on Windows can run both Linux or Windows containers.

Yes. thank you.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact