Oh, you mean like 50% of the work?
As for the reverse proxy: Traefik all the way! :D
Just a suggestion to the OP, it not hard to setup and share a 5 node vagrant cluster on your laptop. Give concrete examples that people can run locally and test your assertions themselves. Once that foundation is laid, you can extrapolate to 10 nodes, 100 nodes, 1000 nodes.
Anyone that has deployed a cluster of that size knows that the article is missing a bunch of items, not limited to the following:
- Overhead Instances (manager, service discorvery, loggging, etc.)
- Configuration Management
- Security Implications
- Failure mitigation (its going to happen at that scale)
- Update strategy at this scale
For those that are interested, one official doc and a good place to start when leaning how to deploy a large docker 1.12 cluster is this guide by docker.
We are just deploying our first Kubernetes cluster in production and anything more than basic hello world would be welcome. Like how to configure networking in production, how to route traffic to containers, how to provide volume storage (backups, etc..).
I mean, we'll get there of course, but we opted against Swarm as data is even more lacking than Kubernetes.
Overall I think Docker is heading in the right direction, but for now Kubernetes, ECS, etc... are better solutions for orchestration. I was hoping to only use Docker for my current project, but I think I'll have to wait until the next one rolls around and Docker releases a few more updates.
...which really means "I have no clue where this will break when scaling".
Cute, but not terribly insightful, and possibly risky in an age where following recipes off the Internet is too often the first step towards production :)
EDIT: I had been misusing the term bare-metal, thanks for picking up on it. Examples should hold on both bare metal and VMs though.
Buy 1000 bare metal servers
This one is easy. Pick your favourite cloud provider and buy lots of servers
>Basically you will run docker swarm init on the first
node and then docker swarm join on all the other nodes.
There are a few other arguments that you’ll need to add
to those commands but if you follow the docs you’ll have
no problems at all.
worse part of the setup is how to build the cluster nodes store in a way which is redundant and reliable, since provisioning it for HA is largely undocumented and left to an exercise for the reader
From here (https://docs.docker.com/engine/swarm/swarm-mode/#view-the-jo...):
"...starts an internal distributed data store for Engines participating in the swarm to maintain a consistent view of the swarm and all services running on it"
Here are the docs for "swarm init" in docker 1.12:
that goes to some docker owned server, ask for nodes, and join/create the swarm as necessary. I wouldn't build the cornerstone of an infrastructure on this, like, never.
there's an undocumented feature I just discovered to use a local token server, it seems, but then you're back to square one.
> documentation is scarce
for being a container, documentation is not just scarce but outright insufficient - especially in regards to its failure and recovery modes
Here's how it works:
- When you run `docker swarm init` it initializes the current node as a "manager" which has a datastore of cluster configuration and is responsible for assigning tasks to "worker" nodes (including itself). This init process also generates a cluster CA and two secrets authorizing manager and worker joins. The output of this command will be two tokens which you can use to join new nodes to the cluster as either a "manager" or a "worker". These tokens are structured like this:
SWMTKN-1-<cluster CA hash>-<manager or worker secret>
- If the new node is joining as a manager then the cluster configuration is replicated to to the new manager. Docker uses the Raft consensus algorithm to maintain consistency of the configuration data. As long as a majority of your manager nodes are available then the managers will be able to coordinate and issue work to the available workers in the cluster.
I hope this helps you better understand how the cluster is secured.
"The create argument makes the Swarm container connect to the Docker Hub discovery service and get a unique Swarm ID, also known as a “discovery token”. The token appears in the output, it is not saved to a file on the host"
are there two type of tokens now?
So this article talks about how you would deploy a 1000 node cluster without actually doing it? Why not saying this is how to deploy a 100000 node cluster?
If you are doing something like this, please keep in mind that this kind of DNS failover is, at best, unreliable. You have no control of how DNS is being cached on client side, and whether the client is going to switch to the next IP in the cycle if the previous one is unavailable.
Proper way to do HA would be to use some kind of VIP + load balancer combination (eg, keepalived + HAProxy), which would allow you to failover the IP instead to just rely on the hostname. However if you also have a database backend to think about, then u will most likely need something like Pacemaker to ensure you don't end up with data inconsistency (brain split scenario).
So it's up to the reader to imagine what kind of application would need a thousand web frontends without any form of persistant storage..