If there's on area that is in dire need of improvement, though, it's the documentation. If you look around, there is essentially no documentation that starts from first principles, going through the different components (and their lifecycle, dependencies, requirements and so on) one by one, irrrespectively of the cloud environment. There is a "Kubernetes from scratch"  document, but it's just a bunch of loose fragments that lacks almost all the necessary detail, and has too many dependencies. (Tip: ask the user to install from source, and leave out how to use images, cloud providers and other things that obscure the workings of everything.)
Almost all of the documentation assumes you're running kube-up or some other automated setup, which is of course convention, but hides a huge amount of magic in a bunch of shell scripts, Salt config and so on that prevents true understanding. If you run it for, say, AWS, then you'll end up with a configuration that you don't understand. It doesn't help that much of the official documentation is heavily skewed towards GCE/GKE, where certain things have a level of automatic magic that you won't benefit from when you run on bare metal, for example. kube-up will help someone get it up and running fast, but does not help someone who needs to maintain it in a careful, controller manner.
Right now, I have a working cluster, but getting there involved a bunch of trial and error, a lot of open browser tabs, source code reading, and so on. (Quick, what version of Docker does Kubernetes want? Kubernetes doesn't seem to tell us, and it doesn't even verify it on startup. One of the reefs that I ran aground on was when 1.11 didn't work, and had to revert to 1.9, based on a random Github issue I found.)
Likely if I had to choose today or this quarter we would go the empire route and build on top of ECS. Though, our model and requirements are a bit different so we'd have to heavily modify or roll our own.
All you need to do, in broad strokes, is:
* Set up a VPC. Defaults work.
* Create an AWS instance. Make sure it has a dedicated IAM role that has a policy like this , so that it can do things like create ELBs.
* Install Kubernetes from binary packages. I've been using Kismatic's Debian/Ubuntu packages , which are nice.
* Install Docker >= 1.9 < 1.10 (apparently).
* Install etcd.
* Make sure your AWS instance has a sane MTU ("sudo ifconfig eth0 mtu 1500"). AWS uses jumbo frames by default , which I found does not work with Docker Hub (even though it's also on AWS).
* Edit /etc/default/docker to disable its iptables magic and use the Kubernetes bridge, which Kubelet will eventually create for you on startup:
DOCKER_OPTS="--iptables=false --ip-masq=false --bridge=cbr0"
* Edit the /etc/default/kube* configs to set DAEMON_ARGS in each. Read the help page for each daemon to see what flags they take. Most have sane defaults or are ignorable, but you'll need some specific ones .
* Start etcd, Docker and all the Kubernetes daemons.
* Verify it's working with something like: kubectl run test --image=dockercloud/hello-world
Unless I'm forgetting something, that's basically it for one master node. For multiple nodes, you'll have to run Kubelet on each. You can run as many masters (kube-apiserver) as you want, and they'll use etcd leases to ensure that only one is active.
In some cloud environments (e.g. DigitalOcean), there's no private subnet shared between hosts, so Kubernetes can't just hand out unique IPs to pods and services. So you need something like Flannel, which can set up a VPC either with UDP encapsulation or VxLAN.
Flannel also has a backend for AWS, but all it does is update the routing table for your VPC. Which can be useful, but can also be accomplished without Flannel. It's also limited to about 50 nodes  and only one subnet, as far as I know. I don't see the point of using it myself.
AWS won't know this IP range and won't route it. So K8s automatically populates your routing table with the routes every time a node changes or is added/removed.
K8s will give a /24 CIDR to each minion host, so the first will get 10.0.1.0/24, the next 10.0.2.0/24, and so on. Each pod will get 10.0.1.1, 10.0.1.2, etc.
Obviously having an additional IP/interface per box adds complexity, but I don't know if K8s supports any other automatic mode of operation on AWS.
(Note: Kubernetes expects AWS objects that it can control — security groups, instances, etc. — to be tagged with KubernetesCluster=<your cluster name>. This also applies to the routing table.)
If you're adding a routing rule for every minion then you will also hit the 50 limit in AWS routing tables.
Flannel is just one of many different options if you need to go beyond 50 nodes. It seems some people use Flannel to create an overlay network, but this isn't necessary. You can use the host-gw mode to accomplish the same thing as Kubernetes' default routing-table-updating behaviour, but with routing tables maintained on each node instead.
On a side note, it's pretty awesome how Docker embedded the key-value store into the main binary. Appears to reduce complexity quite a bit.
However, using dedicated masters (by which I mean mostly kube-apiserver) separate from worker nodes is a good idea to avoid high load impacting the API access.
(Just keep in mind that the Kismatic packages I referred to won't support this — you can't install kubernetes-master and kubernetes-node at the same time. But as you discovered, you can run everything except kubelet as pods. On the other hand, kube-apiserver needs a whole bunch of mounts as well as host networking, so to me it seems like you don't gain all that much.)
What is this Docker key-value store you mention?
They are using a Raft based store inside the engine now so there is no external etcd dependency. IIRC they are using etcd's raft implementation.
I think rkt is making some good decisions and is worth keeping an eye on. Not sure I love the tight coupling to systemd, but the fact that it avoids the PID 0 problem and lets containers be their own processes (separate from the "engine", which can choreograph containers through the systemd API, building on all of its process handling logic) are improvements over Docker. In fact, rkt uses the same networking model as Kubernetes.
ExecStart=/usr/bin/docker daemon --exec-opt native.cgroupdriver=cgroupfs --iptables=false --bridge=cbr0 --ip-masq=false
Disclosure: I work at Google on Kubernetes
In short, I want and need to understand how it's put together so that I can use it.
There was someone on the #kubernetes-novices slack today  who rightly pointed out who described his approach as: Run kube-up, then try to deconstruct everything that kube-up did into a repeatable recipe. I went the other route, by trying to understand what kube-up did and replicating it. I'm still working through things I missed or did wrong.
To be honest, I think Google's approach here is wrong. Kubernetes is being developed at a frenetic pace, but documentation is not being maintained (it's pretty lacking even if you're on GCP!), and users are understandably frustrated with the obscurity of the whole thing. It works, but it takes weeks to gather enough of an understanding of the system, and that's entirely due to lack of documentation.
The documentaton is lacking both a high and low level. At no point does the documentation offer a big-picture view of how everything works together, nor does it offer low-level descriptions of the stack.
I also think the strong focus on kube-up is a mistake, given the lack of docs. I'm sure it works great, but it's not an option for production use, in my opinion. Terraform would have been better here. You're also using Salt — honestly, it would have been so much cooler if kube-up could just take a few inputs ("what cloud?", "what are your credentials?" etc.) and generate a finished Salt config for you, with a separate salt-cloud orchestration config for the provisioning. The current Salt config is a bit of a mess, and not really something you can build on.
Feel free to reach out to me (@atombender) on the Kubernetes Slack if you want to chat.
This is the best infrastructure I've ever used in twenty years of doing ops and leading ops teams.
also, working with k8s will probably spoil you, it's pretty annoying to "go back" to other environments, where you're confronted with problems which would be effortlessly solvable in kubernetes.
I know Silicon Valley folks are infinitely pessimistic and/or grandiose, but this is LITERALLY the reason I got into this job.
I will tell you that the economics are most definitely not there. This is a common misconception amongst the HN crowd in general--that public cloud infra is cheaper. For small footprints, public cloud makes sense but once you get into the larger footprints (300+ instances), it's far cheaper to lease dedicated hardware or DIY in colocation. We're running on approximately 40 dedicated rackmount servers for Openstack and 6 for Kubernetes. To get the equivalent amount of disk and RAM, we would pay 2-3x at AWS or GCE. We could probably cut our cost by an additional 30% by moving what we have to colo but we would lose some flexibility and would have to take on additional headcount.
From a maintainability standpoint, GCE makes Kubernetes easy which is a good thing if you've never run it before. It's not that hard to run it yourself, though. A senior-level systems engineer will be a Kube master after about two months of use. Just guessing, I think it takes about 1/4 of an engineer-week to support our Kube cluster for a week. I think we could grow our cluster 20x without a significant workload increase for our ops team.
We are in the process of automating the last few manual aspects of our Kubernetes infra: load balancing and monitoring. We're building these in the style that we've built the rest of our pipeline: YAML files in a project repo. Simply drop your Datadog-style monitoring/metrics config and your load balancer spec in your project's Github repo and the deployment pipeline will build out your monitoring, metrics, and LB automatically for you.
If you are interested in some of the things that we helped get into this release see our "preview" blog post from a few weeks ago, RBAC, rkt container engine, simpler install, and more: https://coreos.com/blog/kubernetes-v1.3-preview.html
Can't wait to continue the success with v1.4!
Watching issues like https://github.com/kubernetes/kubernetes/issues/23478 , and https://github.com/kubernetes/kubernetes/issues/23174 .. I'm not super interested in "kicking the tires"; I'm evaluating replacing all our environment automation with a version built around Kubernetes. Easy-up scripts that hide a ton of nasty complexity won't do the trick.
Following the issues I'm getting the impression that too much effort is being put into CM style tools vs making the underlying components more friendly to setup and manage. Did anyone see how easy it is to get the new Docker orchestration running?
Then there is the AWS integration documentation.. I'm following the hidden aws_under_the_hood.md updates, but I'm still left with loads of questions; like how do I control the created ELB's configuration(cross zone load balancing, draining, timeouts,etc)?
I re-evaluate after every update and there are some really nice features being added in, but at the end of the day ECS is looking more and more the direction to go for us. Sure, it's lacking a ton of features compared to Kubernetes and it's nigh but impossible to get any sort of information about roadmaps out of Amazon... But it's very clear how it integrates with ELB and how to manage the configuration of every underlying service. It also doesn't require extra resources(service or human) to setup and manage the scheduler.
"Federation" in this context is across clusters, which is not something other systems really do much of, yet. You certainly don't want to gossip on this layer.
"evaluating replacing" really does imply "kicking the tires". Put another way - how much energy are you willing to invest in the early stages of your evaluation? If a "real" cluster took 3 person-days to set up, but a "quick" cluster took 10 person-minutes, would you use the quick one for the initial eval? Feedback we have gotten repeatedly was "it is too hard to set up a real cluster when I don't even know if I want it".
There are a bunch of facets of streamlining that we're working on right now, but they are all serving the purposes of reducing initial investment and increasing transparency.
> how easy it is to get the new Docker orchestration running
This is exactly my point above. You don't think that their demos give you a fully operational, secured, optimized cluster with best-perf networking, storage, load-balancing etc, do you? Of course not. It sets up the "kick the tires" cluster.
As for AWS - it is something we will keep working on. We know our docs here are not great. We sure could use help tidying them up and making them better. We just BURIED is things to do.
Thanks for the feedback, truly.
Whatever you may think of my level of knowledge or weak knees for consensus and gossip protocols, these problems(perceived or otherwise) with setup, documentation, and management seem pretty widely reported.
I hope this doesn't sound too negative. Kubernetes IS getting better all the time. I only write this to give a perspective from somebody who would like use Kubernetes but has reason for pause. Our requirements are likely not standard; our internal bar for automation and ease of use is quite high. We essentially have an internal, hand-rolled, docker-based PaaS with support for ad-hoc environment creation(not just staging/prod). We would like to move away from holding the bag on our hand-rolled stuff and adopt a scheduler :) Deciding to pull the trigger on any scheduler though would be committing us to a rather large amount of integration effort to reach a parity that doesn't seem riddled with regressions over the current solution.
So: there was a big discussion about whether a single k8s cluster should span multiple AZs (which shipped in 1.2), or whether we should allow the API to target multiple independent clusters (federation, the first version of which is shipping in 1.3). The core of the argument is that multi-zone is simpler for most users, but with only one control plane it is less reliable than a federation of totally independent clusters. Federation also brings other benefits, like solving the problem of running in clusters that are not in a single "datacenter" i.e. where you need to worry about non-uniform latency. I haven't seen anyone else make a serious attempt at solving this.
So, remember that the issue tracker is filled with the unvarnished discussions that come from true open source development. I think it is an asset for you, because you don't discover those things 3 months into using your chosen product; but it is definitely a liability for k8s, because we rely on you realizing this in your initial evaluation and weighting appropriately (the devil you know vs the devil you don't). I think k8s is likely much better than you think it is, and you should come talk to us on slack and make sure of that fact! It certainly sounds like you have an interesting use case that we'd like to hear about and consider.
But yes, our docs should be better!
Regarding the multi-AZ support issue - this is mostly because an EBS volume can only be attached to EC2 instances in the same AZ, and since Kubernetes has great support for persistent data volumes, you're pretty much limited to a single AZ if you're using persistent data volumes and want them to be remounted on a different instance in case of a failure. I think a more viable solution for persistent data volumes is to leverage EFS and use Convoy NFS to mount them. Now you have highly available, scalable, persistent data volumes, and you can stretch your cluster across multiple AZs.
The debate between automation vs simplification is one that has gone on since k8s 1.0 and likely will continue to be had. But I think to an extent it is a false choice: I created a new k8s installation/ops tool (i.e. did work on the "automation" side), and out of that I got a fairly clear road-map for how we could simplify the installation dramatically in 1.4. In other words, some of the simplification you ask for comes by embedding the right pieces of the automation. k8s is open-source, so I have to persuade others of this approach, but I think that's a good thing, and I'd hope you'd join in that discussion also (e.g. #sig-cluster-lifecycle on the k8s slack).
To be clear, nothing is "masterless" - please go check out the production deployments for other container management solutions, they all require a separate control plane when running in production with a cluster of any reasonable (>64 nodes) size. FYI, it's a best practice when running a cluster of any size to separate the control plane.
To your direct question, with the other orchestration tools, how would you manage your ELB? Wouldn't you have your own management? They don't (to the best of my knowledge) do any sort of integration - not even the minimum level that Kubernetes does.
As others in the thread mentioned, this was the cut of the binary, we'll be talking a lot more about it, updating docs and sharing customer stories in the coming weeks.
Thanks, and please don't hesitate to let me know if you have any questions!
Disclosure: I work at Google on Kubernetes.
Disclosure: I do not work at Google
The experience, definitely something I’m looking forward to, needs a lot of improvement if your laptop has an Apple logo on it. Hopefully some part of the team is working on that :)
Disclosure: I work at Google, on minikube.
Sounds pretty interesting, especially all the part about service discovery & node health/replacement.
Anyone using it for production?
Otherwise, there's a list at http://kubernetes.io/community/, including: New York Times, eBay, Wikimedia Foundation, Box, Soundcloud, Viacom, and Goldman Sachs, to name a few.
A final build of 1.3 was tagged with an accompanying changelog and announcement post. I found it weird that it had no more ceremony, nor any prior submission on hn, and as it had been announced through the kubernetes-announce mailing list for 17 hours, I figured its existence would be interesting to the community, so I submitted it in good faith.
In any case, kudos to everybody working on it and congratulations on the release, whether it's this week or the next.
My understanding is that with the timing of the US holiday, it made more sense to hold off on the official announcement for a few days. So that's why there aren't more announcements / release notes etc; and likely there won't be as many people around the community channels to help with any 1.3 questions this (long) weekend.
You should expect the normal release procedure next week! And if you want to try it out you can, but most of the aspects of a release other than publishing the binaries are coming soon.
I'm the new executive director of the Cloud Native Computing Foundation, which hosts Kubernetes. We have end user members like eBay, Goldman Sachs and NCSoft, but we're in need of startup end users (as opposed to startup vendors, of which we have many).
Please reach out to me at dan at linuxfoundation.org if you might like to be featured in a case study.
Great to see an openstack provider's been added, too.
Does anyone have examples of how they are managing deployments? I.e. deploying app update, running db migrations perhaps?
Not that it's very exciting to anyone who is familiar with Services + Pod networking, but there's a video demo: https://asciinema.org/a/48294
Kudos to them, and awesome to see people working to get Kubernetes to work on Azure.
> Support for ap-northeast-2 region (Seoul)
In addition, different regions support different AWS features/products and being a newer region usually means the least amount of support. So any setup tooling or infrastructure integration needs to account for those differences and use alternatives if certain services aren't available.