However, I'm using mesos, marathon and chronos to manage a production environment with service discovery glue based on Route53.
Using Docker to ship an application to a well configured environment is just a delight, the amount of configuration needed is absolutely minimal.
However, I think people need to realize that it's "easy" if your services are not talking to each other and dependent on one another in a way. If service X is using service Y directly (via HTTP), it gets a bit more challenging.
The way I like to configure micro-service is based on messaging so you send a message to a queue and multiple satellite services can consume that message and do stuff with it.
If your services are dependent on one another, the configuration gets trickier and the maintenance gets a bit harder.
Good job by Box also contributing back to the core of Kube based on what they needed, based on it getting merged I am guessing other people will find it useful as well.
We have built some tools for port discovery (talking to mesos to figure out what port service Y is built on), but even with all our tools, we recently did a complete cloud migration (to GCP), and it was easy as backing up and deploying Zookeeper on the new nodes. Once the slaves were up, everything was running as if nothing changed in under an hour.
Disclosure: I work at Google on Kubernetes.
I agree 100% Kube answers everything I am missing with mesos/marathon combination, that's why I am planning to start moving over new services.
It's just a matter of my personal comfort, I like to use isolated services that just use messaging and fire a message when their done with their role.
Even if Kube handles everything perfectly, it's still harder to maintain application with inner-service communication, hard to follow problems/errors/stack traces etc...
How do you handle gatewaying traffic into Kubernetes from non-K8s services? I've been trying to get a basic cluster out the door with one of our most stateless services, but I'm having a having a hard time just getting the traffic into it.
The mechanism I'm using is having a dedicated K8s nodes that don't run pods hold onto a floating IP to act as gateway routers into k8s. They run kube-proxy and flannel so they can get to the rest of things, but ksoftirqd processes are maxing CPU cores on relatively recent CPUs trying to handle about 2Gbps of traffic (2Mpps) which is a bit below the traffic level the non-k8s version of the service is handling. netfilter runs in softirq context, so I figure that's where the problem is.
Are you using Calico+BGP to get routes out to the other hosts? What about kube-proxy?
Our network setup is constantly evolving due to a number of internal networking limitations related to nearly static ip-addressing and network acls. I'll describe our current setup and then describe where we'd like to go. The big piece of context is that we already have a number of services already being managed via puppet and a smaller number of new and transitioned services in Kubernetes so we need to allow interop though a number of different mechanisms.
We are currently using Flannel for ip-per-pod addressability within our cluster. No services are communicating inside the cluster so they aren't using kube-proxy yet. For services outside the cluster talking into the cluster we are using a heavily modified (https://github.com/kubernetes/contrib/tree/master/service-lo...) which we have contributed back yet. It supports SNI and virtual hosts. And we get HA and throughput for the individual loadbalancers by using anycast.
We have a number of internal services outside the cluster slowly moving to SmartStack. So I assume we will be figuring out interop with that and running it as a sidecar at some point. We would like to move to calico as we have some fairly high throughput services running outside of the cluster which we need to avoid bottlenecking on a loadbalancer for. We have separate project running internally to move our network acls from network routers to every host via Calico.
Hope that is more helpful than confusing.
You can either bind the container to a host port and register the ip of the node (or use the k8s dns or api to find the ips). Otherwise register a service with a nodeport and all the nodes will accept traffic and load balance internally.
You can get a list of ips from the DNS (instead of just the service ip), and I think that interacts appropriately with host ports.
We dropped the receive queues down to 12, from 48, and hit line rate. More info here:
Off the top of my head:
Have you thought about putting flanneld on the machines hosting the non-K8s services? Probably impractical, but it's something to consider.
The other is to treat the services inside the cluster as if it is in a different datacenter and explicitly expose nodeports that the other services need. If you're using HTTP as the transport, maybe use an http proxy running inside the cluster and proxying them to the services within the cluster. That's how I did it with getting AWS ELB to talk to the services within the cluster I set up.
I have considered just writing a quicky daemon that will do just the work of syncing routes without getting a lease (or trying to modify flanneld to do so).
The service in this case is memcache with a bunch of mcrouter pods in front of it to handle failure and cold cache warming. I still need to get traffic to the mcrouter instances and that's where I'm running into the bottleneck.
Fronting the mcrouter pods with a service and using a node port (http://kubernetes.io/docs/user-guide/services/#type-nodeport) is not workable?
My question is what about network security ? How is that part managed ?
Disclaimer I work for Box on Kube
For example for us at Gogobot, every time a user submits a recommendation we detect the language. This can be a service instead of a worker and the code can live separately.
If you do this often enough and aggressively enough, you end up with dozens of services. Once you have an environment that allows easily testing/launching these it's much more efficient to launch a service than replicating your monolithic and assign worker for things.
Design it then split if where it makes sense.