
Kubernetes: The Future of Deployment - bashtoni
http://www.bashton.com/blog/2015/kubernetes-future-of-deployment/
======
falcolas
Worth remembering, Kubernetes was built to Google's needs, and Google runs
with a shared network space on any given VM and assigns an entire /24 to the
VM running docker. Each container gets one of those addresses. [1] This
probably won't work for everyone - be sure to read into the fine grained
details before drinking the koolade.

They're also at least two build versions behind Docker.[2]

[1]
[https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/networking.md)

[2]
[https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/cluster/saltbase/salt/docker/init.sls)

~~~
xorcist
Google runs docker in a VM? Why would you do that?

I understand you might want to play with it in a lab environment, but in
production at scale (especially Google-scale) sounds very strange.

~~~
brendandburns
Users of the Google Cloud run Docker in VMs, since VMs are what the Google
Cloud Platform sells.

(as does every public cloud provider [e.g. AWS])

For now, VMs are required to ensure a security barrier between different
user's containers on the same physical machine. See some of Dan Walsh's posts
on the subject (e.g. [https://opensource.com/business/14/9/security-for-
docker](https://opensource.com/business/14/9/security-for-docker)) for more
context.

~~~
jganetsk
Google Container Engine runs containers that are in Docker format. The user
does not have to deal with Docker or a VM.

[https://cloud.google.com/container-
engine/](https://cloud.google.com/container-engine/)

There's also Amazon EC2 Container Service

[http://aws.amazon.com/ecs/details/](http://aws.amazon.com/ecs/details/)

So Google and Amazon don't just sell VMs. They sell "CMs" as well (Container
Machines).

~~~
gobengo
It's most likely that even the "CM"s from both providers are actually Virtual
Machines running on a hypervisor running on bare metal. You just can't tell
and don't need to care (for most workloads).

~~~
falcolas
Yup, you can even SSH to them and poke around yourself.

~~~
jganetsk
How does that prove it's a VM? How do you know it's not cgroup isolation with
a chroot jail? Also known as containers?

~~~
takeda
Because you're the one setting them up. Basically you run Amazon provided
agent on an EC2 instance and ECS will see it as a host for ECS.

Also Amazon bills you for that EC2 instance as any other instance.

Personally I have hard time understanding the benefits of running docker in
public cloud, you still run a VM you still pay for that VM. It just one extra
abstraction layer which increases complexity of your infrastructure and also
reduces performance.

I do understand the benefits of using containers in own data center, when you
run it on bare hosts. There's simplicity and and lower costs (because you
don't have VM) you have more resources which lets you run more containers than
VMs on that host.

~~~
scprodigy
People use Docker in a public cloud (VM), primarily to simplify the deployment
pipeline, not for LXC.

Given this, it actually makes sense to combine VM with Docker, check out
www.hyper.sh

~~~
takeda
My problem is that I don't believe you can use docker without using
containers. And if you want to simplify pipeline, why not just use rpm-maven-
plugin[1] you can easily deploy including dependencies, it is fast, you can
easily upgrade or downgrade. And no need to trying to figure complexities
imposed due to involving LXC.

[1] [http://mojo.codehaus.org/rpm-maven-plugin/](http://mojo.codehaus.org/rpm-
maven-plugin/) (the website does not seem to be available at this moment due
to recent CodeHaus shutdown)

------
lobster_johnson
I'd love for someone to explain how Kubernetes compares to Mesos. Every
article I find on the subject says they are mutually beneficial, not
competitors — that you would typically run Kubernetes as a Mesos framework —
yet Kubernetes also seems like it duplicates much of Mesos' functionality on
its own.

~~~
SEJeff
Kubernetes (k8s) makes for an amazing developer story. Mesos is much more bare
metal, but the scheduler scales a loooot better than the still relatively
immature k8s scheduling component. One of the original authors of mesos wrote
a paper on scheduling:
[https://www.cs.berkeley.edu/~alig/papers/drf.pdf](https://www.cs.berkeley.edu/~alig/papers/drf.pdf).
Mesos is one of the first "two level" schedulers. I very highly recommend that
you also read this article for an idea of how this is a good idea:
[http://www.umbrant.com/blog/2015/mesos_omega_borg_survey.htm...](http://www.umbrant.com/blog/2015/mesos_omega_borg_survey.html)

The k8s upstream was forward thinking enough to make the scheduler parts of it
pluggable, which allow the (imo) holy grail of something like this
[https://github.com/mesosphere/kubernetes-
mesos](https://github.com/mesosphere/kubernetes-mesos). This gives you the
nice logical "pod" description for services, running on the battle tested
mesos scheduler.

There are many 10k+ node bare metal mesos deployments (apple, twitter, etc).
There aren't yet many kubernetes deployments of that scale. They truly are
mutually beneficial. Mesos makes ops happy, and k8s makes devs happy. Together
you have a relatively easy to setup internal PaaS (your own heroku with not a
ton of work) more or less.

Disclaimer: I'm a heavy mesos and apache aurora user.

~~~
lobster_johnson
Thanks for the explanation. Sounds like Kubernetes should work just fine for
small (<20 nodes) clusters, though.

I'm still not quite understanding what utility Kubernetes brings to the table
if you can also use it with Mesos. If you use Mesos, why involve Kubernetes at
all, and not some Mesos-specific framework like Marathon or Aurora? Is
Kubernetes simply a competitor to those frameworks?

My concern about Mesos is mainly footprint and complexity. You need to run
ZooKeeper, the master, the slaves, and then each framework. Only Mesos itself
is written in C++, everything else is JVM, which is a pretty significant
memory hog. By installing Mesos you just increased the complexity of the
deployment/ops stack by a huge margin; you reap many benefits, of course, but
Mesos is a lot more opaque and complex than a few daemons and some SSH-based
scripts.

~~~
thockingoog
Kubernetes supports 100 nodes with ease, and we expect to handle much more
than that very quickly. We just had to pick some target to start with.

------
dcosson
Does anyone have resources about security/isolation best practices for running
multiple applications on Kubernetes (or Mesos or similar)?

For instance in a non cloud-native app that runs in VM's, you might have one
app per VM and have firewalls between different VM's that don't need to talk
to each other. Then if a non-critical app got compromised and an attacker got
remote execution or SQL injection or something they can't get to your other
app servers or databases.

If all your apps are in a cluster, the non-critical compromised app might be
running on the same host as a critical app, in which case the only thing
keeping the attacker from your database credentials or other secrets is the
docker container isolation which if I understand correctly is not assumed to
be secure the way VM isolation is.

What are people doing to address this? Or are my assumptions wrong and it's
not actually a problem to worry about? My initial impression with mesos was
that you'd only use it if you're at big enough scale that you're running a
huge number of instances of the same app or you're running a lot of different
data processing tasks that all access the same data so no isolation is needed
between them. Now I feel like I see Kubernetes being discussed frequently as a
great way to run all your different microservices at any scale (e.g. "The
Future of Deployment"), but I've never seen this aspect of security discussed.

~~~
jacques_chester
You might prefer Cloud Foundry, which is switching its underlying container
scheduling fabric to Lattice[1].

In particular, Cloud Foundry has more advanced security groups features,
because it's mostly being marketed to enterprise customers.

Disclaimer: I have worked on CF and I work for a company which is a major
contributor to CF.

[1] [http://lattice.cf/](http://lattice.cf/)

~~~
dcosson
Lattice looks interesting, looking forward to checking it out more

------
NotOscarWilde
A slightly off-topic comment, but being an early-stage PhD in theoretical CS
with my thesis topic on approximation algorithms for scheduling, I would like
to know whether there are some theoretical problems related to these VM
schedulers used in practice. If there is somebody knowledgeable about what is
theoretically open (unknown tight approximation ratio, for instance) AND very
useful to people building Kubernetes et al, I would be really happy to learn
more.

(The natural advice is to "hit the books", actually read the papers related to
Kubernetes and find out what is both theoretical and useful to this area. I
intend to do that soon, but sifting through "practical papers" and looking for
something interesting in theory is a lot of work, and I just hoped there might
be somebody who could provide a shortcut.)

~~~
philip1209
It was more of an amusement, but I used integer programming at a company
hackathon to build a better image scheduler:

[https://engineering.opendns.com/2015/05/06/docker-
container-...](https://engineering.opendns.com/2015/05/06/docker-container-
scheduling-as-a-bin-packing-problem/)

With a large, fairly homogenous environment it didn't outperform random
assignment that well, though. It worked best with small, inhomogeneous loads.

~~~
NotOscarWilde
Right, ILP is a great tool for solving NP-complete problems relatively fast
(depending on the solver, but there are some very good ones out there).
However, as a theoretical tool it probably is not that exciting unless you're
ready to tackle P vs. NP this way. (Unless you move to semidefinite
programming and the SDP hierarchies, where the progress is very exciting but
not yet that applicable to scheduling, to the best of my knowledge.)

 _> With a large, fairly homogenous environment it didn't outperform random
assignment that well, though. It worked best with small, inhomogeneous loads._

Yes, that's probably a piece of the puzzle that I don't have yet -- to know a
theoretical model that is both useful in practice and at the same time
greedy/randomized assignment is not "good enough" for practical uses.

------
sferoze
An interesting sidenote. The Meteor development group is contributing to
Kubernetes and will be using it to help scale Meteor with their upcoming paid
service Galaxy.

~~~
josephjacks
This is great to see independent software companies like Meteor embrace
Kubernetes as the platform on which to build their next-generation services
[0].

[0] [http://info.meteor.com/blog/meteor-and-a-galaxy-of-
container...](http://info.meteor.com/blog/meteor-and-a-galaxy-of-containers-
with-kubernetes)

------
dchuk
What's the deal with the name "Kubernetes"? Does it mean anything, or have
some tech significance, or is it really just because it basically means
"ruler" in Greek?

~~~
thebeardisred
It means "Helmsman" in ancient Greek. Similarly it's related to the word
"Governor"

e.g: "kubernan" in ancient greek means to steer "kubernetes" is helmsman

"gubernare" means to steer or to govern in Latin "gubernator" is "governor" in
Latin

Which then leads into the modern word "Gubernatorial", et al.

~~~
henrikschroder
It's also a pun on Borg Cubes.

------
sunyc
bundling with (unholy-ly immature) SDN is the most damning things for its
adoption. It is thought to be needed for "live migration", but I don't see me
needing that anytime soon because we run on virtual machines anyway?

Iaas provider is not going away,paying for the cost of SDN now for features
that doesn't even exists yet, is insane.

~~~
brendandburns
(kubernetes contributor here)

SDN isn't required for k8s, what is required is that each Pod (group of
containers) get it's own IP address, and that the IP address is routeable in
the cluster. In many cases, the easiest way to achieve this is via an SDN, but
it is also achievable by programming traditional routers.

The reason for wanting an IP address per pod is that it eliminates the need
for port mangling, which dramatically simplifies wiring applications together.

~~~
sunyc
All applications was already desinged to be port based. I don't see how this
would drastically change that.

~~~
brendandburns
the problem with port mangling is that your application starts running on
random ports, so in addition to requiring discovery for IP addresses, you now
also have to do discovery for ports, which pretty much requires custom code
and infrastructure linked into your binaries (how do you convince
nginx/redis/... to use your lookup service for ports?)

And ports are different between different replicas of your service, since
they're chosen at random during scheduling.

It also makes ACLs and QoS harder to define for the network, since you don't
have a clean network identity (e.g IP Address) for each application.

------
NateDad
Wow, you could totally s/kubernetes/juju/ in this article and still be 100%
correct. ([https://jujucharms.com](https://jujucharms.com) for those not aware
of juju)

~~~
saryant
Not cool. At least disclose that you're one of the devs behind Juju.

~~~
NateDad
Sorry, I didn't think of it. I've mentioned it multiple times on HN, but
you're right, I should have added a disclosure.

Seriously didn't mean to be pushing Juju, I was just surprised at how similar
it was to Juju. I had always sort of assumed it was Google cloud only, and/or
containers only, etc.

