
Borg: The Predecessor to Kubernetes - brendandburns
http://kubernetesio.blogspot.com/2015/04/borg-predecessor-to-kubernetes.html
======
anonymuse
I think it'll be quite interesting to see how the smaller players organize
themselves around the multitude of cluster resource management tools emerging
as a natural reaction to Kubernetes growing out of the work Google's done on
Borg.

I am curious to see how long of a shake-out period will exist before there's
either a de facto stack of "compute resource" tooling, or if there's always
going to be a highly fragmented and diverse way to accomplish your goals. Just
off the top of my head (and there's way more) I'm thinking about Tectonic[1],
Mesosphere[2], Rocket [3], Kismatic [4] as a few examples.

As a technologist and a planner, it's been challenging to see far enough into
the future to decide on what tools to devote myself to learning at this point.
I do think we're certainly in a "post-public cloud" timeline where we're
getting good enough (or will be in 6-12 months) at abstracting virtualization
right up to a millimeter or two below the application layer of our stacks. How
we choose to do so seems to be currently up in the air.

In my mind, this opens up the possibility of compute as a resource much wider
than had previously been possible. We'll be less reliant upon Azure, AWS, and
GCP's mixture os Paas and Iaas and much more interested in compute as a
resource, likely from bare metal or private cloud providers.

I'm looking forward to the increased efficiency (both through compute power
and cost) and security available in moving from a application-level
virtualization to operating system-level virtualization.

[1] [https://coreos.com/blog/announcing-
tectonic/](https://coreos.com/blog/announcing-tectonic/) [2]
[https://github.com/mesosphere](https://github.com/mesosphere) [3]
[https://github.com/coreos/rkt](https://github.com/coreos/rkt) [4]
[https://github.com/kismatic](https://github.com/kismatic)

~~~
brendandburns
Yeah, I think that sadly, there is going to be a little bit of an inevitable
equivalent to the unix wars of the early 80s. The sooner we can reach a
standard place, the better it's going to be for the container community and
developers more generally.

One of the reasons that I pushed hard to get Kubernetes open sourced, is the
hope that we could get out in front of this, and allow the developer community
to rally around Kubernetes as an open standard, independent of any provider or
corporate agenda.

~~~
rjeaster
There needs to be a compelling way to run it on AWS for this to happen.

~~~
brendandburns
Please check out:
[https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/getting-
started-guides/aws.md)

for turn up instructions on AWS, it's as easy as:

export KUBERNETES_PROVIDER=aws; wget -q -O -
[https://get.k8s.io](https://get.k8s.io) | bash

~~~
sneak
Ugh, to think people responsible for running large infrastructure
installations are still piping random webpages to their shell in 2015.

It's not just them - doing things this way makes it seem like this is _in any
way acceptable_. It's not. Stop it.

No wonder it's so easy for TAO.

~~~
hackerboos
Just download the script and check it first before running it then.

~~~
sneak
It's not me; I don't run scripts written by people who think it's okay to exec
shit from the network.

It's for the people who don't know any better and see this anti-pattern
everywhere and thereby begin to think it's okay or accepted. It's not.

------
jqgatsby
I was at Google from 2006-2012 and like most Googlers, used Borg extensively.
Since leaving, I've been generally impressed by the AWS ecosystem, but have
been sorely missing Borg-like functionality. It's felt like the rest of the
industry is about 10 years behind where Google is.

I think the crucial question for us is going to be adoption and support within
the AWS ecosystem. It checks out (to me at least) as the the technically
superior option, but Amazon clearly wants to compete in this space as well and
they have the home turf advantage.

like @brendandburns, I just want the best technology to win and become the
standard. It would be shame if the Amazon/Google rivalry got in the way of
something that important.

Can someone from Amazon chime in on this? Is there anything the Google team
could do that would make Kubernetes a neutral project that Amazon would
support? I feel that there's a ton of raw knowledge that Google engineers have
accumulated on cluster management, and Kubernetes is an opportunity for that
not to go to waste.

------
thinkersilver
Kubernetes is going to become the standard api for container orchestration
only because there are no other tools out there trying to do as much. There
was a vacuum around container orchestration tooling and Google got there
first. Kubernetes components can be swapped out for other community-driven
efforts, take mesos as an example, which can be used to replace the default
k8s scheduler. With k8s you can avoid lock-in with different cloud providers.
I think Google is hoping that we end up on their cloud platform but its nice
to see that it is being built from the ground up to be used with other cloud
platforms.

~~~
InTheArena
I'm really not sold on this yet. We've done a number of projects testing and
using Kubernetes, CoreOS, Mesos and most recently, Docker's swarm. It's been
interesting to see how and why the technology space is evolving, but a couple
of general thoughts: 1) The concept of container as primitive, especially the
thorough implementation that Docker put together is extraordinarily powerful.
2) The swarm idea - which provides a matching API to the Docker API - is a
really near idea, even if it lacks the HA and scheduling functions to really
make things work well. 3) I think the next evolution is really to iron out the
network stack here. Kubernetes needs flannel in most circumstances, and the
process is not seamless or as simple as Docker.

I'd also love to know the split between this, Omega, and Kubernetes at Google.

~~~
jamesblonde
I have heard that the Borg is still quite widely used at Google, and Omega
hasn't taken over as had previously been expected. Omega is a distributed
scheduler, and we can only speculate as to why Omega hasn't taken over. My
speculation would be that the optimistic concurrency control in Omega leads to
storms at very high loads - attempts to allocate containers that need to be
rolled back because of contention. With PCC at high loads, you get progress.

~~~
nostrademons
There's also just plain legacy inertia. Many existing systems are on Borg;
many of their dependencies are on Borg; most Google engineers are much more
familiar with Borg than Omega, and the teammates they might ask for help &
advice are also more familiar with Borg than Omega.

Think of how long the Python 2->3 transition has taken (outside Google, not
speaking in Google terms anymore). It's been _six_ years, and we're only now
reaching the point where Python 3 may be a better choice for green-field
projects than Python 2, and Python 3 may _never_ be a better choice for legacy
installs. The Borg -> Omega transition has a similar dependency issue
(everything runs in the cloud at Google), the learning curve is _worse_ than
Python 2->3, and all of Google's code is legacy. That's independent of any
technical differences between them, and also irrelevant to whether an
organization just getting onto the cloud would be better off with Docker,
Mesos, or Kubernetes.

~~~
jamesblonde
That's an issue, I guess for many apps. However, Google tend to make company-
wide technical decisions, and then the entire engineering crowd go there. How
many other companies have one SCM instance? None. If there were unrefutable
economic gains to be made by moving to Omega today, my guess they would do it.

The technically interesting question is whether decentralized scheduling in
the large scale is a solved problem or not. Can we do it better than
centralized today?

~~~
nostrademons
Google absolutely does not make company-wide technical decisions and then the
entire engineering crowd goes there. Rather, they make company-wide technical
decisions, and over a period of 3-5 years the entire engineering crowd
gradually gets there. As we used to say: "There are two ways to do everything
at Google: the deprecated one and the one that doesn't work yet." In some
cases I've seen up to 3 deprecated systems in flight, plus one that doesn't
work yet. _Borg 's_ predecessor was finally removed from production shortly
before I left in 2014, despite being deprecated around 2005.

