
Large-scale cluster management at Google with Borg - plantain
http://research.google.com/pubs/pub43438.html
======
obstinate
Oh, fun, they finally published this. How many times I've had to catch myself
from saying the word "Borg" I don't even know.

The performance isolation is good, but I wouldn't really seek after high
utilization unless compute costs are significant to your business. We've seen
some crazy things where nominally non-interfering jobs cause significant
performance degradation to other jobs on the same node. There's work yet to do
here.

~~~
juliangregorian
Why is everyone acting like the name was some big secret? Everyone who has
been using Mesos knows Borg, for crying out loud, big writeups were published
in Wired and The Verge over 2 years ago!

~~~
dekhn
It was a secret. You weren't allowed to mention borg outside Google. Note John
Wilkes was generally careful about not saying the name was 'Borg' or
acknowledging it as such and instead referred to 'Omega'.

~~~
juliangregorian
Perhaps you were told it was secret. It was not secret. Here's one of the
articles I was referring to. [http://www.wired.com/2013/03/google-borg-
twitter-mesos/](http://www.wired.com/2013/03/google-borg-twitter-mesos/) John
Wilkes mentions Borg, acts coy about the name for some reason (seems like a
pattern), and mentions Omega, its nascent replacement.

~~~
dekhn
John does not mention Borg by name publicly (well, until the paper was
published). That's what I meant (consistent with what I was replying to). The
article you linked to confirms that.

"According to Wilkes, Google plans to publish a research paper on Borg (though
he still won’t use the name). "

Wilkes won’t even call it Borg. “I prefer to call it the system that will not
be named,"

(I work with John and have contributed to Borg)

------
kozyraki
All of us who have used Borg over the years are very appreciative of the
technology and its capabilities. Congrats to all the Googlers, current and ex,
that contributed to it. This paper should be toward the top of the reading
list for anyone working on the topic.

Nevertheless, there are many open questions for large-scale cluster management
for researchers and developer to address. Here are some of my favorite: \- The
curse of overprovisioning: Borg and many other systems rely on reservation
which are systematically exaggerated by users. Right sizing these reservations
is one way to go beyond the 40-50% usage shown in the Borg paper (see fig 12).
A promising way of doing this is Christina Delimitrou's work using
classification techniques (see [http://goo.gl/vFf8oN](http://goo.gl/vFf8oN))
\- Oversubscription using better isolation mechanisms): this is what the Borg
paper calls resource reclamation. Take unused (but reserved) resources from
priority jobs and use them for best effort analytics. David Lo
([http://web.stanford.edu/~davidlo/](http://web.stanford.edu/~davidlo/)) has a
very interesting paper coming up on how to coordinate cpu sets, cache
partitioning, Linux TC, RAPL/DVFS (power management) to run websearch clusters
at >90% by packing them with analytics without causing ANY glitch on search.
And that is Google search.

There are definitely more interesting. Exciting times.

~~~
mdaniel
> [http://goo.gl/vFf8oN](http://goo.gl/vFf8oN)

goes to:
[http://web.stanford.edu/~cdel/2014.asplos.quasar.pdf](http://web.stanford.edu/~cdel/2014.asplos.quasar.pdf)

It's my understanding that URL shorteners are frowned upon in HN posts or
comments.

~~~
kozyraki
I learn something every day :)

------
WestCoastJustin
There was a great talk by John Wilkes (Google Cluster Management) re: Omega in
2011 at Google Faculty Summit [1]. Absolutely fascinating to see the scope of
the problems they are dealing with.

[1]
[https://www.youtube.com/watch?v=0ZFMlO98Jkc](https://www.youtube.com/watch?v=0ZFMlO98Jkc)

Edit: remove error in my comment re: borg/omega order.

~~~
nostrademons
Omega actually comes after Borg (hence the name). It's just that Borg was
quite confidential until now. I'm pretty surprised they let the name slip.

~~~
tonfa
The name was public since Omega was publicized. The paper itself is great,
there's everything about Borg in one place (scheduling, quota, isolation,
etc).

------
chuhnk
Was always in awe of Borg and Omega while at Google. Really nice to see them
finally publish a paper on this. Guess they've more far enough along now that
it makes sense to do so. Omega will be a far superior beast to Borg and the
open source Kubernetes but I have high hopes for the future of Kubernetes.

------
mandeepj
Apache Yarn[1] looks like same thing as borg\omega. A plus point with Yarn is
we can get our hands on it.

[1] [http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-
yar...](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-
site/YARN.html)

~~~
samkone
Yarn tries to do the same thing, but is quite different. It's skewed toward
running short running jobs, like batch jobs, it was meant for hadoop and co
after all. Borg/Omega seems more like a combination of Mesos at the scheduler
layer and Marathon/Kubernetes on top. It's funny to see how many services they
run on top of it though.

~~~
presspot
This is exactly right. Marathon or Kubernetes running on Mesos or Mesosphere's
DCOS is functionally similar to Borg.

------
bbrazil
> cc would be reachable via 50.jfoo.ubar.cc.borg.google.com.

I've implemented similar at my current job, as that sort of naming is very
convenient. [http://www.boxever.com/using-google-apps-openid-connect-
with...](http://www.boxever.com/using-google-apps-openid-connect-with-apache)
has a sketch of how to do this with Apache as a reverse proxy with Google
Auth, though we're using a PAC file now going to a HTTPS forward proxy to
avoid limitations of SSL wildcard certs.

~~~
dekhn
Keep it up, Brian :-)

~~~
bbrazil
If that was the craziest thing I was doing, I'd be doing well :)

------
mark_l_watson
Nice writeup - I just saved the PDF in my keep for ever PDF repo.

When I contracted at Google at 2013 I loved their infrastructure. For my task
I had to run huge Borg jobs and the job submission, monitoring and logging
system were very easy to use. I really liked the summary of hardware failures
that occurred - hardware really is not very reliable when running at scale.

After not using AppEngine for a few years I have started using it recently for
two personal projects. Using AppEngine's logging and scaling features is a
tiny bit like using Google's internal infrastructure - makes me a little
nostalgic.

------
nextweek2
The Register [1] for years has referred to Cisco as the Borg because of their
acquisition strategy.

The title implied to me that Google and Cisco were working together.

[1]
[http://www.theregister.co.uk/2015/04/16/borg_routers_open_to...](http://www.theregister.co.uk/2015/04/16/borg_routers_open_to_repeat_remote_dos_attack/)

------
jnpatel
How does Borg compare with Google's Kubernetes?

~~~
tonfa
There's a whole section about it in the paper (lessons learned and future
work).

------
mrfusion
Is this similar in concept to Oracle Grid Engine (SGE)? How is it different,
superior?

~~~
mitchell_h
I started doing SGE stuff in ~2001...It still amazes me how well that thing
worked and how often a lot of it's features are reinvented.

~~~
mrfusion
I agree. It's really simple to get started (short learning curve) and it's
completely language agnostic.

I'm amazed how often it gets ignored. There's even StarCluster so you can
automatically set up a cluster on EC2.

------
stox
I wonder if disruptive jobs are named Hugh?

------
mkr-hn
You will be scalesimilated.

