Hacker News new | past | comments | ask | show | jobs | submit login
Large-scale cluster management at Google with Borg (research.google.com)
150 points by plantain on April 16, 2015 | hide | past | favorite | 35 comments

Oh, fun, they finally published this. How many times I've had to catch myself from saying the word "Borg" I don't even know.

The performance isolation is good, but I wouldn't really seek after high utilization unless compute costs are significant to your business. We've seen some crazy things where nominally non-interfering jobs cause significant performance degradation to other jobs on the same node. There's work yet to do here.

Why is everyone acting like the name was some big secret? Everyone who has been using Mesos knows Borg, for crying out loud, big writeups were published in Wired and The Verge over 2 years ago!

Whether I'm allowed to speak about a project's codename has little to do with whether it has been leaked through nonofficial channels.

It was a secret. You weren't allowed to mention borg outside Google. Note John Wilkes was generally careful about not saying the name was 'Borg' or acknowledging it as such and instead referred to 'Omega'.

Perhaps you were told it was secret. It was not secret. Here's one of the articles I was referring to. http://www.wired.com/2013/03/google-borg-twitter-mesos/ John Wilkes mentions Borg, acts coy about the name for some reason (seems like a pattern), and mentions Omega, its nascent replacement.

John does not mention Borg by name publicly (well, until the paper was published). That's what I meant (consistent with what I was replying to). The article you linked to confirms that.

"According to Wilkes, Google plans to publish a research paper on Borg (though he still won’t use the name). "

Wilkes won’t even call it Borg. “I prefer to call it the system that will not be named,"

(I work with John and have contributed to Borg)

Perhaps you two are using different definitions of the word "secret". You are probably using the word to mean "a fact that no one outside of a select group of people knows". dekhn is probably using it to mean "a fact which my company has said we are not allowed to publicly acknowledge".

All of us who have used Borg over the years are very appreciative of the technology and its capabilities. Congrats to all the Googlers, current and ex, that contributed to it. This paper should be toward the top of the reading list for anyone working on the topic.

Nevertheless, there are many open questions for large-scale cluster management for researchers and developer to address. Here are some of my favorite: - The curse of overprovisioning: Borg and many other systems rely on reservation which are systematically exaggerated by users. Right sizing these reservations is one way to go beyond the 40-50% usage shown in the Borg paper (see fig 12). A promising way of doing this is Christina Delimitrou's work using classification techniques (see http://goo.gl/vFf8oN) - Oversubscription using better isolation mechanisms): this is what the Borg paper calls resource reclamation. Take unused (but reserved) resources from priority jobs and use them for best effort analytics. David Lo (http://web.stanford.edu/~davidlo/) has a very interesting paper coming up on how to coordinate cpu sets, cache partitioning, Linux TC, RAPL/DVFS (power management) to run websearch clusters at >90% by packing them with analytics without causing ANY glitch on search. And that is Google search.

There are definitely more interesting. Exciting times.

> http://goo.gl/vFf8oN

goes to: http://web.stanford.edu/~cdel/2014.asplos.quasar.pdf

It's my understanding that URL shorteners are frowned upon in HN posts or comments.

I learn something every day :)

There was a great talk by John Wilkes (Google Cluster Management) re: Omega in 2011 at Google Faculty Summit [1]. Absolutely fascinating to see the scope of the problems they are dealing with.

[1] https://www.youtube.com/watch?v=0ZFMlO98Jkc

Edit: remove error in my comment re: borg/omega order.

If you want to see an up-to-date talk about this from John, he will be a speaker at http://dotscale.io on June 8!

Omega actually comes after Borg (hence the name). It's just that Borg was quite confidential until now. I'm pretty surprised they let the name slip.

The name was public since Omega was publicized. The paper itself is great, there's everything about Borg in one place (scheduling, quota, isolation, etc).

Ah, thanks! Updated my comment.

It is the other way: Omega was presented as the successor of Borg.

Was always in awe of Borg and Omega while at Google. Really nice to see them finally publish a paper on this. Guess they've more far enough along now that it makes sense to do so. Omega will be a far superior beast to Borg and the open source Kubernetes but I have high hopes for the future of Kubernetes.

Apache Yarn[1] looks like same thing as borg\omega. A plus point with Yarn is we can get our hands on it.

[1] http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yar...

Yarn tries to do the same thing, but is quite different. It's skewed toward running short running jobs, like batch jobs, it was meant for hadoop and co after all. Borg/Omega seems more like a combination of Mesos at the scheduler layer and Marathon/Kubernetes on top. It's funny to see how many services they run on top of it though.

This is exactly right. Marathon or Kubernetes running on Mesos or Mesosphere's DCOS is functionally similar to Borg.

> cc would be reachable via 50.jfoo.ubar.cc.borg.google.com.

I've implemented similar at my current job, as that sort of naming is very convenient. http://www.boxever.com/using-google-apps-openid-connect-with... has a sketch of how to do this with Apache as a reverse proxy with Google Auth, though we're using a PAC file now going to a HTTPS forward proxy to avoid limitations of SSL wildcard certs.

Keep it up, Brian :-)

If that was the craziest thing I was doing, I'd be doing well :)

Nice writeup - I just saved the PDF in my keep for ever PDF repo.

When I contracted at Google at 2013 I loved their infrastructure. For my task I had to run huge Borg jobs and the job submission, monitoring and logging system were very easy to use. I really liked the summary of hardware failures that occurred - hardware really is not very reliable when running at scale.

After not using AppEngine for a few years I have started using it recently for two personal projects. Using AppEngine's logging and scaling features is a tiny bit like using Google's internal infrastructure - makes me a little nostalgic.

The Register [1] for years has referred to Cisco as the Borg because of their acquisition strategy.

The title implied to me that Google and Cisco were working together.

[1] http://www.theregister.co.uk/2015/04/16/borg_routers_open_to...

How does Borg compare with Google's Kubernetes?

There's a whole section about it in the paper (lessons learned and future work).

Is this similar in concept to Oracle Grid Engine (SGE)? How is it different, superior?

Yes, it's similar, but: it has a concept of services, which are jobs that run forever. It has a concept of allocs, which are permastorage that resides on a machine, where jobs can be rescheduled repeatedly that share the same storage. it scales to much larger size and is better at managing the fleet of machiens and scheduling work there.

I started doing SGE stuff in ~2001...It still amazes me how well that thing worked and how often a lot of it's features are reinvented.

I agree. It's really simple to get started (short learning curve) and it's completely language agnostic.

I'm amazed how often it gets ignored. There's even StarCluster so you can automatically set up a cluster on EC2.

SGE = Sun Grid Engine, I'm guessing?

Indeedarino. I never know the right name anymore.

I wonder if disruptive jobs are named Hugh?

You will be scalesimilated.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact