Hacker News new | past | comments | ask | show | jobs | submit login
Borg: The Predecessor to Kubernetes (kubernetesio.blogspot.com)
233 points by brendandburns on Apr 23, 2015 | hide | past | web | favorite | 38 comments

I think it'll be quite interesting to see how the smaller players organize themselves around the multitude of cluster resource management tools emerging as a natural reaction to Kubernetes growing out of the work Google's done on Borg.

I am curious to see how long of a shake-out period will exist before there's either a de facto stack of "compute resource" tooling, or if there's always going to be a highly fragmented and diverse way to accomplish your goals. Just off the top of my head (and there's way more) I'm thinking about Tectonic[1], Mesosphere[2], Rocket [3], Kismatic [4] as a few examples.

As a technologist and a planner, it's been challenging to see far enough into the future to decide on what tools to devote myself to learning at this point. I do think we're certainly in a "post-public cloud" timeline where we're getting good enough (or will be in 6-12 months) at abstracting virtualization right up to a millimeter or two below the application layer of our stacks. How we choose to do so seems to be currently up in the air.

In my mind, this opens up the possibility of compute as a resource much wider than had previously been possible. We'll be less reliant upon Azure, AWS, and GCP's mixture os Paas and Iaas and much more interested in compute as a resource, likely from bare metal or private cloud providers.

I'm looking forward to the increased efficiency (both through compute power and cost) and security available in moving from a application-level virtualization to operating system-level virtualization.

[1] https://coreos.com/blog/announcing-tectonic/ [2] https://github.com/mesosphere [3] https://github.com/coreos/rkt [4] https://github.com/kismatic

Disclosure: I work at Google and was a co-founder of the Kubernetes project.

I think your observations are interesting. From my (somewhat biased) viewpoint I don't think we will enter into a 'post cloud' world. There are very real efficiency gains from running at public cloud provider scale, and the economics you see right now are not what I would consider 'steady state'. Beyond that the systems we are introducing with Kubernetes are focused on offering high levels of dynamism. They will ultimately fit your workload precisely to the amount of compute infrastructure you need, hopefully saving you quite a lot of money vs provisioning for peak. It will make a lot of sense to lease the amount of 'logical infrastructure' you need vs provisioning static physical infrastructure.

There are however legitimate advantages to our customers in being able to pick their providers and change providers as their needs change. We see the move to high levels of portability as a great way to keep ourselves and other providers honest.

-- craig

Since we have someone who worked on these projects here, there was a report a couple of years ago about Borg and its successor, then called Omega. Is Kubernetes related to / a renamed Omega?

Edit: Wired story: http://www.wired.com/2013/03/google-borg-twitter-mesos/

Omega is a separate system than both Borg and Kubernetes.

Kubernetes is heavily inspired by both Borg and Omega, and incorporates many of the ideas from both, as well as lessons learned along the way. And many of the engineers who work on Kubernetes at Google, also worked on Omega and Borg.

Hi Craig!

Please feel free to respond to me at your leisure, but are you * sure * we will never enter a post-cloud world?

Not to say that there will be no cloud infrastructure, per se, just as mainframes still exist today.

On the other hand, I imagine someday we will have "datacenter in your pocket" type devices. The challenge will be who has the data -- obviously Google has already identified this as a key strategic advantage. The challenge will * not * be who has enough resources to compute it.

These pocket devices seem natural as a way to place strong AI at your fingertips, Siri-like agents, autonomous robots, etc. The first ones, which we have now, either use a data connection or are optimized to have small data sets, but the need for larger data sets is obvious. Once it becomes the primary limiter, I think it will only be a matter of time before "big data" is decoupled from the cloud and personal computing retakes its dominant position. Some will use laptops, some will use phones, but the effect will be the same.

There are also the privacy benefits from managing large datasets on your own device -- solutions are already available for things like how to back up your data, how to sync large sets of common data among a network of untrusted peers, and how to curate that data.

Cloudlets might be the herald of the post-cloud world.




Disclaimer: I work on Google Cloud but not Kubernetes or GKE. Also, Satya was my PhD advisor.

good AI tends to run on massive clusters. Barring some quantum leap in computing technology, I don't see how computation on local devices would fill our computing requirements.

Can you comment a little bit more on where you see the steady state economics of public cloud going? From where we are today, what factors (other than the dynamic provisioning you mentioned) will lead to better economics?

Thanks for commenting on this thread!

Yeah, I think that sadly, there is going to be a little bit of an inevitable equivalent to the unix wars of the early 80s. The sooner we can reach a standard place, the better it's going to be for the container community and developers more generally.

One of the reasons that I pushed hard to get Kubernetes open sourced, is the hope that we could get out in front of this, and allow the developer community to rally around Kubernetes as an open standard, independent of any provider or corporate agenda.

Disclosure: co-founder of Kismatic

We've spent a lot of time working with the Kubernetes community. I can only speak to our experience, but Brendan, Craig, and the rest of the team at Google have 100% lived up to the commitment of treating the Kubernetes project as truly open and independent.

Our Kubernetes dashboard was recently merged into Kubernetes [1]. We brought our own vision of a web ui to the project, and we could have gotten bogged down defending technology decisions, and philosophical nits. Instead, the response from Google, RedHat, and others in the community, was basically "Awesome! How soon can we get it in?"

All of the key players have the right approach, and that gives me confidence in the project's longevity.

[1] UI Demo video - https://www.youtube.com/watch?list=PL69nYSiGNLP2FBVvSLHpJE8_...

"allow the developer community to rally around Kubernetes as an open standard, independent of any provider or corporate agenda"

I look forward to Kubernetes becoming an independent project outside of Google then :)

I'm curious, @caniszczyk why would it need to become independant outside of Google? It's already an Apache licensed open-source project hosted on GitHub.

In essence, having diversity in ownership can help the project have a long life instead of being governed by one entity. There's a lot of risk that the main entity in charge will do things in its self interest instead of the self interest of the project (and its constituency) over the long term.

Independent ownership and proper governance will setup the project for long term success and as a small company, you should prefer it to be that way.

Disclosure: co-founder of Kismatic

I'm extremely pleased that Kubernetes has been open sourced by Google. It truly seems to me that the developer community is and will remain to be able to rally around Kubernetes as an open standard both today and in the future without fear of any outside agendas; as Brendan so eloquently stated. I for one applaud Google's level of transparency when it comes to the future of the project and the overall product vision.

I'm wondering if it was intentional or subconsciously accidental that you went with the "I, for one" construction... which is of course usually suffixed with "welcome our new [adjective] overlords".

"I, for one" was a common construction before, and remains common outside of, that text-meme.

What's the largest k8s deployment you guys have observed? You're not using k8s for anything major yet, right?

Thanks for building k8s! Even if it doesn't "win" in the end, it's been an extremely useful and reliable solution for my needs.

There needs to be a compelling way to run it on AWS for this to happen.

Please check out: https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...

for turn up instructions on AWS, it's as easy as:

export KUBERNETES_PROVIDER=aws; wget -q -O - https://get.k8s.io | bash

Ugh, to think people responsible for running large infrastructure installations are still piping random webpages to their shell in 2015.

It's not just them - doing things this way makes it seem like this is in any way acceptable. It's not. Stop it.

No wonder it's so easy for TAO.

Just download the script and check it first before running it then.

It's not me; I don't run scripts written by people who think it's okay to exec shit from the network.

It's for the people who don't know any better and see this anti-pattern everywhere and thereby begin to think it's okay or accepted. It's not.

What is a better alternative?

You mean like the work that Meteor is doing and hiring for https://www.meteor.com/jobs/core-developer-cloud-systems-eng... ?

Disclaimer: I work on Google Cloud but not Kubernetes or GKE.

It's important to note that some of the items in your list complement Kubernetes rather than replace it.

Think of a cluster of VMs running CoreOS + Tectonic as an alternative to Google Container Engine.

Kismatic apparently calls itself "the Kubernetes Company."

Disclaimer: I work on Google Cloud but not Kubernetes or GKE.

I'm also very curious which direction things will move. I think I'm less convinced than you are that it'll be away from AWS and the like though, they're innovating at least as fast as the open-source container cluster tools (at least it seems that way to me).

I can imagine a future where it gets easier and more common to build an arbitrarily complex backend by just hooking together AWS services, using Lambda (or something that evolves from it) to write all your custom business logic without ever thinking about a server, VM, or container. I'm working on a greenfield app and very seriously considered this route now we but ended up deciding the uncertainty vs doing it the way we know wasn't quite worth it. It feels very close to the tipping point to me though.

Either way it's definitely an exciting time

>just hooking together AWS services, using Lambda (or something that evolves from it) to write all your custom business logic without ever thinking about a server, VM, or container.

you're risking to awaken the ghost of Application Server.

There's quite few different tools in this space. I made a list of them, http://datacenteroperatingsystem.io/ feel free to add your own pull request.

I was at Google from 2006-2012 and like most Googlers, used Borg extensively. Since leaving, I've been generally impressed by the AWS ecosystem, but have been sorely missing Borg-like functionality. It's felt like the rest of the industry is about 10 years behind where Google is.

I think the crucial question for us is going to be adoption and support within the AWS ecosystem. It checks out (to me at least) as the the technically superior option, but Amazon clearly wants to compete in this space as well and they have the home turf advantage.

like @brendandburns, I just want the best technology to win and become the standard. It would be shame if the Amazon/Google rivalry got in the way of something that important.

Can someone from Amazon chime in on this? Is there anything the Google team could do that would make Kubernetes a neutral project that Amazon would support? I feel that there's a ton of raw knowledge that Google engineers have accumulated on cluster management, and Kubernetes is an opportunity for that not to go to waste.

Kubernetes is going to become the standard api for container orchestration only because there are no other tools out there trying to do as much. There was a vacuum around container orchestration tooling and Google got there first. Kubernetes components can be swapped out for other community-driven efforts, take mesos as an example, which can be used to replace the default k8s scheduler. With k8s you can avoid lock-in with different cloud providers. I think Google is hoping that we end up on their cloud platform but its nice to see that it is being built from the ground up to be used with other cloud platforms.

I'm really not sold on this yet. We've done a number of projects testing and using Kubernetes, CoreOS, Mesos and most recently, Docker's swarm. It's been interesting to see how and why the technology space is evolving, but a couple of general thoughts: 1) The concept of container as primitive, especially the thorough implementation that Docker put together is extraordinarily powerful. 2) The swarm idea - which provides a matching API to the Docker API - is a really near idea, even if it lacks the HA and scheduling functions to really make things work well. 3) I think the next evolution is really to iron out the network stack here. Kubernetes needs flannel in most circumstances, and the process is not seamless or as simple as Docker.

I'd also love to know the split between this, Omega, and Kubernetes at Google.

I have heard that the Borg is still quite widely used at Google, and Omega hasn't taken over as had previously been expected. Omega is a distributed scheduler, and we can only speculate as to why Omega hasn't taken over. My speculation would be that the optimistic concurrency control in Omega leads to storms at very high loads - attempts to allocate containers that need to be rolled back because of contention. With PCC at high loads, you get progress.

There's also just plain legacy inertia. Many existing systems are on Borg; many of their dependencies are on Borg; most Google engineers are much more familiar with Borg than Omega, and the teammates they might ask for help & advice are also more familiar with Borg than Omega.

Think of how long the Python 2->3 transition has taken (outside Google, not speaking in Google terms anymore). It's been six years, and we're only now reaching the point where Python 3 may be a better choice for green-field projects than Python 2, and Python 3 may never be a better choice for legacy installs. The Borg -> Omega transition has a similar dependency issue (everything runs in the cloud at Google), the learning curve is worse than Python 2->3, and all of Google's code is legacy. That's independent of any technical differences between them, and also irrelevant to whether an organization just getting onto the cloud would be better off with Docker, Mesos, or Kubernetes.

That's an issue, I guess for many apps. However, Google tend to make company-wide technical decisions, and then the entire engineering crowd go there. How many other companies have one SCM instance? None. If there were unrefutable economic gains to be made by moving to Omega today, my guess they would do it.

The technically interesting question is whether decentralized scheduling in the large scale is a solved problem or not. Can we do it better than centralized today?

Google absolutely does not make company-wide technical decisions and then the entire engineering crowd goes there. Rather, they make company-wide technical decisions, and over a period of 3-5 years the entire engineering crowd gradually gets there. As we used to say: "There are two ways to do everything at Google: the deprecated one and the one that doesn't work yet." In some cases I've seen up to 3 deprecated systems in flight, plus one that doesn't work yet. Borg's predecessor was finally removed from production shortly before I left in 2014, despite being deprecated around 2005.

I'd encourage you to check out:


It's a fairly straightforward getting started experience.

Also, if you want to turn up a cluster in a cloud provider, it's as simple as https://get.k8s.io

Agreed. I am deeply impressed about how "approachable" kubernetes actually is, considering what it does. The overall design concepts are quite simple and the reasoning behind them is clear. It's a small set of self-contained components (api, controller, scheduler, kubelet, proxy sitting on coreos' etcd), so the complexity is fairly manageable. Peeking into the source code of components won't give you the creeps and the build system (cross-compiling) could not be any easier.

I have not yet tried any other docker orchestration framework (there seem to be a few popping up right now), but concerning clustering: In comparison Mesos appears intimidating to me (there is certainly not the 2min "I get this" experience, I've had with tools like etcd & kubernetes) and I remember building clusters w/ technology like heartbeat, corosync, openais & drbd not so long ago - compared to this distributed computing became incredibly easy.

My advise for starters would be to pick some ready2go vagrant-coreos-setup and get it running on your workstation, this should be pretty straightforward. (We are running k8s on openstack/rackspace and there were too many moving parts involved to get the included starter-scripts to reliably bootstrap a kubernetes installation)

Then look at the user-data/cloud-init of that project and try to rebuild things on your preferred stack from the bottom upwards, step after step - I feel a lot more sovereign when doing that. The components' logfiles are actually helpful when you assemble things. It also helps to look at the generated (and documented, thx for this) iptables nat rules, when you have problems with service discovery/communication.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact