

Mesos, Omega, Borg: A Survey - r4um
http://www.umbrant.com/blog/2015/mesos_omega_borg_survey.html

======
menage
One important point that the author seems to have misunderstood is that Borg
was the _predecessor_ to the other two systems, not the successor. Borg went
into production (running a bunch of websearch dedicated clusters) in late
2004, long before Mesos or Omega were around. Omega is/was an experimental
replacement for Borg that was started much later, although I'm not sure how
much production load it actually took over.

~~~
umbrant
Author here. I ordered them based on time of publication, and evaluated them
based on the contents of the paper. My summary wasn't meant to be a substitute
for actually reading the papers either, the Borg paper states that it started
out as a centralized scheduler and has evolved over time, and also that it's
been in production for over a decade. Clearly, it predates Mesos and Omega.

I've heard varied things about the use of Omega in production. The Borg paper
mentions that it runs 98% of machines at Google, but that number is apparently
dated. One person said that Omega runs all the batch work, and is being rolled
out further. However, I've also heard it's being phased out.

~~~
menage
We did try to write a paper on Borg way back in ~2008 but it got bogged down
by internal disagreements on the style and approach ...

------
mckoss
See also Google's blog post summarizing Borg -> Kubernetes improvements.

[http://blog.kubernetes.io/2015/04/borg-predecessor-to-
kubern...](http://blog.kubernetes.io/2015/04/borg-predecessor-to-
kubernetes.html)

------
KaiserPro
Its interesting to see how other industries tackle the same problem.

VFX has essentially the same problem to google: a huge bunch of tasks that
need to perform all at once.

However VFX only tend to have one data center, so they don;t need or want
clustered scheduler.

[https://github.com/mikrosimage/openrendermanagement](https://github.com/mikrosimage/openrendermanagement),
Alfred and tractor from pixar, and framestore's FQ (which is faster and more
efficient than Borg at job dispatch. ) Are a few good example of task
management.

------
presspot
I know a lot about Mesos and Mesosphere's DCOS, so can comment on those:

* There are users of these systems that get 90+% cluster utilization.

* Pre-emptable tasks (e.g., best effort scheduling vs guaranteed SLA scheduling) will be landing in Mesos.

* Mesosphere is building advanced scheduling plug-ins that will use the new scheduling models to do oversubscription of a cluster, helping to drive utilization to the 90%+ range without the need for any special tooling. You can get an idea of some of the algorithms being employed by checking out the Kozyrakis/Delimitrou Quasar paper[1].

[1]
[http://csl.stanford.edu/~christos/publications/2014.quasar.a...](http://csl.stanford.edu/~christos/publications/2014.quasar.asplos.pdf)

------
jefe78
Is anyone using these at scale but with a small team to support it? We have a
5-6k fleet of servers across 3 DCs + another 1.5k in AWS. I tried deploying
Mesos with mixed results. I also experimented with CoreOS. Considering re-
exploring XEN/VMWare.

------
sysk
I'm not a sysadmin but recently started using CoreOS to deploy small web apps.
Could anyone explain to me like I'm 5 what's the difference between those
cluster schedulers and something like CoreOS' fleet
([https://github.com/coreos/fleet](https://github.com/coreos/fleet))?

~~~
gtirloni
I think they are trying to achieve the same thing, with differences in API and
richness of each ecosytem.

[http://www.slideshare.net/teemow1/container-
orchestration](http://www.slideshare.net/teemow1/container-orchestration)

[https://groups.google.com/forum/#!msg/coreos-
dev/nHK8irdnmM0...](https://groups.google.com/forum/#!msg/coreos-
dev/nHK8irdnmM0/BSwZpV1SNisJ)

~~~
nrr
Actually, having spoken with one of the CoreOS guys recently about this, it
seems that their concerns are a bit lower-level. Where these resource managers
actually do concern themselves with the problem space of resource management
as well as orchestration, Fleet is taking the position of a "distributed
systemd" in a way without much else in terms of provided porcelain.

For those of us who are old school HPC people, it's probably more reasonable
to think of Fleet as a dynamic always-running manifestation of xCAT or,
perhaps, Fabric (of fabfile.py fame) or similar. For example, I've heard of
people installing and running Mesos with it, which might seem like a bit of
cluster scheduler self-satisfaction at first until one understands the
reasoning.

