

Google: Multiplex Multiple Works Loads to Increase Utilization and Save Money - iamtechaddict
http://highscalability.com/blog/2013/11/13/google-multiplex-multiple-works-loads-on-computers-to-increa.html

======
nisa
Cool. That Google build cgroups makes even more sense now.

~~~
WestCoastJustin
Yeah, cgroups are pretty awesome! I created a screencast about them @
[http://sysadmincasts.com/episodes/14-introduction-to-
linux-c...](http://sysadmincasts.com/episodes/14-introduction-to-linux-
control-groups-cgroups) if you wanted to learn more about them and have never
used them before.

There is also a great talk by John Wilkes (Google Cluster Management, Mountain
View) about job scheduling via Omega at Google Faculty Summit 2011 @
[http://www.youtube.com/watch?v=0ZFMlO98Jkc](http://www.youtube.com/watch?v=0ZFMlO98Jkc)

~~~
nilsimsa
Nice website. Will be viewing the rest of your screencasts...

~~~
WestCoastJustin
Thanks! If you have any suggestions or episode ideas, please let me know!

------
fidotron
It feels like we've gone in one big circle, where first we move the DB on to a
separate machine for performance, yet now more computation will go back to
being done nearer the data (like Hadoop) and we'll try to pretend it's all
just one giant computer again.

~~~
madhusudancs
Actually, I think this is not exactly a circle. The reason why we moved
Databases to separate machines for performance was we did not understand how
to schedule tasks or jobs.

Also, it comes from the age old debate of whether an Operating System is
sufficient to perform a certain type of resource management function or should
the application which knows about its own semantics build its own resource
manager. One popular incarnation of this is Michael Stonebraker's paper on
"Operating system support for database management" \-
[http://www.eecs.harvard.edu/~margo/cs165/papers/stonebraker-...](http://www.eecs.harvard.edu/~margo/cs165/papers/stonebraker-81.pdf)
where he discusses if Database management systems should builds its own
buffer/caching mechanism or those mechanisms provided by filesystems are good
enough. This can be extended to the problem in question now, which is about
scheduling of jobs on a huge cluster seen as a single computer.

One of the major reasons (of many others) why we moved Databases to separate
machines was because OSes did a bad job at scheduling tasks. Rather than
saying bad, I think I should say that it is difficult for the operating
systems to understand the types of workloads. However, in this particular
case, Google knows a lot more about its workloads and hence can do a better
job at scheduling them on the same machine.

Also, it is very important to note that Google has been able to do this
because they run 1000s of machines next to each other and everything is
distributed. If they can't schedule a task on a particular machine because it
is overloaded, they always have the choice of scheduling that task on a next
machine in the rack. This may not be the case for most people outside Google.
If they are using a single machine to host both the database and the
computation itself and if for some reason that machine is already running at
its peak, there is no other option than making the tasks to be scheduled wait,
because there is no where else to schedule those tasks.

Moving DBs to separate machines was really to keep the utilization at low
levels so that there would be cycles available when needed.

EDIT: Typo fixes.

------
MikeTLive
i have manually operated multiple JVM per server, maintaining /etc/services to
track the ports, and apache+modproxy as front end. start/stop services and let
apache detect the active state. with mesos and cgroups we get even more a
dynamic assurance that services have the minimum guaranteed resources and
access to scale across the entire compute cloud. we will soon be able to scale
up/down on demand maintaining an 85% or greater utilization reducing
allocation. with cgroups we remove the overhead of multiple OS that
traditional VM brings.

~~~
thrownaway2424
Do you use zookeeper for addressing and assigning ports? One of the things
most lacking in the outside-Google datacenter world is an addressing scheme
like DNS that allows you to find some resource by name, including the port.

Using well-known ports severely restricts flexibility of scheduling.

