There is also a great talk by John Wilkes (Google Cluster Management, Mountain View) about job scheduling via Omega at Google Faculty Summit 2011 @ http://www.youtube.com/watch?v=0ZFMlO98Jkc
Also, it comes from the age old debate of whether an Operating System is sufficient to perform a certain type of resource management function or should the application which knows about its own semantics build its own resource manager. One popular incarnation of this is Michael Stonebraker's paper on "Operating system support for database management" - http://www.eecs.harvard.edu/~margo/cs165/papers/stonebraker-... where he discusses if Database management systems should builds its own buffer/caching mechanism or those mechanisms provided by filesystems are good enough. This can be extended to the problem in question now, which is about scheduling of jobs on a huge cluster seen as a single computer.
One of the major reasons (of many others) why we moved Databases to separate machines was because OSes did a bad job at scheduling tasks. Rather than saying bad, I think I should say that it is difficult for the operating systems to understand the types of workloads. However, in this particular case, Google knows a lot more about its workloads and hence can do a better job at scheduling them on the same machine.
Also, it is very important to note that Google has been able to do this because they run 1000s of machines next to each other and everything is distributed. If they can't schedule a task on a particular machine because it is overloaded, they always have the choice of scheduling that task on a next machine in the rack. This may not be the case for most people outside Google. If they are using a single machine to host both the database and the computation itself and if for some reason that machine is already running at its peak, there is no other option than making the tasks to be scheduled wait, because there is no where else to schedule those tasks.
Moving DBs to separate machines was really to keep the utilization at low levels so that there would be cycles available when needed.
EDIT: Typo fixes.
~2002: move everything off to its own dual-cpu server
~2007: move everything to its own dual core vm on an 8-core server
~2012: put everything on each 16/32 core box but manage its resources via cgroups (or just overprovisioning)
Using well-known ports severely restricts flexibility of scheduling.