
Google offers researchers 1 billion computing core-hours - albemuth
http://googleblog.blogspot.com/2011/04/1-billion-computing-core-hours-for.html
======
ChuckMcM
A rancorous debate broke out inside Google once about the suitability of
Google's infrastructure for solving 'real' problems. The people who felt it
was inadequate pointed out that everyone who built "super" computers did so
with lots of shared state and epic low latency bandwidth. But Google's
computers were designed with 'shared nothing' in mind, the architecture [1] so
clearly supports web search but is useless for scientific computing.

I and a few others were of the opinion that the scientific community didn't
build the equivalent of LinPack or scientific simulation packages on 'Google
like' architectures because they _didn't have access_ to such architectures
rather they had "Beowulf" clusters [2] which had been built to be more like
Supercomputers. It wasn't that such problems couldn't be worked on shared
nothing architectures, it was just that nobody was making any real progress
along those lines.

This problem reads like a typical Google response to such an argument "Ok, if
we made some hardware available for this sort of thing, would academics be
willing to apply some serious thinking to it?" No doubt buried in the details
somewhere there will be some language about Google owning or at least getting
free perpetual access too any code or techniques that emerge out of this
experiment.

You have to admit, if they could pull it off it would put a huge crimp in any
supercomputer type system.

But perhaps more interestingly, after I left Google one of the things that
caught my eye was some of the articles that have been written about what
quantum computing might look like. And I realized that with enough cores
sitting around you could imagine how stuff learned programming that for 'real'
problems might inform how you would program a quantum computer.

[1] <http://labs.google.com/papers/googlecluster-ieee.pdf> [2]
[http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_bo...](http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_book/node9.html)

~~~
VladRussian
>I and a few others were of the opinion that the scientific community didn't
build the equivalent of LinPack or scientific simulation packages on 'Google
like' architectures because they didn't have access to such architectures
rather they had "Beowulf" clusters [2] which had been built to be more like
Supercomputers. It wasn't that such problems couldn't be worked on shared
nothing architectures, it was just that nobody was making any real progress
along those lines.

Whats is the difference between 'Google like' and Beowulf or other similar
Linux clusters for scientific calculations? A Beowulf is just a software which
clusters together a bunch of cheap computers connected by whatever Ethernet is
currently available for normal money (of course you can throw more money if
you have it). On the other side, people have no problem running physics or
bioinformatics calculations onto hundreds of Amazon nodes.

>it would put a huge crimp in any supercomputer type system.

thats what Beowulf and the likes already did 10 years ago. It is one of the
reasons why "supercomputers" (SMP nodes connected by extremely fast backplanes
in big cabinets or as you said "lots of shared state and epic low latency
bandwidth") had the low rate of survival into 21st century. Of course there is
still Top500 supercomputers - big rooms with a lot of racks and, frequently,
very expensive/fast networks. Yet, if your distributed program significantly
depends on the speed of the interconnect, ie. it have significant message
passing component, it usually wouldn't scale effectively, ie. it may scale,
yet with quickly diminishing return, Amdahl law style, even with extremely low
latency / expensive interconnect.

------
nyellin
Read the ending; it sounds like a cloud computing offer:

 _In the future, we think this service could also be useful for businesses in
various industries, like biotech, financial services, manufacturing and
energy. If your business can benefit from hundreds of millions of core-hours
to solve complex technical challenges and you want to discuss potential
applications, please contact us._

------
hugh3
Interesting! I have a few ideas on some awesome things I could do with a
billion CPU hours, though I'm not one hundred percent sure whether I can get a
proposal together by the end of May.

The eligibility is a little unclear though... at one point it says "up to 10
distinguished researchers and postdoctoral scholars worldwide", while
elsewhere it says "Awardees will participate through Google’s Visiting Faculty
Program; faculty members need to have full-time status at an academic
institution". So are postdocs invited to apply or not?

edit: Oh, the page could also use some information about memory per node.

~~~
TheEzEzz
Postdocs are full-time faculty members. They just aren't permanent.

~~~
gammarator
No, postdocs are not faculty. They may be classified as staff, as fellows, or
even as students [1-2], but they aren't members of faculty senates.

[1]
[http://sciencecareers.sciencemag.org/career_magazine/previou...](http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/2000_11_10/noDOI.10424127276070392774)
[2]
[http://sciencecareers.sciencemag.org/career_magazine/previou...](http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/2000_11_17/noDOI.10424127276085878637)

~~~
TheEzEzz
At my current university all postdocs in my department carry the official
title of Visiting Assistant Professor. They are listed under the tab 'Visiting
Faculty' on the departmental website. At least in name they are faculty, and
they are professors.

~~~
bbgm
That's unique. Having many friends who were, or are still, postdocs, they were
almost always staff, but not faculty in any sense (in the US at least)

------
bdb
Really interesting that your submission has to be in the form of something
that can run on Native Client. I wonder if they're using this to stress-test
implementations of NC.

~~~
tlrobinson
Or they're just using NativeClient as the sandboxing mechanism.

~~~
rwg
It's almost certainly this, and I would love to see them release _lightweight_
grid software that uses NaCl on compute hosts. The footprints of existing grid
software stacks, both in terms of volume of software and sysadmin time to
setup/maintain, are ridiculous.

~~~
phaedrus
I and another computer science student spent an entire semester internship
just trying to get Globus properly installed and configured in the CS lab for
one of our professors. It (the cluster computing software infrastructure of
Globus) was a ridiculous sprawling mess of poorly integrated components. I've
gone through both Gentoo and Linux-From-Scratch and can honestly say it is
easier to _assemle the parts for an entire operating system_ than to get
Globus working (at least that was the case in 2008). I could see a definite
need for a lightweight solution with a simple install package, but I wasn't in
the position to write one.

------
cabacon
You can also go for the INCITE or ALCC programs from the DOE:
<http://www.doeleadershipcomputing.org/>

They award O(1 billion) hours to O(100) projects, and you don't need DOE
funding to apply (or even be a US-based project). The catch is that your code
has to be very scalable (to roughly 40k cores), and your science problems
should be in line with the DOE mission.

------
entangld
Has Google been putting out really cool stuff consistently over the last few
years or is it just because I'm constantly on HN that I see this stuff?

Anyway, I'm really starting to like Google again.

------
juiceandjuice
That's effectively 114200 cores devoted continuously for a year. Not bad. An
experiment I work with uses ~600 cores over 85 hosts, although the average
load is probably only around 20%, but there's times where it's pegged for a
few months.

~~~
Retric
_That's effectively 114200 cores devoted continuously for a year._

Or 256 x Quadro 6000's @ 99.6% up time. Granted, Google can hand out the real
CPU time not just a bank of GPU's. Still, I don't think it will be that long
before someone pulls that "switch".

~~~
DarkShikari
Quadros don't have 448 cores each. nVidia's "core" numbers count each portion
of a vector processor (a "thread warp") as a core; by this definition, a
single Core i7 core has 48 "cores".

------
pama
This is a truly _amazing_ offering. I hope the applications in the area of
biochemistry research will find creative ways to work around the two important
limitations of google: its GFS filesystem and its latency.

------
gojomo
The question I would like to research is: 1 billion core-hours on Google's
infrastructure yields exactly how many bitcoins?

~~~
wladimir
Well, if the Google infrastructure is so massive that it will dominate the
bitcoin generation: 50 coins every 10 minutes.

Not worth the electricity I think. Let's talk real science instead of self-
enrichtment for a change.

~~~
gojomo
I'd heard the bitcoin system auto-adjusted for more 'mining'... but is its
production rate really constant, no matter how large the arrival of new
computing power under a single authority?

(My 'proposal' was a thought experiment; I'd definitely agree that given that
much computing-power there are more beneficial and/or profitable things to
do... among them running a market-dominating search engine.)

