
What's faster - a supercomputer or EC2? - timf
http://ianfoster.typepad.com/blog/2009/08/whats-fastera-supercomputer-or-ec2.html
======
jacquesm
What's faster, a sportscar or a truck ? It depends on what you are trying to
move.

A load of bricks will be moved faster by truck, even though in an absolute
sense the sportscar is faster...

If you are doing vector processing and have a hard-to-parallelize problem then
a supercomputer is probably the only way to go.

Otherwise EC2 will possibly be faster (it still depends on lots of subtle
factors, such as how compute intensive vs communications intensive your
application is).

The only way to know for sure is to do a limited benchmark on the core of your
problem in order to figure out which architecture works best.

~~~
kschults
Reminds me of the joke about the highest bandwidth available being a truckload
of drives barreling down the highway. While true, not necessarily useful.

~~~
kirubakaran
Certainly useful when you don't care about latency.

<http://aws.amazon.com/importexport/>

------
ajross
It's a reasonably informative article. But...

Does a 25 second process on a 32 node cluster _really_ count as a
"supercomputer" task? By definition, you can run that on your desktop in 15
minutes. If that's really your task, and these are your numbers, and latency-
to-answer is your only metric (as opposed to other stuff, like "cost" that
people in the real world worry about) then sure: EC2 makes sense. But these
look like some terribly cooked numbers to me. I mean, how much time are you
really going to spend optimizing something you can run while you trot to
Starbucks for coffee?

I'd be curious what the answers (including cost) are when you scale the
problem to days of runtime on hundreds of nodes. I rather suspect the answers
would be strongly tilted in favor of the institutional supercomputer
facilities, who after all have made a "business" out of optimizing this
process. EC2 has other goals.

~~~
scott_s
The NAS parallel benchmark suite are used to represent tasks that appear often
in scientific computing. They were designed so that running them on a
computing platform would tell us how amenable that platform is for scientific
computing.

~~~
ajross
Yes, but it's a benchmark of the hardware environment. This is more than
anything else a test of resource allocation latency. The poster posits that
EC2 can get the boxes working faster and get you your answers sooner.

But the result here is that the benchmark will complete with "high
probability" within about 6.5 minutes on EC2 for a task that _only takes 15
minutes to run on your desktop CPU_. That model is _wildly_ overestimating the
impact of latency on the computation cost.

------
asciilifeform
_"If you were plowing a field, which would you rather use?... Two strong oxen
or 1024 chickens?"_

\- Seymour Cray (<http://en.wikiquote.org/wiki/Seymour_Cray>)

~~~
wmf
<rant>These Seymour Cray quotes have been obsoleted by changes in technology,
and trotting them out yet again just displays ignorance of those
changes.</rant>

There are no "strong oxen" in today's world; both supercomputers and EC2 are
clusters using more or less the same processors.

~~~
asciilifeform
> There are no "strong oxen" in today's world

Some problems _demand_ shared-memory architectures. The fact that those who
work on such problems lack the funding to support a large market in shared-
memory machines does not negate the problems' existence.

~~~
wmf
I assume that Cray was talking about a uniprocessor, not SMP. Besides, it's
irrelevant to this article which was comparing NCSA's shared-nothing cluster
against EC2's shared-nothing cluster.

~~~
asciilifeform
> it's irrelevant to this article which was comparing NCSA's shared-nothing
> cluster...

I merely object to the modern redefinition of the word "supercomputer" to mean
a warehouse full of cheap PCs.

There once were _actual_ supercomputers.

------
keefe
I question the premise of this question and analysis! There is an enormous
breadth of work done on clusters vs supercomputers already. It turns out that
sufficiently parallelizable (
<http://en.wikipedia.org/wiki/Embarrassingly_parallel> ) tasks can be
accomplished much more efficiently on clusters. They've been en vogue in
academia for years, I remember playing with the cluster at ND in 2002. Just
wow.

------
sophacles
And... if you don't need lots of communication between nodes, I'd guess EC2
probably is the best way to go in terms of $/computation too.

~~~
tyvkiuiyi
Depends on utilisation. If you own a cluster and are able to keep it busy 24x7
then it pays off pretty quickly. Amazon have to have enough spare idle
capacity to handle unexpected customer loads and make a profit - you are
paying for this.

~~~
sophacles
Good point. I wonder at what % utilization a cluster beats EC2 in terms of
$/computation. Actually I'm sure there are all sorts of variables, and it
becomes an optimization problem, but I think it would be cool to see an
analysis of this in terms of $, computation power (time/jobsize or something),
etc.

~~~
tyvkiuiyi
Custer scheduling is a huge area. I used to work on MPI clusters and it is an
art to balance CPU, Bandwidth, propagation time to pick the optimum number of
processors for a particular algorithm.

Especially on commodity ethernet based MPI, it doesn't do broadcast so
shipping a Gb common dataset to 64nodes can take a lot longer than actualy
doing the calculation.

~~~
sophacles
Strange -- I always just sort of assumed that since they are making big
clusters, they could spend the extra $$ for a good multicast switch, and that
MPI did ip multicast. (a quick googling shows me to be wrong...).

~~~
jacquesm
I'm pretty sure a lot of people would donate for a statue in your likeness if
you solved that.

------
maurycy
What's better? Rock, paper or scissors? (I can't resist)

