
I Build Supercomputers in My Spare Time - stevesalevan
http://www.parallella.org/2014/06/03/my-name-is-brian-and-i-build-supercomputers-in-my-spare-time/
======
onalark
This is a misleading title. Brian builds Supercomputer Models, not
Supercomputers. It would be like if I had a model of a Ferrari I put together
at home and wrote an article about how I built sports cars in my spare time.

Yes, these are awesome, full-featured models, but the differences between this
and a supercomputer, which costs tens of millions of dollars, requires high-
density power and cooling, features multi-dimensional, low-diameter networks,
and contains hundreds of thousands to millions of compute cores is... _quite_
vast.

~~~
vidarh
I get what you are saying, but what is a supercomputer today is a pedestrian
little home computer tomorrow.

The cylinder design he's using is inspired by the early Cray models. Cray 1
had a performance of 80 MFLOPS. Cray X-MP had a performance of 800 MFLOPS. The
Cray 2 (which looked substantially different) reached 1.9 GFLOPS in 1985.

1993-1996, Numerical Wind Tunnel - a 140 CPU vector computer, was at the top
most of the time. It reached it's all time peak at around 235.8 GFLOP/s

Even ASCII Red, which held the top spot until the end of the 20th century,
only reached 1.3 TFLOPS.

So unlike if you built a model of a Ferrari at home, this thing actually
substantially outperforms the fastest supercomputers up until the mid 90's.

~~~
onalark
This is a little bit of a straw man. As Jack Dongarra will happily tell you,
an iPad can outperform some of the supercomputers from the beginning of the
Top500 benchmark. You don't get to be a supercomputer today by beating
supercomputers from two decades ago, that's not how technology works. I'm
taking umbrage at the linkbait title because I'm a grumpy old man, not because
I don't think this project isn't cool (and admirable!)

~~~
vidarh
My point is that the "super computer" term in itself is quite meaningless.
It's pretty much just saying "we thought this thing was fast when it came
out".

And while this thing doesn't really meet _that_ label at its present scale, it
is conceptually far closer to those early supercomputers than what an iPad is,
both in how it's structured, the parallel nature of it (16 ARM cores; 128
Epiphany cores), the shared memory (within each Parallella) etc..

So yes, you're being grumpy about a title where it takes about 10 seconds to
figure out that this isn't _actually_ about someone building stuff aiming for
the Top500.

~~~
onalark
An iPad _does_ resemble the supercomputers of yore, between its superscalar,
vector processing (GPU), and multicore architecture.

The Parallella brings distributed-memory programming in, which is a very
important development.

You and I strongly disagree on the meaningfulness or definition of the term
supercomputer. Here's an easy definition: A supercomputer is any single,
unified, computer system that is currently one of the fastest 500 in the
world.

~~~
jacquesm
Another grumpy old man here. I disagree with your classification. By your
analogy, you'd be building a _classical Ferrari_ capable of the speeds of that
classical Ferrari, not a model. The fact that it can't measure up to _today
's_ super computers does not mean that it isn't constructed along some of the
same lines and shares a lot of traits with it. Back when Beowulf clusters
first came into vogue all the 'real' supercomputer people were saying 'but
that isn't a real supercomputer, you're using multiple CPU's' and we all know
how that discussion ended.

Give the man a break, wait a couple of years and he'll give you a 4K core cpu
_supercomputer_ for little money, that needs to be encouraged, not talked
down. It's early days.

------
supermatt
Having read through the documents, I see that it is capable of ~200 GFLOPS.
Obviously, this outshines my macbook performance of ~50GFLOPS, but it is
substantially less than my workstations GPU, an Nvidia GTX 770, capable of
~3000 GFLOPS.

I suppose my question is why would I choose to build something like this over
using a GPU?

~~~
bri3d
Here's some previous discussion:

[https://news.ycombinator.com/item?id=4702456](https://news.ycombinator.com/item?id=4702456)

In terms of raw GFLOPS, you shouldn't.

Parallela offers something halfway between a multi-CPU cluster, differing in
terms of memory access, and a GPU, differing in that each core is a real core
and can be independently making calls, branching, etc.

Another point to note is that every GPU architecture is different and that
some support a different degree of control flow parallelism vs. data
parallelism.

Ultimately it depends on the kernel you're working with - if you're to a point
where you've got strictly linear SIMD and you depend entirely on floating-
point math throughput or memory bandwidth, it wouldn't make sense not to use a
GPU instead.

~~~
moconnor
Although if you like or have code suitable for a multicore architecture you
would get better performance from a Xeon Phi or the forthcoming Knight's
Landing from Intel, both of which run x86_64 Linux.

Between ARM, OpenPower and Intel's Phi successors this is becoming a hyper-
competitive space. Interesting times ahead!

------
coreymgilmore
For anyone running into the DB connection error and the page not loading:

[http://webcache.googleusercontent.com/search?q=cache:http://...](http://webcache.googleusercontent.com/search?q=cache:http://www.parallella.org/2014/06/03/my-
name-is-brian-and-i-build-supercomputers-in-my-spare-time/)

------
contingencies
Isn't the architecture of real world supercomputers essentially dependent on
their expected load?

The very notion of a 'general purpose processing supercomputer' essentially
conjures one of those modern data center visions: a large array of identical
consumer-grade hardware, with its high price/performance ratio: accessioned,
wired, tested, commissioned, allocated workloads, managed over time and
finally decomissioned by a combination of carefully developed human procedures
and highly automated processes?

For instance, nobody in their right mind would install an OS on every such
node by hand: it has to be PXE or similar (can you boot root-on-iSCSI direct
from BIOS these days?).

I'm curious how many fellow HN'ers out there are doing this.

------
hunt
I can see one fan at the top of the cylinder- is only one fan really
sufficient? It looks like this would generate a large amount of heat.

I would be quite interested to see the temperatures this gets up to.

~~~
vidarh
A single 16 core Parallella can run with just passive cooling. It takes next
to no cooling to bring both the ARM and Epiphany chips down to near room
temperature.

~~~
BaryonBundle
While the 16 Parallella can run with just passive cooling, you still need to
ensure proper airflow over the unit.

Units have been known to overheat with just passive cooling, and they even
advise that you install a fan with the official case (that they sell on their
store/provided to backers), even though there is nowhere to screw a fan in,
etc.

Cases and Cooling: [http://www.parallella.org/2014/04/30/cases-and-
cooling/](http://www.parallella.org/2014/04/30/cases-and-cooling/)
(parallella.org) _April 30, 2014_

The point that a cluster of parallellas is _relatively_ easy to cool is
definitely aided by the form factor.

------
tomberek
[http://webcache.googleusercontent.com/search?q=cache:nBNJrRz...](http://webcache.googleusercontent.com/search?q=cache:nBNJrRzmjUMJ:www.parallella.org/2014/06/03/my-
name-is-brian-and-i-build-supercomputers-in-my-spare-
time/+&cd=1&hl=en&ct=clnk&gl=us)

------
lemcoe9
Your LEDs use 20W of power? I would immediately scrap those - they serve no
real purpose and use an insane amount of power, compared to their computing
counterparts.

------
deutronium
Wouldn't it be quite hard to benchmark something like Parallela as it contains
an FPGA/ARM/Their own multi-core chip.

~~~
rjsw
Not really, you have to make it explicit where any program would run.

------
manuw
"Error establishing a database connection" *scnr

