As you can see from the pictures each of the devices uses its own power supply, so there should be quite some possibility to improve overall power consumption.
Anybody with more experience of using MPI ?
Of course it's not fair to compare this to an infiniband cluster (that's not the point of this exercise), but I'd really be interested to see a cluster built on $0.50 ARM chips with at least a gigabit ethernet interconnect. A couple of years from now -- given the low entry cost and lower infrastructure costs (cooling/power consumption/etc) -- that could be a game changer.
Dell's "Project Copper" - http://content.dell.com/us/en/enterprise/d/campaigns/project...
Boston Viridis - http://www.boston.co.uk/solutions/viridis/default.aspx
One of my favorite systems questions is to have someone walk through the design and implementation of a system where all the machines in the system respond to a query Q based on the contents of a linked list L. The system has an API which consists of L <- M(op) (mutate list), R <- Q(id) respond to a query based on the contents of the list, and R <- S() report on the stability of the list. Start with M(op) being idempotent, then non-idempotent, Etc. Folks who've had a good introduction to state machines will immediately recognize and a number of problems that arise as you control correctness. If folks get through the whole sequence we're taking about a function f(C) which takes a correctness coefficent from 1.0 (fully correct) to 0.0 (unspecified) and look at the performance of the system across that range.
That kind of stuff you could easily do on a 48 node Pi Cluster.
I implemented neural network feature creation for a backgammon agent ("Automated feature selection to maximize learning in artificial intelligence").
Nowadays, I mainly do parallel machine learning on machines with higher network latency. I haven't used MPI since.
With the sort of work I do nowadays (large-scale ML and NLP), I generally need very little synchronization, i.e. my tasks are usually embarrassingly parallel. I typically save final results in a centralized store (DB or NFS) and look at it there.
I also use Hadoop where appropriate.
My conclusion, the RPi doesn't even win in FLOPS/Watt let alone $/FLOPS.
If you only have less than $100 to spend, a better $/FLOPS ratio of a MacBook (or anything else, really) doesn't matter. One is available, one is plainly not.
For people that CAN spend enough this usually is just a (third? nth?) gadget to play with. Like in the article (because, it's really just a neat way of playing with gadgets and lego, not 'useful' in any sense that can be quantified).
In that case you'd want to make it easy to get funding, and be optimising for initial cost. What would be the cost of 64 of your laptops?
Certainly laptops aren't the idea platform to compare against anyways. I'd compare to a 1u node that you would buy for compute cluster. One can by a 64 core 1u node with 64 GB of ram for about $7k. If one wants to play around with cluster computing you can emulate an entire cluster using one of those!
For comparison, you could buy a motherboard, 32 GB of RAM, and an Intel i5 processor for $500 that will do over 20 GFLOPS.
So, it doesn't really stack up well from a price/performance standpoint. The value of these systems is more teaching students how to work with parallel code.
There are no practical applications to this, it's just to show that it can be done.
You can spawn 64 processes in a Core I7 and it would be about the same, just faster. Or 64 VMs.
Maybe the real value of these systems is to teach students to design and calculate before building it.
The application of this is a teaching model. It's a lot easier to demonstrate parallelism gains on this type of platform. Scaling beyond a single ARM core is going to give you immediate performance benefits. Scaling further out to the entire cluster will continue to show returns.
With a single desktop, once you go beyond ~4 cores the gains will drop off too quickly. You just won't be able to see gains out to 64 threads on a single CPU, where on this you should.
It also doesn't hurt to have a quirky architecture to get students excited by. And yes, you could also spend some time discussing the architectural trade-offs and why this is not a cost-effective system for production use.
I made a bad assumption because your account is almost one year old. I'm sorry. Now I see that information is not given anywhere in this site. I think this vote method was taken from Reddit.
One interesting thing is that the USB port being activated (even idle) can blow any power consumption by the CPU itself out of the water.
On the other hand, stop sucking up all the supply for Raspberry Pis! My order keeps getting delayed and it's making me sad.
The amazing thing about this project? Someone managed to get 64 RasPi's all in the same place!
"Please submit the original source. If a blog post reports on something they found on another site, submit the latter. "