
The Raspberry Pi as a Poor Man's Transputer - robin_reala
http://jacquesmattheij.com/raspberry-pi-as-a-poor-mans-transputer
======
jdboyd
I don't think that using a Raspberry Pi as a poor mans Transputer will be very
satisfactory.

The INMOS T800-20 was 20mhz and did 10 MIPS. However, it had four links, each
doing 20Mb/s, so it can communicate at about the same speed it process data.

By comparison, a RaspberryPi is ~700Mhz and 847MIPs, but proportionally IO is
extremely starved. It barely is able to keep up with 100Mb ethernet which
gives only slightly more bandwidth than the the Transputer had 30 years ago.

There are numerous cool things about the RaspberryPi, like the graphics, the
community support for cool software, the HD camera option (and the company
making an HDMI input adapter board to connect to the camera interface on the
RPi). However, it strikes me as a terrible choice for clustering and I feel
sad whenever I see someone talking about doing that.

A much better choice would be an ARM chip with built in gigabit ethernet. One
possibility would be to find PogoPlugs. They aren't just a bare board, but
they can often be found for under $20 with hardware gigabit ethernet, an
800mhz or 1.2ghz processor, and 128-512megs of RAM. Another option that is a
little more expensive would be to use Biostar J1800NH with 2 gig RAM sticks,
pico PSUs, booting from cheap flash drives, could probably be put together for
$105 per node.

~~~
chillingeffect
Yes, this is a trope amongst the amateur fan base of the Raspberry Pi and it
is familiar in many other avocations.

"I know my hobby X is not good enough, but if I got enough of them, it might
be way better than things actually designed to do X!"

When it was first proposed to me about a year or two ago, to massively connect
RPis, I quickly did the math and showed a simple top-of-the-line graphics card
( I did the numbers for a Radeon 7970 ) outperformed RPis at a factor of
something like 10 to 1 in price/GFlops. That's without even considering
interconnections.

I know Jacques Mattheij is some kind of folkhero here on HN, so I'll tread
lightly, but I find this weak as it's years overdue and comes complete with a
non-committal call to someone else doing the action- not a value in this
community. He implies his time is worth more than other people's:

"Doing this will be [...] well worth it." "If anybody builds this let me know"

~~~
rwmj
Those who haven't read Kumar, Grama, Gupta and Karpis[1], are doomed to
rediscover it, painfully.

[1] [http://www.amazon.com/Introduction-Parallel-Computing-
Analys...](http://www.amazon.com/Introduction-Parallel-Computing-Analysis-
Algorithms/dp/0805331700)

~~~
chillingeffect
Do you happen to know if there's a particular term or metric that describes
the cost of the overhead of interconnecting elements?

~~~
rwmj
The book (above) has a comprehensive series of equations you can use to
measure the cost of interconnections in parallel algorithms. I can tell you I
did an MSc-level course in this, and it is _very_ complex, one of the hardest
courses I did on what was a very difficult MSc.

Edit: But basically your comment about interconnecting RPis with ethernet is
completely true. It's a huge waste of time.

------
rwmj
I'd rather use something from AllWinner. eg. an A10 is cheaper ($30 in
packaged system-on-a-USB-key) and a bit more powerful. The A20 (eg.
Cubietruck) is more expensive but is dual core and each core is much more
powerful.

~~~
userbinator
There also appears to be more hardware documentation available, which is
always a good thing.

~~~
rjsw
I would have thought that there was more documentation on the Raspberry Pi now
that the GPU is open.

------
lttlrck
I remember getting excited about the Atari Transputer [1] in my 520ST days -
later on I was lucky enough to do a unit on Occam running on a real T800 - way
back in 1992. They could have based an entire semester on the concepts INMOS
introduced with that little processor.

[1]
[http://en.wikipedia.org/wiki/Atari_Transputer_Workstation](http://en.wikipedia.org/wiki/Atari_Transputer_Workstation)

~~~
kabdib
Ha. The Atari Transputer (the Abaq) was an utter surprise to us folks in Atari
engineering in Sunnyvale. One day there was a press release, and some Atari
folks (that I never knew existed) had apparently decided to ship a Transputer-
based personal computer.

I thought it was cool, but the Tramiels had other ideas.

~~~
jacquesm
As a former owner of a 520 and a 1040 thanks a ton! That was my first machine
with a linear address space capable of doing more than 64K, it opened up a
whole world for me.

~~~
sitkack
Sorry to turbo the thread, but perhaps the device you were wondering about in
your post is from
[http://www.xmos.com/products/xkits](http://www.xmos.com/products/xkits) ?

The folks that created the Transputer have created a similar embedded
processor.

~~~
jacquesm
Yep, that's the one!

------
zhemao
Are you suggesting connecting adjacent Pi's directly over GPIO? I've tried
this before and it doesn't really work out very well. At some point, I ended
up with the ground voltages going off and then everything was just borked.
Plus, the speed of SPI, I2C, or bit-banging GPIO is several orders of
magnitude slower than Ethernet.

Communication over GPIO is a cool idea but ultimately pointless.

------
dsl
Check out the new Raspberry Pi Compute Module
[http://www.raspberrypi.org/raspberry-pi-compute-module-
new-p...](http://www.raspberrypi.org/raspberry-pi-compute-module-new-product/)

------
protomyth
A lot of what made the Transputer cool was Occam
[http://en.wikipedia.org/wiki/Occam_(programming_language)](http://en.wikipedia.org/wiki/Occam_\(programming_language\))

~~~
ZenoArrow
The spiritual successor to the Inmos Transputer is the XMOS line of processors
(David May is a key figure in both). Would be interested to hear your thoughts
on XC and XMOS... [http://www.xmos.com](http://www.xmos.com)

~~~
protomyth
That is amazing, its going to take me a while to go through their stuff.
Thanks for the link.

------
0xdeadbeefbabe
Speaking of many connected computers, greenArrays sells a chip (GA144 [0])
that has 144 cpus, and the interior ones are connected 4 ways. They also have
a simulator that runs fine in wine. While the official eval board is $450,
there is a poor mans option for $60 [1].

[0] [http://www.greenarraychips.com/](http://www.greenarraychips.com/)

[1]
[http://www.greenarraychips.com/home/documents/budget.html](http://www.greenarraychips.com/home/documents/budget.html)

------
ChuckMcM
This, like the C64 compute module [1] would be a good way to get a good
understanding of where the problems are. A long time ago when I was looking at
parallelization issues in storage, I came across some excellent work done by
Garth Goodson at CMU in their "Active Disks" [2] project. The was connecting
what were essentially RPi class devices together.

What emerges is a sort of calculus for describing parallel systems which takes
the amount of compute per second a node can do, the amount of data that node
can "consider" during its computation, and the time it takes for all of the
nodes to "react." In this case a node considers a data structure when it
evaluates its state (say the head node of a list, or the median element of an
array) and is constrained from taking action by a threshold of assurance that
the data it is considering it accurate. The reaction time is the time between
making a change to the data and the time at which all elements in the cluster
considering that data consider it true.

I used to describe that to people like those big mechanical train signs at
stations with all the letters on a flip wheel. People look at the sign and
consider it, it is stable and 'true', then a train leaves the station and all
of the letters start flipping as the sign changes to show the new truth about
what is happening. People with no knowledge are stuck waiting for the sign to
settle down before they can go to their track, people with knowledge can be
heading for their track but if the track used on their arriving train is
changed they will be put in motion again. The length of time it takes to
change the sign is equivalent to the time it takes for a cluster to react. And
and train station cannot usefully serve trains faster than the passengers can
figure out which train to be on, nor can a cluster usefully process structured
data faster than the truth of the relationships in that data can be
ascertained.

It all collapses down to Amdahl's law of course but along the way can help you
figure out where the inefficiencies are going to crop up in the system.

[1] [http://www.vintage.org/projects.php](http://www.vintage.org/projects.php)

[2]
[http://www.pdl.cmu.edu/Active/index.shtml](http://www.pdl.cmu.edu/Active/index.shtml)

~~~
joshu
Garth Gibson.

~~~
ChuckMcM
argh, and thanks. In my defense, I was reading this paper
([https://www.usenix.org/legacy/event/osdi00/full_papers/strun...](https://www.usenix.org/legacy/event/osdi00/full_papers/strunk/strunk.pdf)
\-- Self securing storage systems) by Garth Goodson, _also_ of the CMU
Parallel Data Lab.

------
zwieback
I remember Transputers - they were very popular in Germany due to Parsytec
([http://en.wikipedia.org/wiki/Parsytec](http://en.wikipedia.org/wiki/Parsytec)).

Other than for fun and learning - what are the advantages of a fabric vs.
something like a GPU system? From my experience with micros it seems there's
so much power lost in the peripherals that it's inefficient to use anything
with a small number of cores for massively parallel systems.

~~~
jacquesm
GPUs are not very good at problems that you can't model with SIMD, though
they're getting better at such problems.

Also, there is no isolation between the nodes, it's all one memory space
(that's what triggered this to begin with), computing fabrics have security
implications because the nodes are isolated from each other.

------
chiph
In college I did some work with an Intel iPSC/1 Personal Supercomputer that
got donated to the university. It had 128 nodes, each of which was a 80286
with 512k of RAM, connected in an hypercube configuration. Being a novice, all
I did with it was sort numbers. :)

But it strikes me that the Pi compute module has way more power than the iPSC
did, and consumes far less electricity. Switched gigabit ethernet would likely
suffice -- you wouldn't need something like InfiniBand. Not sure if you'd want
an out-of-band management network for something like this, but it'd be nice to
be able to monitor/restart it if the network stack gets wedged.

You'd want some sort of backplane to host the SODIMMs and route data lines.
Packaging density would mean convection cooling probably wouldn't cut it, so a
fan would be needed. How many could you fit into a 1U rack, hmmm?

~~~
jacquesm
Why go all the way to ethernet if all it does is add cost & complexity? You
could simply connect the GPIO lines of one 'pi' with the neighbours! Pull out
an ethernet link from each backplane maybe, but use direct communications
within the backplane?

> How many could you fit into a 1U rack, hmmm?

Lots :)

~~~
zokier
> You could simply connect the GPIO lines of one 'pi' with the neighbours!

How fast you reckon a GPIO based bus would be? Especially if the rpis are
required to do something more than just shuffle data around.

~~~
undersuit
[http://codeandlife.com/2012/07/03/benchmarking-raspberry-
pi-...](http://codeandlife.com/2012/07/03/benchmarking-raspberry-pi-gpio-
speed/)

I was wondering this too, but I never took the time to find out what this guys
numbers could translate to in terms of actual throughput.

~~~
fhars
These measurements are irrelevant to the question at hand, as they toggle the
pins in software. If you were using them in a communication setting, you would
use pins that can do serial communication using DMA, consult the processor
datasheet for details. It looks like adafruit is able to push pixel data at
80Mb/s (that's Mb, not MiB) over SPI to a display:
[https://learn.adafruit.com/adafruit-pitft-28-inch-
resistive-...](https://learn.adafruit.com/adafruit-pitft-28-inch-resistive-
touchscreen-display-raspberry-pi/faq#faq-3)

------
raphman_
from the article:

> There was a thread a while ago about a little board that you could connect 4
> ways to its neighbours (for the life of me I can’t find it…),

That might be: [https://www.indiegogo.com/projects/pshdl-
board](https://www.indiegogo.com/projects/pshdl-board)

~~~
jacquesm
That wasn't it but very interesting!

------
moron4hire
I've been interested in doing a similar thing with the TI MSP430
microcontroller. You can get 16MHz MSP430s for about $1.00USD/ea in bulk (and
one particular 8MHz version at $0.10USD, IIRC), and it takes only a resister,
a capacitor, and a power source to run them after they have been flashed. The
GPIO has a number of useful features built-in, including a number of serial
communication protocols. They are very small, even for a DIP package. They
require very little power (I ran one actively polling two ultrasonic sensors
and twiddling two RGB LEDs for two days straight on 3 AA batteries). And they
are very easy to program.

~~~
bobowzki
The absolute bulk of the energy was used on the sensors and LEDs in your
example.

~~~
moron4hire
correct, that is the point.

------
zokier
This sort of idea pops out occasionally, but the math does not seem to work
out that well. The cheapest x86 motherboards with integrated Celeron CPU cost
70-80 dollars, add in 20-30 dollars worth of RAM and you got a device that is
maybe 10 times faster than RPi while costing only four times more. And you
have gigabit ethernet on those x86 boards while RPi is limited to 100mbps,
which will be a significant factor in cluster configuration.

So while 5 off-the-shelf x86 computers on gbe network do not sound as cool as
30 RPi modules with custom backplane, they should perform significantly better
while costing about the same.

~~~
jacquesm
Until you factor in power consumption.

~~~
zokier
The Intel chip I was looking at while getting the numbers was J1800 with TDP
of 10 watts. Wiki quotes 3.5 watts for RPi(model B). Assuming my estimate of
10x performance for Intel holds, the perf/watt situation looks pretty equal.

------
lesingerouge
I believe there might be some interesting results in using the RasPi as a kind
of building block for massively parallel computing, but I find that path very
hard to merge with the "ubiquitous computing" trend. The later seems so much
more promising from the point of view of "processing power available to the
masses".

~~~
jacquesm
Mark Weisers world and the world of hacking have fairly little overlap, the
RasPi is solidly in the hacking camp for now. I don't expect consumers to buy
RasPi's anymore than I expect them to buy 1U servers or the racks to mount
those in.

Processing power available to the masses is intimately connected to mobile
phones and tabs. RasPi's are only one small step removed from a girl or a guy
wielding a soldering iron.

~~~
lesingerouge
On the other hand, you would expect consumers to buy hard drives and storing
devices. So maybe that will be the way to go for processing power too?

------
spullara
This is the poor mans transputer:
[http://www.parallella.org](http://www.parallella.org)

------
zokier
At this day and age, I'd go for at least gigabit ethernet for the "fabric".
Maybe just buy bunch of cheap gbe wifi routers and plug them together. I
wonder what the network performance would be like if the built-in four/five
port switches were used.

~~~
happycube
Iff the SATA controllers can deal with it, I don't see why SATA couldn't be a
fast point-to-point connection.

