

256 DSP core at 28nm, 200 GFLOPs with 5 Watts, designed by KALRAY - agumonkey
http://blog.thinkteletronics.com/?p=399

======
jws
The hardware is interesting, but it also looks like they've put a lot of
effort into programming it. You can approach it from C, or they have a way for
you to specify a dataflow and get it mapped onto the hardware. They also
appear to have lots of performance analysis tools in their developer kit
(which has no price, so I'll just assume 5 figures or vapor).

It isn't aimed at GPU work and it is strange enough not to be a general
purpose computer. The obvious hacker markets are password cracking and bitcoin
mining, the coins per joule might be a game changer. I'm sure someone on Wall
Street can use them to harvest fractional pennies at astounding speed. Low
power, high throughput, and big compute sounds about right for an NSA network
sniffing appliance. The application will have to be big enough to get someone
over the learning curve to justify the strangeness. "Hello World!" won't be
hard, but optimizing will be. Something on the Google indexing or Siri scale
would be an obvious market if the computation per watt justifies it.

• 16 clusters of 16 processors all tied together with a fast on chip network.

• 32MB of local memory for each cluster.

• Two DDR3 interfaces

• Two 40Gbps ethernet or eight 10Gbps (though I only see two on the diagram).

• Two eight lane PCI Express interfaces

------
vonmoltke
This little chip could be a significant boon to signal and image processing. A
number of those applications, particularly surveillance radar,
sesimic/hydrographic survey, and sensor fusion[1], have signficant data
parallelism. Many have multiple independent data streams that must be
processed and integrated in real time. With the proper algorithms, an
architecture like this is much more efficient than a general-purpose multi-
core or multi-machine system.

One of the key limits to these sensors is the data pipe between the various
pieces that run the signal processing. This is especially true when one of the
hops is over a relatively slow link, such as with an ROV/UAV. The size and
weight of this chip would allow more of the signal processing to be embedded
on the platform or in the sensor, thus reducing the volume of data passed over
the slow link and allowing more information to be sent.

It does, of course, require a different programming paradigm to use
effectively. Special applications like I have mentioned are worth the extra
effort and reduced portability and so will see value in this device. No doubt
some other applications mentioned so far (crypto, finance, etc) will also see
value in it as well.

[1] I have a strong background in and knowledge of military sensors, and
signficant exposure to seismic and hydrographic survey in the oil and gas
industry.

~~~
tadfisher
Bitcoin.

------
xyzzy123
That's cool.

Although in the hardware world, I have to say that there are a lot of cool
things, and often the main thing is, can I actually buy it?

If you ordered one and had the chance to use it, I think that would give you
the right to crap on the kickstarter project.

------
mmariani
Here's the tech specs <http://www.kalray.eu/products/mppa-manycore/mppa-256/>

------
hn_is_vile
1\. I don't see a price anywhere in the article. Adepteva plans to sell the
64-core version for $199 and that includes the board.

2\. Adepreva plans to open source the hardware and software, does KALRAY?

3\. Adepteva have actually built chips. The kickstarter project is to scale up
manufacturing to bring down the cost. I think $99 for 16 cores is pretty good.

Still, if the price is right, more is better, so it's worth keeping an eye on.

------
Jetlag
I get deja vu every time I see that Parallela schematic:
[http://www.tilera.com/sites/default/files/productbriefs/PB01...](http://www.tilera.com/sites/default/files/productbriefs/PB010_TILE64_Processor_A_v4.pdf)

------
modeless
Has there been any research into running neural networks on DSP architectures?
I'd love to try my hand at programming a DSP. Are there any low-cost options
for hobbyists?

~~~
stusmall
Keep your eye for TI tech days. They have them in most major cities. Its a
series of classes layered with tons of TI advertising. I usually walk out of
them learning something.

The reason I bring them up is they usually give out 95% off anything under
$500 coupons from their estore. I picked up a nice DSP dev board with one of
them not too long ago.

------
xmpir
it's a nice science experiment, but parallelization has its limit. there are
very few examples of (non mathematical) algorithms that can be parallelized to
256 cores...

[http://en.wikipedia.org/w/index.php?title=File:AmdahlsLaw.sv...](http://en.wikipedia.org/w/index.php?title=File:AmdahlsLaw.svg&page=1)

~~~
tonfa
And DPI and communication analysis is one of them.

The tech spec hints clearly that it is a DPI chip especially the 80Gbs
ethernet. (And the company is funded by the french public sector)

~~~
hollerith
What's DPI?

~~~
tonfa
Deep packet inspection (network analysis).

