
A Deep Learning USB Stick - zzzzFrog
http://www.movidius.com/news/movidius-announces-deep-learning-accelerator-and-fathom-software-framework
======
randyrand
Lots of misunderstanding in this comments section.

Movidius makes low power neural network processors for mobile application. The
Myriad V1 is used in google tango and the V2 (what the USB stick has) is used
in the new DJI Phantom 4.

[http://www.theverge.com/2016/3/16/11242578/movidius-
myriad-2...](http://www.theverge.com/2016/3/16/11242578/movidius-
myriad-2-chip-computer-vision-dji-phantom-4)

The Myriad chips are interesting because they combine MIPI camera interface
lanes on the same chip as a general purpose NN/CV processor and an SDK suite
of hardware accelerated computer vision functions (edge detection, Guassian
blur, etc).

here's the white paper for the chip:
[http://uploads.movidius.com/1441734401-Myriad-2-product-
brie...](http://uploads.movidius.com/1441734401-Myriad-2-product-brief.pdf)

Because programming these chips essentially requires having the hardware, and
because the hardware was very hard to come by, programming these chips was
mostly limited to Google, DJI, and other big partners.

With this release the everyday developer has access to these vision processing
chips, and the barrier to development entry is considerably lower.

This is not meant to replace your titan X gpu.

~~~
revelation
This is their own press release. What does that kind of hardware for CV
primitives have to do with deep learning?

(Also, of course, this stick doesn't seem to have any kind of connectivity
besides the USB to the host computer. How do I connect my camera? Having to
shuffle the data from a camera to the stick passing the host computer somewhat
defeats the point.)

~~~
krasin
>What does that kind of hardware for CV primitives have to do with deep
learning?

They have hardware convolutions on 12 SHAVE cores (kind of a DSP core). It
means that the chip can run some useful subset of convolutional neural
networks very fast and energy efficient.

They also have 2 general purpose SPARC cores, which allows you to have a
"normal" program running there. Not sure, how locked the USB stick going to
be, and if running your custom program would be an option.

>How do I connect my camera? The chip itself has a couple of MIPI lanes. The
USB stick likely does not expose that. And I agree, that's suboptimal.

------
hartator
I don't fully get the negativity of comments on HN for this product on the
sole fact that it's currently an usb stick.

I think it's just can make sense to push to the market this kind of chip early
and don't wait to be bundle with another device. Kudo for Caffe support.

~~~
randyrand
This is intended for development. I think HN commentators are missing that.

Given the limited availability of phones and products with the Myriad 2, and
the fact that you might not be targeting a phone anyway, being able to buy and
develop on a cheap USB stick is an incredibly smart move.

------
hatsunearu
I don't see why you would need deep learning NN on a phone.

If this is for CV, why not use the myriad DSPs from every silicon company
ever?

Seems like a buzzword fueled product.

~~~
randyrand
Tell that to Google and DJI that use them in their own products.

They arent just for NN, they are general purpose image processing chips with
NNs as one option.

------
magicfractal
You guys know if the Myriad V2 processor is locked (just neural networks) or
it will be programmable?

~~~
manav
Well it would be suited to other computer vision tasks based on the hardwired
accelerator cores.

------
babo
Haven't found any info of the availability or price. Interesting concept
anyway.

~~~
akramhussein
<$100[0]

[0] [http://www.tomshardware.com/news/movidius-fathom-neural-
comp...](http://www.tomshardware.com/news/movidius-fathom-neural-compute-
stick,31694.html)

------
davesque
Wouldn't this be limited to something like 10Gb/s throughput? That's not much
compared to a GPU bus. Cool idea though. I wish this press release wasn't so
light on specs.

------
manav
It looks like they have developed an ASIC for ML. The obvious use case would
be mobile devices where performance/power and latency (versus going to the
cloud) is a concern.

------
eggy
Can you connect it to a phone with a micro-B USB adapter, and then use it to
run pre-trained networks with your phone's camera like I imagine it is used in
thd DJI Phantom 4? I know the USB to the camera and onboard CPU will not be
the same as the DJI's bus, but it would make for an interesting mobile testing
platform for me to learn on.

~~~
NKCSS
Use a surface tablet?

------
loeg
This page absolutely wrecks Firefox 45.0.2. UI thread gets blocked
indefinitely and spins a core. :-(

------
dharma1
this collab with FLIR for thermal cameras is interesting too. The video gives
a bit more info about the myriad 2

[https://www.youtube.com/watch?v=hsopAM8FexE](https://www.youtube.com/watch?v=hsopAM8FexE)

------
veeragoni
How about this kind of extra processor comes in a mobile phone, which improves
regular camera vision, all health sensors, better everything that we can do
with mobile phones..

~~~
blakes
Someday they will be I'm sure. Probably part of the SoC.

But not yet, software is not ready for it yet.

~~~
jklontz
Seems like the perfect opportunity to insert a plug for my project:
[http://liblikely.org/](http://liblikely.org/)

------
pboutros
Betting on local instead of cloud is always an interesting gamble.

Pros:

\- Security

\- Control

Cons:

\- Resource limitations

\- (...)

~~~
skykooler
Local is also useful in situations with limited or no internet access. Say
you're trying to do deep recognition on a live video feed: many places this
would be useful simply do not have the bandwidth available to stream video.

~~~
teddyknox
Also useful in latency-sensitive applications, such as the drone flight demo
they use in their video.

------
brian_herman
How powerful is it?

~~~
LogicFailsMe
Not very...

At 15 inferences per second in fp16 for Googlenet, I'd guesstimate 50-60
GHFLOPs. That would give it very roughly 2x perf/W over TitanX.

~~~
exDM69
Given that figure and the Titan X at 250W TDP would put it somewhere around
1/125th of the performance. Which is probably more than good enough for
inference given a network but you still need something beefy for the training.

It's still pretty interesting, though, since only need to do the training
once.

------
mirekrusin
I almost closed the tab with this page, the classifier in my head marked it as
an ad.

------
pietrasagh
look like fraud

~~~
castis
Hey, I'll bite. Why do you think that?

~~~
tP5n
their product is based on a buzz word and uses unknown hardware, the specs
seem underwhelming (1W + usb = ?). their page is filled with videos of models
and nature shots and a bunch of news categories that have a few articles from
2013/14\. there's no way to order anything, yet they put out some wild claims:

'With Fathom, every robot, big and small, can now have state-of-the-art vision
capabilities'

'It means the same level of surprise and delight we saw at the beginning of
the smartphone revolution'

'With more than 1 million units of Myriad 2 already ordered'

tl;dr: because, to an outsider, it sure does look like fraud.

~~~
lawlessone
Their hardware is integrated with many devices like googles Tango.

------
intrasight
Just saying - I'm digging the idea of a USB stick that turns my laptop into an
artificial intelligence.

------
nxzero
Thought TensorFlow already did this, how is this USB different?

Seems like the more logical approach would be to have a widget app developers
could easily deploy embedded TensorFlow builds in Android & iPhones. Has
anyone looked into doing this or found someone already doing this?

~~~
danvoell
My understanding is that it allows you to take your tensorflow model, transfer
it to the USB and plug it in (interface with) your 3rd party hardware (other
than your phone) which isn't connected to the cloud.

------
tacos
This is a dev board, not a consumer product. And (contrary to the title) the
press release explicitly says that it is not intended for "deep learning."

Acceleration is needed for training -- not running the models themselves. A
quick glimpse of the power used (1 watt) lets you know exactly how much
"acceleration" is going on in here. This is meant for tiny devices.

EDIT: My point is that this is a small-run dev board for a chip for some
future $19 nannycam. It's not an "accelerator" you install on your PC to put
your graphics card to shame running TensorFlow.

EDIT #2: This is another one of those HN threads that's overrun by
enthusiasts. Jamming a chip onto a stick is simply how they sell embedded crap
now.

Here's a crypto chip that'll really get you guys going:
[http://www.atmel.com/tools/AT88CK590.aspx](http://www.atmel.com/tools/AT88CK590.aspx)

~~~
rm999
>Acceleration is needed for training -- not running the models themselves

This isn't true. Running neural networks (including CNNs) can be
computationally and power intensive, and lends itself to the vector operations
of GPUs, FPGAs, and ASICs. Putting the computations on devoted hardware could
enable embedded applications that simply aren't possible otherwise.

Here's a whitepaper by Microsoft about using FPGA's to speed up CNNs:
[http://research.microsoft.com/apps/pubs/?id=240715](http://research.microsoft.com/apps/pubs/?id=240715)

Article by Google explaining the importance of optimizing neural networks to
run on mobile phones: [http://googleresearch.blogspot.com/2015/07/how-google-
transl...](http://googleresearch.blogspot.com/2015/07/how-google-translate-
squeezes-deep.html)

~~~
zxv
Are there any good examples of FPGA implementations of CNN?

I see one example in Verilog on github: [https://github.com/ziyan/altera-
de2-ann/blob/master/src/ann/...](https://github.com/ziyan/altera-
de2-ann/blob/master/src/ann/ann.vhd)

~~~
dharma1
RNN - [http://arxiv.org/abs/1511.05552v4](http://arxiv.org/abs/1511.05552v4)

