
NNVM Compiler: A New Open End-To-End Compiler for AI Frameworks - ydereky
https://aws.amazon.com/blogs/ai/introducing-nnvm-compiler-a-new-open-end-to-end-compiler-for-ai-frameworks/
======
apendleton
This seems really neat. Haven't really dug in yet, but a couple things I'd be
curious to see if it facilitated would be facilitating GPU-accelerated deep
learning work (either training or inference) on non-nvidia GPUs, which seem
poorly supported natively with most current frameworks, via the OpenCL
backend, and also facilitating ease of deployment for inference on
environments where having large, complex dependencies is a pain, like Lambda.

------
loser777
The Raspberry Pi results are really cool to see. Sure, for a highly-tuned
library like cuDNN the existing operators might be close to as fast as you can
get, but it's unlikely that every platform where it may be interesting to
deploy a deep learning model can get the same amount of attention. I hope that
these results mean that in many cases we can get highly optimized
implementations without exhaustive manual effort.

~~~
dr_zoidberg
Up until recently at work I had to work with Keras on CPU-only and the easiest
way to get things working was TensorFlow backend. Earlier this year the 1.3
update gave me roughly 15% performance increase just from

    
    
        pip install tensorflow --upgrade
    

Fortunately I got a new machine with a cuDNN capable GPU, but I'll be testing
the NNVM for Keras backend when its working. We might get to squeeze some
epochs out of my old machine, now in the hands of a secretary who won't really
notice we're training a net while he replies some emails.

------
modeless
Huh, so it's a fork of Halide that replaces the frontend with a set of
adapters for various neural net frameworks. Halide is super cool and really
ought to be better known. I wonder if they tried to collaborate with the
Halide people at all or if they're just doing their own thing?

~~~
_abadams_
Calling it a fork isn't really fair. It just reuses bits and pieces from
Halide where it made sense for them to do so, and it's appropriately credited.
We're happy for people to build on our stuff.

\- One of the Halide people.

------
alfalfasprout
This is going to be a really useful tool for us-- production ML systems are
incredibly difficult to build when you need to support a bunch of different
frameworks. ONNX is nice as a standardized runtime layer... but being able to
recompile a model into another final target would be amazing.

------
tree_of_item
How feasible is CPU only deep learning? I keep hearing about outrageous
training times for GPUs on the order of weeks with something like 4-8 GPUs, is
anyone actually using CPUs instead?

~~~
Mr_P
> I keep hearing about outrageous training times for GPUs on the order of
> weeks with something like 4-8 GPUs

This is for training a competitive model from scratch on a fundamental problem
like image recognition. If you don't care about the last 1-2%, it's possible
to train _a_ useful model in a few hours (but still on a GPU).

> is anyone actually using CPUs instead

There are useful things you can do without a GPU. For example, "Transfer
learning", which can be as simple as chopping off the last layer of someone
else's GPU-trained model and substituting your own, can be done on a CPU in
reasonable time. This is because you typically need less data and because far
fewer parameters need to be fit.

For example, HBO's "Hotdog/Not-hotdog" app was done this way. See their
description of "V1" here: [https://medium.com/@timanglade/how-hbos-silicon-
valley-built...](https://medium.com/@timanglade/how-hbos-silicon-valley-built-
not-)

------
outlace
Does this mean I can finally train deep learning models on my laptop's Intel
GPU?

