
Introducing Pytorch for fast.ai - bryananderson
http://www.fast.ai/2017/09/08/introducing-pytorch-for-fastai/
======
aabajian
I get the motivation behind fast.ai's lessons, but I question whether it's the
right approach. Their goal is to make deep learning accessible to programmers
by reducing the mathematical background required. An analogous situation is
the engineer who uses Autodesk to design car parts: He/she may not need to
know the detailed implementation of the 3D graphics to use the software
effectively.

The difference is that Autodesk relies on a mature, _deterministic_ technology
(3D graphics rendering). Deep learning is a stochastic process that depends on
the data and the model. The training code, and especially the framework hooks,
is the _least_ important part. The example they give is three lines of code to
train a cat vs. dog classifier. I've tried this classifier on a different
binary image classification task: livers with and without tumors. It didn't
work very well. There's lots of reasons: little variability between images,
grey-scale images, different resolutions, etc. You can tweak the network,
throw in more middle layers, try different kinds of layers, whatever, to get
better results. All of that is guesswork if you don't understand what the CNN
is doing at each stage. At this point in time you _do_ need a formal education
in linear algebra, calculus and statistics to investigate why a model
does/does not work. It's not enough to know how to use the libraries.

On the flipside, you _also_ need to know how to manipulate data and parse it
into the correct format. This generally requires a year or two of programming
practice in a good scripting language like Python. I will echo their thoughts
that Ian Goodfellow's Deep Learning Book is remarkably lacking in this area.
As a simple example, you cannot even use AlexNet without pre-processing your
images to be 227x227 or 224x224 for GoogleNet. That's 10,000 images resized,
labeled and loaded into the model before training can take place.

tl;dr IMHO in terms of being a competent user of deep learning: mathematics >=
programming >>> knowing how to use a framework

~~~
jacquesm
How many people using 'jpeg' productively to create every day applications do
you think understand the DCT?

~~~
JustFinishedBSG
I wasn't able to do a PhD with Stéphane Mallat now I can never use Jpeg2000 :(

~~~
jacquesm
You could petition Ingrid Daubechies as a means of last resort.

------
cs702
I love seeing this, and not just because I love PyTorch[1].

It's also because I believe it's in everyone's best interest to have more than
one widely used framework controlled by a single company (TensorFlow).

Also, I think fast.ai's approach to teaching deep learning is the right one
for the vast majority of developers: start with practical, immediately useful
know-how instead of theoretical underpinnings. People who want to delve
deeper, say, so they can develop innovative architectures, can always do so at
their own pace after taking fast.ai's course. There are a ton of _other_
online resources for learning subjects like linear algebra, multivariate
calculus, statistics, probabilistic graphical models, etc.

[1] Here's why I love PyTorch:
[https://news.ycombinator.com/item?id=14947076](https://news.ycombinator.com/item?id=14947076)

~~~
xanon
I do love PyTorch as well. I'm also somewhat confused, isn't PyTorch
controlled by a single company [Facebook] too?

~~~
apaszke
It's not controlled by Facebook in any way. It's true that a large part of the
core team works there, but development is public and guided by community needs
first.

------
IshKebab
I literally started this course yesterday. One of the annoying things is the
software setup. Yes they provide an AWS image but it's fairly expensive and I
already have a powerful GPU on my desktop. Unfortunately the Windows setup
instructions are long, complicated, use out of date software, etc. You have to
install a lot of different pieces of software, including Anaconda which
apparently is yet _another_ package manager (seriously?).

It's not quite as bad as the Javascript npm/gulp/bower/whatever insanity but
it's not too far off. Get it together ML people!

~~~
jacquesm
Anaconda is the thing that makes it a lot easier. Under Linux this is all a
lot simpler than under windows.

------
edshiro
I am currently learning Tensorflow and Keras. I found the learning curve quite
steep with Tensorflow and the whole static computation graph thing puzzling at
first (I thought it was dynamic...). Now I am a lot more comfortable with
both.

Really welcome new libraries and frameworks to make deep learning more
accessible. My only fear is that the field becomes cluttered with a myriad of
frameworks like the JS world. It would add a lot of confusion and apprehension
for people entering the field IMHO since they would not know what to use and
where to start.

------
gcarvalho
The fact that I haven't heard about PyTorch wrappers before always made me
feel like it had nailed the balance between expressiveness and
customizability.

He also states that PyTorch is hard [1], which does not seem to be HN's
overall opinion. So I guess they found Keras was limited in customization and
PyTorch required some boilerplate for loading/processing data and training
loops, and this new framework tries to fill the gaps?

I'm looking forward for their folloup posts.

[1]:
[https://twitter.com/jeremyphoward/status/906653539161694208](https://twitter.com/jeremyphoward/status/906653539161694208)

------
leesec
When can we expect this class? Also, Jeremy Howard recently commented that
they were redoing the first Practical Deep Learning for Coders class, is this
going to be the same course or will they be separate?

Have loved your material, thank you very much!

~~~
rdrey
The way I understood his tweet is that they are basically rewriting the first
course with PyTorch first, so I'm guessing the content will be identical, just
the notebooks now using PyTorch in the background.

------
eanzenberg
On Keras: >>On the other hand, it tends to make it harder to customize models,
especially during training. More importantly, the static computation graph on
the backend, along with Keras’ need for an extra compile() phase, means that
it’s hard to customize a model’s behaviour once it’s built.

What does that mean, customize models during training?

Also, how are dynamic-graph architectures performing vs. models where the
architecture doesn't change? Are they winning competitions?

~~~
JustFinishedBSG
I imagine he means customizing the optimizer stage?

------
naveen99
I like the fast.ai forums [http://forums.fast.ai](http://forums.fast.ai) also.

Would be nice to see use of cluttered mnist instead of coco as an initial toy
example for teaching [https://github.com/kevinjliang/tf-Faster-
RCNN/blob/master/RE...](https://github.com/kevinjliang/tf-Faster-
RCNN/blob/master/README.md)

------
misophist
The dynamic graph makes a lot of sense to me, in terms of models that combine
ideas not traditionally explored in the NN mainstream eg topology.

May I suggest the name "entelechy" (the realization of potential) as a
candidate name for the framework.

~~~
kjhughes
I respectfully request that you not use _entelechy_ for the name of the
framework.

Kenneth J Hughes (Engineer and Founder, Entelechy Corporation)

~~~
misophist
good choice :) it's a cool name i have been using in my private projects for
more than a decade.

------
kusmi
And what's even better than pytorch? Just torch.

------
sagivo
> Much to our surprise, we also found that many models trained quite a lot
> faster on pytorch than they had on Tensorflow.

Would love to see some benchmarks for that claim.

~~~
ActsJuvenile
I benchmarked Keras+TF vs PyTorch CNNs back in May 2017:

1) Compilation speed for a jumbo CNN architecture: Tensorflow took 13+ minutes
to start training every time network architecture was modified, while PyTorch
started training in just over 1 minute.

2) Memory footprint: I was able to fit 30% larger batch size for PyTorch over
Tensorflow on Titan X cards. Exact same jumbo CNN architecture.

Both frameworks had major releases since May, so I am sure these metrics might
have changed by now. However I ended up adopting PyT for my project.

------
0xbear
I like PyTorch better as well. Note however, that one of its dependencies,
gloo, comes with the infamous PATENT addendum. You won't be using it unless
you do distributed training tho.

~~~
apaszke
gloo is only one of the three currently supported backends. One can easily
switch to MPI, and pick an implementation that comes with a license you want.

