
Deep Learning with PyTorch – An Unofficial Startup Guide - williamtrask
https://iamtrask.github.io/2017/01/15/pytorch-tutorial/
======
ulucs
I really don't get why people are so scared of Lua, it's an absolutely simple
language. I learned it when I was trying to implement Fast Neural Style
algorithm back when Justin Johnson hadn't released the source yet and trying
to learn Lua was the least of my problems. (I had an idea that starting from
his Neural Style implementation would be a better way, and it proved to be
true.) Plus, being able to read a bigger portion of released ML source codes
proved to be a much bigger help than learning a new language was a roadblock.

So I plead to anyone starting learning ML, please do not shy away from new
languages. It's really a much smaller effort than you imagine and the rewards
are much bigger than you expect them to be.

------
nl
_In the past, I have advocated learning Deep Learning using only a matrix
library. For the purposes of actually knowing what goes on under the hood, I
think that this is essential, and the lessons learned from building things
from scratch are real gamechangers when it comes to the messiness of tackling
real world problems with these tools._

It's interesting. The fast.ai MOOC takes the complete opposite approach:

 _We have spent as much time studying the research into effective education
techniques as we have studying the research into deep learning—one of the
biggest differences that you 'll see as a result is that we teach "top down"
rather than "bottom up". For instance, you'll learn how to use deep learning
to solve your problems in week 1, but will only start to learn why it works in
week 2! And you'll spend a lot more time learning how to write effective code
and use effective processes than you will on learning mathematical
formalisms._[1]

Personally, I much prefer the fast.ai approach, which I'd characterize as
"develop a shallow understanding, and then go deep". Either way, one thing is
for sure. The part where it is claimed that _lessons learned from building
things from scratch are real gamechangers when it comes to the messiness of
tackling real world problems with these tools_ doesn't bare out in reality.
The fast.ai course is much, much more focused on performance in real world
situations than any other course I've looked at.

[1] [http://course.fast.ai/about.html](http://course.fast.ai/about.html)

~~~
joshvm
The full quote is:

 _In the past, I have advocated learning Deep Learning using only a matrix
library. For the purposes of actually knowing what goes on under the hood, I
think that this is essential, and the lessons learned from building things
from scratch are real gamechangers when it comes to the messiness of tackling
real world problems with these tools. However, when building neural networks
in the wild (Kaggle Competitions, Production Systems, and Research
Experiments), it 's best to use a framework._

Which makes a bit more sense in context. Certainly I think taking CS231N has
put me at an advantage going through fast.ai because the quality of the notes
(and discussion on backprop, how neural nets work, etc.) is significantly
better. Andrej Karpathy and Justin Johnson have done an amazing job with the
whole website.

The downside with CS231N is the lack of really real-world practice if you're
want to work with image data. The implementation examples only focus on
CIFAR-10. I think the two courses are definitely complementary and I think if
I'd _only_ taken fast.ai I would be left a bit unsatisfied about knowing what
I'd been doing when twiddling parameters.

~~~
nl
I'd never tell anyone not to do CS231N!

However, my comment was about how much the theory helps in the real world (and
I don't think my quote was misleading. I think the author still advocates
learning theory before using the frameworks)

I think that the fast.ai approach of showing solutions, then explaining the
theory is really good.

For example in lesson 2 Fast.ai shows how to finetune a VGG model to get
(real!) state of the art performance on a real-wordl, two-class image
recognition problem by adding an additional dense layer[1].

In CS231N (which I haven't done, although I have glanced through the notes)
this isn't really said anywhere. It's kind of implied in a lot of places, and
if you study it you'll be able to work it out for yourself.

That isn't bad, but I'm unconvinced that gives the huge "real world" advantage
given the amount of time it takes to get to that point.

I strongly agree they are complimentary though.

[1]
[http://wiki.fast.ai/index.php/Lesson_2_Notes#Adding_a_Dense_...](http://wiki.fast.ai/index.php/Lesson_2_Notes#Adding_a_Dense_Layer)

~~~
joshvm
Sure:) Perhaps I should have made clear that I was referring to the fact that
CS231N follows the recommendation in the blog post - i.e. lots of exercises
where you implement things in primitive Numpy - rather than a broad "it's a
better course". CS231N is frustrating in the real-world respect, although
you're given some general recipes e.g. use ReLU, use Adam, etc.

That said, I'd be interested to see some actual results from people taking
what they learned in fast.ai (with no prior knowledge)and applying it to a new
test set. It's easy to be impressed with the examples where you're walked
through it, but figuring out what works on your data is whole new kettle of
fish!

------
opaque_salmon
I enjoyed this tutorial as a first step into the world of deep learning
frameworks. For context, I recently finished the Machine Learning course on
Coursera.

I liked the parallel constructions of the neural network and the transition
from linear algebra to framework. I really appreciate the ease of use of
PyTorch, which pushed me over the edge into actually doing something useful
with deep learning.

I managed to get through this tutorial and make a submission to the kaggle
digit recognizer competition in the span of a few hours. I'm excited to figure
out how to train a model more efficiently, which seems to difficult problem of
choosing network hyperparameters.

~~~
williamtrask
Man that's awesome! Thank you so much for telling me. As a heads up, there's a
completely revamped version of PyTorch being released soon (possibly today).
You can see some documentation for it here.
[http://pytorch.org/docs/](http://pytorch.org/docs/)

------
mastazi
Is Keras actually sponsored by Google as the article suggests? I know, Chollet
works at Google but the fact that the creator of X works at Y doesn't make X
"sponsored by Y"

~~~
m_ke
He mentioned a few times recently that Keras will get pulled into tensorflow.

[https://twitter.com/fchollet/status/820746845068505088](https://twitter.com/fchollet/status/820746845068505088)

~~~
mastazi
Thanks, I had missed that tweet

~~~
mastazi
I was googling that and it seems Chollet also confirmed that in a Reddit
comment:

[https://www.reddit.com/r/MachineLearning/comments/5jg7b8/p_d...](https://www.reddit.com/r/MachineLearning/comments/5jg7b8/p_deep_learning_for_coders18_hours_of_lessons_for/dbhaizx/)

Some background in this article currently featured in r/MachineLearning:
[https://www.reddit.com/r/MachineLearning/comments/5o9vfx/d_k...](https://www.reddit.com/r/MachineLearning/comments/5o9vfx/d_keras_will_be_added_to_core_tensorflow_at_google/)

------
matrix2596
will this mean Torch will also be added to frameworks supported by Keras along
with tensorflow and theano which are supported. That would be great idea.

------
mariusb_
anyone using pytorch in a production setting in a startup environment? if so,
which purpose?

