Hacker News new | past | comments | ask | show | jobs | submit login
Deep Learning with PyTorch – An Unofficial Startup Guide (iamtrask.github.io)
84 points by williamtrask on Jan 15, 2017 | hide | past | web | favorite | 15 comments



I really don't get why people are so scared of Lua, it's an absolutely simple language. I learned it when I was trying to implement Fast Neural Style algorithm back when Justin Johnson hadn't released the source yet and trying to learn Lua was the least of my problems. (I had an idea that starting from his Neural Style implementation would be a better way, and it proved to be true.) Plus, being able to read a bigger portion of released ML source codes proved to be a much bigger help than learning a new language was a roadblock.

So I plead to anyone starting learning ML, please do not shy away from new languages. It's really a much smaller effort than you imagine and the rewards are much bigger than you expect them to be.


In the past, I have advocated learning Deep Learning using only a matrix library. For the purposes of actually knowing what goes on under the hood, I think that this is essential, and the lessons learned from building things from scratch are real gamechangers when it comes to the messiness of tackling real world problems with these tools.

It's interesting. The fast.ai MOOC takes the complete opposite approach:

We have spent as much time studying the research into effective education techniques as we have studying the research into deep learning—one of the biggest differences that you'll see as a result is that we teach "top down" rather than "bottom up". For instance, you'll learn how to use deep learning to solve your problems in week 1, but will only start to learn why it works in week 2! And you'll spend a lot more time learning how to write effective code and use effective processes than you will on learning mathematical formalisms.[1]

Personally, I much prefer the fast.ai approach, which I'd characterize as "develop a shallow understanding, and then go deep". Either way, one thing is for sure. The part where it is claimed that lessons learned from building things from scratch are real gamechangers when it comes to the messiness of tackling real world problems with these tools doesn't bare out in reality. The fast.ai course is much, much more focused on performance in real world situations than any other course I've looked at.

[1] http://course.fast.ai/about.html


The full quote is:

In the past, I have advocated learning Deep Learning using only a matrix library. For the purposes of actually knowing what goes on under the hood, I think that this is essential, and the lessons learned from building things from scratch are real gamechangers when it comes to the messiness of tackling real world problems with these tools. However, when building neural networks in the wild (Kaggle Competitions, Production Systems, and Research Experiments), it's best to use a framework.

Which makes a bit more sense in context. Certainly I think taking CS231N has put me at an advantage going through fast.ai because the quality of the notes (and discussion on backprop, how neural nets work, etc.) is significantly better. Andrej Karpathy and Justin Johnson have done an amazing job with the whole website.

The downside with CS231N is the lack of really real-world practice if you're want to work with image data. The implementation examples only focus on CIFAR-10. I think the two courses are definitely complementary and I think if I'd only taken fast.ai I would be left a bit unsatisfied about knowing what I'd been doing when twiddling parameters.


I'd never tell anyone not to do CS231N!

However, my comment was about how much the theory helps in the real world (and I don't think my quote was misleading. I think the author still advocates learning theory before using the frameworks)

I think that the fast.ai approach of showing solutions, then explaining the theory is really good.

For example in lesson 2 Fast.ai shows how to finetune a VGG model to get (real!) state of the art performance on a real-wordl, two-class image recognition problem by adding an additional dense layer[1].

In CS231N (which I haven't done, although I have glanced through the notes) this isn't really said anywhere. It's kind of implied in a lot of places, and if you study it you'll be able to work it out for yourself.

That isn't bad, but I'm unconvinced that gives the huge "real world" advantage given the amount of time it takes to get to that point.

I strongly agree they are complimentary though.

[1] http://wiki.fast.ai/index.php/Lesson_2_Notes#Adding_a_Dense_...


Sure:) Perhaps I should have made clear that I was referring to the fact that CS231N follows the recommendation in the blog post - i.e. lots of exercises where you implement things in primitive Numpy - rather than a broad "it's a better course". CS231N is frustrating in the real-world respect, although you're given some general recipes e.g. use ReLU, use Adam, etc.

That said, I'd be interested to see some actual results from people taking what they learned in fast.ai (with no prior knowledge)and applying it to a new test set. It's easy to be impressed with the examples where you're walked through it, but figuring out what works on your data is whole new kettle of fish!


I enjoyed this tutorial as a first step into the world of deep learning frameworks. For context, I recently finished the Machine Learning course on Coursera.

I liked the parallel constructions of the neural network and the transition from linear algebra to framework. I really appreciate the ease of use of PyTorch, which pushed me over the edge into actually doing something useful with deep learning.

I managed to get through this tutorial and make a submission to the kaggle digit recognizer competition in the span of a few hours. I'm excited to figure out how to train a model more efficiently, which seems to difficult problem of choosing network hyperparameters.


Man that's awesome! Thank you so much for telling me. As a heads up, there's a completely revamped version of PyTorch being released soon (possibly today). You can see some documentation for it here. http://pytorch.org/docs/


Is Keras actually sponsored by Google as the article suggests? I know, Chollet works at Google but the fact that the creator of X works at Y doesn't make X "sponsored by Y"


He mentioned a few times recently that Keras will get pulled into tensorflow.

https://twitter.com/fchollet/status/820746845068505088


Thanks, I had missed that tweet


I was googling that and it seems Chollet also confirmed that in a Reddit comment:

https://www.reddit.com/r/MachineLearning/comments/5jg7b8/p_d...

Some background in this article currently featured in r/MachineLearning: https://www.reddit.com/r/MachineLearning/comments/5o9vfx/d_k...


If you maintain a large open source project during work hours then that arguably is "sponsorship". Looking at the Keras commit log it seems he pushes frequently.


Thanks, I was assuming it was a side project for him.


will this mean Torch will also be added to frameworks supported by Keras along with tensorflow and theano which are supported. That would be great idea.


anyone using pytorch in a production setting in a startup environment? if so, which purpose?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: