Hacker News new | past | comments | ask | show | jobs | submit login
Learn TensorFlow and deep learning, without a Ph.D. (cloud.google.com)
720 points by ShanaM on Jan 23, 2017 | hide | past | favorite | 43 comments

I spent some time learning the high level concepts first—which I found to be a very useful initial orientation—but recently I've wanted to solidify my foundations and learn the math properly.

I've found the math primer (two sections: "Linear Algebra" and "Probability and Information Theory") in this free book to be excellent so far: http://www.deeplearningbook.org/ It's a little under 50 pages for both sections.

I've seen the basics of linear algebra covered in many different places, and I think this is the most insightful yet concise intro I've come across. I haven't started the probability section yet, so I can't comment on it.

I have been doing the same thing. I've also augmented each section with video lectures from Khan academy or other sources. For instance, their videos on the Jacobian were excellent for getting an intuitive understanding of it [1].

I also search for problems on the topic to help solidify my knowledge. You can almost always find a class that has posted problems for a section with answers.

[1] https://www.khanacademy.org/math/multivariable-calculus/mult...

There seem to be a few of these "learn deep learning without heavy math" courses popping up, for which I'm profoundly grateful.

Have people found any of them to be particularly outstanding? I'd be interested - and I suspect many HN readers would be - to hear recommendations.

I've been really enjoying http://course.fast.ai/

Within a few hours of starting the course you'll have submitted an entry into the Kaggle Dogs vs Cats competition that scores in the top 50% of entries and achieves 97% accuracy. It's designed for coders who don't have a PHD in math. It's a very top-down approach, where you only get into mathematical details once you understand the high level models being used.

I find the lectures from Jeremy Howard [1] very helpful. He points out what seems to work and what doesn't and explains things very well. I'm up to lecture 4 and enjoyed them all.

I also like Andrej Karpathy's thorough explanation for backprop in his Lecture 4 cs231n video. The links to the videos are removed from the course page [2] for some reason, just google "cs231n video" and you will find the youtube links. The page on CNN is pretty good.

[1] http://course.fast.ai/

[2] http://cs231n.github.io/

Andrew Ng's Machine Learning on Coursera is a good for building some solid foundations in ML in general. Keep in mind it's not limited to Neural Networks. The math is kinda light, you either seek to understand it or trust that it works and focus on the ML principles.

> The math is kinda light, you either seek to understand it or trust that it works and focus on the ML principles.

This is accurate and also why I dropped out pretty quickly.

What I'd like is a Learn TensorFlow and deep learning with a mathematics/physics background.

I have a background in mathematics and physics. In this course, I try to give a physical explanation whenever possible to give people good intuitions about what is happening. Level of math required: know how a matrix multiply works (and it gets re-explained)

I feel you. Try nando de Freitas's course, I liked it a lot.

Thank you for the reference. I didn't know of it and will certainly have a look.

Udacity / Google course by Vanhoucke I think is the most popular for learning TensorFlow proper - https://www.udacity.com/course/deep-learning--ud730

Definitely recommended. It's the best course available in a 3 month format.

they say 3 months? I would have thought each of the 4 modules was 1-2 full days if you want to blast through it. Maybe 8 weeks of Sundays if you do it that way. But you can definitely spend a lot of time on them and on the TensorFlow docs.

LAFF or the Andrew Ng Machine Learning courses are true semester courses, but I'm not actually sure this is.

There is a renormalization group theoretical approach

From a programmer background with very few knowledge of mathematics, keras helped me to create some quite efficient CNN without all the code needed by tensorflow.

This simplicity implies limits for advanced users, but the tool is fantastic to apprehend deep learning.

I watched 15 minutes of the first video.

"To help more developers embrace deep-learning techniques, without the need to earn a Ph.D". Oh, good, I can do this.

"These fundamental concepts are taken for granted by many, if not most, authors of online educational resources about deep learning". Yup, true with this one as well.

A lot of the concepts in this talk are introduced very simply in this blog post that creates a working neural network in 9 lines of python code: https://medium.com/technology-invention-and-more/how-to-buil...

Of course, 9 lines is a little dense, even with numpy. In practice, I got more understanding out of the slightly longer version that clocks in at 74 lines including comments and empty lines. This is an enormously simple neural network: a single layer with just 3 neurons. My son described its intelligence as being less than a cockroach after it'd been stepped on.

It works though. It's able to accurately guess the correct response for the trivial pattern it's given. You can follow the logic through so you understand each simple step in the process. In a follow up blog post, there's a slightly smarter neural network with a second layer and a mighty 9 neurons.

These examples are very approachable. It's about as simple a neural network as you can get. If you're new to machine learning, understand how it works helps illuminate the more sophisticated networks described in Martin Görner's presentation.

Is an integral really that hard?

Not if you take it a little bit at a time.

A very little bit at a time - the amount of time between now and then, as now is about to become then.

Taking it from left to right is not that hard. From top to bottom (or bottom to top) is where the devil lies. :P

Perhaps somebody here can help me with a sideproject that I'm working on. I'm trying to figure out the topology of a neural network that is capable of detecting the location and orientation of a given object. Say, a wrench. I don't want to use heatmaps (e.g. [1]) because they give just the location of the object and not the orientation. So the problem is basically how to choose the output quantities and how to encode them. The x and y coordinates of the head of the wrench could be quantities, but how to encode them? Should I use multiple output neurons per coordinate? And encoding the orientation is a similar problem. Would it even make sense to decompose the output in this way? Thanks in advance!

PS: More generally, is there a guide that explains how to robustly encode real numbers as output of neurons? I've tried to search for it, but couldn't find it.

[1] https://github.com/heuritech/convnets-keras

There are algorithms for stuff just like that in OpenCV. Maybe you could find some inspiration or clarification by reading through the source code for the algorithms for a brief description?

Yes, that's a good idea, thanks!

But I was hoping for a more scientific answer. Like how do researchers approach this problem typically? And is there a strong consensus in this area among researchers?

It seems like such a general problem.

Hm, there are problems in the field of computer vision that might help structure your cost/training algorithm; maybe pose estimation, the perspective-n-point problem, and point-set registration.

So, in other words, learn to slap layers'n'shit together. I thought that's what everyone is already doing in deep learning.

Nope, look at the videos. I try to give as much background information as possible within 3h. The goal, on the contrary, is to help you understand the basics so that you can build on a solid fundation rather than slappin' layers together! Not that slappin' layers'n'shit together is against my religion or anythings though - sounds like fun actually :-)

This is probably the most effective 3 hours I have spent trying to get my head around Tensor Flow (and NN in general somewhat). Hell I even get what CNN and RNN's -are- now.

Fancy math is useful for explaining why it works.

But this sort of content is good for explaining to engineers -how- it works. Which is ultimately how I need to understand things before the why is interesting to me.

I just published the outline of this course as a Twitter moment: https://twitter.com/martin_gorner/status/823527027357655041

I found Siraj Raval has a great youtube channel[1] about these topics, also "without a Ph.D" style, he explains dense topics in a fun way! (maybe not for everyone) Also has practical [2] videos for building things from the scratch (in python) to understand better the basic concepts.

[1] https://www.youtube.com/channel/UCWN3xxRkmTPmbKwht9FuE5A

[2] https://www.youtube.com/watch?v=h3l4qz76JhQ

BTW, worth mentioning that Martin (the author of this tutorial) will present it in person at Google Cloud NEXT [1] in SF on March 8.

[1] https://cloudnext.withgoogle.com/schedule#target=tensorflow-...

I watched a version of this course a few weeks ago and it has cleared up a lot of things for me. Martin doesn't waste time on basic concepts and covers a lot of ground in 3 hours. It's probably the tutorial I learned the most from so far.


I'm glad to see the shout out to http://colah.github.io/ (even if the linked post is described as "monster stories")

huh, totally unintentional! colah is a colleague. His blog posts are great and were part of the source material for building these sessions.

Any versions of this course planned for R Programmers?

Python is one of the simplest languages to learn. I do this course as a hands-on lab with people who discover Python programming at the same time as they discover neural networks. I ask them to read this "Python 3 in 15 min" primer beforehand and they are good to go: https://learnxinyminutes.com/docs/python3/

I appreciate the link. I haven't used it in a couple of years. It will be a good refresher and will look forward to the course.

Thx :)

Very few people use R for deep learning.

There's a tensorflow R port, but it requires setting up a working version of tensorflow on python. So once you have that set up, all the documentation, error messages etc are for python - so you might as well just use python.

I understand and will try to reproduce in R what I learn in the course as I have an interest in R, too.

Thx for the info.

Without a Ph.D?! Really?

This is of course a matter of opinion, but objectively speaking, the most complex piece of math is a matrix multiply, which I re-explain anyway. This is an end-of-high-school level of mathematics.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact