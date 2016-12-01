I was studying machine learning from Andrew Ng's CS229 (the class videos are online. I think they date from 2008 or hereabout). There is no way you can progress beyond lecture 2 (out of 20) without a solid probability background. A solid background in probability/statistics probably means a good first course in Probability or maybe the first five chapters of "Statistical Inference" by Cassias and Berger. Similarly, for SVM, you need a solid background in Linear Algebra and so on. You probably also need a background Linear Optimization. Here are the recommendations by Prof. Michael Jordan https://news.ycombinator.com/item?id=1055389
Not a lot of people want to dive in this much. They have got things to do and who cares about proofs anyway. The thinking goes like "Most of the mathematics is abstracted away by libraries like scikit-learn. Let's get shit done.". Well, I think a lot of competitive advantage of Google/Facebook in ML is because they have staffed their engineering with people who have studied these things for years (by PhD). Compare that to flipkart's recommendations.
However, I don't think this problem is unique to ML/Data Science. It is equally bad in "Distributed systems". Let's use Docker, that's the future!
I think what Andrew Ng would say is that without a rigorous statistical background, you will be limited in your ability to use ML, and you will certainly be more liable to blow your foot off by using it improperly. That being said, in a subset of cases, you may be able to achieve non-trivial insights through the techniques he teaches in the course.
So how I would rephrase your assertion is that a hacker can probably get a lot more out of ML techniques if they are willing to learn the math underlying them.
> Oftentimes, you do need a solid theoretical/mathematical background. Most people seems to approach ML like they approach programming tools or libraries - learn just enough to get job done and move on.
I've been coming across this on HN front page and it's worrysome to an extent.
The amount of free resources now available for learning machine learning/deep learning nowadays is robust and easy to comprehend. (indeed, Andrew Ng's Coursera class is very good). And running running ML code is even easier, with libraries like Tensorflow/Theano to abstract the ML gruntwork (and Keras to abstract the abstraction!)
I suspect that there may be machine learning knowledge crash, where the basics are repeated endlessly, but there is less unique, real world application of the knowledge learned. I've seen many Internet testimonials saying how "I followed an online tutorial and now I can classify handwritten digits, AI is the future!" The meme that Kaggle competitions are a metric of practical ML skill encourages budding ML enthusiasts to look at minimizing log-loss or maximizing accuracy without considering time/cost tradeoffs, which doesn't reflect real-world constraints.
Unfortunately, many successful real world applications of ML/DL are the ones not being instructed in tutorials as they are trade secrets (this is the case with "big data" literature, to my frustration). OpenAI is a good step toward transparency in the field, but that won't stop the ML-trivializing "this program can play Pong, AI is the future!" thought pieces (https://news.ycombinator.com/item?id=13256962).
- Too few labeled / garbage labeled data (70000 digits? How about only 1000 complex class objects?)
- Obscure bugs in custom implementation (yeah my custom layer works and gradient is correct... or wait why it diverges after 10k iteration? hmm).
- Timing/RAM constraints (it should segment an image under 10ms on Jetson TX1, well, good luck with GoogLeNet)
1. Intro deep learning, bit of theory and intuition building while applying it to a toy problem:
http://neuralnetworksanddeeplearning.com/index.html
2. A video series walkthrough on how to replicate some of the recent advances:
http://course.fast.ai/lessons/lessons.html
3. More theoretical background:
http://www.deeplearningbook.org/
4. Tensorflow tutorials with practical applications:
https://www.tensorflow.org/tutorials/
Specific applications:
Deep Learning for Vision:
https://www.youtube.com/playlist?list=PLkt2uSq6rBVctENoVBg1T...
Deep Learning for NLP:
https://www.youtube.com/playlist?list=PLIiVRB6G_w0i-uOoS6cDh...
