
Learning Machine Learning: A beginner's journey - deafcalculus
http://muratbuffalo.blogspot.com/2016/12/learning-machine-learning-beginners.html
======
saurabhjha
I think this "machine learning for hackers" approach is just not enough.
Oftentimes, you do need a solid theoretical/mathematical background. Most
people seems to approach ML like they approach programming tools or libraries
- learn just enough to get job done and move on.

I was studying machine learning from Andrew Ng's CS229 (the class videos are
online. I think they date from 2008 or hereabout). There is no way you can
progress beyond lecture 2 (out of 20) without a solid probability background.
A solid background in probability/statistics probably means a good first
course in Probability or maybe the first five chapters of "Statistical
Inference" by Cassias and Berger. Similarly, for SVM, you need a solid
background in Linear Algebra and so on. You probably also need a background
Linear Optimization. Here are the recommendations by Prof. Michael Jordan
[https://news.ycombinator.com/item?id=1055389](https://news.ycombinator.com/item?id=1055389)

Not a lot of people want to dive in this much. They have got things to do and
who cares about proofs anyway. The thinking goes like "Most of the mathematics
is abstracted away by libraries like scikit-learn. Let's get shit done.".
Well, I think a lot of competitive advantage of Google/Facebook in ML is
because they have staffed their engineering with people who have studied these
things for years (by PhD). Compare that to flipkart's recommendations.

However, I don't think this problem is unique to ML/Data Science. It is
equally bad in "Distributed systems". Let's use Docker, that's the future!

~~~
stevehiehn
Everytime there is a paradigm shift there is always that voice: If you don't
understand the paint at a chemical compound level you can't make a beautiful
painting. Wait what?

~~~
nercht12
Eh, let's revise that analogy. More like not understanding the bricks means
you can't make a good building. You can by good intuition, but it won't be
spot-on perfect (as by calculating all the physics) and you'll need more luck
the higher you get.

~~~
stevehiehn
I do take your point. I guess i'm just trying to say i've had good success
just diving in head first and working backwards :)

------
theCricketer
Thanks for sharing. Here's a set of deep learning resources I've found useful
to give you a good theoretical background as well as start applying techniques
to real world problems:

1\. Intro deep learning, bit of theory and intuition building while applying
it to a toy problem:

[http://neuralnetworksanddeeplearning.com/index.html](http://neuralnetworksanddeeplearning.com/index.html)

2\. A video series walkthrough on how to replicate some of the recent
advances:

[http://course.fast.ai/lessons/lessons.html](http://course.fast.ai/lessons/lessons.html)

3\. More theoretical background:

[http://www.deeplearningbook.org/](http://www.deeplearningbook.org/)

4\. Tensorflow tutorials with practical applications:

[https://www.tensorflow.org/tutorials/](https://www.tensorflow.org/tutorials/)

Specific applications:

Deep Learning for Vision:

[https://www.youtube.com/playlist?list=PLkt2uSq6rBVctENoVBg1T...](https://www.youtube.com/playlist?list=PLkt2uSq6rBVctENoVBg1TpCC7OQi31AlC)

Deep Learning for NLP:

[https://www.youtube.com/playlist?list=PLIiVRB6G_w0i-uOoS6cDh...](https://www.youtube.com/playlist?list=PLIiVRB6G_w0i-uOoS6cDh_5nkUyxy_hxe)

------
minimaxir
> So I am doubling down on ML/DL.

The amount of free resources now available for learning machine learning/deep
learning nowadays is robust and easy to comprehend. (indeed, Andrew Ng's
Coursera class is very good). And running running ML code is even easier, with
libraries like Tensorflow/Theano to abstract the ML gruntwork (and Keras to
abstract the abstraction!)

I suspect that there may be machine learning knowledge _crash_ , where the
basics are repeated endlessly, but there is less _unique, real world
application_ of the knowledge learned. I've seen many Internet testimonials
saying how "I followed an online tutorial and now I can classify handwritten
digits, AI is the future!" The meme that Kaggle competitions are a metric of
practical ML skill encourages budding ML enthusiasts to look at minimizing
log-loss or maximizing accuracy _without considering time /cost tradeoffs_,
which doesn't reflect real-world constraints.

Unfortunately, many successful real world applications of ML/DL are the ones
_not_ being instructed in tutorials as they are trade secrets (this is the
case with "big data" literature, to my frustration). OpenAI is a good step
toward transparency in the field, but that won't stop the ML-trivializing
"this program can play Pong, AI is the future!" thought pieces
([https://news.ycombinator.com/item?id=13256962](https://news.ycombinator.com/item?id=13256962)).

~~~
Dzugaru
Pretty much this. In the real world you have to deal with

\- Too few labeled / garbage labeled data (70000 digits? How about only 1000
complex class objects?)

\- Obscure bugs in custom implementation (yeah my custom layer works and
gradient is correct... or wait why it diverges after 10k iteration? hmm).

\- Timing/RAM constraints (it should segment an image under 10ms on Jetson
TX1, well, good luck with GoogLeNet)

~~~
epberry
Nailed it. Neural net execution speed is so critical for may production
systems and it's very difficult to hit the sweet spot on trade offs but I
never hear about these issues in wild.

~~~
Eridrus
> Nailed it. Neural net execution speed is so critical for may production
> systems and it's very difficult to hit the sweet spot on trade offs but I
> never hear about these issues in wild.

That's probably because you're not listening. There's plenty of literature on
scaling down neural nets to smaller devices because everyone knows it's an
issue and can trivially get a smaller device, see techniques such as
Quantization , Distillation, Pruning or architectures specifically designed
for the task such as YOLO.

------
Philipp__
Distributed Systems and ML are probably two most interesting things that I
have on the radar, that got me really scared to the point where I do not know
from where to start, and most importantly for what?! Most of my free time
(time I spent on personal projects) was writing physics simulation in Java,
playing with Lisp and doing some backend development. Nothing amazing. Year
and a half ago I got really interested into Operating systems (tried FreeBSD
and blew my mind) and played with Docker. And at the end of this year, I am
like: "Ok Philip what shall I focus on for year to come?" And the thing is If
I choose to go Ai route, I do not know from where to start (I consider my math
background to be pretty good, I was studying EE before I dropped out after 2
years, and enrolled to CS, done all of the math courses which were pretty
rough), Ai/ML looks interesting but it looks so high level to program and so
abstract to understand. It's really looking like arcane magic to me. With
Dist. Systems is that I have a feeling that is more "engineering" and
"industrial" thing, where you can't do much by yourself at home, besides
reading and writing some code in relevant languages about backend, sometimes
lower level, and learning about systems and computer innards. And the third
option was to go and play with Erlang/Elixir, which is most attractive since
results will come pretty soon, and may be relevant form my interest in
Distributed systems.

~~~
chestervonwinch
> If I choose to go Ai route, I do not know from where to start ...

This type of comment is often made in machine learning (ML) related
submissions.

The pre-req list is long: calculus, linear algebra, stats, probability,
numerical methods (for optimization, linear algebra, maybe interpolation),
etc. BUT, you don't really need to go through the _entirety_ of each subject
for ML. For example, in calculus, you probably only need to focus on the
aspects necessary for optimization, rather than integral techniques,
convergence of sequences, etc. The trouble is that it is difficult to know
_which_ subtopics of each subject are worth spending time on unless you
_already know_ machine learning (or you have the luxury of someone with
experience guiding you).

The latter difficulty is compounded by the fact that there seems to be many
more resources (at least posted as popular submission on the web) for learning
neural nets or learning some specific framework to implement neural networks,
than to learn the mathematical and statistical foundations of ML. This is fine
-- neural nets are a popular and powerful model, and people like to work on
something tangible to get acquainted with a topic.

I wonder if people might enjoy a well-written textbook covering the basic math
for ML -- something like, "All the math you missed (but need to know for
machine learning)" [1]. I might enjoy working on such an ebook if there was
desire for one, but my time is pretty limited (like most).

[1]: [https://www.amazon.com/All-Mathematics-You-Missed-
Graduate/d...](https://www.amazon.com/All-Mathematics-You-Missed-
Graduate/dp/0521797071)

~~~
bronxbomber92
Here's what I've found along the lines of "mathematics for machine learning":

* _DS-GA 1002: Statistical and Mathematical Methods_ ([http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall15/inde...](http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall15/index.html)) by Carlos Fernandez-Granda of NYU

(There is a 2016 version of the course with different lectures notes as well.)

* _Numerical Algorithms_ ([http://people.csail.mit.edu/jsolomon/share/book/numerical_bo...](http://people.csail.mit.edu/jsolomon/share/book/numerical_book.pdf)) by Justin Solomon

* _Math for Intelligent Systems, 2016_ ([https://ipvs.informatik.uni-stuttgart.de/mlr/teaching/maths-...](https://ipvs.informatik.uni-stuttgart.de/mlr/teaching/maths-ws1516/)) by Marc Toussaint & Hung Ngo of University Stuttgart

* _Math for Intelligent Systems, 2015_ ([https://ipvs.informatik.uni-stuttgart.de/mlr/teaching/mathem...](https://ipvs.informatik.uni-stuttgart.de/mlr/teaching/mathematics-for-intelligent-systems/)) by Nathan Ratliff of University Stuttgart

* _Mathematics for Inference and Machine Learning_ ([http://wp.doc.ic.ac.uk/sml/teaching/mathematics-for-machine-...](http://wp.doc.ic.ac.uk/sml/teaching/mathematics-for-machine-learning-autumn-2016/)) by Stefanos Zafeiriou and Marc Deisenroth of Imperial College London

Most of these are lectures notes. Some are very detailed, but I think there is
still space for a completely fleshed out book on the subject.

------
legulere
A counterpoint: Deep learning is currently hyped, making you not consider
other techniques that might work better, or are simpler and work just as good.
Deep learning might have a limited scope and turn out to be a dead end for
areas other than the ones already examined.

~~~
habitue
> making you not consider other techniques that might work better

Currently, DL is the most powerful technique for many problem types. I think
for a beginner, learning that is a safe bet, the "next thing" is likely to be
an elaboration on DL. It's good if some people ignore the hype, they may come
up with the next paradigm going in a completely different direction from DL.
But the people doing that won't be beginners. They'll be people with similar
levels of experience as Yann LeCun, Geoffrey Hinton, etc. Those well rounded
people with deep theoretical knowledge will make the big breakthroughs.
Beginners should start with what we know works well now, and expand out.

> [Other techniques may be] simpler and work just as good

This is a much better reason to learn other things than DL. Efficiency and
lower complexity if you know the problem domain is amenable to the technique.

------
jupiter90000
I have an almost opposite problem. I spent years learning alot of ML stuff and
worked at a job doing this kind of work for a couple years or so. I think the
issue was that the data we had at the organization and the internal politics
seemed to make it difficult to use for ML in a way that mattered to the
business. I grew frustrated with having spent alot of time learning things
that were exciting then realizing it didn't really matter if some manager can
just say "we're doing it this other way that makes sense to me." (Not based on
data, but gut feelings)

I'm not sure what to do with that. Probably ML works best in organizations and
situations that are on board for using ML to make decisions for the business.
Here's the other thing -- finding a business where ML is core to its decision
making that will hire a person with no formal ML related education may be
difficult. Perhaps I'm wrong about that and have just given up on ML after my
frustrating experience.

Now I'm building data systems that the business uses on a daily basis to get
things done. I feel alot better doing that than ML stuff, even though I loved
playing with data and ML. I guess I've given up on ML for now, maybe I'll find
my way back to it again.

~~~
laughfactory
I agree. Most of what I've seen suggests that everyone _thinks_ they need and
want machine learning specialists. But mostly they need people who are
flexible in how they combine business acumen with statistics, a little ML,
analysis, and programming. Business owners usually insist they need the ML
soooo much, but they're rarely willing to go all the way and actually deploy
ML. Plus, often they don't really need it...Or even modeling of any sort. They
may need automated dashboards (for keeping an eye on important KPIs), decision
support platforms, etc... Lots of things which require something more than
analysts, statisticians, programmers, and the like. In essence they need
highly capable jack-of-all-trades who can, on occasion, bring to bear advanced
algorithms and stats. But that's a lot more rare than you'd be led to believe
given job postings and all the news articles.

------
ankurdhama
Any ML tutorial should start with: Its not about machines and not about
learning.

~~~
sabertoothed
It is about both. Machines AND learning.

~~~
ankurdhama
It is about data and algorithms and if calling them machines and learning or
cognition or intution or thinking or whatever makes you happy then no body can
stop you from doing so.

~~~
sabertoothed
You might want to look up the definition of 'learning' from Arthur Samuel /
Tom Mitchell. That might help you in your confusion.

------
ak93
Even I recently started with ML/DL but my approach is more theoretical way. I
started with Andrew's course, but alongside doing Python Machine Learning
textbook,while testing my self on Kaggle. I hope to build some interesting
system soon. The only thing I am worried about is getting a full time job,
which I think always require someone with 2+ year experience.

------
iaw
Admirable intentions by the author but I hope (s)he changes his
font/formatting style.

The current font with dense paragraphs makes it hard for me to read without a
headache, sparser sentences (either via bullet pointed lists or illustrative
images) are much easier for me to parse.

------
soufron
The main question is: what for?

------
ermik
@muratbuffalo Your graph has left the building.
[http://imgur.com/a/kKkjC](http://imgur.com/a/kKkjC)

------
aspiringme
Machine learning.. is the new avenue mankind can boast of.

------
angry_octet
Unfortunately the author begins by citing the fraud Taleb. After that I have
to doubly examine everything he writes for signs of subtle nonsense, and its
just necessary to close the tab.

