
Ask HN: How do I go about learning machine learning? - zach_d
I&#x27;ve finally found an interesting field that gets me really excited to program! I learned python a month ago but haven&#x27;t had any interesting ideas but I just found out about Machine learning, AI, and data science. Really interesting stuff so I&#x27;m wondering what HN would suggest as a general learning path! What math should I know to really understand this stuff? What books do I need to read? Some beginner project ideas? Anything at all! I&#x27;m a HS Senior and I plan to really focus on machine learning, data science, and AI once I&#x27;m in college.<p>Thank you.
======
murbard2
A lot of the advice you will get is going to be in the spirit of the American
way of teaching various scientific disciplines. I would decribe it as a hands
on approach, that gets you started really fast and doing exciting things. This
is true of computer science, statistics, and physics as far as I know. It's
probably a good way to go about learning these things.

The French method for teaching these things (the one I'm used to) is to first
do a lot of mathematics and then to introduce these topics through powerful
mathematical formalisms, from the get go. All else equal, for most people,
it's probably a worse approach.

However, the American way does err a little bit on the side of too little
mathematics, and it can come and bite you. My suggestion would thus be to
follow all the advice given, but give 50% more emphasis on mathematics than
people would otherwise suggest. Fortunately, the mathematics of machine
learning aren't too complicated, but you want to be able to breathe that
stuff. In particular this means:

\- linear algebra

\- real and complex analysis

\- multivariate calculus

\- topology (just a bit)

\- probability theory

It's like touch typing. It may be tedious at first, but once you know this
stuff really well, learning new things will be a breeze.

~~~
eli_gottlieb
I recommend _Understanding Machine Learning_ for a textbook, as I rather
enjoyed a class using it.

[http://www.cambridge.org/al/academic/subjects/computer-
scien...](http://www.cambridge.org/al/academic/subjects/computer-
science/pattern-recognition-and-machine-learning/understanding-machine-
learning-theory-algorithms)

~~~
KodrAus
As someone who came into this field with only an average high-school-level
grasp of mathematics I found that particular book incredibly worthwhile. It's
taken me almost 12 months to start understanding it enough to actually apply
it but it's worth it.

------
chollida1
I wrote about this here:
[https://news.ycombinator.com/item?id=8767092](https://news.ycombinator.com/item?id=8767092)
and here:
[https://news.ycombinator.com/item?id=9433316](https://news.ycombinator.com/item?id=9433316)

Long story short, the biggest mistake I see people making is not actually
rolling up their sleeves and learning the math.

People are often content to watch hour after hour of Udacity, Khan academy and
Coursera videos but the applied follow up is where most people drop off. At
the very least any course work should be followed up by something practical
like a kaggle exercise to prove that you can apply the technique you just
learned. Consider the benefit of just watching videos vs doing actual applied
work.

On one hand if you just watch videos you might learn alot but how do you prove
that to someone hiring you? On the other hand if you sit down and spend a week
attaching a Kaggle excise then at the very least you have something to point
people to, to show that you can apply machine learning techniques.

My recommendation has always been to read the first 5 chapters of Introduction
to statistical learning: [http://www-bcf.usc.edu/~gareth/ISL/](http://www-
bcf.usc.edu/~gareth/ISL/)

and if you fly thorugh it then sample Elements of statistical learning
[http://statweb.stanford.edu/~tibs/ElemStatLearn/](http://statweb.stanford.edu/~tibs/ElemStatLearn/)
for the topics that you want to learn.

If intro to statistical learning is too advanced, then go to Khan academy and
work your way through their statistics videos.

From my experience you can bucket people into skill level by looking at how
they attack a new problem.

Beginners tend to start by saying they'll need a hadoop cluster and spend the
next week setting up a pipeline.

Intermediate people tend to jump into R or scikit and try to model the problem
with a small subset of data and the library and technique they know best.

The advanced people tend to flesh out their hypothesis first and then work out
the math and then jump to modelling with a small set of data and finally move
to a cluster.

------
playing_colours
You can try "Data Science from Scratch" [0] to get some taste. It uses Python
to teach essentials of data science, and ML altorithms. The code quality is
very good, and there is an introduction to Statistics, Maths and Python to
start.

Then you can continue with improving your maths (Linear Algebra [1], Calculus
[2], [3]) and moving on with Statistical Learning [4] [5]. I am personally
going now through this plan.

[0]
[http://shop.oreilly.com/product/0636920033400.do](http://shop.oreilly.com/product/0636920033400.do)

[1] [http://www.amazon.com/Linear-Algebra-Right-Undergraduate-
Mat...](http://www.amazon.com/Linear-Algebra-Right-Undergraduate-
Mathematics/dp/3319110799)

[2] [http://www.amazon.com/Calculus-4th-Michael-
Spivak/dp/0914098...](http://www.amazon.com/Calculus-4th-Michael-
Spivak/dp/0914098918)

[3] [http://www.amazon.com/Calculus-Manifolds-Approach-
Classical-...](http://www.amazon.com/Calculus-Manifolds-Approach-Classical-
Theorems/dp/0805390219)

[4] [http://www-bcf.usc.edu/~gareth/ISL/](http://www-bcf.usc.edu/~gareth/ISL/)

[5]
[http://statweb.stanford.edu/~tibs/ElemStatLearn/](http://statweb.stanford.edu/~tibs/ElemStatLearn/)

~~~
misframer
_An Introduction to Statistical Learning_ [0] is also good. It's a little less
technical than _The Elements of Statistical Learning_. We used it for our
statistical learning course at my university. The full PDF is available for
free as well [1].

[0] [http://www-bcf.usc.edu/~gareth/ISL/](http://www-bcf.usc.edu/~gareth/ISL/)

[1] [http://www-
bcf.usc.edu/~gareth/ISL/ISLR%20Fourth%20Printing....](http://www-
bcf.usc.edu/~gareth/ISL/ISLR%20Fourth%20Printing.pdf)

------
stonemetal
Machine learning is applied statistics. Elements of Statistical Learning was
written by profs at Stanford and is free on the internet. It might be a bit
hard since they don't really teach stats in high school. O'Reilly has a few ml
books. I liked Programming Collective Intelligence it is beginner friendly and
would probably help you come up with project ideas.

------
moserware
See previous discussion at
[https://news.ycombinator.com/item?id=9432952](https://news.ycombinator.com/item?id=9432952)
. My advice in particular is at
[https://news.ycombinator.com/item?id=9433981](https://news.ycombinator.com/item?id=9433981)

------
lovelearning
I recommend starting with Andrew Ng's ML course on Coursera. He teaches and
presents the subject in a way that is both fun and non-intimidating to a
beginner. You get a good overview of techniques, and get to try out fun
exercises like digit recognition.

------
yen223
Get your fundamentals down. Focus heavily on linear algebra and probability
theory.

If you think you've nailed Python, go ahead and look at NumPy and SciPy -
they're different enough from Python, and they crop up very often in ML.

------
LukeFitzpatrick
This is one possible resource for you www.codecloud.me - what it does, people
can code together on projects. The focus is on learning and unfortunately no
payment. They do have some machine learning projects on there which could help
you to start off & it comes with mentors.

------
rayalez
Hi! I've compiled a list of the best resources that I'm aware of:
[http://digitalmind.io/post/deep-learning](http://digitalmind.io/post/deep-
learning)

------
bra-ket
[https://www.quora.com/How-do-I-learn-machine-
learning-1](https://www.quora.com/How-do-I-learn-machine-learning-1)

