
Dive into Machine Learning with Jupyter and Scikit-Learn - stared
https://github.com/hangtwenty/dive-into-machine-learning
======
olympus
I'll add my thoughts on the topic of "how do you know when you're out of the
danger zone and can start marketing yourself as a machine learning expert?"

Do competitions on kaggle (or find them on other sites, but kaggle is
definitely the best place to start). Once you get past the point where you are
finishing in the middle of the pack (multiple top 10% or 25% finishes and
maybe a prize win) then you are an expert. That is proof that you are separate
from than hackers who just throw scikit-learn algorithms at a matrix. The
people in the master tier use clever feature engineering and/or code up custom
learning algorithms to get themselves above the masses. Looking at a problem
and figuring out the correct modeling approach is what the experts do. They
don't just create a data frame and run down the list of classification
algorithms that they have access to. Read the "No Free Hunch" reports on how
the winners did it and you'll quickly see the difference between yourself and
the experts.

~~~
hangtwenty
This is a fantastic contribution, thank you!

I opened an issue ... [https://github.com/hangtwenty/dive-into-machine-
learning/iss...](https://github.com/hangtwenty/dive-into-machine-
learning/issues/11)

I'll wait a bit in case you want to add a note on this in your own words (via
PR). Otherwise, tonight or tomorrow I'll paraphrase you or something. Whether
you or I make the change I want it in a branch, and then I'll try to get a bit
of review for that branch ...

THANKS AGAIN, the guide really needs some insight like this.

------
hangtwenty
TIME SENSITIVE note --

The guide's primary recommended course is Andrew Ng's Machine Learning course.
Current session started November 2nd, you must enroll by the 7th. Another
session is starting November 30th.

[https://www.coursera.org/learn/machine-
learning](https://www.coursera.org/learn/machine-learning)

Cheers!

~~~
argonaut
It's not time-sensitive at all. After the course has ended, you can still
enter the course and do everything; you just won't get the certificate (which
doesn't matter).

~~~
metamet
What exactly is the appeal of the certificate? Do they hold _any_ weight?

~~~
cgag
Precomitting to finish the course for the certificate is a good motivational
hack

------
Bouncingsoul1
If the author is here, thank you very much for providing this. I wanted to to
look it into Jupyter and machine learning and this is probably the right way
to start. I tried the course one Udacity for machine learning (Python, Scikit-
Learn) but it not my way of learning things,since I like to fiddle around
instead of going the straight way . If anyone is interested in an alternative
check out the Udacity coure [https://www.udacity.com/course/machine-learning-
supervised-l...](https://www.udacity.com/course/machine-learning-supervised-
learning--ud675) .

~~~
hangtwenty
Author here. Thank you! Glad you find it useful!

------
aprdm
Would you say that Machine learning is something only for PhDs and very
experienced people or can a dev pick it up and be hired as one?

~~~
xtacy
Like any topic/skill, it can be learnt, but only if you spend significant time
and effort by doing projects, exercises, asking questions (stackexchange,
etc.). It's very important to pay attention to fundamentals and thinking from
scratch rather than mastering a laundry list of tips/tricks, because
fundamental ideas can be composed in different ways and adapted to a new
situation. The fundamentals here would be probability, statistics, linear
algebra, optimisation.

------
_of
For those interesting in picking up machine learning that already know R, I
recommend this book:

[http://www.amazon.com/Applied-Predictive-Modeling-Max-
Kuhn/d...](http://www.amazon.com/Applied-Predictive-Modeling-Max-
Kuhn/dp/1461468485)

It's loaded with useful R snippets and practical examples.

------
bobmichael
Could someone with industry/academic experience in ML comment on the quality
and reliability of the resources in the repo?

~~~
tdaltonc
The quality of Scikit-Learn? It's not bleeding-edge but it's very well tested
and documented. Quite good quality.

No one gets fired for using Scikit.

~~~
hangtwenty
I'm curious if you can speak more to this, or share any resources about it. It
seems clear that scikit-learn is a good fit for this kind of hacking-learning.
If there's a way I can throw in a sentence (with link to more detail), giving
context about where it sits in the eyes of experts ... Would be nice.

~~~
argonaut
What is there to be worried about? scikit-learn is a solid, tested
implementation of most machine learning algorithms. If you're doing work in
Python and want to run your data through a standard ML algorithm, and the algo
is implemented by scikit-learn, then just use scikit-learn. If it isn't
implemented by scikit-learn, you find some other implementation or implement
it yourself.

Experts use all sorts of things: MATLAB, R, Python (with scikit-learn), etc.

~~~
hangtwenty
What you're saying -- actually every sentence of your comment -- was my
existing impression.

tdaltonc said "No one gets fired for using Scikit." Maybe I read too much into
this comment, but it seemed to have a negative tone. So I got the impression
that tdaltonc might have more to say about it. Maybe not though!

------
manish_gill
I'd love one such list about AI in general and other sub-fields like
NLP/Computational Linguistics as well. I've recently started the Berkeley AI
course on EdX along with Russell & Norvig's standard textbook. :)

~~~
rahmaniacc
[https://github.com/josephmisiti/awesome-machine-
learning](https://github.com/josephmisiti/awesome-machine-learning)

This is a really good list of resources on Machine learning and has a section
dedicated to NLP/Text mining.

~~~
hangtwenty
Thank you, now I've added a link to this in the appendix about finding
libraries.

------
peterhadlaw
How does an academic introduction and study in Machine Learning compare to a
self taught one? I know it's a shallow question but there has to be some sort
of line where the difference opens and closes opportunities.

~~~
will_pseudonym
Another way to look at it would be that the difference opens opportunities to
improve the self-taught path.

------
sandGorgon
Are there any quality ML courses (of norvig or ng quality) that uses python or
java.

I would love to learn ML concepts, but I really don't have the cognitive
bandwidth to learn a new language, which I most likely will never use in my
day job (Python, ruby,java).

When I last looked, most of the top quality courses use some variant of
proprietary tools or MATLAB, but production code is in python or java (with R
sometimes).

~~~
GFK_of_xmaspast
Learning a new language is very easy compared to learning all sorts of other
stuff, including ML.

~~~
sandGorgon
I agree - the problem is that there are some problems at my work that I can
probably solve by applying some concepts of ML. But I dont think I can do that
through matlab.

I have been having a bad day on HN, so before I get misconstrued - there is
nothing wrong with matlab. I was just hoping a go-to-production language like
Java or Python for learning ML.

~~~
argonaut
Is there a reason you can't read data into MATLAB (by reading an exported
csv/other format file, or querying a database)?

That being said, you can do ML completely in Python.

~~~
sandGorgon
1\. I dont have matlab, and I dont want to buy it. 2\. when you go into
production (say.. predicting top customers for an ecommerce site), you are not
going to run matlab on the server.

yes - python/pandas/scikit is pretty popular for writing production ML code.
The question really is - any good _courses_ ? Most of the top courses I see
are using some variant of Matlab to teach.

~~~
argonaut
Good point. I don't think it's too much trouble to translate MATLAB to the
equivalent numpy/scipy/ python library calls.

------
TDL
Thank you for this! I've started digging in to ML and this looks absolutely
awesome.

