If you like to study/read: the famous Coursera Andrew Ng machine learning course: https://www.coursera.org/learn/machine-learning
If you just want course materials from UC Berkeley, here's their 101 course: https://news.ycombinator.com/item?id=11897766
If you want a web based intro to a "simpler" machine learning approach, "decision trees": https://news.ycombinator.com/item?id=12609822
Here's a list of top "deep learning" projects on Github and great HN commentary on some tips on getting started: https://news.ycombinator.com/item?id=12266623
If you just want a high level overview: https://medium.com/@ageitgey/machine-learning-is-fun-80ea3ec...
Unlike some of the other complicated tools, sklearn is just a "pip install" away and includes all sorts of examples of different problems. Classification? Regression? Clustering? Representation learning? Perceptual embedding? Odds are, some part of sklearn covers all of that.
This means that you can set up a train and and test set and swap in and out random forest, svm, naive bases, logistic regression, and various others.
Read about them one by one, try to understand the algorithms generally, test them out, see how they perform differently on different data sets.
It all depends on how you like to approach a new subject, but I think this is more fun and motivating than going straight into the mathematics behind the algorithms right away (which is more along the lines Andrew Ng's excellent course). I'd say once you're into it and using the algorithms, then dig deeper into the core mathematics, you'll have a better context for it.
 – https://en.wikipedia.org/wiki/Iris_flower_data_set
The first lesson is a classifier to separate apples and oranges.
If you work through the data, you'll find things like women, children and first class passengers had a higher survival rate than men with lower class tickets.
This matches exactly the stories of what happened: Staff took first class passengers to the lifeboats first, then women and children. Then they ran out of lifeboats.
So the data shows correlation, and eye-witness accounts shows causation. That's close to the ideal combination: eyewitness accounts can be unreliable because we can't know how widespread they are, and correlation doesn't show causation.
But the combination of them both is pretty much the best case for studying something which can't be replicated.
 See examples like https://www.kaggle.com/omarelgabry/titanic/a-journey-through...
But any attempt to frame it as a "prediction", an accurate model of the event or adequate description of reality is just nonsense.
To call things by its proper names (precise use of the language) is the foundation of the scientific method. This is mere oversimplified, non-descriptive toy model of one aspect of historical event, made from of statistics of partially observable environment. A few inferred correlations reflects that there was not a total chaos, but some systematic activity. No doubt about it. But this is absolutely unscientific to say anything else about the toy model, let alone claim that any predictions based on it have any connection to reality.
There is clear correlation between gender and survival rates. Given the data, a decent prior would absolutely take that into account.
Yes, there are other factors. But the foundation of statistical models is simplification, and descriptive statistics are an important foundation of that.
In any case, it isn't exactly clear that there are magical hidden factors which predicted survival. It appears you maybe unfamiliar with the event, because basically those who got into a lifeboat survived, and those who didn't, didn't survive.
To quote Wikipedia:
Almost all those who jumped or fell into the water drowned within minutes due to the effects of hypothermia.... The disaster caused widespread outrage over the lack of lifeboats, lax regulations, and the unequal treatment of the three passenger classes during the evacuation..... The thoroughness of the muster was heavily dependent on the class of the passengers; the first-class stewards were in charge of only a few cabins, while those responsible for the second- and third-class passengers had to manage large numbers of people. The first-class stewards provided hands-on assistance, helping their charges to get dressed and bringing them out onto the deck. With far more people to deal with, the second- and third-class stewards mostly confined their efforts to throwing open doors and telling passengers to put on lifebelts and come up top. In third class, passengers were largely left to their own devices after being informed of the need to come on deck.
Even more tellingly:
The two officers interpreted the "women and children" evacuation order differently; Murdoch took it to mean women and children first, while Lightoller took it to mean women and children only. Lightoller lowered lifeboats with empty seats if there were no women and children waiting to board, while Murdoch allowed a limited number of men to board if all the nearby women and children had embarked
All this behavior matches exactly what the model tells us about the event.
I'd be very interested if you can point to something specific that is wrong about it.
All models are wrong, but some are useful.
Now, it so happens that that correlated heavily with class. But, not as much as with sex. Though, there were some places where being male hurt your chances (as you point out in the one officer not allowing men on boats), by and large these were secondary and correlated with success, not predictors of it.
All predictions are wrong and make no sense for partially observable, multiple causation, mostly stochastic phenomena. It will never be the same.
The Titanic's sister ship (the Brittanic) was torpedoed during WW1 and sunk. However, the lesson of the Titanic (too few lifeboats) had been learnt, and only 26 people died.
I don't know what point you are trying to make - yes, I agree that history never repeats, but lessons can be learnt from it, and they can be quantified and they can be useful.
This happened because they made a _statistical_ model of the Titanic disaster, and learned from it? Like, they actually crunched the numbers and plotted a few curves etc, and then said "aha, we need more boats"?
I kind of doubt it, and if it wasn't the case then you can't very well talk about a "model", in this context. It's more like they had a theory of what factor most heavily affected survival and acted on it. But I'd be really surprised to find statistics played any role in this.
No - statistics as the discipline that we think of today wasn't really around until the work of Gosset and Fisher which was done a few years after this.
I'm sure you noted that I was very careful with what I claimed: "the lesson of the Titanic (too few lifeboats) had been learnt".
These days we'd quantify the lesson with statistics. Then, they didn't have that tool.
Instead, we have testimony relaying the same story: Just one question. Have you any notion as to which class the majority of passengers in your boat belonged? - (A.) I think they belonged mostly to the third or second. I could not recognise them when I saw them in the first class, and I should have known them if there were any prominent people. (Q.) Most of them were in the boat when you came along? - (A.) No. (Q.) You put them in? - (A.) No. Mr. Ismay tried to walk round and get a lot of women to come to our boat. He took them across to the starboard side then - our boat was standing - I stood by my boat a good ten minutes or a quarter of an hour. (Q.) At that time did the women display a disinclination to enter the boat? - (A.) Yes."
So yes, I agree - it was a theory, which our modern modelling tools can show matched well with what the statistics showed happened.
My whole point is that this is very useful, unlike the OP who dismissed it as useless.
OK, tell me, please, what it is that you can predict? That some John Doe, having the first class ticket in a cabin next to the exit would survive the collision of the next Titanic with a new iceberg? That being a woman gives you better chances to secure a seat in a lifeboat? What is the meaning of the word "predict" here?
Most of the models mimic and simulate (very naively) that observable behavior, not its origin.
When people cite "the map is not the territory" they mean this. Simulation is not even an experiment. It is mere an animation of a model - a cartoon.
Simulation can be a very beneficial experiment. See for instance: https://papers.nips.cc/paper/5351-searching-for-higgs-boson-...
Simulations are not experiments. It is an animation of a formalized imagination, if you wish.
The process of learning could be defined as a task of extraction of relevant information (knowledge) about reality (shared environment) not mere accumulation of a fancy nonsense or false beliefs.
And reality like: The actual sinking of the Titanic?
If your model concludes that nobility, traveling first class, close to the exits, without family, has a higher chance of surviving, then this is fancy nonsense or a false belief?
You make a really strange case for your view.
In real life it is also very rare to have free pickings of the variables you want. Some variables have to substituted with available ones.
The Titanic story is to make things interesting for beginners. One could leave out all the semantics of this challenge, anonymize the variables and the target, and still use this dataset to learn about going from a table with variables to a target. In fact, doing so teaches you to leave your human bias at the door. Domain experts get beaten on Kaggle, because they think they need other variables, or that some variables (and their interactions) can't possibly work.
Let the data and evaluation metric do the talking.
That sounds a bit iffy. A domain expert should really know what they're talking about, or they're not a domain expert. If the real deal gets beaten on Kaggle it must mean that Kaggle is wrong, not the domain expert.
Not that domain experts are infallible, but if it's a systematic occurrence then the problem is with the data used on Kaggle, not with the knowledge of the experts.
I mean, the whole point of scientific training and research is to have domain experts who know their shit, know what I mean?
> Since our goal was to demonstrate the power of our models, we did no feature engineering and only minimal preprocessing. The only preprocessing we did was occasionally, for some models, to log-transform each individual input feature/covariate. Whenever possible, we prefer to learn features rather than engineer them. This preference probably gives us a disadvantage relative to other Kaggle competitors who have more practice doing effective feature engineering. In this case, however, it worked out well.
> Q: Do you have any prior experience or domain knowledge that helped you succeed in this competition? A: In fact, no. It was a very good opportunity to learn about image processing.
> Do you have any prior experience or domain knowledge that helped you succeed in this competition? I didn't have any knowledge about this domain. The topic is quite new and I couldn't find any papers related to this problem, most probably because there are not public datasets.
> Do you have any prior experience or domain knowledge that helped you succeed in this competition? M: I have worked in companies that sold items that looked like tubes, but nothing really relevant for the competition. J: Well, I have a basic understanding of what a tube is. L: Not a clue. G: No.
> We had no domain knowledge, so we could only go on the information provided by the organizers (well honestly that and Wikipedia). It turned out to be enough though. Robert says it cannot happen again, so we’re currently in the process of hiring a marine biologist ;).
> Through Kaggle and my current job as a research scientist I’ve learnt lots of interesting things about various application domains, but simultaneously I’ve regularly been surprised by how domain expertise often takes a backseat. If enough data is available, it seems that you actually need to know very little about a problem domain to build effective models, nowadays. Of course it still helps to exploit any prior knowledge about the data that you may have (I’ve done some work on taking advantage of rotational symmetry in convnets myself), but it’s not as crucial to getting decent results as it once was.
> Oh yes. Every time a new competition comes out, the experts say: "We've built a whole industry around this. We know the answers." And after a couple of weeks, they get blown out of the water.
Competitions have been won without even looking at the data. Data scientists/machine learners are in the business of automating things -- so why should domain knowledge be any different?
Ok, sure it can help, but it is not necessary, and can even hamper your progress: You are searching for where you think the answer is -- thousands are searching everywhere and finding more than you, the expert, can.
And if you see classification as a form of hypothesis testing, then cross-validation is a valid way of testing if hypothesis holds on unseen data.
This would be like sampling all coins from my pockets and thinking you could build a predictive model of year printed to value of coin. Probably could for the change I carry. Not a wise predictor, though.
This dataset is more in line with what you are looking for: https://www.kaggle.com/saurograndi/airplane-crashes-since-19...
It shows only that given set of variables (observable and inferred) could be used to build a model. The given data set is not descriptive, because it does not contain more relevant hidden variables, so any predictions or inferences based on this data set are nothing but a story, a myth made from statistics and data.
Can you know nothing about ml, ai, data analysis, and stats then give tensor flow some input and it will give you some input and pretty much apply it to your app?
Or do you have to know these subjects before even starting tensor flow?
It's OK to jump in and try it without having background information. See how far you get and start researching when you hit a wall or find sudden interest.
Thanks for the link
I read all of these architectures in research papers, but I'd really love to learn how to start iterating on them for a particular domain.
I strongly advice for:
- using Python in the interactive environment Jupyter Notebook,
- starting with classical machine learning (scikit-learn), NOT from deep learning; first learn logistic regression (a prerequisite for any neural network), kNN, PCA, Random Forest, t-SNE; concepts like log-loss and (cross-)validation,
- playing with real data,
- it is cool to add neural networks afterwards (here bare TensorFlow is a good choice, but I would suggest Keras).
Instead, learn decision trees and more importantly enough statistics so you aren't dangerous.
Do you know what the central limit theorem is and why it is important? Can you do 5-fold cross validation on a random forest model in your choice of tool?
Fine, now you are ready to do deep learning stuff.
The reason I say not to do neural networks first is because they aren't very effective with small amounts of data. When you are starting out you want to be able to iterate quickly and learn, not wait for hours for a NN to train and then be unsure why it isn't working.
Of course it's important to get a broad horizon eventually but starting with the theory without the applications is not how most humans learn best. Learning by doing is.
The problem with diving into neural networks is that they are slow to train (with large amounts of data anyway), and difficult to debug. This means it isn't really a great place to start.
"All of statistics" is really a great book if you have time work through he exercise.
http://www.inference.phy.cam.ac.uk/itila/book.html (freely accessible online)
1. Machine Learning by Andrew Ng (https://www.coursera.org/learn/machine-learning) /// Class notes: (http://holehouse.org/mlclass/index.html)
2. Yaser Abu-Mostafa’s Machine Learning course which focuses much more on theory than the Coursera class but it is still relevant for beginners.(https://work.caltech.edu/telecourse.html)
3. Neural Networks and Deep Learning (Recommended by Google Brain Team) (http://neuralnetworksanddeeplearning.com/)
4. Probabilistic Graphical Models (https://www.coursera.org/learn/probabilistic-graphical-model...)
4. Computational Neuroscience (https://www.coursera.org/learn/computational-neuroscience)
5. Statistical Machine Learning (http://www.stat.cmu.edu/~larry/=sml/)
If you want to learn AI:
If you don't understand something in the book, back up and learn the pre-reqs as needed.
Personally, I studied Duda & Hart's pattern recognition  and Casella & Berger's statistics text  simultaneously. This took about the equivalent of 2 semesters. Duda's text gets the main ideas across without being as heavy on the probability theory / stats.
Afterwards, I studied "Elements ..." by Hastie et al., which was far more readable after going through Casella & Berger's text. Now Hastie et al. is my go-to reference. I also should note that this all assumes that you also have the requisite math background: up to calc 3, linear algebra, and maybe some exposure to numerical methods (in particular, optimization).
Everyone keeps linking ESL, but really ISLr is much easier to understand, provides more important clarifying context, and covers more or less the same information.
ESL is more like a reference and prototype for ILSr
Not sure whether I should do this too.
That's the key question.
Also Fermat's Library is going to be annotating the book, which should make it even more accessible:
In addition to the linear algebra and statistics MOOCS mentioned, I'll also add:
* No bullshit guide to Linear Algebra: https://gumroad.com/l/noBSLA
* Statistical Models: Theory and Practice: https://www.amazon.com/Statistical-Models-Practice-David-Fre...
Khan Academy looks like a good beginning for linear algebra:
MIT 6.041SC seems like a good beginning for probability theory:
Then, for machine learning itself, pretty much everyone agrees that Andrew Ng's class on Coursera is a good introduction:
If you like books, "Pattern Recognition and Machine Learning" by Chris Bishop is an excellent reference of "traditional" machine learning (i.e., without deep learning).
"Machine Learning: a Probabilistic Perspective" book by Kevin Murphy is also an excellent (and heavy) book:
This online book is a very good resource to gain intuitive and practical knowledge about neural networks and deep learning:
Finally, I think it's very beneficial to spend time on probabilistic graphical models. Here is a good resource:
So you could start with some really simple example code for demand forecasting but where you put in your data and your signals. In this way you can learn what you need to solve a particular problem, 'getting lucky' from only having to adapt examples. Sure it might be nice to learn all the fundamentals first but it is sometimes nice to scratch an itch, every company has plenty, choose one and see how far you get and learn along the way.
Really great content from Andrej and his coworkers. This guy is great.
You can easily find all classes videos on YouTube too.
Has some great links if you already have some knowledge about software engineering and want to get into Machine Learning
Josh Gordon from Google also has a extremely nice handson "how to start with Machine Learning" course on YouTube featuring scikit-learn and TensorFlow:
If you're a python dev, maybe download scikit-learn and see what kinds of things you can put together after a few lectures.
I too felt like ML is something new to try, but the lack of real world use cases on a small scale ( not google, Microsoft, ... ) Has kept me from trying/doing.
I only saw the farm with image recognition for vegetables as an example for now.
Anyone has other examples?
Numerous ML competitions also provide enough fun to get started.
Why numerical methods?
* They might produce the right answer
* They frequently do
* They are easy to visualize or imagine
* You get used to working with a routine that is both fallible but quite simple and remarkably able to work in a wide variety of situations. This is what machine learning does, but there are more sophisticated routines.
At some point you need to make a decision to go down the road more focused on analysis & modelling vs machine learning & prediction. It's not that the two are exclusive, but they really do seek to address really big forks in the problem space of using a computer to eat up data and -- give me predictions or give me correct answers
Google needs lots of prediction to fill in holes where no data may ever exists. Analysis and modeling can really fall down when there is no data to confirm a hypothesis or regress against.
An engineer needs a really good model or the helium tank in the Falcon 9 will explode one time in twenty vs one time in a trillion. The model can predict, based on the simulation of the range of parameters that will slip through QA, how many tanks will explode. Most prediction methods are not trying to solve problems like this and provide little guidance on how to set up the model.
On the prediction side, you will learn all the neural net and SVM stuff.
On the analysis and modelling side, get ready for tons of probability and Monte Carlo stuff.
They are all fun.
Newton's method and other similar numerical methods are the hello world of a branch of mathematics known as 'numerical analysis' and scientific computing. This is not Machine Learning.
I'd also check out Alice Zheng's books:
You can learn on your own, of course, but a university course will focus your learning, provide rich feedback, and give you a strong foundation on which to build. You'll also get to learn from other students, which is not often the case in MOOCs. And there's nothing like having a teacher on your payroll (which is essentially what paying for a course is) to answer your questions, clarify obscure areas in books and generally support you throughout the course.
For the record- I did exactly what I say above. After five years working in the industry as a dev, I took a Masters part-time, sponsored by my employer. I think I got a good foundation as I say above, and I certainly didn't have the time, or the focus, to learn the same things on my own.
And I did try on my own, with MOOCs-and-books for a while. I did learn useful stuff (the introductory AI course from Udacity for instance, was really helpful) but after starting the Masters it felt like all this time I'd been crawling along without aim, and now I was running.
Mathematical Monk - https://www.youtube.com/user/mathematicalmonk#p/c/0/ydlkjtov... (includes a probability primer)
Awesome Courses - https://github.com/prakhar1989/awesome-courses - its a very extensive list of university courses including subjects apart from Machine Learning as well
Programming Collective Intelligence - http://www.amazon.com/programming-collective-intelligence-bu... - heard very good reviews about this
Many other resources available apart from the above. You can access more such resources at http://www.tutorack.com/search?subject=machine%20learning
I think its a good idea to go through one or more beginner level courses like that offered by Andrew Ng on Coursera and then do an actual project.
[Disclaimer - I work at tutorack.com mentioned in the comment]
and then continue with https://www.coursera.org/learn/neural-networks/
I think it's important for people to know where to go for good resources, but this exact question keeps coming up incessantly.
Now practical: I think the best way to learn is pick an algorithm & representation and implement it in your favorite language. Bonus if you have your own language to work with.
I would start looking into Decision Trees first, implement them and then implement some use cases(, which follow after implementing them). Do this for other approaches, like ANN, which you can have it beat you at checkers which is strangely satisfying.
But keep in mind Minsky. I think he is like Archimedes doing "Calculus"-type approaches without fully realizing. Maybe you could be Newton?
Start with a tutorial/pre-made script for one of the Kaggle Knowledge competitions. Move on to a real Kaggle competition and team up with someone who is in the same position on the learning curve as you. Use something like Skype or a Github repo to learn new tricks from one another.
HackerRank (YC S11) has one coming up in 2 weeks (filter by Domains > AI) .
I plan to participate as well just to explore the space. Feel free to shoot me a message if you'd like to discuss more.
i.e : Tutorial, Getting Started, ...
"Your package manager for knowledge".
(mostly focused around ML)
There are basic things I think you must know before jumping into a framework or int any specific algorithm. First thing you probably will have to do is to collect the data and clean it. In order to do this correctly you need some basic statistics. For example you need to know what is a gaussian distribution and collect samples in a way that are representative of your problem. Then you may need to clean the samples, remove outlines, complete blank data, etc. So it is basic you know some statistics to do this right.I have seem people with a lot of knowledge of tools than then they are not able to create a train/test/validation set correctly and the experiment is completely invalid from here no matter what you do next (http://stats.stackexchange.com/questions/152907/how-do-you-u..., https://www.youtube.com/watch?v=S06JpVoNaA0&feature=youtu.be ). You also need to know how are you going to test your results, so again you need to know how to use a statistical test (f-test, t-test). So first thing, jump into statistics to understand your data.
The next step I think is to know some common things in machine learning as the no free lunch theorem, curse of dimensionality, overfitting, feature selection, how to select the current metric to asses your model and common pitfalls. I think the only way to learn this is reading a lot about machine learning and making mistakes by your own. At least now you have some things to search in google an start learning.
The third step would be to understand some basic algorithms and get the feeling of the type of algorithms, so you know when a clustering algorithm is needed or your problem is related classification or with prediction. Sometimes a simple random forest algorithm or logistic regression is enough for your problem and you don't need to use tensorflow at all.
Once you know the landscape of the algorithms I think it is time to improve your maths skills and try to understand better how the algorithms works internally. You might not need to know how a deep network works completely, but you should understand how a neural network works and how backpropagation works. The same with algorithms as k-means, ID3, A*, montecarlo tree search or most popular algorithms that you are probably are going to use in day to day work. In any case you are going to need to learn some calculus and algebra. Vectors, matrix and differential equations are almost everywhere.
You would probably have seen some examples when learning all the stuff I talked about, then it is time to go to real examples. Go to kaggle and read some tutorials, read articles about how the community of kaggle has faced and winning the competitions. From here is just practice and read.
You can jump directly into a framework, learn to use it, have 99% accuracy in your test and 0% accuracy with real data. This is the most probably scenario if you skip the basic things of machine learning. I have seen people doing this and end up very frustrated because they don't understand how their awesome model with 99% accuracy doesn't work in the real world. I have also seen people using very complex things as tensorflow with problems that can be solved with linear regression. Machine learning is a very broad area and you need maths and statistics for sure. Learn a framework is useless if you don't understand how to use it and it might lead you to frustration.
Forget about the code part. It's the least difficult part.
With modern tools and frameworks you can start learning and applying what you know on practice almost immediately.
Check out Keras and the book "Deep Learning with Python". They have enabled me to train my first ANN in 2 days, and get to the point of building a MNIST recognizer in a month(and I was reading it pretty slowly).
Sure, if you're coding it from scratch and must understand every signle detail, you do need like 10 years and 3 PhD's. But that's not a wise way to learn.
I recommend to take the simplest tools, and apply them to practical projects immediately. That will give you the general overview of how things work, and then you will learn the details as needed.
Yes, you can take a library and implement it in 10 minutes, but then you're really not learning machine learning, are you?
I will argue you do not need four years of math by any stretch, though. The stumbling block will be notation more than anything else. Relatively basic calculus and linear algebra will suffice.
They were right about one thing: the code is the least important part.
As an undergrad, I was doing all those easy ML tutorials and took an undergrad level ML course. I thought I would be useful in actual practice, but knowing the whats/hows of neural nets/clustering/etc. is not enough. Feature engineering/math is the most difficult part. In a corporate setting, if it was a straight forward solution, you wouldn't be doing that work because the solution would be trivial and already implemented.
As an engineer with only a bachelors on a ML team full of PhDs there is a definite difference in skill. I've been reduced to a monkey (a content one) that works on the data pipeline. Learning to deal with real world ML problems would take me years of work that I am not sure I would be willing to do, especially when the pay increases per effort expended learning ML is much lower than with regular software/distributed systems/etc..
On the interest part, you're right that I would never have tried to learn ML if I had known the amount of work that is required to actually be good or if I tried learning the math first. That's the real world though. The useful ML engineers did learn the math. The efficient way to learn ML is to learn the math/statistics first.
For special applications, it is totally OK to learn as you go.
I think as well it really depends where you are coming from / what your background is. The reason i say this is i have recently gone through a similar transition into machine learning 'from scratch' except once i got there i realised i knew more than i thought. My academic background is in psychology / biomedical science which involved a LOT of statistics. From my perspective once i started getting into the field i realised there are a lot of things i already knew from stats with different terms in ML. It was also quite inspiring to see many of the eminent ML guys have backgrounds in Psychology (for instance Hinton) meaning i felt perhaps a bit more of an advantage on the theoretical side that many of my programming peers don't have.
I realise most people entering the field right now have a programming background so will be coming at things from an opposite angle. For me i find understanding the vast majority of the tests and data manipulation pretty standard undergraduate stuff (using python / SK Learn is incredible because the library does so much of the heavy lifting for you!). Where i have been struggling is in things that an average programmer probably finds very basic - it took me 3 days to get my development environment set up before i could even start coding (solved by Anaconda - great tool and lessons learned). Iterating over dictionaries = an nightmare for me (at first anyway, again getting better).
I think (though i may be biased) it's easier to go from programming to ML rather than the other way around because so much of ML is contingent on having decent programming skills. If you have a decent programming skill set you can almost 'avoid' the math component in a sense due to the libraries available and support online. There are some real pluses to ML compared to traditional statistics - i.e. tests that are normally ran in stats to check you are able to apply the test (i.e. shape of the data: skewness / kurtosis, multicollinearity etc) become less of an issue as the algorythms role is to deliver an output given the input.
I would still recommend some reading into the stats side of things to get a sense of how data can be manipulated to give different results because i think this will give you a more intuitive feel for parameter tuning.
This book does not look very relevant but it's actually a really useful introduction to thinking about data and where the numbers we hear about actually come from
In conclusion if you can programme and have a good attitude towards learning and are diligent with efforts I think this should be a simple transition for you.
The very first thing you should do is play! Identify a dataset you are interested in and get the entire machine learning pipeline up and running for it. Here's how I would go about it.
1) Get Jupyter up and running. You don't really need to do much to set it up. Just grab a Docker image.
2) Choose a dataset.
I wouldn't collect my own data first thing. I would just choose something that's already out there. You don't want to be bogged down by having to wrangle data into the format you need while learning NumPy and Pandas at the same time. You can find some interesting datasets here:
And don't go with a neural net first thing, even though it is currently in vogue. It requires a lot of tuning before it actually works. Go with a gradient-boosted tree. It works well enough out of the box.
3) Write a classifier for it. Set up the entire supervised machine learning pipeline. Become familiar with feature extraction, feature importance, feature selection, dimensionality reduction, model selection, hyperparameter tuning using grid search, cross-validation, ....
For this step, let scikit-learn be your guide. It has terrific tutorials, and the documentation is a better educational resource than beginning coursework.
4) Now you've built out the supervised machine learning pipeline all the way through! At this point, you should just play:
4a) Experiment with different models: Bayes' nets, random forests, ensembling, hidden Markov models, and even unsupervised learning models such as Guassian mixture models and clustering. The scikit-learn documentation is your guide.
4b) Let your emerging skills loose on several datasets. Experiment with audio and image data so you can learn about a variety of different features, such as spectrograms and MFCCs. Collect your own data!
4c) Along the way, become familiar with the SciPy stack, in particular, NumPy, Pandas, SciPy itself, and Matplotlib.
5) Once you've gained a bit of confidence, look into convolutional and recurrent neural nets. Don't reach for TensorFlow. Use Keras instead. It is an abstraction layer that makes things a bit easier, and you can actually swap out Tensorflow for Theano.
6) Once you feel that you're ready to learn more of the theory, then go ahead and take coursework, such Andrew Ng's course on Coursera. Once you've gone through that course, you can go through the course as it actually has been offered at Stanford here (it's more rigorous and more difficult):
I will also throw in an endorsement for Cal's introductory AI course, which I think is of exceptionally high quality. A great deal of care was put into preparing it.
There are other good resources that are more applied, such as:
I hope this helps. What I am trying to impart is that you will understand and retain coursework material better if you've already got experience, or better yet, projects in progress that are related to your coursework. You don't need to undergo the extensive preparation that is being proposed elsewhere before you can start PLAYING.