Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What free resources did you use to learn how to program ML/AI?
415 points by acalderaro on Aug 12, 2017 | hide | past | favorite | 51 comments

Firstly, while I think it's beneficial to learn multiple languages (python, R, matlab, julia), I'd suggest picking one to avoid overwhelming yourself and freaking out. I'd suggest python because there are great tools and lots of learning resources out there, plus most of the cutting edge neural networks action is in python.

Then for overall curriculum, I'd suggest:

1. start with basic machine learning (not neural networks) and in particular, read through the scikit-learn docs and watch a few tutorials on youtube. spend some time getting familiar with jupyter notebooks and pandas and tackle some real-world problems (kaggle is great or google around for datasets that excite you). Make sure you can solve regression, classification and clustering problems and understand how to measure the accuracy of your solution (understand things like precision, recall, mse, overfitting, train/test/validation splits)

2. Once you're comfortable with traditional machine learning, get stuck into neural networks by doing the fast.ai course. It's seriously good and will give you confidence in building near cutting-edge solutions to problems

3. Pick a specific problem area and watch a stanford course on it (e.g. cs231n for computer vision or cs224n for NLP)

4. Start reading papers. I recommend Mendeley to keep notes and organize them. The stanford courses will mention papers. Read those papers and the papers they cite.

5. Start trying out your own ideas and implementations.

While you do the above, supplement with:

* Talking Machines and O'Reilly Data Show podcasts

* Follow people like Richard Socher, Andrej Karpathy and other top researchers on Twitter

Good luck and enjoy!

For those who like videos, I would highly recommend utilizing Andrew Ng's Coursera ML videos for step one. I found his lectures to be good high level overviews of those topics.

The course in general lacks rigor, but I thought it was a very good first step.

I strongly disagree with this recommendation.

Andrew Ng's Coursera course is probably good for some backgrounds. But if your background is as someone who has mostly been programming for the last few years, I feel that Andrew Ng's course has two big drawbacks:

1. It's not very hands-on or practical. You won't actually get the feeling of building anything for a while.

2. It's very math oriented. If the last time you took a math class for your CS degree was a few years ago, you run the risk of not really remembering the background material well.

I'd personally recommend doing two things in parallel, if your background is in programming with less math training:

1. Look for a very hands-on/practical course to try out some examples.

2. At the same time, start refreshing (or learning) some maths that you might not remember, specifically, probability and statistics. Then after, Linear Algebra and maybe calculus.

I'm going to disagree with this about the difficulty of the math in Andrew Ng's course. Do you remember how to differentiate a function? Look up partial derivatives if you don't remember how they work, it shouldn't take longer than an hour. You're probably going to be fine.

If you never took calculus it's probably going to be hard going, but almost all modern machine learning requires basic calculus.

I would really recommend going through the first part of the course about linear regression if you haven't encountered it before, it was really eye opening for me.

Linear regression is incredibly important, but I think it's much better understood either practically (by implementing it or using it), or if you want to understand it mathematically, at the "end" of a statistics course. There's a reason that when learning probability/statistics, you usually encounter Linear Regression near the end of an introductory course, not in the beginning.

Again, this really depends on how mathematically competent you already are. I'm just basing this on how I felt coming to the course after having finished my degree about 10 years ago, therefore not really having most prob/statistics fresh in my mind.

You can certainly complicate the hell out of linear regression, but Andrew Ng introduces it in the setting of optimization/stochastic gradient descent, which I think is both mind blowing and a much simpler introduction than most statistics courses.

It's the very first bit of the course, I think everyone who is interested should try learning it. If not it's fine, but I wouldn't want anyone to not even try to spend a few hours on it because someone on the internet said it would be too hard.

That's certainly reasonable. And I totally agree with "try it out and gauge for yourself whether it's valuable for you".

My worry is that people will be put off from the field of machine learning if, 3 lessons into Andrew Ng's course, they will see that they don't understand anything, and that it's not practical to boot.

So my advice (generally applicable) is to try a few different things, because different resources click for different people.


* https://www.udacity.com/course/intro-to-artificial-intellige...

* https://www.udacity.com/course/machine-learning--ud262

Deep Learning:

* Jeremy Howard's incredibly practical DL course http://course.fast.ai/

* Andrew Ng's new deep learning specialization (5 courses in total) on Coursera https://www.deeplearning.ai/

* Free online "book" http://neuralnetworksanddeeplearning.com/

* The first official deep learning book by Goodfellow, Bengio, Courville is also available online for free http://www.deeplearningbook.org/

Two good ebooks. Go well with R.

Introduction to Statistical Learning http://www-bcf.usc.edu/~gareth/ISL/

Elements of Statistical Learning https://web.stanford.edu/~hastie/ElemStatLearn/

* Course: fast.ai (http://course.fast.ai). Practical, to the point, theory + code.

* Book: Hands-On Machine Learning w/ Scikit-Learn & TensorFlow (http://amzn.to/2vPG3Ur). Theory & code, starting from "shallow" learning (eg Linear Regression) on sckikit-learn, pandas, numpy; and moves to deep learning with TF.

* Podcast: Machine Learning Guide (http://ocdevel.com/podcasts/machine-learning). Commute/exercise backdrop to solidify theory. Provides curriculum & resources.

Online courses recommended in this thread are great resources to get your feet wet. If you want to actually be able to build ML powered applications, or contribute to an MLE team, we've written a blog post which is a distillation of conversations with over 50 top teams (big and small) in the Bay Area. Hope you find it helpful!


Disclaimer: I work for Insight

Andrew Ng's tutorials[1] on Coursera are very good.

If you're into python programming then tutorials by sentdex[2] are also pretty good and cover things like scikit, tensorflow, etc (more practical less theory)

[1] https://www.coursera.org/learn/machine-learning [2] https://pythonprogramming.net/data-analysis-tutorials/

This doesn't actually answer the question, but I always think that people who want to study neural nets should read Marvin Minsky's Perceptrons. It's an academic work. It's short. It's incredibly well written and easy to understand. It shaped the history of neural net research for decades (err... stopped it, unfortunately :-) ). You should be able to find it at any university library.

Although this recommendation doesn't really fit the requirements of the poster, I think it is easy to reach first for modern, repackaged explanations and ignore the scientific literature. I think there is a great danger in that. Sometimes I think people are a bit scared to look at primary sources, so this is a great place to start if you are curious.

"Learn AI the Hard Way". It's actually just reading a bunch of papers and trying to implement them, and anytime you don't understand something spend as much time as needed until you get it.

1. Udacity: Machine Learning

2. Deep Learning Summer School Montreal 2016 https://sites.google.com/site/deeplearningsummerschool2016/h...

2. selfdrivingcars.mit.edu + youtube playlist "MIT 6.S094: Deep Learning for Self-Driving Cars" (https://youtu.be/1L0TKZQcUtA?list=PLrAXtmErZgOeiKm4sgNOknGvN...)

3. Coursera: Machine Learning with Andrew Ng

4. Standford Cs231n (https://www.youtube.com/watch?v=g-PvXUjD6qg&list=PLlJy-eBtNF...)

5. Deep Learning School 2016 (https://www.youtube.com/playlist?list=PLrAXtmErZgOfMuxkACrYn...)

6. Udacity: Deep Learning (https://www.udacity.com/course/deep-learning--ud730)

I created a blog (http://ai.bskog.com) to have as a notepad and study backlog. There I keep track of what free courses I am currently taking and which one I will take next.


Although video courses are good. Everyday life makes it sometimes difficult to listen to videos on youtube while for instance doing chores around the house or working out, because you often need to a. see the slides/code examples, and b. put it into practice right away... therefore, podcasts are good to give you a flow of information.

Linear Digression, Data skeptic and (thanks to this thread i now discovered Machine Learning Guide)

Don't be discouraged if there is stuff you do not understand or feel like: i can never remember these terms or that algorithm. Just be immersed in the information and stuff will fall into place. And later when you hear about that thing again it will make more sense. I tend to use a breadth first approach to learning, where i get exposed to everything before digging into details thus getting an overview of what i need to learn and where to start.

A study group meetup (Every Tuesday evening in Austin, TX): https://www.meetup.com/cppmsg_ai/

Just Q&A - no presentations. Study from whatever books (http://amlbook.com/ and http://www.deeplearningbook.org/ are popular in our group) or courses (Andrew Ng's are also popular) you like throughout the week and then show up with any questions you have. We've been meeting for a couple of months now and new folks are always welcome no matter where you are in your studies!

I did the "early years" of both statistics and tiny neural networks/perceptrons in college a long time ago. It also helps that I use math at work (anything from simulated 3D physics to DSP.)

Since then, I've used Wikipedia and Mathworld when work had needed it. Regression, random forest, simulated annealing, clustering, boosting and gradient ascent are all on the statistics/ML spectrum.

But the best resource was running NVIDIA DIGITS, training some of the stock models, and really looking deeply at the visualizations available. You could do this on your own computer, or these days, rent some spot GPU instance on ECC for cheap.

I highly recommend going through the DIGITS tutorials if you want a crash course in deep learning, and make sure to visualize all the steps! Try a few different network topologies and different depths to get a feel for how it works.

For deep learning, and ConvNets in particular, cs231n can't be beat.

Geoff Hinton's Coursera course was what got me into it. It's not for the faint of heart. I might recommend Andrej Karpathy's cs231n as a more up to date source today.

This is only the tip of the iceberg, but I found this introduction to naive bayes classification assumed little prior knowledge and successfully helped me build a basic classifier: https://monkeylearn.com/blog/practical-explanation-naive-bay...

For the math: MIT OCW Scholar and maybe Klein's Coding the Matrix.

For AI specifically, MOOCS on Coursera, edx, and Udacity will give you plenty of options. The ones by big names like Thrun, Norvig, and Ng are great places to start.

It really helps to already be comfortable with algorithms. Princeton's MOOCs on Algorithms by Bob Sedgewick on Coursera would be a great place to start.

Think Bayes and Python Data Science Handbook are a good starting point. Below is the list of free books to learn ML/AI


Fast.ai is absolutely wonderful

If you were to spend a year or so going through many of the resources presented here, and probably knew your stuff pretty well (or at least as well as you could after a year), would anyone actually give you a job?

Nobody is "given" a job; you "earn" a job by convincing the hiring manager that you can do what they need done.

If you're any good, and have good results to show and talk about, yes, you could totally be employed.

If you show that you're extra willing to do all the heavy data preparation and labeling work yourself as well as the infrastructure that runs the models, you'll have an even easier time. Most people just want to play with models, and believe data preparation is "beneath" them, but that's actually where the meat is and where the success of the model is made or destroyed.

It depends what sort of a job you have in mind. If you wanted the sort of job where you spend all day every day doing ML/DL/AI stuff then no, that's a pure research job and probably needs a PhD. But the life of an ordinary working data scientist isn't like that: you would spend 75% of your time acquiring and cleaning/pre-processing data (including all the organizational tasks of finding it and persuading people to give you it), 20% of your time trying to shepherd what you had created/discovered into a real, working production system, and maybe 5% if you are lucky on this sort of thing. You absolutely can learn everything you need to get to this level through MOOCs. The rest is down to your interview skills.

The free Azure ML tutorials are pretty cool.


There are too many resources from which to choose. It would be thoughtful of anyone to share AI learning pathways, like a syllabus, using those resources.


For Deep Learning, deeplearning.ai has launched a free course on Coursera, which you may want to check out.

So who else has signed up for the deeplearning.ai course then? (I just did)

arxiv.org to learn the models, SemanticScholar to find connections between papers, GitHub search to find other people's implementations

...and Andrej Karpathy's Sanity Preserver website to search and review Arxiv papers more easily: http://www.arxiv-sanity.com/

Wow, that is really cool, thanks for sharing that! The YouTube video explaining it is here, well worth the short watch: https://m.youtube.com/watch?v=S2GY3gh6qC8

? I've always thought that ML/AI for me was about learning the languages that could express my idea of how it could work. In order to do that myself, I started reading about algorithm types.


There was one particular study piece that I remember reading that I believe was written in the late 70's early 80's, but I can't remember its name. It was a HTML unformatted uni course-work document that the guy who wrote it said he'd just keep changing it as required. Really wish I could remember his name.

I have a slightly different bent on what is discussed here, because my particular implementation reflects what I think is important. There are an infinite number of variations. It depends on what you think you think it might be good for.

Are there good Deep Learning tutorials or blog posts with code (github) in Java, NodeJS, PHP, Lua, Swift or Go ?

You could do it in those languages. But it would be uphill all the way and you'd look back in a year and realise you'd expended 10x the effort trying to hammer a square peg into a round hole than it would have taken to just learn R, Python or MATLAB upfront.

I can code in Julia, Python, R and Matlab. It's just the ignorance of some from the AI field. R and Matlab might be good for prototyping but not for long running stable web server applications. Julia is still too new. Python syntax has spaces part of the syntax. And Python destroyed the community by splitting it in half - 2 and 3 are still around more than a decade later and many projects will never upgrade to 3, but moved on to Go.

Lua is used in Torch. Some are in C++. Unfortunately too many entry level samples are in Python. It's like the JQuery plague was to JavaScript not too long ago, when every other question on StackOverflow was answered with a slow as hell JQuery snip instead of vanilla JavaScript. Good that people moved on. I welcome the same for Deep Learning.

> Java, NodeJS, PHP, Lua, Swift or Go ?


>I can code in Julia, Python, R and Matlab

Finde one course you like, and convert it to Java, NodeJS, PHP, Lua, Swift or Go. You learn the course inside out, and you will build the tutorial you are looking for.

Note that you have Tensorflow for Java, Go and C. For Java, you can also look at deeplearning4j

>R and Matlab might be good for prototyping

this is what you do in ML...


If you are into watching programming videos, I would recommend Siraj Raval Youtube channel - https://www.youtube.com/channel/UCWN3xxRkmTPmbKwht9FuE5A

It is quirky, funny and above all very short and crisp and gives you a quick overview of things. Most of his videos are related to AI/ML.

I'm with the others on this. Never mind the cringe - he's all show, so much so I think he's bluffing (doesn't know ML). He amps up on "character" so much you're excited for the knowledge drop - when it comes, it's so fast and technical there's nothing to gain from it. The adage "if you can't explain something simply you don't understand it" applies. I was hoping he understood ML enough to boil things down; instead he spews equations and jargon so fast (1) you don't catch it, (2) I think he's just reading from a source. He doesn't go for essence, he goes for speed - and that's not helpful.

Again, the cringe isn't the problem directly; but that it's a cover for his bluff. The result is a not-newbie-friendly resource.

I just checked out the "About" section of his Youtube channel.

> I've been called Bill Nye of Computer Science Kanye of Code Beyonce of Neural Networks Osain Bolt of Learning Chuck Norris of Python Jesus Christ of Machine Learning but it's the other way. They are the Siraj Raval of X

I mean, seriously?

I was watching a twitch livestream where he was coding an RL thing. His code was just wrong (I paused it and looked through it), but it compiled anyways and started outputting stats, so he declared "I'm such a baller! It's learning!" and then quickly concluded the program. It's one thing to find his style annoying, but he is neither a strong thinker nor coder.

I've personally found him to be more of a "showman" and a youtube "star" rather than someone technically adept with data sciences. He is good at what he does - which is building cool things using cool tools/api.

But I wouldn't recommend him as a good resource to learn core ML from or figure out how stuff work internally.

That rapping video with three women dancing in the background is cringe worthy. No comment on whether the AI lessons are any good.

I think Siraj is a tool user, not a tool maker.

He just pipes input through bunch of libraries that are available off the shelf. Does that produce a useful output? Sure. Could he write any of them himself, or explain how any of them work beyond a superficial overview? I doubt it.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact