Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What is your goto resource for learning about big data, ML, AI etc?
144 points by vijayr on Oct 28, 2016 | hide | past | favorite | 40 comments

HN has a great and I mean absolutely great search feature via Algolia https://hn.algolia.com and this particular question keeps springing up every now and then, no one seems to use the feature despite the search bar being at the bottom of every page.

Edit: removed "inb4 downvotes".

I never knew it was at the bottom and I use this site all the time. Thanks for pointing that out. However, it does raise some questions about the UI in this case. Can't we put the search box up in the header where people expect it to be?

I only tend to use search now and again, since it's not always used I prefer it not take up initial screen space. Also I think the majority of HN users are able to figure out a way of searching this site relatively easy, I see it as a non-issue.

I think the happy medium would be to add a link to search in the top bar but not the actual search bar.

Indeed I also didn't know about this. In general I don't think it is easy to find information about how to use hacker news, everything is hidden in weird places.

I tend to run Google searches like these - site:news.ycombinator.com 'AI'

HN search has been one of the most helpful resources (among many) to my personal and professional life. +1 for that alone. It sowed the seeds for a career path (went from a yoeman replaceable scripter to a guy with a reliable paycheck that can comment with angst on HN.) HN-algolia is snappy. I've been developing a behavior where I default to searching HN before I search google (for better or worse).

Also, I really apologize for this but, please don't say things like: "Time for me to get down voted to oblivion".

You've spoken your mind (and helpfully so) with the end of your comment (which is otherwise good).

Self-referencing how one expects comment voting to go is a behavior that I wish people would refrain from. It makes the comment "about" itself --- rather than the content. It's a primer that stems from perceptions about how it will be interpreted by the community, which in turn manipulates voting behavior about the comment. (<insert-discussion> voting systems on community forums. is voting itself a good system? </insert-discussion>).

That's true.

If you need books as your learning resources, I would recommend to search it via Hacker News Book [1]. That site scrapes books based on the shared links on HN comment and ranks them.

[1]: http://hackernewsbooks.com/

Great resource thanks. Would love to learn how to make a website like this. Not necessarily detecting books, but detecting [anything] and counting, but with the scraping features, still.

This is amazing! thanks!

Reading the responses here, I wonder how a revamped HN homepage would look like if there was a search bar at the top of the page.

The user story for search has been solved. What hasn't been solved, it sounds like, is feature discoverability.

I actually do not care for the Algolia search functionality. The previous search worked far better.

Algolia has suggestion features built-in which cannot be disabled (synonyms? autocorrect?) which will return content that perhaps does not much what the user really wants if they want an exact search. This behavior is especially important to developers since our terminology does not match the English (the language of HN) vocabulary many times. Try searching for the product "logsene", which is simply an example. Quoting words, such as what Google uses, does not work all the time.

Whelp, I just learned HN has a search box!

For complete newbies (but with programming experience), I would recommend this UW Coursera course to get introduced to ML Basics: https://www.coursera.org/learn/ml-foundations

Early this year Apple acquired Turi for $200 million. It was founded by Carlos Guestrin, one of the professors who is teaching the course.

We (Class Central) are also working on a six part Wirecutter style guide to learning Data Science online. Here is part 1: https://www.class-central.com/report/best-programming-course...

Feedback would be appreciated (on the format as well as content)!

I'm a huge fan of the rest of this Coursera specialization (or was, until they started charging to submit assignments for it mid-specialization, but I digress...)

Carlos and Emily do a great job diving deeper than most other online courses into the math behind different algorithms without making the math too theoretical. I'm a grad student in engineering, so I wanted to understand not only how to run these algorithms but also how they work and these courses were great for learning in a mathematically rigorous but still approachable sort of way.

The only criticism I've heard of this series is that it uses Turi/Dato/Graphlab instead of SciKit-Learn. I did the courses that exist so far using GraphLab, but I'm starting to redo the assignments using SciKit now so that I learn that toolkit as well.

I think they start charging after the second course.

I am in the same boat as you. I am currently doing Udacity's Machine Learning Nanodegree. But I think I would have felt lost if I hadn't done the first two courses of that Coursera Specialization.

Just started, but it seems that Pandas and SciKit-Learn are very similar to Dato/Graphlab from a usage perspective.

It depends on your focus, of course. Andrew Ng's coursera is famous, and it's ideal for someone who wants to get into the mathematics behind various ML algorithms. However, this class is will take you into implementing algorithms, but is less about applying them.

If you want to just try them out, I'd honestly recommend just going through the scikit-learn documentation. Almost all of the algorithms provide an example, and the API is pretty consistent across different ML algorithms, to the extent that it can be.

People learn differently, some people prefer to get into the math right away, others will never be interested in it. I'm interested, but I tend to be more motivated when I've used the algorithms, start to learn about how and why they perform well or poorly under various circumstances, and then dig into the mathematics specifically to find out why.

Also, I'm not going to be creating new ML algorithms. So, you know, that also influences my level of interest. I do care about the mathematics involved, because I do want to genuinely understand why some outputs are available for random forests but not naive bases or logistic regression, why performance and/or accuracy is great in some circumstances and not others, and I don't want to have to rely on too much hand waving. But if you want to actually develop and research novel ML algorithms, you'd need to get considerably deeper into the math.

similar question was just asked 2 weeks ago:

    Ask HN: How to get started with machine learning?

The scikit-learn documentation is solid:


Udacity has a free Introduction to Machine Learning (which use scikit-learn, python). They also have nano-degrees which are paid.

For big data, 'Big Data' by Nathan Marz was an excellent read. The conceptual chapters are top notch, and the implementation chapters give you a good look into the tools used for the field at the time of publishing.

I enjoy the way this site is written and its focus on getting developers up and running quickly while still instilling conceptual basics.



There are a lot of good online courses to get started, I like the Stanford CS231n lectures - http://www.youtube.com/watch?v=F-g0-6_RRUA&list=PLLvH2FwAQhn...

For keeping up with the latest research, once you know what you are doing, reading papers on Arxiv daily/weekly is a great way to keep up, nearly everything gets published there

http://blog.yhat.com/: Tutorials, example apps, and other stuff.

Shameless plug: LearnDataScience http://learned.com is a git repo with Jupyter Notebooks, data and instructions. It's meant for programmers, assumes no math background and addresses data cleaning issues which most classes ignore. Having said that Andrew Ng's class on Coursera is gold.

I think you meant http://learnds.com/

Ignore the domain but... try this:


It is a remarkably high signal to noise community.

I think it's time for someone to write the equivalent of http://norvig.com/21-days.html for ML/big data :-)

Yikes. I didn't know it was possible to feel embarrassed by watching a TensorFlow video.

For a ML intro Coursera's machine learning course https://www.coursera.org/learn/machine-learning is great. I have not been through the entire course but for someone who has no background in it, its a good intro as the video themselves are solid.

I know you didn't ask for this, but here is a gentle introduction to ML http://www.soc.napier.ac.uk/course-notes/sml/manual.html :P

Conferences like WWW, KDD, ICML for latest, coursera for basics, and textbooks like Pattern matching by Bishop.

A classic reference is Pattern Recognition and Machine Learning by Bishop

METACADEMY is pretty good: short summaries + prerequisite graph



Excellent book for starting with NN and DL.

The best way to learn is by doing, imo. Just go join a kaggle competition. Maybe people know others it is so easy to partake in?

Nice. I was looking for ML resources.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact