Hacker News new | past | comments | ask | show | jobs | submit login
A Few Useful Things to Know about Machine Learning [pdf] (washington.edu)
213 points by alrex021 on May 18, 2015 | hide | past | favorite | 18 comments

When I first read this paper, I found it to be immensely helpful. I'm new to Machine Learning but I was so inspired by this paper when I first found it that I wanted to build up resources around it.

Here's the result: https://github.com/hangtwenty/dive-into-machine-learning

I want this guide to be a good resource for other people like me, who are curious to get into Machine Learning by this process:

1) hacking 2) coming to understand what you hacked 3) more structured, in-depth learning

It can be intimidating to approach Machine Learning this way. For a long time it felt like I couldn't do steps 1 or 2... and had to start with 3. That's intimidating!

Pull requests welcome, I want this to be a good resource! Thanks all.

Well done - I'm sure this approach is helpful to many people who, like myself, learn best by using the process you summarised, so here's another compliment to the list :)

this is fantastic! I was looking for something like this to get started. Thank you so much for putting this together. I am taking some machine learning courses next semester and will use this to get familiar during the summer.

Really nice, thanks for posting.

Pedro Domingos also has a fantastic mooc at https://www.coursera.org/course/machlearning

Whoa, I didn't realize this, thank you!

This is a very useful guide. Although I'd imagine that you'd have to have atleast some experience with ML before you truly appreciate what's being explained in the paper.

I'm assuming you meant to respond to me. Thanks! I agree with you, and I'm sure it's riddled with mistakes. But, sometimes it takes a beginner to make a beginner's guide. So I'm hoping it can be valuable in that way, and that contributions can correct my mistakes.

This is great, wish this was around about 6 months ago. I'm going to go over this and fill in some knowledge gaps.

It was written in 2012 and is a fairly well known and accessible text! Pedro is an excellent speaker, if you ever get chance to hear him I highly recommend it.

Is anyone else getting an untrusted cert?

I got it too.

Ok cool. 2 hours into my comment, and I was beginning to think I was being MiTM'd.

"So there is ultimately no replacement for the smarts you put into feature-engineering."

Recently, deep learning changed this. Finding the right network architecture allows the net to learn the features by itself.

That's the goal of representation learning, but we're not quite there yet. From a previous comment:

syllogism 264 days ago

Deep learning needs feature engineering too.

You still need to transform your context into a vector of boolean or real values, somehow. And that transform is going to encode assumptions about what information is relevant to the problem, and what's not.

Let's say you're trying to predict house prices. There's no end of geo-tagged data you might pull in. And if you have a cleverer idea than the next guy, your model will be more accurate. And, probably, if the next guy's at least competent, it'll be your feature ideas that set you apart.

In a linear model, you need to come up with a clever set of conjunction features, that balances bias and variance. You don't need to do that for a deep learning model, and that's a big advantage. But that's not the same as saying there's no feature engineering.

Who said there is no feature engineering?

This is true, deep learning can make feature selection / engineering easier. That being said, a deep learning method can be over kill for a large number of problems that ML is used to solve. The amount of data needed for the training set and amount of computational power needed for the training set is often not available or a huge effort. I believe in keeping things simple if possible and spending a little more thought of feature construction. However, it isn't best for all problems.

Just finished reading this. Brilliant!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact