
Machine Learning Crash Course: Part 1 - rafaelc
https://ml.berkeley.edu/blog/2016/11/06/tutorial-1/
======
aub3bhat
As a counter-argument Linear regression to ML is "goto statement" to
programming.

Linear regression looks great on paper since you can derive residuals, slopes
compare the individual "effects" etc. But thats unnecessary and in some cases
wrong when the goal is mere prediction and not explanation. The big difference
between ML and statistics is that latter selects a "correct" linear model and
then assumes a distribution for "errors" due to pesky reality. The effects are
used for explanations (538 Nate Silver style wonk/punditry). Machine Learning
on other hand tries to predict as close to observations as possible without
imposing a model or caring about an explanation.

The simplest introductory Machine Learning approach should not be linear
regression but rather a 1- nearest neighbor model.

E.g. rather than giving data about house prices and square footage. The
question should be "How do you predict price of a house in given location?".
"What are relavant features?" (location,location, location,school
district,number of rooms,sq ft, etc), "How would you collect labels/data".
(Zillow, exclued prices older than 2-3 years).

The simplest answer would be that the price is same as that of the neighboring
house (closest lat/long) with similar sq foot sold recently. This can then be
implemented as a weighted distance metric and tested using Leave one out cross
validation (I know not the best metric). But consider how Nearest Neighbors
allows us to incorporate location information in a natural manner. That is
very important and cannot incorporated in an elegant manner in a linear
regression model.

A big part of ML is applying different set of methods across several domains.
Thus for beginners Teaching ML should not be about teaching limear models or
gradient descent but rather how do you start thinking from ML perspective.

------
minimaxir
The whole "machine learning is just fancy statistics" discussion that happens
on Hacker News endlessly is often pedantic semantics. However, in the case of
linear regression, this _is_ basic statistics that is an analysis life skill
and has many practical applications outside of the hardcore TensorFlow blog
posts. (case in point, I first learned linear regression during my undergrad
in a "Statistics for Business" class)

~~~
madenine
Linear regression is dope.

As a data scientist, I often walk new clients through a linear regression
exercise to convey some key concepts about the engagement and demystify what
I'll be doing for them.

I'm often dealing with people who, much like you, haven't done much with stats
since a college "Business Stats" course, so I get a lot of "oh yeah, I vaguely
remember this" \- but going through it again gives me a good foundation to
relate back to as things get more complicated.

~~~
sickrumbear
Are there any resources you would suggest for someone in the same boat to get
easily reacquainted and rebuild a good foundation?

~~~
JshWright
The Coursera Machine Learning course just started (I assume you could still
join). I just finished the second week (I'm trying to keep a week ahead due to
the somewhat unpredictable nature of my schedule lately), and have been
enjoying it so far.

The first couple weeks are all about univariate and multivariate linear
regression (as well as an optional linear algebra refresher on matrix
operations).

[https://www.coursera.org/learn/machine-
learning/](https://www.coursera.org/learn/machine-learning/)

~~~
cr0sh
I second this course; I took it when it was put on in 2011 in association with
Stanford, and called the "ML Class" \- its success was the catalyst for the
creation of Coursera.

------
CN7R
What's the best way for college freshmen to learn about ML? -- A.I. and ML
aren't really topics talked about until upper-divs, which means a year or two
out for me.

~~~
pyromine
You can certainly learn to implement the APIs that are available, but in terms
of really understanding I'd say wait a bit.

Take classes in probability, and linear algebra, from there you'll begin to
have the level of mathematics to be able to really dive in. You'll also have
the computer science maturity to better understand the libraries in use, and
in truth the field will have advanced a bit in the time it takes for you to
get those foundations.

There was just a good reddit[1] thread about what topics to focus on in linear
algebra and probability that you should be paying attention to, because those
two subjects are largely the mathematical foundations of machine learning.

[1]:[https://www.reddit.com/r/MachineLearning/comments/5klywi/d_w...](https://www.reddit.com/r/MachineLearning/comments/5klywi/d_which_aspects_of_probability_and_linear_algebra/)

~~~
CN7R
Thanks for the info! I'm taking linear algebra this upcoming semester so this
will be useful. :D

------
rrggrr
Seriously... thank you for posting. This was the intro I've been looking for
but never found.

