
Machine Learning 101 slidedeck: 2 years of headbanging, so you don't have to - flor1s
https://docs.google.com/presentation/d/1kSuQyW5DTnkVaZEjGYCkfOxvzCqGEFzWBy4e9Uedd9k/preview?imm_mid=0f9b7e&cmp=em-data-na-na-newsltr_20171213&slide=id.g183f28bdc3_0_90
======
kmax12
As someone who works with a lot of people new to machine learning, I
appreciate guides like this. I especially like the early slides that help
frame AI vs ML vs DL so that people can have a realistic understanding of what
these technologies are for.

For my part, one of the biggest realization I had after many years of applying
machine learning was that I got too caught up in the machine learning
algorithms themselves. I was often way too eager to guess and check across
different algorithms and parameters in search of higher accuracy. Fortunately,
there are new automated tools today that can do that automatically.

However, the key piece of advice I'd give someone new to machine learning is
not to get caught up in the different machine learning techniques (SVM vs
random forrest vs neural network, etc). Instead (1) spend more time on
translating your problem into terms a machine can understand (i.e how are you
defining and generating your labels) and (2) how do you perform feature
engineering so the the right variables are available for machine learning to
use. Focusing on these two things helped me build more accurate models that
were more likely to be deployed in the real world.

Feature engineering in particular has become a bit of a passion of mine since
that realization. I currently work on an open source project called
Featuretools
([https://github.com/featuretools/featuretools/](https://github.com/featuretools/featuretools/))
that aims help people apply feature engineering to transactional or relational
datasets. We just put out a tutorial on building models to predict what
product a customer will buy next, which is a good hands on example to learn
from
[https://github.com/featuretools/predict_next_purchase](https://github.com/featuretools/predict_next_purchase)
for beginners.

~~~
Fiahil
Don't you think people are, sometimes, just applying ML to their problem
"because of hype" ?

One example I have in mind, was a contest where participants were given a
series of satellite pictures and asked to write a classifier to detect
icebergs and cargo ships (the two are quite similar). As someone else pointed
out, trying to use classical computer vision and machine learning on these
images will always have some error rate during identification. However, if we
were able to extract speed and trajectory of all objects in the picture and
mixing them with AIS data, finding which ones are ships, which ones are giant
pieces of ice, and which one are non-moving structures to be avoided, becomes
easy.

So, you have to choose between a black box that will give you potential
results with a given error-rate, and a predictable algorithm that anyone can
audit. Seems like a no-brainer situation to me. For what other reason would
you choose the first solution, except hype-related decisions ?

~~~
radarsat1
Your comparison seems like a false dichotomy, and I think you are agreeing
with OP. OP says, spend less time worrying about the algorithm and more time
worrying about what data you are feeding the algorithm. You are saying, what
if you had to choose between dataset A with algorithm A and dataset B with
algorithm B.

You claim, (probably correctly) that dataset B, which includes velocity and
trajectory, is more correct for the problem at hand, and given dataset B, I
would suggest that either algorithm A or B would probably do just fine.

You also claim that algorithm A has "some error rate during identification."
But so will algorithm B, and so will either algorithm on dataset A and B!

The question you should ask is, how much do I care about "black box" vs.
"white box", and is there are trade-off? If the black-box solution (algorithm
A, the "ML" solution) gives you 10% higher accuracy, and that accuracy is
going to save lives, you bet I'd choose it. Or maybe I decide that
interpretability is really important due to external audit reasons, so I need
the white-box solution. But maybe I'd choose both, the interpretable one, and
use the uninterpretable one as a flag for "a human should look at this." Or
maybe I'd combine the results of both algorithms to get even higher accuracy.

There are just so many ways to configure a solution to the problem you
propose, and you are only distinguishing between two of them. In the end the
appropriate choice depends on context.

------
minimaxir
The presentation goes straight from from linear regression and classification
to _computer vision and reinforcement learning_.

The practical value of ML/AI is what’s _in between_ and is something that
isn’t often discussed between all the hype. ML/AI can be used to build models
which work well with nontabular data (e.g. text and images), and can solve
such regression/classification problems more cleanly. (and with tools like
Keras, they’re as easy to train and deploy as a normal model)

~~~
flor1s
I think slide 12 touches on this. Even in the case of an image we can process
it pixel by pixel, but that would be lunacy!

For text great results have been achieved using automatons, but they only work
for structured strings and break if you add only a little bit of noise.

I feel like ML should be considered whenever you feel like programming
something requires you to deal with many different cases, you have a lot of
example data available, and having some false positives / true negatives is
not a big problem.

------
rubenfiszel
If I understand correctly those are slides from a Googler (Not sure if those
slides have corporate approval), that probably have as a side goal to showcase
that Google is a fun place to do ML.

Not that I am judging or anything but, the author's personal website
[http://www.jasonmayes.com/](http://www.jasonmayes.com/) whose link is
displayed multiple times is a giant ad to get hired elsewhere and show at
least some desire for other career opportunities. Not sure if that reflects
greatly on the company.

~~~
npgatech
Checking his website, it reeks of narcissism. There are better ways to assert
yourself than to do all the corny things he has done on his self promotion
website.

~~~
jasonkester
Are you honestly slagging a guy off for talking about himself on his
_resume_???

I mean yeah, we computer folk are supposed to be all self deprecating and all.
But if there is one place we should stop mumbling and talking ourselves down
for a second, that is it.

At some point if you want people to know what you do, you're going to have to
tell them.

~~~
npgatech
I found his approach tacky, loud and insincere.

Of course you should be talking about yourself on your resume but a couple of
this that are different here:

\- Wtf is up with music \- 51%/49% thing. \- Publicly asking to be hired that
reflects poorly on his current job at Google. \- Excessively loud self
marketing

why not have a simple site with your accomplishments? Why all the excess
bullshit?

~~~
GFischer
I think that's definitely a cultural thing. It would be in poor taste in my
country, but seems very American.

And many of the things he did do sound like he could be a good contributor - I
guess you can't know about his personality without an interview.

------
vowelless
Slide 64: _A whole tonne of stuff going on in robotics right now. Just take a
look at Boston Dynamics YT channel for some mind bloding research, most of
which is driven by ML._.

I _highly_ doubt that BD is doing _any_ ML work right now ... Can the author
link to specific research that they are doing using ML?

~~~
natch
You mean public work perhaps? I imagine they are doing a lot with vision, gait
learning, object manipulation, task planning, autonomy, multi-robot
coordination, etc. all of which can be enabled by or at least helped along by
machine learning, no? Your request for links is valid, I just am surprised
anyone would doubt that they are doing ML research unless you are thinking of
a strangely narrow definition of ML.

~~~
fnl
Most of robotics is about reinforcement learning.

EDIT: Oh, and expert systems/rules. Lots of em.

EDIT2: Well, an engineering, obviously... :-) Heck, just check Wikipedia on
the topic...

~~~
natch
>Most of robotics is about reinforcement learning.

is != always will be.

But forget that, let's check Wikipedia, as you suggest.

What are the first few words of the article on Reinforcement Learning, hmm.
The very first few words at the very beginning of the article:

"Reinforcement learning (RL) is an area of machine learning..."

Read that and tell me what the last two words are. "Machine learning."

------
mmanfrin
Good slides, got me back in to the fever of wanting to learn; although a lot
of the credit goes to the linked 3Blue1Brown videos (whose Linear Calculus
series is excellent) which were a lot more technical but no less approachable.

Question to those versed in ML: I want to work on an AI that plays a video
game (aspirations of playing something like Rocket League, but I know I need
to start smaller with something like an old NES game). I understand these are
usually done with Recurrent Neural Networks, but I'm a little lost as to how
to get data in to the RNN -- will I need to make another AI or CNN to read the
screen and interpret (including the score?) My 30k ft view is that if I can
define a 'score', give it a 'reset' button, and define 'inputs (decision
targets)', then I just need to give it the screen and let it do its thing. But
getting the 'score' is the part I can't figure out short of adding another
layer to the classifier.

~~~
allenguo
You should check out Berkeley's deep reinforcement learning course[1]. There's
lecture videos, slides, and homework assignments, and it's all very up-to-
date.

[1]
[http://rll.berkeley.edu/deeprlcourse/](http://rll.berkeley.edu/deeprlcourse/)

------
nblavoie
The document is awesome, but the animated backgrounds are distracting.

~~~
ausjke
exactly, I stopped at the second slide because of that. "I ask for your
undivided attention for two hours" is what it says, the background animation
seems not helping that goal, quite the opposite.

------
Jaruzel
Reading normally, and skipping the videos, the whole deck takes about 15
minutes. The last 3rd of the slides are basically promotional material for the
various Cloud ML services that are out there.

It's nice deck, but I'd hoped the blue slides went more technical without
dropping out to various videos. If wanted videos, I'd go to youtube directly.
Not everyone wants to learn through watching people talk. I learn best when I
read, it's unfortunate that youngsters these days think that the written word
is now a poor cousin to flashy video.

<rant>

In the same way that new clothes are no longer for me, and new music is no
longer for me, and all good TV shows and films are full of people half my age,
I also now feel that I'm being aged off the internet.

I was here first, you young whippersnappers! It's MY lawn.

</rant>

------
candiodari
Cool presentation ... but there's a million ones like this. We don't need yet
another basic introduction to machine learning, we need detailed practical
studies of real problems.

~~~
gregorymichael
Easier to curse the darkness than to light a match.

------
Animats
Loading...

Google Slides is really slow. That's why this needs two hours.

Most of the real content is in linked videos.

------
charlysl
"There is no golden road to geometry"

If you really want to understand you would be much better off starting here:

[https://work.caltech.edu/telecourse.html](https://work.caltech.edu/telecourse.html)

------
hprotagonist
I was absolutely convinced by the title that this would be a link to a
research blog post about an analysis of hair motion at metal shows.

~~~
elicox
Me too; After I read Perceptron I fast forward lol

------
fartcannon
This ends up not being much more than an advertisement for google. Wikipedia's
articles on these subjects have more depth.

------
kbart
Information is great, but it would be much more readable in simple text form
or pdf. It's strange that senior creative engineer at Google doesn't know
presentation making basics.

~~~
quacker
It's not surprising that a Google engineer would use Google docs. It's at
least easily shareable and there are complementary embedded videos that aren't
suitable for text/PDF anyway.

Though, the options to export as a PDF didn't work for me (either via download
or as an export to Google Drive). I'm assuming the presentation is too big.

~~~
qazpot
Can't download the PDF as well.

~~~
ratsimihah
This might be intended behavior. If you try to print the deck, it shows the
Goole Drive access request page.

------
cerealbad
strobe morphing advertisements that are trained to adapt to your unique
reaction(s) until it detects the same facial response as the last time you
made a purchase. botox, masks, camouflage tattoos, permanent smiling, IR
obscurants. the arms race begins a new. benevolent dictatorship by self
regulating machines is probably going to plague us for several thousand years.
wouldn't that be interesting, if the last evolutionary bottleneck is getting
smart enough to create an enclosed planetary system that fully satisfies your
biological needs and suppresses all intangible ones so there is no reason to
go any further up the chain of exploration.

stopping death means stopping life, every generation is more forgetful of the
past than the last. the light ages will be far more destructive, what could
possibly motivate you to stop a perpetual pleasure machine? how do you prevent
the inevitable conflict between those who insist pain and suffering is an
essential part of the human experiment and those who just force them to feel
good and change their mind? what will happen to these toddlers in 10-15 years?
they will have grown up interfacing with some electronic device for every
single day of their young lives, a different type of consciousness shaped by
destruction of self-confidence in their own knowledge and memories and a
complete trust in the needle-finders of big hay.

this shift will be as important as pre-writing to post-writing, except the
transformation won't take centuries and millennia to propagate itself across
the planet. a post-memory world, with every human enslaved by their base
sensations. the first US president who is an internet addict.

"[Writing] will create forgetfulness in the learners’ souls, because they will
not use their memories; they will trust to the external written characters and
not remember of themselves. The specific which you have discovered is an aid
not to memory, but to reminiscence, and you give your disciples not truth, but
only the semblance of truth; they will be hearers of many things and will have
learned nothing; they will appear to be omniscient and will generally know
nothing; they will be tiresome company, having the show of wisdom without the
reality."

the future is a fate worse than death.

~~~
bpizzi
That's a strongly pessimistic view you got here. Are you aware that there's
people thinking the opposite of you? The thing is: the reality is always far
more balanced that what the extremists are preaching us.

~~~
cerealbad
is being disagreeable sufficient claim to argument?[0] is there a profit
motive in balance?[1] people with vast weapons of control, deception and war
shape reality, is it balanced?[2]

[0][1][2] no

~~~
bpizzi
I'm sorry, I just don't see what we are argumenting about, to be honest.

~~~
purple-again
Dude look at the way he expressed his thoughts. Either a troll or a loon. Best
not to engage him either way.

------
TheNewLab
Nice introduction, but I really don't see how "2 years of headbanging, so you
don't have to" applies.

~~~
pimlottc
I think the author meant "banging my head against the wall" while getting it
all working but didn't realize the term has a very different meaning...

------
donkeyd
I created nearly the same presentation this week. It's good to see that I
didn't miss much, though this one goes deeper, which I don't do on purpose.
I'll probably send the attendee this presentation afterwards for those who
want to go deeper, very cool!

------
tudelo
Pretty similar information to my semester-long Machine Learning class in
undergrad (with less detail). Good information to get the basics down, but
can't say I've gone to apply any of the information I have learned yet...
Still a useful set of slides.

------
rkagerer
Not bad but toward the end it basically just becomes a big pitch for Google's
ML products. It links to 3Blue1Brown's videos which are great!

------
bitamess
I really appreciate you sharing this, as myself always try to create a
learning path and make things clearer as much as possible on top of it, slides
are good looking (design awesome). the problem sometimes comes is at its
overwhelming when advanced things pop up in talk and its hard to know what
should we listen to it or is it above our level of the current situation.

------
mrweasel
I honestly don't really see the value in a slidedeck, without the accompanying
talk. It's the same as when someone proclaim: "Slides from this talk is
available online", yeah that's not really any good without video or audio.

~~~
nerdponx
Disagree. Slides, even out of context slides, can be an excellent source of
information for people new to a field: they still give a sense of the
structure of the talk (what is related to what), and they give you keywords to
start searching for. And for experienced readers, sometimes they just contain
nice ideas or tips you had not been aware of.

~~~
cyberpunk0
No they aren't. Slides are a skeleton. Just bullet points that 99% of the time
offer no usable information

------
ForFreedom
How can I download this slide?

~~~
trishmapow2
Going to the /export/pdf link shows access denied. My quick 3min workaround
was to use FF dev tools, settings enable screenshot, click through and save
each page.

~~~
agumonkey
96 x 2 clicks too

such modern

------
mtw
This make it look like all interesting ML projects are by Google.

------
swframe2
Probability distributions are approximate world models. There are mathematical
tools to management them. Once you see the world this way, ML/DL becomes more
intuitive.

------
tziki
>Artificial neuron (aka perceptron)

These aren't synonymous, a perceptron is a type of artificial neuron.

Also confusingly, 'multi layer perceptrons' might not contain perceptrons at
all.

------
philyalater
As someone who has recently taken an interest in machine learning and AI, this
deck is much appreciated.

------
elephant_burger
Thank you for this. This is an excellent slide deck

------
AdReVice
Why is the file protected.. how to access it

------
iamwil
What was he headbanging about in these last 2 years? Just a linkbait-y title?

------
XR0CSWV3h3kZWg
I really wish that they hadn't decided on having moving images behind the text
you are supposed to be reading.

~~~
carlmr
_Give us your undivided attention._

Does everything to distract us.

------
catnaroek
The slide titled “A note on dimensionality” reminded me of this xkcd:
[https://www.xkcd.com/547/](https://www.xkcd.com/547/)

“That would be (very) bad.”

------
pruthvishetty
Gold!

