
Reinforcement Learning from scratch - e_ameisen
https://blog.insightdatascience.com/reinforcement-learning-from-scratch-819b65f074d8
======
crapflare
[https://www.alexirpan.com/2018/02/14/rl-
hard.html](https://www.alexirpan.com/2018/02/14/rl-hard.html) Reinforcment
learning for the average person is a big waste of time. Probably for anyone
atm

~~~
kevinwang
I disagree with the conclusion. Article has some critiques on DRL, but I don't
think those invalidate the field as a whole. Even the article has a section
for "When could Deep RL work for me?"

------
curiousgal
This is mostly just a preview to a codecamp.

~~~
e_ameisen
Author here, this is aimed at an overview of the field, with links to relevant
resources, but not a preview of courses we give at Insight since every project
is different here!

------
yonkshi
I think "learn" is a bit misleading here but I do have to say it's a nice and
intuitive overview of RL. RL is quite hard and math heavy, I don't know if one
can take a short cut in learning RL without solid graduate level math
foundation.

~~~
Buttons840
I disagree. At the core RL is just updating a table of values, and then using
function approximation (aka, machine learning) for more complex cases.

I think it might be perceived as being math heavy because the best resources
on the topic [0][1] use a lot of math notation to express their ideas. The
ideas are subtle and easy to mess up though. I think these books are some of
the first that made me appreciate math notation, it can look scary, but it
does convey the ideas more accurately than words can.

[0] [http://incompleteideas.net/](http://incompleteideas.net/) (his
"Reinforcement Learning: An Introduction" is a great book)

[1] [https://www.manning.com/books/grokking-deep-reinforcement-
le...](https://www.manning.com/books/grokking-deep-reinforcement-learning)
looks promising, I've read the few chapters available and was impressed. It's
still an early work with grammatical mistakes, but the layout of the ideas was
clear and organized.

~~~
yonkshi
Yes, bellman equation fundamental idea for all RL algorithms (i.e. updating
table of values), but RL is also much more fragile than supervised learning
methods, thus to ensure stability, modern algorithms have complex mathematical
tools to fix that. I wouldn't say it's the math notations that's scary, rather
the concept behind modern algorithms require higher mathematical concepts. For
example, Max Entropy algorithm for inverse RL, it's essential to know the
concepts of Shannon information entropy to understand why it works.

~~~
moultano
Information theory does not require graduate level math. Even if it's entirely
new to you, you can pick it up with a few days on wikipedia.

~~~
yonkshi
Ah yeah I agree it's not a terribly difficult concept to learn, but perhaps we
have a different definition of graduate level math. Things like information
theory are, at best, skimmed over in upper div math classes during undergrad.
Having a solid understanding things like KL divergence, information entropy is
just an indicator of one's overall math level, and if you are already there,
you can consider yourself at graduate level. Though I guess math used in ML is
probably kids play comparing to a graduate level mathematician or physicist.

------
master_yoda_1
Is this a click bait article? I wish I have AD BLOCKER plus plus to block this
kind of &*## $#!^ :(

~~~
dna_polymerase
It is on Medium, so it shouldn't have been such a surprise to you.

------
ninjamayo
Just get Sutton’s and Barto’s book.

~~~
e_ameisen
A great book indeed, but a bit long for an introduction!

~~~
eachro
Yea, intro is a bit misleading. Maybe the first 6 chapters(~130) pages is
enough for a solid intro.

------
setzer22
A small tangential criticism, but using "deep" every other sentence and
especially expressions like "classical deep learning" made me take this
article less seriously.

This is not unique to this author, sadly. I'm tired of seeing the d word
thrown in research papers just for the sake of adding more buzzwords per
buzzword.

Once you've made clear you are using neural networks with a lot of layers you
can start using some variation in the discourse. Maybe just call them neural
networks...

------
ogennadi
There were so many technical terms, I'm surprised you could get through even
an overview, and then practicals, in just 4 hours.

Do you know of any resources which list most of the common alternatives? e.g.
what are the alternatives to a3c for parallelizing; or the alternatives to a2c
for getting policy and value estimates?

