
Spinning Up in Deep RL - samrohn
https://spinningup.openai.com/en/latest/user/introduction.html
======
svalorzen
If you are ever interested in the topic of RL, but wish to start learning the
concepts on simpler algorithms and keep the "deep" part for later, I maintain
a library that has most of the same design goals:

[https://github.com/Svalorzen/AI-Toolbox](https://github.com/Svalorzen/AI-
Toolbox)

Each algorithm is extensively commented, self-contained (aside from general
utilities), and the interfaces are as similar as I could make them be. One of
my goals is specifically to help people try out simple algorithms so they can
inspect and understand what is happening, before trying out more powerful but
less transparent algorithms.

I'd be happy to receive feedback on accessibility, presentation, docs or even
more algorithms that you'd like to see implemented (or even general questions
on how things work).

------
GnarlyWhale
Plug for the RL specialization out of the University of Alberta, hosted on
coursera: [https://www.coursera.org/specializations/reinforcement-
learn...](https://www.coursera.org/specializations/reinforcement-learning) All
courses in the specialization are free to audit.

For those unaware, the university of Alberta is Rich Sutton's home
institution, and he approves of and promotes the course.

~~~
infimum
Currently on course 2/4 in the series and it's great. Every week starts with a
reading assignment from the RL book followed by a series of videos
(re-)explaining stuff. The videos themselves are very nicely structured, with
clear outlook and summary at the start and end of them. Sutton himself appears
in a couple of videos and there are other great guest lectures with
interesting insights about applications.

Definitely a recommendation!

------
plants
Asking for the benefit of me and others since this is on the front page now -
are there any resources this comprehensive for any other field of study? This
guide is amazing and I've failed to find anything else like it. I was
specifically interested in biotech (from the perspective of a software
developer, i.e. practically zero biology background), but will take what I can
get

~~~
unoti
> Are there any resources this comprehensive for any other field of study? ...
> I was specifically interested in biotech

I recommend the FastAI course on deep learning. Several of their lectures
relate to things their students have done in biotech and medical. The main
lecturer Jeremy Howard has worked for years at the crossroads of medicinal
technology and AI, and routinely discusses this.

The full fastai course is here[1] and free. Here is a blog post and associated
video[2] as an example of fastai incorporating biotech into their work. In
this example they use AI to upsample the resolution and quality of
microscopes.

[1] [https://www.fast.ai/](https://www.fast.ai/)

[2]
[https://www.fast.ai/2019/05/03/decrappify/](https://www.fast.ai/2019/05/03/decrappify/)

------
kakadzhun
If you want to play around with Spinning Up in a Docker container, then make
sure you git clone the repository, then pip install -e repository. For
whatever reason, if you try to directly install it with pip, it doesn't work,
at least last time I tried it. Here's a Dockerfile and docker-compose.yaml I
created some time ago: [https://github.com/joosephook/spinningup-
dockerfile](https://github.com/joosephook/spinningup-dockerfile)

------
mementomori
Can anyone recommend some less opinionated introductory resources to learn
reinforcement learning that focus on first principles?

~~~
blahblahblah10
I would highly recommend Sergey Levine's course:

[http://rail.eecs.berkeley.edu/deeprlcourse/](http://rail.eecs.berkeley.edu/deeprlcourse/)

For a more mathematical treatment, there's a beautiful book by Puterman:

[https://www.amazon.com/Markov-Decision-Processes-
Stochastic-...](https://www.amazon.com/Markov-Decision-Processes-Stochastic-
Programming-dp-0471727822/dp/0471727822/ref=mt_other?_encoding=UTF8&me=&qid=)

------
flooo
RL, including contextual bandits, is becoming more popular for
personalization, i.e. adapting some system to the preferences of (groups of)
individuals.

Plug/Source: I did a lit. review on this topic
[https://doi.org/10.3233/DS-200028](https://doi.org/10.3233/DS-200028)

------
janhenr
I enormously appreciate the resources OpenAI provides to start out in DRL such
as this one. However, OpenAI has (purposely?) left out the brittleness of
their algorithms to parameter choice and code-level optimizations [1] in the
past. As a researcher myself, I would be more than surprised to hear that
OpenAI did not explore this behaviour themselves. Instead, my guess would be
that these "inconveniences" would do harm to the Marketing of OpenAI and its
algos. Such deeds are far more harmful to proper understanding of DRL and
applications than a nice UI is beneficial imo.

[1][https://gradientscience.org/policy_gradients_pt1/](https://gradientscience.org/policy_gradients_pt1/)

------
cbHXBY1D
There was a discussion on r/datascience this weekend about if anyone uses RL.
Almost no one does.

[https://www.reddit.com/r/datascience/comments/iav3lv/how_oft...](https://www.reddit.com/r/datascience/comments/iav3lv/how_often_do_you_guys_use_reinforcement_learning/)

~~~
p1esk
Why would you ask data scientists about RL? I bet no one there uses deep
learning either. That does not mean much.

------
odomojuli
"Pray, who is the candidate's tailor?" -Hilbert

Who is responsible for OpenAI's UI/UX design? It is immaculate and should be
the standard for the community. I'm always dazzled by the impeccable standards
of OpenAI with regards to tone, presentation, accessibility.

The documentation is both familiar but distinct, an impressive achievement!

I have my own personal qualms on OpenAI's ethics and virtues but am
nevertheless impressed by their aesthetics and regard for their publicity.
It's always delightful to look at their work.

OpenAI has in my opinion, the most appropriate presentation for their ideas
with marketing and branding. It feels exquisitely simple to grasp what goes on
here.

I feel comfortable saying that the biggest obstacle in progress for AI is UI
but projects such as this give me hope.

~~~
colordrops
I assume this comment is generated? The link is a standard Sphinx doc.

~~~
Davidzheng
If that comment is generated, I will quit my current job and work full time on
AI. I don't believe it.

~~~
317070
All the blogs posted by e.g. this user [0] were generated by GPT-3. [1] Some
of those reached the top of HN.

That comment indeed looks a lot like it is generated. It has correlated a
bunch of words, but it did not understand that the link between UI and AI is
tenuous. It is probably one of the few comments where it is so glaringly
obvious. There are likely a lot more comments around which are generated but
which went unnoticed.

This comment is not generated, as the links below are dated after the GPT-3
dataset was scraped.

[0]
[https://news.ycombinator.com/submitted?id=adolos](https://news.ycombinator.com/submitted?id=adolos)

[1] [https://adolos.substack.com/p/what-i-would-do-with-
gpt-3-if-...](https://adolos.substack.com/p/what-i-would-do-with-gpt-3-if-i-
had)

~~~
amelius
Isn't gpt3-generated prose always seeded by some words or sentences? I'm
curious what the poster used as seed, and why/if they chose to focus on UI?

Would it be possible to use gpt3 to "beautify" existing prose without changing
its meaning? Now that would be useful!

~~~
rytill
The prompt design likely has as input the HN title, and potentially some other
metadata, but not necessarily the first few words of the comment.

