
Spinning Up in Deep RL - stablemap
https://blog.openai.com/spinning-up-in-deep-rl/
======
minimaxir
This developed-and-maintained package is a good approach towards furthering RL
development; as the writeups state, the biggest problem in RL is subtle bugs
from an implementation which don't cause an error but tank learning
performance. (+ loggers/utils to help debug things)

Granted, a lot of RL thought pieces/examples on places like Medium.com take an
existing RL implementation without many tweaks, run it on a new task, and see
what happens. A better RL library might make this workflow more prevalent;
hence why it's very important for researchers to make their pipelines
transparent.

~~~
jrx
I've made some effort to provide a set of similar high-quality implementations
available in PyTorch: [https://blog.millionintegrals.com/vel-pytorch-meets-
baseline...](https://blog.millionintegrals.com/vel-pytorch-meets-baselines/)

In my opinion PyTorch code is easier to understand and debug for newcomers.
Code is definitely lacking in documentation, but whenever there was a tradeoff
between clarity and modularity in the end I've chosen modularity. Ideally I
would like others to be able to take bits and pieces and incorporate into
their projects to speed up time to delivery of their ideas.

~~~
mike_mg
+1 on that, that's a great project.

PyTorch with its explicit state that can be easily examined by hand in PyCharm
debugger will be way easier for people coming into the field.

------
dimitry12
This is awesome and I hope will allow more people to _experiment_ with
algorithms, instead of only re-applying OpenAI's baselines. Baselines are
great, but are very hard (for me, at least) to tinker with.

It helps me to understand something new if I can controllably break it. In
other words, I progress by predicting the edge-conditions when something
shouldn't work - and then testing if algorithm indeed experienced expected
type of failure. Transparent algorithm implementation is key for this.

One thing, which I immediately checked in the spinningup-repo is if it uses TF
Eager. And it doesn't. @OpenAI what's your reasoning for that?

~~~
jachiam
Hi! Primary developer for Spinning Up here. The code for this was developed
mostly in June and July this year, and Eager still felt relatively new to me.
I wanted to wait for Eager to stabilize and hit maturity before investing in
it. I also wanted to see how TF would change on the road to TF 2.0, since that
could change the picture even more.

At the six month review in 2019, we'll evaluate whether it makes sense to
rewrite the implementation examples for TF 2.0. I'll speculate that the answer
will be "yes, it does." Since Eager execution will be a central feature of TF
2.0, the (probable) revamp for Spinning Up will include it.

Good luck with your experiments! And please let us know about your experience
with Spinning Up---we want to make sure it fulfills the mission of helping
anyone learn about deep RL, and user feedback is vital for that.

~~~
dimitry12
Thank you for sharing your thought process!

------
browsercoin
whenever I see high quality submissions I bookmark it and promise myself to
come back and spend time learning it.

this time...I promise myself its different

------
pretty_dumm_guy
Hi! I really appreciate you sharing this with the community. The documents and
code look really clear and concise. I do have one question. Is it possible to
change the dependency on Mujoco engine to something else (to for e.g.
Roboschool)?

I don't have access to a computer with GPU and I am currently using google
colab to do my DL projects. I tried installing Mujoco on colab but
unfortunately, the computer id generated seems invalid. Any help is highly
appreciated.

Thank you!

~~~
jachiam
Hi!

A few people had this question on Twitter also. Our response: "Several of us
at OpenAI are thinking seriously about how to make something like this happen!
I can't promise anything, but we definitely want to remove barriers to entry."
([https://twitter.com/jachiam0/status/1060595172285632512](https://twitter.com/jachiam0/status/1060595172285632512))

In the meanwhile, you can still use Spinning Up with the Classic Control and
Box2d envs in Gym (which don't require any licenses at all). And what's more:
for most of these environments you don't need a GPU! CPU is fine.

~~~
pretty_dumm_guy
Thank you for replying promptly. I am willing to help with such a change, if
planned. Meanwhile I'll get started with running spinning-up on my laptop.

------
nshr
This looks like an awesome initiative! I think it will be very valuable for
people trying to enter the field. I particularly like the clear advice on how
to get started doing RL research. Have you considered setting up a forum for
the community to share their experiences?

------
rcshubhadeep
Discovered two small issues in the doc. Where can I send feedback?

~~~
jachiam
Let us know by opening an issue on Github:
[https://github.com/openai/spinningup/issues/new](https://github.com/openai/spinningup/issues/new)

~~~
rcshubhadeep
Perfect. Thanks

------
wnevets
is there a Dockerfile with everything set up already?

~~~
jachiam
No, but please open up an issue on Github and we'll look into making one! :)

[https://github.com/openai/spinningup/issues/new](https://github.com/openai/spinningup/issues/new)

------
enygmata
I thought this was something about roguelikes. :(

