Hacker News new | past | comments | ask | show | jobs | submit login
Key Papers in Deep Reinforcement Learning (openai.com)
178 points by dsr12 5 months ago | hide | past | web | favorite | 12 comments

Many of these papers had been featured on HN before:

Neural Episodic Control https://news.ycombinator.com/item?id=13843282

Exploration by Random Network Distillation https://news.ycombinator.com/item?id=18346943

Evolution Strategies as a Scalable Alternative to Reinforcement Learning https://news.ycombinator.com/item?id=13953980

Recurrent World Models Facilitate Policy Evolution https://news.ycombinator.com/item?id=16860247

Playing Atari with Deep Reinforcement Learning https://news.ycombinator.com/item?id=8484313

The spinning up guide is neat, but it seems to assume access to fairly expensive GPU resources for running models.


> In US regions, each K80 GPU attached to a VM is priced at $0.45 per hour while each P100 costs $1.46 per hour.

The $300 free tier gets you ~600 hours of K80. The spinning up guide suggests iterating models in <5 min, so that's 7200 iterations.

> start with vanilla policy gradient (also called REINFORCE), DQN, A2C (the synchronous version of A3C), PPO (the variant with the clipped objective), and DDPG, ... VPG...

that's 6 algorithms, combined with a half a dozen tasks to try, whittles it down to a few hundred iterations per task/algo combo.

that, combined with a lot of paper-reading, and perhaps clever blogging is probably enough to get started.

Still, it seems beneficial to democratize DL by making these 5 minute iterations free, doesn't it?

A great and free option are Colab notebooks from Google: https://colab.research.google.com/

You can attach a GPU for free, and, if I recall, even a TPU. See https://colab.research.google.com/notebooks/gpu.ipynb

The guide doesn't assume access to specialized hardware. Experiments and iterations with all of those algorithms can be done on a normal CPU that anyone has. :)


"Iterate fast in simple environments. To debug your implementations, try them with simple environments where learning should happen quickly, like CartPole-v0, InvertedPendulum-v0, FrozenLake-v0, and HalfCheetah-v2 (with a short time horizon—only 100 or 250 steps instead of the full 1000) from the OpenAI Gym. Don’t try to run an algorithm in Atari or a complex Humanoid environment if you haven’t first verified that it works on the simplest possible toy task. Your ideal experiment turnaround-time at the debug stage is <5 minutes (on your local machine) or slightly longer but not much. These small-scale experiments don’t require any special hardware, and can be run without too much trouble on CPUs."

DRL has definitely shown some excellent results but can someone in the field comment on this paper, Simple random search provides a competitive approach to reinforcement learning [1].

[1] https://arxiv.org/abs/1803.07055

Not a comment on that paper, but for those interested I would highly recommend Ben Recht's (ont of the authors of that paper) blog series at http://www.argmin.net/2018/06/25/outsider-rl/.

I appreciate a lot they are thinking of beyond-human runs these days, for example just put a few agents into big, number-crunching climate change models and see if they are able to “win” (keeping planet Earth ecosystem on this side of divergence), how (for example, beating the one or two or more most critical opponents) and at what cost for implementable actions for mankind (how many resources are needed?). Basically, treat these scenarios as active strategy games with rewards and a final goal instead of passive, parametric Montecarlo runs a la think tanks or supranational agencies?

Are there curated guides like these for other fields (e.g. NLP)?

Bibliographies? Yes most sub fields have one somewhere.

I maintain one for the Ruby programming language https://rubybib.org/.

I find this NLP guide helpful, and would love to know of any others that people are aware of: https://nlpprogress.com/

Does anyone know of examples where reinforcement learning has been applied to IoT applications?

I think curated reading lists like this add a lot of value. Otherwise just knowing where to start is an obstacle.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact