
OpenAI Procgen: Procedurally Generated Game-Like RL Environments - davidfoster
https://github.com/openai/procgen
======
xamuel
This page has plots of performance of a certain agent on these different
environments: [https://openai.com/blog/procgen-
benchmark/](https://openai.com/blog/procgen-benchmark/)

Question to anyone familiar with this stuff: I can't figure out which agent
they're running on these environments to create the plots in the above link.
Is it some well-known agent which is supposed to be clear from context?

~~~
gwern
It is OpenAI so you can safely assume that it's their PPO workhorse agent. But
if you were unsure, they provide repos and papers for further details, both of
which mention early on the agent ( [https://github.com/openai/train-
procgen#try-it-out](https://github.com/openai/train-procgen#try-it-out) and
[https://cdn.openai.com/procgen.pdf#page=4](https://cdn.openai.com/procgen.pdf#page=4)
).

------
d-d
Are we in an environment like one of these?

Oh well, back to work.

------
euske
I don't deny this is indeed a cool project, but how is this related to their
mission of "building a safe and beneficial AGI" backed by a billion dollar
funding? They look like having too much fun with sidetracking (which is
totally understandable!).

~~~
TaylorAlexander
Reinforcement learning is almost exclusively researched in video game like
environments. Being able to create a variety of game like environments that
all have the same interface will make it easier to test a single algorithm in
many different situations.

~~~
sansnomme
Also it's a lot easier to justify side projects that improves tooling and
infrastructure when you have billions in funding and not just millions. For
smaller shops, the usual answer is to try RL on existing procedural gen games
e.g. Minecraft.

~~~
TaylorAlexander
Well as your funding goes up, the scope of your mission increases. I don’t see
this as a “side project” so much as a necessary part of solving the bigger
picture. It’s like how building a battery factory is what makes Tesla work.
You can buy batteries from Panasonic and indeed they did, but as they grew
they realized they needed their own factory. In the same way, if you want to
work on developing novel RL algorithms you can’t test it all in one game. You
need a way to test it in many different problems using a standard interface.
That’s how I see this.

