
An Optimistic Perspective on Offline Reinforcement Learning - theafh
https://ai.googleblog.com/2020/04/an-optimistic-perspective-on-offline.html
======
MasterScrat
This is an important direction.

For now, training reinforcement learning agents mean that you need to simulate
an environment, for example a car simulation, or a robotic simulation, or a
video game.

The simulation can be quite slow. In the worst case, your environment may
actually not be a simulation, but a real-world experiment - in which case each
interaction is even slower.

For now, this is something RL researchers have to deal with. If we could get
"offline" reinforcement learning to work, ie learning from pre-recorded
experiences only, this would bring a considerable boost to the field, as you
could just run your simulation for a few millions/billions frame then do your
research on that.

Huge boost in turnover time and computation cost.

Another aspect is reproducibility. Reinforcement learning is notoriously hard
to benchmark properly (see eg
[https://arxiv.org/abs/1709.06560](https://arxiv.org/abs/1709.06560)). One of
the reason is the stochasticity of the environment: you can often perform the
same action in an environment and end up with different results. And agents
learn from what they see, so a slight difference due to stochasticity in the
beginning can have a huge impact later!

So here again learning "offline" helps a lot - since they level the playing
field for different methods.

