
Show HN: OpenAI's Cartpole Can Be Optimally Solved in ~10 Random Initializations - eugenhotaj
https://github.com/EugenHotaj/gym/blob/master/cartpole-random.py
======
tlb
Indeed, cartpole should be that easy. The observation vector is dimension 4,
and the action vector is dimension 1. Gym has a range of difficulties, of
which this is the lowest. It's good for debugging because you can verify the
math in your head.

The linked code is quite fragile, though. It chooses random models (4-vectors)
with all positive signs between 0 and 1. It so happens that there's a valid
solution in this range, but a good RL agent should work equally well with
flipped signs of observation and action vectors, and it should be able to
solve it (possibly with more iterations) with any affine transformation of the
two vector spaces.

~~~
eugenhotaj
Agreed, the code is not meant to be a good solution.

This was actually my starting point for an implementation of an "evolution"
based algorithm that I wanted to experiment with. I was surprised that
completely random models were able to optimally solve the Cartpole environment
so I thought I'd share :). Of course I don't expect this to be the case for
more complex environments.

