
Neural Episodic Control - beefman
https://arxiv.org/abs/1703.01988
======
rhaps0dy
I wonder how they deal with the change of the mapping from states to vectors h
from the CNN changing as training advances.

I reckon they _don't_ train the CNN, rather, they use it as pre-trained from
DQN or something. In that case, no wonder it learns faster.

~~~
tmesisly
they do train the CNN from scratch, see section 3.4.

------
jedharris
Note that Denis Hassibis is one of the authors and that all authors are at
Google, Deep Mind.

Without this I'd be quite skeptical of these claims but now I'm wishing for
analysis from someone better qualified than I am.

------
huahaiy
A glance of the paper seems to suggest that they are going back to the
traditional RL methods. It would be interesting for the paper to compare with
them explicitly.

~~~
rhaps0dy
Which traditional RL methods do you mean, and what kind of comparison?

If you mean they are returning to tabular Q-learning, then yes, sort of. They
still use function approximation (as required; tabular Q-learning stands no
chance of generalising across states), but their function kind of looks like a
table look up. A look up of the 50 closest values to some computed key though.

It would be totally uninteresting to compare this with tabular Q-learning
though.

