
Totally Model-Free Learned Skillful Coping (2004) [pdf] - xj9
http://www.ieor.berkeley.edu/People/Faculty/dreyfus-pubs/BSTSmodelfree.pdf
======
ced
It's an interesting article. I haven't quite figured the difference to TD-
learning, but this part in particular:

 _in the spirit of temporal difference reinforcement learning, I hypothesize
that certain neurons become critic neurons. The activity levels of these
neurons can be interpreted as determining, at each moment, the brain’s critic-
neurons-based expected quality of overall performance of the task given that
the sensed current environment and the brain’s current state, which, in turn,
is a function of what has happened in this particular execution of the task up
to now. Although one need not be conscious of this expectation during
performance of a task, the phenomenology of skillful behavior suggests that
this information is indeed encoded within the brain and is, in fact, available
to the conscious mind. A baseball out fielder seems to know only moments after
a ball is hit whether it is catchable and with what difficulty and likelihood.
A skilled chess player, shown an unfamiliar yet realistic chess position, can
report almost instantaneously the probable outcome of the game if it were
contested by skilled players_

How can that be achieved without a model? I don't understand what "model-free"
is supposed to mean. Even Q-learning "works by learning an action-value
function that ultimately gives the expected utility of taking a given action
in a given state and following the optimal policy thereafter." (Wikipedia).
The expected utility depends on a set of modeling assumptions, so it can
hardly be called "model-free"...

