Great observation. The solution to the update problem is relatively simple. It d...

YeGoblynQueenne · on Jan 27, 2021

You make it sound simple, but from my point of view the ability to update one's learned representation requires a representation that can withstand being updated. I mentioned John McCarthy's concept of "elaboration tolerance" in another comment, i.e. the ability of a representation to be modified easily. This was not a solved problem in McCarthy's time and it's not a solved problem today either (see my sibling comment about "catastrophic forgetting" in neural nets). For shannon's time it was definitely not a solved problem, perhaps not even a recognised problem. That's the 1950's we're talking about, yes? :)

Sorry, I didn't get what you mean about the dynamically evaluated reward function.