Recurrent models do not simply map from one vector space to another and could very much be interpreted as reasoning about their environment. Of course they are significantly more difficult to train and backprop through time seems a bit of a hack.
No they aren't? RNNs have state that gets modified as time goes on. The RNN has to learn what is important to save as state, and how to modify it in response to different inputs. There is no explicit time-stamping.
not necessarily, depending on the usage RNN based models are sometimes trained in both directions, i.e. for every sample of say videos show it to the network in its natural time direction and then also reversed. This is motivated some say to eliminate dependence on specific order of sequences but instead to train an integrator.
So, time's arrow can be reversed, and the model can thus extrapolate both forward and backward. Cool!
However, that doesn't actually eliminate the axis/dimension. Eliminating timestamps only makes the dimension a unitless scalar (IOW 'time' tautologically increments at a 'rate' of 'one frame per frame').