Interesting concept. An algorithm that learns specifically how to better learn on data presented.
For the meta-learning papers, you may have interests to read the related work part of the RL^2 paper https://arxiv.org/pdf/1611.02779.pdf.
Quoted as follows,
"Our work draws inspiration from a particular line of work (Younger et al.,
2001; Santoro et al., 2016; Vinyals et al., 2016), which formulates meta-learning as an optimization
problem, and can thus be optimized end-to-end via gradient descent."
"Another line of work (Hochreiter et al., 2001;
Younger et al., 2001; Andrychowicz et al., 2016; Li & Malik, 2016) studies meta-learning over the
optimization process. There, the meta-learner makes explicit updates to a parametrized model."
Inspired by the same works, apply the meta learning idea into RL problems, meet the ICLR deadline together. Still make sense right?
I think more directly stated, do you think we could get better results somehow? What would those look like?
My hope would be that they each learned something slightly different in solving the same problem. Eventually, things may converge to a single answer. However, there is no evidence to see that we should demand the convergence at the beginning.
So, the shame here is if folks are not comparing and contrasting the different solutions to the same problem. I confess I am guilty in that I have not read both papers. But I will try to see if it can help me understand.
(disclaimer: incomplete knowledge is risk)