
Reptile: A Scalable Meta-Learning Algorithm - stochastic_monk
https://blog.openai.com/reptile/
======
JacobiX
Very impressive, but it has also some limitations, the softmax output that
shows the confidence is often unreliable (tested empirically). This is an
example of misclassification that has a score of 99%:
[https://imgur.com/a/wnr8K](https://imgur.com/a/wnr8K)

~~~
stochastic_monk
I imagine that’s because it’s difficult for the model to decide which features
are important. Understanding my human bias in looking for an intuitive
explanation, the “circle” in this example looks most like the circle in the
picture it identified as most similar. I imagine salience assignment is
difficult without either more examples or injecting prior knowledge.

~~~
mortdeus
Is it really that hard to determine what changes more geometrically?

:) :( :| :O :D :/ ;)

~~~
mortdeus
wouldnt the solution be more accurate if you partitioned the picture down into
smaller segments and then just use the algorithm on each part?

------
Eridrus
This work is surely interesting, but don't let the sophisticated formulation
fool you, meta-learning is not the best performing option for few-shot
classification, ProtoNets and other simple matching strategies achieve far
better performance.

~~~
twanvl
I looked up your claim. The ProtoNets paper by Snell et al. reports 1 shot
accuracy 49.42 and 5 shot accuracy 68.20 on miniImagenet, while the new
Reptile paper reports 48.21 and 66.00 respectively.

I wouldn't call Reptile sophisticated, the method actually looks really simple
(perform a couple of steps of SGD per task, and use these updates as gradients
in the outer loop).

~~~
Eridrus
Ok, I admit I hadn't looked up the actual numbers before I posted that
comment, but the Snell paper wasn't the last word on that line of work, it was
followed by
[https://arxiv.org/abs/1703.05175](https://arxiv.org/abs/1703.05175) (which
doesn't have miniImagenet results). And there may have been further work in
that direction that I'm not aware of.

You're right that Reptile is the simplest recent algorithm in the meta-
learning literature, but I would argue that's basically my point, they started
from somewhere pretty ambitious (lets learn a learner, or at least an SGD
update rule), and ended up with learning an initialization that can be updated
well with a few steps of SGD.

[EDIT]: I also prefer Matching/ProtoNets style work as being simpler to
deploy, since you don't need to retrain to add new classes. Maybe one day
Meta-learning will be SoTA, but there's a lot of world class researchers on
it, and the approaches keep tending away from actual meta-learning IMO, so my
money is on the matching approach. Though my money is on integrating with data
stores in general and not needing to squish everything into weights, so I'm a
bit biased here.

~~~
stochastic_monk
I think it’s more about learning a variety of tasks. And I like the emphasis
on getting at higher-order derivatives with only first-order methods, which as
an abstract idea has a variety of applications.

------
mortdeus
draw an upside down triangle and it thinks its the first picture. Which means,
that AI is still kinda retarded.

------
notMick
Is the main benefit of this sampling technique that it reduces the
contribution from rare though large error variance outliers?

