
Do We Train on Test Data? Purging Cifar of Near-Duplicates - seesawtron
https://arxiv.org/abs/1902.00423
======
seesawtron
> "Ideally, researchers and machine learning practitioners aim at comparing
> models with respect to their ability of generalizing to unseen data. With a
> growing number of duplicates (in test data), however, we run the risk of
> comparing them in terms of their capability of memorizing the training data,
> which increases with model capacity. This is especially problematic when the
> difference between the error rates of different models is as small as it is
> nowadays, i.e., sometimes just one or two percent points."

~~~
seesawtron
A similar study in the past (2018) "Do CIFAR-10 Classifiers Generalize to
CIFAR-10?":

[https://arxiv.org/abs/1806.00451](https://arxiv.org/abs/1806.00451)

