The authors make bold claims to outperform recent SOTA methods on CIFAR10 and SVHN but they are using the much bigger architecture: WRN28_10 instead of standard WRN28_2.
Hi, I am one of the co-authors. Thanks for reading our paper! We are actually using the small architecture WRN28_2.
In the experiment settings, we wrote: \citet{oliver2018realistic} provided evaluation results of prior works with the same architecture and evaluation scheme, hence we follow their settings and employ Wide Residual Networks~\citep{zagoruyko2016wide} with depth 28 and width 2 as our baseline model.
In the caption of table 2, we also wrote "Comparison with existing methods on CIFAR-10 and SVHN with $4,000$ and $1,000$ examples respectively. All compared methods use a common architecture WRN-28-2 with 1.4M parameters except AutoAugment$^*$ which uses a larger architecture WRN-28-10.".