"We’ve applied this approach to two heavily benchmarked datasets in deep learning: image recognition with CIFAR-10 and language modeling with Penn Treebank. On both datasets, our approach can design models that achieve accuracies on par with state-of-art models designed by machine learning experts (including some on our own team!)."
...
"This approach may also teach us something about why certain types of neural nets work so well. The architecture on the right here has many channels so that the gradient can flow backwards, which may help explain why LSTM RNNs work better than standard RNNs."
...
"This approach may also teach us something about why certain types of neural nets work so well. The architecture on the right here has many channels so that the gradient can flow backwards, which may help explain why LSTM RNNs work better than standard RNNs."