I’m curious why Schmidhuber didn’t participate in that competition. Clearly his students were the first to train convnets on GPUs: https://arxiv.org/abs/1202.2745
That paper, discussung Schmidhuber team's work on MNIST and traffic sign recognition dates to 2012, same time as Hinton's team (Alex Krizhevsky, Ilya Sutskever - now of OpenAI) was working on ImageNet about to win that year's ImageNet competition.
The scale of ImageNet (1000 categories, 1 million 256x256+ images) is far more demanding than something like MNIST (10 categories, 60K lo-res 28x28 images), both in terms of compute power and network architecture. Remember this was done before CuDNN or any NN framework software, and the advance of AlexNet over things that had come before such as Schmidhuber's DanNet (which the AlexNet paper cites) was exactly in all the architectural tweaks and optimization (incl. handwritten GPU kernels) to get a CNN to work at this scale.
The introduction to the AlexNet paper clearly sets out what prior work existed and what their own contributions were in taking it to this level.
There's nothing wrong with Schmidhuber expecting prior work to get recognition, but his own work built on that of others as did work that came later. I'm sure he'd have loved to enter ImageNet in 2012, but Hinton's team beat him to it and opened everyone's eyes to the possibility of training neural nets at that scale.
Not sure. But it seems they at least aware of Alex's work on image recognition systems because the CIFAR-10 reference. Training ImageNet at that time involves a lot of engineering work.
My guess is Dan tried Imagenet and decided it’s too computationally intensive for the graphics cards he had at the time (probably realized he couldn’t fit a decent model on one GPU) and decided no one is going to do it any time soon.
Nice to see the emphasis on the creation of the dataset.
A great way to do impactful work while avoiding the dumb competition in the ML space is to identify a high impact application and start making datasets...