
Path-breaking Papers About Image Classification - parths
http://blog.paralleldots.com/technology/deep-learning/must-read-path-breaking-papers-about-image-classification
======
falcolas
Some really cool information, but this concluding bit annoyed me:

> By Moore’s law, we will reach computing power of human brain by 2025 and all
> of the humanity by 2050.

Their graph does show exponential growth, but the data points cut off at the
year 2000. Not surprising, given that Moore's law has reached its end in the
last decade. ML improvements now depend upon better algorithms to make them
more parallel, and the economies of scale which make more parallel computation
units available. I don't think we're anywhere near that exponential graph,
however, and we'll keep getting further from it.

Perhaps quantum computing will become a widespread reality and blow the field
open, but I'm not holding my breath that it will happen in the next few
decades.

~~~
londons_explore
Moore’s law is still alive and well, you just have to move over to the
parallel architectures like GPU's.

Considering machine learning is all on GPU's and TPU's now, I think this is
still a fair assessment.

~~~
bllguo
You're conflating things here; Moore's law is about how many transistors you
can pack onto a chip, not performance. Moore's law is unsustainable due to
laws of physics. Transistors are reaching size scales where quantum effects
like tunneling dominate.

Think of a transistor as having two regions, separated by a channel. When the
transistor is on, charge carriers flow through the channel between the
regions. When off, charge carriers do not flow. But when we move to smaller
and smaller length scales, the channel is so small that charge carriers will
tunnel through and reach the other region. How will you distinguish on/off
behavior now?

~~~
londons_explore
Todays transistors are still generally on a plane. If we could find ways to
build IC's stacked thousands or millions of layers thick, we could really
start using the third dimension.

("3d" transistors built on their side don't really count)

~~~
CodesInChaos
For CPUs/GPUs it probably wouldn't help that much, since you have to get rid
of the heat produced by switching a transistor.

And for storage the main issue isn't how big it is, but how cheap it is to
manufacture.

------
AndrewOMartin
The caption for the top graph appears a bit out of whack.

It states "exponential decline in top 5 error rate", the decline looks more
like diminishing returns to me, especially if you push the 2017 data point out
to where it should be (they've omitted 2016).

It's nice that the error rate is low, but the caption appears to oversell it.

This graph reminds me of a very closely related one I saw in a talk a few
years ago [1]. It was showing decline in voice recognition error rates over
time, with a highlighted band for "human performance".

The speaker, Roger Moore (the academic, not the actor, and not the Moore with
the law), pointed out that this line, while encouraging, hid two important
points.

1) For linear improvement, exponentially more training data was needed. 2) No
insight into how living beings solve the same task.

These aren't necessarily fatal flaws, but they're worth remembering.

[1]
[https://www.youtube.com/watch?v=iYbVsvxd3bE](https://www.youtube.com/watch?v=iYbVsvxd3bE)

------
zitterbewegung
A more accurate idea of what a computer sees is actually that ML models figure
out what parts of the signal to throw away and pay attention to. This is why
you can slightly perturb the image so that humans see a picture of two hot
dogs while an ML model can be confused into two different things (hot dog and
an egg plant).

------
kushankpoddar
Sometimes I wonder why is the top-5 image classification task so difficult. If
you are giving me 5 chances to look at an image and correctly classify it from
~1000 Imagenet classes, I can surely do better than 5-10% error rate.

Also, now that the top-5 error rate been brought down considerably, what is
the next benchmark for the research community to beat? A new dataset, top-1
error rate on Imagenet?

~~~
parths
A large majority of human errors come from fine-grained categories(such as
correctly identifying two similar cat species) and class unawareness. I would
recommend this article by Andrej Karpathy, where he talks about his learning
from competing against GoogLeNet: [http://karpathy.github.io/2014/09/02/what-
i-learned-from-com...](http://karpathy.github.io/2014/09/02/what-i-learned-
from-competing-against-a-convnet-on-imagenet/)

~~~
AstralStorm
That would be relatively low grade error. Specifically errors have to be
valued and not just counted.

------
T_D_K
Does anyone have insight as to why they're still doing top 5? It seems to me
like the error rates have dropped low enough that they could move on to top 3
or even single guess challenges. Is there data that shows how these same
models perform in such tasks? Though I suppose, if I was motivated, all the
needed tools are available to find out for myself.

------
pulkitkumar1995
Is densenet the one which won the best oaper award in CVPR this year?

And which framework would you recommend to code these in?

~~~
parths
Yes! Facebook's Densenet won the best paper award in CVPR this year. I would
recommend PyTorch framework to code these in as it extends the numpy, scipy
ecosystem and is simpler to use.

~~~
mongodude
I'll prefer utility over hype. One has to see how the community evolves around
pytorch.

------
mongodude
Squeeze and excitation network by momenta.ai has been a watershed moment for
Chinese AI prowess and I'll watch out for such Chinese startups to dominate AI
landscape for a while. What amuses me is why Google haven't participated in
the last couple imagenets?

~~~
tanilama
Imagenet as a competition is losing its importance ever since 2016. No idea
like ResNet that is widely effective and inspiring from that year. I feel
people just over engineered their network structure to claim the state of art
by marginal gain.

Google since brought up their Neural Architecture search that can
automatically design network, which I think is way ahead of rest of the
competitors here.

------
AndrewKemendo
If anyone is interested here are the official ILSVRC2017 results:

[http://image-net.org/challenges/LSVRC/2017/results](http://image-
net.org/challenges/LSVRC/2017/results)

------
lalp2119
It would be great if you can share the links to pretrained weights if the
networks mentione here in python framework.

~~~
sanxiyn
Here are some. They all have pretrained weight download.

ResNet: [https://github.com/KaimingHe/deep-residual-
networks](https://github.com/KaimingHe/deep-residual-networks) Wide ResNets:
[https://github.com/szagoruyko/wide-residual-
networks](https://github.com/szagoruyko/wide-residual-networks) ResNeXt:
[https://github.com/facebookresearch/ResNeXt](https://github.com/facebookresearch/ResNeXt)
DenseNet:
[https://github.com/liuzhuang13/DenseNet](https://github.com/liuzhuang13/DenseNet)

