
Deep Residual Learning for Image Recognition - mbartoli
http://arxiv.org/abs/1512.03385
======
mdda
This Microsoft Research's approach, that romped to first place in the recent
ImageNet challenge [0].

What's neat is that the technique is an almost comically simple way to add
extra layers to a network. It's commonly accepted that deeper networks can
learn better, but they get very unwieldy/difficult to train as they get
deeper.

Roughly speaking (and please correct me if I'm off-base), the paper's
technique is to slot in additional layers that that are initially 'identity+',
where the new layer then gets trained to hone in on the differences from
'identity'. This training on residuals alone is more stable, since answers
near each '~0' starting point are simply as good as the original network - any
improvement is a pure win.

So... their winning network has a breathtaking 152 layers (and then ensembles
a few of them together).

[0] [http://image-net.org/challenges/LSVRC/2015/](http://image-
net.org/challenges/LSVRC/2015/)

------
transcranial
Really cool insight / results. Like relu and dropout, love it when such simple
techniques make such great improvements.

------
amelius
This seems like a result that is more general than "Image Recognition".

------
dharma1
This is great. Any implementations available yet?

~~~
cfcef
Looks like Lasagne
([https://github.com/alrojo/lasagne_residual_network](https://github.com/alrojo/lasagne_residual_network))
and a stab at Keras/Theano
([https://github.com/ndronen/modeling/blob/master/modeling/res...](https://github.com/ndronen/modeling/blob/master/modeling/residual.py)).
At a guess, we'll see more implementations pop up in coming months as
researchers and grad students recover from NIPS and begin pondering how they
could use residual learning.

~~~
dharma1
thanks! Yep, will prob see more after xmas feasting.

