Hacker News new | past | comments | ask | show | jobs | submit login
Deep Residual Learning for Image Recognition (arxiv.org)
47 points by mbartoli on Dec 12, 2015 | hide | past | web | favorite | 6 comments



This Microsoft Research's approach, that romped to first place in the recent ImageNet challenge [0].

What's neat is that the technique is an almost comically simple way to add extra layers to a network. It's commonly accepted that deeper networks can learn better, but they get very unwieldy/difficult to train as they get deeper.

Roughly speaking (and please correct me if I'm off-base), the paper's technique is to slot in additional layers that that are initially 'identity+', where the new layer then gets trained to hone in on the differences from 'identity'. This training on residuals alone is more stable, since answers near each '~0' starting point are simply as good as the original network - any improvement is a pure win.

So... their winning network has a breathtaking 152 layers (and then ensembles a few of them together).

[0] http://image-net.org/challenges/LSVRC/2015/


Really cool insight / results. Like relu and dropout, love it when such simple techniques make such great improvements.


This seems like a result that is more general than "Image Recognition".


This is great. Any implementations available yet?


Looks like Lasagne (https://github.com/alrojo/lasagne_residual_network) and a stab at Keras/Theano (https://github.com/ndronen/modeling/blob/master/modeling/res...). At a guess, we'll see more implementations pop up in coming months as researchers and grad students recover from NIPS and begin pondering how they could use residual learning.


thanks! Yep, will prob see more after xmas feasting.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: