
Enabling Continual Learning in Neural Networks - interconnector
https://deepmind.com/blog/enabling-continual-learning-in-neural-networks/
======
cs702
This builds on the ideas behind PathNet, previously discussed at
[https://news.ycombinator.com/item?id=13675891](https://news.ycombinator.com/item?id=13675891)

Whereas PathNet permanently freezes parameters and pathways used for
previously learned tasks, in this case the authors compute how important each
connection is to the most recently learned task, and protect each connection
from future modification by an amount proportional to its importance.
Important pathways tend to persist, and unimportant pathways tend to be
discarded, gradually freeing "underused" connections for learning new tasks.

The authors call this process Elastic Weight Consolidation (EWC). Figure 1 in
the paper does a great job of explaining how EWC finds solutions in the search
space of solutions that are good for new tasks without incurring significant
losses for previous tasks.

Very cool!

~~~
beaconstudios
that's pretty incredible - I'm no AI expert but it certainly sounds like this
algorithm provides an ANN equivalent of neuroplasticity, which seems like a
big step.

------
rayuela
I'm confused. I don't get what the novelty in this is. It looks like all they
do is include an input that identifies different tasks and then trains one
neural network to learn a separate distributions for each task, with some
weight sharing...

~~~
Filligree
It may be an obvious solution, but has anyone done that before? While
retaining the ability to have said weight sharing?

~~~
rayuela
Of course, people have done this before [1]. There is quite of bit of research
looking into Multi-task learning. Just look through some of the references in
that Luong et all paper. Deepmind has been putting out some amazing research
lately, but this paper definitely does not fall in that category.

1\.
[http://nlp.stanford.edu/pubs/luong2016iclr_multi.pdf](http://nlp.stanford.edu/pubs/luong2016iclr_multi.pdf)

------
colmvp
Sidenote: On the list of contributors I noticed there are Research Engineers
and Research Scientists. What is the difference between the two?

~~~
jhurliman
Research engineers turn theory, pseudocode, or smaller proof of concepts into
a more fleshed out implementation. Once a research project exceeds a few
thousand lines of code it becomes useful to have dedicated engineers doing
architectural design, owning unit testing / backtesting frameworks, code
quality control, etc.

Source: was a research engineer at Intel Labs several years ago.

~~~
taeric
While this is true, it is also sometimes merely based on your degree. I was a
"Research Engineer" doing the same work as "Research Scientists" because my
degree was in "Computer Engineering" not "Computer Science."

So, YMMV.

------
apl
This came out two days ago and uses what they call intelligent synapses to
improve multi-task learning:
[https://arxiv.org/abs/1703.04200](https://arxiv.org/abs/1703.04200)

Seems closely related.

------
spynxic
"..After learning a task, we compute how important each connection is to that
task."

Anyone know if this was expanded on in the whitepaper?

~~~
eutectic
Presumably you could track e.g. the average gradient of the error with respect
to each weight.

