
DeepMind just published a mind blowing paper: PathNet - MichaelBurge
https://medium.com/@thoszymkowiak/deepmind-just-published-a-mind-blowing-paper-pathnet-f72b1ed38d46#.z69ew2s9g
======
hectormalot
When working through Michael Nielsen's deep learning tutorial[1], this was
exactly the question I had. I used to think deep learning involved 1000s of
layers. Turns out that training a network a few layers deep can already be a
real pain.

Currently we seem to be training one-off networks that are very good at a
single thing (e.g. recognising hand written digits in 28x28 pixels), but we
haven't been able to really connect multiple networks together to get to more
generalised applications.

I didn't see this model coming, but it show DeepMind is taking an interesting
direction here, potentially leading to more generally applicable ML
capabilities.

[1]:
[http://neuralnetworksanddeeplearning.com](http://neuralnetworksanddeeplearning.com)

------
brudgers
PathNet paper:
[https://arxiv.org/abs/1701.08734](https://arxiv.org/abs/1701.08734)

 _For artificial general intelligence (AGI) it would be efficient if multiple
users trained the same giant neural network, permitting parameter reuse,
without catastrophic forgetting. PathNet is a first step in this direction. It
is a neural network algorithm that uses agents embedded in the neural network
whose task is to discover which parts of the network to re-use for new tasks.
Agents are pathways (views) through the network which determine the subset of
parameters that are used and updated by the forwards and backwards passes of
the backpropogation algorithm. During learning, a tournament selection genetic
algorithm is used to select pathways through the neural network for
replication and mutation. Pathway fitness is the performance of that pathway
measured according to a cost function. We demonstrate successful transfer
learning; fixing the parameters along a path learned on task A and re-evolving
a new population of paths for task B, allows task B to be learned faster than
it could be learned from scratch or after fine-tuning. Paths evolved on task B
re-use parts of the optimal path evolved on task A. Positive transfer was
demonstrated for binary MNIST, CIFAR, and SVHN supervised learning
classification tasks, and a set of Atari and Labyrinth reinforcement learning
tasks, suggesting PathNets have general applicability for neural network
training. Finally, PathNet also significantly improves the robustness to
hyperparameter choices of a parallel asynchronous reinforcement learning
algorithm (A3C)._

------
rdlecler1
Biology has figured this out by separating developmental evolution from
learning. Developmental evolution evolves the genotype-phenotype map. For
neurogeneisis evolution has evolved genes that generate the brain. Learning is
then a fine tuning on hundreds of millions of years evolutionary algorithms.
We can't get an Ape to speak but almost no human is impaired. Interestingly,
the process of neurogenesis is controlled by genes which together form gene
regulatory networks and when you represent those mathematically you discover
that they're basically a special kind of neural network. Basically neural
networks are a universal computational design that have been functionlly
reproduced at least twice in nature.

~~~
posterboy
> We can't get an Ape to speak

Sign language counts in my opinion.

Edit:

> Basically neural networks are a universal computational design that have
> been functionlly reproduced at least twice in nature

or is that because the concept can be used to describe everything (via
Universal approximation theorem)? Gene regulatory networks still sounds very
interesting.

------
mycall
> A plausible requirement for artificial general intelligence is that many
> users will be required to train the same giant neural network on a multitude
> of tasks. This is the most efficient way for the network to gain experience,
> because such a network can reuse existing knowledge instead of learning from
> scratch for each task.

Is this the hypothesis or already well-known? I'm not so sure if one giant DDN
is better than a multitude of localized and specialized DDNs.

------
habitue
Pathnet is super interesting, but the authors of the paper don't make claims
like this blog post does about being general AI, and this blog post doesn't
back up those claims well either.

