
Ask HN: Are modular neural networks an interesting avenue for further research? - hsikka
Modular&#x2F;Multiple Neural networks (MNNs) revolve around training smaller, independent networks that can feed into each other or another higher network https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Modular_neural_network<p>In principle, the hierarchical organization could allow us to make sense of more complex problem spaces and reach a higher functionality, but it seems difficult to find examples of concrete research done in the past regarding this. I&#x27;ve found a few sources:<p>https:&#x2F;&#x2F;www.teco.edu&#x2F;~albrecht&#x2F;neuro&#x2F;html&#x2F;node32.html<p>https:&#x2F;&#x2F;vtechworks.lib.vt.edu&#x2F;bitstream&#x2F;handle&#x2F;10919&#x2F;27998&#x2F;etd.pdf?sequence=1&amp;isAllowed=y<p>A few concrete questions I have:<p>Are there any tasks where MNNs have shown better performance than large single nets?<p>Could MNNs be used for multimodal classification, i.e. train each net on a fundamentally different type of data, (text vs image) and feed forward to a higher level intermediary that operates on all the outputs?<p>From a software engineering perspective, aren&#x27;t these more fault tolerant and easily isolatable on a distributed system?<p>Has there been any work into dynamically adapting the topologies of subnetworks using a process like Neural Architecture Search?<p>Generally, are MNNs practical in any way?<p>Apologies if these questions seem naive, I&#x27;ve just come into ML and more broadly CS from a biology&#x2F;neuroscience background and am captivated by the potential interplay.<p>I really appreciate you taking the time and lending your insight!
======
Michaelanjello
Learning Quickly to Plan Quickly Using Modular Meta-Learning
[https://arxiv.org/abs/1809.07878](https://arxiv.org/abs/1809.07878)

Automatically Composing Representation Transformations as a Means for
Generalization
[https://arxiv.org/abs/1807.04640](https://arxiv.org/abs/1807.04640)

Modular meta-learning
[https://arxiv.org/abs/1806.10166](https://arxiv.org/abs/1806.10166)

Omega: An Architecture for AI Unification
[https://arxiv.org/abs/1805.12069](https://arxiv.org/abs/1805.12069)

Cortex Neural Network: learning with Neural Network groups
[https://arxiv.org/abs/1804.03313](https://arxiv.org/abs/1804.03313)

Evolutionary Architecture Search For Deep Multitask Networks
[https://arxiv.org/abs/1803.03745](https://arxiv.org/abs/1803.03745)

~~~
hsikka
Fantastic, thank you! I'm going to add these to the reading list for this week
:)

------
svantana
In general the trend is toward more complex architectures, which increasingly
blurs the line between modular nets and sparsely connected parts of a single
model. IMO it's only meaningful to talk about modularity if the modules can be
separately trained, or at least have their own loss functions. But of course
its always useful to train or finetune the entire system end-to-end.

A semi-recent success story is Tacotron 2, a speech synthesizer that has one
model to convert text to a spectrogram, and a second model to convert that
spectrogram into pcm (waveform) data.

Also, Francois Chollet, the author of the popular deep learning library Keras,
has some interesting thoughts on the future of ML along these lines that I
find compelling: [https://blog.keras.io/the-future-of-deep-
learning.html](https://blog.keras.io/the-future-of-deep-learning.html)

~~~
hsikka
awesome, thank you! I'll look into this. I was certainly thinking about
modularity as independent, separate training of the modules.

------
mark_l_watson
Since non-linear neal models can approximate any function, I think that it
makes sense to treat models as functions that are embedded in conventional
applications. There are good deep learning wrappers for functional languages
like Haskell and Clojure. I recently enjoyed the talk on juji.io which treats
models as functions, leading to an interesting architecture.

~~~
dlivingston
Do you have a permalink to that talk? I tried to go to their site but couldn't
get past the 'Empathy Bot'

~~~
mark_l_watson
[https://m.youtube.com/watch?v=phA4bMjKvCY](https://m.youtube.com/watch?v=phA4bMjKvCY)

------
andreyk
It's pretty much what Google is looking into doing

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts
Layer [https://arxiv.org/abs/1701.06538](https://arxiv.org/abs/1701.06538)

~~~
hsikka
Thanks andreyk, and thank you for all the cool stuff you do in the AI
community, been following your work for a little while now

------
tabtab
That's largely the idea behind "Factor Tables":
[https://github.com/RowColz/AI](https://github.com/RowColz/AI)

You break big tasks into smaller tasks, which are generally manageable by
people with 4-year degrees. An AI expert may manage or guide the larger
project, but the sub-tasks shouldn't normally require an AI expert.

------
Michaelanjello
[https://github.com/tensorflow/adanet](https://github.com/tensorflow/adanet)
looks effectively modular to me, although I wouldn't consider it the epitome
of the same. One needs to be able to merge, move, and remove modules too.

~~~
hsikka
I’ve actually been wondering if some sort of NAS network could allow me to
emulate neural plasticity in the modular networks

