Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Are modular neural networks an interesting avenue for further research?
43 points by hsikka on Dec 3, 2018 | hide | past | favorite | 12 comments
Modular/Multiple Neural networks (MNNs) revolve around training smaller, independent networks that can feed into each other or another higher network https://en.wikipedia.org/wiki/Modular_neural_network

In principle, the hierarchical organization could allow us to make sense of more complex problem spaces and reach a higher functionality, but it seems difficult to find examples of concrete research done in the past regarding this. I've found a few sources:

https://www.teco.edu/~albrecht/neuro/html/node32.html

https://vtechworks.lib.vt.edu/bitstream/handle/10919/27998/etd.pdf?sequence=1&isAllowed=y

A few concrete questions I have:

Are there any tasks where MNNs have shown better performance than large single nets?

Could MNNs be used for multimodal classification, i.e. train each net on a fundamentally different type of data, (text vs image) and feed forward to a higher level intermediary that operates on all the outputs?

From a software engineering perspective, aren't these more fault tolerant and easily isolatable on a distributed system?

Has there been any work into dynamically adapting the topologies of subnetworks using a process like Neural Architecture Search?

Generally, are MNNs practical in any way?

Apologies if these questions seem naive, I've just come into ML and more broadly CS from a biology/neuroscience background and am captivated by the potential interplay.

I really appreciate you taking the time and lending your insight!



Learning Quickly to Plan Quickly Using Modular Meta-Learning https://arxiv.org/abs/1809.07878

Automatically Composing Representation Transformations as a Means for Generalization https://arxiv.org/abs/1807.04640

Modular meta-learning https://arxiv.org/abs/1806.10166

Omega: An Architecture for AI Unification https://arxiv.org/abs/1805.12069

Cortex Neural Network: learning with Neural Network groups https://arxiv.org/abs/1804.03313

Evolutionary Architecture Search For Deep Multitask Networks https://arxiv.org/abs/1803.03745


Fantastic, thank you! I'm going to add these to the reading list for this week :)


In general the trend is toward more complex architectures, which increasingly blurs the line between modular nets and sparsely connected parts of a single model. IMO it's only meaningful to talk about modularity if the modules can be separately trained, or at least have their own loss functions. But of course its always useful to train or finetune the entire system end-to-end.

A semi-recent success story is Tacotron 2, a speech synthesizer that has one model to convert text to a spectrogram, and a second model to convert that spectrogram into pcm (waveform) data.

Also, Francois Chollet, the author of the popular deep learning library Keras, has some interesting thoughts on the future of ML along these lines that I find compelling: https://blog.keras.io/the-future-of-deep-learning.html


awesome, thank you! I'll look into this. I was certainly thinking about modularity as independent, separate training of the modules.


Since non-linear neal models can approximate any function, I think that it makes sense to treat models as functions that are embedded in conventional applications. There are good deep learning wrappers for functional languages like Haskell and Clojure. I recently enjoyed the talk on juji.io which treats models as functions, leading to an interesting architecture.


Do you have a permalink to that talk? I tried to go to their site but couldn't get past the 'Empathy Bot'



It's pretty much what Google is looking into doing

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer https://arxiv.org/abs/1701.06538


Thanks andreyk, and thank you for all the cool stuff you do in the AI community, been following your work for a little while now


That's largely the idea behind "Factor Tables": https://github.com/RowColz/AI

You break big tasks into smaller tasks, which are generally manageable by people with 4-year degrees. An AI expert may manage or guide the larger project, but the sub-tasks shouldn't normally require an AI expert.


https://github.com/tensorflow/adanet looks effectively modular to me, although I wouldn't consider it the epitome of the same. One needs to be able to merge, move, and remove modules too.


I’ve actually been wondering if some sort of NAS network could allow me to emulate neural plasticity in the modular networks




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: