Hacker News new | past | comments | ask | show | jobs | submit login

One problem with understanding ANNs is that the weight matrix carries a lot of spurious interactions. Running perturbation analysis you can see that many of the interactions do not contribute to the information processing of the circuit. This is the same for Gene Regulatory Networks. I wrote a paper published in Nature's "Systems Biology" entitled Survival of the Sparsest Gene Networks are Parsimonious. It's been cited ~130 times. There I represent an algorithm to evolve the connectivity of the network. What you find if that a network will tend to remove spurious interactions if the system is allowed to evolve. Because there will be very few network topologies that are both sparsely connected and functionally equivalent (think of how many ways you could create a minimally complex 8-bit added) there is likely only a small handful of non-isomorphic network topologies for any given function. With these sparse networks we should get a better grasp on the functional circuits that drive them. When the networks appear fully connected, at least in each layer, that circuitt does not reveal itself.



NVIDIA NIPS paper doing just that for NN.

http://papers.nips.cc/paper/5784-learning-both-weights-and-c...

Not sure if it tells us more about why the network works, but they sure as heck get rid of a lot of connections.


Is perturbation analysis one of the things the mathematicians don't know about or don't accept?


I don't think so - it is fairly common in simulations and statistical analysis as a way to explore the robustness of your results given small changes within the tolerances you expect for your model. Usually it is applied to the input data, but training neural networks can be expensive so tinkering with the weights is much cheaper (which is the input to some gradient descent algorithm looking for a local minimum and gives insight into the stability of your local minimum).


Isn't this what dropout does?


Could you explain weight matrix "spurious interactions"?


In short, a spurious interaction is a non-zero entry in the weight matrix that has no (positive) contribution to the network function, and that removing that interaction the network will perform at least as good as it would with the interaction.

Most neural network models assume that all neurons in one level are fully connected to all neurons in the next level. This leads to confusion about how ANNs work.

From my research I'd argue that MOST interactions in these networks are spurious. Once you remove them it reveals the (visual) topology (circuit diagram) that's driving the function of that network.

In the paper I wrote, I evolved gene regulatory networks (of ANNs have the same mathematical representation) such that interactions between any two nodes could be deleted, created (if W_ij = 0), or modified according to probabilities of deletion, creation, and modification. Given these probabilities, you can calculate the number of interactions that should result when the network reaches equilibrium, however what I found was that the network evolves less interactions than you would expect from the equilibrium calculation. This says that all things being equal, a network is paying a price for spurious interactions and that these will be removed in an evolutionary environment. Basically, each interaction needs to pay it's way otherwise it leads to unnecessary complexity that reduced the fitness of the network.


Interesting, thank you.


Very, very roughly a neural network consists of a bunch of connected nodes that you propagate signals through. Each of those connections carries a weight that affects how the signal gets propagated through the rest of the network.

Let's say a specific problem has only one very specific set of connections that matters. You'll eventually add up with weights that reflects that, but that doesn't prevent a lot of other connections from having weights set during training, but that may end up being cancelled out or reduced enough to have no meaningful impact whatsoever on the end result.


Is that why sparsity terms have been introduced into objective functions?


noise, biases, collisions, etc. due to imperfect or insufficient data, representational space issues, etc.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: