I had always thought of neural nets in terms of the massive connected graph, that in my head was somehow behaved like a machine.
When I realized in the end its just a representation of a massive function, f:Rm->Rn, which needs to fitted to match inputs and outputs.
I know this is not precisely correct and glosses over many, many details - but this change in viewpoint is what finally allowed me to increase the depth of my understanding.
It's unclear that there is such a thing as an NN, and in any case, that it is graph-like.
What are the nodes and edges?
There is a computational graph which corresponds to any mathematical function -- but it is not the NN diagram -- and not very interesting (eg., addition would be a node).
NNs are just polynomial regression with polynomial activations; and piece-wise linear regression with relu activations (etc.).
A NN is just a highly parameterized regression model -- for better, or worse.