> This sounds important and interesting but isn't wide the key word here?
Yes, width is very important for this result. Given the size of modern deep neural networks, I (and most people in the deep learning theory community, by now) believe the large width regime is the appropriate regime to study neural networks.
> I mean, the article puts forward a distinct language/model to expression neural nets in, which is cool but are they talking about all or most of the NNs you see today?
Try throw me an architecture and watch if I can't throw you back a GP :)
> Fo-get-a-bout-it, see SiempreViernes' comment: "I couldn't find any mention about a trained NN, this is strictly about the initial state. "(emphasis added)
Yes. I will have things to say about training, but that requires building up some theory. This paper is the first step in laying it out. Stay tuned! :)
On the pragmatic side, would that GP train faster than the NN? In my little experimentation with GPs, I found them awfully slow. However, maybe what I tried (it was black box for me) used some brute force approach, and there are other more fine-tuned algorithms. Since you are an expert in the area, what's your take?
I wouldn't say I'm an expert at using GPs, so actual GP practitioners feel free to correct me if I'm wrong :)