It is also not controversial that gradient optimisation gets stuck in local
optima. A good exercise to convince yourself of this is to implement a simple
gradient descent algorithm and observe its behaviour.
>> [1] It should be self-evident because they would hardly be of any use if
unable to extrapolate at all.
That is why deep neural nets are trained with such vast amounts of data
and computing power: it's to compensate for their inability to generalise to unseen
instances.
That, too, is not controversial. It is common knowledge that the recent
success of deep learning is due to the availability of larger datasets and
more powerful computers than in past years. For example, see this interview with Geoff Hinton:
Geoffrey Hinton: I think it’s mainly because of the amount of computation and
the amount of data now around but it’s also partly because there have been
some technical improvements in the algorithms. Particularly in the algorithms
for doing unsupervised learning where you’re not told what the right answer is
but the main thing is the computation and the amount of data.
The algorithms we had in the old days would have worked perfectly well if
computers had been a million times faster and datasets had been a million
times bigger but if we’d said that thirty years ago people would have just
laughed.
I hope this is helpful. I understand there are misconceptions about the
strengths and limitations of deep neural nets that are not very easy to see
with the naked eye. But it's a good idea to approach such an overhypoted
technology with a skeptical mind and, e.g., ask yourself: why is all that data
necessary, if they are so good at learning?
For a high-level discussion, see the following article, by Francois Chollet, the maintainer of the deep learning library, Keras:
https://blog.keras.io/the-limitations-of-deep-learning.html
It is also not controversial that gradient optimisation gets stuck in local optima. A good exercise to convince yourself of this is to implement a simple gradient descent algorithm and observe its behaviour.
>> [1] It should be self-evident because they would hardly be of any use if unable to extrapolate at all.
That is why deep neural nets are trained with such vast amounts of data and computing power: it's to compensate for their inability to generalise to unseen instances.
That, too, is not controversial. It is common knowledge that the recent success of deep learning is due to the availability of larger datasets and more powerful computers than in past years. For example, see this interview with Geoff Hinton:
http://techjaw.com/2015/06/07/geoffrey-hinton-deep-learning-...
Geoffrey Hinton: I think it’s mainly because of the amount of computation and the amount of data now around but it’s also partly because there have been some technical improvements in the algorithms. Particularly in the algorithms for doing unsupervised learning where you’re not told what the right answer is but the main thing is the computation and the amount of data.
The algorithms we had in the old days would have worked perfectly well if computers had been a million times faster and datasets had been a million times bigger but if we’d said that thirty years ago people would have just laughed.
I hope this is helpful. I understand there are misconceptions about the strengths and limitations of deep neural nets that are not very easy to see with the naked eye. But it's a good idea to approach such an overhypoted technology with a skeptical mind and, e.g., ask yourself: why is all that data necessary, if they are so good at learning?