There's a couple of odd decisions here as far as a discussion is concerned. First of all, the "hype" behind the current generation of DNNs is Tensorflow running on a GPU. The idea isn't necessarily to make an efficient model per say (small number of weights)... but instead to hack the model to run extremely efficiently on a GPU. (IE: Hundreds of convolutional layers, all fitting inside of GPU-Register space)
Under these very specific conditions: a Neural Network can perform 60+ TFlops (!!) worth of matrix multiplications on a $500 NVidia 2070 Super, hundreds of times faster than the most expensive CPUs on today's market.
Seeing an old-school "shallow" style neural network is a bit nostalgic. But... that's not what people are doing or getting hyped up about. The speed and efficiency of modern neural networks is designed for a different computer entirely, one that's cheaper to parallelize and cheaper to scale: the GPU.
I think the overall methodology of the post is sound, but it poorly represents what people are doing with neural nets. I was expecting a Tensorflow model but we got a 1x hidden layer model instead.
There's a couple of odd decisions here as far as a discussion is concerned. First of all, the "hype" behind the current generation of DNNs is Tensorflow running on a GPU. The idea isn't necessarily to make an efficient model per say (small number of weights)... but instead to hack the model to run extremely efficiently on a GPU. (IE: Hundreds of convolutional layers, all fitting inside of GPU-Register space)
Under these very specific conditions: a Neural Network can perform 60+ TFlops (!!) worth of matrix multiplications on a $500 NVidia 2070 Super, hundreds of times faster than the most expensive CPUs on today's market.
Seeing an old-school "shallow" style neural network is a bit nostalgic. But... that's not what people are doing or getting hyped up about. The speed and efficiency of modern neural networks is designed for a different computer entirely, one that's cheaper to parallelize and cheaper to scale: the GPU.
I think the overall methodology of the post is sound, but it poorly represents what people are doing with neural nets. I was expecting a Tensorflow model but we got a 1x hidden layer model instead.