The more chip-makers in the "machine learning" space the better, I would say. But I'd also hope for as much general purpose computing as possible.
Numeric convolution is just piece-wise vector multiplication and done a lot in CNNs but it still seems more specialized than what a GPU has become - more or less a glorified general purpose parallel processor using the SIMD model (in Flynn's taxonomy).
That's what I find particularly cool about stuff like nvidia DGX systems: they are great at machine learning stuff, but they also have ridiculously high performance for more traditional scientific computing (GROMACS, QUDA, etc). A DGX-1V provides 50+ teraflops for fp64 at a price point that is within reach for many university labs.
> A DGX-1V provides 50+ teraflops for fp64 at a price point that is within reach for many university labs.
Universities do tend to have deep pockets, and research grants do tend to be filed to max out the expenditure on equipment such as hardware. A university being able to afford a piece of kit doesn't mean much about its affordability.
Yeah.... Not so much. The only university labs I know that are using DGX had them donated by Nvidia. For my lab, we cobble together consumer cards, as does basically everyone else I know.
NVIDIA was rather vocal with "hey ARM, it is you who dragged us that into that consumer market SoC epopeia in which we failed HARD, now we expect something in return or we sever our relationship"
ARM needs to get some skin in the accelerator game before RISC-V et. al. commoditize its cash cow.
NVDLA is fairly permissively licensed (free for commercial use), but of course Nvidia will steer the greater ecosystem around it. Perhaps ARM can be the Red Hat to NVDLA's Linux, or something like that. Still seems a bit strange to me.
>> ARM needs to get some skin in the accelerator game before RISC-V et. al. commoditize its cash cow.
For their own sake sure. I look forward to RISC-V taking over a large part of the world ;-) The Esperanto chip looks so promising even without proprietary extensions.
It says the processor is built for efficient convolutions. While that would be great for CNNs, could the same be used to make RNNs more efficient? I.e. can the problem of solving RNNs be reduced to solving CNNs?
Numeric convolution is just piece-wise vector multiplication and done a lot in CNNs but it still seems more specialized than what a GPU has become - more or less a glorified general purpose parallel processor using the SIMD model (in Flynn's taxonomy).
https://en.wikipedia.org/wiki/Flynn%27s_taxonomy