
Cray Has Deep Learning Supercomputer in the Works - jonbaer
https://www.top500.org/news/cray-has-deep-learning-supercomputer-in-the-works/
======
arcanus
Expect this to become a common offering. The big vendors are all working on
this (Cray, IBM, Dell's HPC group,etc.).

Furthermore, the hardware manufacturers are increasingly moving towards
hardware that should effectively resolve this. Nvidia is of course leading in
the space, but Intel and AMD are certainly trying to close the gap with new
offerings, and I suspect ARM is as well (although I don't know any ARM folks
well).

For some weird reason, Nvidia is not selling TitanX for use in AWS or other
cloud offerings, so HPC resources might be the best place to get massive
collections of hardware with high speed interconnects.

I'm an HPC software guy, and so I view all this with a skeptical eye. I don't
think software stacks are sufficiently advanced to enable true HPC-scale
parallelism. Most of the work is being done on small clusters, and until a
truly performant MPI stack (or _perhaps_ a different message passing paradigm)
comes along, building a big (10K+ nodes) computer for deep learning seems
premature.

