Here is a better source:
Interesting bit about nVidia Tesla V100 GPUs:
Assuming all those nodes are fully equipped, the GPUs alone will provide 215 peak petaflops at double precision. Also, since each V100 also delivers 125 teraflops of mixed precision, Tensor Core operations, the system’s peak rating for deep learning performance is something on the order of 3.3 exaflops.
Those exaflops are not just theoretical either. According to ORNL director Thomas Zacharia, even before the machine was fully built, researchers had run a comparative genomics code at 1.88 exaflops using the Tensor Core capability of the GPUs. The application was rummaging through genomes looking for patterns indicative of certain conditions. “This is the first time anyone has broken the exascale barrier,” noted Zacharia.
Knights Landing/Hill/Mill is simply not compelling; Omni-Path was created as an infiniband knockoff that doesn’t beat Mellanox. The Cray Gemini/Aries interconnects can be found all over the top of the list (and the Intel acquisition of those interconnects happened in 2012), but you don’t see Omni-Path replacing anything.
Meanwhile, Nvidia comes out with NVLink and begins to build small clusters of GPUs connected by larger networks containing IBM and Mellanox. A vacuum was created, and IBM and Mellanox moved (back) in.
The last few acquisitions by ORNL and LANL have been Crays while ANL and LLNL were buying IBM Blue Genes. With this generation, it looks like things have switched. As another poster mentioned, it certainly seems like ANL’s next one will be Cray/Intel. It was going to be based on Knight’s Hill, but Intel cancelling that sort of put the architecture up for grabs.
I don't think there is much doubt core count will increase on all segments and that asymmetric core tech that's currently used in ARM is pretty cool.
I don't see the advantage of mixing Phi and SKX cores. Just use an appropriate balance of different nodes (maybe not all Intel).
And apparently mlnx is pressured by some activist investors to reduce R&D expenses and pay more dividends instead.
I can't remember where to look for OPA's features to help MPI implementation, but someone else might be able to comment.
Price isn't really a concern with these computers, though, because the sort of experimental work they are intended to "replace" (using that loosely, since a lot of the things they simulate are impossible to actually do) is far more expensive. Leadership-grade computing is all about enabling new classes of problems to be solved.
AFAIK there isn't a single publicly owned supercomputer in the US that wasn't funded in small part by stockpile maintenance budgets, even if they were never used for that purpose.
Also, actual science work that they have to do.
Could you explain
> Also, actual science work that they have to do.
Presumably these clusters aren't 100% utilized 100% of the time. They certainly weren't at the national lab I had access to ages ago...
My question was more inquiring if there's any red tape preventing the laboratory from paying the bills with mining, if it could do so profitably. Let's just assume there's idle time where this could hypothetically occur.
I'm not even trying to suggest that they should do that, it's just an interesting relatively new possibility and these things are quite expensive.
Sitting idle, the machine is not using nearly the power draw of when it's running full-tilt.
It still depends on scientists to produce code that can run during off hours. (And how well are scientists known to code?)