
NVIDIA Develops NVLink Switch: NVSwitch, 18 Ports For DGX-2 - jsheard
https://www.anandtech.com/show/12581/nvidia-develops-nvlink-switch-nvswitch-18-ports-for-dgx2-more
======
jsheard
I'll just put the follow-up articles here rather than spamming tons of similar
stories:

More DGX-2 information - [https://www.anandtech.com/show/12587/nvidias-
dgx2-sixteen-v1...](https://www.anandtech.com/show/12587/nvidias-
dgx2-sixteen-v100-gpus-30-tb-of-nvme-only-400k)

Quadro GV100 announced - [https://www.anandtech.com/show/12579/big-volta-
comes-to-quad...](https://www.anandtech.com/show/12579/big-volta-comes-to-
quadro-nvidia-announces-quadro-gv100)

Tesla V100 memory bumped to 32GB -
[https://www.anandtech.com/show/12576/nvidia-bumps-all-
tesla-...](https://www.anandtech.com/show/12576/nvidia-bumps-all-
tesla-v100-models-to-32gb)

~~~
lsb
Five and a half years ago, one DGX-2 would be in the top 10 supercomputers in
the world[1], and you'll probably be able to rent one on EC2 for under twenty
bucks an hour before the year's out. You can already get the DGX-1 for under
ten bucks an hour right now.

[1] 1920 petaflops of 4x4+4 matrix multiply/add, and see
[https://www.top500.org/lists/2012/11/](https://www.top500.org/lists/2012/11/)

~~~
p1esk
No it would not. You are comparing FP16 to FP64 performance.

~~~
twtw
It falls around 220th in the November 2012 list, and would make it onto the
top 10 for the November 2007 list (comparing Rpeak) and the June 2008 list
(comparing Rmax).

Systems of comparable performance in the top 10 used between 300-500 kW
(30-50x the DGX2).

Red Storm, which has a listed Rpeak of 127 TFLOPs and placed sixth in the
November 2007 TOP500, was "relatively inexpensive," costing only in the
ballpark of $75M (~180x the price of the DGX2).

------
everyone
Does anyone know when graphics cards will be available at sane prices for
people who actually want to use them one (or two max) at a time to render
graphics?

~~~
Obi_Juan_Kenobi
It's really hard to come up with a timeline, but I'd say the worst has passed.
It will take time to recover inventory, but the alt-coin market (basically,
not-Bitcoin cryptocurrencies) has died down a lot and the rush to acquire
mining capital has likely diminished.

6 months to a year, maybe?

~~~
jamesblonde
If Ethereum does move over to proof-of-stake, then the market will be flooded
with 1080Tis. I expect the DeepLearning11 server will then be a goldmine for
DL researchers - cheap as chips.

~~~
MasterScrat
Can't wait. Fingers crossed.

------
nabla9
Can someone explain the technology edge Nvidia has?

AMD, Intel etc. have not been able to compute in high-performance GPU market,
so Nvidia must have an edge. How big and sustainable it is?

~~~
dogma1138
They understood really early on that you need to invest in software just as
much as you would need to invest in hardware if not more.

Open standards tend to be a horse designed by a committee, it can take years
for them to evolve and to reach any consensus and they would never be able to
match the speed in which hardware can evolve and adapt to market requirements.

So NVIDIA essentially made their own software ecosystem which can be just as
flexible as their hardware and more importantly it allows NVIDIA to be
proactive rather than reactive.

~~~
ethbro
They also realized non-CS researchers don't have the time, expertise, budget,
or interest in writing optimized ML libraries.

And that if a graphics card company wrote easier to use / higher stack
libaries, this would be a competitive advantage.

------
throwaway84742
Without reading the article first, let me guess, $200k?

~~~
wlesieutre
$400k

[https://www.anandtech.com/show/12587/nvidias-
dgx2-sixteen-v1...](https://www.anandtech.com/show/12587/nvidias-
dgx2-sixteen-v100-gpus-30-tb-of-nvme-only-400k)

~~~
throwaway84742
Sheeeit. I’d love some of the stuff they’re smoking. You can build a 100 GPU
rig for half as much. Just scatter it around the office so it’s not “in the
data center”.

~~~
twtw
100 GPU rig would not have 512 GB in one address space, accessible by 16 GPUs.
Each GPU can directly address the memory on any of the 16 GPUs.

~~~
gravypod
For the 300k you'll save just hire a full time dev to write you a distributed
implementation of your solver.

~~~
slizard
Hahha, good one. Just that by the time that one dev implements everything,
you'll be overtaken by the competition -- at least that's what everyone's fear
is, and it's partly warranted.

------
jftuga
10,000 watts...wow!

------
xvilka
Too bad this came from the one of the worst companies in the world, according
to the policy towards open source. Too bad Google playing on their side by do
not merging OpenCL support in TensorFlow.

