Yeah but NVIDIA's amazing digging technique that could only be accomplished with...

sho_hn · 2025-01-27T23:12:17 1738019537

DeepSeek's stuff is actually more dependent on nVidia shovels. They implemented a bunch of assembly-level optimizations below the CUDA stack that allowed them to efficiently use the H800s they have, which are memory-bandwidth-gimped vs. the H100s they can't easily buy on the open market. That's cool, but doesn't run on any other GPUs.

Cue all of China rushing to Jensen to buy all the H800s they can before the embargo gets tightened, now that their peers have demonstrated that they're useful for something.

At least briefly, Jensen's customer audience increased.

yieldcrv · 2025-01-28T02:56:04 1738032964

I was thinking about that, but don’t those same optimizations work on H100s? and the concepts work on every other chip from Nvidia and every other manufacturer’s chip

I still think this is bullish: more people will be buying chips once cheaper and more accessible, and the things the will be training with be 1,000% to 10,000% larger

jjk166 · 2025-01-28T06:01:48 1738044108

Probably possible is nothing compared to already implemented. How long will it take to apply those concepts to other chips? Will they also be made available to the degree DeepSeek has been? By the time those alternatives are implemented how much further improvement will be made on Nvidia chips? Worst case scenario someone implements and open sources these optimizations for a competitor's chip basically immediately in which case the competitive landscape remains unchanged, for all other scenarios this is a first mover advantage for Nvidia.

dragonwriter · 2025-01-27T23:02:29 1738018949

What about DeepSeek negates NVidia’s advantages over other GPU vendors?

culi · 2025-01-28T05:23:13 1738041793

You can train, or at least run, llms on intel and less powerful chips

dragonwriter · 2025-01-28T07:11:22 1738048282

> You can train, or at least run, llms on intel and less powerful chips

The claimed training breakthrough is an optimization targeting NVidia chip, not something that reduces NVidia's relative advantage. Even if it is easily generalizable to other vendors hardware, it doesn't reduce NVidia's advantage over other vendors, it just proportionately scales down the training requirements for a model of a given capacity. Which, maybe, very short term reduces demands from the big existing incumbents, but it also increases the number of players for which investing in GPUs for model training at all is worthwhile, increasing aggregate demand.

culi · 2025-01-28T07:19:27 1738048767

It's not an optimization targeting Nvidia chips. It's an optimization of the technique through and through regardless of chip

But your point is well taken and perhaps both mine and GP's metaphors break down.

Either way, we saw massive spikes in demand for Nvidia when crypto mining became huge followed by a massive drop when we hit the crypto winter. We saw another massive spike when LLMs blew up and this may just be the analogous drop in demand for LLMs

Tostino · 2025-01-28T09:35:28 1738056928

You both seem to be talking past each other. There were a number of optimizations that made this possible. Some were with the model itself and are transferable, others are with the training pipeline and specific to the Nvidia hardware they trained on.

koolhead17 · 2025-01-28T04:51:24 1738039884

What is stopping huawei or other Chinese vendors to make chips on deepseek specification and 1/10th NVIDIA cost and mass market it?

shawabawa3 · 2025-01-28T10:31:26 1738060286

> What is stopping huawei or other Chinese vendors to make chips on deepseek specification

What is "deepseek specification"? Deepseek was trained on NVDA chips. If chinese vendors could build chips as good as NVDA it wouldn't have such a dominant position already, that hasn't changed

ozten · 2025-01-27T23:01:45 1738018905

CUDA begs to differ.