Nvidia’s $589B DeepSeek rout

plaidfuji · 2025-01-28T04:54:45 1738040085

Here’s a take I haven’t seen yet:

If training and inference just got 40x more efficient, but OpenAI and co. still have the same compute resources, once they’ve baked in all the DeepSeek improvements, we’re about to find out very quickly whether 40x the compute delivers 40x the performance / output quality, or if output quality has ceased to be compute-bound.

DebtDeflation · 2025-01-28T11:35:44 1738064144

In the long run (which in the AI world is probably ~1 year) this is very good for Nvidia, very good for the hyperscalers, and very good for anyone building AI applications.

The only thing it's not good for is the idea that OpenAI and/or Anthropic will eventually become profitable companies with market caps that exceed Apple's by orders of magnitude. Oh no, anyway.

infecto · 2025-01-28T12:36:56 1738067816

Yes! I have had the exact same mental model. The biggest losers in this news are the groups building frontier models. They are the ones with huge valuations but if the optimizations becomes even close to true, its a massive threat to their business model. My feet are on the ground but I do still believe that the world does not comprehend how much compute it can use...as compute gets cheaper we will use more of it. Ignoring equity pricing, this benefits all other parties.

sigmoid10 · 2025-01-28T14:36:49 1738075009

My big current conspiracy theory is that this negative sentiment toward Nvidia from Deepseek's release is spread by people who actually want to buy more stock at a cheaper price. Like, if you know anything about the topic, it's wild to assume that this will drive demand for GPUs anywhere but up. If Nvidia came out with a Jetson like product that can run the full 670B R1, they could make infinite money. And in the datacenter section, companies will stumble over each other to get the necessary hardware (which corresponds to a dozen H100s or so right now). Especially once HF comes out with their uncensored reproduction. There's so much opportunity to turn more compute into more money because of this, almost every company could theoretically benefit.

matco11 · 2025-01-28T12:56:28 1738068988

Can you guys explain what this would be bad for the OpenAI and Anthropic of the world?

Wasn't the story always outlined to be we build better and better models, then we eventually get to AGI, AGI works on building better and better models even faster, and we eventually get to super AGI, which can work on building better and better models even faster... Isn't "super-optimization"(in the widest sense) what we expect to happen in the long run?

DebtDeflation · 2025-01-28T13:26:44 1738070804

First of all, we need to just stop talking about AGI and Superintelligence. It's a total distraction from the actual value that has already been created by AI/ML over the years and will continue to be created.

That said, you have to distinguish between "good for the field of AI, the AI industry overall, and users of AI" from "good for a couple of companies that want to be the sole provider of SOTA models and extract maximum value from everyone else to drive their own equity valuations to the moon". Deepseek is positive for the former and negative for the latter.

sghiassy · 2025-01-28T13:19:34 1738070374

Because building a frontier model is expensive. But building a model as good as an existing frontier model is cheap (re: distillation).

https://en.m.wikipedia.org/wiki/Knowledge_distillation

So the takeaway is they have no moat

majewsky · 2025-01-28T19:04:49 1738091089

Link without mobile UI: https://en.wikipedia.org/wiki/Knowledge_distillation

infecto · 2025-01-28T13:24:59 1738070699

Beautiful and concise, much better than my word salad.

infecto · 2025-01-28T13:22:20 1738070540

I believe in general the business model of building frontier models has not been fully baked out yet. Lets ignore the thought of AGI and just say models do continue to improve. In OpenAIs case they have raised lots of capital in the hopes of dominating the market. That capital pegged them at a valuation. Now you have a company with ~100 employees and supposedly a lot less capital come in a get close to OpenAIs current leading model. It has the potential to pop their balloon massively.

By releasing a lot of it opensource everyone has their hands on it. Opens the door to new companies.

Or a simple mental model, there has been this ability for third parties to get quite close to leading frontier models. The leading frontier models takes hundreds of millions of dollars and if someone is able to copy it within a years time for significantly less capital, its going to be hard game of cat and mouse.

danhds · 2025-01-28T14:36:18 1738074978

If I can use LLMs for free, why would I give money to OpenAI or Anthropic?

AnthonyMouse · 2025-01-28T06:42:24 1738046544

> If training and inference just got 40x more efficient

Did training and inference just get 40x more efficient, or just training? They trained a model with impressive outputs on a limited number of GPUs, but DeepSeek is still a big model that requires a lot of resources to run. Moreover, which costs more, training a model once or using it for inference across a hundred million people multiple times a day for a year? It was always the second one, and doing the training cheaper makes it even more so.

But this implies that we could use those same resources to train even bigger models, right? Except that you then have the same problem. You have a bigger model, maybe it's better, but if you've made inference cost linearly more because of the size and the size is now 40x bigger, you now need that much more compute for inference.

sghiassy · 2025-01-28T13:21:40 1738070500

I don’t think that it got more efficient. It’s that smaller models can train via larger ones cheaply. Think teacher/student relationship

https://en.m.wikipedia.org/wiki/Knowledge_distillation

mrtesthah · 2025-01-28T08:39:25 1738053565

Actually inference got more efficient as well, thanks to the multi-head latent attention algorithm that compresses the key-value cache to drastically reduce memory usage.

https://mlnotes.substack.com/p/the-valleys-going-crazy-how-d...

AnthonyMouse · 2025-01-28T09:03:55 1738055035

That's a useful performance improvement but it's incremental progress in line with what new models often improve over their predecessors, not in line with the much more dramatic reduction they've achieved in training cost.

pyentropy · 2025-01-28T09:10:59 1738055459

If H800 is a memory-constrained model that NVIDIA built to avoid the Chinese export ban on H100 with equivalent fp8 performance, it makes zero sense to believe Elon Musk, Dario Armodei and Alexandr Wang's claims that DeepSeek smuggled H100s.

The only reason why a team would allocate time on memory optimizations and writing NVPTX code rather than focusing on posttraining is if they severely struggled with memory during training.

I mean, take a look at the numbers:

https://www.fibermall.com/blog/nvidia-ai-chip.htm#A100_vs_A8...

This is a massive trick pulled by Jensen, take the H100 design whose sales are regulated by the government, make it look 40x weaker and call it H800, while conveniently leaving 8-bit computation as fast as H100. Then bring it to China and let companies stockpile without disclosing production or sales numbers, and have no export controls.

Eventually, after 7 months, US govt starts noticing the H800 sales and introduces new export controls, but it's too late. By this point, DeepSeek has started research using fp8. They slowly build bigger and bigger models, work on the bandwidth and memory consumptions, until they make r1 - their reasoning model.

schubart · 2025-01-28T18:49:44 1738090184

Interesting how people keep calling it “the Chinese export ban”. Isn’t an American export ban?

cyanydeez · 2025-01-28T09:18:15 1738055895

What's surprising is anyone would repeat Elon musk related things.

Tech or politics related, he's off the deep end.

mnky9800n · 2025-01-28T09:27:44 1738056464

Especially since he seems intent on everyone talking about him all the time. I find it questionable when a person wants to be the centre of attention no matter. Perhaps attention is not all we need.

K0balt · 2025-01-28T14:36:15 1738074975

Yet another casualty of laypersons browsing arXiv. That paper was like flypaper to his narcissism.

AnthonyMouse · 2025-01-28T09:28:45 1738056525

The problem is he's only wrong some of the time and then people arguing about which one it is this time generates attention, a valuable commodity.

m-s-y · 2025-01-28T12:39:27 1738067967

Maybe “some” applied in the past but his recent history might best be described as “almost always”.

Muromec · 2025-01-28T12:52:36 1738068756

Drugs. Dont do that much drugs for so long.

numpad0 · 2025-01-28T12:55:47 1738068947

He's like a broken smart network switch, smart as in managed. Packets with switch MAC on it are all broken, but erroneously forwarded ones often has valuable data. We through L3 don't know which one is which.

cyanydeez · 2025-01-28T18:19:25 1738088365

I'm wrong some of the times.

He's a lucky mensch, no more, no less.

cyanydeez · 2025-01-28T09:16:35 1738055795

I think what got cheaper are models with up to date information.

hobs · 2025-01-28T14:10:40 1738073440

You almost never reintegrate new information with training, its by far the most expensive way to do that.

cyanydeez · 2025-01-28T18:20:47 1738088447

...and that got cheaper? Not sure your point.

At some point, the models _have_ to do "continuous integration" to provide the "AGI" that's wanted out of this tech.

cyberax · 2025-01-28T07:15:44 1738048544

> DeepSeek is still a big model that requires a lot of resources to run

I can run the largest model at 4 tokens per second on a 64GB card. Smaller models are _faster_ than Phi-4.

I've just switched to it for my local inference.

KeplerBoy · 2025-01-28T07:50:13 1738050613

Isn't the largest model still like 130GB after heavy quantization[1] and 4 tok/s borderline unusable for interactive sessions with those long outputs?

[1] https://unsloth.ai/blog/deepseekr1-dynamic

EVa5I7bHFq9mnYK · 2025-01-28T09:38:35 1738057115

I told it to skip all reasoning and explanations and output just the code. It complied, saving a lot of time)

OKRainbowKid · 2025-01-28T14:54:17 1738076057

Wouldn't that also result in it skipping the "thinking" and thus in worse results?

thot_experiment · 2025-01-28T07:57:16 1738051036

OP probably means "the largest distilled model"

menaerus · 2025-01-28T08:36:56 1738053416

So not an actual DeepSeek-R1 model but a distilled Qwen or Llama model.

From DeepSeek-R1 paper:

> As shown in Table 5, simply distilling DeepSeek-R1’s outputs enables the efficient DeepSeekR1-7B (i.e., DeepSeek-R1-Distill-Qwen-7B, abbreviated similarly below) to outperform nonreasoning models like GPT-4o-0513 across the board.

and

> DeepSeek-R1-14B surpasses QwQ-32BPreview on all evaluation metrics, while DeepSeek-R1-32B and DeepSeek-R1-70B significantly exceed o1-mini on most benchmarks.

and

> These [Distilled Model Evaluation] results demonstrate the strong potential of distillation. Additionally, we found that applying RL to these distilled models yields significant further gains. We believe this warrants further exploration and therefore present only the results of the simple SFT-distilled models here.

K0balt · 2025-01-28T14:42:51 1738075371

Yes, but even that can still be run (slowly) on cpu-only systems down to about 32gb. Memory virtualization is a thing. If you get used to using it like email rather than chat, it’s still super useful even if you are waiting 1/2 hour for your reply. Presumably you have a fast distill on tap for interactive stuff.

I run my models in an agentic framework with fast models that can ask slower models or APIs when needed. It works perfectly, 60 percent of the time lol.

manojlds · 2025-01-28T07:57:02 1738051022

How are you running it, can you be more specific?

cyberax · 2025-01-28T18:40:37 1738089637

DeepSeek-R1-Distill-Llama-70B on triple 4090 cards.

paul_e_warner · 2025-01-28T14:48:25 1738075705

Yes, but I think most of the rout is caused by the fact that there really isn't anything protecting AI from being disrupted by a new player - They're fairly simple technology compared to some of the other things tech companies build. That means openai really doesn't have much ability to protect it's market leader status.

I don't really understand why the stock market has decided this affects nvidia's stock price though.

gloflo · 2025-01-28T06:25:41 1738045541

Does line go up forever?

bmulholland · 2025-01-28T10:07:28 1738058848

Isn't the question closer to "/when/ does line stop going up?"

mapt · 2025-01-28T12:28:41 1738067321

That is a matter of hope.

If line keeps going up, line does catastrophic or potentially apocalyptic harm, given our current circumstances.

gdiamos · 2025-01-28T08:45:06 1738053906

It goes up at least until LLMs match humans - ie until an LLM can write Windows

z3phyr · 2025-01-28T10:43:23 1738061003

I want the LLM to decide not to do anything, or write a new OS.

Whenever I prompt: "Do not do anything"

It always does <something>.

segmondy · 2025-01-28T14:26:22 1738074382

Do not think of a pink elephant. Were you able to do so?

busyant · 2025-01-28T13:14:06 1738070046

> Whenever I prompt: "Do not do anything" It always does <something>.

Yep. A lot of times, the responses I get remind me of Simone in Ferris Bueller's Day Off: https://www.youtube.com/watch?v=swBtLPWeKbU

If you end up making a new model, please teach it that less is more and call it "LAIconic".

winwang · 2025-01-28T08:57:25 1738054645

Slightly tangential, but I want an LLM which can debug windows.

cyanydeez · 2025-01-28T09:19:54 1738055994

Debug And locally fix security holes.

Muromec · 2025-01-28T12:16:21 1738066581

Or sell the found exploit to the friendly nation state actor to pay for its compute behind your back

zeven7 · 2025-01-28T06:18:02 1738045082

I've missed the stories on this until now. Is it known (and is there an ELI5) how they were able to do it so much more efficiently?

rx_tx · 2025-01-28T08:14:20 1738052060

This article has good background, context, and explanations [1] They skipped CUDA and instead used PTX which is a lower level instruction set where they were able to implement more performant cross-chip comms to make up for the less-performant H800 chips.

[1]: https://stratechery.com/2025/deepseek-faq/

saagarjha · 2025-01-28T09:38:12 1738057092

> Moreover, if you actually did the math on the previous question, you would realize that DeepSeek actually had an excess of computing; that’s because DeepSeek actually programmed 20 of the 132 processing units on each H800 specifically to manage cross-chip communications. This is actually impossible to do in CUDA.

You can do this just fine in CUDA, no PTX required. Of course all the major shops are using inline PTX at the very least to access the Tensor cores effectively.

yread · 2025-01-28T09:13:09 1738055589

So can people do the same in SPIR for OpenCL or amdgcn?

https://en.wikipedia.org/wiki/Standard_Portable_Intermediate...

https://www.khronos.org/spir/

Or even better in the unified language like SYCL?

https://cdrdv2-public.intel.com/786536/Heidelberg_IWOCL__SYC...

riffraff · 2025-01-28T06:20:05 1738045205

IIUC they released a paper, it's partially algorithmic improvements partially good old low level optimization.

CursedSilicon · 2025-01-28T07:58:48 1738051128

There's something to be said about the idea that instead of just dumping oceans of money into buying Nvidia cards they just...optimized what they had

I'd say the wider industry could learn a thing or two, but as other commentors have joked. The line must go up

plussed_reader · 2025-01-28T08:16:54 1738052214

The double edged sword of export controls.

mnky9800n · 2025-01-28T09:28:40 1738056520

Creativity always delivers when minds are constrained.

plussed_reader · 2025-01-28T14:54:18 1738076058

Is that what we can call Project2025?

Creativity from constrained minds?

mnky9800n · 2025-01-28T16:21:03 1738081263

surely more relevant comments could exist in the aether of the internet.

Neonlicht · 2025-01-28T13:33:54 1738071234

Ah but why care about efficiency when you have basically unlimited investor money?

Reminds me of Japanese cars and the OPEC boycott in the 1970s...

frontalier · 2025-01-28T06:25:53 1738045553

there's this and that little desktop computer they announced earlier this month - digits

they claim it's able to run models with 200B parameters on a single node and 400B when paired with another node

qwe----3 · 2025-01-28T21:26:44 1738099604

Like everything, I expect improvement to be logarithmic

nejsjsjsbsb · 2025-01-28T10:29:06 1738060146

That's a take I've seen in many HN comments

BenFranklin100 · 2025-01-28T05:00:27 1738040427

That seems to be the key question.

xbmcuser · 2025-01-28T05:39:07 1738042747

Yeah this was my first thought as well. If it got so efficient how good all the models will be 2-3 months from now

teleforce · 2025-01-29T00:22:52 1738110172

>If training and inference just got 40x more efficient

The jury is still out on how much improvement DeepSeek made in terms of training and inference compute efficiency, but personally I think 10x is probably the actual improvement that's being made

But in business/engineering/manufacturing/etc if you have 10x more efficiency, you're basically going to obliterate the competitions.

>output quality has ceased to be compute-bound

You raised an interesting conjecture and it seems that it's very likely the case.

I know that it's not even a full two years that ChatGPT-4 has been released but it seems that it take OpenAI a very long time to release ChatGPT-5. Is it because they're taking their own sweet time to release the software not unlike GIMP, or they genuinely cannot justify the improvement to jump from 4 to 5? This stagnation however, has allowed others to catch up. Now based on DeekSeek claims, anyone can has their own ChatGPT-4 under their desk with Nvidia project Digits mini PCs [1]. For running DeepSeek, 4 units mini PCs will be more than enough of 4 PFLOPS and cost only USD12K. Let's say on average one subscriber user pays OpenAI monthly payment of USD$10, for 1000 persons organization it will be USD$10K, and the investment will pays for itself within a month, and no data ever leave the organization since it's a private cloud!

For training similar system to ChatGPT-4 based on DeepSeeks claims, a few millions USD$ is more than enough. Apparently, OpenAI, Softbank and Oracle just announced USD$500 Billions joint ventures to bring the AI forward with the new announced Stargate AI project but that's 10,000x money [2],[3]. But the elephant in the room question is that, can they even get 10x quality improvement of the existing ChatGPT-4? I really seriously doubt it.

[1] NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer’s Fingertips:

https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

[2] Trump unveils $500bn Stargate AI project between OpenAI, Oracle and SoftBank:

https://www.theguardian.com/us-news/2025/jan/21/trump-ai-joi...

[3] Announcing The Stargate Project:

https://openai.com/index/announcing-the-stargate-project/

eigenvalue · 2025-01-28T03:07:54 1738033674

90% of the comments in this thread make it clear that knowing about technology does not in any way qualify someone to think correctly about markets and equity valuations.

cageface · 2025-01-28T03:49:06 1738036146

You could even say that overconfidence in abilities and knowledge outside of core expertise is the critical flaw of today's entire tech industry.

sympil · 2025-01-28T04:10:23 1738037423

As a great example of this read Paul Graham’s essays that aren’t anout his core expertise.

rhubarbtree · 2025-01-28T07:32:15 1738049535

I think that’s unfair unless you give specific examples and clear evidence he’s wrong.

I disagree with PG on economics and politics, but much of his writing on that is subjective.

sympil · 2025-01-28T13:14:54 1738070094

He recently said that evil people can’t survive long as founders of tech companies because they need smart people to work for them and smart people can work anywhere. There are lots of other examples. Especially read his recent tweets/essays that aren’t about his area of expertise.

rhubarbtree · 2025-01-28T14:45:05 1738075505

Do you have a link to him saying that, please? I want to see the context and whether this was absolute. Was it "few evil founders can be successful?"

We might also want to delve into what "evil" means here. Presumably he's talking about evil in the way that they treat their employees?

As sibling says, that's his core expertise, so not quite relevant.

sympil · 2025-01-28T15:34:13 1738078453

Look it up yourself. Paul Graham does not have a core competency regarding what smart people are or are not willing to do. That would be sociology or psychology or economics.

https://bsky.app/profile/isilanor.bsky.social/post/3ldx24kvc...

rhubarbtree · 2025-01-28T21:13:07 1738098787

If it wasn’t clear, I tried to look it up myself and couldn’t find it, hence the request.

marcosdumay · 2025-01-28T14:22:00 1738074120

I'd argue that subject is one of his core competencies.

redcobra762 · 2025-01-28T09:03:22 1738055002

“Clear evidence he’s wrong”

Good science is dead, isn’t it?

rhubarbtree · 2025-01-28T14:41:47 1738075307

I don't think that's a snark, but I don't understand.

nindalf · 2025-01-28T14:33:56 1738074836

Happy to. Here's PG misrepresenting a wealth tax - https://nindalf.com/posts/wealth-tax/

derangedHorse · 2025-01-28T15:29:41 1738078181

> No one is proposing a tax that would apply to all your assets

Yes they are, that’s literally what a wealth tax is.

How were you defining wealth?

nindalf · 2025-01-29T07:18:04 1738135084

The actual proposals at the time I wrote this was a wealth tax applying to wealth greater than $50 million.

That’s substantially different from one that starts at $0.

haliskerbas · 2025-01-28T05:31:12 1738042272

But every successful SV founder and or VC is not only a tech genius but also a geopolitical and socioeconomic expert! That’s why they make war companies, cozy up to politicians, and talk about how woke is ruining the world. /s

roenxi · 2025-01-28T06:24:06 1738045446

In fairness, 'geopolitical experts' may not really exist. There are a range of people who make up interesting stories to a greater or lesser extent but all seem to be serially misinformed. Some things are too complicated to have expertise in.

Indeed, while the existence of socioeconomic experts seems more likely we don't have any way of reliably identifying them. The people who actually end up making social or economic policy seem to be winging it by picking the policy that most benefits wealthy people and/or established asset owners. It is barely possible to blink twice without stumbling over a policy disaster.

rl3 · 2025-01-28T08:18:45 1738052325

>In fairness, 'geopolitical experts' may not really exist.

Except for, I don't know, the many thousands of people who work at various government agencies (diplomatic, intelligence) or even private sector policy circles whose job it is to literally be geopolitical experts in a given area.

saithound · 2025-01-28T11:33:36 1738064016

There are thousands of gamblers whose job is to literally predict the tumbling random number generators in the slot machines they play, and will be rewarded with thousands of dollars if they do a good job.

They are not experts. As said above, some things are too complicated to have expertise in.

It's plausible that geopolitics may work the same way, with the ones who get lucky mistaken for actual experts.

chmod775 · 2025-01-28T12:55:04 1738068904

Absolute rubbish. There's lots of factual information here you can know and use to make informed "guesses" (if you will).

People like Musk, who are often absolutely clueless about countries' political situations, their people, their makeup, their relationships and agreements with neighboring countries, as well as their history and geography, are obviously going to be terrible at predicting outcomes compared to someone who actually has deep knowledge of these things.

Also we seem to be using the term "geopolitics" a bit loosely in this thread. Maybe we could inform ourselves what the term we are using even means before we discount that anyone could have expertise in it[1]. I don't think people here meant to narrow it down to just that. What we really seem to concern ourselves with here is international relations theory and political sciences in general.

Now whether most politicians should also be considered experts in these areas is another matter. From my personal experience, I'd say most are not. People generally don't elect politicians for being experts - they elect politicians for representing their uninformed opinions. There seems to be only a weak overlap between being competent at the actual job and the ability to be elected into it.

[1] https://en.wikipedia.org/wiki/Geopolitics

saithound · 2025-01-28T23:16:02 1738106162

> There's lots of factual information here you can know and use to make informed "guesses" (if you will).

The gambler who learned the entire observable history of a tumbling RNG will not be in a better position to take the jackpot than the gambler who models it as a simple distribution. You cannot become an expert on certain things.

Geopolitics may or may not be one of these things, but you've made no substantial argument either way.

roenxi · 2025-01-28T14:16:23 1738073783

Geopolitics is a complex system. Having lots of factual and historical information to inform your decision is not obviously an advantage over a guess based on a cursory read of the situation.

It is like economists - they have 0 predictive power vs. some random bit player with a taste for stats when operating at the level of a country's economy. They're doing well if they can even explain what actually happened. They tend to get the details right but the big picture is an entirely different kettle of fish.

Geopolitics is much harder to work with than economics, because it covers economics plus distance and cultural barriers even before the problem of leaders doing damn silly things for silly reasons. And unlike economics there is barely the most tenuous of anchors to check if the geopolitical "experts" get things right even with hindsight. I'd bet the people who sent the US into Afghanistan and Iraq are still patting themselves on the back for a job well done despite what I think most people could accept as the total failure of those particular expeditions.

unsupp0rted · 2025-01-28T11:09:23 1738062563

I thought Peter Zeihan was a geopolitical expert until he started talking about things I lived through, with complete ignorance of the basics. It's not that his take was wrong, it's that his basic underlying assumptions were all wildly different from reality on the ground.

ANewFormation · 2025-01-28T14:58:41 1738076321

Any sort of geopolitical expert is generally going to be labeled as such because he works in the domain at a reasonably high level.

The problem with that is that when at such a level, political factors start to come into play.

The net effect is that in any conflict, the winning side will have competent and qualified expert geopolitical analyses, while the losing side will have propagandists.

So the geopolitical expert is, at best, a liminal species.

benzible · 2025-01-28T06:49:13 1738046953

So, you think the system is genuinely trying to identify expertise to achieve equitable outcomes, and just happening to fail at it? Rather than policy being shaped by personal networks and existing power structures that tend to benefit themselves?

roenxi · 2025-01-28T07:19:48 1738048788

I think the system has been carefully configured to benefit wealthy people and/or established asset owners. But the reason that there is no effective resistance to that is because identifying generalist socioeconomic experts is practically impossible.

rbGasf · 2025-01-28T14:07:56 1738073276

They may exist, but the real expertise is mostly kept non-public. Regarding the Ukraine war, both pro-Russian and pro-American public pundits never mentioned economic and real strategic issues apart from NATO membership for almost 2.5 years.

Then Lindsey Graham outright mentioned the mineral wealth and it became a topic, though not a prominent one.

Access to the Caspian Sea via the Volga-Don canal and the Sea of Azov is never mentioned. Even though there are age old Rand corporation papers that demand more US influence in that region.

The best public pundits get personalities and some of the political history correct (and are entertaining), but it is always a game of omission on both sides.

thiago_fm · 2025-01-28T08:44:23 1738053863

That's so wrong in so many levels, also cynical. If the world worked by what you described, we would have been already obliterated ourselves a long time ago, or mass-enslavery would have happened. It didn't.

Geopolitics can be studied and learned, and is something that diplomats heavily rely upon.

Of course, those geopolitical strategies can play in certain ways we don't foresee, as on the other side we also have an actor that is free to do what they want.

But for instance, if you give Mexico a very good trade agreement as a strong country like the US, it's very likely that they will work with you on your special requests.

ponty_rick · 2025-01-28T05:25:20 1738041920

With the crowdstrike outage earlier last year it was incredible how many hidden security and kernel "experts" came out crawling from the woodwork, questioning why anything needs to run in the kernel and predicting the company's demise.

lmm · 2025-01-28T08:31:27 1738053087

They were correct that there is no need for it to run in the kernel. They were incorrect in thinking this would affect the company's future, because of course the sales of their product have nothing to do with its technical merit.

spydum · 2025-01-28T12:54:12 1738068852

I think you've got it half correct: sales absolutely does have to do with the technical merit. Their platform works, it's just folks overestimated the impact of a single critical defect.

Nobody would pay crowdstrikes prices if it didn't stop attacks, or improve your detection chances (and I can assure you, it does, better than most platforms)

lmm · 2025-01-28T13:49:51 1738072191

> Nobody would pay crowdstrikes prices if it didn't stop attacks, or improve your detection chances

In my experience people pay because they need to tick the audit box, and it's (marginally) less terrible than their competitors. Actually preventing or detecting an attack is not really a priority.

milkshakes · 2025-01-28T08:05:45 1738051545

the experts were correct. in 2024 there are now OS APIs that provide the same observability and control with much less risk involved.

ta1243 · 2025-01-28T12:57:42 1738069062

And yet crowdstrike's stock price is still 28% up on where it was 12 month ago, 46% up on 6 months ago after their crash.

Sibling is right, that type of product is nothing to do with actually preventing problems, its to do with outsourcing personal risk. Same as SAAS. Nobody got fired when office 365 was down for the second day in a year, but have a 5 minute outage on your on-prem kit after 5 years and there's nasty questions to answer.

ponty_rick · 2025-01-28T17:21:26 1738084886

Pretty sure every vendor uses the kernel. There must be a reason behind none of them exclusively using the APIs.

sek · 2025-01-28T08:25:26 1738052726

The crash is absolutely rational; the cascading effect highlights the missing moat for companies like OpenAI. Without a moat, no investor will provide these companies with the billions that fueled most of the demand. This demand was essential for NVIDIA to squeeze such companies with incredible profit margins.

NVIDIA was overvalued before, and this correction is entirely justified. The larger impact of DeepSeek is more challenging to grasp. While companies like Google and Meta could benefit in the long term from this development, they still overpaid for an excessive number of GPUs. The rise in their stock prices was assumed to be driven by the moat they were expected to develop themselves.

I was always skeptical of those valuations. LLM inference was highly likely to become commoditized in the future anyway.

AnthonyMouse · 2025-01-28T08:50:27 1738054227

It has been clear for a while that one of two things is true.

1) AI stuff isn't really worth trillions, in which case Nvidia is overvalued.

2) AI stuff is really worth trillions, in which case there will be no moat, because you can cross any moat for that amount of money, e.g. you could recreate CUDA from scratch for far less than a trillion dollars and in fact Nvidia didn't spend anywhere near that much to create it to begin with. Someone else, or many someones, will spend the money to cross the moat and get their share.

So Nvidia is overvalued on the fundamentals. But is it overvalued on the hype cycle? Lots of people riding the bubble because number goes up until it doesn't, and you can lose money (opportunity cost) by selling too early just like you can lose money by selling too late.

Then events like this make some people skittish that they're going to sell too late, and number doesn't go up that day.

d1sxeyes · 2025-01-28T10:24:13 1738059853

One thing you’re missing is that there’s nothing that says the value must correct. There are at least two very good reasons it might not: Nvidia now has huge amounts of money to invest in developing new technologies, exploring other ideas, and the other is that very little of the stock market is about the actual value of the company itself, but speculation. If people think it will go up, they buy it, reducing supply, and driving up the price. If people think it will go down, they sell it, increasing supply and driving down the price. It is a self-fulfilling prophecy on a large scale, and completely secondary to the actual business.

marcosdumay · 2025-01-28T14:36:45 1738075005

Eventually people will sell their stock to invest in some business that is actually growing or giving proportional dividends.

Of course, that "eventually" there is holding a way too much load. And it's very likely this won't happen in a time the US government is printing lots of money and distributing it to rich investors. But that second one has to stop eventually too.

d1sxeyes · 2025-01-28T15:00:33 1738076433

What if it just… doesn’t?

marcosdumay · 2025-01-28T15:58:55 1738079935

It's a lot of people holding the stock, you are expecting everybody to just not do it.

Private companies are different, but on publicly traded ones it tends to happen.

(Oh, you may mean that printing money part. It's a lot of people holding that money, eventually somebody will want to buy something real with it and inflation explodes.)

d1sxeyes · 2025-01-28T17:47:10 1738086430

Yeah the printing money bit. Generously one might even say that that’s the reason for printing more money: make sure that the value of peoples investments decays over time so there’s no need for the market to crash to “get the money back out”.

matwood · 2025-01-28T11:10:31 1738062631

Related to your #2. I mentioned this elsewhere yesterday, but NVDA's margins (55% last quarter!) are a gift and a curse. They look great for the stock in the short term, but they also encourage its customers to aggressively go after them. Second, their best customers are huge tech companies who have the capital and expertise (or can buy it) to go after NVDA. DeepSeek just laid out a path to put NVDAs margins under pressure, hence the pullback.

HDThoreaun · 2025-01-28T12:18:59 1738066739

Their data center margins are even dumber. 80-90% margins on a product just are not sustainable. No chance they keep this monopoly long term.

awongh · 2025-01-28T11:34:24 1738064064

2) Seems the most plausible, but how to value the moat, or, how long / how many dollars will it cost to overcome the moat? The lead that CUDA currently has suggests that it's probably a lot of money, and it's not clear what the landscape will look like afterwards.

It seems likely that the technology / moat won't just melt away into nothing, it'll at least continue to be a major player 10 years from now. The question is if the market share will be 70%, 10% or 30% but still holding a lead over a market that becomes completely fractured....

bo1024 · 2025-01-28T09:01:58 1738054918

I think the analysis of (2) is too simplistic because it ignores network effects. A community of developers and users around a specific toolset (e.g. CUDA) is hard to just "buy". Imagine trying to build a better programming language than python -- you could do it for a trillion dollars, but good luck getting the world to use it. For a real example, see Meta and Threads, or any other Twitter competitor.

AnthonyMouse · 2025-01-28T09:07:19 1738055239

You have a trillion dollars in incentive. You can use it for more than just creating the software, you can offer incentives to use it or directly contribute patches to the tools people are already using so they support your system. Moreover, third parties already have a large motivation to use any viable replacement because they'd avoid the premium Nvidia charges for hardware.

kristiandupont · 2025-01-28T11:20:02 1738063202

You could apply this analysis to any of the other big tech innovations like operating systems, search, social media, ...

MS threw a lot of money after Windows Phone. I worked for a company that not only got access to great resources, but also plain money, just to port our app. We took the money and made the port. Needless to say, it still didn't work out for MS.

AnthonyMouse · 2025-01-28T21:28:44 1738099724

Those markets have a much stronger network effect (especially social media), or were/are propped up by aggressive antitrust violations, or both.

To use your example, the problem with entering the phone market is that customers expect to buy one phone and then use it for everything. So then it needs to support everything out of the gate in order to get the first satisfied customer, meanwhile there are millions of third party apps.

Enterprise GPUs aren't like that. If one GPU supports 100% of code and another one supports 10% of code, but you're a research group where that 10% includes the thing you're doing (or you're in a position to port your own code), you can switch 100% of your GPUs. If you're a cloud provider buying a thousand GPUs to run the full gamut of applications, you can switch what proportion of your GPUs that run supported applications, instead of needing 100% coverage to switch a single one. Then lots of competing GPUs get made and fund the competition and soon put the competition's GPUs into the used market where they become obtainium and people start porting even more applications to them etc.

It also allows the competition to capture the head of the distribution first and go after the long tail after. There might be a million small projects that are tied to CUDA, but if you get the most popular models running on competing hardware, by volume that's most of the market. And once they're shipping in volume the small projects start to add support on their own.

echoangle · 2025-01-28T10:22:35 1738059755

Why can’t you just build something that’s CUDA-compatible? You won’t have to move anyone over then. Or is the actual CUDA api patented? And will Chinese companies care about that?

marcosdumay · 2025-01-28T14:40:45 1738075245

AFAIK, CUDA is protected. There are patents, and the terms of use of the compiler forbids using it on other devices.

Of course, most countries will stump over the terms of use thing (or worse, use it as evidence to go after Nvidia), and will probably ignore the patents because they are anticompetitive. It's not only China that will ignore them.

cyanydeez · 2025-01-28T09:23:50 1738056230

The second dotcom era was basically giving away product until you have a monopoly.

Really isn't hard to imagine a company trying to make a software happen by generating an environment around it.

rkagerer · 2025-01-28T10:07:58 1738058878

How many of those companies are still around? It takes more than just dollars to build a vibrant and robust ecosystem.

NewLogic · 2025-01-28T13:04:27 1738069467

Uber is still the market leader. They were practically giving away taxi services for a long time.

cyanydeez · 2025-01-28T18:18:38 1738088318

Twitter _seems_ to be stil around and if buying the election is a successful investment, they were "successful".

Microsoft did this also.

It happens often.

eigenvalue · 2025-01-28T12:16:39 1738066599

Yes this is exactly correct! Well said

miohtama · 2025-01-28T11:05:18 1738062318

AMD is failing to recreate CUDA.

Intel is dying.

Who will step in as credible competition, and when?

A trillion dollar question.

AnthonyMouse · 2025-01-28T20:55:19 1738097719

AMD is actively working to recreate CUDA. "Haven't succeeded yet" is very different from having failed, and they're certainly not giving up.

Intel's fab is in trouble, but that's not the relevant part of Intel for this. They get a CUDA competitor going with GPUs built on TSMC and they're off to the races. Also, Intel's fab might very well get bailed out by the government and in the process leave them with more resources to dedicate to this.

Then you have Apple, Google, Amazon, Microsoft, any one of which have the resources to do this and they all have a reason to try.

Which isn't even considering what happens if they team up. Suppose AMD is useless at software but Google isn't and then Google does the software and releases it to the public because they're tired of paying Nvidia's margins. Suppose the whole rest of the industry gets behind an open standard.

A lot of things can happen and there's a lot of money to make them happen.

matwood · 2025-01-28T11:19:52 1738063192

You don't need external competition to have NVDA correct. All it takes is for one or more of the big customers to say they don't need as many GPUs for any reason. It could be their in house efforts are 'good enough', or that the new models are more efficient and take less compute, or their shareholders are done letting them spend like drunken sailors. NVDAs stock was/is priced for perfection and any sort of market or margin contraction will cause the party to stop.

The danger for NVDA is their margins are so large right now, there is a ton of money chasing them not just from their typical competition like AMD, but from their own customers.

pjc50 · 2025-01-28T11:09:05 1738062545

While we can bet on "AMD are too sclerotic to fix their drivers even if it's an existential threat to the company", I don't think we can bet on "if we deny technology to China they won't try to copy it anyway".

miohtama · 2025-01-28T11:30:24 1738063824

I agree.

Need to find this company and buy its stock when it is still early.

bdelmas · 2025-01-28T15:26:12 1738077972

The crash of NVIDIA is not about the moat of OpenAI.

But because DeepSeek was able to cut training costs from billions to millions (and with even better performance). This means cheaper training but it also proves that OpenAI was not at the cutting edge of what was possible in training algorithms and that there are still huge gaps and disruptions possible in this area. So there is a lot less need to scale by pumping more and more GPUs but instead to invest in research that can cut down the cost. More gaps mean more possibility to cut costs and less of a need to buy GPUs to scale in terms of model quality.

For NVIDIA that means that all the GPUs of today are good enough for a long time and people will invest a lot less in them and a lot more in research like this to cut costs. (But I am sure they will be fine)

camillomiller · 2025-01-28T09:17:19 1738055839

This is partially why Apple is the one that stands to gain more, and it showed. Their "small models, on device" approach can only be perfected with something like DeepSeek, and they're not exposed to NVIDIA pricing, nor have to prove investors that their approach is still valid.

aurareturn · 2025-01-28T10:37:01 1738060621

Until AGI removes the need for iOS.

Apple is not immune to AI disruption.

The Rabbit R1 was a scam but the concept was the right approach. It was just 5 years too early.

You don’t need an iPhone with AGI. You just need a 5G device with a screen, connected to an AGI.

kmmlng · 2025-01-28T12:37:05 1738067825

I keep seeing this argument, but I don't buy it at all. I want a phone with an AGI, not a phone that is only AGI. Often it's just easier to press a button rather than talk to an AI, regardless how smart it is. I have no interest in natural language being the only interface to my device, that sounds awful. In public, I want to preserve my privacy. I do not want to have everyone listening in on what I'm doing.

If we can create an AGI that can literally read my mind, okay, maybe that's a better interface than the current one, but we are far away from that scenario.

Until then, I'm convinced users will prefer a phone with AI functionalities rather than the reverse. It's easier for a phone company to create such a phone than it is for an AI company.

kristiandupont · 2025-01-28T11:22:57 1738063377

I don't disagree, but

>You don’t need an iPhone with AGI. You just need a 5G device with a screen, connected to an AGI.

..it seems that that is exactly what the iPhone 20 or whatever will be?

Urahandystar · 2025-01-28T12:08:08 1738066088

Surely its just a watch.

greenie_beans · 2025-01-28T13:20:46 1738070446

> You don’t need an iPhone with AGI. You just need a 5G device with a screen, connected to an AGI.

no the fuck i don't need that. how do you know what i need?

danaris · 2025-01-28T17:08:05 1738084085

Perennial reminder that we do not have any real evidence that we are anywhere close to AGI, or that "throwing more resources at LLMs" is even theoretically a possible way to get to an AGI.

"Lots of people with either a financial motivation to say so or a deep desire for AGI to be real Soon™ said they can do it" is not actual evidence.

We do not know how to make an AGI. We do not know how to define an AGI. It is hypothetically possible that we could accidentally stumble into one, but nothing has actually shown that, and counting on it is a fool's bet.

wbsun · 2025-01-28T04:38:11 1738039091

I don’t understand why this is not obvious to many people: tech and stock trading are totally two different things, why on earth a tech expert is expected to know trading at all? Imagining how ridiculous it would be if a computer science graduate will also automatically get a financial degree from college even though no financial class has been taken.

menaerus · 2025-01-28T09:03:34 1738055014

People developing statistical models that are excercising the financial market at scale are the quants. These people don't come from financial degree background.

WJW · 2025-01-28T09:58:38 1738058318

By and large they don't come from a software development background either though. It's all math and physics PhDs.

menaerus · 2025-01-28T10:05:10 1738058710

Exactly. My point was rather that (formal) financial degree background isn't quite the necessity. Quite the contrary in practice I think.

vishnugupta · 2025-01-28T07:33:10 1738049590

I’ve noticed this phenomenon among IT & tech VC crowd. They will launch pod cast, offer expert opinion and what not on just about every topic under the Sun, from cold fusion to COVID vaccine to Ukraine war.

You wouldn’t see this in other folks, for example, a successful medical surgeon won’t offer much assertion about NVIDIA.

And the general tendency among audience is to assume that expertise can be carried across domains.

lazyasciiart · 2025-01-28T09:04:25 1738055065

You totally see this in other folks. Look at Ben Carson, and his magical surgery expertise on running the department of Housing and Urban Development.

Ensorceled · 2025-01-28T14:05:08 1738073108

> a successful medical surgeon won’t offer much assertion about NVIDIA.

You haven't meet many surgeons have you? When I was working in medical imaging, the technicians all said we (the programmers) were almost as bad as the surgeons.

mandmandam · 2025-01-28T11:10:07 1738062607

> You wouldn’t see this in other folks, for example, a successful medical surgeon won’t offer much assertion about NVIDIA.

Doctors are actually known for this phenomenon. Flight schools particularly watch out for them because their overconfidence gets them in trouble.

And, though humans everywhere do this, Americans are particularly known for it. There are many compilation videos where Americans are asked their opinion on whether Elbonia needs to be bombed or not, followed by enthusiastic agreement. That's highly abnormal in most other countries, where "I don't know" is seen as an acceptable response.

> Anti-intellectualism has been a constant thread winding its way through our political and cultural life, nurtured by the false notion that democracy means that 'my ignorance is just as good as your knowledge. ― Isaac Asimov

photonthug · 2025-01-28T09:16:43 1738055803

Systems, it’s all about systems thinking. It is absolutely true that people in tech are often optimistic and/or delusional about the other expertise at their command. But it’s not like the basic assumption here is completely crazy.

Being a surgeon might require thinking about a few interacting systems, but mostly the number and nature of those systems involved stay the same. Talented programmers without even formal training in CS will eat and digest a dozen brand new systems before breakfast, and model interactions mentally with some degree of fidelity before lunch. And then, any formal training in CS kind of makes general systems just another type of object. This is not the same as how a surgeon is going to look at a heart, or even the body as a whole.

Not that this is the only way to acquire skills in systems thinking. But the other paths might require, IDK, a phd in history/geopolitics, or special studies or extensive work experience in physics or math. And not to rule out other kinds of science or engineering experts as systems thinkers, but a surprisingly large subset of them will specialize and so avoid it. By the numbers.. there are probably just more people in software/IT, therefore more of us to look stupid if/when we get stuff wrong.

Obviously general systems expertise can’t automatically make you an expert on particle physics. But honestly it’s a good piece of background for lots of the wicked problems[1], and the wicked problems are what everyone always wants to talk about.

[1] https://en.m.wikipedia.org/wiki/Wicked_problem

boredhedgehog · 2025-01-28T10:21:06 1738059666

But even if we just look at the examples given by the parent, most of them are not about systems or models at all. Epidemiology and politics concern practical matters of life. In such matters, life experience will always trump abstract knowledge.

rTX5CMRXIfFG · 2025-01-28T10:49:34 1738061374

Epidemiology and politics do involve systems, I’m afraid. We can call it “practical” or “human” or “subjective” all we like, but human behaviors exhibit the same patterns when understood from a statistical instead of an individual standpoint.

TeMPOraL · 2025-01-28T13:05:22 1738069522

Epidemiology and politics are pretty much the poster children of systems[0], next to their eldest sibling, economics. Life and experience may trump abstract knowledge dumbly applied, but alone it won't let you reason at larger scales (not that you could collect any actual experience on e.g. pandemics to fuel your intuition here anyway).

A part of learning how to model things as systems is understanding your model doesn't include all the components that affect the system - but it also means learning how to quantify those effects, or at least to estimate upper bounds on their sizes. It's knowing which effects average out at scale (like e.g. free will mostly does, and quite quickly), and which effects can't possibly be strong enough to influence outcome and thus can be excluded, and then to keep track of those that could occasionally spike.

Mathematics and systems-related fields downstream of it provide us with plenty of tools to correctly handle and reason about uncertainty, errors, and even "unknown unknowns". Yes, you can (and should) model your own ignorance as part of the system model.

--

[0] - In the most blatant example of this, around February 2020, i.e. in the early days of the COVID-19 pandemic going global, you could quite accurately predict the daily infection stats a week or two ahead by just drawing up an exponential function in Excel and lining it up with the already reported numbers. This relationship held pretty well until governments started messing with numbers and then lockdowns started. This was a simple case because at that stage, the exponential component was overwhelmingly stronger than any more nuanced factor - but identifying which parts of a phenomenon dominate and describing their dynamics is precisely the what learning about systems lets you do.

danaris · 2025-01-28T17:10:31 1738084231

This is exacerbated by the tendency in popular media to depict a Scientist character, who can do all kinds of Science (which includes technology, all kinds of computing, and math).

nly · 2025-01-28T08:32:03 1738053123

It's because software devs are smart and make a lot of money - a natural next step is to try and use their smarts to do something with that money. Hence stocks.

Cumpiler69 · 2025-01-28T08:47:40 1738054060

>It's because software devs are smart and make a lot of money

They just think they're smart BECAUSE they make a lot of money. Just because you can center divs for six figures a year at a F500 doesn't make you smart at everything.

nly · 2025-01-28T08:56:48 1738054608

I've never met a fellow software engineer who "centers divs" for 6 figures.

But then I work with engineers using FPGAs to trade in the markets with tick to trade times in double digit nanoseconds and processing streams of market data at ~10 million messages per second (80Gbps)

The truth is, a lot of P&L in trading these days is a technical feat of mathematics and engineering and not just one of fundamental analysis and punting on business plans

saagarjha · 2025-01-28T09:50:40 1738057840

If you were smart surely there would be easier ways for you to make that money ;)

nejsjsjsbsb · 2025-01-28T10:49:31 1738061371

If you are smart that is the easy way to make money!

danaris · 2025-01-28T17:12:17 1738084337

If you were really smart surely you would be able to see that there are more long-term valuable things for you to do with your time than just make yourself more money...

Cumpiler69 · 2025-01-28T09:14:41 1738055681

>I've never met a fellow software engineer who "centers divs" for 6 figures.

But did you ever meet fellow engineers who don't take everything literally?

cyanydeez · 2025-01-28T09:26:31 1738056391

The secret of meritocracy is it can be measured in billionaires

Cumpiler69 · 2025-01-28T09:39:37 1738057177

Forgot the /s ?

unsupp0rted · 2025-01-28T11:11:45 1738062705

Tech people are allowed to quickly learn a domain enough to build the software that powers it, bringing in insights from other domains they've been across.

Just don't allow them to then comment on that domain with any degree of insight.

aurareturn · 2025-01-28T04:14:10 1738037650

I think you’re wrong and Wallstreet got Deepseek’s impact wrong.

You say DeepSeek should decrease Nvidia demand. Wallstreet agreed today.

I say DeepSeek should increase Nvidia’s demand due to Jevon’s Paradox.

sidkshatriya · 2025-01-28T08:50:26 1738054226

No, nvidia's demand and importance might reduce in the long term.

We are forgetting that China has a whole hardware ecosystem. Now we learn that building SOTA models does not need SOTA hardware in massive quanties from nvidia. So the crash in the market implicitly could mean that the (hardware) monopoly of American companies is not going to be more than a few years. The hardware moat is not as deep as the West thought.

Once China brings scale like it did to batteries, EVs, solar, infrastructure, drones (etc) they will be able to run and train their models on their own hardware. Probably some time away but less time than what Wall Street thought.

This is actually more about nvidia than about OpenAI. OpenAI owns the end interface and it will be generally safe (maybe at a smaller valuation). In the long term nvidia is more replaceable than you think it is. Inference is going to dominate the market -- its going to be cerebras, groq, amd, intel, nvidia, google TPUs, chinese TPUs etc.

On the training side, there will be less demand for nvidia GPUs as meta, google, microsoft etc. extract efficiencies with the GPUs they already have given the embarrasing success of DeepSeek. Now, China might have been another insatiable market for nvidia but the export controls have ensured that it wont be.

aurareturn · 2025-01-28T09:31:07 1738056667

>On the training side, there will be less demand for nvidia GPUs as meta, google, microsoft etc. extract efficiencies with the GPUs they already have given the embarrasing success of DeepSeek. Now, China might have been another insatiable market for nvidia but the export controls have ensured that it wont be.

Why? If DeepSeek made training 10x more efficient, just train a 10x bigger model. The end goal is AGI.

sidkshatriya · 2025-01-28T10:10:01 1738059001

You are assuming that a 10x bigger model will be 10x better or will bring us close to AGI. It might be too unweildy to do inference on. Or the gain in performance maybe minor and more scientific thought needs to go into the model before it can reap the reward with more training. Scientific breakthroughts sometimes take time.

aurareturn · 2025-01-28T10:45:45 1738061145

I’m not assuming 10x bigger will yield 10x better. We have scaling laws that can tell you more.

But I find it bizarre that you made the conclusion that AI has stopped scaling because DeepSeek optimized the heck out of the sanctioned GPUs they had. Weird.

sidkshatriya · 2025-01-28T11:40:22 1738064422

I have not said that. I simply said that you now know that you can get more juice for the amount you spend. If you’ve just learnt this you would now first ask your engineers to improve your model to scale it rather than place any further orders with nvidia to scale it. Only once you think you have got the most out of the existing GPUs you would buy more. DeepSeek have made people wonder if their engineers have missed some more stuff and maybe they should just pause spending to make sure before sinking in more billions. It breaks the hegemony of the spend more to dominate attitude that was gripping the industry e.g $500 billion planned spend by openAI consortium etc

aurareturn · 2025-01-28T17:49:05 1738086545

It doesn’t break the attitude. The number one problem DeepSeek’s CEO stated in an interview is they don’t have access to more advanced GPUs. They’re GPU starved.

There’s no reason why American companies can’t use DeepSeek’s techniques to improve their efficiency but continue the GPU arms race to AGI.

DeepSeek’s impact does not change any attitude.

Aloisius · 2025-01-28T04:27:05 1738038425

> Jevon’s Paradox

I've now seen this referenced two dozen times today which is well up from the 0 times I've seen it over the past year.

Is there some recent article referencing it that everyone is regurgitating?

jml7c5 · 2025-01-28T04:39:17 1738039157

Satya Nadella posted about it on Twitter.

https://x.com/satyanadella/status/1883753899255046301

ojbyrne · 2025-01-28T06:44:38 1738046678

Which seems pretty self-serving, and at the very least, premature.

satvikpendem · 2025-01-28T05:16:19 1738041379

Baader-Meinhof phenomenon, but also because everyone is writing about GPU demand and Jevon's paradox is an easy way to express the idea in a trite keyword.

deadbabe · 2025-01-28T04:40:24 1738039224

Most people know Jevon’s Paradox there is just rarely an opportunity to bring it up, similar to Poe’s Law.

SOLAR_FIELDS · 2025-01-28T06:46:33 1738046793

I never knew there was an actual term for this, but I knew of the concept in my professional work because this situation often plays out when the government widens roads here in the States. Ostensibly the road widening is intended to lower congestion, but instead it often just causes more people to live there and use it, thereby increasing congestion.

Probably a decent amount of professions have some variation of this, so it probably is accurate to say most people know OF Jevon’s Paradox because it’s pretty easy to dig up examples of it. But probably much fewer know it’s actual name, or even that it has a name

mrtksn · 2025-01-28T07:10:35 1738048235

IMHO it happens as long as you can find use cases that were previously unfeasible due cost or availability constraints.

At some point the thing no longer brings any benefits because other costs or limitations overtake. for example, even faster broadband is no longer that big of a deal because your experience on most websites is now limited by their servers ability to process your request. However maybe in the future the costs and speeds will be so amazing that all the user devices will become thin clients and no one will care about their devices processing power, therefore one more increase in demand can happen.

retrac98 · 2025-01-28T06:22:44 1738045364

I’d bet most people don’t know what a paradox is.

fldskfjdslkfj · 2025-01-28T04:23:32 1738038212

The increase in efficiency is usually accompanied with the process of commoditization as stuff get cheaper to develop, which is very bad news for nvidia.

If you dont need the super high end chips than Nvidia loses it's biggest moat and ability to monopolize the tech, CUDA isn't enough.

aurareturn · 2025-01-28T04:55:43 1738040143

Anthropic doesn’t train on Nvidia as far as I know. Google doesn’t either.

I don’t see how DeepSeek changes things.

By the way, DeepSeek trains on Nvidia.

rsanek · 2025-01-28T05:44:09 1738043049

"Anthropic currently uses systems containing chips from Nvidia as well as those with AWS’ Trainium and Inferentia chips to train its models." https://www.cio.com/article/3602879/anthropic-caught-up-in-a...

haliskerbas · 2025-01-28T05:32:58 1738042378

What do the others train on?

blovescoffee · 2025-01-28T05:44:36 1738043076

Anthropic on Amazon trainium chips

saagarjha · 2025-01-28T09:52:30 1738057950

And TPU, and normal GPUs. They like to have diversity in their stack.

aurareturn · 2025-01-28T05:43:38 1738043018

Nvidia.

giantrobot · 2025-01-28T05:20:21 1738041621

> Nvidia loses it's biggest moat and ability to monopolize the tech, CUDA isn't enough

CUDA is plenty for right now. AMD can't/won't get their act together with GPU software and drivers. Intel isn't in much better of a position than AMD and has a host of other problems. It's also unlikely the "let's just glue a thousand ARM cores together" hardware will work as planned and still needs the software layer.

CUDA won't be an Nvidia moat forever but it's a decent moat for the next five years. If a company wants to build GPU compute resources it will be hard to go wrong buying Nvidia kit. At least from a platform point of view.

fldskfjdslkfj · 2025-01-28T05:39:17 1738042757

CUDA will still be a moat for the near future and nobody is saying that Nvidia will die, but the thing is that Nvidia margins will drop like crazy and so will it's valuation. It will go back down to being a "medium tech" company.

Basically training got way cheaper, and for inference you don't really need nvidia, so even if there's an increase for cheaper chips there's no way the volume makes up for the loss of margin.

Jlagreen · 2025-01-28T07:00:55 1738047655

No, Nvidia's margins won't drop at all and the proof for this is Apple.

The units of AI accelerators will explode, the market will explode.

At the end of the day, Nvidia will have 20-30% of the unit share in AI HW and 70-80% of the profit share in the AI HW market. Just like Apple makes 3x the money compared to the rest of the smartphone market.

Jensen has considered Nvidia a premium vendor for 2 decades and track record of Nvidia's margins show this.

And while Nvidia remains a high premium AI infrastructure vendor, they will also add lots of great SW frameworks to make even more profit.

Omniverse has literally no competition. That digital world simulation combines all of Nvidia's expertise (AI HW, Graphics HW, Physics HW, Networking, SW) into one huge product. And it will be a revolution because it's the first time we will be able to finally digitalize the analog world. And Nvidia will earn tons of money because Omniverse itself is licensed, it needs OVX systems (visual part) and it needs DGX systems (AI part).

Don't worry, Nvidia's margins will be totally fine. I would even expect them to be higher in 10 years than they are today. Nobody believes that but that's Jensen's goal.

There is a reason why Nvidia has always been the company with the highest P/S ratio and anyone who understands why, will see the quality management immediately.

diamondfist25 · 2025-01-28T14:55:34 1738076134

This. Well said.

People are blind to the llm gpt 3.5/r1 paradigm, and fail to see the other domains nvda is quietly setting up

flakeoil · 2025-01-28T08:51:11 1738054271

Someone commented somewhere above that deepseek avoided using CUDA. So it means you can achieve very good results without Nvidia's CUDA.

"They skipped CUDA and instead used PTX which is a lower level instruction set"

saagarjha · 2025-01-28T09:53:30 1738058010

CUDA in this case is the ecosystem, not the programming language.

nly · 2025-01-28T08:26:16 1738052776

Nobody ever got fired for buying Nvidia

cyberax · 2025-01-28T07:25:45 1738049145

Intel seems to be getting its act together. Battlemage is a decent mid-range GPU, and Gaudi 3 seems to be a fairly decent AI attempt.

dumpsterdiver · 2025-01-28T06:48:01 1738046881

I’m also baffled by the reaction. Even with the ability to do more with less, the nature of the race still encourages everyone to do more with more.

bgnn · 2025-01-28T08:45:15 1738053915

Why should they invest in Nvidia now instead of investing companies which can capitalize on the applications of AI.

Also, why not invest in AMD or Intel bur Nvidia till now: Because Nvidia had the moat and there was a race to buy as much GPU as possible at the moment. Now momentarily Nvidia sales would go down.

For long term investers who are investing in a future, not now, Nvidia was way overpriced. They will start buying when the price is right, but at the moment it's still way too high. Nvidia is worth 20-30 billion or so in reality.

WJW · 2025-01-28T10:07:45 1738058865

Part of NVidias valuation was due to the perception that AI companies would need lots and lots of GPUs, which is still true. But I think the main problem causing the selloff was that another part of the popular perception was that NVidia was the only company who could make powerful enough GPUs. Now it has been shown that you might not need the latest and greatest to compete, who knows how many other companies might start to compete for some of that market. NVidia just went from a perceived monopolist to "merely" a leading player in the AI supplier market and the expected future profits have been adjusted accordingly.

bilekas · 2025-01-28T08:05:29 1738051529

I was under the impression too that this would bump the retail customers demand for the 50 series given the extra AI and cuda cores, add to that the relatively low cost of the hardware. But I know nothing of the sentiments around wallstreet.

I don't feel like upgrading my 4090 that said. Maybe wallstreet believes that the larger company deals that have driven the price up for so long might slow down?

Or I'm completely wrong on the impact of the hardware upgrades.

ojbyrne · 2025-01-28T06:43:54 1738046634

A minor quibble: It’s Jevons Paradox - the last name of the economist who formulated it is “Jevons”

hbs18 · 2025-01-28T08:33:14 1738053194

Shouldn't it be Jevons' Paradox?

sebastiennight · 2025-01-28T22:28:43 1738103323

It's "the Jevons Paradox".

mitthrowaway2 · 2025-01-28T05:29:46 1738042186

Jevon's paradox ony applies under certain conditions. It remains to be seen if it will hold in this case.

dredmorbius · 2025-01-28T07:46:02 1738050362

What do you understand those conditions to be?

mitthrowaway2 · 2025-01-28T18:31:54 1738089114

Output quantity consumed (almost) always increases with falling inputs (ie, costs, whether in dollars or GPUs). But for Jevon's paradox to hold, the slope of quantity-consumption-increase-per-falling-costs must exceed a certain threshold. Otherwise, the result is just that quantity consumed increases while quantity of inputs consumed decreases.

Applied to AI and NVIDIA, the result of an increase in the AI-per-GPU on demand for GPUs depends on the demand curve for AI. If the quantity of AI consumed is completely independent of its price, then the result of better efficiency is cheaper AI, no change in AI quantity consumed, and a decrease in the number of GPUs needed. Of course, that's not a realistic scenario.

(I'm using "consumed" as shorthand; we both know that training AIs does not consume GPUs and AIs are also not consumed like apples. I'm using "consumed" rather than the term "demand" because demand has multiple meanings, referring both to a quantity demanded and a bid price, and this would confuse the conversation).

But a scenario that is potentially realistic is that as the efficiency of training/serving AI drops by 90%, the quantity of AI consumed increases by a factor of 5, and the end result is the economy still only needs half as many GPUs as it needed before.

For Jevons paradox to hold, if the efficiency of converting GPUs to AI increases by X, resulting in a decrease in price by 1/X, the quantity of AI consumed must increase by a factor of more than X as a result of that price decrease. That's certainly possible, but it's not guaranteed; we basically have to wait to observe it empirically.

There's also another complication: as the efficiency of producing AI improves, substitutes for datacenter GPUs may become viable. It may be that the total amount of compute hardware required to train and run all this new AI does increase, but big-iron datacenter investments could still be obsoleted by this change because demand shifts to alternative providers that weren't viable when efficiency was low. For example, training or running AIs on smaller clusters or even on mobile devices.

If tech CEOs really believe in Jevons Paradox, it means that last month when they decided to invest $500 billion in GPUs, then this month after learning of DeepSeek, they now realize $500 billion is not enough and they'll need to buy even more GPUs, and pay even more each one. And, well, maybe that's the case. There's no doubt that demand for AI is going to keep growing. But at some point, investment in more GPUs trades off against other investments that are also needed, and the thing the economy is most urgently lacking ceases to be AI.

raincole · 2025-01-28T04:51:52 1738039912

> I say DeepSeek should increase Nvidia’s demand due to Jevon’s Paradox.

If their claims were true, DeepSeek would increase the demand for GPU. It's so obvious that I don't know why we even need a name to describe this scenario (I guess Jeven's Paradox just sounds cool).

The only issue is that whether it would make a competitor to Nvidia viable. My bet is no, but the market seems to have betted yes.

vishnugupta · 2025-01-28T07:40:09 1738050009

> DeepSeek should increase Nvidia’s demand due to Jevon’s Paradox.

How exactly? From what I’ve read the full model can run on MacBook M1 sort of hardware just fine. And this is their first release, I’d expect it to get more efficient and maybe domain specific models can be run on much lower grade hardware sort of raspberry pi sort.

aurareturn · 2025-01-28T07:46:46 1738050406

>How exactly? From what I’ve read the full model can run on MacBook M1 sort of hardware just fine.

No you can't. Unless you run their 1.5b model quantized. Almost useless.

Their full model, Deepseek V3 has 671B parameters. Not even remotely close to being able to run well on consumer hardware.

michaelscott · 2025-01-28T08:16:25 1738052185

Only the distilled models can run on everyday machines, not the full model