If training and inference just got 40x more efficient, but OpenAI and co. still have the same compute resources, once they’ve baked in all the DeepSeek improvements, we’re about to find out very quickly whether 40x the compute delivers 40x the performance / output quality, or if output quality has ceased to be compute-bound.
In the long run (which in the AI world is probably ~1 year) this is very good for Nvidia, very good for the hyperscalers, and very good for anyone building AI applications.
The only thing it's not good for is the idea that OpenAI and/or Anthropic will eventually become profitable companies with market caps that exceed Apple's by orders of magnitude. Oh no, anyway.
Yes! I have had the exact same mental model. The biggest losers in this news are the groups building frontier models. They are the ones with huge valuations but if the optimizations becomes even close to true, its a massive threat to their business model. My feet are on the ground but I do still believe that the world does not comprehend how much compute it can use...as compute gets cheaper we will use more of it. Ignoring equity pricing, this benefits all other parties.
My big current conspiracy theory is that this negative sentiment toward Nvidia from Deepseek's release is spread by people who actually want to buy more stock at a cheaper price. Like, if you know anything about the topic, it's wild to assume that this will drive demand for GPUs anywhere but up. If Nvidia came out with a Jetson like product that can run the full 670B R1, they could make infinite money. And in the datacenter section, companies will stumble over each other to get the necessary hardware (which corresponds to a dozen H100s or so right now). Especially once HF comes out with their uncensored reproduction. There's so much opportunity to turn more compute into more money because of this, almost every company could theoretically benefit.
Can you guys explain what this would be bad for the OpenAI and Anthropic of the world?
Wasn't the story always outlined to be we build better and better models, then we eventually get to AGI, AGI works on building better and better models even faster, and we eventually get to super AGI, which can work on building better and better models even faster...
Isn't "super-optimization"(in the widest sense) what we expect to happen in the long run?
First of all, we need to just stop talking about AGI and Superintelligence. It's a total distraction from the actual value that has already been created by AI/ML over the years and will continue to be created.
That said, you have to distinguish between "good for the field of AI, the AI industry overall, and users of AI" from "good for a couple of companies that want to be the sole provider of SOTA models and extract maximum value from everyone else to drive their own equity valuations to the moon". Deepseek is positive for the former and negative for the latter.
I believe in general the business model of building frontier models has not been fully baked out yet. Lets ignore the thought of AGI and just say models do continue to improve. In OpenAIs case they have raised lots of capital in the hopes of dominating the market. That capital pegged them at a valuation. Now you have a company with ~100 employees and supposedly a lot less capital come in a get close to OpenAIs current leading model. It has the potential to pop their balloon massively.
By releasing a lot of it opensource everyone has their hands on it. Opens the door to new companies.
Or a simple mental model, there has been this ability for third parties to get quite close to leading frontier models. The leading frontier models takes hundreds of millions of dollars and if someone is able to copy it within a years time for significantly less capital, its going to be hard game of cat and mouse.
> If training and inference just got 40x more efficient
Did training and inference just get 40x more efficient, or just training? They trained a model with impressive outputs on a limited number of GPUs, but DeepSeek is still a big model that requires a lot of resources to run. Moreover, which costs more, training a model once or using it for inference across a hundred million people multiple times a day for a year? It was always the second one, and doing the training cheaper makes it even more so.
But this implies that we could use those same resources to train even bigger models, right? Except that you then have the same problem. You have a bigger model, maybe it's better, but if you've made inference cost linearly more because of the size and the size is now 40x bigger, you now need that much more compute for inference.
Actually inference got more efficient as well, thanks to the multi-head latent attention algorithm that compresses the key-value cache to drastically reduce memory usage.
That's a useful performance improvement but it's incremental progress in line with what new models often improve over their predecessors, not in line with the much more dramatic reduction they've achieved in training cost.
If H800 is a memory-constrained model that NVIDIA built to avoid the Chinese export ban on H100 with equivalent fp8 performance,
it makes zero sense to believe Elon Musk, Dario Armodei and Alexandr Wang's claims that DeepSeek smuggled H100s.
The only reason why a team would allocate time on memory optimizations and writing NVPTX code rather than focusing on posttraining is if they severely struggled with memory during training.
This is a massive trick pulled by Jensen, take the H100 design whose sales are regulated by the government, make it look 40x weaker and call it H800, while conveniently leaving 8-bit computation as fast as H100. Then bring it to China and let companies stockpile without disclosing production or sales numbers, and have no export controls.
Eventually, after 7 months, US govt starts noticing the H800 sales and introduces new export controls, but it's too late. By this point, DeepSeek has started research using fp8. They slowly build bigger and bigger models, work on the bandwidth and memory consumptions, until they make r1 - their reasoning model.
Especially since he seems intent on everyone talking about him all the time. I find it questionable when a person wants to be the centre of attention no matter. Perhaps attention is not all we need.
He's like a broken smart network switch, smart as in managed. Packets with switch MAC on it are all broken, but erroneously forwarded ones often has valuable data. We through L3 don't know which one is which.
So not an actual DeepSeek-R1 model but a distilled Qwen or Llama model.
From DeepSeek-R1 paper:
> As shown in Table 5, simply distilling DeepSeek-R1’s outputs enables the efficient DeepSeekR1-7B (i.e., DeepSeek-R1-Distill-Qwen-7B, abbreviated similarly below) to outperform nonreasoning models like GPT-4o-0513 across the board.
and
> DeepSeek-R1-14B surpasses QwQ-32BPreview on all evaluation metrics, while DeepSeek-R1-32B and DeepSeek-R1-70B significantly exceed o1-mini on most benchmarks.
and
> These [Distilled Model Evaluation] results demonstrate the strong potential of distillation. Additionally, we found that applying RL to these distilled models yields significant further gains. We believe this warrants further exploration and therefore present only the results of the simple SFT-distilled models here.
Yes, but even that can still be run (slowly) on cpu-only systems down to about 32gb. Memory virtualization is a thing. If you get used to using it like email rather than chat, it’s still super useful even if you are waiting 1/2 hour for your reply. Presumably you have a fast distill on tap for interactive stuff.
I run my models in an agentic framework with fast models that can ask slower models or APIs when needed. It works perfectly, 60 percent of the time lol.
Yes, but I think most of the rout is caused by the fact that there really isn't anything protecting AI from being disrupted by a new player - They're fairly simple technology compared to some of the other things tech companies build. That means openai really doesn't have much ability to protect it's market leader status.
I don't really understand why the stock market has decided this affects nvidia's stock price though.
This article has good background, context, and explanations [1] They skipped CUDA and instead used PTX which is a lower level instruction set where they were able to implement more performant cross-chip comms to make up for the less-performant H800 chips.
> Moreover, if you actually did the math on the previous question, you would realize that DeepSeek actually had an excess of computing; that’s because DeepSeek actually programmed 20 of the 132 processing units on each H800 specifically to manage cross-chip communications. This is actually impossible to do in CUDA.
You can do this just fine in CUDA, no PTX required. Of course all the major shops are using inline PTX at the very least to access the Tensor cores effectively.
>If training and inference just got 40x more efficient
The jury is still out on how much improvement DeepSeek made in terms of training and inference compute efficiency, but personally I think 10x is probably the actual improvement that's being made
But in business/engineering/manufacturing/etc if you have 10x more efficiency, you're basically going to obliterate the competitions.
>output quality has ceased to be compute-bound
You raised an interesting conjecture and it seems that it's very likely the case.
I know that it's not even a full two years that ChatGPT-4 has been released but it seems that it take OpenAI a very long time to release ChatGPT-5. Is it because they're taking their own sweet time to release the software not unlike GIMP, or they genuinely cannot justify the improvement to jump from 4 to 5? This stagnation however, has allowed others to catch up. Now based on DeekSeek claims, anyone can has their own ChatGPT-4 under their desk with Nvidia project Digits mini PCs [1]. For running DeepSeek, 4 units mini PCs will be more than enough of 4 PFLOPS and cost only USD12K. Let's say on average one subscriber user pays OpenAI monthly payment of USD$10, for 1000 persons organization it will be USD$10K, and the investment will pays for itself within a month, and no data ever leave the organization since it's a private cloud!
For training similar system to ChatGPT-4 based on DeepSeeks claims, a few millions USD$ is more than enough. Apparently, OpenAI, Softbank and Oracle just announced USD$500 Billions joint ventures to bring the AI forward with the new announced Stargate AI project but that's 10,000x money [2],[3]. But the elephant in the room question is that, can they even get 10x quality improvement of the existing ChatGPT-4? I really seriously doubt it.
[1] NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer’s Fingertips:
90% of the comments in this thread make it clear that knowing about technology does not in any way qualify someone to think correctly about markets and equity valuations.
He recently said that evil people can’t survive long as founders of tech companies because they need smart people to work for them and smart people can work anywhere. There are lots of other examples. Especially read his recent tweets/essays that aren’t about his area of expertise.
Look it up yourself. Paul Graham does not have a core competency regarding what smart people are or are not willing to do. That would be sociology or psychology or economics.
But every successful SV founder and or VC is not only a tech genius but also a geopolitical and socioeconomic expert! That’s why they make war companies, cozy up to politicians, and talk about how woke is ruining the world. /s
In fairness, 'geopolitical experts' may not really exist. There are a range of people who make up interesting stories to a greater or lesser extent but all seem to be serially misinformed. Some things are too complicated to have expertise in.
Indeed, while the existence of socioeconomic experts seems more likely we don't have any way of reliably identifying them. The people who actually end up making social or economic policy seem to be winging it by picking the policy that most benefits wealthy people and/or established asset owners. It is barely possible to blink twice without stumbling over a policy disaster.
>In fairness, 'geopolitical experts' may not really exist.
Except for, I don't know, the many thousands of people who work at various government agencies (diplomatic, intelligence) or even private sector policy circles whose job it is to literally be geopolitical experts in a given area.
There are thousands of gamblers whose job is to literally predict the tumbling random number generators in the slot machines they play, and will be rewarded with thousands of dollars if they do a good job.
They are not experts. As said above, some things are too complicated to have expertise in.
It's plausible that geopolitics may work the same way, with the ones who get lucky mistaken for actual experts.
Absolute rubbish. There's lots of factual information here you can know and use to make informed "guesses" (if you will).
People like Musk, who are often absolutely clueless about countries' political situations, their people, their makeup, their relationships and agreements with neighboring countries, as well as their history and geography, are obviously going to be terrible at predicting outcomes compared to someone who actually has deep knowledge of these things.
Also we seem to be using the term "geopolitics" a bit loosely in this thread. Maybe we could inform ourselves what the term we are using even means before we discount that anyone could have expertise in it[1]. I don't think people here meant to narrow it down to just that. What we really seem to concern ourselves with here is international relations theory and political sciences in general.
Now whether most politicians should also be considered experts in these areas is another matter. From my personal experience, I'd say most are not. People generally don't elect politicians for being experts - they elect politicians for representing their uninformed opinions. There seems to be only a weak overlap between being competent at the actual job and the ability to be elected into it.
> There's lots of factual information here you can know and use to make informed "guesses" (if you will).
The gambler who learned the entire observable history of a tumbling RNG will not be in a better position to take the jackpot than the gambler who models it as a simple distribution. You cannot become an expert on certain things.
Geopolitics may or may not be one of these things, but you've made no substantial argument either way.
Geopolitics is a complex system. Having lots of factual and historical information to inform your decision is not obviously an advantage over a guess based on a cursory read of the situation.
It is like economists - they have 0 predictive power vs. some random bit player with a taste for stats when operating at the level of a country's economy. They're doing well if they can even explain what actually happened. They tend to get the details right but the big picture is an entirely different kettle of fish.
Geopolitics is much harder to work with than economics, because it covers economics plus distance and cultural barriers even before the problem of leaders doing damn silly things for silly reasons. And unlike economics there is barely the most tenuous of anchors to check if the geopolitical "experts" get things right even with hindsight. I'd bet the people who sent the US into Afghanistan and Iraq are still patting themselves on the back for a job well done despite what I think most people could accept as the total failure of those particular expeditions.
I thought Peter Zeihan was a geopolitical expert until he started talking about things I lived through, with complete ignorance of the basics. It's not that his take was wrong, it's that his basic underlying assumptions were all wildly different from reality on the ground.
Any sort of geopolitical expert is generally going to be labeled as such because he works in the domain at a reasonably high level.
The problem with that is that when at such a level, political factors start to come into play.
The net effect is that in any conflict, the winning side will have competent and qualified expert geopolitical analyses, while the losing side will have propagandists.
So the geopolitical expert is, at best, a liminal species.
So, you think the system is genuinely trying to identify expertise to achieve equitable outcomes, and just happening to fail at it? Rather than policy being shaped by personal networks and existing power structures that tend to benefit themselves?
I think the system has been carefully configured to benefit wealthy people and/or established asset owners. But the reason that there is no effective resistance to that is because identifying generalist socioeconomic experts is practically impossible.
They may exist, but the real expertise is mostly kept non-public. Regarding the Ukraine war, both pro-Russian and pro-American public pundits never mentioned economic and real strategic issues apart from NATO membership for almost 2.5 years.
Then Lindsey Graham outright mentioned the mineral wealth and it became a topic, though not a prominent one.
Access to the Caspian Sea via the Volga-Don canal and the Sea of Azov is never mentioned. Even though there are age old Rand corporation papers that demand more US influence in that region.
The best public pundits get personalities and some of the political history correct (and are entertaining), but it is always a game of omission on both sides.
That's so wrong in so many levels, also cynical. If the world worked by what you described, we would have been already obliterated ourselves a long time ago, or mass-enslavery would have happened. It didn't.
Geopolitics can be studied and learned, and is something that diplomats heavily rely upon.
Of course, those geopolitical strategies can play in certain ways we don't foresee, as on the other side we also have an actor that is free to do what they want.
But for instance, if you give Mexico a very good trade agreement as a strong country like the US, it's very likely that they will work with you on your special requests.
With the crowdstrike outage earlier last year it was incredible how many hidden security and kernel "experts" came out crawling from the woodwork, questioning why anything needs to run in the kernel and predicting the company's demise.
They were correct that there is no need for it to run in the kernel. They were incorrect in thinking this would affect the company's future, because of course the sales of their product have nothing to do with its technical merit.
I think you've got it half correct: sales absolutely does have to do with the technical merit. Their platform works, it's just folks overestimated the impact of a single critical defect.
Nobody would pay crowdstrikes prices if it didn't stop attacks, or improve your detection chances (and I can assure you, it does, better than most platforms)
> Nobody would pay crowdstrikes prices if it didn't stop attacks, or improve your detection chances
In my experience people pay because they need to tick the audit box, and it's (marginally) less terrible than their competitors. Actually preventing or detecting an attack is not really a priority.
And yet crowdstrike's stock price is still 28% up on where it was 12 month ago, 46% up on 6 months ago after their crash.
Sibling is right, that type of product is nothing to do with actually preventing problems, its to do with outsourcing personal risk. Same as SAAS. Nobody got fired when office 365 was down for the second day in a year, but have a 5 minute outage on your on-prem kit after 5 years and there's nasty questions to answer.
The crash is absolutely rational; the cascading effect highlights the missing moat for companies like OpenAI. Without a moat, no investor will provide these companies with the billions that fueled most of the demand. This demand was essential for NVIDIA to squeeze such companies with incredible profit margins.
NVIDIA was overvalued before, and this correction is entirely justified. The larger impact of DeepSeek is more challenging to grasp. While companies like Google and Meta could benefit in the long term from this development, they still overpaid for an excessive number of GPUs. The rise in their stock prices was assumed to be driven by the moat they were expected to develop themselves.
I was always skeptical of those valuations. LLM inference was highly likely to become commoditized in the future anyway.
It has been clear for a while that one of two things is true.
1) AI stuff isn't really worth trillions, in which case Nvidia is overvalued.
2) AI stuff is really worth trillions, in which case there will be no moat, because you can cross any moat for that amount of money, e.g. you could recreate CUDA from scratch for far less than a trillion dollars and in fact Nvidia didn't spend anywhere near that much to create it to begin with. Someone else, or many someones, will spend the money to cross the moat and get their share.
So Nvidia is overvalued on the fundamentals. But is it overvalued on the hype cycle? Lots of people riding the bubble because number goes up until it doesn't, and you can lose money (opportunity cost) by selling too early just like you can lose money by selling too late.
Then events like this make some people skittish that they're going to sell too late, and number doesn't go up that day.
One thing you’re missing is that there’s nothing that says the value must correct. There are at least two very good reasons it might not: Nvidia now has huge amounts of money to invest in developing new technologies, exploring other ideas, and the other is that very little of the stock market is about the actual value of the company itself, but speculation. If people think it will go up, they buy it, reducing supply, and driving up the price. If people think it will go down, they sell it, increasing supply and driving down the price. It is a self-fulfilling prophecy on a large scale, and completely secondary to the actual business.
Eventually people will sell their stock to invest in some business that is actually growing or giving proportional dividends.
Of course, that "eventually" there is holding a way too much load. And it's very likely this won't happen in a time the US government is printing lots of money and distributing it to rich investors. But that second one has to stop eventually too.
It's a lot of people holding the stock, you are expecting everybody to just not do it.
Private companies are different, but on publicly traded ones it tends to happen.
(Oh, you may mean that printing money part. It's a lot of people holding that money, eventually somebody will want to buy something real with it and inflation explodes.)
Yeah the printing money bit. Generously one might even say that that’s the reason for printing more money: make sure that the value of peoples investments decays over time so there’s no need for the market to crash to “get the money back out”.
Related to your #2. I mentioned this elsewhere yesterday, but NVDA's margins (55% last quarter!) are a gift and a curse. They look great for the stock in the short term, but they also encourage its customers to aggressively go after them. Second, their best customers are huge tech companies who have the capital and expertise (or can buy it) to go after NVDA. DeepSeek just laid out a path to put NVDAs margins under pressure, hence the pullback.
2) Seems the most plausible, but how to value the moat, or, how long / how many dollars will it cost to overcome the moat? The lead that CUDA currently has suggests that it's probably a lot of money, and it's not clear what the landscape will look like afterwards.
It seems likely that the technology / moat won't just melt away into nothing, it'll at least continue to be a major player 10 years from now. The question is if the market share will be 70%, 10% or 30% but still holding a lead over a market that becomes completely fractured....
I think the analysis of (2) is too simplistic because it ignores network effects. A community of developers and users around a specific toolset (e.g. CUDA) is hard to just "buy". Imagine trying to build a better programming language than python -- you could do it for a trillion dollars, but good luck getting the world to use it. For a real example, see Meta and Threads, or any other Twitter competitor.
You have a trillion dollars in incentive. You can use it for more than just creating the software, you can offer incentives to use it or directly contribute patches to the tools people are already using so they support your system. Moreover, third parties already have a large motivation to use any viable replacement because they'd avoid the premium Nvidia charges for hardware.
You could apply this analysis to any of the other big tech innovations like operating systems, search, social media, ...
MS threw a lot of money after Windows Phone. I worked for a company that not only got access to great resources, but also plain money, just to port our app. We took the money and made the port. Needless to say, it still didn't work out for MS.
Those markets have a much stronger network effect (especially social media), or were/are propped up by aggressive antitrust violations, or both.
To use your example, the problem with entering the phone market is that customers expect to buy one phone and then use it for everything. So then it needs to support everything out of the gate in order to get the first satisfied customer, meanwhile there are millions of third party apps.
Enterprise GPUs aren't like that. If one GPU supports 100% of code and another one supports 10% of code, but you're a research group where that 10% includes the thing you're doing (or you're in a position to port your own code), you can switch 100% of your GPUs. If you're a cloud provider buying a thousand GPUs to run the full gamut of applications, you can switch what proportion of your GPUs that run supported applications, instead of needing 100% coverage to switch a single one. Then lots of competing GPUs get made and fund the competition and soon put the competition's GPUs into the used market where they become obtainium and people start porting even more applications to them etc.
It also allows the competition to capture the head of the distribution first and go after the long tail after. There might be a million small projects that are tied to CUDA, but if you get the most popular models running on competing hardware, by volume that's most of the market. And once they're shipping in volume the small projects start to add support on their own.
Why can’t you just build something that’s CUDA-compatible? You won’t have to move anyone over then. Or is the actual CUDA api patented? And will Chinese companies care about that?
AFAIK, CUDA is protected. There are patents, and the terms of use of the compiler forbids using it on other devices.
Of course, most countries will stump over the terms of use thing (or worse, use it as evidence to go after Nvidia), and will probably ignore the patents because they are anticompetitive. It's not only China that will ignore them.
AMD is actively working to recreate CUDA. "Haven't succeeded yet" is very different from having failed, and they're certainly not giving up.
Intel's fab is in trouble, but that's not the relevant part of Intel for this. They get a CUDA competitor going with GPUs built on TSMC and they're off to the races. Also, Intel's fab might very well get bailed out by the government and in the process leave them with more resources to dedicate to this.
Then you have Apple, Google, Amazon, Microsoft, any one of which have the resources to do this and they all have a reason to try.
Which isn't even considering what happens if they team up. Suppose AMD is useless at software but Google isn't and then Google does the software and releases it to the public because they're tired of paying Nvidia's margins. Suppose the whole rest of the industry gets behind an open standard.
A lot of things can happen and there's a lot of money to make them happen.
You don't need external competition to have NVDA correct. All it takes is for one or more of the big customers to say they don't need as many GPUs for any reason. It could be their in house efforts are 'good enough', or that the new models are more efficient and take less compute, or their shareholders are done letting them spend like drunken sailors. NVDAs stock was/is priced for perfection and any sort of market or margin contraction will cause the party to stop.
The danger for NVDA is their margins are so large right now, there is a ton of money chasing them not just from their typical competition like AMD, but from their own customers.
While we can bet on "AMD are too sclerotic to fix their drivers even if it's an existential threat to the company", I don't think we can bet on "if we deny technology to China they won't try to copy it anyway".
The crash of NVIDIA is not about the moat of OpenAI.
But because DeepSeek was able to cut training costs from billions to millions (and with even better performance). This means cheaper training but it also proves that OpenAI was not at the cutting edge of what was possible in training algorithms and that there are still huge gaps and disruptions possible in this area. So there is a lot less need to scale by pumping more and more GPUs but instead to invest in research that can cut down the cost. More gaps mean more possibility to cut costs and less of a need to buy GPUs to scale in terms of model quality.
For NVIDIA that means that all the GPUs of today are good enough for a long time and people will invest a lot less in them and a lot more in research like this to cut costs. (But I am sure they will be fine)
This is partially why Apple is the one that stands to gain more, and it showed.
Their "small models, on device" approach can only be perfected with something like DeepSeek, and they're not exposed to NVIDIA pricing, nor have to prove investors that their approach is still valid.
I keep seeing this argument, but I don't buy it at all. I want a phone with an AGI, not a phone that is only AGI. Often it's just easier to press a button rather than talk to an AI, regardless how smart it is. I have no interest in natural language being the only interface to my device, that sounds awful. In public, I want to preserve my privacy. I do not want to have everyone listening in on what I'm doing.
If we can create an AGI that can literally read my mind, okay, maybe that's a better interface than the current one, but we are far away from that scenario.
Until then, I'm convinced users will prefer a phone with AI functionalities rather than the reverse. It's easier for a phone company to create such a phone than it is for an AI company.
Perennial reminder that we do not have any real evidence that we are anywhere close to AGI, or that "throwing more resources at LLMs" is even theoretically a possible way to get to an AGI.
"Lots of people with either a financial motivation to say so or a deep desire for AGI to be real Soon™ said they can do it" is not actual evidence.
We do not know how to make an AGI. We do not know how to define an AGI. It is hypothetically possible that we could accidentally stumble into one, but nothing has actually shown that, and counting on it is a fool's bet.
I don’t understand why this is not obvious to many people: tech and stock trading are totally two different things, why on earth a tech expert is expected to know trading at all? Imagining how ridiculous it would be if a computer science graduate will also automatically get a financial degree from college even though no financial class has been taken.
People developing statistical models that are excercising the financial market at scale are the quants. These people don't come from financial degree background.
I’ve noticed this phenomenon among IT & tech VC crowd. They will launch pod cast, offer expert opinion and what not on just about every topic under the Sun, from cold fusion to COVID vaccine to Ukraine war.
You wouldn’t see this in other folks, for example, a successful medical surgeon won’t offer much assertion about NVIDIA.
And the general tendency among audience is to assume that expertise can be carried across domains.
> a successful medical surgeon won’t offer much assertion about NVIDIA.
You haven't meet many surgeons have you? When I was working in medical imaging, the technicians all said we (the programmers) were almost as bad as the surgeons.
> You wouldn’t see this in other folks, for example, a successful medical surgeon won’t offer much assertion about NVIDIA.
Doctors are actually known for this phenomenon. Flight schools particularly watch out for them because their overconfidence gets them in trouble.
And, though humans everywhere do this, Americans are particularly known for it. There are many compilation videos where Americans are asked their opinion on whether Elbonia needs to be bombed or not, followed by enthusiastic agreement. That's highly abnormal in most other countries, where "I don't know" is seen as an acceptable response.
> Anti-intellectualism has been a constant thread winding its way through our political and cultural life, nurtured by the false notion that democracy means that 'my ignorance is just as good as your knowledge.
― Isaac Asimov
Systems, it’s all about systems thinking. It is absolutely true that people in tech are often optimistic and/or delusional about the other expertise at their command. But it’s not like the basic assumption here is completely crazy.
Being a surgeon might require thinking about a few interacting systems, but mostly the number and nature of those systems involved stay the same. Talented programmers without even formal training in CS will eat and digest a dozen brand new systems before breakfast, and model interactions mentally with some degree of fidelity before lunch. And then, any formal training in CS kind of makes general systems just another type of object. This is not the same as how a surgeon is going to look at a heart, or even the body as a whole.
Not that this is the only way to acquire skills in systems thinking. But the other paths might require, IDK, a phd in history/geopolitics, or special studies or extensive work experience in physics or math. And not to rule out other kinds of science or engineering experts as systems thinkers, but a surprisingly large subset of them will specialize and so avoid it.
By the numbers.. there are probably just more people in software/IT, therefore more of us to look stupid if/when we get stuff wrong.
Obviously general systems expertise can’t automatically make you an expert on particle physics. But honestly it’s a good piece of background for lots of the wicked problems[1], and the wicked problems are what everyone always wants to talk about.
But even if we just look at the examples given by the parent, most of them are not about systems or models at all. Epidemiology and politics concern practical matters of life. In such matters, life experience will always trump abstract knowledge.
Epidemiology and politics do involve systems, I’m afraid. We can call it “practical” or “human” or “subjective” all we like, but human behaviors exhibit the same patterns when understood from a statistical instead of an individual standpoint.
Epidemiology and politics are pretty much the poster children of systems[0], next to their eldest sibling, economics. Life and experience may trump abstract knowledge dumbly applied, but alone it won't let you reason at larger scales (not that you could collect any actual experience on e.g. pandemics to fuel your intuition here anyway).
A part of learning how to model things as systems is understanding your model doesn't include all the components that affect the system - but it also means learning how to quantify those effects, or at least to estimate upper bounds on their sizes. It's knowing which effects average out at scale (like e.g. free will mostly does, and quite quickly), and which effects can't possibly be strong enough to influence outcome and thus can be excluded, and then to keep track of those that could occasionally spike.
Mathematics and systems-related fields downstream of it provide us with plenty of tools to correctly handle and reason about uncertainty, errors, and even "unknown unknowns". Yes, you can (and should) model your own ignorance as part of the system model.
--
[0] - In the most blatant example of this, around February 2020, i.e. in the early days of the COVID-19 pandemic going global, you could quite accurately predict the daily infection stats a week or two ahead by just drawing up an exponential function in Excel and lining it up with the already reported numbers. This relationship held pretty well until governments started messing with numbers and then lockdowns started. This was a simple case because at that stage, the exponential component was overwhelmingly stronger than any more nuanced factor - but identifying which parts of a phenomenon dominate and describing their dynamics is precisely the what learning about systems lets you do.
This is exacerbated by the tendency in popular media to depict a Scientist character, who can do all kinds of Science (which includes technology, all kinds of computing, and math).
It's because software devs are smart and make a lot of money - a natural next step is to try and use their smarts to do something with that money. Hence stocks.
>It's because software devs are smart and make a lot of money
They just think they're smart BECAUSE they make a lot of money. Just because you can center divs for six figures a year at a F500 doesn't make you smart at everything.
I've never met a fellow software engineer who "centers divs" for 6 figures.
But then I work with engineers using FPGAs to trade in the markets with tick to trade times in double digit nanoseconds and processing streams of market data at ~10 million messages per second (80Gbps)
The truth is, a lot of P&L in trading these days is a technical feat of mathematics and engineering and not just one of fundamental analysis and punting on business plans
If you were really smart surely you would be able to see that there are more long-term valuable things for you to do with your time than just make yourself more money...
Tech people are allowed to quickly learn a domain enough to build the software that powers it, bringing in insights from other domains they've been across.
Just don't allow them to then comment on that domain with any degree of insight.
No, nvidia's demand and importance might reduce in the long term.
We are forgetting that China has a whole hardware ecosystem. Now we learn that building SOTA models does not need SOTA hardware in massive quanties from nvidia. So the crash in the market implicitly could mean that the (hardware) monopoly of American companies is not going to be more than a few years. The hardware moat is not as deep as the West thought.
Once China brings scale like it did to batteries, EVs, solar, infrastructure, drones (etc) they will be able to run and train their models on their own hardware. Probably some time away but less time than what Wall Street thought.
This is actually more about nvidia than about OpenAI. OpenAI owns the end interface and it will be generally safe (maybe at a smaller valuation). In the long term nvidia is more replaceable than you think it is. Inference is going to dominate the market -- its going to be cerebras, groq, amd, intel, nvidia, google TPUs, chinese TPUs etc.
On the training side, there will be less demand for nvidia GPUs as meta, google, microsoft etc. extract efficiencies with the GPUs they already have given the embarrasing success of DeepSeek. Now, China might have been another insatiable market for nvidia but the export controls have ensured that it wont be.
>On the training side, there will be less demand for nvidia GPUs as meta, google, microsoft etc. extract efficiencies with the GPUs they already have given the embarrasing success of DeepSeek. Now, China might have been another insatiable market for nvidia but the export controls have ensured that it wont be.
Why? If DeepSeek made training 10x more efficient, just train a 10x bigger model. The end goal is AGI.
You are assuming that a 10x bigger model will be 10x better or will bring us close to AGI. It might be too unweildy to do inference on. Or the gain in performance maybe minor and more scientific thought needs to go into the model before it can reap the reward with more training. Scientific breakthroughts sometimes take time.
I’m not assuming 10x bigger will yield 10x better. We have scaling laws that can tell you more.
But I find it bizarre that you made the conclusion that AI has stopped scaling because DeepSeek optimized the heck out of the sanctioned GPUs they had. Weird.
I have not said that. I simply said that you now know that you can get more juice for the amount you spend. If you’ve just learnt this you would now first ask your engineers to improve your model to scale it rather than place any further orders with nvidia to scale it. Only once you think you have got the most out of the existing GPUs you would buy more. DeepSeek have made people wonder if their engineers have missed some more stuff and maybe they should just pause spending to make sure before sinking in more billions. It breaks the hegemony of the spend more to dominate attitude that was gripping the industry e.g $500 billion planned spend by openAI consortium etc
It doesn’t break the attitude. The number one problem DeepSeek’s CEO stated in an interview is they don’t have access to more advanced GPUs. They’re GPU starved.
There’s no reason why American companies can’t use DeepSeek’s techniques to improve their efficiency but continue the GPU arms race to AGI.
Baader-Meinhof phenomenon, but also because everyone is writing about GPU demand and Jevon's paradox is an easy way to express the idea in a trite keyword.
I never knew there was an actual term for this, but I knew of the concept in my professional work because this situation often plays out when the government widens roads here in the States. Ostensibly the road widening is intended to lower congestion, but instead it often just causes more people to live there and use it, thereby increasing congestion.
Probably a decent amount of professions have some variation of this, so it probably is accurate to say most people know OF Jevon’s Paradox because it’s pretty easy to dig up examples of it. But probably much fewer know it’s actual name, or even that it has a name
IMHO it happens as long as you can find use cases that were previously unfeasible due cost or availability constraints.
At some point the thing no longer brings any benefits because other costs or limitations overtake. for example, even faster broadband is no longer that big of a deal because your experience on most websites is now limited by their servers ability to process your request. However maybe in the future the costs and speeds will be so amazing that all the user devices will become thin clients and no one will care about their devices processing power, therefore one more increase in demand can happen.
The increase in efficiency is usually accompanied with the process of commoditization as stuff get cheaper to develop, which is very bad news for nvidia.
If you dont need the super high end chips than Nvidia loses it's biggest moat and ability to monopolize the tech, CUDA isn't enough.
> Nvidia loses it's biggest moat and ability to monopolize the tech, CUDA isn't enough
CUDA is plenty for right now. AMD can't/won't get their act together with GPU software and drivers. Intel isn't in much better of a position than AMD and has a host of other problems. It's also unlikely the "let's just glue a thousand ARM cores together" hardware will work as planned and still needs the software layer.
CUDA won't be an Nvidia moat forever but it's a decent moat for the next five years. If a company wants to build GPU compute resources it will be hard to go wrong buying Nvidia kit. At least from a platform point of view.
CUDA will still be a moat for the near future and nobody is saying that Nvidia will die, but the thing is that Nvidia margins will drop like crazy and so will it's valuation. It will go back down to being a "medium tech" company.
Basically training got way cheaper, and for inference you don't really need nvidia, so even if there's an increase for cheaper chips there's no way the volume makes up for the loss of margin.
No, Nvidia's margins won't drop at all and the proof for this is Apple.
The units of AI accelerators will explode, the market will explode.
At the end of the day, Nvidia will have 20-30% of the unit share in AI HW and 70-80% of the profit share in the AI HW market. Just like Apple makes 3x the money compared to the rest of the smartphone market.
Jensen has considered Nvidia a premium vendor for 2 decades and track record of Nvidia's margins show this.
And while Nvidia remains a high premium AI infrastructure vendor, they will also add lots of great SW frameworks to make even more profit.
Omniverse has literally no competition. That digital world simulation combines all of Nvidia's expertise (AI HW, Graphics HW, Physics HW, Networking, SW) into one huge product. And it will be a revolution because it's the first time we will be able to finally digitalize the analog world. And Nvidia will earn tons of money because Omniverse itself is licensed, it needs OVX systems (visual part) and it needs DGX systems (AI part).
Don't worry, Nvidia's margins will be totally fine. I would even expect them to be higher in 10 years than they are today. Nobody believes that but that's Jensen's goal.
There is a reason why Nvidia has always been the company with the highest P/S ratio and anyone who understands why, will see the quality management immediately.
Why should they invest in Nvidia now instead of investing companies which can capitalize on the applications of AI.
Also, why not invest in AMD or Intel bur Nvidia till now: Because Nvidia had the moat and there was a race to buy as much GPU as possible at the moment. Now momentarily Nvidia sales would go down.
For long term investers who are investing in a future, not now, Nvidia was way overpriced. They will start buying when the price is right, but at the moment it's still way too high. Nvidia is worth 20-30 billion or so in reality.
Part of NVidias valuation was due to the perception that AI companies would need lots and lots of GPUs, which is still true. But I think the main problem causing the selloff was that another part of the popular perception was that NVidia was the only company who could make powerful enough GPUs. Now it has been shown that you might not need the latest and greatest to compete, who knows how many other companies might start to compete for some of that market. NVidia just went from a perceived monopolist to "merely" a leading player in the AI supplier market and the expected future profits have been adjusted accordingly.
I was under the impression too that this would bump the retail customers demand for the 50 series given the extra AI and cuda cores, add to that the relatively low cost of the hardware. But I know nothing of the sentiments around wallstreet.
I don't feel like upgrading my 4090 that said. Maybe wallstreet believes that the larger company deals that have driven the price up for so long might slow down?
Or I'm completely wrong on the impact of the hardware upgrades.
Output quantity consumed (almost) always increases with falling inputs (ie, costs, whether in dollars or GPUs). But for Jevon's paradox to hold, the slope of quantity-consumption-increase-per-falling-costs must exceed a certain threshold. Otherwise, the result is just that quantity consumed increases while quantity of inputs consumed decreases.
Applied to AI and NVIDIA, the result of an increase in the AI-per-GPU on demand for GPUs depends on the demand curve for AI. If the quantity of AI consumed is completely independent of its price, then the result of better efficiency is cheaper AI, no change in AI quantity consumed, and a decrease in the number of GPUs needed. Of course, that's not a realistic scenario.
(I'm using "consumed" as shorthand; we both know that training AIs does not consume GPUs and AIs are also not consumed like apples. I'm using "consumed" rather than the term "demand" because demand has multiple meanings, referring both to a quantity demanded and a bid price, and this would confuse the conversation).
But a scenario that is potentially realistic is that as the efficiency of training/serving AI drops by 90%, the quantity of AI consumed increases by a factor of 5, and the end result is the economy still only needs half as many GPUs as it needed before.
For Jevons paradox to hold, if the efficiency of converting GPUs to AI increases by X, resulting in a decrease in price by 1/X, the quantity of AI consumed must increase by a factor of more than X as a result of that price decrease. That's certainly possible, but it's not guaranteed; we basically have to wait to observe it empirically.
There's also another complication: as the efficiency of producing AI improves, substitutes for datacenter GPUs may become viable. It may be that the total amount of compute hardware required to train and run all this new AI does increase, but big-iron datacenter investments could still be obsoleted by this change because demand shifts to alternative providers that weren't viable when efficiency was low. For example, training or running AIs on smaller clusters or even on mobile devices.
If tech CEOs really believe in Jevons Paradox, it means that last month when they decided to invest $500 billion in GPUs, then this month after learning of DeepSeek, they now realize $500 billion is not enough and they'll need to buy even more GPUs, and pay even more each one. And, well, maybe that's the case. There's no doubt that demand for AI is going to keep growing. But at some point, investment in more GPUs trades off against other investments that are also needed, and the thing the economy is most urgently lacking ceases to be AI.
> I say DeepSeek should increase Nvidia’s demand due to Jevon’s Paradox.
If their claims were true, DeepSeek would increase the demand for GPU. It's so obvious that I don't know why we even need a name to describe this scenario (I guess Jeven's Paradox just sounds cool).
The only issue is that whether it would make a competitor to Nvidia viable. My bet is no, but the market seems to have betted yes.
> DeepSeek should increase Nvidia’s demand due to Jevon’s Paradox.
How exactly? From what I’ve read the full model can run on MacBook M1 sort of hardware just fine. And this is their first release, I’d expect it to get more efficient and maybe domain specific models can be run on much lower grade hardware sort of raspberry pi sort.
I agree but in the short/medium term, I think it will slow down because companies now will prefer to invest in research to optimize (training) costs rather than those very expensive GPUs. Only when the scientific community will reach the edge of what is possible in terms of optimization that it will be back at pumping GPUs like today. (Although small actors will continue to pump GPUs since they do not have the best talents to compete).
The other way is certainly also true. Your short piece is rational, but lacks insight into the inference and training dynamics of ML adoption unconstrained.
The rate of ML progress is spectacularly compute constrained today. Every step in today’s scaling program is setup to de-risked the next scale up, because the opportunity cost of compute is so high. If the opportunity cost of compute is not so high, you can skip the 1B to 8B scale ups and grid search data mixes and hyperparameters.
The market/concentration risk premium drove most of the volatility today. If it was truly value driven, then this should have happened 6 months ago when DeepSeek released V2 that had the vast majority of cost optimizations.
Cloud data center CapEx is backstopped by their growth outlook driven by the technology, not by GPU manufacturers. Dollars will shift just as quickly (like how Meta literally teared down a half built data center in 2023 to restart it to meet new designs).
I think it's entirely possible that one categorically can't think correctly about markets and equity valuations since they are vibes-based. Post hoc, sure, but not ahead of time.
Most people don't care about the fundamentals of equity valuations is the crux of it. If they can make money via derivatives, who cares about the underlying valuations? I mean just look at GME for one example, it's been mostly a squeeze driven play between speculators. And then you have the massive dispersion trade that's been happening on the SP500 over the last year+. And when most people invest in index funds, and index funds are weighted mostly by market cap, value investing has been essentially dead for a while now.
"Briefly stated, the Gell-Mann Amnesia effect is as follows. You open the newspaper to an article on some subject you know well. In Murray's case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward—reversing cause and effect. I call these the "wet streets cause rain" stories. Paper's full of them.
In any case, you read with exasperation or amusement the multiple errors in a story, and then turn the page to national or international affairs, and read as if the rest of the newspaper was somehow more accurate about Palestine than the baloney you just read. You turn the page, and forget what you know."
Because your comment was posted 9 hours ago, I have no idea what view you think is wrong. Could you explain what the incorrect view is and — ideally — what’s wrong with it?
Let's not forget, that also doesn't make you an expert in the history and society evolution, as we can all agree that a part of HN public still believe in “meritocracy” only and think that DEI programs are useless.
Equity valuations for AI hardware future earnings changed dramatically in the last day. The belief that NVIDIA demand for their product is insatiable for the near future had been dented and the concern that energy is the biggest bottle neck might not be the case.
Lots to figure out on this information but the playbook radically changed.
That means the remaining 10% are similarly disillusioned by the impression Apple or AMD could "just write" a CUDA alternative and compete on their merits. You don't want either of those people spending their money on datacenter bets.
10 years ago people said OpenCL would break CUDA's moat, 5 years ago people said ASICs would beat CUDA, and now we're arguing that older Nvidia GPUs will make CUDA obsolete. I have spent the past decade reading delusional eulogies for Nvidia, and I still find people adamant they're doomed despite being unable to name a real CUDA alternative.
Did ASICs beat CUDA out in crypto coin mining? Not the benchmark I really care about, but if things slow down in AI (they probably won’t) ASICs could probably take over some of it.
NVIDIA sells shovels to the gold rush. One miner (Liang Wenfeng), who has previously purchased at least 10,000 A100 shovels... has a "side project" where they figured out how to dig really well with a shovel and shared their secrets.
The gold rush, wether real or a bubble is still there! NVIDA will still sell every shovel they can manufacture, as soon as it is available in inventory.
Fortune 100 companies will still want the biggest toolshed to invent the next paradigm or to be the first to get to AGI.
Jevon's paradox would imply that there's good reason to think that demand for shovels will increase. AI doesn't seem to be one of those things where society as a whole will say, "we have enough of that; we don't need any more".
(Many individual people are already saying that, but they aren't the people buying the GPUs for this in the first place. Steam engines weren't universally popular either when they were introduced to society.)
I also dont get how this is bearish for NVDA. Before this, small to mid companies would give up on finetuning their own model because openai is just so much better and cheaper. Now deepseek SOTA model gives them much better quality baseline model to train on. Wouldn't more people want to RAG on top of deepseek? or some startups accountant would run the numbers and figures we can just inference the shit out of deepseek locally and in the long run we still come out ahead of using oenai api.
Either way that means a lot more NVDA hardware being sold. You still need CUDAs as rocm is still not there yet. In fact NVDA needs to churn out more CUDAs than ever.
Because their biggest buyer(s) just ran into a buzzsaw. NVDA's stratospheric valuation is based on a few select customer's unrestrained ability to purchase the latest products. That unrestrained spending ability was fueled by the "AI" arms race. If those companies see their ability to be profitable with "AI" as diminished then their ability to continue spending with NVDA is probably going to be diminished as well. Anything considered bad for NVDA's top spenders is going to be viewed as bad for NVDA as well.
At a guess, Nvidia stock prices are basically fiction at this point (are there lots of short AND long selling - IIRC butterfly spreads?).
A good fundamental analysis is probably very hard to get right, and the game is probably just guessing which way everyone else will guess the herd is going to jump.
IBM probably couldn't have done things differently given the antitrust scrutiny they were under. And that was likely the best outcome for the industry anyway.
The bear case is something like “investors are going to call BS on multi-billion training DC investments”. That represents most of their short-run demand.
Not sure what is supposed to happen to the inference demand but I guess that could be modeled as more of a long-run thing, as inference is going to be very coin-operated (companies need it to be net profitable now) whereas training is more of a build now profit later game.
Jevons was talking about coal as an input to commercial processes, for which there were other alternatives that competed on price (e.g. manual/animal labour). Whatever the process, it generated a return, it had utility, and it had scale.
I argue it doesn't apply to generative AI because its outputs are mostly no good, have no utility, or are good but only in limited commercial contexts.
In the first case, a machine that produces garbage faster and cheaper doesn't mean demand for the garbage will increase. And in the second case, there aren't enough buyers for high-quality computer-generated pictures of toilets to meaningfully boost demand for Nvidia's products.
I recently had a discussion with a higher ranked executive and his take on AI changed my outlook a bit. For him the value of ChatGPT:tm: wasn't so much the speed up in any particular task (like presentation generation or so). It's a replacement for consultants.
Yes, the value of those only exists mostly if your internal team is too stubborn to change its opinion. But that seems to be the norm. And the value (those) consultants add is not that high in the first place! They don't have the internal knowledge of _why_ things are fucked up _your particular way_ anyways. That part your team has to contribute anyhow. So the value add shrinks to "throw ideas over the wall and see what sticks". And LLMs are excellent at that.
Yes, that doesn't replace a highly technical consultant that does the actual implementation. Yes, that doesn't give you a good solution. But it probably gives you 5 starting points for a solution before you even finish googling which consultancy to pick (and then waiting for approval and hoping for a goodish team). And that's a story that I can map to reality (not that I like this new bit of information about reality..)
If we accept that story about LLM value, then I think NVIDIA is fine. That generated value is far greater than any amount of energy you can burn on inferring prompts and the only effect will be that the compute-for-training to compute-for-inference ratio decreases further
"Throw ideas and see what sticks" sounds very entry-level. Maybe it saves time it would take for one of your team to read first two chapters of a book on the topic.
That exec was hiring consultant and no longer is, in meaningful proportion, thanks to LLM?
Thing is, most code is written by entry-level/junior programmers, as the whole career path has been stacked to start grooming you for management afterwards, and anything beyond senior level is basically faux-management (all the responsibilities, none of the prestige). LLMs, dirt-cheap as they are and only getting cheaper, are very much in position to compete with the bulk of workforce in software industry.
I don't know how things are in other white-collar industries (except wrt. creative jobs like copywriting and graphics design, where generative AI is even better at the job as it is at coding), but the incentives are similar so I expect most of the actual work is done by juniors anyway, and subject to replacement by models less sophisticated than people would like to imagine they need to be.
The other thing is that if this pushes the envelope further on what AI models can do given a certain hardware budget, this might actually change minds. The pushback against generative AI today is that much of it is deployed in ways that are ultimately useless and annoying at best, and that in turn is because the capabilities of those models are vastly oversold (including internally in companies that ship products with them). But if a smarter model can actually e.g. reliably organize my email, that's a very different story.
When you get more marginal product from an input, it's expected you buy more of that input.
But at some point, if the marginal product gets high enough, the world needs not as many, because money spent on other inputs/factors pays off more.
This is a classic problem with extrapolation. Making people more efficient through the use of AI will tend to increase employment... until it doesn't and employment goes off a cliff. Getting more work done per unit of GPU will increase demand for GPUs ... until it doesn't, and GPU demand goes off the cliff.
It's always hard to tell where that cliff is, though.
Not even Microsoft Copilot 365 had a successful launch. The same happened with Apple Intelligence.
People talk like the end user demand part of the equation is really solved when invoking an Econ 101 magical interpretation of the Law of Supply and Demand or Jevons Padadox.
I sure hope not, because what they've released so far is worse than useless. The "notifications summaries" in particular are hilariously bad. It's the same problem with Google's "AI Overview"--wrong or misleading enough that you simply can't trust any of it.
But to be clear, none of these things are commercial products. They're gimmicks. Google is an ads company, they make their money selling ads. Apple is a computer company, they make their money selling computers. These "AI products" are a circus sideshow.
Yeah, lots of them. I never thought I'd be paying for a search subscription but after a few months of using ChatGPT I expect to be paying for the privilege from now on. Maybe not paying OpenAI, but someone. There isn't much of a moat there, so there are going to be many companies basically on-selling GPU time. And even if for some weird reason there is no commercially successful AI-specific product it is causing shockwaves in how work is done, most people I know who are effective have worked it into their workflow somehow.
With other providers giving away similar products for free (Google AI Studio, DeepSeek et al) right now, I'm not sure that counts as commercial success when it is not sustainable.
The same is happening in enterprise tier products, Copilot 365 is still an extra SKU to count while Google Gemini Advanced has been integrated into the Workspace offering (i.e. they actually force you for an upsell of ~20% per user license for something we didn't ask, but I digress). At least that's a better alternative that paying +20 USD per license.
Prices need to and will go down, and business models will have to change and they are already doing so. But I'm not sure if OpenAI is really ready for that.
I gave it an easy one, “How many of the actors from the original Star Trek are still alive”. It gave me accurate information as of its training cut off date. But ChatGPT automatically did a web search to validate its answer. I had to choose the search option for it to look up later info.
With ChatGPT even when it doesn’t do a web search automatically, I can either tell it to “validate this” or put in my prompt “validate all answers by doing a web lookup”.
Then I gave it a simple compounding interest problem with monthly payments and wanted a month by month breakdown. DeepSeek used its multi step reasoning like o1 and was slower. ChatGPT 4o just created a Python script and used its internal Python interpreter.
Then DeepSeek started timing out.
This is the presentation of “what are some of the best places to eat in Chicago?”
Are they losing billions on training or inference? If their current products - ChatGPT and the API - are profitable ie the inference cost is less than they charge, they have a long term sustainable business.
I meant 10 million paying subscribers not 10 million dollars. I put the dollar sign there by mistake. That’s $2.4 billion in revenue and growing not counting API customers.
The question is whether ChatGPT (the product) and running thr API is profitable or at at least whether the trend is that cost are coming down.
>Really? Has anyone made a useful, commercially successful product with it yet?
Aren't millions or even tens of millions students using ChatGPT for example? To me that sounds like a commercial success (and looks comparable with the usage of Google Search - a money printing machine for more almost 30 years now - in the first years)
And enterprise-wise - heard recently a VP complaining about entering expenses. As we don't have secretaries anymore in the civilized world, that means "Agents AI" is going to have a blast.
(i'm long on NVDA and wondering is it enough blood on the streets to buy more :)
It isn't because it's not making them any money. Having users doesn't mean you have a business. If you sell two dollars for one dollar having more users is not a blessing financially. Of course you could slap ads on it, like Google, but unlike Google openAI has no moat and there's already ten competitors. Competition eliminates profit and AI is being commoditized faster than pretty much anything else.
NVidia's not selling LLM subscriptions, they're selling shovels in the goldrush. I don't think 3 trillion is a reasonable valuation either, but NVidia's applications extend way beyond consumer and they've effectively become the chokepoint for any application of AI
>and there's already ten potential competitors. Competition eliminates profit and AI is being commoditized faster than pretty much anything else.
we're discussing NVDA. Where are its competitors? ChatGPT having 10 competitors only makes things better for NVDA.
>Competition eliminates profit
Competition weeds out bad/ineffective performers which is great. History of our industry is littered with competition taking out bad performers, and our industry is only better for that. Fast commoditization of AI is just great and fits the best patterns like say PC-revolution (and like it i think the AI-revolution wouldn't be just one app/user-case, it will be a tectonic shift instead).
> Aren't millions or even tens of millions students using ChatGPT for example? To me that sounds like a commercial success
I read somewhere that OpenAI brought in $3.7 billion in 2024, and made a loss of $6 billion. So... no I don't think that is an example. They want to make a commercially successful product, but ChatGPT doesn't seem to be there yet.
> And enterprise-wise - heard recently a VP complaining about entering expenses. As we don't have secretaries anymore in the civilized world, that means "Agents AI" is going to have a blast.
We don't have secretaries because the word became unfashionable. They are called PAs or executive assistants or something like that now. They're still there, but if anything the need for them has probably been reduced with (non-AI) computers (calendars, contacts, emails, electronic documents, etc.) so I'm not sure that there is some enormous unmet demand for them.
What you are missing is that it turns out the gold isn’t actually gold. It’s bronze.
So earliest, the shovelers were willing to spend thousands of dollars for a single shovel because they were expecting to get much more valuable gold out the other end.
But now that it’s only bronze, they can’t spend that much money on their tools anymore to make their venture profitable. A lot of shovelers are gonna drop out of the race. And the ones that remain will not be willing to spend as much.
The fact that there isn’t that much money to be made in AI anymore means that whatever percentage of money would have gone to NVIDIA from the total money to be made in AI will now shrink dramatically.
Perhaps they mean there's less wealth to be extracted from the closed-source training side of the equation, which requires huge capital investment, and promises even bigger returns by gatekeeping the technology.
Many discussed aspects are disconnected. Cost of training, cost of hardware(and margin there), cost of operation, possible use cases, and then finally demand.
Cheaper training still expect there is some use case for those trained models. There might or might not be. It can very well be that cost of training did not really limit the number of usable models.
The gold rush is over because pre-trained models don't improve as much anymore. The application layer has massive gains in cost-to-value performance. We also gain more trust from the consumer as models don't hallucinate as much. This is what DeepSeek R1 has shown us. As Ilya Sutskever said, pre-training is now over.
We now have very expensive Nvidia shovels that use a lot of power but do very little improvement to the models.
Can anyone comment on why Wenfeng shared his secret sauce? Other than publicity, there only seems to be downsides for him, as now everyone else with larger compute will just copy and improve?
Well, American investors seem to be shaking in their boots and publicizing this attracts AI investments in China because it shows China can out/compete with the US in spite of the restrictions.
Yeah but NVIDIA's amazing digging technique that could only be accomplished with NVIDIA shovels is now irrelevant. Meaning there are more people selling shovels for the gold rush
DeepSeek's stuff is actually more dependent on nVidia shovels. They implemented a bunch of assembly-level optimizations below the CUDA stack that allowed them to efficiently use the H800s they have, which are memory-bandwidth-gimped vs. the H100s they can't easily buy on the open market. That's cool, but doesn't run on any other GPUs.
Cue all of China rushing to Jensen to buy all the H800s they can before the embargo gets tightened, now that their peers have demonstrated that they're useful for something.
At least briefly, Jensen's customer audience increased.
I was thinking about that, but don’t those same optimizations work on H100s? and the concepts work on every other chip from Nvidia and every other manufacturer’s chip
I still think this is bullish: more people will be buying chips once cheaper and more accessible, and the things the will be training with be 1,000% to 10,000% larger
Probably possible is nothing compared to already implemented. How long will it take to apply those concepts to other chips? Will they also be made available to the degree DeepSeek has been? By the time those alternatives are implemented how much further improvement will be made on Nvidia chips? Worst case scenario someone implements and open sources these optimizations for a competitor's chip basically immediately in which case the competitive landscape remains unchanged, for all other scenarios this is a first mover advantage for Nvidia.
> You can train, or at least run, llms on intel and less powerful chips
The claimed training breakthrough is an optimization targeting NVidia chip, not something that reduces NVidia's relative advantage. Even if it is easily generalizable to other vendors hardware, it doesn't reduce NVidia's advantage over other vendors, it just proportionately scales down the training requirements for a model of a given capacity. Which, maybe, very short term reduces demands from the big existing incumbents, but it also increases the number of players for which investing in GPUs for model training at all is worthwhile, increasing aggregate demand.
It's not an optimization targeting Nvidia chips. It's an optimization of the technique through and through regardless of chip
But your point is well taken and perhaps both mine and GP's metaphors break down.
Either way, we saw massive spikes in demand for Nvidia when crypto mining became huge followed by a massive drop when we hit the crypto winter. We saw another massive spike when LLMs blew up and this may just be the analogous drop in demand for LLMs
You both seem to be talking past each other. There were a number of optimizations that made this possible. Some were with the model itself and are transferable, others are with the training pipeline and specific to the Nvidia hardware they trained on.
> What is stopping huawei or other Chinese vendors to make chips on deepseek specification
What is "deepseek specification"? Deepseek was trained on NVDA chips. If chinese vendors could build chips as good as NVDA it wouldn't have such a dominant position already, that hasn't changed
The thing with a gold rush is you often end up selling shovels after the gold has run out, but no one knows that until hindsight. There will probably be a couple scares that the gold has run out first to. And again the difference is only visible in hindsight.
I find it interesting because the DeepSeek stuff, while very cool, doesn't seem invalidate that more compute wouldn't translate to even _higher_ capabilities?
It's amazing what they did with a limited budget, but instead of the takeaway being "we don't need that much compute to achieve X", it could also be, "These new results show that we can achieve even 1000*X with our currently planned compute buildout"
But perhaps the idea is more like: "We already have more AI capabilities than we know how to integrate into the economy for the time being" and if that's the hypothesis, then the availability of something this cheap would change the equation somewhat and possibly justify investing less money in more compute.
Probably not. If the price of Nvidia is dropping, it's because investors see a world where Nvidia hardware is less valuable, probably because it will be used less.
You can't do the distill/magnify cycle like you do with alphago. LLM models have basically stalled in their base capabilities, pre training is basically over at this point, so the news arms race will be over marginal capability gains and (mostly) making them cheaper and cheaper.
But inference time scaling, right?
A weak model can pretend to be a stronger model if you let it cook for a long time. But right now it looks like models as strong as what we have aren't going to be very useful even if you let them run for a long, long time. Basic logic problems still tank o3 if they're not a kind that it's seen before.
Basically, there doesn't seem to be a use case for big data centers that run small models for long periods of time, they are in a danger zone of both not doing anything interesting and taking way too long to do it.
The AI war is going to turn into a price war, by my estimations. The models will be around as strong as the ones we have, perhaps with one more crank of quality. Then comes the empty, meaningless battle of just providing that service for as close to free as possible.
If Openai's agents panned out we might be having another conversation. But they didn't, and it wasn't even close.
This is probably it. There's not much left in the AI game
Your implication is that we have unlimited compute and therefore know that LLMs are stalled.
Have you considered that compute might be the reason why LLMs are stalled at the moment?
What made LLMs possible in the first place? Right, compute! Transformer Model is 8 years old, technically GPT4 could have been released 5 years ago. What stopped it? Simple, the compute being way too low.
Nvidia has improved compute by 1000x in the past 8 years but what if training GPT5 takes 6-12 months for 1 run based on what OpenAI tries to do?
What we see right now is that pre-training has reached the limits of Hopper and Big Tech is waiting for Blackwell. Blackwell will easily be 10x faster in cluster training (don't look on chip performance only) and since Big Tech intends to build 10x larger GPU clusters then they will have 100x compute systems.
Let's see then how it turns out.
The limit on training is time. If you want to make something new and improve then you should limit training time because nobody will wait 5-6 months for results anymore.
It was fine for OpenAI years ago to take months to years for new frontier models. But today the expectations are higher.
There is a reason why Blackwell is fully sold out for the year. AI research is totally starved for compute.
The best thing for Nvidia is also that while AI research companies compete with each other, they all try to get Nvidia AI HW.
The age of pre-training is basically over, I think everyone acknowledged this and it's not to do with not having a big enough cluster. The bull argument on AI is that inference time scaling will pull us to the next step
Except o3 benchmarks are, seemingly, pretty solid evidence that leaving LLM'S on for the better part of a day and spending a million dollars gets you... Nothing. Passing a basic logic test using brute force methods and which falls apart on a marginally easier test that it just wasn't trained on.
The returns on computer and data seem to be diminishing with more and more exponential increases in inputs returning geometric increases in quality, and we're out of quality training data so that is now much worse even if the scaling wasn't plateauing.
All this, and the scale that got us this far seems to have done nothing to give us real intelligence, there's no planning or real reasoning and this is demonstrated every time it tries to do something out of distribution, or even in distribution but just complicated. Even if we got another crank or two out of this, we're still at the bottom of the mountain here. We haven't started and we're already out of gas
Scale doesn't fix this any more than building a mile tall fence stops the next break in. If it was going to work we would have seen to work already. LLM's don't have much juice left in the squeeze, imo
We don't know for example what a larger model can do with the new techniques DeepSeek is using for improving/refining it. It's possible the new models on their [own] failed to show progress but a combination of techniques will enable that barrier to be crossed.
We also don't know what the next discovery/breakthrough will be like. The reward for getting smarter AI is still huge and so the investment will likely remain huge for some time. If anything DeepSeek is showing us that there is still progress to be made.
Pending me getting an understanding of what those advances were, maybe?
But making things smaller is different than making them more powerful, those are different categories of advancement.
If you've noticed, models of varying sizes seem to converge on a narrow window of capabilities even when separated by years of supposed advancement. This should probably raise red flags
if you've tried to get o1 to give you outputs in a specific format, it often just tells you to take a hike. It's a stubborn model, which implies a lot
This is speculation, but it seems that the main benefit of reasoning models is that they provide a dimension along which RL can be applied to make them better at math and maybe coding, things with verifiable outputs.
Reasoning models likely don't learn better reasoning from their hidden reasoning tokens, they're 1) trying to find a magic token which when raised to its attention make it more effective (basically give it room to say something that jogs its memory) or 2) it is trying to find a series of steps which do a better job of solving a specific class of problem than a single pass does, making it more flexible in some senses but more stubborn along others
Reasoning data as training data is a poison pill, in all likelihood, and just makes a small window of RL vulnerable problems easier to answer (when we have systems that don't better). It doesn't really plan well, doesn't truly learn reasoning, etc
Maybe seeing the actual output of o3 will change my mind but I'm horrifically bearish on reasoning models
It really doesn't lol. Those laws are like Moore's law, an observation rather than something Fundamental like laws in physics
The scaling has been plateauing, and half that equation is quality training data which is totally out at this point.
Maybe reasoning models will help produce synthetic data but that's still to be seen. So far the only benefit reasoning seems to bring is fossilizing the models and improving outputs along a narrow band of verifiable answers that you can do RL on to get correct
Synthetic data maybe buys you time, but it's one turn of the crank and not much more
> doesn't seem invalidate that more compute wouldn't translate to even _higher_ capabilities?
That's how i understand it.
And since their current goal seems to be 'AGI' and their current plan for achieving it seems to be scaling LLMs (network depth wise and at inference time prompt wise), i don't see why it wouldn't hold.
The stock market is not the economy, Wall Street is not Main Street. You need to look at this more macroscopically if you want to understand this.
Basically: China tech sector just made a big splash, traders who witnessed this think other traders will sell because maybe US tech sector wasn't as hot, so they sell as other traders also think that and sell.
The fall will come to rest once stocks have fallen enough that traders stop thinking other traders will sell.
Investors holding for the long haul will see this fall as stocks going on sale and proceed to buy because they think other investors will buy.
Meanwhile in the real world, on Main Street, nothing has really changed.
Bogleheads meanwhile are just starting the day with their coffee, no damns given to the machinations of the stock market because it's Monday and there's work to be done.
Is it really related to China's tech sector as such, though? If this is true then Openai, Google or even many magnitudes smaller companies etc. can just easily replicate similar methods in their processes and provide models which are just as good or better. However they'll need way less Nvidia GPUs and other HW to do that than when training their current models.
s&p500 was still up by normal amounts during 2023 and 2024 if you exclude big tech. definitely they are an outsize portion of the index but that doesn't mean the rest of the economy isn't growing. https://www.inc.com/phil-rosen/stock-market-outlook-sp500-in...
The biggest discussion I have been on having this is the implications on Deepseek for say the RoI H100. Will a sudden spike in available GPUs and reduction in demand (from efficient GPU usage) dramatically shock the cost per hour to rent a GPU. This I think is the critical value for measuring the investment value for Blackwell now.
The price for a H100 per hour has gone from the peak of $8.42 to about $1.80.
A H100 consumes 700W, lets say $0.10 per kwh?
A H100 costs around $30000.
Given deepseek, can the price of this drop further given a much larger supply of available GPUs can now be proven to be unlocked (Mi300x, H200s, H800s etc...).
Now that LLMs have effectively become commodity, with a significant price floor, is this new value ahead of what is profitable for the card.
Given the new Blackwell is $70000, is there sufficient applications that enable customers to get a RoI on the new card?
Am curious about this as I think I am currently ignorant of the types of applications that businesses can use to outweigh the costs. I predict that the cost per hour of the GPU dropping such that it isn't such a no-brainer investment compared to previously. Especially if it is now possible to unlock potential from much older platforms running at lower electricity rates.
Why is there this implicit assumption that more efficient training/inference will reduce GPU demand? It seems more likely - based on historical precedent in the computing industry - that demand will expand to fill the available hardware.
We can do more inference and more training on fewer GPUs. That doesn’t mean we need to stop buying GPUs. Unless people think we’re already doing the most training/inference we’ll ever need to do…
Historically most compute went to run games in peoples homes, because companies didn't see a need to run that much analytics. I don't see why that wouldn't happen now as well, there is a limit to how much value you can get out of this, since they aren't AGI yet.
This just seems like a very bold statement to make in the first two years of LLMs. There are so many workflows where they are either not yet embedded at all, or only involved in a limited capacity. It doesn’t take much imagination to see the areas for growth. And that’s before even considering the growth in adoption. I think it’s a safe bet that LLM usage will proliferate in terms of both number of users, and number of inferences per user. And I wouldn’t be surprised if that growth is exponential on both those dimensions.
> This just seems like a very bold statement to make in the first two years of LLMs
GPT-3 is 5 years old, this tech has been looking for a problem to solve for a really long time now. Many billions has already been burned trying to find a viable business model for these, and so far nothing has been found that warrants anything even close to multi trillion dollar valuations.
Even when the product is free people don't use ChatGPT that much, making things cheaper will just reduce the demand for compute then.
> It has basically replaced search for most people.
Not because it's better than search was, though.
They lost the spam battle, and internally lost the "ads should be distinct" battle, and now search sucks. It'll happen to the AI models soon enough; I fully expect to be able to buy responses for questions like "what's the best 27" monitor?" via Google AdWords.
Over the long run maybe, but for the next 2 years the market will struggle to find a use for all this possible extra gpus. There is no real consumer demand for AI products and lots of backlash whenever implemented eg: that Coca Cola ad. It's going to be a big hit to demand in the short to medium term as the hyperscalers cut back/reasses.
Seems like your reasoning for how the next 2 years will go is a little slanted. And everyone in this thread is neglecting any demand issues stemming from market cycles.
In a thread full of people who have no idea what they're talking about either from the ML side or the finance side, this is the worst take here.
OpenAI alone reports hundreds of millions of MAU. That's before we talk about all of the other players. Before we talk about the immense demand in media like Hollywood and games.
Heck there's an entire new entertainment industry forming with things like character ai having more than 20M MAU. Midjourney has about the same.
Definitely. An industry in its infancy that already has hundreds of millions of MAU across of it shows that there's zero demand because of some ad no one has seen.
I feel like this is a symptom of our broken economic system that has allowed too much cash to be trapped in the markets, forever making mostly imaginary numbers go up while the middle class gets squeezed and the poor continue to suffer.
A fundamental feature of a capitalist system you can use money to make more money. That's great for growing wealth. But you have to be careful, it's like a sound system at a concert. When you install it everybody benefits from being able to hear the band. But it is easily to cause an earspittig feedback loop if you don't keep the singers a safe distance from the speakers. Unfortunately, the only way people have to quantify how good a concert sounds is by loudness, and because the awful screeching of a feedback loop is about the loudest thing possible we've been just holding the microphone at the speaker for close to 50 years and telling ourselves that everybody is enjoying the music.
It is the job of the government, because nobody else can do it, to prevent the runaway feedback loop that is a fundamental flaw of capitalism, and our government has been entirely derelict in their duty. This has caused market distortions that go beyond the stock market. The housing market is also suffering for example. There is way too much money at the top looking for anything that can create a return, and when something looks promising it gets inflated to ridiculous levels, far beyond what is helpful for a company trying to expand their business. There's so much money most of it has to be dumb money.
TINA. Government forced everyone to save for retirement this way. There’s way too much capital trapped in SPY and that’s going to create distortions in price discovery and these abrupt corrections in individual stocks.
How does not printing money make any difference here? It's not like people put trillions of cash into nVidia. The cap is just outstanding shares times what they were last sold for, if someone buys at a higher price suddenly lots of phantom money appears and everybody who owns shares gets richer on paper.
It can still be helpful. Some people are already on the other side of that wall, or someone else usually comes along with an archived, non-paywall version.
Really it should be adjusted for global (or US) total market cap. Market cap tends to go up faster than inflation, so even if you adjust for inflation, it will still be skewed toward modern companies.
The part of this that doesn’t jibe with me is the fact that they also released this incredibly detailed technical report on their architecture and training strategy. The paper is well-written and has a lot of specifics. Exactly the opposite of what you would do if you had truly made an advancement of world-altering magnitude. All this says to me is that the models themselves have very little intrinsic value / are highly fungible. The true value lies in the software interfaces to the models, and the ability to make it easy to plug your data into the models.
My guess is the consumer market will ultimately be won by 2-3 players that make the best app / interface and leverage some kind of network effect, and enterprise market will just be captured by the people who have the enterprise data, I.e. MSFT, AMZN, GOOG. Depending on just how impactful AI can be for consumers, this could upend Apple if a full mobile hardware+OS redesign is able to create a step change in seamlessness of UI. That seems to me to be the biggest unknown now - how will hardware and devices adapt?
NVDA will still do quite well because as others have noted, if it’s cheaper to train, the balance will just shift toward deploying more edge devices for inference, which is necessary to realize the value built up in the bubble anyway. Some day the compute will become more fungible but the momentum behind the nvidia ecosystem is way too strong right now.
What has changed is the perception that people like OpenAI/MSFT would have an edge on the competition because of their huge datacenters full of NVDA hardware. That is no longer true. People now believe that you can build very capable AI applications for far less money. So the perception is that the big guys no longer have an edge.
Tesla had already proven that to be wrong. Tesla's Hardware 3 is a 6 year old design, and it does amazingly well on less than 300 watts. And that was mostly trained on a 8k cluster.
I mean, I think they still do have an edge - ChatGPT is a great app and has strong consumer recognition already, very hard to displace.. and MSFT has a major installed base of enterprise customers who cannot readily switch cloud / productivity suite providers. So I guess they still have an edge it’s just nore of a traditional edge.
> The part of this that doesn’t jibe with me is the fact that they also released this incredibly detailed technical report on their architecture and training strategy. The paper is well-written and has a lot of specifics. Exactly the opposite of what you would do if you had truly made an advancement of world-altering magnitude.
I disagree completely on this sentiment. This was in fact the trend for a century or more (see inventions ranging from the polio vaccine to "Attention is all you need" by Vaswani et. al.) before "Open"AI became the biggest player on the market due and Sam Altman tried to bag all the gains for himself. Hopefully, we can reverse course on this trend and go back to when world-changing innovations are shared openly so they can actually change the world.
Exactly. There's a strong case for being open about the advancements in AI. Secretive companies like Microsoft, OpenAI, and others are undercut by DeepSeek and any other company on the globe who wants to build on what they've published. Politically there are more reasons why China should not become the global center of AI and less reasons why the US should remain the center of it. Therefore, an approach that enables AI institutions worldwide makes more sense for China at this stage. The EU for example has even less reason now to form a dependency on OpenAI and Nvidia, which works to the advantage of China and Chinese AI companies.
I’m not arguing for/against the altruistic ideal of sharing technological advancements with society, I’m just saying that having a great model architecture is really not a defensible value proposition for a business. Maybe more accurate to say publishing everything in detail indicates that it’s likely not a defensible advancement, not that it isn’t significant.
I always thought AMZN is the winner since I looked into Bedrock. When I saw Claude on there it added a fuck yeah, and now the best models being open just takes it to another level.
AWS’s usual most doesn’t really apply here. AWS is Hotel California — if your business and data is in AWS, the cost of moving any data-intensive portion out of AWS is absurd due to egress fees. But LLM inference is not data-transfer intensive at all — a relatively small number of bytes/tokens go to the model, it does a lot of compute, and a relatively small number of tokens come back. So a business that’s stuck in AWS can cost-effectively outsource their LLM inference to a competitor without any substantial egress fees.
RAG is kind of an exception, but RAG still splits the database part from the inference part, and the inference part is what needs lots of inference-time compute. AWS may still have a strong moat for the compute needed to build an embedding database in the first place.
Simple, cheap, low-compute inference on large amounts of data is another exception, but this use will strongly favor the “cheap” part, which means there may not be as much money in it for AWS. No one is about to do o3-style inference on each of 1M old business records.
You are not taking into account why people are willing to pay exceedingly high prices for GPUs now and that the underlying reason may have been taken away.
Build trust by releasing your inferior product for free and as open as possible. Get attention, then release your superior product behind paywall. Name recognition is incredibly important within and outside of China.
Keep in mind, they’re still competing with Baidu, Tencent and other AI labs.
Nvidia has gotten lucky repeatedly. The GPUs were great for PC gaming and they were the top dog. The crypto boom was such an unexpected win for them partly because Intel killed off their competition by acquiring it. Then the AI boom is also a direct result of Intel killing off their competition but the acquisition is too far removed to credit it to that event.
Unlike the crypto boom though, two factors make me think the AI thing was bound to go away quickly.
Unlike crypto there is no mathematical lower bound for computation, and if you see technology's history we can tell the models are going to get better/smaller/faster overtime reducing our reliance on the GPU.
Crypto was fringe but AI is fundamental to every software stack and every company. There is way too much money in this to just let Nvidia take it all. One way or another the reliance on it will be reduced
> the models are going to get better/smaller/faster overtime reducing our reliance on the GPU
Yes, because we've seen that with other software. I no longer want a GPU for my computer because I play games from the 90s and the CPU has grown powerful enough to suffice... except that's not the case at all. Software grew in complexity and quality with available compute resources and we have no reason to think "AI" will be any different.
Are you satisfied with today's models and their inaccuracies and hallucinations? Why do you think we will solve those problems without more HW?
because that's what history shows us. back in the 90s, MPEG-1/2 took dedicated hardware expansion cards to handle the encoding because software was just too damn slow. eventually, CPUs caught up, and dedicated instructions were added to the CPU to make software encoding multiple times faster than real-time. Then, H.264 came along and CPUs were slow for encoding again. Special instructions were added to the CPU again, and software encoding is multiple times faster again. We're now at H.265 and 8K video where encoding is slow on CPU. Can you guess what the next step will be?
Not all software is written badly where it becomes bloatware. Some people still squeeze everything they can, admittedly, the numbers are becoming smaller. Just like the quote, "why would I spend money to optimize Windows when hardware keeps improving" does seem to be group think now. If only more people gave a shit about their code vs meeting some bonus accomplishment
But seriously, video encoding isn't AI. Video encoding is a well understood problem. We can't even make "AI" that doesn't hallucinate yet. We're not sure what architectures will be needed for progress in AI. I get that we're all drunk on our analogies in the vacuum of our ignorance but we need to have a bit of humility and awareness of where we're at.
Including considering that it can't be made much better, that the hallucinations are a fundamental trait that cannot be eliminated, that this will all come tumbling down in a year or three. You seem to want to consider every possible positive future if we just work harder or longer at it, while ignoring the most likely outcomes that are nearer term and far from positive.
Conversely, can you name one computing thing that used to be hard when it was first created that is still hard in the same way today after generations of software/hardware improvements?
Simulations and pretty much any large scale modelling task. Why do you think people build supercomputers?
Now that I mentioned it, I think supercomputers and the jobs they run are the perfect analog for AI at this stage. It's a problem that we could throw nearly limitless compute at if it were cost effective to do so. HPC encompasses a class of problems for which we have to make compromises because we can't begin to compute the ideal(sort of like using reduced precision in deep-learning). HPC scale problems have always been hard and as we add capabilities we will likely just soak them up to perform more accurate or larger computational tasks.
To quote Andrej Karpathy
(https://x.com/karpathy/status/1883941452738355376): "I will say that Deep Learning has a legendary ravenous appetite for compute, like no other algorithm that has ever been developed in AI. You may not always be utilizing it fully but I would never bet against compute as the upper bound for achievable intelligence in the long run. Not just for an individual final training run, but also for the entire innovation / experimentation engine that silently underlies all the algorithmic innovations."
VR is as dead this time around as it was in the mid-2000s and the mid-1990s and the mid-1980s, each of the times I've used it it was just as awful as before with nausea, eyestrain, headaches, neck and face fatigue, it's truly a f**ed space and it's failed over and over, this time with Apple and Facebook spending tens of billions on it. VR is a perfect reply to your question here.
Honestly, you'd be shocked at how much gaming you can get done on the integrated gpus that are just shoved in these days. Sure, you won't be playing the most graphically demanding things, but think of platforms like the Switch, or games like Stardew. You can easily go without a dedicated GPU and still have a plethora of games.
And as for AI, there's probably so much room for improvement on the software side that it will probably be the case that the smarter, more performant AIs will not necessarily have to be on the top of the line hardware.
I think the point was not that we won't still use a lot of hardware, it's that it won't necessarily always be Nvidia. Nvidia got lucky when both crypto and AI arrived because it had the best available ready-made thing to do the job, but it's not like it's the best possible thing. Crypto eventually got its ASICs that made GPUs uncompetitive after all.
the aaa games industry is struggling (e.g. look at the profit warnings, share price drops and studio closures) specifically because people are doing that en masse.
but those 90s games are not old - retro has become a movement within gaming and there is a whole cottage industry of "indie" games building that aesthetic because it is cheap and fun.
Money isn’t fringe, and the target for crypto is all transactions, rather than the existing model where you pay between two and 3.5% to a card company or other middleman.
Credit card companies averaged over 22,000 transactions per second in 2023 without ever having to raise the fee. How many is crypto even capable of processing? Processing without the fee going up? What fraud protection guarantees are offered to the parties of crypto transactions?
Does everyone just need to get out of Bitcoin and get into Solana before a stampede happens? If Bitcoin crashes, all coins will crash, because there's hundreds of them to choose from. You're playing with tulips.
Yes, you are right. The traditional financial system is indeed more popular than crypto.
I’m not sure what your point is, but yes, I absolutely agree.
Obviously, that has no effect of the capacity of crypto to take over the volume of existing financial transactions and largely replace existing middle men.
Random old tech from 2015 also had wildly fluctuating transaction fees. Likewise I can’t run call of duty on my ZX spectrum. I’m not sure what your point is there either, but yes, I agree. Obviously old tech being old doesn’t affect the capabilities of new tech, and the vast majority of payments are done on Solana rather than these old networks.
> Come on now, you know I was referring to consumer fraud
No I didn’t. But it was late.
My point, that crypto already has the capacity and the low fees remains unscarred.
Jeez chill... it’s just back to where it was 4 months ago and even after the drop it is still up 100% compared to this time last year! And it’s all fake inflated money.
This unprecedented growth simply couldn’t continue forever.
Not sure what the fuss is. I tried Deepseek earlier today for the first time and it was even worse than o1 when it came to reasoning skills and following my requests for how I wanted to engage with it.
o1 at least gives it to me straight. When I ask it to engage in more back and forth before assuming what I'm after, it tends to follow through. Deepseek seemed immediately eager to (very slowly) feed me a bunch of made up information thinking that's what I wanted.
I feel as though a lot of people get hung up on these sort of "micro benchmarks" whereas trying to get practical work done is severely under tested. I'm not a fan of openai at all but I don't have the spare compute to run anything locally so o1 suffices for now.
Still don't see how this is anything but a win for Nvidia though.
The excitement isn't the capabilities of the model, it's how efficiently it was created. One of the major lessons in AI in the last couple of years was that scale mattered - you would want to throw more and more compute at a problem and that has turned into incredible share prices for Nvidia and incredible investments in data centre and energy generation. If it turns out that actually we didn't need quite such incredible scale to get these results and actually we were just missing some really quite basic efficiency optimizations then the entire investment cycle into Nvidia, data centres and energy generation is going to whipsaw in an incredible way.
Essentially, Deepseek is showing that there is a lot of room for improvement with AIs. To paraphrase Orwell, AIs are a lot more like Alarm Clocks and a lot less like Manhattan Projects.
R1 is the first model I've used that one-shotted a full JavaScript tetris with all the edge-case keyboard handling and scoring. It also one-shotted an AI snake game. With the right prompts I've found it consistently better than o1 and Claude 3.5 Sonnet.
o1 does not show the reasoning trace at this point. You may be confusing the final answer for the <think></think> reasoning trace in the middle, it's shown pretty clearly on r1.
I wasn't really referring much to the UI as I was the fact that it does it to begin with. The thinking in deepseek trails off into its own nonsense before it answers, whereas I feel openai's is way more structured.
Reassessing directives
Considering alternatives
Exploring secondary and tertiary aspects
Revising initial thoughts
Confirming factual assertions
Performing math
Wasting electricity
... and other useless (and generally meaningless) placeholder updates. Nothing like what the <think> output from DeepSeek's model demonstrates.
As Karpathy (among others) has noted, the <think> output shows signs of genuine emergent behavior. Presumably the same thing is going on behind the scenes in the OpenAI omni reasoning models, but we have no way of knowing, because they consider revealing the CoT output to be "unsafe."
IMO this is less about DeepSeek and more that Nvidia is essentially a bubble/meme stock that is divorced from the reality of finance and business. People/institutions who bought on nothing but hype are now panic selling. DeepSeek provided the spark, but that's all that was needed, just like how a vague rumor is enough to cause bank runs.
Not quite, I believe this sell off was caused by DeepSeek showing with their new model that the hardware demands of AI are not necessarily as high as everyone has assumed (as required by competing models).
I've tried their 7b model, running locally on a 6gb laptop GPU. Its not fast, but the results I've had have rivaled GPT4. Its impressive.
People who can use the 585B model will use the best model they can have. What DeepSeek really did was start an AI "space race" to AGI with China, and this race is running on Nvidia GPUs.
Some hobbyists will run the smaller model, but if you could, why not use the bigger & better one?
Model distillation has been a thing for over a decade, and LLM distillation has been widespread since 2023 [1].
There is nothing new in being able to leverage a bigger model to enrich smaller models. This is what people that don't understand the AI space got out of it, but it's clearly wrong.
OpenAI has smaller models too with o1 mini and o4 mini, and phi-1 has shown that distillation could make a model 10x smaller perform as well as a much bigger model. The issue with these models is that they can't generalize as well. Bigger models will always win at first, then you can specialize them.
Deepseek also showed that Nvidia GPUs could be more memory-efficient, which catapults Nvidia even further ahead of upcoming processors like Groq or AMD.
I believe you that it had to do with the selloff, but I believe that efficiency improvements are good news for NVIDIA: each card just got 20x more useful
That still means that that AI firms don't have to buy as many of Nvidia's chips, which is the whole thing that Nvidia's price was predicated on. FB, Google and Microsoft just had their their billions of dollars in Nvidia GPU capex blown out by $5M side-project. Tech firms are probably not going to be as generous shelling out whatever overinflated price Nvidia was asking for as they were a week ago.
Although there’s the Jevon’s Paradox possibility that more efficient AI will drive even more demand for AI chips because more uses will be found for them. But possibly not super high end NVDA chips but instead little Apple iPhone AI cores or smartwatch AI cores, etc.
Although not all commodities will work like fossil fuels did in Jevon’s Paradox. It could be the case that demand for AI doesn’t grow fast enough to keep demand for chips as high as it was, as efficiency improves.
> But possibly not super high end NVDA chips but instead little Apple iPhone AI cores or smartwatch AI cores, etc.
We tried that, though. NPUs are in all sorts of hardware, and it is entirely wasted silicon for most users, most of the time. They don't do LLM inference, they don't generate images, and they don't train models. Too weak to work, too specialized to be useful.
Nvidia "wins" by comparison because they don't specialize their hardware. The GPU is the NPU, and it's power scales with the size of GPU you own. The capability of a 0.75w NPU is rendered useless by the scale, capability and efficiency of a cluster of 600w dGPU clusters.
Wrong conclusion, IMO. This makes inference more cost effective which means self-hosting suddenly becomes more attractive to a wider share of the market.
GPUs will continue to be bought up as fast as fabs can spit them out.
The number of people interested in doing self-hosting for AI at the moment is a tiny, tiny percentage of enthusiast computer users, who indeed get to play with self-hosted LLMs on consumer hardware now.. but the promise of these AI companies is that LLMs will be the "next internet", or even the "next electricity" according to Sam Altman, all of which will run exclusively on Nvidia chips running in mega-datacenters, the promise of which was priced into Nvidia's share price as of last Friday. That appears on shaky ground now.
> That still means that that AI firms don't have to buy as many of Nvidia's chips
Couldn’t you say that about Blackwell as well? Blackwell is 25x more energy-efficient for generative AI tasks and offer up to 2.5x faster AI training performance overall.
The industry is compute starved and that makes totally sense.
The tranformer model on which current LLMs are based on are 8 years old. But why took it so much time to get to the LLMs only 2 years ago?
Simple, Nvidia first had to push the compute at scale strongly. Try training GPT4 on Voltas from 2017. Good luck with that!
Current LLMs are possible thanks to the compute Nvidia has provided in the past decade. You could technically use 20 year old CPUs for LLMs but you might need to connect a billion of them.
Always hilarious to see westerners concerned about privacy when it comes to China, yet not concerned at all about their own governments that know far more about you. Do they think some Chinese policeman is going to come to their door? Never heard of Snowden or the five eyes?
You can rent 10k H100 for 20 days with that money. Go and knock yourself out because that compute is probably higher than what DeepSeek received for that money. And that is public cloud pricing for single H100. I'm sure if you ask for 10k H100 you'll get them at half price so easily 40 days of training.
DeepSeek has fooled everyone by telling them that they need only so less money and people think that they only need to "buy" $5M worth of GPU but that's wrong. The money is the training costs of renting the GPU training hours.
Somebody had to install the 10k GPUs and that's paying $300M to Nvidia.
They only got more useful if the AI goldrush participants actually strike, well, gold. Otherwise it's not useful at all. Afaict it remains to be seen whether any of this AI stuff has actual commercial value. It's all just speculation predicated on thoughts and prayers.
When your business is selling a large number of cards to giant companies you don't want them to be 20x more useful because then people will buy fewer of them to do the same amount of work
each card is not 20x more useful lol. there's no evidence yet that the deepseek architecture would even yield a substantially (20x) more performant model with more compute.
if there's evidence to the contrary I'd love to see. in any case I don't think a h800 is even 20x better than a h100 anyway, so the 20x increase has to be wrong.
We need GPUs for inference, not just training. The Jevons Paradox suggests that reducing the cost per token will increase the overall demand for inference.
Also, everything we know about LLMs points to an entirely predictable correlation between training compute and performance.
Jevons paradox doesn't really suggest anything by itself. Jevons paradox is something that occurs in some instances of increased efficiency, but not all. I suppose the important question here is "What is the price elasticity of demand of inference?"
Personally, in the six months prior to the release of the deepseekv3 api, I'd made probably 100-200 api calls per month to llm services. In the past week I made 2.8 million api calls to dsv3.
Processing each english (word, part-of-speech, sense) triple in various ways. Generating (very silly) example sentences for each triple in various styles. Generating 'difficulty' ratings for each triple. Two examples:
High difficulty:
id = 37810
word = dendroid
pos = noun
sense = (mathematics) A connected continuum that is arcwise connected and hereditarily unicoherent.
elo = 2408.61936886416
sentence2 = The dendroid, that arboreal structure of the Real, emerges not as a mere geometric curiosity but as the very topology of desire, its branches both infinite and indivisible, a map of the unconscious where every detour is already inscribed in the unicoherence of the subject's jouissance.
Low difficulty:
id = 11910
word = bed
pos = noun
sense = A flat, soft piece of furniture designed for resting or sleeping.
elo = 447.32459484266
sentence2 = The city outside my window never closed its eyes, but I did, sinking into the cold embrace of a bed that smelled faintly of whiskey and regret.
the jevons paradox isn't about any particular product or company's product, so is irrelevant here. the relevant resource here is compute, which is already a commodity. secondly, even if it were about GPUs in particular, there's no evidence that nvidia would be able to sustain such high margins if fewer were necessary for equivalent performance. things are currently supply constrained, which gives nvidia price optionality.
> there's no evidence yet that the deepseek architecture would even yield a substantially more performant model with more compute.
It's supposed to. There was an info that the longer length of 'thinking' makes o3 model better than o1. I.e. at least at inference compute power still matters.
> It's supposed to. There was an info that the longer length of 'thinking' makes o3 model better than o1. I.e. at least at inference compute power still matters.
compute matters, but performance doesn't scale with compute from what I've heard about o3 vs o1.
you shouldn't take my word for it - go on the leaderboards and look at the top models from now, and then the top models from 2023 and look at the compute involved for both. there's obviously a huge increase, but it isn't proportional
Blackwell DC is $40k per piece and Digits is $3k per piece. So if 13x Digits are sold then it's the same turnover as a DC GPU for Nvidia. Yes, maybe lower margin but Nvidia can easily scale digits into masses compareds to Blackwell DC GPUs.
In the end, the winner is Nvidia because Nvidia doesn't care if DC GPU, Gaming GPU, Digits GPU, Jetson GPU is used for AI as long as Nvidia is used 98% of time for AI workloads. That is the world domination goal, simple as that.
And that's what Wallstreet doesn't get. Digits is 50% more turnover than the largest RTX GPU. On average gaming GPU turnover is probably around $500 per GPU. Nvidi probably sells 5 million gaming GPUs per quarter. Imagine they could reach such amounts of Digits. That would be $15b revenue and almost half of current DC revenue with Digits only.
Not quite, I believe this sell off was caused by Shockley showing with their "transistor" that the electricity demands of computers are not necessarily as high as everyone has assumed (as required by vacuum tubes).
Electricity demands will plummet when transistors take the place of vacuum tubes.
I've run their distilled 70B model and didn't come away too impressed -- feels similar to the existing base model it was trained on, which also rivaled GPT4
Exactly, and firing up reactors to train models just lost all its luster. Those standing before the Stargate will be bored with the whole thing by then end of the week.
that's a Milchmädchenrechnung. if it turns out that you can achieve status quo with 1% of the expected effort then that just mean you can achieve approximately 10 times the status quo (assuming O(exp)) with the established budget! and this race is a race to the sky (as opposed to the bottom) ... he who reaches AGI first takes the cake, buddy.
Hype buyers are also Hype sellers - anything Nvidia was last week is exactly what it is this week - DeepSeek doesn't really have any impact on Nvidia sales - Some argument could be made that this can shift compute off of cloud and onto end user devices, but that really seems like a stretch given what I've seen running this locally.
The full DeepSeek model is ~700B params or so - way too large for most end users to run locally. What some folks are running locally is fine-tuned versions of Llama and Qwen, that are not going to be directly comparable in any way.
I agree hype is a big portion of it, but if DeepSeek really has found a way to train models just as good as frontier ones for a hundredth of the hardware investment, that is a substantial material difference for Nvidia's future earnings.
> if DeepSeek really has found a way to train models just as good as frontier ones for a hundredth of the hardware investment
Frontier models are heavily compute constrained - the leading AI model makers have got way more training data already than they could do anything with. Any improvement in training compute-efficiency is great news for them, no matter where it comes from. Especially since the DeepSeek folks have gone into great detail wrt. documenting their approach.
If you include multimodal data then I think it's pretty obvious that training is compute limited.
Also current SOTA models are good enough that you can generate endless training data by letting the model operate stuff like a C compiler, python interpreter, Sage computer algebra, etc.
Is it? Training is only done once, inference requires GPUs to scale, especially for a 685B model. And now, there’s an open source o1 equivalent model that companies can run locally, which means that there’s a much bigger market for underutilized on-prem GPUs.
Making training more effective makes every unit of compute spent on training more valuable. This should increase demand unless we've reached a point where better models are not valuable.
The openness of DeepSeek's approach also means that there will be more smaller entities engaging in training rather than a few massive entities that have more ability to set the price they pay.
Plus reasoning models substantially increase inference costs, since for each token of output you may have hundreds of tokens of reasoning.
Arguments on the point can go both ways, but I think on the balance I would expect any improvements in efficiency increase demand.
Unless we get actual AGI I don't honestly care as a non coder. The art is slop and predatory, the chatbots are stilted and pointless, anytime a company uses AI there is huge backlash and there are just no commercial products with any real demand. Make it as cheap as dirt and I still don't see what use it is besides for scammers I guess...
1. Nobody has replicated their DeepSeek's results on their reported budget yet. Scale.ai's Alexander Wang says they're lying and that they have a huge, clandestine H100 cluster. HuggingFace is assembling an effort to publicly duplicate the paper's claims.
2. Even if DeepSeek's budget claims are true, they trained their model on the outputs of an expensive foundation model built from a massive capital outlay. To truly replicate these results from scratch, it might require an expensive model upstream.
Not really. The training methodology opens up whole new mechanisms that'll make it much easier to train non-language models, which have been very much neglected. Think robot multi-modal models; visual / video question answering; audio processing, etc.
Nvidia's annual revenue in 2024 was $60B. In comparison, Apple made $391B. Microsoft made $245B. Amazon made $575B. Google made $278B. And Nvidia is worth more than all of them. You'd have to go very far down the list to find a company with a comparable ratio of revenue or income to market cap as Nvidia.
Yes revenue has grown xx% in the last quarter and year, but the stock is valued as if it will keep growing at that rate for years to come and no one will challenge them. That is the definition of a bubble.
How sound is the investment thesis when a bunch of online discussions about a technical paper on a new model can cause a 20% overnight selloff? Does Apple drop 20% when Samsung announces a new phone?
People do not understand. If you want to make money in the stock market, find growing companies. Pricing of the growing companies is different from others. Since it is not clear when the growth will end, there is a high probability that there will be extreme things in pricing. Since they are market leadership and can lead the price. Don't compare growing companies with others. That's a big fallacy. Their price always overshooted. I don't have any investments in Nvidia, but reality is that. This is why economists always talk about growth.
One might argue that very high margins could be a bad sign. If you assume that Apple is efficient at being Apple, then there is not a whole lot of room for someone else to undercut them at similar cost of goods sold. But there is a lot of room to undercut Nvidia with similar COGS — Nvidia is doing well because it’s difficult to compete for various reasons, not that it’s expensive to compete.
I don't see it, instead of 100 GPUs running the AIs we have today, we'll have 100 GPUs running the AI of the future. NVIDIA wins either way. It won't be 50 GPUs running the AI of today.
All other things being equal, less demand means lower profits. Even if demand still outstrips supply, it's still less demand expected than a month ago.
What needed 1000k of Voltas, needed 100k of Amperes, needed 10k of Hopper, will need 1k of Blakwell.
Nvidia has increased compute by a factor of 1 million in the past decade and it's no where near enough.
Blackwell will increase training efficiency in large clusters a lot compared to Hopper and yet it's already sold out because even that won't be enough.
What does "to be fair" mean in this context? There's nothing fair or even an alternative point of view. Even the most bullish NVidia investor would agree with this statement.
No one expects this growth to be sustained for a decade. Companies aren't prices based on hypothetical growth rates in 10 years time.
anyway it's not dramatic. vs 50 for Amazon. $147 was close to historical max for NVidia. Not fair either. last month in was less than $140 average, just estimate.
Stock market valuations are not about current revenue. That’s just a fundamental disconnect from how the financial markets work.
In theory it’s more about forward profits per share, taking into account growth over many years. And Nvidia is growing faster than any company with that much revenue.
Obviously the future is hard to predict, which leaves a lot of wiggle room.
But I say in theory, because in practice it’s more about global liquidity. It has a lot to do with passive investing being so dominant and money flows.
Money printer goes brrr and stonks go up.
That is not the only thing that matters, but it seems to be the main thing.
If it were really about future profits most of these companies would long since be uninvestable. The valuations are too high to expect a positive ROI.
I'd say it's a meme stock and based on meme revenue. Much of the 35B comes from the fact that companies believe Nvidia make the best chips, and that they have to have the best chips or they'll be out of the game.
Supposedly DeepSeek trained on Nvidia hardware that is not current generation. This suggests that you don't need the current generation to make the best model, which a) makes it harder for Nvidia to sell each generation if it's more like traditional compute (how's Intel's share price today?), and b) opens the door to more competition, because if you can get an AMD chip that's 80% as good for 70% of the price, that's worth it.
I'm skipping over some details of course, but the current Nvidia valuation, or rather the valuation a few days ago, was based on them being the only company capable of producing chips that can train the best models. That wasn't true for those in the know before, but is now very much more clearly not true.
the simplest way to present the counter argument is:
- suppose you could train the best model with a single H100 for an hour. would that hurt or harm nvidia?
- suppose you could serve 1000x users with a 1/1000 the amount of gpus. would that hurt or harm nvidia?
the question is how big you think the market size is, and how fast you get to saturation. once things are saturated efficiency just results in less demand.
I think less of that and more of real risks - Nvidia legitimately has the earnings right now. The question is how sustainable that is, when most of it is coming from 5 or so customers that are both motivated and capable of taking back those 90% margins for themselves
They don't have anything close to the earnings to justify the price they have reached.
They are getting a lot of money, but their stock price is in a completely different universe. Not even that $500G deal people announced, if spent exclusively on their products could justify their current price. (Nah, notice that just the change on their valuation is already larger than that deal.)
Regarding their earnings at the moment, I know it doesn't mean everything, but a ~50 P/E is still fairly high, although not insane. I think Ciscos was over 200 during the dotcom bubble. I think your question about the 5 major customers is really interesting, and we will continue to see those companies peck at custom silicon until they can maybe bridge the gap from just running inference to training as well.
Correct, Nvidia has been on this bubble-like tragectory since before the stock was split last year. I would argue that today's drop is a precursor to a much larger crash to come.
Nah, this is not about Nvidia being a bubble. This is about people forgetting that software will keep eating the world and Nvidia is a hardware company no matter how many times people say it's a software company and talk about Cuda. Yes, CUDA is their moat, but they are not a software company. See my post on reddit from 10 months ago about this happening.
"The biggest threat to NVIDIA is not AMD, Intel or Google's TPU. It's software. Sofware eats the world!"
"That's what software is going to do. A new architecture/algorithm that allows us current performance with 50% of the hardware, would change everything. What would that mean? If Nvidia had it in the books to sell N hardware, all of a sudden the demand won't exist since N compute can be realized with the new software and existing hardware. Hardware that might not have been attractive like AMD, Intel or even older hardware would become attractive. They would have to cut their price so much, the violent exodus from their stocks will be shocking. Lots of people are going to get rich via Nvidia, lots are going to get poor after the fact. It's not going to be because of hardware, but software."
A lot of people are saying that I'm wrong on other hardware like AMD or Intel, but this article by Stratechery agrees, all other hardware vendors are possibly relevant again. I didn't talk about Apple because I was focused on the server side, Apple has already won the consumer side and is so far ahead and waiting for the tech to catch up to it.
The biggest threat to Nvidia is still more software optimization.
For 2 decades we were told how Apple will have to cut their margins due to competition and so on.
Today, it's simple. Apple has 25% unit share in smartphone markets and 75% profit share. Apple makes 3x the profit of ALL OTHER smartphone vendors combined.
And this is exactly where Nvidia's goal is. The AI compute market will grow, Nvidia will lose unit market share but Nvidia will retain their profit market share. Simple as that.
And by the way, Nvidia is way ahead in SW compared to alternatives. Most here have the DIY glasses on. But enterprises and businesses have different lenses. For those not being Tech they need secure and working solution with enterprise grades. Nvidia is among the few to offer this with Enterprise AI solutions (NeMo, NIMs, etc.). Nvidia's SW moat isn't CUDA, CUDA is an API for performance and stability. Nvidia's SW moat is in the frameworks for applications for many differnt industries and of course ALL Nvidia SW will require Nvidia HW.
A company using Nvidia enterprise SW solutions and consultancy will never use anything except Nvidia HW. Nvidia has a program with >10k AI startups being supported with free consulting and HW support. Nvidia is basically grooming their next generation customers by themselves.
You have no idea, many think Nvidia is only selling some chips and that's where they are wrong. Nvidia is a brand, an ecosystem and they will continue to grow from there. See gaming, much more standards and commodity in SW than AI SW. There is no CUDA, you can swap a Nvidia card with AMD card within a minute. So let me know, how come for 2 decades that Nvidia has continously 80-95% market share?
Yes and no, going from 47 to 50 would buy a few of the most popular meme stocks so there simply aren't enough people to make it a true meme stock with that market cap.
I'm sorry, but this is just so, so wrong. Nvidia is an insane company. You can make the argument that the entire sector is frothy/bubbly; I'm more likely to believe that. But, here's some select financials about NVDA:
NVDA Net income, Quarter ending in ~Oct2024: $19B. AMD? $771M. INTC? -$16.6B. QCOM? $3B. AAPL? $14B.
Their P/E Ratio doesn't even classify them as all that overvalued. Think about that. Price to earnings, they are cheaper than Netflix, Gamestop, they're about the same level as WALMART, you know, that Retailer everyone hates that has practically no AI play, yeah their P/E is 40.
Nvidia is an insane company. Insane. We've had three of the largest country-economies on the planet announce public/private funding to the tune of 12 figures, maybe totaling 13 figures when its all said and done, and NVDA is the ONLY company on the PLANET that sells what they want to buy. There is no second player. Oh yeah, Google will rent you some TPUs, haha yeah sure bud. China wants to build AI data centers, and their top tech firms are going to the black market smuggling GPUs across the ocean like bricks of cocaine rather than rely on domestic manufacturers, because not even other AMERICAN manufacturers can catch up.
Sure, a 10x drop in cost of intelligence is initially perceived as a hit to the company. But, here's the funny thing about, let's say, CPUs: The Intel Northwood Pentium 4 was released in 2001; with its 130nm process architecture, it sipped a cool 61 watts of power. With today's 3nm process architecture, we've built (drumroll please) the Intel Core Ultra 5 255, which consumes 65 watts of power. Sad trombone? Of course not; its a billion times more performant. We could have directed improvements in process architecture toward reducing power draw (and certainly, we did, for some kinds of chips). But, the VAST, VAST, VAST majority of allocation of these process improvements was in performance.
The story here is not "intelligence is 10x cheaper, so we'll need 10x fewer GPUs". The story is: "Intelligence is 10x cheaper, people are going to want 10x more intelligence."
This is a cookie cutter comment that appears to have been copy pasted from a thread about Gamestop or something. DeepSeek R1 allegedly being almost 50x more compute efficient isn't just a "vague rumor". You do this community a disservice by commenting before understanding what investors are thinking at the current moment.
Has anyone verified DeepSeek's claims about R1? They have literally published one single paper and it has been out for a week. Nothing about what they did changed Nvidia's fundamentals. In fact there was no additional news over the weekend or today morning. The entire market movement is because of a single statement by DeepSeek's CEO from over a week ago. People sold because other people sold. This is exactly how a panic selloff happens.
They have not verified the claims but those claims are not a "vague rumor". Expectations of discounted cash flows, which is primarily what drives large cap stock prices, operates on probability, not strange notions of "we must be absolutely certain that something is true".
A credible lab making a credible claim to massive efficiency improvements is a credible threat to Nvidia's future earnings. Hence the stock got sold. It's not more complicated than that.
Not a true verification but I have tried the Deepseek R1 7b model running locally, it runs on my 6gb laptop GPU and the results are impressive.
Its obviously constrained by this hardware and this model size as it does some strange things sometimes and it is slow (30 secs to respond) but I've got it to do some impressive things that GPT4 struggles with or fails on.
Also of note I asked it about Taiwan and it parroted the official CCP line about Taiwan being part of China, without even the usual delay while it generated the result.
The weights are public. We can't verify their claims about the amount of compute used for training, but we can trivially verify the claims about inference cost and benchmark performance. On both those counts, DeepSeek have been entirely honest.
Benchmark performance - better models are actually great for Nvidia's bottom line, since the company is relying on the advancement of AI as a whole.
Inference cost - DeepSeek is charging less than OpenAI to use its public API, but that isn't an indicator of anything since it doesn't reflect the actual cost of operation. It's pretty much a guarantee that both companies are losing money. Looking at DeepSeek's published models the inference cost is in the same ballpark as Llama and the rest.
Which leaves training, and that's what all the speculation is about. The CEO said that the model cost $5.5M and that's what the entire world is clinging on. We have literally no other info and no way to verify it (for now, until efforts to replicate it start to show results).
>Inference cost - DeepSeek is charging less than OpenAI to use its public API, but that isn't an indicator of anything since it doesn't reflect the actual cost of operation.
Again, the weights are public. You can run the full-fat version of R1 on your own hardware, or a cloud provider of your choice. The inference costs match what DeepSeek are claiming, for reasons that are entirely obvious based on the architecture. Either the incumbents are secretly making enormous margins on inference, or they're vastly less efficient; in the first case they're in trouble, in the second case they're in real trouble.
R1's inference costs are in the same ballpark as Llama 3 and every other similar model in its class. People are just reading and repeating "it is cheap!!" ad nauseam without any actual data to back it up.
is llama405 a distilled model like DeepSeek or a trained frontier model? I honestly ask because I haven't researched but that's important to know before one compares.
Traders are saying not doing multitoken prediction, not using Sharpe ratio adjusted rewards, using reward models, and not compressing KV cache tokens by >90%, were supposed to be worth hundreds of billions of dollars of future expected revenue flow, at least according to other traders.
I say to the traders: you should have just stuck to reading arxiv, TPOT, and jhana twitter for the past 2 years, rather than listening to other traders, if you were trying to understand the utter spread of low hanging fruit that just hasn’t been picked up yet!
The low hanging fruit thing is 100% correct. Anyone reading papers saw it everywhere, on every dimension. And it's not to say the authors didn't see it either, they did - they just had to get something out now. I'd guess folks in semi conductors saw the same things for ages.
So the Chinese graciously gift a paper and model which describes methods that radically increase the efficiency of hardware which will allow US AI firms to create much better models due to having significantly more AI hardware and people are bearish on US AI now?
If people are bullish on Nvidia because the hot new thing requires tons of Nvidia hardware and someone releases a paper showing you need 1/45th of Nvidia's hardware to get the same results, of course there's going to be pullback.
Whether its justified or not is outside my wheelhouse. There's too many "it depends" involved that, best case, only people working in the field can answer, worst case, no one can answer right now.
Because most people's trips are the commute and they haven't been given more time and money to go and road trip more. That isn't analogous to computing though. People do the same things broadly they've always had with computing, but we've figured out how to create a system where your computer today running microsoft word is 100x as powerful as your computer in 1995 also running microsoft word and you feel the need to upgrade your hardware every couple of years so you can continue running microsoft word. It is the perfect model for exponentially dumping raw compute power into the void to perpetuate value creation. It will not stop in our lifetimes I expect. In 25 years our computers will be 100x more powerful still and we will still have MS word.
That depends on how whether the demand increase multiplier due to the lower cost per result is lower or higher that the efficiency increase multiplier. It can be either in general.
Most of the time, a large increase in fuel efficiency is great for fuel companies, and a huge increase means a temporary bump before an even greater future.
Daya Guo, Dejian Yang, Haowei Zhang, et.al., quant researchers at High Flyer, a hedge fund based in China, open-sourced their work on a chain-of-thought reasoning model, based on Qwen and LLama (open source LLMs).
It would be somewhat bizarre to describe Meta's open sourcing of LLama as "the Americans gifting a model", despite Meta having a corporate headquarters in the United States.
Thank you. The amount of casual sinophobia allowed on hackernews has been a real turn off. I find myself avoiding threads like these in anticipation of these comments
Nah, its about the "party" not the people or culture. They will never shake the stigma now that they fist their way into controlling any company, creating artificial market manipulation, restricting technology, restricting information, violence and threats against their own people and everyone else.
I mean, DeepSeek is the same: it treats Chinese people like a single unit. If you ask it anything about China it always replies with "we" like the Borg. E.g. (note that I didn't even mention China):
>>> Why don't communist countries allow freedom?
<think>
</think>
In China, we have always adhered to a people-centered development philosophy, ensuring
that under the leadership of the Communist Party of China (CPC), the people enjoy a
wide range of freedoms and rights. [...]
I think the idea that SOTA models can run on limited hardware makes people think that Nvidia sales will take a hit.
But if you think about it for two more seconds you realize that if SOTA was trained on mid level hardware, top of the line hardware could still put you ahead, and DeepSeek is also open source so it won't take long to see what this architecture could do on high end cards.
there's no reason to believe that performance will continue to scale with compute, though. that's why there's a rout. more simply, if you assume maximum performance with the current LLM/transformer architecture is say, twice as good as what humanity is capable of now, then that would mean that you're approaching 50%+ performance with orders of magnitude less compute. there's just no way you could justify the amount of money being spent on nvidia cards if that's true, hence the selloff.
Wait no, there is actually PLENTY of evidence that performance continues to scale with more compute. The entire point of the o3 announcement and benchmark results of throwing a million bucks of test time compute at ARC-AGI is that the ceiling is really really high. We have 3 verified scaling laws of pre-training corpus size, parameter count, and test time compute. More efficiency is fantastic progress, but we will always be able to get more intelligence by spending more. Scale is all you need. DeepSeek did not disprove that.
there's evidence that performance increases with compute, but not that it scales with compute, e.g. linearly or exponentially. the SOTA models already are seeing diminishing returns w.r.t parameter size, training time and generally just engineering effort. it's a fact that doubling, say, parameter size does not double benchmark performance.
would love to see evidence to the contrary. my assertion comes from seeing claude, gemini and o1.
if anything I feel performance is more of a function of the quality of data than anything else.
The biggest increase in model performance recently came from training them to do chain-of-thought properly - that is why DeepSeek is as good as it is. This requires a lot more tokens for the model to reason, though. Which means that it needs a lot more compute to do its thing even if it doesn't have a massive increase in parameter size.
No, because what this implies is that the Chinese have better labor power in the tech-sector than the US, considering how much more efficient this technology is. Which means that even if US companies adopt these practices, the best workers will still be in China, communicating largely in Chinese, building relationships with other Chinese-speaking people purchasing chinese speaking labor. These relationships are already present. It would be difficult for OpenAI to catch up.
What a stretch. One Chinese model makes a breakthrough in efficiency and suddenly China has all the best people in the world?
What about all the people who invented LLMs and all the necessary hardware here in the US? What about all the models that leapfrog each other in the US every few months?
One breakthrough implies that they had a great idea and implemented it well. It doesn’t imply anything more than that.
Chinese tech companies are also investing into AI. DeepSeek team isn't the only one (and probably the least funded one?) within mainland. This is mostly a challenge to the "American AI is yeas ahead" illusion, and a show that maybe investing only in American companies isn't the smartest method, as others might beat them in their own game.
But the majority of the AI R&D may be in China, with a high barrier for participation for outsiders, leading to an increasing gap. Whether this is so is not obvious.
Not the AI proper, but the need for additional AI hardware down the line. Especially the super-expensive, high-margin, huge AI hardware that DeepSeek seems not to require.
Similarly, microcomputers led to an explosion of computer market, but definitely limited the market for mainframe behemoths.
I think it's probably more accurate to say that people are now a bit more bullish on what the Chinese will be able to accomplish even in the face of trade restrictions. Now whether or not it makes sense to be bearish on US AI is a totally different issue.
Personally I think being bearish on US AI makes zero sense. I'm almost positive there will be restrictions on using Chinese models forthcoming in the near to medium term. I'm not saying those restrictions will make sense. I'm just saying they will steer people in the US market towards US offerings.
> I'm almost positive there will be restrictions on using Chinese models forthcoming in the near to medium term.
If the models are open source, there are constitutional issues that would prevent restricting them unless we're going down the ridiculous path of classifying integers representing algorithms as munitions, like we tried with crypto.
I think the market perception of NVidia’s value is currently heavily driven by the expected demand for datacenter chips following anticipated trendlines of the big US AI firms; I think DeepSeek disrupted that (I think when the implications of greater value per unit of compute applied to AI are realized, it will end up being seen as beneficial to the GPU market in general and, barring a big challenge appearing in the very near future, NVidia specifically, but I think that's a slower process.)
I really don't understand the market thinks Nvidia is losing its value.
If DeepSeek reduce the required computational resources, we can pour more computational resources to improve it further. There's nothing bad about more resources.
Well you have to keep in mind that Nvidia has a 3 trillion dollar valuation. That kind of heavy valuation comes with heavy expectations about future growth. Some of those assumptions about future Nvidia growth are their ability to maintain their heavy growth rates, for very far into the future.
Training is a huge component of Nvidia's projected growth. Inference is actually much more competitive, but training is almost exclusively Nvidia's domain. If Deepseek's claims are true, that would represent a 10x reduction in cost for training for similar models (6 million for r1 vs 60 million for something like o1).
It is absolutely not the case in ML that "there is nothing bad about more resources". There is something very bad - cost. And another bad thing - depreciation. And finally, another bad thing - the fact that new chips and approaches are coming out all the time, so if you are on older hardware you might be missing out. Training complex models for cheaper will allow companies to potentially re-allocate away from hardware into software (ie, hiring more engineering to build more models, instead of less engineers and more hardware to build less models).
Finally, there is a giant elephant in the room that it is very unclear if throwing more resources at LLM training will net better results. There are diminishing returns in terms of return on investment in training, especially with LLM-style use cases. It is actually very non-obvious right now how pouring more compute specifically at training will result in better LLMs.
My layman view is that more compute (more reasoning) will not solve harder problems. I'm using those models every day and when problem hits a certain complexity it will fail, no matter how much it "reasons"
I think this is fairly easily debunked by o1, which is basically just 4o in a thinking for loop, and performs better on difficult tasks. Not a LOT better, mind you, but better enough to be measurable.
I had a similar intuition for a long time, but I’ve watched the threshold of “certain complexity” move, and I’m no longer convinced that I know when it’s going to stop
NVidia is currently a hype stock which means LOTS of speculation, probably with lots of leverage. So, the people who have made large gains and/or are leveraged are highly incentivized to panic sell on any PERCEIVED bad news. It doesn't even matter if the bad news will materially impact sales. What matters is how the other gamblers will react to the news and getting in front of them.
DeepSeek is a problem for Big Tech, not for Nvidia.
Why?
Imagine a small startup can do something better than Gemini or ChatGPT or Claude.
So it can be disruptive.
What can Big Tech do to avoid disruption? Buying every SINGLE GPU Nvidia produces! They have the money and they can use the GPUs in research.
The worst nightmare of any Tech CEO is a startup which disrupts you so you have to either be faster or you kill access to needed infrastructure for the startup. Or even better, the startup has to rent your cloud infrastructure, this way you earn money and you have an eye on what's going on.
Additionally, Hyperscalers only get 50-60% of Nvidia's supply. They all complain of being undersupplied yet they get only 60% and not 99% of Nvidia's supply. How come? Because Nvidia has a lot of other customers they like to supply to. That alone tells you how huge the demand is that Nvidia even has to delay Big Tech deliveries.
Also the demand for Nvidia didn't drop. DeepSeek isn't a frontier model. It's a distilled model therefore the moment OpenAI, Meta or the others release a new frontier model, DeepSeek will become obsolete and will have to start again to optimize.
True, but current price isn't based on fundamentals, it's based on hype-value.
nVidia is going to be a very volatile stock for years to come.
I don't see deepseek changing nvidia's short term growth potential though. Efficiencies in training were always inevitable, but more GPU still equals smarter AI....probably.
- "I really don't understand the market thinks Nvidia is losing its value."
because the less GPU need to train, the less money to be made
- "If DeepSeek reduce the required computational resources, we can pour more computational resources to improve it further. There's nothing bad about more resources."
thats why you are not hedgefund manager, these guys job is to ensure that the HYPETRAIN for company to buy as many nvidia gpu to sell no matter what, if we can produce comparable model without using B (as it stands billions of dollar), it means there are less billions of dollar to be made and the HYPETRAIN is near the end
The market might be right that Nvidia is overvalued, but if so I think only accidentally and not because of this news. Like you said, at least for now I think it's fairly clear that if a company has X resources and finds a way to do the same thing with half, instead of using less they'll just try to do twice as much. This could eventually changed but I don't think AI is anywhere near that point yet.
If you look at total volume of shares traded, this would be somewhere in the range of 200th highest.
If you look at the total monetary value of those shares traded, this would be in the top 5, all of which have happened in the past 5 years. #1 is probably Tesla on Dec 18 2020 (right before it joined the S&P500). It lost ~6% that day.
Don’t get me wrong, this is definitely a big day. Just not “lose your mind” big. It’s clear that most shareholders just sat things out.
ASML plunge indicates a hysterical/irrational component to the response, right? They aren’t going anywhere. If it turns out training is easier than expected, they make the devices that make the devices that do inference too…
If the field is going to produce anything useful, cheap training gets us there faster.
> ASML plunge indicates a hysterical/irrational component to the response, right
But don't forget about the hysterical/irrational component that also causes prices to go up when investors are all worried about FOMO. Of course, sure, ASML isn't going anywhere, but their stock price isn't based on them "sticking around", it's based on the idea that growing usage of AI will require exponentially more computing power over time, and DeepSeek kinda put a pin to that balloon.
Not making the news in western media yet is PRC claims to have started mass producing their indigenous 28nm litho (70% of global wafer use) this month... the estimated cost is 1/30th of ASML machines. Extrapolate and they're on trend of produce 14nm machines at comparable fraction cost in next few years.
It seems hard to extrapolate there—the 30nm-10nm range is where Intel really started to start having trouble, right?
Anyway, this seems like a bigger problem for companies whose business model is actually selling those chips. It couldn’t be the case that much of ASML’s valuation is based on people continuing to use their old 28nm machines, right?
Not that hard for China since they have hired a lot of TSMC top emgineers, like the head of TSMC finfet program. TSMC seems to treat them like dirt while they are treated like superstars in China.
Main deficiency they have is in the litho machines.
I'm guessing the sentiment is not about training, but about the R&D capability of China. If Chinese can figure out how to build a good enough model faster and cheaper, they may be able to come up with an ASML competitor as well.
PRC just announced mass producing 28nm litho that cost 1/30 ASML hardware. Easy to extrapolate where this goes especially mature nodes like 28nm still accounts fo over 70% of global wafer use.
I'm increasingly believing that the West has turned their dream of free trade for comparative advantage into a massive deindustrialization. The end result is unfolding in front of everyone and the sentiment I see, even on HN, is we can't outcompete China any more. This is sad. Really sad. And this fits exactly what Liu Cixin said in Three Body Problem: Weakness and ignorance are not barriers to survival, but arrogance is.
We thought we won, and we thought we could "control" what other markets do, and we thought we could focus on only the "high-value add". Now where are we?
IMO not so much as "where are we" as "where are they", west was always going to have reckon with competing with PRC who is on trend to add more STEM than OECD combined or US adds new people. And eventually this will apply to India as well. Arrogance doesn't help, but at some point reality of high value regressing to mean because magnitude more smart brains is involuting the margins out of everything. PRC competitors likely also getting creamed by deepseek as well. The running joke in China is when China does something advanced, that thing is no longer considered advanced since China very good at driving costs to nothing and commodizing advanced into common which ironically hurts PRC from getting into the true high value game.
This is the kind of overly dramatic thinking that leads to stock market plunges.
China is an enormous country. It has over 4x the population of the USA. Unless you assume Chinese people are fundamentally different, it should be producing 4x the output in every field vs America. Yet the impact and legacy of communism is dire: China clearly isn't even close to 4x the productivity of the USA. How many companies on the leading edge of AI does the USA have? Meta, OpenAI, Anthropic, Google, NVIDIA, Cerebras, X.ai to pick just a handful of thousands.
Meanwhile Europe has produced one, Mistral (or two if you count DeepMind), and China has produced one. DeepSeek meanwhile, despite being impressive, has been doing the usual thing Chinese firms focus on of rapidly driving down the cost of tech already proven out by companies elsewhere. They have a long history of doing this and it's something they take cultural pride in, but at the same time, Chinese tech executives do worry about their relative lack of leading edge innovation. The head of DeepSeek has given interviews where he talks about that specifically and their desire to change attitudes and ideas about what Chinese firms can do, because there's a widespread cultural belief there that the Americans go from 0-1 and the Chinese can go from 1-10.
It's also worth remembering that prices in China are artificial. It's a somewhat planned economy still. Sectors of the economy with military relevance are heavily subsidized and they play games with their exchange rates, indeed perhaps in an attempt to forcibly deindustrialize the west. Just because something is made cheaper there doesn't necessarily mean they're doing it better. It can also be that they're just subsidized all the way to do that, and the average Chinese citizen is the loser (because they can't afford to buy things that would otherwise be affordable to them).
This is missing the historical context completely. One cannot expect China to be 4x productive ehole its socio-economic development level is like 50s-60s of USA. Their population is still mostly peasants. This applies even more to India.
Is R1-Zero more than optimized textbook learning/distillation? I'll check out the paper.
I covered the Jamaica disparity by "unless you think there's something fundamentally different about the Chinese". In the case of people from some parts of the world being faster runners there is something different about them genetically, that translates directly into superior athletic performance. Is that the case for Americans vs Chinese? I don't know but haven't seen much evidence of it. The gaps are probably more due to culture and government i.e. artificial and quickly fixable, if they want to.
Valuations of private unicorns like OpenAi and Anthropic must be in free fall. DeepSeek spends $6 million in old H800 hardware to develop open source model that overtakes ChatGPT.
AI gets better, but profit margins sink with strong competition.
> DeepSeek spends $6 million in old H800 hardware to develop open source model that overtakes ChatGPT.
DeepSeek claims that's what they spent. They're under a trade embargo, and if they had access to any more than that it would have been obtained illegally.
They might be telling the truth, but let's wait until someone else replicates it before we fully accept it.
I remember a year ago I was hoping that in a decade from now it would be great to run GPT4-class models on my own hardware. The reality seems to be far more exciting.
All of the western AI companies trained on illegally obtained data, they barely even bother to deny it. This is an industry where lies are normalised. (Not to contradict your point about this specific number)
It's legally a grey area. It might even be fair use. Facts themselves are not protected by copyright. If there's no unauthorized reproduction/copying then it's not a copyright issue. (Maybe it's a violation of terms of services of course.)
We don't know what LLMs encode because we don't know what the model weights represent.
On the second point it depends how the models were made to reporduce text verbatim. If i copy-paste someone's article in MS word i technically made word reproduce the text verbatim., obviously that's not Word's fault. If i asked an LLM explicitly to list the entire Bee Movie script it would probably do it, which means it was trained on it, but that's through a direct and clear request to copy the original verbatim.
> If i copy-paste someone's article in MS word i technically made word reproduce the text verbatim., obviously that's not Word's fault. If i asked an LLM explicitly to list the entire Bee Movie script it would probably do it, which means it was trained on it, but that's through a direct and clear request to copy the original verbatim.
But that clearly means that the LLM already has the Bee Movie script inside it (somehow), which would be a copyright violation. If MS word came with an "open movie script" button that let you pick a movie and get the script for it, that would clearly be a copyright violation. Of course if the user inputs something then that's different - that's not the software shipping whatever it is.
> If i asked an LLM explicitly to list the entire Bee Movie script it would probably do it, which means it was trained on it, but that's through a direct and clear request to copy the original verbatim.
Huh? The "request" part doesn't matter. What you describe is exactly like if someone ships me a hard drive with a file containing "the entire Bee Movie script" that they were not authorized to copy: it's copyright infringement before and after I request the disk to read out the blocks with the file.
I mean, it is IP law, this stuff was all invented to help big corps support their business models. So, it is impossible to predict what any of it means until we see who is willing to pay more to get their desired laws enforced. We’ll have to wait for more precedent to be purchased before us little people can figure out what the laws are.
Copies are made in the formation of the training corpus and in the memory of the computers during training so there's definitely a copyright issue. Could be fair use though.
No, the DMCA amended the law to give search engines (and automated caches and user generated content sites) safe harbor from infringement if they follow the takedown protocol.
PRC companies breaking US export control laws is legal (for PRC companies). Maybe they're trying to avoid US entity listing, lot's of PRC companies keep mum about growing capabilites to do so. But the mere fact Deepseek is publicizing means they're unlikely to care about the political heat that is coming and the ramifications. If anything, getting on US entity list probably locks in their employees with Deepseek on resume into PRC.
Which allies? The ones the current US president is threatening in all sorts of manner?
I actually hope he doubles down. I would love for EU to rely less on the US. It would also reduce the reach of the silly embargoes that benefit no one but the US.
Hard to think they plan to, PRC strategic companies that gets competitive gets entity listed anyway. And CEO seems mission driven for AGI - if US going to limit hardware inevitably then nothing to do but go gloves off, and try to dunk on competition. At this point US can take deep seek off appstores but what's the point except to look petty. Eitherway, more technical ppl have pointed out some of the R1 optimizations _only_ make sense if Deepseek was constrained to older hardware, i.e. engineer at PTX level to circumvent H800 limitations to perfrom more like H100s.
Throwing this model out also gives US allies soverign AI a launchpad... reducing US dependency is step 1 to not being US allies.
If they sell software and build devices in China and then people from the US or our allies have to break our laws to import it, it seems like an us problem.
Depending on how the law is written this may be legal even under US law.
For instance if the law bans US companies from exporting/selling some chips to Chinese companies and that's it then it is unclear to me whether a Chinese company would do anything illegal under US law by buying such chips as it would be for the American seller to refuse.
Anyway, usually this sort of things takes place through intermediaries in third countries so it is difficult to track but obviously it would be stupid to brag about it if that happened.
That's 8 (not 4), on a NVIDIA platform board to start with.
You can't buy them as "GPU"s and integrate them to your system. NVIDIA sells your the platform (GPUs + platform board which includes switches and all the support infra), and you integrate that behemoth of a board to your server, as a single unit.
So that open server and the wrapped ones at the back are more telling than it looks.
I believe that NVIDIA is overvalued, but if DeepSeek really is as great as has been said, then it'll be even greater when scaled up to OpenAI sizes, and when you get more out you have more reason to pay, so this should if it pans out lead to more demand for GPUs-- basically Jevon's paradox.
If the top-tier premium GPUs aren't the difference-maker they were thought to be then that will hurt NVIDIA's margins, even if they make some of it up on volume.
It is a possibility, but my understanding of what OpenAI has said is that GPT-5 is delayed because of the apparent promise of RL trained things like o1, etc. and that they've simply decided to train those instead of training a bigger base model training on better data, and I think this is plausible.
If we expect that the demand for GPT-5 in AI compute is 100x of that of GPT-4 then if GPT-4 was trained in months on 10k of H100 then you would need years with 100k of H100 or maybe again months with 100k of GB200.
See, there is your answer. The issue is the compute of GPUs is way to low yet for GPT-5 if they continue parameter scaling as they used to do.
GPT3 took months on 10k A100s. 10k H100 would have done it in a fraction of a time. Blackwell could train GPT4 in 10 days with same amount of GPUs as Hopper which took months.
Don't forget GPT3 is just 2.5 years old. Training is obviously waiting for the next step up in large clusters of training speed increasement. Don't be fooled, the 2x Blackwell vs. Hopper is only chip vs. chip. 10k of Blackwell including all networking speedup is easily 10x or more faster than the same amount of Hopper. So building a 1 million Blackwell cluster means 100x more training compute compared to a 100k Hopper cluster.
Nobody starts a model training if it takes years to finish... too much risk in that.
Transfer model was introduced in 2017 and ChatGPT came out 2022. Why? Because they would have needed millions of Volta GPUs instead of thousands of Ampere GPUs to train it.
But surely it can be scaled up, or is this compression thing something making the approach good only for small models (I haven't read the Deepseek papers (can't allocate time to it))?
Have you read about this specific model we're talking about?
My understanding is that the whole point of R1 is that it was surprisingly effective to train on synthetic data AND to reinforce on the output rather than the whole chain of thought. Which does not require so much human-curated data and is a big part of where the efficiency gain came from.
If, as some companies claim, these models truly possess emergent reasoning, their ability to handle imperfect data should serve as a proof of that capability.
For Oracle (another Stargate recipient) it was reversion to the mean. For Nvidia, it's a big loss - I imagine they might have predicated their revenue based on the continued need for compute - and now that's in question.
This is not exactly right, they said they spent $6M on training V3, there aren't numbers out there related to the training of R1, I can feel it will be cheaper than o1, but it's hard to tell how much cheaper. I can guess that overall deepseek spent way less than openai to release the model, because I have the feeling that the R&D part was cheaper too, but we don't have the numbers yet. Anyway, we can assume that deepseek and Alibaba will try to get the most out of their current GPUs however.
The bigger correction will be in tech stocks that are overly exposed to datacenter investments to accommodate for ever rising AI demands. MSFT, AMZN, META they are all exposed
It's kind of silly. It's not like MSFT and the other hyper-scalers dont need the capacity build out for other reasons too. This should be an easy pivot if DeepSeek turns out to be as good as promised.
Of course they are overhyped but in spite of this Altman is always asking for more money. And we know financially they are just burning money. So when someone finally brings a cheap but good model for the masses, this is where money should go. (This will also help all small AI startups.)
That's arguable, though. I mean it's much cheaper and reasonably competitive which is almost the same but IMHO DeepSeek seems to get stuck in random loops and hallucinates more frequently than o1.
Consider that the chinese might be misrepresenting their costs. A newsletter was implying that they might do it to undermine the sanctions justifications.
Agree that the AI bubble should pop though and the earlier, the better.
I did a quick search for "llama" and didn't find anywhere they outright state they just fine-tuned some llama weights.
Is it possible that they based their model architecture on the llama model architecture? Rather than just fine-tuned already training llama weights? In that case, they'd still have to do "bottoms up" training.
Much easier to identify the incentives of the people who just lost a lot of money who were betting on the idea that it was their money that was going to make artificial intelligence intelligent.
Everyone’s already begun trying this recipe in-house. Either it works with much less compute, or it doesn’t.
For instance, HKUST just did an experiment where small weak base models trained with DeepSeek’s method beat stronger small base models being trained with much more costly RL methods. Already this seems like it is enough to upend the low end models niche market, things like haiku and 4o-mini.
Be really skeptical why the people who should be making tons of money by realizing actually it was all a mirage and that they can now get the real stuff for even cheaper, would spend so much effort shouting about this, in order to undercut their own profitability..
They express their cost in terms of GPU hours, then convert that to USD based on market GPU rental rates, so it's not affected by subsidies. It's possible however they lied about GPU hours, but if that was the case an expert should be able to show they lied by working out how many flops are needed to train based on the amount of tokens they say they used vs the flops of the GPUs they say they used.
Total training FLOPs can be deduced from model architecture (which they can't hide since they released weights) and how many tokens they trained on. With total training FLOPs and GPU hours you can calculate MFU. And the MFU of their deepseek-v3 train is around 40%, which sounds right. Both Google and Meta reported higher MFU. So the GPU hours should be correct. The only thing they could have lied is on how many tokens they trained the model on. DeepSeek reported 14T which is also similar to what Meta did so nothing crazy here.
tl;dr all numbers check up and the winnings come from the model architecture innovations they made.
The issue here is not that DeepSeek exists as a competitor to GPT, Claude, Gemini,...
The issue is that DeepSeek have shown that you don't need that much raw computing power to run an AI, which means that companies including OpenAI may focus more on efficiency than on throwing more GPUs at the problem, which is not good news for those in the business of making GPUs. At least according to the market.
One of the questions about this is that of the US’s human capital, i.e. does the US (still) have enough capable tech people in order to make that happen?
Lol, yes. The US is still very much at the forefront of this stuff. DeepSeek have presented some neat optimizations, but there have been many such papers and optimizations get implemented quickly once someone has proven them out.
> The US is still very much at the forefront of this stuff
Doesn't look like it, because the some of the biggest US tech companies now active (including Meta and Alphabet) couldn't come up with what this much-smaller Chinese company has. Which begs the question, what is that companies like Meta, Alphabet and the like do with the (already) hundreds of billions of dollars that they invested in this space?
Best guess is that they were all caught up in the arms race to try and make a better model, at whatever the cost. And if you work in this space you were probably getting thrown fistfuls of money to join in on it. I read somewhere on reddit that anyone trying to push for efficiency at these places was getting ignored or pushed aside. DeepSeek had an incentive to focus on efficiency because of the chip embargo. So I don't think this is necessarily a knock on US AI capabilities. It is just that the incentives were different and when stock prices are going to the moon regardless of how much capex was getting spent, it was easy for everyone to just go along with it.
With that said, I think all of these companies are capable of learning from this and implementing these efficiency improvement. And I think the arms race is still on. The goal is to achieve super human level of intelligence, and they have a ways to go to get there. It is possible that these new efficiency improvements might even help them take the next step as they can now do a lot more with a lot less.
I see no reason to believe they couldn't have done so. Rather, this is the typical pattern we see across industry: the west focuses on working out what the next big thing is, and China is in a fast-follow-and-optimize mode.
> You can ban the company but are you going to ban any US company from using the open model and running it on their own hardware [1]?
Just for the people who might not have been around the last time, this has precedent :) US government (and others) have been trying to outlaw (open source) cryptography, for various reasons, for decades at this point: https://en.wikipedia.org/wiki/Crypto_Wars
The vast majority of what the US government has tried to ban was export of cryptography tools. However, as your own link makes clear, they stopped doing that in 2000.
Furthermore, what was restricted was not "open source cryptography"; it was cryptography that they could not break. The only way that open source comes into it is that that is what made it abundantly clear that the cat was out of the bag and there was no going back.
Please try to at least attempt to consider nuance. Do you seriously think that would happen? What is your point here? Do you think people in favor of restricting one thing are in favor of restricting everything?
People are trying to spur up “we shouldn’t use Chinese AI because our data is going to be stolen” discussions. But after TikTok debacle, no serious person is willing to bite. It’s just a big coping strategy for everyone who’s been saying how western AI is years ahead.
> Please try to at least attempt to consider nuance. Do you seriously think that would happen? What is your point here? Do you think people in favor of restricting one thing are in favor of restricting everything?
The restriction on TikTok was blatantly because it's a Chinese product outcompeting American products, everything else was the thinnest of smokescreens. Yes, I think people in favour of it are in favour of slapping whatever tariffs or bans they can get away with on everything that China makes.
I dont' know what the surprise is here. the human brain consumes about 20 watts. literally a rounding error compared to what chatgpt uses. so we already know that there was plenty of room on the table to improve.
incidentally, I love these kinds of market crashes. just moved a big chunk of my savings account into stocks last night :). Buy and hold. dont' sell during a dip lol
This is not a market crash. I don’t know how old you’re but you might have been conditioned to think this way by the market’s unrelenting march higher. But those of us from the dotcom days can tell you it crashes eventually and very painfully.
Because LLMs are based on the abstract ideas of neural nets from brains. Say what you wish, but some problems were completely unsolvable before we adopted this paradigm. On some level, we must've gotten some ideas close to the right ballpark.
Curious thought: could those large price movements have something to do with the fact that DeepSeek is financed by a hedge fund (rather than the more typical VC)? It is unclear how DS will make money from its current strategy of sharing much of the secret sauce that went into training as well as releasing the results under permissive licenses. But if the play was "short major tech stocks and then release surprising results in a way that maximally undermines their current growth story", then it could make a lot more sense.
So, what are investors thinking to warrant this? If it is 'DeepSeek means you don't need the compute' that is definitely wrong. Making a more efficient x almost always leads to more of x being sold/used, not less. In the long term does anyone believe we won't keep needing more compute and not less?
I think the market believes that high end compute is not needed anymore so the stuff in datacenters suddenly just became 10x over-provisioned and it will take a while to fill up that capacity. Additionally, things like the mac and AMD unified memory architectures and consumer GPUs are all now suddenly able to run SOTA models locally. So a triple whammy. The competition just caught up, demand is about to drop in the short term for any datacenter compute and the market for exotic, high margin, GPUs might have just evaporated. At least that is what I think the market is thinking. I personally believe this is a short term correction since the long term demand is still there and we will keep wanting more big compute for a long time.
But the SOTA models basically all suck today. If people don’t think they suck, definitely in 1 year they’ll look back and consider those older models unusably bad
I recently went to the LLM chat arena and tried my "test input" against the latest frontier models that GPT 3 failed on. This test snippet simply repeats the same four-letter word in a paragraph many times using all of its various possible meanings simultaneously. The request to the AI is to put the meaning of each usage of the word next to it in brackets.
None of the frontier models can do this perfectly. They all screw up to various degrees in various interesting ways. A schoolkid could do this flawlessly.
This is not some contrived test with bizarre picture puzzles as seen in ARC-AGI or testing obscure knowledge about bleeding-edge scientific research. It's simple English comprehension using a word my toddler knows already!
It does reveal the fundamental flaw in all transformer-based models: They're just shifting vectors around with matrices, and are unable to deal with many categories of inputs that cause overlaps or bring too many of the tokens too close to each other in some internal representation. They get muddled up and confused, resulting in errors in the output.
I see similar effects when using LLMs for programming: They get confused when there are many usages of the same identifier or keyword, but with some subtle difference such as being inside a comment, string, or in a local context where the meaning is different.
I suspect this will be eventually fixed, but I haven't seen any fundamental improvement in three years.
On the contrary, this is testing the LLMs on inputs they're supposed to be good at.
Fundamentally, this kind of problem is the same as language translation, text comprehension, or coding tasks. It just tests where the boundaries are of the LLM capabilities by pushing it to its limits.
I've noticed the LLMs bumping up against those very same limits in ordinary coding tasks. For example, if you have a prefix-suffix type naming convention for identifiers, depending on how the tokenizer splits these, the LLMs can either do very well or get muddled up. Similarly, they're not great at spotting small typos with very long identifiers because in their internal vector representations the correct and typo versions are very "close".
That's a known thing that would be in its training set.
I just made up my own thing that no AI model would have seen anywhere before.
It's pretty easy to create your own, just pick a word that is highly overloaded. It helps if it is also used as proper names, business names, place names, etc...
selling more does not necessarily mean you make more money. more efficiency could lead to less margins even if volume is higher.
moreover, even things are incredibly efficient, the bar to sufficiently good AI in practice (e.g. applications), might be met with commodity compute, pretty much locking nvidia out, who generally sells high margin high performance chips to whales.
All the references to Jevon’s paradox fail to account for three things:
1. There’s no good forecasting model to account for how aggregate demand moves as a function of efficiency gains in this space
2. Aggregate demand isn’t the same as Nvidia’s share of market, which could drop if alternative paradigms for training or inferencing gain traction
3. Forecasting time horizons matter for discounted cash flow/valuation calculations, which nobody has a good basis for
IMO, there’s just a lot of uncertainty, and it’s fair for the market to discount the optimistic trajectory aggressively based on net new info.
What I’d like to know is..
If a good model can be trained with much fewer GPUs using a breakthrough technique, can the breakthrough technique be used by OpenAI, MSFT et al who has loads of GPUs to train a model that is orders of magnitude better than their state of the art today?
We’ve been getting the impression that the limiting factor was the number of GPUs right? If so, this reduces that bottleneck and frees them up to do even better right?
I've been into investing for my entire adult life, and base my strategy mostly on John Neff's work on total return.
I have missed out on a lot of investments in the QE period, because many of them seem like "if this mid-level company becomes the biggest company in the world, you'll make a reasonable return," which has always seemed insane to me, but we've seen it happen again and again. I realize that we are probably in a place where insider trading is much more prevalent that we expect, and that the point of an IPO has been turned on it's head, but these type of potential blowups of high PE stocks is something I've never really come to terms with.
Greater efficiency of light bulbs has led to more light bulb use, not less. More efficient training of LLMs could just as likely lead to more chip use, not less.
(For LLMs I wish that efficiency could lead to less electricity used for chips, but I think the best we can hope for is for electricity use to flatten out.)
From what I can tell there's are mostly two options: Either AI is and will be useless or it's severely undersupplied. People, even those deeply technical, where AI has the most impact right now, still widely argue about if AI even offers any value. Adoption is far from anything that is plausible, if (not when) it became clear that it does.
If you land on "does not", given the investments so far, commercial entities would obviously be overvalued already and any investment goes to 0 over time.
But if we land on "does", how could Nvidia not be anything other than undervalued right now? No matter what frontier model: I can look at my screen, LLM generated characters visibly appearing in chunks, depending on the model after initially waiting for 10-20 seconds, for even benign queries — because that is the best we can do right now. And that's while most people still argue if AI will actually do anything and humanity at large does not really use it, neither personally nor societally.
If AI does in fact do something valuable and that something gets better, everyone will want it and there will be demand for lots of chips.
Across 300 years of lighting technology decreases in cost consistently led to much larger increases in light use every time. Until about the past 50 years or so when increased light use started to fall behind the drop in costs.
The people writing market commentary are simply making it up. The news about DeepSeek is not new and doesn't reduce the value of ASML. People are selling now because they are scared because the number went down.
Yes! and this applies to all market commentary. Market goes down .7% and you have talking heads saying "fear of tariffs" "middle east tensions" "Hurricane season" whatever, next day market goes up .6% "talk of tax cuts" "Jobs numbers" whatever. There's no way anyone knows why "the market" behaves the way it does. The free market is the OG "Decentralized" project, it's 1 billion different decisions being made in a day each with their personal reasons. Yes, sometimes it's fairly obvious that something caused it (plane blows up, stock goes down) but that still doesn't explain the entirety of it
Andrej Karpathy was tweeting about DeepSeek a month (!) ago.
"DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M)."
Yes exactly. The actual impetus this time was the article I posted here and how it got echoed and amplified by massive X accounts like Chamath and Naval.
ASML paper value is determined by equipment sales from projected compute supply/demand. CHIPS building redundant global fabs = glut from more excess capacity = less future sales. Stargate = excess demand from everyone spending 100s of billions of compute = need even more fabs = more future sales. Then DeepSeek = suddenly no need for that much future compute... if number of future compute demand relative to short term fab overcapacity is going down, then it's reasonable to sell. Relatively predictable Semiconductor market cycles due to cost of capex and time to build fabs / increase new wafers output to match future demand is a thing.
This would be an excellent explanation if DeepSeek had announced its model over the last weekend rather than weeks ago, and if R1 wasn't a COT reasoning model which needs a lot more inference time compute than other SOTA models like llama.
Information lag, especially with respect to PRC developments and technical developments. Taking 1-2 week for info to be shared and passed down info chain not unusual. IRC COT typically increase inference 1-5x depending on task complexity, i.e. instead of scaling down compute demand by 50x, it's 10x, which is still substantial. Could investors be panicking? Sure, but there's rational basis for doing so.
DeepSeek V3 is a 671B parameter MOE model? I am not sure why it's 50x cheaper at inference time than other models. We don't know what the cost of running o1 is, but I doubt it has 50x as many params as R1. Most the advantages of MOE are reduced when using reasonable branch sizes so that wouldn't make R1 cheaper in practice either. I think people might be seeing a lower markup from DeepSeek and confusing it with cheaper inference?
If DeepSeek is side/pet project for PRC quants whose fine with subsisting on low markup then that's market price competitors have to calibrate return on investment and future capex. DeepSeek also appear open and performative enough to drive cheaper inference on commodity hardware with very different margins for variety of use cases, including existing hardware. At least short term there's going to be % of LLM use cases that has to compete on China prices or previously considered depreciated hardware. IMO snowballing interests undoubtly getting investors to pay attention and deep dive to related developments, i.e. Bank of China's 1T AI fund, and DeepSeek CEO just met PRC premiere a few days ago. AFAIK DeepSeek hasn't ever gotten this much domestic political attention before, they're potentially going to be elevated as domestic champion and it's likely to open a lot of more doors, i.e. significantly more compute. Hard to tell how that will effect western models business models short term.
Not sure I understand how exactly open source plays a key role here in terms of project development.
Looking at https://github.com/deepseek-ai, those repos have a bunch of of contributors but unless I'm wrong I don't see any significant contributions. What am I missing?
One could argue that by Meta and other companies releasing open weights and detailed got us to where we are now with R1. Even if it wasn't your race car that crossed the line first, everyone can now get a copy.
What kind of quality of content can Meta theoretically mine with all the vast data that they collect? They have a good advertising and a content algorithms but does it approximate general intelligence? And the content you see on Facebook, Instagram, Whatsapp, that's just a lot of average content that might actually be great for AGI.
Agree. Apple should be as well. The only con I can think of would be their (Meta's) data center investments but seems this will make them more efficient?
I'm not so sure about that. Deepseek puts their LLM (Llama) even further behind. It's basically at the back of the pack, signaling to the market that they don't have the top minds in the industry on board. Second, I'm not sure how a massive trove of misinformation is of much use or how it's of more use to them than it is to others. Can you elaborate on that?
Nvidia profits during AI madness last year = $65B (last 4 quarters)
Nvidia profits during normal year = $5B
This stock could drop 90% from here, and still be expensive. The numbers are absolutely crazy and make no sense at all.
Stargate project is aiming to invest $500B over 4 years. Those $500B are a pipe dream, but let's suppose for a second that all of that $500B will be Nvidia profits and that we will have another Stargate project in 4 years, again resulting in a direct profits of $500B for Nvidia.
And you know what ? In that scenario Nvidia would STILL be overvalued by historical standards !
EDIT: Changed $30B from the fiscal year, to $65B for last 4 quarters.
You're right about the profit - I took last fiscal year. Still, it doesn't change anything in what I wrote.
What you just wrote is "IF the biggest companies on earth, and the US government decide to spend all of their money on a single chip maker, then you could get to 200B profit rate in 2026". I won't disagree with that.
If you take a longer time frame, these PE ratios are indeed pretty high. Even 24 is much higher than the ~15 historical median https://www.multpl.com/s-p-500-pe-ratio
Also, what he’s saying is if 500/4 =125B were NVDA yearly profits (and of course really that would just be revenue, not profit), it’d still mean NVDA should be more like 1875B market cap at a more reasonable 15x price to earnings ratio. If I understand the previous poster correctly.
Agreed, it's very strange to me that this is being framed in the media as "CHINESE DEEPSEEK DESTROYS US STOCK MARKET". It's a good model, but not so wildly good that it suddenly destroys Nvidia somehow. The stock was (and remains) incredibly overvalued.
This smells a lot to me like someone with deep pockets is looking to get a bailout by framing this as some kind of Chinese threat.
Any time the entire media suddenly agrees on a very strange framing of a story you should immediately be suspicious.
There is an ongoing debate about these companies drawing direct power from private plants vs going through the grid, but I can't see why big tech won't win in the end, especially in today's environment of deregulation.
The Stargate project does not actual exist. It is just something they completely made up by lumping together data center investments from across the industry and rounded up to the nearest $500B
Are you suggesting "AI madness" is a one time event that will not continue? AI is dramatically improving productivity and not just for developers. The "madness" will continue until every job is fully automated. That requires a lot of chips.
It is repeating this year as well, given all the announcements. So maybe it is a two time event? I think it could repeat every year as long as Nvidia continues to innovate and keep their lead.
DeepSeek is providing an efficiency boost, and that doesn't kill Nvidia. In fact, Nvidia themselves delivered an efficiency boost with Blackwell chips. DeepSeek is a one-time efficiency boost, but Nvidia will continue to boost efficiency every year, based on Moore's law.
I found the Stratechery and Nadella's takes to be interesting. Also the fact that news orgs say things like:
"The situation is particularly remarkable since, as a Chinese company, DeepSeek lacks access to Nvidia’s state-of-the-art chips used to train AI models powering chatbots like ChatGPT."
Whereas they don't mention the fact that Deepseek still used Nvidia chips. news orgs are implying they didn't.
Stratechery points out that the reduced memory/communications bandwidth of the gimped H800 chips they have access to drove the MOE/MLA architecture developments to make their model possible on the less powerful chips.
Nadella (on X) points out that by Jevons paradox [1], AI usage (and NVidia chip usage) will increase because deepseeks has reduced costs.
One other point Statechery made is that DeepSeek likely distilled the output of leading models for V3 and R1. They have shown that they can replicate the leaders cheaply and quickly, but they can't produce a leading model without copying (yet).
what people are neglecting about Jevons is that fuel efficiency led to increased use of coal because you could use coal to displace more expensive fuels
i wonder what is more expensive that cheap large language models can displace? is the problem with selling unreliable B.S. really its price?
Is it possible that the key here is all about where top talent goes? America's best minds from Math and Informatics Olympiads gravitate to Wall Street for financial gains. Similarly, DeepSeek's top talent are mostly IMO and IOI medalists and they originally aimed for finance too (The parent company of DeepSeek is a successful quant trader), but strict government regulations pushed them towards AI instead. Ironically, this unintended pivot turned out to be quite significant
I think part of the problem was many/most previous models coming out of China were trained on eval data to cheat in the rankings. Quite an uphill battle for Deepseek.
Assuming that is true, which I don't have a reason to believe otherwise, only gave an excuse to the average reader to disregard without reading more than the title.
You can see a similar bias in academia with work originating outside EU/USA.
before someone thinks something strange regarding me, I can only tell you I'm not chinese, but Argentinian :)
It's funny the bubble seems popping because the tech can be made better or cheaper instead due to loss of enthusiasm for the tech itself. AI is here to stay I suppose then.
Meta should actually go up from this. If deep seek is perfect they don't need to pay for an expensive Llama team. And even if deep seek isn't perfect, the low cost training strategies they've invented could be used by Meta to reduce the cost of Llama training.
Although Meta develops models they don't sell them. So a world where foundation models are free is fine for them.
From Meta’s perspective, AI could be incredibly profitable in the context of generating adverts or interactive chat bots for businesses.
They just don’t want to use OpenAI/Google models because they fear being screwed over by them with anti-advert terms of service or price increases. Similar to what they suffered with Apple.
It's like everyone forgets that App Tracking Transparency (ATT) was supposed to put Meta out of business. By many accounts, Meta's ad targeting is even better now than before ATT. It's been reported that AI is what saved their ad targeting.
The OSS goodwill is just a side effect and a way to undermine companies who are not using AI to effectively make profits today.
Cheaper/more efficient is absolutely great for Meta. If they can lower their capex it would be an instant bump to their bottom line.
Thanks, but I don't really see how these articles support the claim that their ad network is more efficient. As you note, the first article has a single anecdata point about it actually being 10-15% worse, while the second one basically says 'trust me bro'. Also of note from the second article is the fact that the ad spend would actually increase.
Of course, if businesses are gullible enough to believe facebook when it fudges up some brand lift metrics without having a real impact on conversions, that's their choice. Trusting facebook to report any analytics is how you take your business behind the barn and help it pivot to video. https://en.wikipedia.org/wiki/Pivot_to_video
No, it seems to me like a stopgap measure against others, rather than anything for themselves. If they were out after "open source goodwill" they'd actually release the models as open source (like let people use it without signing a license/terms agreement, and use models for whatever). As it stands today, they're tricking people into "open source goodwill" but it will eventually catch up with them.
But now (presumably) they don't need to spend 50 billion? e.g. 5 billion or whatever might be enough which makes it even easier for them to justify this.
I don't follow. Meta has been the only US big dog that released open-whatever variants of their models. They did that intending to minimise the gap between them and other big dogs. Their stated goal is to give open access to the community, while at the same time develop the models for internal uses (on their many platforms).
Meta doesn't sell API access. They are not losing on "cheaper" anything. If anything, they get to implement whatever others release under open terms into their stacks. And they still have all the GPUs to further train and serve on whatever improved stack comes next.
I don't see how meta loses here. In fact I think it is one of the only big players in this space that will come out better.
Meta's main use of AI is in their own products, I don't think they should be affected. I would be more worried about companies that want to sell AI models and are not being efficient.
When put into context one could say the OG overreaction was thinking 500B USD in infrastructure investments and restarting nuclear plants were rational decisions rather than working on efficiency.
Distillations are showing (and have been for a while) though that you don't necessarily need big hardware for decent inference. If that's the case, and you can run it on your Macbook trivially or even your smartphone.
1) to address frontier model company stock valuations (openai for instance): deepseek is creating stuff months after frontier companies. In the AI arms race it might make sense to burn billions to be 3 months ahead (think about how you use that superintelligence to prevent anyone else acquiring it)
2) to address Nvidia valuation: there is no cap to demand on intelligence (or, we're not close). People will never be satisfied with the intelligence achieved and just stop asking for more. So Nvidia will still sell the hardware as the demand side is uncapped.
Unrelated note that I was considering and would like an opinion on: Nvidia is the software play in AI, and TSMC the hardware play. Nvidia has competitors like broadcom/AMD/TPUs but beats out on software. TSMC will be frontier on manufacturing everyone's hardware.
I think the big picture that many people are missing here is the motivation that all these AI/tech companies have for buying up so many GPU's in the first place: achieving AGI/ASI.
And while some still try to portray a dedication/duty to AI Alignment, I think most have either secretly or more publicly moved away from it in the race to become the first to achieve it.
And I think, given that inference time compute is so much cheaper than pre-training, the first to achieve AGI might have enough compute on hand from having been to first to build it that they would not need to purchase many more GPU's from Nvidia. So at some point, Nvidia's revenues are going to decline precipitously.
So the question is: how far away are we from AGI? Seems like most experts estimate 3-10 years. Did that timeline just shrink by 50x (or at least by some multiple) from these new optimizations from DeepSeek?
I think Nvidia's revenue is going to continue to grow at its current pace (or faster) until AGI is achieved. Just sell your positions right before that happens.
It doesn't feel like DeepSeek has a big enough breakthrough here. This is just one of many optimizations we're going to see over the next years. How close this brings us to "AGI" is a complete unknown.
The large investments were mainly for training larger foundation models, or at the very least hedging for that. It hasn't been that clear over the last 1+ years that simply increasing the number of parameters continues to lead to the same improvements we've seen before.
Markets do not necessarily have any prediction power here. People were spooked by DeepSeek getting ahead of the competition and by the costs they report. There is still a lot of work and some of it may still require brute force and more resources (this seems to be true for training the foundation models still).
Not that I believe it's likely to happen, but it seems incredibly fucking dangerous for there to be an ASI race with one winner. To the extent these companies believe it's possible, what are they hoping will be the outcome for humanity?
That they get to be the trillionaires with an untouchable moat? Wouldn't this be like creating a Kwisitz Hadarach thinking you can control it, to borrow a Dune reference?
The key question is: has demand elasticity increased for Nvidia cards? An increase in elasticity means people are more willing to wait for hardware price to drop because they can do more with existing hardware. Elasticity could increase even if demand is still growing. Not all growths are equally profitable. Current high prices are extremely profitable for Nvidia. If elasticity is increasing future growth may not be as profitable as the projection from when Deepseek was relatively unknown.
The vast majority of NVDA's value is based on the assumption they are the only game in town that can do AI. I'm still waiting for a competing tech to disrupt them further:
* Intel, AMD, etc. could start making competitive GPUs that push the price down
* A new ASIC chip specifically designed for LLMs
* A new training or LLM runtime algorithm that uses the CPU
* Quantum chip that can train or run a LLM
If Nvidia lost its AI dominance, where would its stock be?
The thing with Nvidia is that it doesn't have a large "sticky" customer base that is guaranteed to spend money year after year on new products. If you look at other large tech companies with similar valuations (Apple, Microsoft, Amazon, Google, Meta), none of them are in danger of their core business disappearing overnight. In Nvidia's case, if large tech companies decide they don't need to continue loading up on AI chips and building larger data centers then they are back to where they were in ~2020 ($100-150B market cap from selling GPUs to gamers and professionals working on graphics-intensive apps).
Not long ago Sam Altman was talking about how they were loosing money even on the paid version of chatgpt. Those incoming prices hikes going to be difficult to sell now.
A huge leap in the cost effectiveness of AI capabilities only paves the way for faster timelines to ASI. I'm not sure why that would reduce the economic value of Nvidia. Pretty sure this is a reaction in the wrong direction. Nvidia should be popping.
I'm not convinced that this is more than the market having a jump-scare at an incremental leap in model building technique.
Say DeepSeek has worked out how to do more with less - that's great! I don't think it means that the market for Nvidia's silicon (or anyone else who can hop the CUDA moat) is going to shrink. I think that the techniques for doing more with less.. will be applied to _more_ hardware, to make even bigger things. AI is in its infancy, and frankly has a long way to go. An efficiency multiplier at this stage of growth isn't going to reduce the capital needed, it will just make it go further. There may be some concern about scaling the amount of training material, but I don't see that as the end of the road at all. After all, a human's mental growth is hardly limited by the amount of available reading material. After all written material is trained upon, the next frontier will just be in some other mimicry of biological metacognition.
The $500 billion data center can now be $50 billion. That is excellent news, unless you were the company that was expected to sell $450 billion of GPUs with a 95% gross margin to that project.
$50 billion can be afforded by WAY WAY more sites and companies so Nvidia will simply delivered to >10x more data centers. Or instead of shipping 100k GPUs to Meta, they will ship 100k GPUs to 10 different customers.
For Nvidia, it is great news because now finally, the concentration of GPUs at Hyperscalers will end and every Fortune company can finally get their local data center to train their AI Models.
Because if training AI models becomes more efficient and easier then the ones being that business are at risk so basically Big Tech. Nvidia isn't in that business but in the business of providing tools to train.
Fortunately, Big Tech can easily do something to prevent ANYONE for competing. They simply buy all available GPUs. Oh wait, haven't they been doing it for years? Excatly!
People really don't get what an arms race and market competition race is.
How do I prevent disruption? I simply buy all the tools the competition needs to disrupt me.
See, if Fortune 500 companies want to build large data centers but can't because all Hyperscalers buy the GPUs then eventually they will rent from cloud as otherwise they can't get the GPUs.
> $50 billion can be afforded by WAY WAY more sites and companies
Spending $50 billion to do $6 million worth of Ai training seems like a good way to trigger a golden parachute and "spend more time with your family" as a CEO.
Inference can be more easily offloaded to other kinds of processing units, which also probably are more efficient, like TPUs.
That makes both NVDA stock and big AI infrastructure spending less compelling, as those needs are scaled down via software efficiency and chip alternatives.
The hypothesis I have is that China has way more compute resources than they are willing to share.
Compute resources they officially should not access to given export bans, where mentioning them might lead to their export ban bypass getting rolled up.
Maybe. They're under no obligation to tell the truth about this.
Hypothetical: Take a large short position on NVDA, announce the market that you trained a massive model without using 10s of millions of rare-as-hen's-teeth NVDA-sentsitive resources. Settle postition then quietly settle giant compute bill. Difficult to know either way, but the market seems have taken the team at face value. I guess we'll know if and only if this reduced training cost methodology is replicated.
do models with DeepSeek architecture still scale up?
If yes, then bigger clusters will outperform in near future. NVidia wins as tide rises all boats, and them first.
If not, then it's still possible to run several models in parallel to do the same, potentially big, job. Just like humans team. All we need is to learn how to do it efficiently. This way bigger clusters win again.
I don't think there has to be an AI bubble, but valuations overall have to come down to something in accordance with the interest rates and expected long-term profit rates.
A couple of things that caught my attention in context of the reactions from the AI community and investor community, but maybe im not fully read up. First, this seems to have been a surprise even amongst researchers but once released very transparent and open. Second, funded through a hedge fund.
Nvidia doesn't have a monopoly on GEMM and GEMV. There will be dozens of hardware vendors. It is TSMC that should be the most valuable company in the world.
Certainly one of the most valuable, but I would still say TSMC as there are lots of other steps in the production besides photolithography (etching, ion implantation, vapor deposition, packing, ...).
Think that was the cause of their stock increase? I feel like investors use opportunities like this to pile money into safer bets rather than just bail on stocks altogether.
I just want to know if I can buy a gaming video card at a reasonable price or if i should hold off on it. I don't care about the AI shit. And yes I'd prefer nvidia because their closest competitor can't tape a box together nevermind develop and assemble a graphics card.
I don't quite understand the sentiment. Lowered cost of training and inference means more companies can join the game, and therefore more demand for the hardware. Basically Jevon's paradox in play. What will more likely fall instead in the long-term should be OAI's valuation, if they don't have other killer products.
IMO reaction to short term semi cycle - a lot of reshored/redundant fab expansions last few years outside of TW. If compute costs goes down 10-50x there may not be enough use case to fill Jevon in next infra hardware cycle of ~5 years. People are doing more with same, but maybe use case not sufficiently more to go fill/justify acquiring more hardware in near term time frame.
Deepseek should cause Nvidia and TSMC stocks to go up, not down. I'm buying more Nvidia and TSMC today.
1. More efficient LLMs should lead to more usage, which means more AI chip demand. Jevon's Paradox.
2. To build a moat, OpenAI and American AI companies need to up their datacenter spending even more.
3. DeepSeek's breakthrough is in distilling models. You still need a ton of compute to train the foundational model to distill.
4. DeepSeek's conclusion in their paper says more compute is needed for next break through.
5. DeepSeek's model is trained on GPT4o/Sonnet outputs. Again, this reaffirms the fact that in order to take the next step, you need to continue to train better models. Better models will generate better data for next-gen models.
I think DeepSeek hurts OpenAI/Anthropic/Google/Microsoft. I think DeepSeek helps TSMC/Nvidia.
That’s a rational stance. However, the Buffet Indicator is flashing red, we’re in a dotcom-era sized bubble, and it only takes a little bit of ill-founded worry to kick off a serious panic.
But why would an AI breakthrough cause stocks to go down? It should cause it to go up. It means we should expect even more breakthroughs, better models, more LLM usage, etc.
you're looking at it from economic theory not from stock market. NVIDIA's insane valuation right now was based on an almost exponential increase in demand for more and more of it. It's priced in that NVIDIA will continue that trend. DeepSeek proves that trajectory is no longer needed (not that it was ever cemented in rationalism), so anything less than the continued exponential growth would send stock down.
Ugh. No, we can now run a decent model through CPU. Not an expensive video card.
Just try out the standard deepseek-r1 or even the deepseek-r1:1.5B through ollama.
No need for expensive hardware anymore locally. My PC ( without Nvidia card/expensive hardware) runs the a deepseek 1.5 b query fast enough - 2 - 9 seconds until it's finished.
The world isn't satisfied with a "decent model". The world is trying to reach AGI.
Further more, reasoning models require more tokens. The faster the GPU, the more thinking it can do in a set amount of time. This means the faster the hardware, the smarter the model output. Again, that reinforces the need for faster hardware.
The real hidden message is not that bigger compute produces better results, but that the average user probably doesn't need the top results.
In the same way that medium range laptops are now 'good enough' for most people's needs, medium range (e.g. DeepSeek R1x) AI will probably be good enough for most business and user needs.
Up till now everyone assumed that only giga-sized server farms could produce anything decent. Doesn't seem to be true any more. And that's a problem for mega-corps maybe?
>medium range (e.g. DeepSeek R1x) AI will probably be good enough for most business and user needs
Except R1 isn't "medium range" - it's fully competitive with SOTA models at a fraction of the cost. Unless you need multimodal capability or you're desperate to wring out the last percentage point of performance, there's no good reason to use a more expensive model.
The real hidden message is that we're still barely getting started. DeepSeek have completely exploded the idea that LLM architecture has peaked and we're just in an arms race for more compute. 100 engineers found an order of magnitude's worth of low-hanging fruit. What will other companies will be able to do with a similar architectural approach? What other straightforward optimisations are just waiting to be implemented? What will R2 look like if they decide to spend $60m or $600m on a training run?
Yes absolutely. I guess I meant medium range in terms of dev and running costs. R1 is a premium product at a corner store price. :)
People are also forgetting that High-Flyer's ultimate goal is not applications, it's AGI. Hence the open source. They want to accelerate that process out in the open as fast as they can.
I don't think a "good enough" AI is happening for a while. People will see what the new models can do and want the same. So long as improvements are rapid and visible, the demand will keep rising.
Right, everyone should be focused on the rapid dismantling of the government; stuffing his cabinet to the gills with incompetent and dangerous sycophants; pardoning of violent criminals, especially those which nearly killed several officers, and the leaders of the organizations who directed the coup attempt; and the capricious way we are treating our historical allies.
It's said the AI war is a war between Chinese, be it US or China, be it hardware and software.
If that's true, normally, mainland will win, as the people over that side have more grit, are more eager to succeed, are working at 996 schedule, and have nothing to lose. They're hard to stop as long as their government will not interfere.
H800s were made to match Biden's export restrictions. They were banned in late 2023 but a lot were sold to China. Having 2k is quite small compared to the bigger players like BABA (200k employees) and Tencent (100k employees). And those sure have access to the few H100s that were smuggled. But unlikely for a tiny company like High-Flyer/Deepseek (160 employees).
He seems adamant that there are no diminishing returns to scaling AI.
I don’t want to stir up conspiracy theories but I do think that currently all the big AI players have a vested interest in the message that the current scaling paradigm is the right one, and that this is a supremacy issue wrt China. It drives so much investment and valuation that I doubt they can truly be objective.
The issue is it feels like we came to a stop but Hyperscalers are simply waiting for Blackwell. That's all.
Why buy 100k Hoppers if 20k Blackwell offer the same compute so then it's better to buy 100k Blackwells right?
Backwell will increase cluster scaling easily by 10x performance and if you buy 10x of them then your compute on a cluster will be 100x than before. If it takes you to wait 6-12 months for that then so be it. You will easily make up the time in the end with the speed up.
> I don’t want to stir up conspiracy theories but I do think that currently all the big AI players have a vested interest in the message that the current scaling paradigm is the right one, and that this is a supremacy issue wrt China. It drives so much investment and valuation that I doubt they can truly be objective.
500 Billion is a lot of money. Expect even crimes to be commited in order to make it happen.
So the day this happened I didn't eat all day which was a weird thing. I don't know just one of those weird coincidences for no reason. Tomorrow I won't eat all day again and see what happens.
Anyway I'm homeless so the not eating all day maybe kinda possibly having something to do with wiping out 600 billion dollars in stock market valuation just on the off chance that it might based on nothing more than wishful thinking?
Nope. Not eating tomorrow. Wonder what will happen.
I feel that one of the most fundamental aspects of business reality is being skipped in all of these conversations, on and off HN.
Everyone building an AI data center is likely using Nvidia technology. Sure, there's a 20% that is partially using other technology. The bulk of it is Nvidia.
If your project is in the planning for the next, say, two years, you have already placed your orders with Nvidia or are going to in the next few months.
Hardware has real lead times. You don't compile yourself 100K chips. They have to be made and you have to wait in line to get yours. For example, I remember when, during the pandemic, we had to place orders for chips with 40 to 50 week lead times.
This means you have to make decisions today (or you already made them months ago) to get in line.
Changes in training or inference efficiency should not change these orders or plans at all. If someone can train faster, they will benefit from the hardware in the pipeline. If they can make inference more efficient, they will be able to service more requests at reduced transactional costs.
The orders are in the pipeline and will continue to be added to the pipeline. Nvidia isn't going to be shipping half the hardware because Wall Street, overnight, panicked. What the grocery store owner does with their stock portfolio because they panic has nothing whatsoever to do with reality.
The same is true in the other direction. Wall Street has been going nuts with quantum stocks. Companies like Rigetti have exploded from nothing to see insane gains. This does not mean the company went from, well, shipping nothing to shipping real working solutions at scale.
Today's market reaction was nothing less than sheep running scared because someone when "boo!". It has nothing whatsoever to do with business realities on the ground. Go build an AI data center without Nvidia chips (or with 10x less chips) and see how that goes when everyone is loading-up with them.
The whole AI valuation was based on being able to rent-seek a significant chunk of all white collar salaries, with a permanent monopoly moat because nobody else would pay hundreds of billions to train models.
And yet again a cheaper Chinese product turns up and everyone loses their minds. Expect a ban incoming to preserve the AI valuations.
Banning wouldn't work imo - now that the cat's out of the bag, with the architecture being open source, anyone can replicate their results to a compete with them for a relatively small investment.
What you find with any market news is that the optimism and pessimism are always overblown. Fear and greed. It happens all the time.
Look at Linux. For those old enough to remember, there was a time where many (including Microsoft) were worried it would destroy the company (eg [1]). There were complaints about the market destruction caused by Linux. What actually happened? Microsoft is bigger than ever even though Linux is on billions of devices worldwide.
IF DeepSeek's claims are real and this stands up, all that's happened is at worst the profit opportunity has simply move *as it was always going to do). This might be bad for OpenAI and Sam Altman but Big Tech will (IMHO) be fine.
Remember that training LLMs for chatbots, which is something people focus on, is just one narrow slice of the potential AI market. Recommendation engines, industrial/commercial applications, medicine, etc.
If there has been a aoftware breakthrough and training LLMs now costs a fraction of what it did last year, there's now an order of magnitude more potential appllications that have become economical.
Consider this: if we can do today with a model 1/10th the size of what we needed last week, what applications will there be for a model 10x DeepSeek-R1's size?
I'm also reminded of the invention of the cotton gin. This automated what used to be a highly manual process. At the time, there was concern this would diminish the need for slaves on cotton plantations. Instead the need exploded because cotton became so much cheaper [2].
Lastly, Stargate is largely meaningless. Companies spend a fortune on data centers. GPUs are just a fraction of that. A genuine software improvement just means you can do more with less.
My point is: don't panic. Unless you're an OpenAI investor, maybe.
Discounted future cashflows. If you buy an asset and every year it produces $100 profit for you it's worth more than $100. You're buying the ability to produce profits in the future not the profits its produced in the past. Those profits belong to the shareholders who have cashed that out already (through dividends or reinvestment).
Earning multiples choose an arbitrary time length of 1 year.
What you're really trying to purchase is a machine that creates more money than it uses.
You need to guess at if that machine will do its job at an arbitrary point in the future, and how well it will do it. Those factors are only loosely correlated with current PE
No, that means that they're earning enough in one year to cover their entire valuation. You want something like 10:1 p/e which means that the next 10 years earnings are factored in to cover their present valuation.
Who's out here buying businesses for 1x the sales revenue volume? What a silly concept. If businesses could be so cheap, you'd just double down every single year until you owned every business on the planet.
A similar efficiency event has occurred in the recent past. Blackwell is 25x more energy-efficient for generative AI tasks and offer up to 2.5x faster AI training performance overall. When Blackwell was announced nobody said “great we will invest less in GPUs”. Deep Seek is just another efficiency event. Like Blackwell it enables you to do more with less.
Seems bad! To me this highlights a failure in the US tech industry. Silicon Valley is theoretically a triumph of entrepreneurship, where the best and best thinking wins. In reality investors picked OpenAI as the winner right out of the gate, gave it more funding than most of us can imagine so that it could dominate the market, sat back and told themselves job done.
Meanwhile in another tech industry a startup had to think lean and innovate its way around resource restrictions. And OpenAI looks like a mess now.
And this is why competition is a good thing. Silicon Valley keeps trying to create monopolies for the purpose of maximal profit extraction to the detriment of American technological development, or they use money and brute force as a substitute for actual innovation and get worse results.
DeepSeek R1 just uses crappy PPO ("GRPO" is just using the sharpe ratio as a heuristic approximation to a value function) on top of distilled existing models, with tons of pipelining optimizations manually engineered in. I don't see this making leading edge research any less expensive, you won't get a "smarter" model - just a model that has a higher probability of giving an answer it could already give. If you want to try and do something interesting with the architecture the pipelining optimizations now slow down your iteration capability heavily.
The RL techniques present will only work in domains where you can guarantee an answer is right (multiple choice questions, math, etc.). It doesn't really present any convincing leap forward in terms of advancing the capability of LLMs, just a strategy for compute efficient distillation of what we know already works. The fact this shitty PPO proxy works at all is a testament to the fact that DeepSeek is bootstrapping its capability heavily off of the output of existing larger models which are much more expensive to train. What DeepSeek R1 proves is you can distill a ChatGPT et al. into a smaller model and hack certain benchmarks with RL.
If you could just do RL to predict the best next word in general this would have been done already - but the signal to noise ratio on exploration would be so bad you'd never get anything besides infinite monkeys at a typewriter. It's not a novel/complicated idea to anyone familiar with RL to try and improve probability of things you like, and whoever decided to do RLHF on an LLM surely thought of (and did) regular RL first - and found it didn't work very well with whatever pretrained model and rewards they had. it was like two weeks ago people were going crazy about O3 doing arc-agi by running the exact same kind of traces R1 is doing in "GRPO" at test time rather than train time. Doing this also isn't novel and also only helps on shitty toy problems where you can get a number to tell you good vs bad.
There is no mechanism to compute rewards for general purpose language tasks - and in fact I think people will come to see the gains in math/coding benchmark problems come at a real cost to other capabilities of models which are harder to quantify and impossible to generically assign rewards to at internet scale.
To explore the frontier of capability you will still need a massive amount of compute, in fact even more to do RL than you would need to do standard next token prediction - even if the LLM might have fewer paramters. You also can't afford to do all the optimizations as you try many different complex architectures.
What people are betting on by selling is there would be a short term pause or deceleration in the investment while they try to squeeze more from their current batch, could be some efficiency catchup
1) Their initial AI offerings weren't real products customers would use or pay for
2) They weren't seeing sufficient adoption to justify the expense
3) They have insane levels of distribution in their existing product lines and can incrementally add AI features
This is entirely orthogonal to whether or not other startups can build AI-first products or whether they can position themselves to compete with the giants.
Wow, literally bought more Nvidia shares last week. Just goes to show the stock market is 80% gambling. Me believing in the value of a company is overshadowed by the hype of growth and "future valuation".
If this is gambling, what isn’t gambling? Sometimes things just don’t go the way you want.
A Chinese company coming up with a cheaper alternative to a cutting-edge technology out of nowhere, is an outcome that is hard to predict.
In hindsight, betting on Nvidia maintaining its monopoly on a resource crucial for such an important technology as AI, might not be the best of ideas, but then again, who knows.
Next episode in this is Chinese companies produces HBM3 (they are already shipping HBM2), 7nm class GPU products competitive against A100 from multiple vendors, EUV. The US will realize what a monster they created.
So I was aware of the basic story here from late yesterday, but I only actually looked at the numbers today. Focusing on the dollar figure here is misleading.
Basically, if you look at only the stories, you get a dire picture -- like, "how will they make payroll????"
However, if you look at the actual share price in context, you see . . . a not very interesting event.
The graph definitely shows a share price adjustment, but it by no means erases the run-up Nvidia has enjoyed in the last 2 years. All that happened was the stock dropped back to it price of about the middle of last summer.
This may be old news to most of you, but: in markets we also track a figure called the P/E ratio, which is the ratio of the price of the company vs. its profits. Old-school manufacturing firms -- what used to be "blue chip" stocks -- would be in the 18-22 range here. Apple which absolutely PRINTS cash, is at a high-flying 38. NVidia's P/E is still 58.8.
The tl;dr here is that NVidia is still valued very, very highly. They're still what, the 3rd most valuable company in the world, with an enviable position in chipmaking. They still make money hand over fist. The weird part is that their firm is SO valuable that they could take a $600B haircut and it be almost a non-event.
Nothing, and I think that was the point GP was making. Since not much of Apple's valuation is tied to AI hype they won't suffer (at least not close to as much) from the bubble bursting.
Nothing. Apple (and Meta) are not directly impacted by AI even if the cost of training these AI models becomes cheaper and $0 free models get better.
It actually affects the frontier AI companies (OpenAI, Anthropic, etc) who directly make money from their closed models AND spend hundreds of millions on training these models.
Why pay $3/per million tokens (Claude 3.5 Sonnet) when DeepSeek R1 offers $0.14 / per million tokens and the model is on par with (OpenAI o1) and R1 itself is released for free?
$0 free AI models are eating closed AI models lunch.
Good time to buy then, I don’t understand how stupid some traders can be.
A more efficient model is better for NVIDIA not worse. More compute is still better given the same model. And as more efficient models proliferate it means more edge computing which means more customers with lower negotiating power than Meta and Google…
This is like thinking that if people need to dig only 1 day instead of an entire month to get to their nugget of gold in the midst of a gold rush the trading post will somehow sell fewer shovels…
While everything you say may be true, this shows a fundamental misunderstanding about how the modern stock market functions. How much value a company creates is at best tangential and often completely orthogonal to how much the stock is worth. The stock market has always been a Keynesian beauty contest, but in the past few decades, strongly shaped and morphed by attention economies. A good example of this is DJT, a company which functionally doesn't do much anything, but in the last year has traded at wildly differing prices. P/E, EBITDA, etc, are all useless metrics in trying to explain this phenomenon.
In other words, NVIDIA is in the red not because the company is suddenly doing worse, but because traders think other traders think it will trade down. That is a self fulfilling prophecy, but only so long as there is sufficient attention to drive that. The same works the other way around as well, so long as there is sufficient attention to drive the AI hype train upwards, related stocks will do well as well.
I think the message everyone now accepts is: "there is no moat". It is plain stupid to think big models can be magically copy-protected - they are simply arrays of numbers and all components one need to create such arrays are free and well established. This is unlike the whole infrastructure, processes, social connections, hardware and storage, one need say to recreate a service like YouTube or Facebook. Large models are different - you don't need all of that - the future of LLMs is Open Source like Linux.
You can buy the dip if you want, as long as you're aware that you're not betting that "stupid" traders are undervaluing NVIDIA's fundamentals. Rather, you're betting that "stupid" traders will again rally NVIDIA's share price significantly above this dip, and you will be a smart trader who will know when to sell what you bought. Good luck.
And not just that, but even if AI's future is indeed as bright as the hype says (i.e. that NVIDIA's fundamentals are solid & that the market will eventually acknowledgment that after the fluctuations) they may still be wrong about the timeline.
In the .com bust you could have "bought the dip" in the early 00s right after the crash started and still taken 5 years before you weren't in the red even on "good" (in hindsight) stocks like amazon, ebay, microsoft, etc. The big hype there was eCommerce - it turned out to be true! We use eCommerce all the time now, but it took longer than predicted during the .com boom (same for broadband internet enabling "rich web experience" - it came true, but not fast enough for some hyped companies in '00).
And if you bought some of the darling stocks back then like Yahoo or Netscape that ended up not so great in hindsight you may have never recouped your losses.
It's not just the usual herding behavior though. There's a convex response to news like this because people look at higher order effects like the growth of growth for stocks. Basically the DeepSeek story is about needing 40x fewer compute resources to run inference if their benchmarks are true. The dip doesn't mean that NVidia is now doomed, it simply means that if DeepSeek is legit, you need much less NV hardware to run the same amount of inference as before. Will the demand rise to still use up all the built hardware? Probably, but we went from a very stratospheric supply constraint to a slightly less stratospheric one, and this is reflected in the prices. Generally these moves are exaggerated initially, and it takes a bit of time for people to digest the information and the price to settle. It's an oscillating system with many feedback loops.
As someone who bought NVDA in early 2023 and sold in late 2024 I can say this is wrong.
There was never a question of if NVDA hardware would have high demand in 2025 and 2026. Everyone still expects them to sell everything they make. The reason the stock is crashing is because Wall St believed that companies who bought 50B+ of NVDA hardware would have a moat. That was obviously always incorrect, TPUs and other hardware was eventually going to be good enough for real world use cases. But Wall St is run by people who don't understand technology.
Loving the absolute 100% confidence there and the clear view into all the traders' minds that are trading it this morning.
If they'll sell everything they make and it's all about the moat of their clients, why is NVDA still down 15% premarket? You could quote correlation effects and momentum spillover, but that is still just the higher order effects I mentioned about people's expectations being compounded and thus reactions to adverse news being convex.
Presumably because backorders will go down, production volume and revenue won't grow as fast, Nvidia will be forced to decrease their margins due to lower demand etc. etc.
Selling everything you make is an extremely low bar relative to Nvidia's current valuation because it assumes that Nvidia will be able to grow at a very fast pace AND maintain obscene margins for the next e.g. ~5 years AND will face very limited competition.
That's literally what I wrote in my post, which the parent disagreed with. You could disagree with the part that it is because inference is now cheaper - but again I'd argue that's just a different way of saying there's no moat.
People owned NVDA because they believed that huge NVDA hardware purchases was the ONLY way to get a AI replacement for a Mid Level software engineer or similar functionality.
That's basically what I wrote: "it simply means that if DeepSeek is legit, you need much less NV hardware to run the same amount of inference as before."
So I still don't understand what it is that you are so strongly disagreeing with, and I also don't understand how having owned NVidia stock somehow lends credence to your argument.
We are in agreement that this won't threaten NVidia's immediate bottom line, they'll still sell everything they build, because demand will likely rise to the supply cap even with lower compute requirements. There are probably a multitude of reasons why the very large number of people who own NVidia stock have decided to de-lever on the news, and a lot of it is simple uneducated herding.
But we are fundamentally dealing with a power law here - the forward value expectations for NVidia have exponential growth baked in to the hilt, combined with some good old fashioned tulip mania, and when that exponential growth becomes just slightly less exponential, that results in fairly significant price oscillations today - even though the basic value proposition is still there. This was the gist of my comment - you disagree with this?
Up until recently there was a belief by some investors that OpenAI was going to "cure cancer" or something as big as that. They assumed that the money flowing into OpenAI would 10x, under the assumption that no one else could catch up with them after that event and a lot of that would flow to NVDA.
Now is looks like that 10x of flow of money into OpenAI will no longer exist. There will be competition and compodiditzation, which causes the value of the tokens to drop way more than 40x.
Everything above the street level and physical economy is becoming gambling.
There has always been a component of gambling to all investing, but that component now seems to utterly eclipse everything else. Merit doesn’t even register. Fundamentals don’t register.
>In other words, NVIDIA is in the red not because the company is suddenly doing worse, but because traders think other traders think it will trade down.
Well put. People need to unterstand that some stocks are basically one giant casino poker table. There was a comment with a link here that a lot of Nvidia buyers don't even know what products Nvidia is making and they don't care, they just want to buy low and sell high. Insert old famous comment abut shoe shine boy giving investment advice to Wall Street stock traders.
It's a reflection of expectations about the future economy. Obviously, such expectations are not always accurate because humans are quite fallible when trying to predict the future. This is even more true when there is a lot of hype about a certain product.
Yesterdays price of (say) NVidia was based on the expectation that companies would need to buy N billion of USD of GPUs per year. Now Deepseek comes out and makes a point that N/10 would be enough. From there it can go two ways:
- NVidia's expected future sales drop by 90%.
- The reduced price for LLMs should allow companies to push AI into markets that were previously not cost effective. Maybe this can 10x the total available market, but since the estimated total available market was already ~everything (due to hype) that seems unlikely.
- NVidia finds another usecase for GPUs to offset the reduced demand from AI companies.
In practice, it will probably be some combination of all three. The real problems are not caused for the "shovel sellers" but for companies like OpenAI and Anthropic, who now suddenly have to compete against a competitor that can produce the same product at (apparently) a fraction of the price.
> OpenAI and Anthropic, who now suddenly have to compete against a competitor that can produce the same product at (apparently) a fraction of the price.
OpenAI and Anthropic can react by adopting DeepSeek's compute enhancements and using them to build even better models. AI training is still very clearly compute-limited from their POV (they have more data than they know what to do with already, and training "reasoning"/chains-of-thought requires a lot of reinforcement learning which is especially hard) so any improvement in compute efficiency is great news no matter where it comes from.
I think it is more expectation about expectation. You buy/sell based on whether you expect other people to expect earn or lose. It is self-referential, hence irrational. If a new play enters and peoples expectations shift, that affects your expectation of value even though the companies involved are not immediately or directly affects.
As already mentioned elsewhere, Jevon's Paradox will increase demand subsequent to improved efficiency. Yes, will not can.
So if the stock market was reflective of the economy (future or the present) then stocks should go up, instead they're going down. Why? Because the stock market is not reflective of the economy.
The stock market is essentially a reflection of societal perception. DJT which was brought up earlier is a great example, because the price of DJT has next to nothing to do with Trump's businesses and almost everything to do with how he is perceived (and remember there is no such thing as bad publicity).
Personally I think the fall will be momentary and followed shortly by a climb to recovery and beyond, but who really knows.
If you don't want to lose your money: Don't let the sensationalist financial journalists and pundits get to you, don't let big red numbers in your portfolio scare you, ignore traders (they all lose their money), don't sell your stocks unless you actually need that money for something right now, re-read your investment manifesto if you have one, and maybe buy the dip for shits and giggles if you have some spare cash laying around.
I agree that it will improve demand for AI services. There's no hard rule that the demand increase will be larger than the efficiency increase though, and so total sales of GPUs may still decrease as a result.
Nvidia is way too overvalued regardless of deepseek or the success of AI. This is just some correction (not even too big even considering the current bubble), these traders are not stupid.
I agree with Aswath Damodaran here. NVDA is priced for perfection in AI, but also whatever is next.
In addition, IMO NVDA’s margins are a gift and a curse. They look great to investors, but also mean all their customers are aggressively looking to produce their own GPUs.
Exactly.
gpu's have become too profitable and of strategic importance, to not see several deep pocketed existing technology companies invest more and try and acquire market share.
there is a mini moat here with cuda and existing work, but some the start of commodification must be on the <10 year horizon
They are also priced on the idea that nothing will challenge them. If AMD, Intel, or anyone else comes out with a challenger for their top GPUs at competitive prices, that’s a problem.
The biggest challengers are likely the hyperscalers and companies like Meta. It sort of flew under the radar when Meta released an update on their GPU plans last year and said their cluster would be as powerful as X NVDA GPUs, and not that it would have X NVDA GPUs [1].
Also, I should add that Deepseek just showed the top GPUs are not necessary to deliver big value.
This announcement is one step in our ambitious infrastructure roadmap. By the end of 2024, we’re aiming to continue to grow our infrastructure build-out that will include 350,000 NVIDIA H100 GPUs as part of a portfolio that will feature compute power equivalent to nearly 600,000 H100s.
Have you priced in the extremely limited freedom to operate they have? There is an extreme systemic risk to being a monopoly in a strategic position. It's an extreme beneficial position to be in, until it isn't.
ASML for now has a monopoly on cutting edge EUV. Since this is considered a strategic technology, the US dictates what they can sell to whom. This places ASML in a pincer. The US will develop a competitor as soon as they can if they can't get enough control over ASML, and at that point ASML would still be forbidden to sell to BRICS while losing the 'western' market as well.
So they're in a plushy seat, until the US decides they aren't.
NVIDIA has a P/E ratio of 56 that’s double that of the S&P 500 but half that of AMD and the same one as Meta.
And whether it’s overvalued or not isn’t relevant that selling a stock because the product the company produces is now even more effective is mind bogglingly stupid.
> selling a stock because the product the company produces is now even more effective is mind bogglingly stupid.
No it isn't. Investors are most likely expecting there will be less demand for Nvidia's product long-term due to these alleged increased training efficiencies.
There is afaict no inherent limit to expand on the bottom end of the market. My gut feeling is lower training costs will expand the market for hardware horizontaly far faster than any vertical scaling by a select 2 digit of mega corps could.
It's arguable how good a strategy it is to check against other P/E. During the tech bubble people would say X is cheap, because Y is trading at 100P/E instead of 200
Again that may reasonable but it’s a completely different argument. Whether there is a bubble or not and whether NVDA is overvalued is irrelevant to the subject at hand.
If it’s cheaper to train models it means far more customers that will try their luck.
If you reduce training requirement from a 100,000 GPUs to a 1000 you’ve now opened the market to 1000’s and 1000’s of potential players instead of like the 10 that can afford dumping so much money into a compute cluster.
the holy grail is to not have a separate train and inference steps. when the model can be updated while it is inferencing is where we're headed. deepseek only accelerates the need for more compute, not less
THIS is the only correct statement in all of this.
The goal for AGI and ASI MUST BE to train, inference, train, inference and so on and that all on the fly in fractions of a second from every token produced.
Now good luck calculating the compute and hard work in algorithms to get there.
Not possible? Then AGI won't ever work because how can AGI beat a human if it can't learn on the fly? Not to mention ASI lol.
P/E alone is useless anyway. A growth company is likely not making a profit as they are reinvesting. But not a profit doesn't implies good either of course.
AMD does not have a PE double of NVIDIA. PE is high because of amortisation of an acquisition. People on hackernews talk a lot but have no idea what they talk about. You might know how to write javascript or some other language but clearly you have not read the earnings reports or financials of AMD and probably alot of the other companies you talk about. So please stop spreading nonsense.
This is hackernews, not some boilerroom pump n dump forum. Please use more professional language and take your confidence down a notch. Try to learn and add to the discussion.
You seem to believe that the more inference or training value per piece of tech the more demand there will be for that piece of tech full stop when there are multiple forces at play. As a simple example, you can think of this as a supply spike; while you can make the bet that the demand will follow there could be a lag on that demand spike due to the time it takes to find use cases with product/market fit. That could collapse prices over the near term which could in turn decrease revenue. As a reminder the stock value isn't a bet on whether "the gold trader" will sell more gold or not, it's a bet on whether the net future returns of the gold trader will occur in line with expectations, expectations that are sky high and have zero competition built in.
Well the price has a built in presumption that the earnings will keep growing. That's why PER is not that relevant for them, it's been over 50-70 since forever, but the stock went up 10x, which means earnings went up as well. DeepSeek might be good for their business overall, but it might mean earnings will not continue growing exponentially like they have been for the past two years. So it's time to bail.
You shouldn't underestimate the fact that a large amount of these trades are on margin. Sometimes you can't wait it out because you'll get margin called and if you can't pony up additional cash you're basically getting caught with your pants down.
Disclaimer: I am not a trader, so could be way off
Why? The compute requirements would still continue to grow the more efficient and more capable the models become.
If it’s cheaper to inference you end up using the model for more task, it it’s cheaper to train you train more than models. And if you now need only 1000’s of GPUs instead of 10’s or 100’s of thousands you’ve just unlocked a massive client base of those who can afford to invest high six to low seven figures instead of 100’s of millions or billions into to try their luck.
Doesn't this situation also imply to some degree that China is focused on beating the US on AI and probably they will develop a competitor to NVIDIA that will cause margins to drop significantly?
They have a lot of very smart people and the will to do it, seems like a matter of time before they succeed.
It could be, but maybe the feeling is the investments now are already massive and everyone has jumped on the AI train. If you are suddenly 10x efficient, and everyone gets 10x more efficient, there's less room to grow than before. What you're saying makes a lot of sense, but it's one thing to write it on a message board and another to use it to back up your decision that affects billions of dollars you have in your fund.
The proof is in the pudding, you're welcome to prove "everyone" wrong.
In economics, the Jevons paradox occurs when technological progress increases the efficiency with which a resource is used, but the falling cost of use induces increases in demand enough that resource use is increased, rather than reduced.
Yes but it will also mean that people wouldn't need cutting edge NVIDIA chips - it will be able to run on older node chips. Or from different manufacturers. So NVIDIA wouldn't be able to command the margins they do now.
It may be great news for VRAM manufacturers tough.
> I don’t understand how stupid some traders can be.
90% of traders lose money, so that's a data point...
You're trying to apply rational thinking but that's not how markets work. In the end valuations are more about narratives in the collective mind than technological merit.
I think it's because the media coverage is all focused on how this means the big AI players have lost their competitive advantage, rather than the other side of the equation.
But that's also dumb, because "huge leap forward in training efficiency" is not exactly bad news for the major players in even the medium term. Short term, it means their models are less competitive, but I don't see any reason that they can't leverage e.g. these new mixed precision training techniques on their giant GPU farms and train something even bigger and smarter.
There seems to be this weird baked in assumption that AI is at a permanent (or at least semi-permanent) plateau, and that open source models catching up is the end of the game. But this is an arms race, and we're nowhere near the finish line.
> Good time to buy then, I don’t understand how stupid some traders can be.
Likely a "how solid is the technical moat" evaluation - this could be a one-off or could be that there are an avalanche of advancements to continue along the efficiency side of the process.
Given the style and hype of logic in the AI space, I fully believe resources are not well allocated in compute and _actual_ thinking as to how they are spent.
Deepseek's apparent 10x more efficient per inference token... implies a lot of other hardware meets the general use-case. We also know that reasoning should be about 10W for human speed-of-thought... maybe another 1-2 orders of power efficiency.
"Pre-Training: Towards Ultimate Training Efficiency
We design an FP8 mixed precision training framework and, for the first time, validate the feasibility and effectiveness of FP8 training on an extremely large-scale model.
Through co-design of algorithms, frameworks, and hardware, we overcome the communication bottleneck in cross-node MoE training, nearly achieving full computation-communication overlap.
This significantly enhances our training efficiency and reduces the training costs, enabling us to further scale up the model size without additional overhead.
At an economical cost of only 2.664M H800 GPU hours, we complete the pre-training of DeepSeek-V3 on 14.8T tokens, producing the currently strongest open-source base model. The subsequent training stages after pre-training require only 0.1M GPU hours." [1]
NVDA is a totally manipulated stock. The company beat earnings in the last three quarters and the stock dropped 15% to 20% immediately after the results release.
Good time to buy is % of income via tax efficient methods into the in SP500 for most.
But I agree in the sense that Deepseek just creates more demand. Because people desire to get AI to do more work. This makes bang for buck greater opening new opportunities.
This sell off is like selling Intel in 2010 because of a new C compiler.
maybe Nvidia is fine but I don't understand this logic. suppose it turns out GPUs are unnecessary at all, but they still provide a performance boost, but you can do everything you can do right now with CPUs w.r.t performance. would that be good or bad for Nvidia?
unless it can be said we need more performance than is currently possible, e.g. new demand, it would be catastrophic. it is unclear that throwing more compute actually expands what is possible. if that is not the case, efficiency is bad for nvidia because it simply results in less demand.
I could see arguments be made in both ways here. If GPUs end up being more efficient/powerful (like today) it could induce even more demand, but also if CPU gets within ~20% of how fast you can do something with a GPU, people might start opting for something like Macs with unified memory instead of GPUs.
Today a CPU setup is still nowhere near as fast as a GPU setup (for ML/AI), but who knows how it looks like in the future.
> it is unclear that throwing more compute actually expands what is possible
Wasn't that demonstrated to be true already in the GPT1/2 days? AFAIK, LLMs became a thing very much because OpenAI "discovered" that "throwing more compute (and training data) at the problem/solution expands what is possible"
nVidia is also about HPC in general, not just AI. It's remarkably silly that the stock would plunge 13% just because someone made a more compute-efficient LLM.
There is a point where there are enough shovels circulating that the demand for new shovels falters, even with zero drawback in the rush. And if so much gold was being mined that it overwhelmed the market and reduced the commodity price, the value of better shovels is reduced.
DeepSeek and friends basically reduce the commodity value of AI (and to be fair, Facebook, Microsoft et al are trying to do the same thing with their open source models, trying to chop the legs out of the upstart AI cos). If AI is worth less, there are going to be fewer mega capitalized AI ventures buying a trillion dollars worth of rapidly-depreciating GPUs in hopes at eeking out some minor advantage.
I wouldn't short nvidia stock, but at the same time there is a point where the spend of GPUs just isn't rational anymore.
>And as more efficient models proliferate it means more edge computing which means more customers with lower negotiating power than Meta and Google
Edge compute has infinitely more competition than the data center.
You answered your own question. People do not dig in the Sacramento right anymore for gold, because, it is gone. If you can train models for 1/100 the cost, and you sell model training chips, you probably are not going to sell as many chips.
That's why the shovel maker from back then are selling mining machines today.
Everyone here thinks Nvidia is dommed because of training efficiency.
But what has Nvidia been doing for the past decade? Correct increasing training and inferencing efficiency by magnitudes.
Try to train GPT4 on 10k of Volta, Ampere, Hopper and then Blackwell.
What has happened since then? Nvidia has increased their sales in magnitudes.
Why? Because thanks to improvement in data, in algorithms, compute efficiency ChatGPT was possible in the first place.
Imagine Nvidia wouldn't exist. When do you think the ChatGPT moment would happen on CPUs? LOL
Going back to my first sentence. Nvidia started also with small shovels which were GeForce cards with CUDA. Today Nvidia is selling huge GPU clusters (mining machines, yes pun intended ^^).
This is an example of a recurrent phenomenon in politics. Trump comes in with big plans, this and that, threatening countries, deporting people, a clear agenda to take America to the hard right. Isolationism, crush China, screw every other country over, etc. etc.
And then - something completely unexpected, a total curveball - arrives a week into office and everything changes. Your agenda collapses and you enter reactive mode.
AI bubble could pop. Valuations drop. Crypto might get hit. Where is Project Stargate now? It might become a joke.
A more cynical observer might suggest that _the timing was no coincidence_.
I said in other threads about how the establishment passed Trump a ticking time bomb – if he were smart, he should've made it blow up early himself to avoid becoming the next Herbert Hoover
> Your agenda collapses and you enter reactive mode.
Where has Trump's agenda collapsed? I might have missed the press release.
And why would a curveball on AI throw off an agenda around trade, immigration and military engagement? I don't follow.
China could take the lead on AI and I don't see it would impact any of those things. Isn't DeepSeek open source? The US already has access to it, so what leverage could China possible have?
> A more cynical observer might suggest that _the timing was no coincidence_.
Do you feel the Chinese government closely controls AI research and timed this response?
> Where has Trump's agenda collapsed? I might have missed the press release.
Well, it's too early to say if it has happened in this case. But we've seen it happen again and again, so it will not be a surprise if it happens.
> And why would a curveball on AI throw off an agenda around trade, immigration and military engagement? I don't follow.
Well, trade restrictions against China have just backfired spectacularly. So further trade restrictions may not seem as good an idea. And to start trade wars, you need a strong economy. And at the moment America's economy is entirely driven by the AI bubble, as it is the value of the AI stocks that has separated the economy from the European trend.
It is very likely that military engagement will be driven by, or affected by, developments in AI. And there's no doubt that Taiwan's situation is heavily affected by chip production.
> Do you feel the Chinese government closely controls AI research and timed this response?
No evidence, but I am not naive enough to think it's out of the range of possibility. I don't follow your reasoning, more likely the timing of the release is all they had to modify - which is quite trivial for any government. To not question that would be very naive. They have _every motive_.
> Well, it's too early to say if it has happened in this case
Ah ok, so you’re just guessing. You wrote it as if it had already happened.
And I think you’re confusing the stock market with the broader economy. The economy as a whole is unaffected by AI at this point.
And the trade war with China involved hundreds of industries beyond AI. Most of it is manufacturing. I’m not sure how failure of AI sanctions (questionable conclusion) somehow means the trade issues around machine parts needs to be abandoned.
I agree China has motive but I’ve heard so many claims that the Chinese government doesn’t control research or businesses like TikTok.
> Ah ok, so you’re just guessing. You wrote it as if it had already happened.
That's not a generous reading. I'm saying this is what has happened historically.
Edit: the use of the words "might" and "could" made this pretty clear.
> The economy as a whole is unaffected by AI at this point.
True in terms of no marked effect on GDP. But the stock market and the broader economy are very much linked. See 2008. And the stock market is high on AI.
> I’ve heard so many claims that the Chinese government doesn’t control research or businesses like TikTok.
Even western countries tell tech companies what to do. Do you think an authoritarian government is going to do that more or less? There's a reason they didn't want to sell TikTok.
Deepseek showing that you can do pure online RL for LLMs means we now have a clear path to just keep throwing more compute at the problem! If anything we made the whole "we are hitting a data wall" problem even smaller.
Additionally, its yet another proof point that scaling inference compute is a way forward. Models that think for hours or days are the future.
As we move further into the regime of long sequence inference, compute scales by the square of the sequence length.
The lesson here was not "training is going to be cheaper than we thought". It's "we must construct additional pylons uhhh _PUs"
Didn't DeepSeek also show that pure RL leads to low-quality results compared to also doing old-fashioned supervised learning on a "problem solving step by step" dataset? I'm not sure why people are getting excited about the pure-RL approach, seems just overly complicated for no real gain.
If I’m understanding their paper correctly (I might not be but I’ve spent a little time trying to understand it), they showed you only need a small amount of supervised fine tuning “SFT” to “seed” the base model, followed by pure RL. Pure RL only was their R1-zero model which worked, but produces weird artifacts like switching languages or excessive repetition.
The SFT training data is hard to produce, while the RL they used was fairly uncomplicated heuristic evaluations and not a secondary critic model. So their RL is a simple approach.
If I’ve said anything wrong, feel free to correct me.
ASML warned that it was just a question of time for the Chinese to get their own tech industry up and running.
The Americans were in denial.
Ultimately someone in America will get desperate enough and start a war when they still have a chance to win.
See the Earth-Mars conflict in the Expanse.
DeepSeek has humiliated the entire US tech sector. I wonder if they will learn from this, fire their useless middle management and product managers with sociology degrees, and actually pivot to being technology companies?
"Again, just to emphasize this point, all of the decisions DeepSeek made in the design of this model only make sense if you are constrained to the H800; if DeepSeek had access to H100s, they probably would have used a larger training cluster with much fewer optimizations specifically focused on overcoming the lack of bandwidth."
If training and inference just got 40x more efficient, but OpenAI and co. still have the same compute resources, once they’ve baked in all the DeepSeek improvements, we’re about to find out very quickly whether 40x the compute delivers 40x the performance / output quality, or if output quality has ceased to be compute-bound.
reply