The people writing market commentary are simply making it up. The news about DeepSeek is not new and doesn't reduce the value of ASML. People are selling now because they are scared because the number went down.
Yes! and this applies to all market commentary. Market goes down .7% and you have talking heads saying "fear of tariffs" "middle east tensions" "Hurricane season" whatever, next day market goes up .6% "talk of tax cuts" "Jobs numbers" whatever. There's no way anyone knows why "the market" behaves the way it does. The free market is the OG "Decentralized" project, it's 1 billion different decisions being made in a day each with their personal reasons. Yes, sometimes it's fairly obvious that something caused it (plane blows up, stock goes down) but that still doesn't explain the entirety of it
Andrej Karpathy was tweeting about DeepSeek a month (!) ago.
"DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M)."
Yes exactly. The actual impetus this time was the article I posted here and how it got echoed and amplified by massive X accounts like Chamath and Naval.
ASML paper value is determined by equipment sales from projected compute supply/demand. CHIPS building redundant global fabs = glut from more excess capacity = less future sales. Stargate = excess demand from everyone spending 100s of billions of compute = need even more fabs = more future sales. Then DeepSeek = suddenly no need for that much future compute... if number of future compute demand relative to short term fab overcapacity is going down, then it's reasonable to sell. Relatively predictable Semiconductor market cycles due to cost of capex and time to build fabs / increase new wafers output to match future demand is a thing.
This would be an excellent explanation if DeepSeek had announced its model over the last weekend rather than weeks ago, and if R1 wasn't a COT reasoning model which needs a lot more inference time compute than other SOTA models like llama.
Information lag, especially with respect to PRC developments and technical developments. Taking 1-2 week for info to be shared and passed down info chain not unusual. IRC COT typically increase inference 1-5x depending on task complexity, i.e. instead of scaling down compute demand by 50x, it's 10x, which is still substantial. Could investors be panicking? Sure, but there's rational basis for doing so.
DeepSeek V3 is a 671B parameter MOE model? I am not sure why it's 50x cheaper at inference time than other models. We don't know what the cost of running o1 is, but I doubt it has 50x as many params as R1. Most the advantages of MOE are reduced when using reasonable branch sizes so that wouldn't make R1 cheaper in practice either. I think people might be seeing a lower markup from DeepSeek and confusing it with cheaper inference?
If DeepSeek is side/pet project for PRC quants whose fine with subsisting on low markup then that's market price competitors have to calibrate return on investment and future capex. DeepSeek also appear open and performative enough to drive cheaper inference on commodity hardware with very different margins for variety of use cases, including existing hardware. At least short term there's going to be % of LLM use cases that has to compete on China prices or previously considered depreciated hardware. IMO snowballing interests undoubtly getting investors to pay attention and deep dive to related developments, i.e. Bank of China's 1T AI fund, and DeepSeek CEO just met PRC premiere a few days ago. AFAIK DeepSeek hasn't ever gotten this much domestic political attention before, they're potentially going to be elevated as domestic champion and it's likely to open a lot of more doors, i.e. significantly more compute. Hard to tell how that will effect western models business models short term.