Hacker Newsnew | past | comments | ask | show | jobs | submit | dkobia's commentslogin

Zitron is begging for a collapse at this point. Yes, his macro analysis correctly identifies a massive financial risk but his incessant pessimism completely misses the incredible ground-level utility that many of us on HN celebrate every day through undeniable, massive productivity gains.

At this point I'm trying to believe there's a middle ground where the level of individual capability this unlocks, leads to major discoveries.


> undeniable, massive productivity gains.

Take any stock index, remove AI stocks, what do you see? That's right! Nothing...

So where is all the productivity going? Where is the value? Where are the massive unemployment stats or the millions of new startups making big $$$?


Writing about AI, destroying the planet for data centers, there's a lot of money to be made.

That being said, AI seems kind of miraculous sometimes.

Similar to cars. So enticing that we make everything else in the world worse in order to maximize the profit, make it indispensable, subsidize it, and make the dependency on it irreversible.

And it's not even something to blame individual people for.

Driving away from all the other cars to spend a weekend feels like freedom.

Using AI to answer a question feels like a "bicycle for the mind".

But in fact it's more like a car. It requires massive resources and creates perverse incentives, and the result is ineffective and corrupt.

Both cars and AI are amazing technology and extremely useful, but using them is not an individual responsibility. It requires societal subsidy.


The environmental impact of answering a question on an obscure topic with ai model is less than an the impact of answering the question with an hour-long google search hunting for references or a drive to the public library.

That's true, and I am not anti-AI. I was not only thinking about the environmental effects of some single prompt or a certain amount of tokens.

Neither did I want to say that a car is always more wasteful than some alternative.

But defaulting to the behemoth is inefficient, unless everyone is driven to do it: then it's in some way reasonable.

By adding "corrupt" and "dependent", as well as the economic terms, I wanted to offer a broader critique and create an analogy, not just talk about energy usage on its own.

What I had in mind was: it's easier to go many places that are a mile or less from me, by car. Because everything is obstructed by cars. And I'm atrophied by lack of movement. Best would be to drive somewhere to move/walk.

People already do that in masses.

And doing shopping by car, because everything else seems unbearable, also takes away your time, apart from wasting energy compared to more, smaller shops that would be reachable by foot, bycicle etc.

I guess you know the argument.

Today, people's thinking atrophies because their LLM is probably right in their summarization of some Wikipedia article, plus 2-3 other random sources.

Or so.

Using the Wikipedia search function is not expensive.

But, I mostly had a bigger picture in mind than what is the cost of inference.


I think it's a good analogy in many ways, and personally I think car-centric society has a lot of flaws. I think the ease that AI brings to tasks may erode mental capabilities in the same way cars have eroded our collective physical health.* That said, it doesn't seem to me that we would be better off without cars altogether, despite all the related issues.

I am concerned about the environmental impacts that AI poses, but they don't seem to me to be so catastrophic. Solar and battery tech has made enormous leaps in the past couple decades, and we will need to pivot to clean energy future irrespective of AI.

*This said, I have become gradually more alarmed over the past decade at the lack of epistemological rigor in the general public, as made apparent through the rise of social media. I don't know that AI becoming a truth-seeking crutch for people wouldn't be more good than bad.


> it doesn't seem to me that we would be better off without cars altogether, despite all the related issues

Oh my god, no. I also want the benefits of automobiles! They are strictly more capable than, say, trains. That's where I would derail the discussion completely when going into details, but no, I am not against cars as a technology.

Apart from all the ethical and social arguments (logistics, ambulances, the elderly, etc etc). But that's not where I wanted to go.

I was making a leap here simply because of the whole complex around prisoner's, dilemma, the commons, state economy, and so forth.

Since at least ~100yrs ago, I guess cars and streets as the primary mode of transportation have also "won the vote" / are what the majority wants, so it's also an interesting analogy for diminishing returns maybe.

Building out more car infrastructure is certainly not controversial where there is absolutely none but there are commercial or residential buildings.

Anyway, lots of associations are worth considering here IMO. The ultimate limiting capacity here, when disregarding all environmental or health concerns, is simply space and the positive externalities (cities etc) around existing infrastructure.


> I was not only thinking about the environmental effects of some single prompt or a certain amount of tokens.

Hand wringing about AI datacenter's environmental impact is well and good. We should keep the data centers accountable for their consumption and waste.

I just wish the same people had been upset the last 20 years with poor water resource management in a lot of areas (the west US especially) with urban, ranching and farming development.

> That's true, and I am not anti-AI.

Me neither!


The past may be past, but it's important that even now we point out the relative scale of resource usage, pollution, etc going forward of everything from cars to AI to golf courses to beef.

That might be true, but at least I started asking way more questions since we’ve had competent LLMs.

It's like saying if we didn't have cheap commercial flights people would travel by foot anyways and would consume more resources for food &co. than the plane would consume in fuel...

80% of generative AI queries wouldn't even exist as google searches.


To be clear, your position here is that insurmountable barriers to information is the preferable state of the world?

One claim of the parent comment was that AI is ineffective. For the purpose of finding answers to questions, it is more resource-efficient than the alternatives, and, to your point, capable of answering questions that were impossible to answer via other means before. In what way is that ineffective?


No, they're saying that 80% of genai queries (aka anything sent to an LLM; I won't speak on the validity of the percentage) are not things someone would search on Google. It's things like trial-and-error vibecoding, openclaw-like agentic loops, talking to chatgpt like it's a person, etc. In other words, most genai queries are not for getting "obscure information" or even getting direct information at all. It's about either getting it to do something you don't want to do yourself, or using it as a replacement for someone else (junior dev, therapist, friend, significant other, etc).

A request that isn't asking for information isn't a query

That's just what some people generally use to refer to LLM input string/prompt/message/etc. The only thing the LLM can do is return information...in the form of text, so every request is one for information.

If we want to get really pedantic, every generated token is the answer to the query: what's the next most probable token in this sequence of tokens?


If "query" doesn't imply intent by the user, it ceases to be a useful word. You can acrobat your way to imagine a digital system has agency to ask a question before it receives bits, but then any transfer of data could be called a query.

When I post this http request containing this reply, you could say my machine is querying the server to ask "what did you do with the message I just gave you", but then query stops having any useful semantic value to distinguish it from "request"

Regardless, this is tangential. I don't disagree that a lot of LLM use is not in pursuit of knowledge, but enough of it is for me to think that preferring LLMs not to exist is a hard position to defend, at least without making the case for existential doom.


I do plenty of AI queries, both pragmatic ones and some for entertainment: witnessing talktotransformer was mind-bending already at the time! And since then, I've tried frontier models, local, coding agents, and use plenty of them on the regular.

I awe at the capabilites of generative AI.

I also enjoy sitting in or driving a car.

I did not want to make a moral argument, unless you consider each and every form of utilitarianism as moralism.


Vonnegut said in his last living work that the greatest addiction modern people face is the drug of cheap oil.

We got addicted to the convenience and overuse, and have started a mass extinction event because of it.

The perverse incentives will come for us all.


It is exactly this thought, in the form of this sentence, that could replace almost all of my comments in this thread.

It feels depressing, but I think the same. When thinking about the larger world, it becomes increasingly hard to ignore. And of course it is not new.

There were "doomers" already in the midst of the 20th century, but it doesn't mean that they were wrong.


I agree with your message but not sure about the conclusion. Cars themselves are commodified luxury available (in the US pretty much required) to everyone, and they do need to be subsidized, both in terms of infrastructure and the lifestyle they require.

But with AI what is the exact price? My understanding is that R&D is extremely expensive, but running non-SOTA models is not that bad. We are getting pretty close to models which can be useful locally in many applications.

Or do you mean that at scale running them locally is not possible and hence the infrastructure price is in data centers, which will be expensive to maintain and scale for demand?


Thanks for asking an open question about my point.

First, because I initially failed to answer your more closed questions (this paragraph is edited in):

> We are getting pretty close to models which can be useful locally in many applications. Or do you mean that at scale running them locally is not possible and hence the infrastructure price is in data centers, which will be expensive to maintain and scale for demand?

I don't think there's a way around making the best of AI capabilities with minimum price and maximum control, and I'd agree this is met by on-prem data centers, just not in a rationally targeted way.

Back to my original comment:

Because it (my conclusion) was not so clear, and maybe I just wanted to highlight some observations without delivering a real argument for or against things [, I thank you for your open question].

The utility/leverage aspect for AI seems more esoteric than the one for cars because, apart from Chatbots, it's more hidden.

And also, similar to cars (or many other phenomena of industrialization), yes, my first vague point was the subsidization of infrastructure. But also, the power gap: that's something not only associated with AI or cars, but with a lot of technologies we all hold dear: sewage, powerline, logistics, etc etc.

What reminds me of cars in the current AI frenzy is the fixation on cementing infrastructure. And also, I think, a lot more people agree on, for example, some kind of universal right to, for example, clean water.

But all of industrialization confronts people with questions of efficiency, inequality, and collective support.

Most people would, for example, support a right to get a minimum amount of clean water when you are living and working in a tradionally inhabited space (if you're on the social-darwinist side) or at least not harming society (if you're more of a social democrat).

And, similar to the buildup of car infrastructure, and the procurement of resources, space etc for maximum building, giant data centers can obstruct people in buying drinking water. Or walking outside (AI obstructs traditional methods of online collaboration).


> So where is all the productivity going? Where is the value?

Infrastructure doesn't produce value overnight. How long did it take the Interstate System to provide measurable value? I asked Gemini. Supposedly increased national productivity by 25% over 39 years[1]. But if you drove on a newly finished interstate in 1959, you saw the same cars just moving a lot faster.

That's what we're seeing right now. People can produce an incredible amount of stuff really quickly with AI. Is it directly connected to measurable productivity across the entire economy? No, because, realizing a mass productivity increase from infrastructure takes time.

[1] - https://www.richmondfed.org/publications/research/econ_focus...


The original point of the stock market was to fund gigantic society-level projects (like railroads). Modern VC has replaced some of that at smaller scales but not all of it at the largest scales. So this could just be the stock market performing the function it was designed to perform -- helping fund something transformative on a societal level.

> Take any stock index, remove AI stocks, what do you see? That's right! Nothing...

Where did all the stock gains go before AI?

FAANG / MAG-7.

Was everything from 2012-2020 fake, too?


They went from ~9% of the sp500 to ~35% over your timeframe...

Literally right here. eComm business turned around from losing money to profitable in less than 12 months after vibecoding a bunch of solutions to variousn problems we were having.

Not sure what your point is. Stock markets are based on money going into securities based on estimated future value. Even if AI were doubling productivity at a non-AI company, there is more leverage to that money going into an AI company.

The question is, is AI leading to massive productivity gains in companies that implement it? AI productivity gains take time to diffuse, but so far companies in the S&P 500 are seeing very high growth. YOY earnings growth rate for the S&P 500 is 21.7% https://advantage.factset.com/hubfs/Website/Resources%20Sect...


> YOY earnings growth rate for the S&P 500 is 21.7%

Now remove the companies selling the AI shovels: https://pbs.twimg.com/media/HIAjbZxacAARHwD.png

> Not sure what your point is.

My point is that they're selling us Skynet and the end of employment as we now it, things that we shouldn't even have to measure to perceive the results of, yet no one is able to measure any of it

Pointing a finger at nvidia, google, and the other few companies stuck in circular investment schemes that shouldn't even be legal and saying "OOGA BOOGA line go UP, UP GOOD!" doesn't count in my book


Your grandparent comment:

> Take any stock index, remove AI stocks, what do you see? That's right! Nothing...

Parent comment:

> Now remove the companies selling the AI shovels: https://pbs.twimg....

From your linked image, "excluding AI stocks" is "+16%" (the figure with AI stocks is far higher).

Your sole source says +16% excluding AI - in what kind of market is +16% “nothing”?


Charitably the lag time for this technology to have noticeable effects could just be ~5 years away. Similarly to how computers didn't have a big impact for a decade after they were introduced as people got used to using them.

Is the image you provided depicting revenue, or stock value? My point is about revenue.

Revenues don't matter when you sell a dollar for 50ct and half of the deals are circular anyways

So you're claiming that the revenue growth of the S&P 500 over the last few years is largely due to "selling dollars for 50ct" and circular deals?

Yes.

https://insights.som.yale.edu/insights/this-is-how-the-ai-bu...

> AI-related stocks have accounted for 75% of S&P 500 returns, 80% of earnings growth and 90% of capital spending growth since ChatGPT launched in November 2022.


has it occurred to you that AI companies may be making huge returns because AI is genuinely increasing productivity and driving actual economic growth via their products?

If all these false practices can pull revenue out of nothing, why doesn’t every company do it? How come AI companies seem to be able to pull off financial magic that no other company can match?

All your analyses still ignore the revenue point.


But then why don’t we see this productivity growth in any other statistics? In layoffs or in faster GDP growth or in new software products?

> Take any stock index, remove AI stocks, what do you see? That's right! Nothing...

I mean, do you know what the value of those stocks would be if AI didn't exist. Maybe they would be much more negative. Maybe we would be in a recession. Without a control this type of analysis is meaningless.

And that is even assuming that AI productivity gains are happening now instead of 5-10 years from now.


He has also consistently demonstrated, at least to me, that he doesn't really understand how inference works from a technical perspective, which weakens much of his core thesis for why there should be a collapse.

I do value having some naysayers in the mix generally, because we do need balanced critique in what is otherwise a very frothy hype cycle. I just don't think he's making sound arguments, and that's even assuming you even agree with his premises in the first place.

My biggest gripe with his napkin math is that he treats inference gross margins as something novel that you can't compare to normal SaaS margins. He's right in part: the constant carousel of R&D costs from model training, related infrastructure buildout, and other adjacent costs required to stay competitive do change the analysis a bit.

But he takes this way too far when he says this is structurally different from normal SaaS margins. The business model definitely doesn't look like Dropbox, but it absolutely looks a lot like AWS, especially early AWS, CDNs, telecom, etc. I can speak to the telecom bit personally, since it's been over half of my professional career as an engineer and, in this specific case, also as a founder. You can have a brutally capital-intensive infra business where profitability depends on utilization, oversubscription, peak-capacity planning, segmentation, and recovering capex over time.

The math he presents gets even more questionable as we see explicit segmentation happening for cost-saving reasons. Many forward-thinking orgs are waking up to the fact that they don't need to use the best, most expensive model for every task. They can route easier tasks to cheaper models, use caching, batch non-urgent workloads, and reserve frontier models for the subset of work that actually needs frontier intelligence. That directly undermines his claim that providers always need to chase frontier intelligence in order to maintain current demand, utilization, and pricing curves.


I think he doesn't need to understand the technology to point out the books are cooked. a business can sink in either way: the technology flops or the finances flop. he's arguing the /finances/ would flop. he doesn't argue that the /technology/ would flop, only that they can't come up with the money to pay their debters.

There is a piece of this I agree with. That you do not need to be a deep technical expert to notice that a company is burning cash by overcommitting to capex, or relying on heroic revenue projections that may or may not come to pass.

But that is not the full argument he is making. If the claim is that the labs will not be able to pay their creditors because inference is structurally incapable of becoming profitable, then he absolutely needs to be right about the technical economics of inference.

One part of that is the balance-sheet argument (which already shows insanely good margins). But it also depends on how inference-time compute actually works: routing, batching, kv cache reuse, model segmentation, different latency tiers, etc. Much of those details he's just been straight up wrong about in his writing, so as a result I have to call into question the rest of his reasoning as well (in part to avoid Gell-Mann amnesia).


Doesn't this kinda imply its own smoke and mirrors though? Like if the name of the game with inference is already routing things around and caching so you can make money, why is the newest biggest model always the most important critical thing? How does this square with any of their press about it? Also wouldn't that just add more inference? Because you need to pre-judge every prompt to know where to route it?

Also, if there is significant gains from caching, then like.. what are even doing here? Inputting something and then reading cached pieces of text based on their similarity to the input? Kinda like a search engine?


> That directly undermines his claim that providers always need to chase frontier intelligence in order to maintain current demand, utilization, and pricing curves.

But does it also not mean that they will make less money given that there is already brutal competition for that lower tier from openrouter, Deepseek, Amazon, etc.?

You can't on the one hand say "customers are beginning to understand they can spend less" and on the other hand suggest that this is good for forecasts of revenue.


> that he doesn't really understand how inference works from a technical perspective

Could you share what tells about it? I.e. where he was wrong about it?


There's examples both in his writing and also in his appearances on podcasts, interviews, etc.

I'll cherry pick a couple:

“When these new models ‘reason,’ they break a user’s input and break into component parts, then run inference on each one of those parts.” [1]

This is not at all how test-time compute works. At best, this is a very loose metaphor that he may have used out of convenience. This might sound a bit pedantic to point out, but this is a very basic thing that he's getting wrong (presumably at least, again it could be that he just used a poor metaphor).

A less pedantic example would be his claims related to gpt-5/chatgpt auto-routing. He argued that having a router means OpenAI can no longer cache static prompts, because the user prompt has to come before the hidden instructions [2]. This is just not at all how this works at inference-time. There is no evidence that the standard approach of system>developer>user instruction hierarchy has changed, the public API and caching docs maintain this.

But even more broadly, it suggests he is reasoning about kv/prefix caching at the wrong level of abstraction. It's true that conventional prefix caching does require a stable prefix, so yes, if you literally put variable user content before the static prompt, you would destroy the cacheability of that static prompt.

But that is exactly why inference systems are designed to preserve reusable prefixes where possible (via checkpointing or similar), and why serving systems care so much about prefix caching. This is also a big part of how disaggregated prefill/decode infra works where cache-aware routing is critical. His argument treats a bad prompt layout as if it were a necessary consequence of routing, rather than an avoidable implementation choice.

A router can read the user request, decide which model path to use, and then construct a normal downstream model call with stable static instructions first and user content later. Treating that as impossible implies a fundamental architectural misunderstanding.

[1] https://www.wheresyoured.at/how-to-argue-with-an-ai-booster/

[2] https://www.wheresyoured.at/how-does-gpt-5-work/


Productivity is not value. It's quite possible for you to experience productivity improvements, and actual value to not be created. That is what I think the most robust data is showing.

https://unessays.substack.com/p/talk-is-cheap


From an economic perspective productivity is defined as the creation of value isn't it? Then if you "improve productivity" and does not create value in the end you're no improving productivity at all.

It does depend on how you define productivity. But the way it's commonly used is "I'm going faster, personally, with these tools."

The thing people I think have a hard time seeing is that "I go faster" does not mean "more features get finished".

It's a scale issue, and one scale is better than the other. People only pay for finished features, they do not pay for how much code you emit.


economists define productivity as gdp per hour worked. Like a lot of other economic measurements, its mostly a bogus number people use as an argument on why their politics are better than someone elses politics. You can have an efficient business located in a poor country making the same product and same quality as that same business in a rich country, the rich country will be more "productive" because local cost of goods is higher there (i.e. a restaurant in NYC is more "productive" than a restaurant in bangladesh).

Sure. But that's not, in my view, how most people use the word productivity when describing LLM use.

In my field - operations - productivity is usually described as some rate of production for a specific asset. 100 widgets / machine / hour - for example.

"My productivity is 3 PRs / day with the LLM as opposed to 1 PR per every three days". That's how I think people are thinking about it.

My point is that's not the same thing as value. I.e. what people will pay for.


You're correct, I just wanted to add that there is another definition that you may see used online, and it is very specific, and it's important to be aware it's NOT exactly the same thing most normal people mean when they say "productivity".

Productivity is defined revenue per worker hour. And we know worker hours are going down as there are fewer workers with the layoffs.

Also, supposed productivity gains are dubious. I personally experience at best no productivity gains when using LLMs to write code, and sometimes it's an active drain on my productivity. There was that one study a year or so ago showing similar results. People are trying to say the productivity gains are there and undeniable, but that is not true. It is very much a subject of controversy whether AI helps productivity.

I can see an argument that the productivity gains are illusory / don’t translate to economic productivity. I’m not denying the possibility.

However, most of the engineers I respect have gone from being skeptics a year ago to convinced today. I don’t personally know any true holdouts any more. If there are studies that disprove productivity gains more than six months ago, I’m happy to believe that it was true of the AIs that were available at the time. But I’m going to need something much more recent before I disbelieve my lyin’ eyes where it pertains to the AIs available today.


There is an observational study that was published in March 2026 that followed 4000 teams over 2 years. It shows, in my view, exactly that the productivity gains don't translate into economic value.

Here is the report:

https://www.faros.ai/blog/ai-acceleration-whiplash-takeaways

And my commentary:

https://unessays.substack.com/p/talk-is-cheap


If it was published in March 2026, even if the data was collected up to the day the study was published, 7/8ths of it would fail my “within the last six months” test. But I am looking forward to the results of future studies on this topic!

I get wanting to wait for more data. And thinking that LLMs have improved enough that this will change.

My view is that it's not really about how good the models are - it's about how we're using them. Understanding what you've built is an important part of value creation, and LLMs eliminate that.


Its funny, I've noticed the same thing, but did not come to the same conclusion.

I currently don't have work access to Claude Code, but most of my teammates do. Watching from the outside, the cycle seems to look like this:

1. Experience some success, which hooks you into relying on AI.

2. The AI keeps failing at some task, but you don't want to stop. Keep trying over and over again.

3. Run out of tokens and take a break.

Now, sometimes 1 doesn't happen. Sometimes 2 doesn't happen. 3 is a certainty though.

Now, if you told me that the productivity gain from 1 is enough to offset the loss from 2 and 3, I could believe you. But I also wouldn't be surprised if it didn't.


As I work with Claude more and gain a feel for its capabilities, I tend to run into 2 far less often, as I'll decompose my messages more for the current model limitations. The threshold also changes each release.

I’m going back to being a holdout, but it’s nuanced - My theory into why LLMs don’t lead to the colloquial definition of productivity would be something like - if code was never the bottleneck than generating code faster doesn’t result in more meaningful output.

Even if you take for granted that AI is as good as the best people say in writing code. And Ive spent a lot of time generating codes, I won’t disagree - Then the question becomes - does this change your daily incentives such that you reach for code as the solution to your problems rather than something else (coordinating with your colleagues? Product management? Planning and Design?

So from a holistic perspective, I think intentionally limiting your own AI usage is the best approach for maximum long-term productivity.


I’m not completely closed to your idea but if code was never the bottleneck why did so many organizations always feel so chronically low on coders? And of course this requires the AI to be no help at all with what is actually the bottleneck.

That report doesn't match what faros.ai conclude which is mostly a paywalled report.

That's possible, sure. But I think the answer is more likely in the numbers, not in just qualitatively saying AI isn't worth anything. Like if I pay $30k for an ounce of gold, I got value. Gold is worth something. But that amount of gold wasn't worth what I spent.

EDIT: In fact, parent comment has a link to some numbers.

[EDIT: Most] people don't want to go through the numbers. Ok. But there's a history here. When people don't want to see the numbers, certain kinds of things tend to happen.


I've posted numbers that indicate that productivity is becoming decoupled from value delivery. If you follow the link in my comment it reviews a pretty robust study of 4000 teams over 2 years. There is no product throughput increase.

Yep.

Code acceleration is great, but.... something precedes that. Vision and strategy re. expansion of offerings and businesses. Once a firm reaches maturity in what it offers and is only touching the edges - this code acceleration is literally useless when you factor in all of the trade-offs.

This is a good thing - it means fat and slow incumbents are sitting ducks to be out-witted by creative and imaginative founders, which is healthy for a well-functioning economy.

Now the economics of existing frontier models are not sustainable - its looking like a mix of the airline (supersonic vs subsonic) and EV industry with China in the background providing decent offerings at much lower prices.


I think its worse than that.

I admit that if a small team or an individual uses an LLM, it's likely they can create value faster.

I think as soon as you don't own the responsibility for the defects you generate with an LLM, their use starts to destroy value. Regardless of product maturity.

This is what I think the data says.

https://unessays.substack.com/p/talk-is-cheap


Yeah this part scares me a little. I imagine it scares everyone who is more than a couple of years out of school. I hear that "the solution to LLM tech debt is more LLM." That might be true, but it might not be.

It scares me too.

I actually think this is precisely the reason LLMs can't be the basis for a technological revolution. Because it's only one way.

Like, if you have a compiler, and it has a bug. You can discover if that bug is influencing your code execution and patch it. You can go both up and down the stack.

With LLMs, there is no way to patch it's translation function. You have to rely on it to forward process.

I don't think there is any way to avoid us understanding our tech stacks.


You're not really getting it.

If you are producing something that delivers a far better experience, irrespective of what's under the hood (see Claude Code et al), you will decimate an incumbent who is trying to use LLMs in the context of incrementally improving a mature product.

LLMs are suited for the development of revolutionary innovation, not incremental.


I think we mostly agree.

I think I just disagree about the power of the LLM to deliver revolutionary innovation. That's something you do. Not the machine.

And, pretty soon on your journey to scale, the LLM becomes a hinderance rather than a help.


Interesting data, thanks.

He’s been continuously predicting that the collapse was just around the corner, that progress was slowing, and that there was no market for inference, since 2024.

The fact he’s never reflected on the glaring failures in his analysis tells what we need to know about his intellectual integrity. There’s truth in some of his words about financial risk, but if you can’t acknowledge that there’s upside too, you can’t evaluate risk properly either.

I find it difficult to take him seriously.


Progress is slowing, in an important way.

Have a muck about with what Qwen 3.6 or Gemma 4 can do and you'll see. I mean this as an illustration but Qwen just isn't as far behind as I expected, and compared to the data centre hardware it will run on a potato.

The frontier models are losing their undeniable edge over that which is unmetered.

And even putting aside my optimism for the smaller open weights models, there's a huge amount of scope for the larger, hosted open weights models that are only just behind the cutting edge and which cost, what, 1/25th of the price on opencode go, openrouter etc.

Commodification is coming, and with it slimmer profit margins; it's hard to see them making anywhere near the kind of money they need to in a commodified market.


> progress was slowing

Do you think it's not slowing? Do I miss anything really important?

My understanding is that we have now is incremental improvement on thinking models which appeared more than a year ago. Of course, a breakthrough might happen, but I don't see one yet.


The most important thing I would point to is Mythos et al and the wave of vulnerabilities that have been discovered in the past couple months. It’s a completely unprecedented event, brought forth almost entirely by improvements in the models themselves. That said. keep in mind, I’m talking about over the past two years. With Claude code and the capabilities gained since December of last year, there have been incredible gains in the capabilities that are now available. Demand for inference is higher now than it was a year ago, because capability has improved. A specific criticism that I would hold is that claiming that progress with LLMs is slowing, prior to that point, is embarrassingly wrong in my view. One could argue that the model capability improvements are slowing, and all the improvements were in harnesses. I think that’s a stronger argument, but I have a few problems with it. 1. Utility is utility. Whether that comes from the model or the harness is irrelevant when making claims about utility. I don’t think that’s a useful distinction most of the time, but especially when talking about the technology as a whole. 2. Marginal intelligence gain is different than marginal utility gain. It’s estimated that intelligence grows logarithmically relative to investment. However, the utility of a marginally more intelligent model may grow exponentially, because once behavior crosses a reliability threshold, it unlocks new capabilities. 3. Even on those terms, it’s not clear to me that frontier capabilities are slowing down. With Mythos and its contemporaries, we have been seeing a vast change in the security industry as vulnerabilities are discovered at an unprecedented rate. OpenBSD vulnerabilities, more Firefox vulnerabilities found in a single month than the past two years, critical Linux vulnerabilities. It’s hard for me to look at the effects there, a radical new capabilities baked into the model itself, and see stagnation. A part of the reason it might feel like it’s slowing down is because we plebs don’t have access to the top models.

The maintainer of curl - who has access to mythos - disagrees [0].

I think it's dangerous to rely on claims made by people who financially profit from you believing them without checking.

[0]: https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...


The article says in the second section that the author did not have access to Mythos. I think it’s dangerous to rely on claims made by others without even bothering to read them first, let alone check.

It found hundreds of vulnerabilities in Firefox, according to Mozilla: how does Mozilla benefit? It found a 27 year old vulnerability in OpenBSD. How do they benefit from that? Is that made up? Are the maintainers of those codebases lying for the benefit of Anthropic’s IPO? Is copy fail a fabrication by big AI? The 12 OpenSSL vulnerabilities found in January?

https://venturebeat.com/security/mythos-detection-ceiling-se... https://www.wired.com/story/mozilla-used-anthropics-mythos-t... https://cyberscoop.com/copy-fail-linux-vulnerability-artific... https://www.schneier.com/blog/archives/2026/02/ai-found-twel...

Im not sure whose claims you think I’m relying on. I trust Firefox that they’re not overstating the number of CVES they’ve found. Same for OpenSSL. The OpenBSD folks definitely don’t seem like the types. I’ve not known Linux to fabricate CVEs either. I think my sources are fine.


That blog post is very clear about the maintainer having no access to Mythos.

Does that matter that somebody else ran it for him?

When it is explicitly an appeal to authority, and the basis for the authority is incorrect? Feels like it matters.

And presumably the GP thought that saying the maintainer had access to Mythos made it a more compelling argument. Otherwise why even mention it?


Do you have access to Mythos?

Nope. Just watching the volume and severity of CVEs coming through since it’s been running. It’s been a busy few months.

> He’s been continuously predicting that the collapse was just around the corner, that progress was slowing, and that there was no market for inference, since 2024.

Old WSB saying: The market can remain irrational for (far) longer than one can remain solvent.

And unfortunately, a lot of the market on the "buyer" side has been acting irrationally. When you see CEOs telling their employees that they don't care about token cost, only about "how much AI do you use" because that is what the stock market wants to hear - that's when you know we're all getting cooked, the question is how long it takes until the bubble bursts.


anyone that takes him seriously at this point... I don't want to say very bad words here...

>undeniable, massive productivity gains.

How can something so undeniable have zero scientific evidence? Are there any large peer reviewed or meta studies confirming your claim?


It’s a very hard experiment to run. You have a population that’s already “treated”. You can’t blind them to the fact that they’re using AI tools. It’s hard to imagine a study that wouldn’t have serious flaws that people would then use to dismiss and form their own conclusions. Sure you have METR but that was very low n with a very old model.

I think the surest sign of productivity gains is the sheer volume of adoption. If you look beyond headlines, adoption is just incredible. Of course adoption does not necessarily point to productivity gains, but if this was some sort of FOMO or smoke and mirrors you would not see this much retention and this feverish a pace of adoption. You would not see a large segment of the profession using coding agents exclusively. All of these companies track productivity, again with imperfect proxies, yet everything points to a pretty consistent picture. Same with benchmarks, again a lot of crappy benchmarks but a lot of high quality ones too and a very diverse collection of tasks and capabilities they probe.


Sheer volume of adoption is fairly forced though - "use it or you're fired, and tokenmaxx the hell out of it". Most the people I know outside of tech don't seem to be particularly captured by it, if they use it at all.

Your second paragraph appears to be 3 different instances of saying "X does not necessarily point to productivity gains... but in the case of AI, X definitely means productivity" without really saying why that is true or why other explanations do not fit.

Adoption meaning productivity supposes there are no other dominant factors for the AI push nor AI retention. It is possible for practices to be picked up or continued in spite of causing productivity DROPS. What studies have suggested are factors that make for productive work environments and what is actually enforced in the workplace are different things.


It’s 3 different weak but complimentary proxies. We form beliefs from imperfect evidence and I find these fairly convincing when it’s hard to find any hard evidence of no productivity and exactly the scenario you would expect under the hypothesis that we do see productivity gains. None of this is supposed to be unassailable. I would challenge then if you disagree what the evidence you have for this is?

Adoption implying at least some significant productivity gains doesn’t contradict there being other factors. You’re seeing entire companies reshaped. The argument is this is all for show or CEOs are in some sort of idiot class?

“It is possible for practices to be picked up or continued in spite of causing productivity drops” well of course. I just find that incredibly far away from Occam’s razor.

My point is: we have lots of evidence that’s highly consistent with real productivity gains, and I don’t see many pieces of evidence to the contrary.


Because even in a field like software engineering where the output of our work is save in version control, measuring baseline productivity is hard.

LoC: people argue it’s not what’s important

PRs/day: same as LoC

Getting projects done faster: oh but what about the quality.

Solve the technical problems and actually be more productive, the social systems build around the old way of doing things will hole you back.

Finish a PR in 10 minutes doesn’t matter if you’re waiting days for a human review.


I do not disagree with what you are saying, but I honestly still believe that most of the utility we experience are honestly gonna become very boring very soon that we can just run local... Even if it's a bit more slow who cares, can just run in background while you work on other stuff yourself, read up on things, review other work...

It's not that the utility of it put in question. What is however a giant question mark is how the heck any of the big AI companies are ever gonna get that ROI? Given how many of us are becoming more and more fine with local models that run just fine especially on a good enough computer which most developers have anyway...


Even more dangerous to the big 2 AI companies is the fact that the 20 different Chinese companies are catching up fast and for a lot lower cost.

Why should someone pick Opus 4.8 when Qwen3.7 Plus produces similar results for about 1/20th the cost.

That sort of pricing disparity is across the board. But further it's becoming more and more apparent that they are doing more with less parameters. That's what's giving the local models their super powers.


Because it doesn't. Not for the tasks where using Opus instead of a lower tier model is appropriate, at any rate. Benchmarks show this, as do revealed preferences of actual users. To believe that Qwen is as capable as Opus at 1/20 the cost you have to believe that every person who does not make the choice to use Qwen over Opus for a given task is some mix of ignorant or delusional. This is certainly an opinion you can hold about other engineers, but it's definitely a questionable one at best.

I find myself rarely reaching for Opus nowadays, it's just too slow. I assume there are tricky use-cases where it's really useful though, just not super relevant for my day to day. I much prefer a faster, "weaker" model.

The benchmarks between the two are close and the engineers that have used both (like myself) can attest that the differences aren't so wide as you might believe.

I'd say that yes, ignorance plays a role here because a decent number of engineers are looking strictly at the benchmarks and choosing Opus just for that reason.

But I'd also say that a major factor for Opus use is because Opus is being purchased for the engineers by their employers. They don't get to pick which models they are using.


Even if we assume that everything you said holds true, how is that we as a crowd can make viable a service that eats some $300bn annually in infrastructure costs? Where would that money come from? Most tech companies these days are cutting their AI budgets because the per token pricing is killing them.

Cite a real source for that last bit, I don’t think that is true. Also the budgets should be cut the spend at some places goes beyond any reasonable amount. The strategy there is to hook everything in and find the right processes, then cut the rest. Things then get better and better with each model release.

The way you make a viable service that eats 300bn annually is to have enough demand to service that. Anthropic underbought compute. That tells you something.



When you say "Things then get better and better with each model release."

How far behind are models that can be run locally, and do you expect that this will be widespread?


He has recently made the very good point that actually, the FAANG companies are struggling to put any ROI numbers on that incredible ground-level utility.

Uber, for example, is so unclear there is any ROI, they are cutting their exposure pretty radically.

He points out that one single Anthropic customer — a payments provider — accidentally had to pay Anthropic $500M for one month of token spend.

That is half what Apple is reportedly paying Google for the supply side of their entire consumer AI strategy.


It doesn't matter under Capitalist Realism, the banks were bailed out, the AI companies will be bailed out, and you will pay for it. There is no alternative.

I'm not sure if they would be bailed out. The government tends to help with bank bailouts as they are essentially the hemoglobin of the economy, I see this being more like the dot-com bubble were they will just let it fall and have the bigger more entrenched player pickup the scraps for cents on the dollar.

> undeniable, massive productivity gains.

The jury is still out on that.


Yeah they're very much deniable. Raw LOC/hr is much higher, and putting together a MVP, but I've yet to see any evidence that an LLM is capable of doing anything unsupervised, and if you need a human supervising everything it does... why bother having an LLM in the first place?

Because it can perform much faster? Monitoring allows you to multitask more effectively. I would also disagree that you can’t one shot anything…claims like this are weak and I have enough counter examples in my own life that it’s trivially false. The question is more: can it one shot the right things with a low enough failure rate for it to be a good replacement. It’s hard to figure that out a priori.

They are absolutely deniable. Huge swathes of people deny them.

Agreed that he has an extreme POV (or more accurately that he trolls for views/subscriptions). But his central argument is valid: if AI underdelivers financially, this bubble will burst and this bubble is magnitudes larger than what we've seen before, so there could be very rough seas ahead.

The question is: what does "underdeliver" mean here? the pro-AI arguments I am seeing in this thread are equating mass adoption to agentic coding. Er, I dont know of any trillion dollar cap companies that sell dev tools. The point is Zitron doesn't have to be 100% right for his central prediction to come true.


I don’t get this. We already have an insane demand. And yes exactly, this is primarily just with coding agents, but are you aware of what’s coming down the pipeline? It’s not hard to be you just have to find a decent way to keep up with literature.

* robotics (need to close data gap and release first viable product to get a data flywheel)

* conversational ai (no one is ready for this and we’re getting closer and closer to natural speech. The quality still isn’t good enough but it’ll be soon).

* other agentic use cases, openclaw adoption was crazy and that had a ton of barriers to entry

* ai products, like the one OpenAI is working on with Johnny Ive

Anyone thinking it’s unreasonable to hit whatever revenue requirements is just not that aware of what’s happening. Not to mention were capacity constrained already!! This is barely speculation at this point.


I don't think the issue with robotics is a data gap. maybe somewhat, but the real issues are that:

- RL is extraordinarily sample-inefficient.

- distribution shift/catastrophic forgetting aren't solved. only off-policy learning with giant decorrelated batches works.

- the breakout success of transformers as an architecture doesn't neatly translate to robot motion policy models.

the field is missing fundamental breakthroughs.

I also find it very interesting that conversational AI has taken this long. where are the models with good turn-taking? passive listening? the ability not to respond in paragraphs? has Anthropic simply not gotten around to it?


All of these points are great. The first one motivates world models which lots of labs work on. Not many people tend to understand the strategic value of those “open world” or interactive generation models: its robotics and planning. But also like you say you’re right, there are complicated problems to solve and it’s not totally clear the timeline. But where there’s data and compute, there’s a way.

For conversational AI these labs do have lots of things to do lol but you’re right; it likely also requires some architectural improvements but you see the infancy: look at the llama4 speech duplex model. Very unimpressive yet all of the components are there. Just a matter of pushing on them, licensing and commissioning better data, etc. takes time and compute is stretched thin.


I quite like my mechanical spider from Wild Wild West and the coffee it makes with a 50% success rate

Every day people here debate whether or not there are any actual productivity gains from LLM, and it's only in the limited context of software development. While I understand that this place obviously skews heavily towards the software industry, the notion that LLMs are anywhere near as useful in other industries is hubristic (at best).

Perhaps they aren't, but not currently viable !== always unviable.

Is it really worth it to cause a global economical collapse and harm society well-being to an unimaginable degree just to find out if it is viable?

Why cant it naturally grow and prove it's worth?


Just 5 more years and $500 billion more, bro. We're still so early.

And?

> through undeniable, massive productivity gains.

And where are those? They seem particularly hard to actually observe and only appear in anecdotes.

> I'm trying to believe

For every exponential increase in compute capacity you see linear gains in output accuracy. This is a death spiral. Anyways, you see "massive productivity gains" so why is "belief" a function of your viewpoint?


I really like some good drama slop that reads like a thriller, it is entertaining. I don't take any of it THAT serious, but lately with the IPOs that are about to hit the indizes, he has gained a lot of attention. If you look around the internet, most people publish a negative angle on something and then extrapolate it into some grand conspiracy, which is really captivating. Its crazy when you enter some echo chamber you never engage with (movies, gaming, art/comics) and they have their own head cannon for why the world is bad and collapsing. It puts your echo chamber into perspective to see the same patterns of argumentation and presentation spin out in a different way

Yes. Zitron has been predicting and begging for collapse since 2024. It's not just his brand at this point. It's his entire identity. As such, he cannot back down, he cannot question himself, and he cannot accept any other viewpoint. And he will keep moving his goal posts until something happens that can make him go "aha! I told you guys!!"

This, combined with his extreme ignorance, makes him unreadable. The only reason people read his stuff is because it validates and confirms their own anti-AI beliefs. It's why every time he publishes an article, it reaches the front page in an hour or less.


> This, combined with his extreme ignorance,

Extreme ignorance?


> undeniable, massive productivity gains

How are they undeniable? They're very deniable. One example is the (seemingly) increasing maintenance costs for AI-generated code[1]. Another is the cost incurred by everybody reading AI slop instead of actual communication.

I don't have hard data as to whether these cancel out the benefits, but it's not as rosy as some seem to think.

[1] After years of people understanding that LOC is not only a poor productivity metric but also a negative indicator of code quality (shorter code for the same thing is better), we now have people touting how many LOC their LLM agent is generating. It's like everyone forgot what LOC actually represents and what it means for long term maintenance costs.


> Zitron is begging for a collapse at this point

No, he's not, he's making tons of money every month from his Substack subscriptions. In fact, the AI bubble popping would be the worse thing ever for him, he would be out of a job.

Just like the who have predicated the US dollar will collapse any-moment-now and which pushed gold for decades.

Funny how people always say "oh, you are an AI lab, of course you are going to hype AI", but never "oh, you make sooo much money from predicting the collapse of the AI bubble..."


> undeniable, massive productivity gains.

Just because you keep repeating something doesn't make it an undeniable truth.


[flagged]


i don't think this comment contributes much to the discussion. can you elaborate more than saying "no"?

Thank you for making this whoever you are. There is a wonderful video at https://www.youtube.com/watch?v=8FT-oz9aZU4 that visualizes space travel and time dilation in Hail Mary. What I wished I had immediately after watching it was an interactive stellar chart.


You're welcome! I love that channel so much. Their videos and the blog post I link in the about section/citations on the starmap were inspirations for making this.


The primary issue here is that CEOs and investors are particularly vulnerable to AI psychosis which is then forcibly propagated to the rest of the organization. Understandably, the perceived benefits are almost impossible to ignore, compounded by the FOMO of the AI first/AI native narrative being sold by AI influencers.


This always blows my mind. We are currently breathing in the DNA of the trees, animals, and people around us—and we’re leaving ours behind for them, too. We’re all one big genetic soup.


> This always blows my mind. We are currently breathing in the DNA of the trees,

At this time of year, believe me, I am aware of the inhaled tree DNA setting off my pollen allergies.


The world of the very small is a very strange place https://en.wikipedia.org/wiki/Disappearing_polymorph

We can't make a certain HIV drug anymore because after 2 years, a lower energy crystal state formed, and that state isn't as effective. Now anytime we try to form synthesize the drug it finds a molecule of the lower energy crystal state which causes the crystal to also form in the lower energy state.


"Soup" is a good word. Pieces of DNA resulting from destruction by nucleases and other enzymes.


The immune system destroy all the DNA in unexpected places in case it's a https://en.wikipedia.org/wiki/Viroid or something. Better safe than sorry.

One of the important steps in mRNA vaccines was to surround the mRNA with a lipid to ensure it can survive long enough to enter a cell. Naked mRNA would not have worked.


One has to wonder whether destroy is all it does though. Analyzing this as cues about the surroundings seems like it could be pretty useful for successful living, and something evolution could well pick up on. Will we find eventually that some of those nucleic acid fragments were being hauled off for identification in something like an extra inner sense of smell?


No need to wonder if you study the lymphatic system.


> No need to wonder if you study the lymphatic system.

Does it mean yes or no? I think "no", but IANAMD; IANAB, ... I think it only identify proteins, glycoproteins, and other stuff that is in the surface of the cells/virus but not the DNA/RNA.


Maybe so, but also maybe not quite my point, unless you know something I don't about it.

Sure, some samples will be off to antigen presentation, but does that inform more than this is an encountered foreign substance and this is how to bind to it for neutralization ? Seems like in principle you could take the overview picture and have something like olfaction, but for things that hadn't been sufficiently cracked open when they passed the olfactory epithelium. Maybe it's starting to be sorted out, but I'm not up to date on what the neural feedback from the immune system carries.


If the quantities are too low for the nose to pick them up I think nature has converged on "too noisy to bother with".


Are the quantities too small though? Really foul-smelling small molecules can be sensed at least down to ppb concentrations. And the recent technical use of "eDNA" demonstrates there's signal to be had.


Not just rna, but dna as well.


It seems like the whole world could massively benefit from this much like the other great innovation out of the EU -- the Common Charger Directive (aka USB-C).


I’m running local models with a maxed out M4 but I find local models only useful and reliable for trivial tasks and sensitive items like database optimization work. Local LLMs just don’t come anywhere close to Claude or Codex for heavy work.


Just when you think Egyptology can't get more interesting, it does. No wonder "just a quick search about the Pyramids" turns into a lifelong obsession for many.


I thought Grok in the car was awesome until it went off on a tangent and started praising Elon.


The average price of a new car in the US is now ~$50,000 and the average monthly payment is almost $800. All people want is an affordable car and it is clear that won’t happen any time soon. It isn’t strange at all that prisoners to this system are cheering on the Chinese disruption.


AI Slop or not, these doomer articles have more than a grain of truth and you as a knowledge worker knows “Something is happening”.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: