More

civvv · 2026-05-27T13:33:54 1779888834

>Or as Jean Baudrillard has said:

It is nothing short of profoundly ironic to quote Jean Baudrillard in this context.

robocat · 2026-05-27T17:12:51 1779901971

That is a version anglicisée meme. I prefer the original:

« Le contexte réel s’effondre au moment même où l’on me cite. »

civvv · 2026-05-21T23:03:47 1779404627

Yarvin is a basement dweller and 4chan intellectual, high on his own supply of pseudo-intellectual takes. What is sad and worrying is that these kinds of politics are increasingly moving out of the fringe internet and into pockets of power (eg. Thiel and Vance). It is problematic that these ideas now linger only one or two steps away from the most powerful and influential person in the world.

jaybrendansmith · 2026-05-22T07:00:58 1779433258

Increasingly moving out? They've taken over the entire government. We are well on our way to a dystopia. Nice job, everyone.

civvv · 2026-05-09T08:54:39 1778316879

There are many indications that model progress is slowing down, so that is not entirely accurate.

aspenmartin · 2026-05-09T11:45:30 1778327130

Please be specific because outside of anecdotal blog posts by people who don’t know what they’re talking about it’s not true. Look at scaling laws, composite benchmarks from the epoch capability index, nothing at all suggests “model progress is slowing down”

StrauXX · 2026-05-09T09:21:23 1778318483

Which indications are that?

nicoburns · 2026-05-09T12:14:14 1778328854

The cost factors on the new models compared to the old models.

jeremyjh · 2026-05-09T13:33:03 1778333583

Qwen3.6 9B is as good as GPT-4o and runs on my M2 MacBook Air. Models are getting stronger and less costly at the same time, but these are somewhat separate branches of research. Frontier labs are spending more because they are still getting marginal returns and there is more capacity to spend than there was a year ago.

gertop · 2026-05-09T14:31:30 1778337090

Qwen 3.6 9B doesn't exist.

If you meant 3.5 9B and you truly believe it's as good as 4o then I can only assume you have a very basic use case.

jeremyjh · 2026-05-09T20:54:57 1778360097

You are right, I was mistaken about the version. I evaluated it in general chat assistant prompts plucked from my history across a range of topics but did not use it for coding - there was never a time when I thought 4o was “good enough” for agentic coding.

bdelmas · 2026-05-09T13:11:23 1778332283

You are mixing cost and progress. It’s not because it’s more and more expensive that progress is slowing down by itself.

nicoburns · 2026-05-09T13:34:46 1778333686

They are intrinsically linked beyond a certain point. If we're making progress but costs are spiraling exponentially then it stands to reason that we will soon reach a point where we can no longer afford the increasing costs and thus progress will slow.

(barring some breakthrough that reduces costs, which of course may happen, but for which recent model improvements are not strong evidence of)

aspenmartin · 2026-05-09T14:30:44 1778337044

Cost for a specific level of performance decreases 10x per year, this has been a pretty consistent property for awhile now.

butlike · 2026-05-12T21:25:03 1778621103

I guess within the domain of AI, a pertinent question would be: "do I want to use anything but the best?" The errors older models give being directly analogous to being stupider in my eyes.

aspenmartin · 2026-05-13T02:37:29 1778639849

Depends — many tasks in various pipelines have a reasonable Pareto frontier and diminishing returns after a certain level of performance. You may just have a high budget constraint (say like YouTube computing ASR subtitles; they are not going to be using the best ASR models because it’s expensive). If it’s myself, with a coding agent, I’m going to get the best thing I can afford.

overfeed · 2026-05-09T09:25:10 1778318710

Investment dollars.

dzhiurgis · 2026-05-09T10:42:10 1778323330

Source for that claim?

lionkor · 2026-05-09T10:25:41 1778322341

Nobody is releasing NEW models

aspenmartin · 2026-05-09T11:54:50 1778327690

…not only is this not true but it also doesn’t matter. Why would this indicate performance saturating?

taneq · 2026-05-09T10:59:05 1778324345

The standard networking connection has been called “Ethernet” for more than thirty years, so networking has stagnated, right?

SlinkyOnStairs · 2026-05-09T11:31:57 1778326317

If higher bandwidth networking consisted primarily running more and more ethernet lines in parallel, you would most certainly agree that "networking has stagnated".

"Reasoning" and now "Agentic" AI systems are not some fundamental improvement on LLMs, they're just running roughly the same prior-gen LLMS, multiple times.

Hence the conclusion that LLM improvement has slowed down, if not stagnated entirely, and that we should not expect the improvements of switching to these "reasoning" systems to keep happening.

p1esk · 2026-05-09T12:24:28 1778329468

From TFA:

“ChatGPT came up with an idea which is original and clever. It is the sort of idea I would be very proud to come up with after a week or two of pondering, and it took ChatGPT less than an hour to find and prove”

SlinkyOnStairs · 2026-05-09T12:33:16 1778329996

You misunderstand. I'm not saying that Reasoning/Agentic systems aren't better.

I'm saying they're not an advancement in the tech in the way GPT 1 through 3 were. They're a different kind of improvement.

And as such the rate improvement cannot just be extrapolated into the future.

p1esk · 2026-05-09T12:55:05 1778331305

GPT1 through GPT3 advancement were exactly like using more Ethernet cables in parallel.

All interesting conceptual breakthroughs came after GPT3: RL and reasoning being the main ones.

kstenerud · 2026-05-09T11:49:07 1778327347

What constitutes a NEW model for the purposes of calculating progress?

GardenLetter27 · 2026-05-09T12:07:18 1778328438

What? DeepSeekV3 just came out and is incredible for the price. Mythos is also half-released.

nozzlegear · 2026-05-09T14:24:02 1778336642

Until you or I can actually use Mythos in Claude without an nda or other strings attached, Mythos is not released and is just an effective marketing tool for Anthropic.

pixl97 · 2026-05-10T16:50:05 1778431805

At least to me this is a pretty sour grapes take. There are all kinds of released products that are expensive or need an NDA. You're just too poor to afford it. But make no mistakes there are governments using this in mass and likely against you.

nxobject · 2026-05-10T22:50:18 1778453418

I think that’s worthy of at least sour grapes, too.

CuriouslyC · 2026-05-09T13:50:41 1778334641

Model progress at spitting out unhallucinated facts is slowing down hard. Model progress at solving hard math challenges/programming tasks doesn't seem to be slowing down that I can tell.

civvv · 2026-05-08T22:09:06 1778278146

Three of my favourite game series as a kid, what a legend.

wrqvrwvq · 2026-05-09T05:39:47 1778305187

unc's thrashing out

civvv · 2026-05-06T20:42:49 1778100169

Not sure about their other departments, but at least in March there was no LLM coding in their Source2 & game teams.

https://x.com/ZPostFacto/status/2035784300575305895

cleaning · 2026-05-07T01:10:59 1778116259

Probably won't stay that way:

https://x.com/ZPostFacto/status/2050780692062376287

account42 · 2026-05-07T10:27:43 1778149663

Looking forward to finally getting HL3 /s

civvv · 2026-04-26T18:19:56 1777227596

This is likely true. I think model quality has stagnated and that its likely a non-trivial task to find a new improvement vector. Scaling the width of the model (which has been the driving force behind the speed of improvement thus far) seems to have reached its limit.

It will be interesting to see the implications of this. Tooling can only do so much in the long term.

mxwsn · 2026-04-26T18:26:35 1777227995

How do you know that width scaling has been the driving force of improvement?

civvv · 2026-04-27T10:50:58 1777287058

I am no insider and have never even tried to build an LLM, so I can only guess. But the general sentiment seems to be that this is the case. If you are interested, I would recommend you read the MIT paper "Superposition Yields Robust Neural Scaling" [0]. It confirms an interesting trend: models represent more features/concepts than they have clean independent dimensions, so features overlap. Increasing model dimension reduces this geometric interference, which lowers loss in a predictable way, but with diminishing returns.

This has, in my opinion, likely been the primary vector in getting better models thus far, but MIT mathematically proves that it yields diminishing returns for each new dimension added. It will get more and more expensive and the cost-return will or probably already has made it infeasible.

Ilya appear to support sentiment this as well. [1]

[0] - https://openreview.net/forum?id=knPz7gtjPW [1] - https://www.businessinsider.com/openai-cofounder-ilya-sutske...

waterTanuki · 2026-04-26T23:19:10 1777245550

I mean, it's not exactly a PhD level question. One can infer from the extreme demand of GPUs and DRAM + new data center construction that all the providers are banking on width.

svnt · 2026-04-27T03:54:15 1777262055

No? That could just be fomo, actual adoption, or a number of other things.

civvv · 2026-04-21T22:03:56 1776809036

Do you understand how LLM's work and that they are always behind in their knowledge? Unless Claude does a network call to check its own website, it will give you outdated information. Its a prediction model, its not magic.

civvv · 2026-04-21T21:53:13 1776808393

If you scroll down, you can clearly see that the Pro plan has an "x" on Claude Code now.

civvv · 2026-04-21T21:50:07 1776808207

Does this mean that for enterprises using per-seat pricing, only the $100 premium seat gets access to claude code?

vict7 · 2026-04-21T22:18:25 1776809905

Team plan shows “Claude code” in a main bullet point still. Which would indicate it is part of the team plan regardless if it has premium seats or not.

But it seems this is all in a state of flux.

And there’s the lovely asterisk at the bottom:

> Prices and plans are subject to change at Anthropic's discretion.

conception · 2026-04-22T01:20:12 1776820812

Enterprise doesn’t have premium. Just api usage.

Business accounts are like max 6x accounts.

civvv · 2026-04-19T22:58:46 1776639526

You’re generalizing too much here. One of the biggest problems with LLM’s today is in-fact that they are not at the level being advertised. This is not solely a case of regulation standing in the way of a «revolution».