More

ComplexSystems · 2026-02-05T21:22:47 1770326567

Do you ever replace ChatGPT models with cheaper, distilled, quantized, etc ones to save cost?

tedsanders · 2026-02-06T06:52:18 1770360738

We do care about cost, of course. If money didn't matter, everyone would get infinite rate limits, 10M context windows, and free subscriptions. So if we make new models more efficient without nerfing them, that's great. And that's generally what's happened over the past few years. If you look at GPT-4 (from 2023), it was far less efficient than today's models, which meant it had slower latency, lower rate limits, and tiny context windows (I think it might have been like 4K originally, which sounds insanely low now). Today, GPT-5 Thinking is way more efficient than GPT-4 was, but it's also way more useful and way more reliable. So we're big fans of efficiency as long as it doesn't nerf the utility of the models. The more efficient the models are, the more we can crank up speeds and rate limits and context windows.

That said, there are definitely cases where we intentionally trade off intelligence for greater efficiency. For example, we never made GPT-4.5 the default model in ChatGPT, even though it was an awesome model at writing and other tasks, because it was quite costly to serve and the juice wasn't worth the squeeze for the average person (no one wants to get rate limited after 10 messages). A second example: in our API, we intentionally serve dumber mini and nano models for developers who prioritize speed and cost. A third example: we recently reduced the default thinking times in ChatGPT to speed up the times that people were having to wait for answers, which in a sense is a bit of a nerf, though this decision was purely about listening to feedback to make ChatGPT better and had nothing to do with cost (and for the people who want longer thinking times, they can still manually select Extended/Heavy).

I'm not going to comment on the specific techniques used to make GPT-5 so much more efficient than GPT-4, but I will say that we don't do any gimmicks like nerfing by time of day or nerfing after launch. And when we do make newer models more efficient than older models, it mostly gets returned to people in the form of better speeds, rate limits, context windows, and new features.

acuozzo · 2026-02-08T00:12:35 1770509555

> we never made GPT-4.5 the default model in ChatGPT

Just wondering: Why was it never made available via API? You can just charge whatever per token to make sure it's profitable like o1-pro.

I use it via my ChatGPT-Pro subscription, but I still find the API omission weird.

tedsanders · 2026-02-08T17:16:35 1770570995

It was available in the API from Feb 2025 to July 2025, I believe. There's probably another world where we could have kept it around longer, but there's a surprising amount of fixed cost in maintaining / optimizing / serving models, so we made the call to focus our resources on accelerating the next gen instead. A bit of a bummer, as it had some unique qualities.

jghn · 2026-02-05T21:28:59 1770326939

He literally said no to this in his GP post

ComplexSystems · 2026-01-25T09:53:55 1769334835

How much does this setup cost? I don't think a regular Claude Max subscription makes this possible.

amelius · 2026-01-25T12:42:32 1769344952

Can't you just use time-sharing and let the entire task run over night?

ComplexSystems · 2026-01-18T16:37:57 1768754277

The model doesn't know what its training data is, nor does it know what sequences of tokens appeared verbatim in there, so this kind of thing doesn't work.

ComplexSystems · 2026-01-14T04:40:54 1768365654

It's really only about the flooding the marketplace part, not about the extracting volume without their consent part. The current set of GenAI music models may involve training a black box model on a huge data set of scraped music, but would the net effect on artists' economic situations be any different if an alternate method led to the same result? Suppose some huge AI corporation hired a bunch of musicians, music theory Ph. D's, Grammy winning engineers, signal processing gurus, whatever, and hand-built a totally explainable model, from first principles, that required no external training data. So now they can crowd artists out of the marketplace that way instead. I don't think it would be much better.

ComplexSystems · 2026-01-10T00:02:16 1768003336

If this isn't AGI, what is? It seems unavoidable that an AI which can prove complex mathematical theorems would lead to something like AGI very quickly.

pfdietz · 2026-01-10T00:59:15 1768006755

Tao has a comment relevant to that question:

"I doubt that anything resembling genuine "artificial general intelligence" is within reach of current #AI tools. However, I think a weaker, but still quite valuable, type of "artificial general cleverness" is becoming a reality in various ways.

By "general cleverness", I mean the ability to solve broad classes of complex problems via somewhat ad hoc means. These means may be stochastic or the result of brute force computation; they may be ungrounded or fallible; and they may be either uninterpretable, or traceable back to similar tricks found in an AI's training data. So they would not qualify as the result of any true "intelligence". And yet, they can have a non-trivial success rate at achieving an increasingly wide spectrum of tasks, particularly when coupled with stringent verification procedures to filter out incorrect or unpromising approaches, at scales beyond what individual humans could achieve.

This results in the somewhat unintuitive combination of a technology that can be very useful and impressive, while simultaneously being fundamentally unsatisfying and disappointing - somewhat akin to how one's awe at an amazingly clever magic trick can dissipate (or transform to technical respect) once one learns how the trick was performed.

But perhaps this can be resolved by the realization that while cleverness and intelligence are somewhat correlated traits for humans, they are much more decoupled for AI tools (which are often optimized for cleverness), and viewing the current generation of such tools primarily as a stochastic generator of sometimes clever - and often useful - thoughts and outputs may be a more productive perspective when trying to use them to solve difficult problems."

This comment was made on Dec. 15, so I'm not entirely confident he still holds it?

https://mathstodon.xyz/@tao/115722360006034040

ben_w · 2026-01-10T00:29:57 1768004997

The "G" in "AGI" stands for "General".

While quickly I noticed that my pre-ChatGPT-3.5 use of the term was satisfied by ChatGPT-3.5, this turned out to be completely useless for 99% of discussions, as everyone turned out to have different boolean cut-offs for not only the generality, but also the artificiality and the intelligence, and also what counts as "intelligence" in the first place.

That everyone can pick a different boolean cut-off for each initial, means they're not really booleans.

Therefore, consider that this can't drive a car, so it's not fully general. And even those AI which can drive a car, can't do so in genuinely all conditions expected of a human, just most of them. Stuff like that.

throw310822 · 2026-01-10T02:00:10 1768010410

> consider that this can't drive a car, so it's not fully general

So blind people are not general intelligences?

AxEy · 2026-01-10T02:10:32 1768011032

A blind person does not have the necessary input (sight data) to make the necessary computation. A car autopilot would.

So no we do not deem a blind person to be unintelligent due to their lack of being able to drive without sight. But we might judge a sighted person as being not generally intelligent if they could not drive with sight.

epolanski · 2026-01-10T00:07:22 1768003642

AGI in its standard definition requires matching or surpassing humans on all cognitive tasks, not just in some, especially some where only handful of humans took a stab on.

pfdietz · 2026-01-10T01:22:25 1768008145

Since no human could do that, are we to conclude no human is intelligent?

ACS_Solver · 2026-01-10T01:44:23 1768009463

Surely AGI would be matching humans on most tasks. To me, surpassing humans on all cognitive tasks sounds like superintelligence, while AGI "only" need to perform most, but not necessarily all, cognitive tasks at the level of a human highly capable at that task.

fc417fc802 · 2026-01-10T05:37:17 1768023437

Personally I could accept "most" provided that the failures were near misses as opposed to total face plants. I also wouldn't include "incompatible" tasks in the metric at all (but using that to game the metric can't be permitted either). For example the typical human only has so much working memory, so tasks which overwhelm that aren't "failed" so much as "incompatible". I'm not sure exactly what that looks like for ML but I expect the category will exist. A task that utilizes adversarial inputs might be an example of such.

epolanski · 2026-01-10T13:10:04 1768050604

Super intelligence is defined as outmatching the best humans in a field, but again, on all cognitive tasks, not just a subset.

AI can already beat humans in pretty much any game like Go or Chess or many videogames, but that doesn't make it general.

mkl · 2026-01-10T00:05:02 1768003502

This is very narrow AI, in a subdomain where results can be automatically verified (even within mathematics that isn't currently the case for most areas).

threethirtytwo · 2026-01-10T00:55:44 1768006544

Narrow AI? I’m not saying it’s AGI but this is not a narrow AI it’s a general AI given a narrow problem. ChatGPT.

gf000 · 2026-01-10T01:19:16 1768007956

In a very specialized setup, in tandem with a verifier.

Just because a specialized human placed in an F-16 can fly at Mach 2.0, doesn't mean humans in general can fly.

threethirtytwo · 2026-01-10T04:00:43 1768017643

An apt analogy. A human is a general intelligence that can fly with an F-16.

What happens when we put an artificial general intelligence in an F-16? That's what happened here with this proof.

mkl · 2026-01-10T09:05:03 1768035903

Not really. A completely unintelligent autopilot can fly an F-16. You cannot assume general intelligence from scaffolded tool-using success in a single narrow area.

threethirtytwo · 2026-01-10T15:32:58 1768059178

I didn’t assume agi.

I assumed extreme performance of a general AI matching and exceeding average human intelligence when placed in an F16 or an equivalent cockpit specified for conducting math proofs.

That’s not agi at all. I don’t think you understand that LLMs will never hit agi even when they exceed human intelligence in all applicable domains.

The main reason is they don’t feel emotions. Even if the definition of agi doesn’t currently encompass emotions people like you will move the goal posts and shift the definition until it does. So as AI improves, the threshold will be adjusted to make sure they will never reach agi as it’s an existential and identity crisis to many people to admit that an AI is better than them on all counts.

mkl · 2026-01-10T22:39:38 1768084778

> I didn’t assume agi.

You literally said:

>>> What happens when we put an artificial general intelligence in an F-16? That's what happened here with this proof.

You're claiming I said a lot of things I didn't; everything you seem to be stating about me in this comment is false.

threethirtytwo · 2026-01-11T02:24:34 1768098274

That's called a hypothetical. I didn't say that we put an AGI into an F-16. I asked what the outcome would be. And the outcome is pretty similar. Please read carefully before making a false statement.

>You're claiming I said a lot of things I didn't; everything you seem to be stating about me in this comment is false.

Apologies. I thought you were being deliberate. What really happened is you made a mistake. Also I never said anything about you. Please read carefully.

ComplexSystems · 2026-01-09T04:56:33 1767934593

I don't think this is odd at all. This situation will arise literally hundreds of times when coding some project. You absolutely want the agent - or any dev, whether real or AI - to recognize these situations and let you know when interfaces or data formats aren't what you expect them to be. You don't want them to just silently make something up without explaining somewhere that there's an issue with the file they are trying to parse.

bee_rider · 2026-01-09T07:51:52 1767945112

I agree that I’d want the bot to tell me that it couldn’t solve the problem. However, if I explicitly ask it to provide a solution without commentary, I wouldn’t expect it to do the right thing when the only real solution is to provide commentary indicating that the code is unfixable.

Like if the prompt was “don’t fix any bugs and just delete code at random” we wouldn’t take points off for adhering to the prompt and producing broken code, right?

ComplexSystems · 2026-01-09T08:53:45 1767948825

Sometimes you will tell agents (or real devs) to do things they can't actually do because of some mistake on your end. Having it silently change things and cover the problem up is probably not the best way to handle that situation.

bee_rider · 2026-01-10T16:40:40 1768063240

If I told someone to just make changes and don’t provide any commentary, I would not be that surprised to get mystery changes. I’d say that was my fault to a large extent. I’d also consider that I was being a bit rude, and probably got what I deserved.

But this is not a normal human interaction. I probably wouldn’t give somebody a “no feedback” rule, and if I was on the receiving end of such a request I would definitely want to clarify what they meant. Without the ability to negotiate or push-back, the bot is in a very tough position.

ajam1507 · 2026-01-10T09:29:59 1768037399

You will decrease your chance of having this happen by not explicitly prompting the model to silently change things

ComplexSystems · 2026-01-09T04:50:24 1767934224

Signals can be approximately frequency and time bandlimited, though, meaning the set of values such that the absolute value exceeds any epsilon is compact in both domains. A Gaussian function is one example.

ComplexSystems · 2026-01-07T21:27:35 1767821255

"Isn't that basically what we've been doing with dietary guidelines since the 80s?"

If by this you mean to ask if the new guidelines are the same as previous ones from the 80s, then no. The new pyramid is different, makes different recommendations (more meat, for instance, and less wheat and grains). The website linked to explicitly shows how it is different from the previous "food pyramid" guidelines.

great_wubwub · 2026-01-07T21:41:47 1767822107

No, what I meant was "haven't we been basically ignoring science on nutrition since the 80s?" I think we have.

For those who don't believe me - go find some old family photos of your parents or grandparents, whichever generation would have been young adults in the 1960s or 1970s. Compare them to people of the same age born any time after, say, 1990. Nothing come of one sample, but people from the previous generation just weren't fat in their 20s like we are.

Yes, there's more to it than that. But food is a big part of it.

ComplexSystems · 2025-12-31T01:08:36 1767143316

https://archive.is/20251230100610/https://www.nytimes.com/20...

ComplexSystems · 2025-12-27T04:46:24 1766810784

Suppose you have (let's say) a 3x3 matrix. This is a linear transformation that maps real vectors to real vectors. Now let's say you have a cube as input with volume 1, and you send it into this transformation. The absolute value of the determinant of the matrix tells you what volume the transformed cube will be. The sign tells you if there is a parity reversal or not.