> For my use cases it doesn't appear to be any better than o3/o3-pro.
It doesn't have to be better, it has to be "as close to those as possible", while being cost efficient to run and serve at scale.
> This probably means that we're in the "slow AI / stagnation" timeline
I'd say we're more in the "ok, we got the capabilities in huge models, now we need to make them smaller, faster and scalable" timeline. If they capture ~80-90% of the capabilities of their strongest models while reducing costs a lot (they've gone from 40-60$/Mtok to 10$/Mtok) then they're starting to approach a break-even point and slowly make money off of serving tokens.
There's also a move towards having specialised models (code w/ claude, long context w/ gemini, etc) and oAI seem to have gone in this direction as well. They've said for a long time that gpt5 will be a "systems" update and not necessarily a core model update. That is, they have a routing model that takes a query and routes it towards the best model for the task. Once devs figure out how to use this to their advantage, the "vibes" will improve.
Yeah. In my particular narrow field of expertise -- which, broadly speaking, is a subset of mechanical engineering -- I can say for a certainty that it's nowhere near PhD-level, and it's not perceptively smarter than o3-pro. Maybe it's slightly smarter, but honestly it's hard to tell.
I can confirm in head-to-head tests that Kimi is a far better prose stylist. I asked both models to write me a poem in the style of Ezra Pound's Canto I. Kimi's one-shot result was excellent; an "A+" effort at any school in the world, and genuinely professional-poet-tier. (Rather frightening, that.) GPT-5-Pro's result was a disaster that was so bad it verged on parody, and to add insult to injury it sometimes plagiarized Pound's Canto I word-for-word.
I will say that GPT-5 seems a little bit more imaginative and inventive than previous models. It seems slightly better than all other models at formal logic, to the extent that it's a superhuman analytic philosopher. It's also better and faster at searching the web, and it's a little bit more circumspect than usual about the results it shares. (It doesn't blindly promote every product.)
Ultimately this seems like a very incremental upgrade, but it's an upgrade nonetheless.
Also, GPT-5 pro is a lot slower than o3-pro. My two most recent queries taking 17 and 18 minutes, whereas o3-pro would probably take 4-5.
Surprisingly, at generic writing-prose tasks (e.g. compose a poem in the style of X), GPT-5 is still noticeably inferior to DeepSeek and Kimi.
Honestly, I'm tempted to cancel my $200/month subscription.
This probably means that we're in the "slow AI / stagnation" timeline, so at least we're not going to get paperclipped by 2027.