> For all that, GPT-5 is not a terrible model. I played with it for about an hour, and it actually got several of my initial queries right (some initial problems with counting “r’s in blueberries had already been corrected, for example). It only fell apart altogether when I experimented with images.
Spatial reasoning and world model is one aspect. Posting bicycle part memes does not a bad model make. The reality is its cheaper than Sonnet and maybe around as good at Opus at a decent number of tasks.
> And, crucially, the failure to generalize adequately outside distribution tells us why all the dozens of shots on goal at building “GPT-5 level models” keep missing their target. It’s not an accident. That failing is principled.
This keeps happening recently. So many people want to take a biblically black and white take on whether LLMs can get to human level intelligence. See recent interview with Yann LeCun (Meta Chief AI Scientist): https://www.youtube.com/watch?v=4__gg83s_Do
Nobody has any fucking idea. It might be a hybrid or a different architecture than current transformers, but with the rate of progress just within this field, there is absolutely no way you can make a prediction that scaling laws won't just let LLMs outpace the negative hot takes.
Spatial reasoning and world model is one aspect. Posting bicycle part memes does not a bad model make. The reality is its cheaper than Sonnet and maybe around as good at Opus at a decent number of tasks.
> And, crucially, the failure to generalize adequately outside distribution tells us why all the dozens of shots on goal at building “GPT-5 level models” keep missing their target. It’s not an accident. That failing is principled.
This keeps happening recently. So many people want to take a biblically black and white take on whether LLMs can get to human level intelligence. See recent interview with Yann LeCun (Meta Chief AI Scientist): https://www.youtube.com/watch?v=4__gg83s_Do
Nobody has any fucking idea. It might be a hybrid or a different architecture than current transformers, but with the rate of progress just within this field, there is absolutely no way you can make a prediction that scaling laws won't just let LLMs outpace the negative hot takes.