Always hard to say for sure because I'm not sitting around running the exact same situations through both models in parallel to compare them.
It feels like you can give it a big chunky problem and leave it alone and it gets it done, with less questions and fewer design decisions that I wouldn't have made.
In reviewing its code I'm finding less to complain about than Opus. But it's all vibes, if you want a more scientific comparison you'll have to look elsewhere.
He has early access to anthropic models, of course he will hype them up, so that they will keep sharing access to preview models with him (and more traffic to his website). It also does't require him to perform any rigorous analysis of model performance, just share how it feels:
> But it's all vibes, if you want a more scientific comparison you'll have to look elsewhere.
I gave it a complete database migration of our app, opus failed hard each time... Untyped Json b for some rows, no proper normalisation, falling back asking me questions in between.
Fable just did it, clean code, one timeout with a hanging bash script, fixed a couple very old very structural bugs in the codebase