my friends have been heating their apartments in the winter mining cryptocurrencies. they're not into crypto, in that they don't do it in the summer, it just helps offset the cost in rentals without heat pumps -- gamers who've already purchased the gpus
My experience is very different than yours. Codex and CC yield very differenty result both because of the harness differencess and the model differences, but niether is noticeably better than the other.
Personally, I like Codex better just because I don't have to mess with any sort of planning mode. If I imply that it shouldn't change code yet, it doesn't. CC is too impatient to get started.
I guess yes, that's a harness difference, and you can also configure CC as a harness to behave very differently, but still with same harness and guidance, "to me" there's still a difference in terms of Opus 4.6 and e.g. GPT 5.4 or which GPT model do you use? I've been using Claude Code, Codex and OpenCode as harnesses presently, but for serious long running implementation I feel like I can only really rely on CC + Opus 4.6.
I come from Cursor before having adopted the TUI tools. Opus was nothing short of pathetic in their environment compared to the -codex models. I would only use it for investigations and planning because it was faster.
Like you've said, though, that could just be a harness issue.
I have the opposite experience. Codex gets to work much faster than Claude Code. Also I've never seen the need to use planning mode for Claude. If it thinks it needs a plan it will make one automatically.
I developed one for a specific personal research topic. Once I answered my question, the initiative petered out.
I've considered starting another based on the idea of getting high off knowledge. I don't see the point as an information store, but as a toy it makes sense; use it spark curiosity, make neat connections, etc.
Even the article acknowledges that the model used to do this test is flawed.
Google's AI Overview is incredible. It's an instant correct answer for self-verifying questions, and it's right most of the time for reasonably complex questions. If the first page of results would contain the answer to your question, and your question can be answered with only one prompt, it's right almost every time.
> If the first page of results would contain the answer to your question
You can find complaints on this site, not too long ago, that Google was failing to have good results anymore. I don't feel the ranking has particularly improved since then.
I don't think that "right almost every time" is enough. It's different when you're searching for an answer yourself and you're expecting to dig for right one among others not so right. But when you get one answer from LLM you either trust it or don't. If you do you're bound to be lied to from time to time and face the consequences. If you don't, you're back to searching manually anyway.
I find myself questioning LLMs a healthy habit these days.
Considering the accuracy of journalists in general (contentious subject, but the studies I've seen seem to confirm the gell-mann amnesia effect to some degree), I'd say AI Overview isn't bad at least.
reply