No... sigmoid10 was comparing with *o1* (not o1-pro), which is accessible for $2...

layer8 · 2025-02-18T15:24:45 1739892285

“The impression overall I got here is that this is somewhere around o1-pro capability”.

“Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI's strongest models (o1-pro, $200/month)”.

coder543 · 2025-02-18T15:28:37 1739892517

The comment I was replying to had replied to an lmarena benchmark link. Perhaps you think that person should have replied to someone else? And, if you want to finish the quote, Karpathy's opinion on this is subjective. He admits it isn't a "real" evaluation.

"[...] though of course we need actual, real evaluations to look at."

His own tests are better than nothing, but hardly definitive.

layer8 · 2025-02-18T15:38:43 1739893123

I understood numpad0 to continue the comparison to o1-pro, after sigmoid10 expressed the opinion that the comparison is warranted.

coder543 · 2025-02-18T15:46:54 1739893614

Yes, numpad0 did... but I was pointing out that this choice was illogical. The lmarena results they were replying to only supported a comparison against o1, since o1 effectively matches Grok 3 on the benchmark being replied to (with o1-pro nowhere to be found), and then they immediately leapt into a bunch of weird value-proposition math. As I said, perhaps you think they should have replied to someone else? Replying to an lmarena benchmark indicates that numpad0 was using that benchmark as part of the justification of their math. I also pointed out the limitations of lmarena as a benchmark for frontier models.

I don't think anyone is arguing that ChatGPT Pro is a good value unless you absolutely need to bypass the rate limits all the time, and I cannot find a single indication that Premium+ has unlimited access to Grok 3. If Premium+ doesn't have unlimited rate limits, then it's definitely not comparable to ChatGPT Pro, and other than one subjective comment by Karpathy, we have no benchmarks that indicate that Grok 3 might be as good as o1-pro. You already get 99% of the value with just ChatGPT Plus compared to ChatGPT Pro for half the price of Premium+.

numpad0 was effectively making a strawman argument by ignoring ChatGPT Plus here... it is very easy for anyone to beat up a strawman, so I am here to point out a bad argument when I see one.

AtlanticThird · 2025-02-18T19:57:31 1739908651

You're the one that came in and told him about the "factor in your math". Like you said, it's his comparison, not yours. If you want to do your own comparison, feel free. But don't come in and tell him he's not allowed to do his comparison. I for one like is comparison.

cyanydeez · 2025-02-18T22:49:06 1739918946

Guys, yall forget GIGO. First principles.

This thing is produced by musk.