Why do people keep taking OpenAIs marketing spin at face value? This keeps happe...

th1243127 · 2025-01-19T23:58:39 1737331119

It might be because (very few!) mathematicians like Terence Tao make positive remarks. I think these mathematicians should be very careful to use reproducible and controlled setups that by their nature cannot take place on GPUs in the Azure cloud.

I have nothing against scientists promoting the Coq Proof Assistant. But that's open source, can be run at home and is fully reproducible.

aithrowawaycomm · 2025-01-20T00:13:54 1737332034

Keep in mind those mathematicians were kept in the dark about the funding: it is incredibly unethical to invite a coauthor to your paper and not tell where the money came from.

It's just incredibly scummy behavior: I imagine some of those mathematicians would have declined the collaboration if the funding were transparent. More so than data contamination, this makes me deeply mistrustful of Epoch AI.

refulgentis · 2025-01-20T00:14:51 1737332091

I can't parse any of this, can you explain to a noob? I get lost immediately: funding, coauthor, etc. Only interpretation I've come to is I've missed a scandal involving payola, Terence Tao, and keeping coauthors off papers

Vecr · 2025-01-20T00:33:13 1737333193

Very few people were told the nature of the funding.

Vecr · 2025-01-20T01:01:07 1737334867

Wait, I think I somehow knew Epoch AI was getting money from OpenAI. I'm not sure how, and I didn't connect any of the facts together to think of this problem in advance.

rvz · 2025-01-20T07:43:59 1737359039

Because they are completely gullible and believe almost everything that OpenAI does without questioning the results.

On each product they release, their top researchers are gradually leaving.

Everyone now knows what happens when you go against or question OpenAI after working for them, which is why you don't see any criticism and more of a cult-like worship.

Once again, "AGI" is a complete scam.

refulgentis · 2025-01-20T00:18:19 1737332299

Because the models have continually matched the quality they claim.

Ex. look how much work "very few" has to do in the sibling comment. It's like saying "very few physicists [Einstein/Feynman/Witten]"

Its conveniently impossible to falsify the implication that the inverse of "very few" say not positive things. i.e. that the vast majority say negative things

You have to go through an incredible level of mental gymnastics, involving many months of gated decisions, where the route chosen involved "gee, I know this is suspectable to confirmation bias, but...", to end up wondering why people think the models are real if OpenAI has access to data that includes some set of questions.

saithound · 2025-01-21T10:59:25 1737457165

> Because the models have continually matched the quality they claim.

That's very far from true.

"Yes, I know that the HuggingFace arena and coding assistant leaderboards both say that OpenAI's new model is really good, but in practice you should use Claude Sonnet instead" was a meme for good reason, as was "I know the benchmarks show that 4o is just as capable as ChatGPT4 but based on our internal evals it seems much worse". The latter to the extent that they had to use dark UI patterns to hide ChatGPT-4 from their users, because they kept using it, and it cost OpenAI much more than 4o.

OpenAI regularly messes with benchmarks to keep the investor money flowing. Slightly varying the wording of benchmark problems causes a 30% drop in o1 accuracy. That doesn't mean "LLMs don't work" but it does mean that you have to be very sceptical of OpenAI benchmark results when comparing them to other AI labs, and this has been the case for a long time.

The FrontierMath case just shows that they are willing to go much farther with their dishonesty than most people thought.