I think a lot of people got discouraged, seeing how openai solved arc agi 1 by w...

mrshadowgoose · 2025-03-25T03:03:12 1742871792

Strong emphasis on "seems".

I'd encourage you to review the definition of "brute force", and then consider the absolutely immense combinatoric space represented by the grids these puzzles use.

"Brute force" simply cannot touch these puzzles. An amount of understanding and pattern recognition is strictly required, even with the large quantities of test-time compute that were used against arc-agi-1.

Davidzheng · 2025-03-25T09:35:40 1742895340

Also there's no clear way to verify the solution. There could be easily multiple rules which works on the same examples

fchollet · 2025-03-25T00:28:34 1742862514

It's useful to know what current AI systems can achieve with unlimited test-time compute resources. Ultimately though, the "spirit of the challenge" is efficiency, which is why we're specifically looking for solutions that are at least within 1-2 order of magnitude of cost from being competitive with humans. The Kaggle leaderboard is very resource-constrained, and on the public leaderboard you need to use less than $10,000 in compute to solve 120 tasks.

Legend2440 · 2025-03-25T04:03:47 1742875427

Efficiency sounds like a hardware problem as much as a software problem.

$10000 in compute is a moving target, today's GPUs are much much better than 10 years ago.

NitpickLawyer · 2025-03-25T08:10:37 1742890237

> $10000 in compute is a moving target

And it's also irrelevant in some fields. If you solve a "protein folding" problem that was a blocker for a pharma company, that 10k is peanuts now.

Same for coding. If you can spend 100$ / hr on a "mid-level" SWE agent but you can literally spawn 100 today and 0 tomorrow and reach your clients faster, again the cost is irrelevant.