That doesn't show that they "fall over". All degraded performances are highly no... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

og_kalu on July 11, 2023 | parent | context | favorite | on: AI Safety and the Age of Dislightenment

That doesn't show that they "fall over". All degraded performances are highly non trivial. And even the paper admits humans would see degraded performance on counterfactuals as well. They think humans may not only with "enough time to reason and revise", something the LLMs being evaluated don't get to do here.

If you took arithmetic tests in base 8, you wouldn't reach the same accuracy either.

skepticATX on July 11, 2023 [–]

Well, sure, but the problem is that LLMs can’t reason and revise, architecturally. Perhaps we can chain together a system that approximates this, but it still wouldn’t be the LLM doing the reasoning itself.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact