I think we have a bit of miscommunication here. You see these “benchmarks” as necessary and sufficient for “real intelligence” when people are just saying they are necessary but not sufficient.
It’s like asking for a drink (legal age 21) and getting rebuked with a “you aren’t even old enough to drive” (legal age 16). That doesn’t mean you can drink when you are 16!
Edit: If you want a hard necessary and sufficient condition for “real intelligence”, mine is “when it can do all of our jobs, i.e. wholesale replace every human”.
* ”X would demonstrate real intelligence.“
* Some ML model M beats X.
* ”Well, actually, M is not really intelligent. Y would prove real intelligence.“
* repeat