Hacker News new | past | comments | ask | show | jobs | submit login

"Does X look like Y" is always a continum. In test environments, players are judged human with these rates:

* Real humans 66%

* GPT-4: 49.7%

* ELIZA: 22%

* GPT-3.5: 20%

https://arxiv.org/pdf/2310.20216

(I'm rather surprised by ELIZA beating 3.5, as were the researchers).

Turing's introduction of the test, was a 70% chance of spotting the AI after 5 minutes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: