Hacker News new | past | comments | ask | show | jobs | submit login

Many other replies here are wrong - the primary reason is that the LLMs were used on completely out of distribution data (e.g. trained on English, evaluated on completely different language that shared some characters). The points about compression's relatedness to understanding are valid, but they are not the primary reason for LLMs underperforming relative to naive compression.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: