Hacker News new | past | comments | ask | show | jobs | submit login

Absolutely false, at the core of every LLM is a highly compressed text corpus from an Internet search engine.

(The wonder here isn't that an LLM succeeds at text retrieval tasks, the wonder is how highly compressed the index turns out to be. But maybe we just severely overestimate our own information complexity.)




So, you're saying an LLM is a just a database that does text retrieval?


Yes, using a statistical model which is in effect a very lossy compressor.


So, what you're telling me is that every thing they say has already been said before, completely verbatim? Like, if I asked it to write a story about a dog named Jebediah surfing to planet Xbajahabvash, it would basically just find a link to someone else's story that wrote about the same dog surfing to the same planet? That sounds like an infinitely large amount of combinations. Perhaps the internet is just infinitely large, squared (or even circled).


So, like a human, then?




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: