I think the data is way more important for the success of LLMs than the architec...

necovek · 2024-10-20T08:48:31.000000Z

I agree we are not limited with the data set size: all humans learn the language with the much smaller language training set (just look at kids and compare them to LLMs).

OTOH, humans (and animals) do get other data feeds (visual, context, touch/pain, smell, internal balance "sensors"...) that we develop as we grow and tie that to learning about language.

Obviously, LLMs won't replicate that since even adults struggle to describe these verbally.

tsimionescu · 2024-10-20T06:19:14.000000Z

> However, it also seems plausible that you could build an abstract representation of the world through studying a vast amount of human language and that'll be a good approximation of the real-world too and furthermore it seems possible that reasoning about that abstract representation can take place in the depths of the layers of a large transformer.

While I agree this is possible, I don't see why you'd think it's likely. I would instead say that I think it's unlikely.

Human communication relies on many assumptions of a shared model of the world that are rarely if ever discussed explicitly, and without which certain concepts or at least phrases become ambiguous or hard to understand.

necovek · 2024-10-20T08:40:53.000000Z

GP argument seems to be about "thinking" when restricted to knowledge through language, and "possible" is not the same as "likely" or "unlikely" — you are not really disagreeing, since either means "possible".

tsimionescu · 2024-10-20T09:02:52.000000Z

GP said plausible, which does mean likely. It's possible that there's a teapot in orbit around Jupiter, but it's not plausible. And GP is specifically saying that by studying human language output, you could plausibly learn about the world that have birth to the internal models that language is used to exteriorize.

necovek · 2024-10-20T09:29:26.000000Z

If we are really nitpicking, they said it's plausible you could build an abstract representation of the world by studying language-based data, but that it's possible it could be made to effectively reason too.

Anyway, it seems to me we are generally all in agreement (in this thread, at least), but are now being really picky about... language :)