Hacker News new | past | comments | ask | show | jobs | submit login

I think the data is way more important for the success of LLMs than the architecture although I do think there's something important in the GPT architecture in particular. See this talk for why: [1]

Warning, watch out for waving hands: The way I see it is that cognition involves forming an abstract representation of the world and then reasoning about that representation. It seems obvious that non-human animals do this without language. So it seems likely that humans do too and then language is layered on top as a turbo boost. However, it also seems plausible that you could build an abstract representation of the world through studying a vast amount of human language and that'll be a good approximation of the real-world too and furthermore it seems possible that reasoning about that abstract representation can take place in the depths of the layers of a large transformer. So it's not clear to me that we're limited by the data we have or necessarily need a different type of data to build a general AI although that'll likely help build a better world model. It's also not clear that an LLM is incapable of the type of reasoning that animals apply to their abstract world representations.

[1] https://youtu.be/yBL7J0kgldU?si=38Jjw_dgxCxhiu7R






I agree we are not limited with the data set size: all humans learn the language with the much smaller language training set (just look at kids and compare them to LLMs).

OTOH, humans (and animals) do get other data feeds (visual, context, touch/pain, smell, internal balance "sensors"...) that we develop as we grow and tie that to learning about language.

Obviously, LLMs won't replicate that since even adults struggle to describe these verbally.


> However, it also seems plausible that you could build an abstract representation of the world through studying a vast amount of human language and that'll be a good approximation of the real-world too and furthermore it seems possible that reasoning about that abstract representation can take place in the depths of the layers of a large transformer.

While I agree this is possible, I don't see why you'd think it's likely. I would instead say that I think it's unlikely.

Human communication relies on many assumptions of a shared model of the world that are rarely if ever discussed explicitly, and without which certain concepts or at least phrases become ambiguous or hard to understand.


GP argument seems to be about "thinking" when restricted to knowledge through language, and "possible" is not the same as "likely" or "unlikely" — you are not really disagreeing, since either means "possible".

GP said plausible, which does mean likely. It's possible that there's a teapot in orbit around Jupiter, but it's not plausible. And GP is specifically saying that by studying human language output, you could plausibly learn about the world that have birth to the internal models that language is used to exteriorize.

If we are really nitpicking, they said it's plausible you could build an abstract representation of the world by studying language-based data, but that it's possible it could be made to effectively reason too.

Anyway, it seems to me we are generally all in agreement (in this thread, at least), but are now being really picky about... language :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: