The purpose of my original comment wasn't to accurately depict LLMs, but to introduce the properties of people that cause us to write/speak etc. which LLMs aren't sensitive to. The point was to answer the question, "in what ways arent we just doing the same?"
The point of the LLM bit is that the property of the world that LLMs are sensitive to is the distribution of text tokens in their training data. Regardless of which features of this distribution any given AI model captures, it is necessarily a model of the dataset's actual P(token|tokens..).
In the case of LLMs its a very high dimensional model so that P(word|previous words) is actually modelled by something like: P(word|P(prompt embedding space|answer embedding space) | ...) -- but this makes no difference to the "arent we doing the same?" question. We dont use frequency associations between parts of a historical corpus when we speak.
I don't think the question of are we doing the same is meaningful except on the surface, where we focus on the function that is performed, and ignore what we know about the mechanism performing the function.
On the surface, in the presence of in context learning, novel out of distribution contexts, and reactive coupling with a world context like a python repl, simulation, robot, or other source of empirical feedback, then yes, there is a sense in which LLMs do the same kinds of things, and can perform the same kinds of functions.
Given an experimental and out of distribution context that no human has seen, an LLM can generate novel hypotheses, experiment to test these hypotheses, and converge on the truth. It doesn't matter if this functionality attains from a corpus conditioned token generator, or a biological network of spiking neurons. It's important to point out that both systems support that function, without appealing to reductions, which in both cases would trivialize and obscure the higher order functions. If we're physically reductive with LLMs that throws away the functionalist view, and reduces our ability to actually expect, predict, and elicit higher order functionalities.
Everything you say here makes sense, except the last bit:
> We dont use frequency associations between parts of a historical corpus when we speak.
But that's the thing, it seems we do. Arguably, the very meaning of concepts is determined solely by associations with other concepts, in a way remarkably similar if not identical to frequency associations.
No, no.. the semantics of words is not other words.
Cavemen wander around, they fall over a pig, they point to pig and say "pig". Other cavemen observe. Later, when they want a pig, they say "pig". No one here knows anything about pigs other than that there is something in the world which causes people to say "pig" and each caveman is able to locate that thing after awhile.
The vast majority of language is nothing more than this: words point outside themselves to the world, this pointing is grown in us through acquaintance with the world.
Now, in general, the cause of my saying "pig" is not me falling over one. Suppose I say, to a friend, "I've always thought pigs were cute, until i saw a big one!"
So here, "I" points at both me as a body, but also plausibly at my model of myself (etc.), "always" modifies "thought" ... so "I've always thought" ends up being a statement about how my own models of my self over time have changed.. and so on for "pigs" and the like.
We do not know that this is what our words mean. We have no idea that what we're referring to when I say "I've always thought" -- the nature of the world that our words refers to requires, in general, science to explain. Words are, at first, just a familiar way of throwing darts at a target which we can see, but not describe nor explain.
It is this process which is entirely absent in an LLM. An LLM isnt throwing a dart at anything, it isnt even speaking. It's replaying historical darts matches between people.
And this is just to consider reference. There are other causes of our using words much more complex than our trying to refer to things, likewise, these are absent from the LLM.
The point of the LLM bit is that the property of the world that LLMs are sensitive to is the distribution of text tokens in their training data. Regardless of which features of this distribution any given AI model captures, it is necessarily a model of the dataset's actual P(token|tokens..).
In the case of LLMs its a very high dimensional model so that P(word|previous words) is actually modelled by something like: P(word|P(prompt embedding space|answer embedding space) | ...) -- but this makes no difference to the "arent we doing the same?" question. We dont use frequency associations between parts of a historical corpus when we speak.