LLMs have access to the space of collective semantic understanding. I don't understand why people expect cognitive faculties that are clearly extra-semantic to just fall out of them eventually.
The reason they sometimes appear to reason is because there's a lot of reasoning in the corpus of human text activity. But that's just a semantic artifact of a non-semantic process.
Human cognition is much more than just our ability to string sentences together.
I might expect some extra-semantic cognitive faculties to emerge from LLMs, or at least be approximated by LLMs. Let me try to explain why. One example of extra-semantic ability is spatial reasoning. I can point to a spot on the ground and my dog will walk over to it — he’s probably not using semantic processing to talk through his relationship with the ground, the distance of each pace, his velocity, etc. But could a robotic dog powered by an LLM use a linguistic or symbolic representation of spacial concepts and actions to translate semantic reasoning into spacial reasoning? Imagine sensors with a measurement to language translation layer (“kitchen is five feet in front of you”), and actuators that can be triggered with language (“move forward two feet”). It seems conceivable that a detailed enough representation of the world, expressive enough controls, and a powerful enough LLM could result in something that is akin to spacial reasoning (an extra-semantic process), while under the hood it’s “just” semantic understanding.
Spatial reasoning is more akin to visualising a 3D "odd shaped" fuel tank from 2D schematics and being able to mentally rotate that shape to estimate where a fluid line would be at various angles.
This is distinct from stringing together treasure map instructions in an chain.
Isn’t spatial navigation a bit like graph walking, though? Also, AFAIK blind people describe it completely differently, and they’re generally confused by the whole concept of 3D perspective and objects getting visually smaller over distance, and so on. Brains don’t work the same for everyone in our species, and I wouldn’t presume to know the full internal representation just based on qualia.
I'm always impressed by the "straightedge-and-compass"-flavoured techniques drafters of old used to rotate views of odd 3D shapes from pairs of 2D schematics, in the centuries before CAD software.
I don’t know if you’re correct. I don’t think you know that our brains are that different? We too need to train ourselves on massive amounts of data. I feel like the kids of reasoning and understanding I’ve seen ChatGPT do are soooo far beyond something like just processing language.
When I talk to 8B models, it's often painfully clear that they are operating mostly (entirely?) on the level of language. They often say things that make no sense except from a "word association" perspective.
With bigger (400B models) that's not so clear.
It would be silly to say that a fruit fly has the same thoughts as me, only a million times smaller quantitatively.
I imagine the same thing is true (genuine qualitative leaps) in the 8B -> 400B direction.
We do represent much of our cognition in language.
Sometime I feel like LLMs might be “dancing skeletons” - pulleys & wire giving motion to the bones of cognition.
Do you have any evidence that human cognition (for speaking) is more than just an ability to string sentences together? Do you have any evidence that LLMs don't reason at all?
A perfect machine designed to only string sentences together as perfect responses with no reasoning built it IS Indistinguishable from a machine that only builds sentences from pure reasoning.
Either way nobody understands what's going on in the human brain and nobody understands why LLMs work. You don't know. You're just stating a belief.
It is like having Google's MusicML output a mp3 of saxophone music and then ask what proof is there that MusicML has not learned to play the saxophone?
In a certain context that is only judging the output, what is meant by "play the saxophone", the model has achieved.
In another context of what is normally meant, the idea the model has learned to play the saxophone is completely ridiculous and not something anyone would even try to defend.
In the context of LLMs and intelligence/reasoning, I think we are mostly talking about the later and not the former.
"Maybe you don't have to blow throw a physical tube to make saxophone sounds, you can just train on tons of output of saxophone sounds then it is basically the same thing"
Let's limit the discussion to things that can be actually done with an LLM.
Getting one to blow on a saxophone is outside of this context.
An LLM can't blow on a saxophone period. However it can write and read English.
>In the context of LLMs and intelligence/reasoning, I think we are mostly talking about the later and not the former.
And I'm saying the later is completely wrong. I'm also saying the former is irrelevant. Look this is what you're doing. For the former you're comparing something humans can do to something LLMs Can't do. That's a completely irrelevant comparison.
for the later we are comparing things humans and LLMs BOTH can do. Sometimes humans give superior output, sometimes LLMs give superior output. Given similar inputs and outputs the internal analysis of what's going on whether it's true intelligence or true reasoning is NOT ridiculous.
"Ridiculous" is comparing things where no output exists. LLMs do not have saxophone output where they actually blow into an instrument. There's nothing to be compared here.
There's also counter evidence that animals lack reasoning abilities. Ever see a gorilla attack it's own reflection or a dog chase it's own tail? Contradictory evidence displaying the ability to reason and the lack of the ability to reason is what is displayed by animals.
Figures that the LLM displays the same contradictory evidence.
But none of this evidence proves anything definitively. Much like human cognition, The LLM is a machine that we built but don't understand. No definitive statement can be made about it.
You just argued that, because some animals display errors in reasoning, we can call into doubt the claim that any animal reasons. This does not follow. The reflection test, for example, is passed by some animals; we can test it by putting a red dot on their face and seeing if they mess with it. Either way, it’s not even a test of general reasoning abilities. I think you’re giving animals far too little credit.
I believe the claim that some animals can reason is a very reasonable hypothesis, while the claim that no animals can reason is an unlikely one. The contradictory evidence you cite is pointing at some animals doing dumb things. Those animals can be as dumb as rocks and it wouldn’t matter for my claim, I’d only need to show you one reasoning animal to prove it.
Not to mention all of the developments in mirror testing to account for differences in perception. What they're finding is that self-recognition is more common than assumed.
Id wage they could not create much if they had no exposure to other art or music in the first place. creation does not come from nothing. composers and artists typically imitate, its very well known.
So where is the original guitar music that all the guitar players imitated? Can't have been created by a human since humans imitate and can't create new things, as you say, was it god who created it? Or was it always there?
Humans are really creative and create new stuff. Not sure why people try to say humans aren't.
I find the terminology is used inconsistently*, so it's probably always worth asking.
To me, a "large language model" is always going to mean "does text"; but the same architecture (transformer) could equally well be trained on any token sequence which may be sheet music or genes or whatever.
IIRC, transformers aren't so good for image generators, or continuums in general, they really do work best where "token" is the good representation.
* e.g., to me, if it's an AI and it's general then it's an AGI, so GPT-3.5 onwards counts; what OpenAI means when they say "AGI" is what I'd call "transformative AI"; there's plenty of people on this site who assert that it's not an AGI but whenever I've dug into the claims it seems they use "AGI" to mean what I'd call "ASI" ("S" for "superhuman"); and still others refuse to accept that LLMs are AI at all despite coming from AI research groups publishing AI papers in AI journals.
No. LLMs can take any type of data. Text is simply a string of symbols. Images, video and music are also a string of symbols. The model is the same algorithm just trained on different types of data.
I never said cognition was limited to text. I just limited the topic itself to cognition involving text.
Every culture on earth has seemed to figured the same rudimentary addition and fractions to figure out accounts and inheritance partitioning. Why didn't they come up with inconsistent models of numerics if their developed in total linguistic isolation?
It's possible that we are just LLMs with much much more data such that we don't make inconsistencies. And the data is of course just inborn and hardwired into our neural networks rather than learned.
We don't know, so no statement can really be made here.
You can't explain how an LLM does what it does and you can't explain how humans do what we do either. With no explanation possible but CLEAR similarities between human responses and LLM responses that pass turing tests... my hypothesis is actually reasonable.
In theory, with enough data and enough neurons we can conceivably construct an LLM that performs better than humans. Neural nets are supposed to able to compute anything anyway. So none of what I said is unreasonable.
The problem I have with your claim is that it assumes humans use language the way that an LLM does. Humans don’t live in a world of language, they live in the world. When you teach kids vocabulary you point to objects in the environment. Our minds, as a consequence, don’t bottom out at language; we draw on language as a pointer into mental concepts built on sensory experience. LLMs don’t reference something, they’re a crystallization of language’s approximate structure. How do they implement this structure? I dunno, but I do know that they aren’t going to do much more than that because it isn’t rewarded during training. We almost certainly possess something like an LLM in our heads to help structure language, but we also have so, so much more going on up there.
You made a bunch of claims here but you can’t prove any of them to be true.
Also you are categorically wrong about language. LLMs despite the name go well beyond language. LLMs can generate images and sound and analyze them too. They are trained on images and sound. Try ChatGPT.
> Human cognition is much more than just our ability to string sentences together
animals without similar language capabilities dont seem to be too strong at reasoning. it could well be that language and reasoning are heavily linked together
The reason they sometimes appear to reason is because there's a lot of reasoning in the corpus of human text activity. But that's just a semantic artifact of a non-semantic process.
Human cognition is much more than just our ability to string sentences together.