It likely is a fact, but we don't really know what we mean by "think".
LLMs have illuminated this point from a relatively new direction: we do not know if their mechanism(s) for language generation are similar to our own, or not.
We don't really understand the relationship between "reasoning" and "thinking". We don't really understand the difference between Kahneman's "fast" and "slow" thinking.
Something happens, probably in our brains, that we experience and that seems causally prior to some of our behavior. We call it thinking, but we don't know much about what it actually is.
I don't think its useful or even interesting to talk about AI in relation to how humans think, or whether or not they will be "conscious" whatever that might mean.
AIs are not going to be like humans because they will have perfect recall of a massive database of facts, and be able to do math well beyond any human brain.
The interesting question to me is, when will we be able to give AI very large tasks, and when will it to be able to break the tasks down into smaller and smaller tasks and complete them.
When will it be able to set its own goals, and know when it has achieved them?
When will it be able to recognize that it doesn't know something and do the work to fill in the blanks.
I get the impression that LLMs don't really know what they are saying at the moment, so don't have any way to test what they are saying is true or not.
There was a study where they trained a model to lie. When they looked at what was happening internally they could see it knew the truth but just switched things at the outer nodes to lie.
I think we have a pretty good idea that we are not stochastic parrots - sophisticated or not. Anyone suggesting that we’re running billion parameter models in order to bang out a snarky comment is probably trying to sell you something (and crypto’s likely involved.)
I think you’re right, LLMs have demonstrated that relatively sophisticated mathematics involving billions of params and an internet full of training data is capable of some truly, truly, remarkable things. But as Penrose is saying, there are provable limits to computation. If we’re going to assume that intelligence as we experience it is computable, then Gödel’s theorem (and, frankly, the field of mathematics) seems to present a problem.
I've never had any time for Penrose. Gödel’s theorem "merely" asserts that in any system capable of a specific form of expression there are statements which are true but not provable. What this has to do with (a) limits to computation or (b) human intelligence has never been clear to me, despite four decades or more of interest in the topic. There's no reason I can see why we should think that humans are somehow without computational limits. Whether our limits correspond to Gödel’s theorem or not is mildly interesting, but not really foundational from my perspective.
At the end of the day Penrose's arguments is just Dualism.
Humans have a special thingy that makes the consciousness
Computers do not have the special thingy
Therefore Computers cannot be consciousness.
But Dualism gets you laughed at these days so Dualists have to code their arguments and pretend they aren't into that there Dualism.
Penrose's arguments against AI has always felt to me like special pleading that humans (or to stretch a bit further, carbon based lifeforms) are unique.
While I don't like Penrose's argument and I think it stands on very shaky ground, I very much disagree it's a form of dualism. His argument is simply that human thinking is not reducible to a Turing machine, that it is a form of hyper-computation.
If this were to be true, it would follow that computers as we build them today would fundamentally not be able to match human problem-solving. But it would not follow, in any way, that it would be impossible to build "hyper computers" that do. It just means you wouldn't have any chance of getting there with current technology.
Now, I don't think Penrose's arguments for why he thinks this is the case are very strong. But they're definitely not mystical dualistic arguments, they're completely materialistic mathematical arguments. I think he leans towards an idea that quantum mechanics has a way of making more-than-Turing computation happen (note that this is not about what we call quantum computers, which are fully Turing-equivalent systems, just more efficient for certain problems), and that this is how our brains actually function.
> I think he leans towards an idea that quantum mechanics has a way of making more-than-Turing computation happen (note that this is not about what we call quantum computers, which are fully Turing-equivalent systems, just more efficient for certain problems), and that this is how our brains actually function.
That was my understanding on Penrose's position as well which is just a "Consciousness of the Gaps" argument. As we learn more about quantum operations the space for Consciousness as a special property of humans disapears.
Penrose doesn’t think that consciousness is special to humans. He thinks most animals have it and more importantly to your point, he thinks that there is no reason that we won’t someday construct artificial creations that have it.
I just watched an interview where he made that exact statement nearly word for word.
His only argument is that it is not computable, not that it’s not physical. He does think the physical part involves the collapse of the wave function due to gravity, and that somehow the human brain is interacting with that.
So to produce conciseness in his view, you’d need to construct something capable of interacting with the quantum world the same way he believes organic brains do (or something similar to it). A simulation of the human brain wouldn’t do it.
He proposed a proof of platonism: Mandelbrot set has a stable form and is not subjective, because it doesn't fit in human mind due to its sheer complexity, consequently it exists objectively. His beliefs are pretty transparent.
> I think we have a pretty good idea that we are not stochastic parrots - sophisticated or not. Anyone suggesting that we’re running billion parameter models
On the contrary, we have 86B neurons in the brain, the weighting of the connections is the important thing, but we are definitely 'running' a model with many billions of parameters to produce our output.
The theory by which the brain mainly works by predicting the next state is called predictive coding theory, and I would say that I find it pretty plausible. At the very least, we are a long way from knowing for certain that we don't work in this way.
> On the contrary, we have 86B neurons in the brain
The neurons (cells) in even a fruit flies brain are orders of magnitude more complex than the "neurons" (theoretical concept) in a neural net.
> the weighting of the connections is the important thing
In a neural net, sure.
In a biological brain, many more factors are important: The existence of a pathway. Antagonistic neurotransmitters. NT re-incorporation. NT-binding sensitivity. Excitation potential. Activity of Na/K channels. Moderating enzymes.
Even what we last ate or drank, how rested, old, hydrated, we are, when our lats physical activity took place, and all the interactions prior to an input influence how we analyse and integrate it.
> but we are definitely 'running' a model with many billions of parameters to produce our output.
No, we are very definitely not. Many of our mental activities have nothing to do with state prediction at all.
We integrate information.
We exist as a conscious agent in the world. We interact, and by doing so change our own internal state alongside the information we integrate. We are able to, from this, simulate our own actions and those of other agents, and model the world around us, and then model how an interaction with that world would change the model.
We are also able to model abstract concepts both in and outside the world.
We understand what concepts, memories, states, and information mean both as abstract concepts and concrete entities in the universe.
We communicate with other agents, simultaneously changing their states and updating our modeling of their internal state (theory of the mind, I know that you know that I know, ...)
We filter, block, change, and create information.
And of course we constantly learn and change the way we do ALL OF THIS, consciously and subconsciously.
> At the very least, we are a long way from knowing for certain that we don't work in this way.
OK, let me be more clear, because I'm not sure what you're arguing against.
If the process in the brain is modellable at all, then it is certainly a model with at a minimum many billions of parameters. Your list of additional parameters if anything supports that rather than arguing against it. If you want to argue with that contention, I think you need to argue that the process isn't modellable, which if you want to talk about burden of proof, would place a huge burden on you. But maybe I misunderstood you. I thought you were saying that it's ludicrous to say we're using as many as billions of parameters, but perhaps you're trying to say that billions is obviously far too small, in which case I agree.
My second point, which is that there's a live theory that prediction may be a core element of our consciousness was intended as an interesting aside, I don't know how it will stand the test of time and I certainly don't know if its correct or not, I intended only to use it to prove that the things you seem to think are obvious are not in fact obvious to everyone.
For example, that big list of things that you are using as an argument against prediction doesn't work at all because you don't know whether they are implemented via a predictive process in the brain or not.
It feels that rather than arguing against modellability or large numbers of parameters or prediction you're arguing against the notion that the human brain is exactly an llm, which is an idea so obviously true I don't think anyone actually disagrees with it.
> Your list of additional parameters if anything supports that rather than arguing against it.
> perhaps you're trying to say that billions is obviously far too small, in which case I agree.
No, it doesn't, and I don't.
The processes that happen in a living brain don't just map to "more params". It doesn't matter how many learnable parameters you have...unless you actually change the paradigm, an LLM or similar construct is incapable of mapping a brain, period. The simple fact that the brains internal makeup is itself changeable, already prevents that.
> prediction may be a core element of our consciousness
No it isn't, and it's trivially easy to show that.
Many meditative techniques exist where people "empty their mind". They don't think or predict anything. Does that stop consiousness? Obviously not.
Can we do prediction? Sure. Is it a "core element", aka. indispensable for consciousness? No.
I am not a neuroscientist, but I think it's likely that LLMs (with 10s/100s of billions of parameters) and the human brain (with 1-2 orders of magnitude more neural connections[1]) process language in analogous ways. This process is predictive, stochastic, sensitive to constantly-shifting context, etc. IMO this accounts for the "unreasonable effectiveness" of LLMs in many language-related tasks. It's reasonable to call this a form of intelligence (you can measure it, solve problems with it, etc).
But language processing is just one subset of human cognition. There are other layers of human experience like sense-perception, emotion, instinct, etc. – maybe these things could be modeled by additional parameters, maybe not. Additionally, there is consciousness itself, which we still have a poor understanding of (but it's clearly different from intelligence).
So anyway, I think that it's reasonable to say that LLMs implement one sub-set of human cognition (the part that has to do with how we think in language), but there are many additional "layers" to human experience that they don't currently account for.
Maybe you could say that LLMs are a "model distillation" of human intelligence, at 1-2 orders of magnitude less complexity. Like a smaller model distilled from a larger one, they are good at a lot of things but less able to cover edge cases and accuracy/quality of thinking will suffer the more distilled you go.
We tend to equate "thinking" with intelligence/language/reason thanks to 2500 years of Western philosophy, and I believe that's where a lot of confusion originates in discussions of AI/AGI/etc.
>I am not a neuroscientist, but I think it's likely that LLMs (with 10s of billions of parameters) and the human brain (with 1-2 orders of magnitude more neural connections[1]) process language in analogous ways
Related is the platonic representation hypothesis where models apparently converge to similar representations of relationships between data points.
Interesting. I'm not sure I'd use the term "Platonic" here, because that tends to have implications of mathematical perfection / timelessness / etc. But I do think that the corpuses of human language that we've been feeding to these models contain within them a lot of real information about the objective world (in a statistical, context-dependent way as opposed to a mathematically precise one), and the AIs are surfacing this information.
To put this another way, I think that you can say that much of our own intelligence as humans is embedded in the sum total of the language that we have produced. So the intelligence of LLMs is really our own intelligence reflected back at us (with all the potential for mistakes and biases that we ourselves contain).
Edit: I fed Claude this paper, and "he" pointed out to me that there are several examples of humans developing accurate conceptions of things they could never experience based on language alone. Most readers here are likely familiar with Helen Keller, who became an accomplished thinker and writer in spite of being blind and deaf from infancy (Anne Sullivan taught her language despite great difficulty, and this Keller's main window to the world). You could also look at the story of Eşref Armağan, a Turkish painter who was blind from birth – he creates recognizable depictions of a world that he learned about through language and non-visual senses).
Try taking any of the LLM models we have, and making it learn (adjust its weights) based on every interaction with it. You'll see it quickly devolves into meaninglessness. And yet we know for sure that this is what happens in our nervous system.
However, this doesn't mean in any way that an LLM might not produce the same or even superior output than a human would in certain very useful circumstances. It just means it functions fundamentally differently on the inside.
Maybe this is just a conversation about what "fundamentally differently" means then.
Obviously the brain isn't running an exact implementation of the attention paper, and your point about how the brain is more malleable than our current llms is a great point, but that just proves they aren't the same. I fully expect that future architectures will be more malleable, if you think that such hypothetical future architectures will be fundamentally different from the current ones then we agree..
It likely is a fact, but we don't really know what we mean by "think".
LLMs have illuminated this point from a relatively new direction: we do not know if their mechanism(s) for language generation are similar to our own, or not.
We don't really understand the relationship between "reasoning" and "thinking". We don't really understand the difference between Kahneman's "fast" and "slow" thinking.
Something happens, probably in our brains, that we experience and that seems causally prior to some of our behavior. We call it thinking, but we don't know much about what it actually is.