This is a whole lot of words to say "because it just mixes up words that are in likely the same order as other stuff that was fed into it. It doesn't know or reason anything."
I have yet to see any explanation more useful or apparently accurate than this one.
This notion of innate concepts of "to know" and the ability to reason smell slightly of linguistics prior to AI (of various kinds) - i.e. "Grammar is innate, computers can do [something]" -> Computers now do it.
There are definitely going to be contexts that Transformers just don't work with very well at all, but the idea that you can't get a very good statistical approximation to knowing and reasoning via a computer seem naively prone anthropocentrism.
Conversely, the idea that concepts like "reasoning" and "knowing" can be approximated by language models seems like a naive result of anthropomorphism.
It was created to be a tool to estimate the next token in a series based off it's training data. To say that reasoning and knowing can be approximated in the same way says less about the language models themselves and more about the relationship of "reasoning" and "knowing" to "language".
In my opinion that's why I think discussions on whether or not GPT-x can reason/know should be taken as seriously as discussions on the physics of torch drives. They seem to assume a relationship between statistical approximation, reasoning, and language exists that isn't proven much like torch drive discussion assumes working nuclear fusion.
Essentially I think dismissing the idea of statistical approximations via transformers being able to "reason" and "know" is about as anthropocentric as dismissing the idea that collective consciousnesses shouldn't be granted individual rights. There's a lot of things we need to know and decide before we can even start thinking about what that means.
It's hardly anthropocentrism though. Even a student that has studied some epistomology understands that there is a very big difference between pattern matching and operating based on formal logic and that's different from operating based on known concepts inferred from perceptual cues.
The reality is LLMs are fantastic at pattern matching and knowledge retrieval (with caveats) but struggle in problems involving uncertainty. Yann Lecun actually has had some great posts on the subject if you're interested.
I've said something like this before, but yes -- you can make a computer sound a lot like a human. Like A LOT.
Also, a really good sculptor can make a statue that looks a LOT like a human. A lot. Good enough to fool people. But so what?
I'm not saying "AI" isn't a big deal. I think it is -- perhaps on the order of the invention of the movie, or the book, or the video game. But I also think those are still FAR from "living beings" or anything LIKE "living beings."
> the idea that you can't get a very good statistical approximation to knowing and reasoning via a computer seem naively prone anthropocentrism.
I don't think this is being questioned in general right now, but rather the claim is:
You can't get a very good statistical approximation to knowing and reasoning via _just analyzing the language_.
Language is evidently not enough on its own [1]. According to some researchers [2], the system needs to be "grounded" (think of it as being given common sense). Although there's apparently no consensus [3] among scientists on how to _fundamentally_ solve the shortcomings of current systems.
I suppose I could go further. I don't think anthropocentrism is bad; I think it's actually a VERY GOOD sort of null hypothesis? I'm really comfortable rolling with it given the following: for years and years human beings anthropomorphizing non-human things but, with probably the exception of other actual animals at times, nothing has really come close -- despite a whole lot of people suggesting/feeling otherwise. Absent some REALLY impressive evidence (which this is not, it's relatively easy to grok what's going on here) I see no reason to not roll with the null hypothesis of "humans, in fact, are special."
I asked ChatGPT "Why are you so good at making things up?"
It's response is almost exactly what you just said.
> As an AI language model, I am not capable of "making things up" in the traditional sense. Rather, I am designed to generate text based on patterns and relationships that I have learned from the vast amount of language data that I have been trained on. My ability to generate coherent and believable text comes from the sophisticated algorithms and neural networks that power my language processing capabilities. These algorithms enable me to understand the structure and meaning of language, and to generate text that is syntactically and semantically correct. While I may sometimes generate responses that are creative or unexpected, everything I produce is ultimately grounded in the language data that I have learned from.
It is capable of doing tasks that could not possibly be in its training set. I guess this doesn't technically contradict your explanation, but it makes your explanation entirely unhelpful. Even if the AI doomers are somehow right and GPT-5 turns into skynet, we still could not categorically prove that it is doing reasoning.
I have yet to see any explanation more useful or apparently accurate than this one.