There's nothing general about GPT-4's intelligence. The single problem it is trained on, token prediction, has the capability to mimic many other forms of intelligence.
Famously, GPT-4 can't do math and falls flat on a variety of simple logic puzzles. It can mimic the form of math, the series of tokens it produces seem plausible, but it has no "intelligent" capabilities.
This tells us more about the nature of our other pursuits as humans than anything about AI. When holding a conversation or editing an essay, there's a broad spectrum of possibilities that might be considered "correct", thus GPT-4 can "bluff" its way into appearing intelligent. The nature of its actual intelligence, token prediction, is indistinguishable from the reading comprehension skills tested by something like the LSAT (the argument could be made, I think, that reading comprehension of the style tested by the LSAT *is* just token prediction).
But test it on something where there are objectively correct and incorrect answers and the nature of the trick becomes obvious. It has no ability to verify, to reason, about even trivial problems. GPT-4 can only predict if the nature of its tokens fulfill the form of a correct answer. This isn't a general intelligence in any meaningful sense of the word.
I asked Chat-GPT to prove that the set of all integers is uncountable (it isn't). What's interesting is that Chat-GPT not only spat out the classic diagonalization proof, but rephrased around integers where it doesn't work, it ended with "This may seem counterintuitive, because we know that the integers are countable, but the proof clearly shows that they are uncountable."
Not only will Chat-GPT mess up math on its own, you can ask it to mess up math and rather than refuse, it cheerfully does it.
Well, I mean it’s pretty fair accusation. ChatGPT was demonstrably bad at math. I think it was only recently mentioned that GPT4 was trained on math. Furthermore, consider what it means to apply the transformer architecture to math problems. I think the tool is a mismatch for the problem. You’re relying on self attention and emergent phenomena to fake computational reduction as symbol transformations. It can probably do some basic maths (all the way up to calculus even) because, in the scope of human life, the maths we deal with are pretty boring. But that’s what they made the wolfram plug-in for as well.
I really think people attribute powers beyond what GPT really is: a colossal lookup table with great key aliasing.
Famously, GPT-4 can't do math and falls flat on a variety of simple logic puzzles. It can mimic the form of math, the series of tokens it produces seem plausible, but it has no "intelligent" capabilities.
This tells us more about the nature of our other pursuits as humans than anything about AI. When holding a conversation or editing an essay, there's a broad spectrum of possibilities that might be considered "correct", thus GPT-4 can "bluff" its way into appearing intelligent. The nature of its actual intelligence, token prediction, is indistinguishable from the reading comprehension skills tested by something like the LSAT (the argument could be made, I think, that reading comprehension of the style tested by the LSAT *is* just token prediction).
But test it on something where there are objectively correct and incorrect answers and the nature of the trick becomes obvious. It has no ability to verify, to reason, about even trivial problems. GPT-4 can only predict if the nature of its tokens fulfill the form of a correct answer. This isn't a general intelligence in any meaningful sense of the word.