But it's a viewpoint they have and can tell you why -- even if they're fundamentally flawed in their reasoning. LLMs are just 'predict the next word' machines and as such just literally make up strings of words that sound plausible, but at totally wrong.
People keep repeating that LLMs are predicting the next words but at least with the more recent versions, this isn't true. Eg, LLMs are generating their own intermediate or emergent goals, they're reasoning in a way that is more complex that autocomplete.
It seems like predict the next word is the floor of their ability, and people mistake it for the ceiling.
But ultimately it is predicting the next token. That's the taste. Using context from what's already been predicted, what comes before it, attention mechanisms to know how words relate, all of the intermediate embeddings and whatever they signify about the world -- that all just makes the next word prediction that much better.
But intelligence *is* being able to make predictions! That's the entire reason we evolved intelligence! (Not words, but the world around us, sure, but apparently language makes a pretty good map)
Prediction is a faction of cognition. There’s a theory of self, perception, sensory fusion, incremental learning, emotions, a world model, communication and a sense of consequences, desire for self preservation and advancement, self-analysis and reflection, goal setting, reward-driven behavior, and so many more aspects that are missing from “predict the next word.”
You are confusing the underlying algorithm, such as prediction improved by gradient optimization, with the algorithms that get learned based on that.
Such as all the functional relationships between concepts that end up being modeled, I.e. “understood” and applicable. Those complex relationships are what is learned in order to accomplish the prediction of complex phenomena, like real conversations & text. About every sort of concept or experience that people have.
Deep learning architectures don’t just capture associations, correlations, conditional probabilities, Markov chains, etc. They learn whatever functional relationships that are in the data.
(Technically, neural network style models are considered “universal approximators” and have the ability to model any function given enough parameters, data and computation.)
Your neurons and your mind/knowledge, have exactly the same relationship.
Simple learning algorithms can learn complex algorithms. Saying all they can do is the simple algorithm is very misleading.
It would be like saying logic circuits can only do logic. And’s, Or’s, Not’s. But not realizing that includes the ability to perform every possible algorithm.
And how many of those are obvious applications of prediction, where prediction is the hard part?
World model: This is what prediction is based on. That's what models are for.
Sense of consequences: prediction of those consequences, obviously.
Desire for self preservation: prediction; avoiding world states predicted to be detrimental to achieving one's goals.
Goal setting: prediction; predicting which subgoals steer the world towards achieving one's supergoal(s).
Reward-driven behavior: fundamentally interweaved with prediction. Not only is it all about predicting what behaviors are rewarded, the reward or lack thereof is then used to update the agent's model to make better predictions.
There's even a theory of cognition that all motor control is based on prediction: the brain first predicts a desired state of the world, and the nervous system then controls the muscles to fulfill that prediction!
It does matter, because the flat earther isn't to likely make something up about everything they talk about. They can communicate their world view, and you quickly start to figure out a model of theirs as you talk to them. None of that is true with an LLM. Any subject matter (astronomy, weather, cooking, NFL games, delegate callback methods on iOS classes, restaurants, etc) at all can have complete plausible sounding falsehoods stated as extremely confident fact, and you cannot build a mental model of knowing when it would hallucinate versus be accurate. 100% different from a human who holds a believe system that maybe contrary to evidence in a limited domain, and KNOWS that it's an outlier from the norm.
Fair enough. Your point is valid and I hate to be that person, but..
> It does matter, because the flat earther isn't to likely make something up about everything they talk about.
I am less optimistic about this. It seems to me you are vastly overestimating the average person's rationality. Rational types are overwhelming minority. It always amazes me how even my own thin layer of rationality breaks down so very fast. I used to think we live on top of vast mountains of rationality, but now I feel more like we, deep down, are vast ancient Lovecraftian monsters with a thin layer of human veneer.
I'm not arguing that LLMs today are comparable to how humans can maintain a perspective and contain their own "hallucinations", but I am arguing that it is a matter of quantity, not quality. It's a matter of time (IMO).
If you ask a flat earther where they recommend eating, they’re not going to interweave restaurants that exist with restaurants that don’t, but have plausible sounding restaurant names. Or if you ask for the web address of those restaurants, the flat earther will say “I don’t know, google it.” They won’t just make up plausible sounding URLs that don’t actually exist.
Hallucinations for LLMs are at a different level and approach every subject matter. Because it’s all just “predict the next word,” not “predict the next word but only if it makes sense to do so, and if it doesn’t, say you’re not sure.”
I understand, it’s a failure mode unique to LLM’s. What I mean is that it has no relation with intelligence. Humans have failure modes too and often quite weird an surprising ones too, but they are different. It’s just that we biased and used to it.
These are not the same thing.