>> Behind this approach is a simple principle often abbreviated as "compression is intelligence", or the model must approximate the distribution of data and perform implicit reasoning in its activations in order to predict the next token (see Solomonoff Induction; Solomonoff 1964)
For the record, the word "intelligence" appears in the two parts of "A Formal Theory of Inductive Inference" (referenced above) a total of 0 times. The word "Compression" appears a total of 0 times. The word "reasoning" once; in the phrase "using similar reasoning".
Unsurprisingly, Solomonoff's work was preoccupied with Inductive Inference. I don't know that he ever said anything bout "compression is intelligence" but I believe this is an idea, and a slogan, that was developed only much later. I am not sure where it comes from, originally.
It is correct that Solomonoff induction was very much about predicting the next symbol in a sequence of symbols; not necessarily linguistic tokens, either. The common claim that LLMs are "in their infancy" or similar are dead wrong. Language modelling is basically ancient (in CS terms) and we have long since crossed in the era of its technological maturity.
It makes perfect sense that intelligence is a form of compression. An inductive model is small but can potentially generate arbitrary amounts of information.
For the record, the word "intelligence" appears in the two parts of "A Formal Theory of Inductive Inference" (referenced above) a total of 0 times. The word "Compression" appears a total of 0 times. The word "reasoning" once; in the phrase "using similar reasoning".
Unsurprisingly, Solomonoff's work was preoccupied with Inductive Inference. I don't know that he ever said anything bout "compression is intelligence" but I believe this is an idea, and a slogan, that was developed only much later. I am not sure where it comes from, originally.
It is correct that Solomonoff induction was very much about predicting the next symbol in a sequence of symbols; not necessarily linguistic tokens, either. The common claim that LLMs are "in their infancy" or similar are dead wrong. Language modelling is basically ancient (in CS terms) and we have long since crossed in the era of its technological maturity.
_______________
[1] https://raysolomonoff.com/publications/1964pt1.pdf
[2] https://raysolomonoff.com/publications/1964pt2.pdf