Hacker News new | past | comments | ask | show | jobs | submit login
Compressionism: A Theory of Mind Based on Data Compression [pdf] (ceur-ws.org)
109 points by optimalsolver 11 days ago | hide | past | favorite | 32 comments

Jurgen Schmidhuber has similar ideas - "History of science is the history of compression progress" https://youtu.be/3FIo6evmweo?t=1537

and (quoted in the paper)


The Arbital entry on Unforeseen Maximums[0] is more pessimistic:

"Juergen Schmidhuber of IDSIA, during the 2009 Singularity Summit, gave a talk proposing that the best and most moral utility function for an AI was the gain in compression of sensory data over time. Schmidhuber gave examples of valuable behaviors he thought this would motivate, like doing science and understanding the universe, or the construction of art and highly aesthetic objects.

Yudkowsky in Q&A suggested that this utility function would instead motivate the construction of external objects that would internally generate random cryptographic secrets, encrypt highly regular streams of 1s and 0s, and then reveal the cryptographic secrets to the AI."

[0] https://arbital.greaterwrong.com/p/unforeseen_maximum/

I wrote finished a short story this year that explores some of these ideas heavy-handedly. It is placed in a Universal Library setting, a continuation of Borges' Library of Babel. In a sense it is both for and against the idea of compression as a way to gaining knowledge, attempting to make a greyer area out of it.

"Unlike the entirety of the Library, incomprehensible due to its sheer vastness, the book itself was much smaller. Any single one could easily be read hundreds of times over during the lifespan of a librarian. Therefore, the book’s gibberish nature did not come of its own accord but simply sprouted out of its relation to us and our lack of knowledge concerning the Library as a whole. In other words, for any book to make any sense to any librarian, it must have recognizable patterns, like those in language. And since every possible book that could exist did, most of them were simply random, nonsensical strings of characters. Leaving the librarians themselves to search for the language or sets of languages that could bring meaning to them (them:themselves or them:the books no one was sure). This was reflected — maybe the same sort of reflection Tanner was searching for — in the various languages the librarians did know and use; the ever-evolving nature of them, incorporating and condensing larger and larger swaths of ideas that then needed to be condensed even further into shorter acronyms for quicker reference."


I killed it a few days after this post as I am working on submitting for publishing. Which is to say, I am a glutton for punishment and rejection, haha. However, thank you for looking up the archive link. If you do read it please email me with notes you may have. Any input good or bad would warmly received.

Prediction and compression as a way to encode data cheaply is a reasonable idea. It's really cheap to encode something that the predictor expected next. This is essentially what GPT-2 and GPT-3 are doing.

The author seems to be missing some implications of that. One is that a predictor/compressor has a bias towards what it has compressed before. We know that humans tend to over-generalize, and that that's a survival trait. Something like this might be the mechanism behind that.

Over-generalizing means that you will compress worse, as does under-generalizing. Can't use the mere fact that "compression is going on", you have to look into utility, costs, etc.

How? "All X people behave in Y ways" is an over-generalization and seems to compress very well.

You need to encode the delta between your model and what you observe, and that delta will be expensive to encode if your model over-generalizes.

Thanks, your comment made understand a tiny little bit more of the machine learning lingo. I think. :-D

Another useful concept of memory organization is what's called a "schema":

> A schema is any pattern of relationships among data stored in memory. It is any set of nodes and links between them in the spider web of memory that hang together so strongly that they can be retrieved and used more or less as a single unit. [1]

When thinking about something we naturally bring togheter plenty of related information, thus this "compression" described in the papper must somehow be capable of aggregating these relationships into the data as well prior to compressing it.

[1] Richards J. Heuer Jr. Psychology of Intelligence Analysis. Center for the Study of Intelligence, 1999.

Well I think that the compression part would be storing some type of abstract subgraph that could then be referenced or re-used for similar situations rather than being duplicated.

If neurons that fire together wire together. Can we assume there's a shared subprocess and this is just a type of compression?

Related short story: Kolmogorov's AI


When I see passages like that quoted below, I wonder if the authors are falling for the economist's fallacy, assuming that objective, quantifiable, repeatable measures are ipso facto measuring what they want to measure. I have not seen anything in this paper persuading me that understanding is a consequence of compression, rather than vice-versa.

A weakness of the Turing test (Turing, 1950) is that a program might pass the test simply by exploiting weaknesses in human psychology. If a given system passes the test we cannot be sure if it was because of the quality of the responses or the gullibility of the judge. In contrast, Hutter’s compression test is more reliable. The more that data is compressed, the harder it becomes to compress it further (Chaitin, 2006). Because there is no way to cheat by using a simple heuristic, data compression presents a reliably hard standard. We argue that this process of identifying deep patterns through compression is what people mean when they attribute both ‘intelligence’ and ‘consciousness’.

Compression is important to processes going on in the mind, but I don't think its a central concept to understand how the mind words. Compression is relevant inasmuch as our brains perform lossy compression to capture the relevant regularity in our sense data. But this is only the start of the analysis; it doesn't explain anything useful about our cognitive architecture. The problem is strong compression trades space requirements for computational requirements. But going all-in on compression is optimizing for the wrong thing in the context of a bag-of-neurons computational model.

von Neumann architectures are good at fast and precise state transitions. A bag-of-neurons is good at leveraging relationships to minimize computational requirements. Thus a plausible cognitive architecture for the mind should emphasize relationships rather than computation. A good cognitive architecture is one where relationships in the data are maximally revealed by the architecture, thus minimizing the computational burden to access and utilize the relationship. This explains why the visual system takes up so much volume of the brain. 3-dimensional sense data is packed with interrelationships and these relationships need support from the cognitive architecture to be utilized. The compression view is only useful to a point--after that the constraints on the bag-of-neurons model become dominant and must drive the architecture search.

When it comes to phenomenal consciousness, the question then becomes: is there anything it is like to be a process whose topological structure maximally captures external and internal state relationships? My intuition tells me the answer is yes. One functional requirement of a human brain is that it `believes` it is the author of its decisions. To manifest this `belief` requires a self-representation of itself distinct from the environment, i.e. a target of the attribution of authorship. Facts of this self-representation entail what it is like to be this topological structure.

Compression only cares about representing a set of data, while the brain also needs to account for future experiences. Thus brain representations need to also include information that is useless in the present but could be useful in the future. Compression is definitely a part of the story but not the whole.

When I see a citation to Integrated Information Theory I usually hit back two or three times.

The axioms of it don't look terribly different than that of the Buddhist functional model:


This seems like a reasonable next step after considering the failure of Totoni's IIT. IIT has the "problem" that a diode is a little conscious; the authors embrace and extend this "problematic" viewpoint to cover the informal popular idea that compression is knowledge.

Their view on the Hard Problem is not new. They say that the main trick to the Hard Problem is compressing information about a "self" which is repeatedly re-quantified. However, they then must rely on some notion of human individuality in order to explain why qualia are subjective.

The only problems I have with this approach are the normal criticisms of panpsychic approaches: It's hard to observe, there's few useful predictions, rocks are a little conscious, etc. However, rocks are a little conscious, or at least a little alive, thanks to lithophile bacteria which form microscopic filaments threading through the rock; this wasn't known in Berkeley's time!

That doesn't mean rocks are conscious, it means the things that live on rocks are conscious - an interesting idea that appears to scale well.

There is no CS solution to the hard problem, because the hard problem is not about information, or behaviour, or data compression, or rocks. It's about subjectivity, and there isn't even a working definition of what subjectivity is, never mind any experiment that can be done to confirm that it exists.

Subjectivity is essentially metaphysical in the sense that it's outside of science.

It's impossible to make a testable statement about any phenomenon of any kind that isn't filtered through human perception and all the layers of human cognitive processing.

In a very real sense it's objectivity that's the illusion. Try to make a statement about anything whatsoever that is truly independent of collective human sense perception and human mental processes and see how far you get.

Whatever you think you're doing in science or math, you're looking out through a distorted window whose properties you're not aware of, and trying to correlate your observations with others looking out through windows with similar distortions.

The best you'll get is an interesting list of distortions which everyone agrees on. Even if they have predictive power, that doesn't make them truly objective - it just makes them a spiky median of collective experience instead of an outlier of individual experience.

Though to some extent the problem is purely metaphysical and as iresolvable as the question of divinity or boltzmann brains, I still think that it is a question that is fundamentally inaccessible to science in all its aspects.

Let's assume that at some point in the distant future, if we have a working understanding of animal and human cognition, to the extent that we can either build a program that works as well as a dog or even human being, and we can also probe its functioning in a meaningful way, perhaps in a similar manner that we can today model a complex fluid or large mechanical system (that is, we understand the general idea, but not all aspects of it).

If we could do all of the above, then we could model and alter states, and identify which "brain" states correspond to certain subjective experiences - assuming of course that there is no ghost in the machine. This could be considered reductionist, and it would not be a fully satisfying answer to the hard problems of consciouness - just like a theory of everything or a perfect model of the formation of the universe wouldn't actually answer the concept of divinity, but it may make it seem unnecessary to many people.

I agree with your direction, but have qualms with your points.

Suppose for absurdity I replay my original argument, but with humans, and you reply "that doesn't mean humans are conscious; it means that the things which are housed within skulls are conscious". I understand the nuance you're introducing, but don't think that it helps. Part of the difficulty of the Combination Problem is that the boundary of conscious control extends beyond the nexus of thought. I'll accept that rocks are just the substrate, but integrated circuits are just carefully cooked rocks; substrate-vs-consciousness thinking might be wrong.

I agree with you that science is done empirically, and thus doesn't experience objectivity. However, maths is quite formal, and in the past century we managed to achieve "formal formality" with category theory. We now can talk of certain objects as existing universally; in their context, not only do they exist, but they uniquely exist. For example, `1+1=2`, which is to say that there's an equivalence between any pair of objects selected one by one, and any pair taken both at a time; for any particular counting context, there's only one natural numbers object.

I am not quite as pessimistic as you regarding the quality of the conclusions that we reach. Rather than saying that our filters make us subjective, I would simply say that the experiences of humanity are essentially human. There's an anthropic bias because humans can't experience anything that isn't human; we can't think truly alien ideas or build truly alien artifacts. Everything we think is non-human is actually human. This doesn't prevent us from drawing conclusions, but it does forbid objectivity.

> I'll accept that rocks are just the substrate, but integrated circuits are just carefully cooked rocks; substrate-vs-consciousness thinking might be wrong

It's a side point, but I think the important difference is that a rock can exist even without bacteria (imagine the rock is somewhere close to the earth's core, just below the temperature that would melt it into lava), while the bacteria on the rock could also exist independently of the rock (they could be blown by the wind and moved to a different rock, for example). So it makes sense to discuss the amount of consciousness in the (sterile) rock itself, and in the (floating) bacteria as well, though of course the system of rock+bacteria can have its own amount of consciousness.

This kind of reminds me of the usual interview question: can you make a fair die from an unfair one.

Similarly, us as humans, could possibly have the ability to get over some/all of our distorted senses.

I am not talking about every day life here. It's in principle possible to deal with the logical fallacies that we're programmed with, but it requires effort.

What am more interested to convey is: our potential for objectivity is troubled with subjectivity. It is, however, not impossible on just the account of having subjective distortions. Those can be overrun.

There are many other similar analogies that come to mind, one that is prominent: digital systems made from purely analog ones.

This is very interesting and reminds me of something I read 4 years ago. I had wondered about data compression related to brain research after I had read this blog: https://probablydance.com/2016/04/30/neural-networks-are-imp...

Sounds plausible, but I'm biased as it agrees with my theory for the origin of consciousness: https://news.ycombinator.com/item?id=23475069

Ever since I learned about autoencoders, I've been saying compression is at the heart of intelligence.

True that all thought is abstraction. But not necessarily compression? Hypostatic abstraction would expand the data as would building any kind of new relations? Neuro modulators are brain-wide. We can also change our perception.


The full citation seems to be Maguire, Phil, Mulhall, Oisín, Maguire, Rebecca and Taylor, Jessica (2015) Compressionism: A Theory of Mind Based on Data Compression. Proceedings of the 11th International Conference on Cognitive Science. pp. 294-299. ISSN 1613-0073. It’s annoying that there is no date or publication information in the PDF itself.

Is it vegan to pirate WinRar?

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact