Hacker News new | past | comments | ask | show | jobs | submit login
Jane Austen's concept of information (Not Claude Shannon's) (2013) (bham.ac.uk)
73 points by benbreen 8 months ago | hide | past | favorite | 28 comments



Recent work in information theory has formalized the concept of semantic information, where Shannon deliberately avoided semantics, and focused purely on the information content.

They define semantic information as: Information which helps an object stay away from thermodynamic equilibrium (no different from entropic noise). Information then has meaning for a receiver if it helps them stay alive. The very same message may be completely "meaningless" to another receiver.

https://arxiv.org/abs/1806.08053 Semantic information, autonomous agency, and nonequilibrium statistical physics

> Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and philosophy, and there has been a long-standing interest in formulating a broadly applicable and formal theory of semantic information. In this paper we introduce such a theory. We define semantic information as the syntactic information that a physical system has about its environment which is causally necessary for the system to maintain its own existence. "Causal necessity" is defined in terms of counter-factual interventions which scramble correlations between the system and its environment, while "maintaining existence" is defined in terms of the system's ability to keep itself in a low entropy state. We also use recent results in nonequilibrium statistical physics to analyze semantic information from a thermodynamic point of view. Our framework is grounded in the intrinsic dynamics of a system coupled to an environment, and is applicable to any physical system, living or otherwise. It leads to formal definitions of several concepts that have been intuitively understood to be related to semantic information, including "value of information", "semantic content", and "agency".


You might be interested in this [1]. It sounds like you might be more knowledgeable about this stuff than I am, but Shannon did have some passing interest in his definition of information and 'structural' information.

[1] https://en.wikipedia.org/wiki/Macy_conferences


While their definition is in the ballpark, its not general enough to handle cases where survival isn't on the line, or communication between non-living systems. A similar effort is Haig's paper which is broadly compatible with the paper you link but is generic enough to capture semantics broadly: http://philsci-archive.pitt.edu/13287/1/Strange Inversion and Making sense.pdf

>An interpreter is a mechanism that uses information in choice. The capabilities of the interpreter couple an entropy of inputs (uncertainty) to an entropy of outputs (indecision). The first entropy is dispelled by observation (input of information ). The second entropy is dispelled by choice of action (output of decision). I propose that an interpreter’s response to inputs (information) be considered the meaning of the information for the interpreter. In this conceptual framework, the designed or evolved mechanisms of interpreters provide the much debated link between Shannon information and semantics.


It's a bit absurdly reductionist, imo.

Not all information is acted upon, and not all actions are sense-response, if we're talking 'starting with Biology'.

In any event, if that thesis is in agreement with the prelude's "there can be competence without comprehension", then it follows that without the need for comprehension there are no requirement for any sense of "meaning" in the "competent mechanism".

Meaning, as we humans experience it, is fully bound up with consciousness and sense of self. And sometimes it takes multiple 'readings' of the same input for us to arrive at "comprehension".

If we resort to black box modeling of sentient organisms, which is fine, we should not be projecting our experience of the phenomena of consciousness and sense of self (agency) to black boxes that can already be modeled as computation. There is zero basis for this, beyond our strong tendency as humans for anthropomorphism.

The really interesting question, imho, is whether meaning is secondary to form [which is the prevalent modern orthodoxy], or whether meaning can have an independent existence from form. (The fact that multiple forms can embody the same meaning is a data point.)

(Agree with your first para.)


>It's a bit absurdly reductionist, imo.

The more reductionist, the better imo.

>Meaning, as we humans experience it, is fully bound up with consciousness and sense of self.

And it is important to avoid giving meaning an anthropocentric framing which will constrain the analysis too much and thus miss important phenomena. Why should meaning have some qualitative experience associated with it? All sorts of organisms and processes behave in meaningful ways that suggest meaning is embedded in these systems without consciousness. A successful definition of meaning should explain the widespread appearance of meaning in the world.

>we should not be projecting our experience of the phenomena of consciousness and sense of self (agency) to black boxes that can already be modeled as computation.

But computation is necessarily meaningful! We cannot take some meaningful information, apply an arbitrary sequence of transformations on it, and expect to get meaningful information out. In other words: garbage transformations, garbage out. We see that program can meaningfully transform an input sequence, and that same program will produce garbage with another input sequence. The context of the program, i.e. what input it is operating on, determine what those operations mean, or whether they mean anything at all.

The fact that nature picks the useful transformations out of the space of the garbage, or a human programmer picks out the useful transformations needs to be explained. The explanation is that some transformations are meaningful in a given context, and operating in the space of meaning helps guide one through the space of possibility.


I would much prefer it if [for example] one used 'sense' instead of 'meaning'. For example 'the reaction of processor is the sense of the input'.

Why? Because language hygiene matters. There will be inevitable leakage of strictly defined terms of a conceptual model that have far broader scope in the general usage.

> But computation is necessarily meaningful!

Damn straight, but it is "meaningful!" to the programmer, not the program. :)


>Damn straight, but it is "meaningful!" to the programmer, not the program. :)

But the meaning of the computation exists independent of the programmer. That is, the program doesn't require the existence of a human interpreter to be meaningful. To explain this, it helps to get clear what a computation really is. Essentially, a program takes some input and transforms it to some output. But not just any transformation counts as a computation. What separates transformations that merely increase entropy in the universe and ones that count as computations are the fact that mutual information is preserved across the transformation. The input has mutual information with some system, and the transformation preserves some amount of this mutual information. This is in contrast to destructive transformations where the mutual information is lost entirely.

We can further refine the concept by noticing that computations are revelatory, they can tell us something we didn't know about some external system. The result of a computation is such that some new information is revealed about some external system that was inaccessible prior to the computation. Thus we can define a computation as a process that uses an input signal to reduce entropy (the unknown information) with respect to some external system.

But mutual information and entropy are not concepts that depend on human interpreters.


Are you not describing 'effect' here?

You, the observer, are seeing side effects of the program. You are a 'reader' of these effects. You ascribe 'meaning' to the transformation. The program itself is just a mechanism.

"But the meaning of the computation exists independent of the programmer."

But not outside of an external "observer". I assert that you still require another "mind" to read/note the "meaningful effects"

let's try r/meaning/effect:

"But the [effect] of the computation exists independent of the programmer. That is, the program doesn't require the existence of a human interpreter to be [effective]."

And that is perfectly reasonable and doesn't seem to nullify your further remarks regarding [processes] that preserve mutual information. And it is interesting and informative and we happen to in agreement.

My beef is with use of loaded words like "meaning". Your cite (Dennet) certainly doesn't shy away from a very expansive philosophical program:

"This biological re-grounding of much-debated concepts yields a bounty of insights into the nature of meaning and life."

That's borderline theological.

> We can further refine the concept by noticing that computations are revelatory

And this is interesting since I wanted to mention revelatory experiences in context of -zero- input/information. I'm sure you've must have had your share of epiphanies and bolts out of the blue insights.


>I assert that you still require another "mind" to read/note the "meaningful effects"

The issue is less that you need another mind to "note" something, but whether two independent minds will come to the same conclusion when undertaking independent analyses. We think mathematics is objective because two intelligent agents with entirely different histories will ultimately discover much of the same mathematical structure. The consequences for mutual information and entropy of a given process would reveal itself to a sufficiently determined analysis.

The meaning/effect translation works at first glance because computation is a species of causation. The question is, are there features of computation that a general causation description doesn't capture? I think so.

An example is that causation does not require an input with some amount of mutual information, any "input" will do. Remember, mutual information is the shared information between two systems, self information is mutual information with itself. Everything has self-information, few systems have mutual information. My definition constrains the analysis of causative processes to inputs with mutual information.

As another example, consider the sentence "the program I wrote to invert a matrix has a bug, its output has no meaning/effect". Replacing meaning with effect and the sentence completely changes meaning.

>That's borderline theological.

I think you are misreading Dennett's claim here. He's not using "meaning and life" in the sense of grand notions of the human spirit, he means it very matter-of-factly, in the sense of meaning as the content of communication and life as in biological processes. Dennett is about as far from theology and the woo-peddlers as you can get.

>I'm sure you've must have had your share of epiphanies and bolts out of the blue insights.

I would characterize epiphanies like this as the result of sub-conscious processes ruminating on prior inputs from memory.


> The issue is less that you need another mind to "note" something, but whether two independent minds will come to the same conclusion when undertaking independent analyses. We think mathematics is objective because two intelligent agents with entirely different histories will ultimately discover much of the same mathematical structure. The consequences for mutual information and entropy of a given process would reveal itself to a sufficiently determined analysis.

We agree then that additional external elements are required to speak of "meaning". This is what I meant earlier by "absurdly reductionist". I have nothing against the reductionist approach. (It's too bad that hn doesn't have a diagraming element as I am curious as to your reductionist diagram of a 'mind' that is 'informed' by 'meaningful' processes. Mine would have at least two boxes.)

I find it acceptable to say that: minds can arrive at meaning/meaningful-theories about a/the world by "[analysing] causative processes to inputs with mutual information".

But I still am not on board with some of the loaded word play. For example:

"the program I wrote to invert a matrix has a bug, its output is not the intended effect". It follows that a 'reading' of garbage out would then prompt the analyst to assert that "this result is meaningless".

> I think you are misreading Dennett's claim here.

That did frame my critical posture to an extent. Good to know.


>We agree then that additional external elements are required to speak of "meaning"

I may or may not be willing to grant this depending on how its cashed out. For example, I have no problem saying that an external element is required to "reveal" the meaning embedded in some system. But that's not the interesting question in my mind. I take the interesting question to be whether or not 'meaning' is objective, i.e. mind independent in some important sense. Do you agree that two independent investigators coming to the same conclusion entails that the content is objective?

>"the program I wrote to invert a matrix has a bug, its output is not the intended effect"

But it still has a causal effect, i.e. it still has some output. 'Intent' is in the vicinity of meaning and so its sufficiently distinct from a pure causal description.

>But I still am not on board with some of the loaded word play.

I'm not too wedded to the particular terms used. The point is mainly that there is some analyzable feature of processes involving mutual information that sits in between causation and mental representations on a spectrum of complexity, and this feature has wide applicability. If you just want to keep 'meaning' in the realm of minds, that's fine. But I wonder how that influences your views on what's possible with AI and machine intelligence.


> I take the interesting question to be whether or not 'meaning' is objective, i.e. mind independent in some important sense. Do you agree that two independent investigators coming to the same conclusion entails that the content is objective?

If every independent observer came to the same conclusion, then yes. Given that we can never test that, I would say (safely) that the answer to that question remains undecidable. We can not assert it.

(But we know for a fact that highly intelligent people have disagreed on meaning of various observations through out the ages. Famous recent example is the contention over the meaning and implications of Quantum physics by Einstein and Bohr.)

When independent observers reach the same conclusion they share a 'mindset'. This shared mindset may itself be an altertion due to input of new facts, i.e doesn't necessarily imply a shared starting point. In my mind, there simply needs to be a 'mental pathway' from distinct starting points that arrive at the same conclusion. But more interesting to me is the 'temporal proximity' of sudden paradigm shifts by independent minds. For example, there were no impediments to conception of non-euclidian geometry. And I find it very interesting that 3 different people, centuries after Euclid, all arrived at non-Euclidian concept at approximately the same time. (This to some extent alludes to my views on nature of mind and possibilities for AI.)

>>"the program I wrote to invert a matrix has a bug, its output is not the intended effect"

> But it still has a causal effect, i.e. it still has some output. 'Intent' is in the vicinity of meaning and so its sufficiently distinct from a pure causal description.

Fine, but intent is definitively bounded by the mind of the entity that created the process. The process is causal, restricted as you said to a species of processes, but it still a causal system. I don't see any legitimate reason to ramp up from 'effect' to 'meaning'. (Because while you may be an honest philosophizing agent, inevitably the 'woo-peddlers' will use the use of loaded terminology to assign 'mind' and 'personhood' to "meaningful processes".)

> I'm not too wedded to the particular terms used. The point is mainly that there is some analyzable feature of processes involving mutual information that sits in between causation and mental representations on a spectrum of complexity, and this feature has wide applicability. If you just want to keep 'meaning' in the realm of minds, that's fine. But I wonder how that influences your views on what's possible with AI and machine intelligence.

Well, I don't want to be a woo-peddler :) but my personal views & reflection on this phenomena that defines my existence are not congruent with current orthodoxy. Specifically, I do not accept that we truly understand the nature of materiality (yet). (There are too many red flags with fudge factor constants and 'dark' aspects that remind of earlier efforts of humanity to fit a model to observations.)

What this means, in context, is that I am not convinced that our minds can be reduced to 'boxes' with input and output. That notion is offensive at a fundamental core of my being. (If I ever became convinced that I am merely a twisted hydrocarbon chain in the mercy of winds of entropy I would exit post-haste from this sorry planet, to be quite honest.)

Certainly I reject the notion that complex structure (form) is the basis of consciousness. I reject this as a matter of 'choice' and 'mindset'. (I'm ~with Pascal.) I do however recognize that neither my view nor the prevalent "mind is simply a phenomena of certain complex structures" are provably right or wrong.

In my conception (atm), the entire 'Reality' (with capital R) is a universal mind and consciousness is a field of some sort, a property of 'reality' like gravity and EM. Our brains are transducers in this field but our minds are not bound by the body. The body localizes and establishes the point-of-view of observation in time and space. Can a machine have a mind then in my universe? To the extent that it's physical makeup shares (an unknown set of) characteristics with ours that allows it to 'interface' with the field. Is that wooie enough for you? /g


What about webpages on suicide techniques? Do those count as information?


I guess death could be viewed as the ultimate move away from the equilibrium (of being alive)?


Doesn't death allow the body to achieve maximal entropy?

Also, this conversation here has nothing to do with keeping me alive, yet I find it informative. Seems the concept of sematic information is missing something, or a lot.


For me, semantic information is information which allows you to predict future world states, or remove uncertainty about current world states, while optimizing for your utility function.

If your utility function is to remove all sense data, then a website with suicide techniques helps you predict future world states (if X then Death), and aligns with your utility function, so it is meaningful information for you.

But then you get into heterophenomenology territory: someone may state they are not hungry, while desperate for sustenance (or vice versa). People may temporarily act counter to their typical utility function. Do we want to allow such craziness to influence the formal value of semantic information?

> has nothing to do with keeping me alive, yet I find it informative.

Imagine having to read the exact same information every day. It won't allow you to adapt to a changing environment or be exposed to new ideas, so you can form new ideas yourself. You'd be stuck in a rut, and as good as non-alive. Soon you'll probably perish out of boredom.


I spend long periods of time without ingesting sources of information without getting bored and literally dying. Conversely, if I was constantly bombarded with useful information, I would still go crazy, and that could very well lead to literal death. I think staying alive is one piece of meaningful information out of a very large set of meaningful information, most of which has nothing to do with staying alive.

In other words, meaningfulness is orthogonal to staying alive.


I'm still reading this, so I may be getting the wrong end of the stick, but this seems to be conflating Shannon stating that his work is the study of the behaviour of information in the abstract and the notion that information theory is totally disinterested in anything else.

This piece just seems a bit dull to my eyes - maybe there's something interesting trying to get out, but currently it seems like it's fumbling around the edges of Shannon's work and trying to find a dichotomy where there fundamentally isn't one.

The connection with Jane Austen in particular seems to be completely spurious - let's say that the information of finding this in Austen's work doesn't seem to be very high.


Conpletely agreed.

There was just this:

> But [Shannon's] choice of the label "information" in his publications seems to have confused many highly intelligent people.

But there was nothing further about who was confused or where or when. The article seems to be clearing up a confusion that doesn't exist.


In the past few years I introduced a new concept that provides a simple and universal mathematical way to represent information AND meaning.

The basic idea is to recursively build grammars as trees in 2-D Cartesian space. Then you just count the elements.

https://github.com/treenotation/research/blob/master/papers/...

It's still in its infancy, but it is very pragmatic and very useful.


As someone who is completely ignorant to this field, I found this very readable and interesting. Thanks for your work!


What about this is specific to Austen? All of the quotes in the article are examples of information used as a normal English word, to convey its normal meaning, which everyone knows.


It is a truth universally acknowledged


The site where this article stems from seems like an awesome rabbit hole, a definite flagship for Good Old, Dense and Messy Web: http://www.cs.bham.ac.uk/research/projects/cogaff/

Same counts for Aaron Sloman's web site: https://www.cs.bham.ac.uk/~axs/

I added both links to the wiby.me search engine.


Just hate these long form articles that takes forever to get to the point. They are designed to have your eye balls kept on page as long as possible as opposed to transfer maximum aloof information in given time.


Very interesting but nothing really new here, this what IoT and big data is all about. I think the article is even misleading by saying that Shannon's work is not about the content of the information. The statement of not focusing on content the article is referring to is due the scope and context of the communication paper. Of course it will not focusing on the content at the physical layer but his other works really do (see networking layers below). He is highly regarded as the pioneer of information theory and AI including the earlier machine learning e.g chess machine, mouse thru mazes, etc.

For more complete hierarchy of knowledge it is the data -> information -> knowledge -> wisdom. For more meaningful in the age of AI and data analytics point of view it is the descriptive - > diagnostic -> prediction -> prescriptive [1]. As you can see the information content referred to by Jane Austin's novels mentioned in the article is even at the lower part of the hierarchy. It is the job of the information theorists and practitioners to develop inference engines (e.g. GPT-3) that can provide the higher level of understanding and proper prescriptions. In addition even if the inference engines are not that capable yet, human-in-the-loop system can be deployed to significantly reduce the cognitive workloads of the human expert (e.g. cardiologist) to perform the better diagnostics for more accurate and timely prediction and prescription.

From data centric point of view in electronics and communication engineering Shannon's famous theorems on sampling and communication is a subset of data conditioning (to remove/overcome/mitigate noises on system) and data transmission (to remove/overcome/mitigate noises during transmission).

From TCP/IP networking point of view Shannon's theorem of communication, albeit a very important concept, is mainly at physical layer not even at the data link layer and the layers above (networking transport, application). The more universal information theory and entropy partly developed by Shannon is mainly belong to application layer including the AI and data analytics. For more complete treatment refer to the book by the late Dr. David J. C. MacKay entitled Information Theory, Inference and Learning Algorithms [2]. The author has made the book available from the website.

[1]https://www.ciscopress.com/store/iot-fundamentals-networking...

[2]http://www.inference.org.uk/mackay/itila/book.html


Cryptography, error coding, and lossy compression are particularly concerned with what the author identifies as Jane Austen's concept of information but also have a very close relationship to Shannon entropy.

Basically in cryptography the problem is that plaintext messages reveal important information, and the probability distribution of plaintexts is roughly known. While "attack at dawn" and "attack at noon" may have a lot of letters those two messages may be the only expected messages, and they each may be close to 50% likely, reducing the Shannon entropy to nearly a single bit. Parties wishing to communicate secretly would like to obscure their communication by revealing no information about the actual probability distribution of the plaintexts, so Shannon entropy is maximized (with respect to tractability) when generating a ciphertext.

Successful cryptography transforms a message channel into one where e.g. 2^N ciphertexts are all equally likely, hiding the semantic content of both the actual plaintext messages and their probabilities.

On the other hand error codes strengthen the semantic meaning of the underlying message by significantly increasing the distribution while adjusting the probabilities toward favoring the important messages.

Lossy compression (images, audio, video) is kind of the inverse of error coding. A reduction in the distribution by choosing which parts of the message are least semantically important and removing the distinction between them and making them indistinguishable.

Roughly, Austen-information is limited by the Shannon entropy as an upper bound and also by a computable, tractable function of the available Shannon information (which we might as well call the semantic function which translates between bits and useful human representation). In cryptography's case P = D_K(C) is the particular function that turns an N-bit ciphertext from 0 bits of Austen information into N bits of Austin information with the proper key. Lossy compression turns an N-bit message into a semantically useful approximation of an N+M-bit message, excluding the production of a large class of semantically indistinguishable resulting messages in the process.

There's another layer that maybe the author was getting at, and that's the human (or any agent) interpretation and response to sensory inputs. At that semantic layer it's again a function from inputs to experiences, responses, or actions. Information theory suggests that such functions are probably most useful if they can extract the most efficiency from all available channels, e.g. distinguish between important states of the world with the fewest bits possible. Going beyond that is a digression into intelligence itself where we have to determine what the most important states of the world are and how to respond.


It's a very important point the paper makes, which is the disconnect between what is commonly called Shannon information (he actually called it entropy) and what we call information in normal parlance. A channel may transmit a high amount of Shannon entropy, but contain zero information of the regular sort, since the transmission is just random noise. A concept that is closer to our intuition is mutual information, so named by Shannon's colleague precisely because Shannon entropy is not concerned with semantics.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: