Hacker News new | past | comments | ask | show | jobs | submit login
A Mathematical Theory of Communication (1948) [pdf] (math.harvard.edu)
197 points by alokrai on April 30, 2020 | hide | past | favorite | 39 comments

Probably one of the dozen or so most important publications of the twentieth century. Ironically though it would be Norbert Wiener's interpretation of information (exactly Shannon's times minus one) that would cement itself in the popular lexicon due to it being far more intuitive. Where Wiener posits information as negative-entropy (or "order"), Shannon interprets it as representing a degree of freedom, or uncertainty. The problem with Wiener's is that the underlying epistemology carries a subjective bent, where information is equivocal to "meaning" or semantic content. Shannon's, meanwhile is ultimately superior as an objective metric, though far less intuitive (under his interpretation, a random string is technically the most information-saturated construct possible, because it possesses the highest degrees of freedom).

To say Wiener and Shannon founded information theory is like saying Lorenz and Einstein founded speial relativity. I don't think physicists would consider Lorenz having made a major advance, and in fact even after the relativity was explained some people still didn't understand it.

Wiener made important contributions to mathematics but not to information theory. He wrote a book saying that entropy is related to "information" and is maximized for a Gaussian. That's about his involvement. The paper attached goes way beyond that.

I didn't claim Wiener founded it, only that his popularization of the term is more ubiquitous than Shannon's formulation.

But isn't the information gained from a message necessarily subjective? It depends on just on this message, but from the distribution from which this message was drawn, and more appropriately, the distribution anticipated by the observer will determine the value (information content) of the message. That is exactly how it becomes useful in practice.

This "surprisal" can be parametrized by relative entropy (Kullback-Liebler divergence), which reduces to the Shannon entropy formula only when the observer has the correct anticipated distribution.

I think there is a distinction to be made between information in the ontological sense and information in the pragmatic sense. The former does not have any purchase on "meaning" or "use", only a metric of informational entropy or degrees of freedom, irrespective of observation. However, the sense in which you speak of information is the pragmatic dimension, where an observer receives a datum of semantic content against a given backdrop of pre-established "meaning". This sense is problematic as an objective metric.

I'm trying to find a quote I read about this paper once. To paraphrase, not only did this paper create a new field of academic inquiry, it answered most of that new field's interesting questions, too.

From Jon Gertner's "Idea Factory":

> With Shannon’s startling ideas on information, it was one of the rare moments in history, an academic would later point out, “where somebody founded a field, stated all the major results, and proved most of them all pretty much at once.”

It is attributed to a documentary on Claude Shannon: https://www.youtube.com/watch?v=z2Whj_nL-x8

What I think is telling, is how easy to read Shannon’s paper is. Even today it is used pretty much as is at many EE colleges to teach communication theory.

Another cool fact: Shannon’s master thesis is most likely the most influential one of all time: in it, he linked Boolean algebra to electrical circuits with switches, essentially inventing digital circuit theory.

Wasn't the idea first pointed out by C. S. Peirce back in the 1880s?

BTW here's the paper by Hartley cited on the first page it is also very readable and insightful. I found it helped clarify some of the subtler points in Shannon's paper to read it as well.


Recommended on an earlier thread: From Aristotle to John Searle and Back Again: Formal Causes, Teleology, and Computation in Nature" by E. Feser.



Source: https://news.ycombinator.com/item?id=12080670

If you're interested in this, check out Cover's Information Theory textbook — the rabbit hole goes much deeper. One of the most interesting examples, is that when you're betting on a random event, Shannon entropy tells you how much to bet & how quickly you can compound your wealth. Cover covers (heh) this, and the original paper is Kelly: http://www.herrold.com/brokerage/kelly.pdf

Kelly's paper (based on this paper by Shannon), is responsible for fundamentally reshaping equity, commodity and even sports betting markets.

I highly recommend William Poundstone's book, "Fortune's Formula" as a biography of those ideas - it's almost as good as any Michael Lewis book on the subject would be.

The article was renamed "The Mathematical Theory of Communication" in the 1949 book of the same name, a small but significant title change after realizing the generality of this work.

The forward is also essential reading, even for those uninterested in the math. It's one of the best descriptions of a field of science I've ever read.

What was the original title?

A Mathematical Theory of Communication

So "A" was changed to "The"?


I just recently finished the biography of Shannon called "A Mind at Play."

It was quite a joy and I highly recommend to anyone interested in these sorts of bios.

Concur. That is a good read.

> The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning ...

I never fail to suppress a chuckle when I read that.

The crucial point is the word "selected". As in "selected from a pre-agreed set of possible messages".

A well-deserved perennial. Some prior discussions:

2016, 53 comments: https://news.ycombinator.com/item?id=12079826

2017, 11 comments: https://news.ycombinator.com/item?id=15095393

There seems to be a great, unresolved difference of opinion about whether information entropy and thermodynamic entropy are commensurable.

In trying to connect Shannon entropy to thermodynamic entropy, I always get stuck on the fact that you need to have a defined alphabet or symbol set.

It seemed clear to me at the time when I read (a few of - but especially https://bayes.wustl.edu/etj/articles/theory.1.pdf) ET Jaynes' papers on classical thermodynamics that they were commensurable. I'm mildly surprised there is some question about that!

The alphabet is to distinguish states. But this is the case in thermodynamic entropy as well, that each state is distinguishable. What is important is quantifying the set of possible states.

vs an arbitrary defined number of dimensions?

Their are more exotic mechanisms to define entropy: https://en.wikipedia.org/wiki/Parastatistics

In a thermodynamic sense, information input (say, of photons) is generally going to increase the entropy of a receiving system. Only certain kinds of information can reduce the entropy of the receiving system. Too much information (i.e., too many photons) can literally burn your eyes.

I don't see how to square those manifest physical effects with the immateriality of Shannon information entropy.

Is this not the problem of Maxwell's Daemon, if an intelligent agent observes and interferes with a particle system to reduce the system's entropy then overall entropy is still increasing because the agents needs to store and calculate information in some physical medium, be it brain matter of silicon.

Not exactly, I don't think.

In the Shannon formulation, there isn't a notion of "too much information" -- it is all symbols.

But in reality, too many photons can burn our eyes or a denial of service attack can break a website. Too much information is a reality --because we are thermodynamic entities.

I don't see how to understand that with Shannon.

Yeah right, and too much heat can burn my toast, forget about global warming it about universal warming, I don't understand that with Einstein either.

Shannon's work has had vast influence.

Here is a great, short overview of the topic => https://youtu.be/_PG-jJKB_do

There is no entropy regarding question of whether the Jade is an attractive presenter.

> There is no entropy regarding question of whether the Jade is an attractive presenter.

Please don't do this here. It just comes across as super-crass.


I think this gets posted on Hacker News once or twice a year. It's a pleasure every time.

Anyone have good recommendations on LDPC or Turbo Code theory?

David MacKay did some good stuff, and source code is available: http://www.inference.org.uk/mackay/CodesFiles.html

I find MacKay's explanations very clear. Sadly, he passed away far too young.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact