Hacker News new | past | comments | ask | show | jobs | submit login
Information Theory: A Tutorial Introduction (arxiv.org)
297 points by teleforce 34 days ago | hide | past | favorite | 26 comments

Shannon's original paper "A Mathematical Theory of Communication" which introduced information theory is so accessible as to be useful as a tutorial itself:


Here is a list of books for those wishing to dig deeper:

1) The Information by Gleick is a book covering everything from biographies of the key historical figures and contributors to contemporary applications of information theory in quantum mechanics.

2) Elements of Information Theory by Joy Thomas and Thomas Cover is a thorough and engaging textbook.

3) Quantum information and quantum computation by Nielsen and Chuang explaining both classical and quantum information with some practical examples.

4) Entropy and Information by Volkenstein a Soviet popular science book with math-heavy examples from biology, chemistry and physics

and as an aside I can also recommend "Willful Ignorance: The Mismeasure of Uncertainty" by Weisberg, which is an engaging history review of probability theory and its key inventors

> 2) Elements of Information Theory by Joy Thomas and Thomas Cover is a thorough and engaging textbook.

Thomas & Cover is THE textbook on information theory -- no doubt -- but you and I have very different definitions of engaging.

For a different kind of engaging, I would recommend Dave MacKay's "Information Theory, Inference, and Learning Algorithms", which can be found on his website:


+1 for the original paper. I used it to pass my information theory class in college because it made more sense than the official materials and my lecture notes.

It’s a fantastic explanation of the theory.

The edition with Weaver's introduction is also very good, especially if you would like the context and ramifications of Shannon's theory.


Except that some of the notation has changed a little. But it's insanely good and impactful for a piece that is only 8 pages long, if I remember correctly.

it's 55 pages

His master's thesis was much smaller, if I remember correctly. Gonna check with my ex teacher :)

His master's thesis was on applying boolean algebra to the design of switching circuits, which was similarly groundbreaking (although discovered in parallel by another).

Is there an annotated version or walk through of this paper available or something? I've read it a few times before but I couldn't really understand what's so ground breaking about it. It could be because the more detailed and technical stuff I couldn't grasp that well, I understood high level the kind of model of information theory he was talking about but I'm pretty sure just that by itself isn't what made it so groundbreaking at the time

As the other reply to your comment said: the high level things you understand were groundbreaking at the time. The fact that arbitrary data can be encoded as bits? Minds blown!

What passes for mundane obviousness today was once a strike of genius nobody else saw coming.

And that's part of what makes the paper so groundbreaking, too. Some things are groundbreaking in terms of explanatory power but don't make it into the fabric of society. What Shannon's paper did was essentially weave the fabric of the digitally powered society.

And it did it so successfully that you think of it as obvious banalities today.

It wasn’t like there were other theories of information at the time. He established a very neat theory for the first time, and showed a very intriguingly idea of entropy that proceeded naturally from it. He showed how the information a channel can transmit is related to the bandwidth. He showed how you can estimate the entropy of languages.

If this seems obvious to you and not remarkable it is only because of hindsight, because his insights have been absorbed into the common culture.

Continuous entropy is more subtle than the direct analogy to the discrete case he uses in the original paper, see:



Another fantastic (and free!) Information Theory resource is David MacKay’s book, which also addresses aspects of Bayesian Statistics and Machine Learning: http://www.inference.org.uk/mackay/itila/

It’s among my favourite textbooks: you can feel his enthusiasm and personality through the pages. The world is a poorer place without him around.

Great textbook, the exercises build up intuition, rather than just mechanically asking you to apply this formula then that formula.

Written some 60 years ago, Information Theory and Coding by Abramson[1] is an absolute gem for those looking to get into info theory. The cover book [2] being the more complete resource (and somewhat of a grad level defacto standard text).

[1] https://www.amazon.com/Information-Theory-Coding-Norman-Abra...

[2] https://www.amazon.com/Elements-Information-Theory-Telecommu...

"To mistake a binary digit for a bit is a category error. In this case, the category error is not as bad as mistaking marzipan for justice, but it is analogous to mistaking a pint-sized bottle for a pint of milk."

That's well put.

But I thought a bit is a binary digit? What does this quote mean?

The next sentence in the paper says: "Just as a bottle can contain between zero and one pint, so a binary digit (when averaged over both of its possible states) can convey between zero and one bit of information."

These definitions are for information theory.

A "binary digit" is either a zero or a one. It's a specific value.

A "bit" is an amount of information: a container that can hold a binary digit.

That may very well be the definition. People certainly do say e.g. "... a 0 bit followed by a 1 bit ...", however.

Excuse me, english isn't my first language and I can't find the meaning of "mistaking marzipan for justice". Would you enlighten me?

It's not an idiom but a literal statement about how incomparable marzipan and justice is

I think of information theory as an entirely new branch of mathematics because it represents a new way of quantitatively thinking. Calculus provided a different way to think about problems separate from geometry. Information theory let’s me think about problems differently.

The author of the paper has a book with the same title:


Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact