Shannon's original article "A Mathematical Theory of Communication" is available online, for example at [1]; I find it very readable. The MIT article talks about information theory as applied to our digital world, but does not mention that it's now applied widely in theoretical physics; watch [2] for an intro. For example, the Hawking radiation from a black hole can function as an error correcting (erasure) code, allowing recovery of the information that fell into the black hole.
Information Theory is the ur-science. It's the theory of mapping observed data onto models. In other words, it's how to fuse inductive logic into deductive logic. In physics, this is especially salient, since many of the models are of low-information systems. See, e.g. Wootter's paper "Statistical Distance and Hilbert Space" [1], which gets pretty close to deriving the Dirac formulation fo quantum mechanics as the physics of low information systems in those explicit terms. (It doesn't seem to explain why it's a complex Hilbert space.)
Eh, this can be said about a lot of foundational disciplines/studies/theories. Philosophy of science (or philosophy of mathematics), logic, category theory, model theory etc would all fit the description. I'm not saying you're wrong, it's just that information theory isn't unique in this respect.
All other possible ur-sciences you list are purely deductive. But the natural sciences have an element of inference, or inductive reasoning, to them as well. To the extent that scientists aim to capture all observed phenomena into a theory which is capable of making predictions and otherwise performing deduction, they need to bridge that gap. Category theory, e.g., does not do this.
...Not at all? The very definition and distinction between deductive and inductive reasoning is in and of itself a matter firmly tread in the annals of philosophy. Every realized science is just a topical offshoot of natural philosophy, where the corpus of the eventually named science are all the axioms and techne that the particular field concerns itself with.
Even the distinction between that most vaunted science, physics, and it's far more fluid, subjective sibling, metaphysics is squarely settled in the philosopher's forum rather than the scientist's academy. To the scientist, the epistemic question is settled by your choice of field by necessity.
Another way to think of it:
The Art to good Philosophy is to learn to cope with and identify those patterns of reasoning that degenerate down to navel gazing.
It is why the very tool by which we interlocute and communicate is in and of itself a matter of great philosophical concern. The ability to elicit a shared understanding of a purportedly common world phenomema is still hotly debated within philosophical circles. The question of what the essence of meaning is and what it means to be all happen far outside the borders of any science's capability to operate or even exist as standalone rational framework.
A science is a rational workshop, which is what is because of the collection tools and products of philosophically established ground states that render the space conducive to ongoing thought and reasoning.
Seriously. If yoy think philosophy is all deductive, crack open some books. Enrich thyself, and marvel.
Information theory is one of very few things out of academia that has really stuck with me and shaped how I look at problems in the world. The current applications for it are so vast and immediate that I wonder how many other places it could still be applied to.
Being able say and practice things like "now let's consider what this might look like in the frequency domain..." can open up radical new approaches to solving problems. "Ohhhh that signal/channel has a hard roll-off starting at 20khz... I probably need to increase/decrease my sample rate accordingly" vs flying blind with traditional time domain analysis. In practical terms, frequency domain work is mandatory for things like high-efficiency video or audio codecs.
Understanding what entropy vs information really means is a huge part of being able to competently build cryptographic primitives from first principles.
As a ~1-hour talk, Dr. Wootters is able to dig more deeply into Reed Solomon codes (a very popular error-correction code for nearly 50 years), as well as applications into strange stuff: like using RS codes in "Test Pooling" to save money on Syphilis tests (and probably the same methodology being used for "Test Pooling" in today's COVID19 world).
Dr. Wootters keeps things relatively dumbed down, never getting too into the weeds of the math (and indeed: only sticks with the GF(5) field, a prime field instead of talking about the more applicable extension fields). Still, extension fields follow mostly the same concepts, and GF(5) is sufficient to cover all the concepts.
I've never heard of Prof. Mary Wootters before. I wonder if there's any relation to Prof. William Wootters. Neither mentions that in their biography but it's not a terribly common name, and both seem to work at the intersection of physics and information theory.
Hamming codes were among the first codes invented after Shannon's paper.
Hamming codes are very tiny, but efficient. There are bigger (and therefore better) codes today, but hamming codes are still used when small codes are useful (or: error correcting ram)
Hamming codes are optimal and efficient for their size. But there is a need for larger codes in most practical situations. If you need single error correcting double error detecting however, Hamming is probably your best bet.
[1] http://people.math.harvard.edu/~ctm/home/text/others/shannon...
[2] https://www.youtube.com/watch?v=v5UbN0xx4X0