Hacker News new | past | comments | ask | show | jobs | submit login
Some Fundamental Theorems in Mathematics (arxiv.org)
419 points by postit 5 months ago | hide | past | web | favorite | 71 comments

The second theorem (the generalized Pythagorean theorem, which uses inner product spaces) feels incomplete to me. The exposition is fine, but I don't think the explanation should have concluded with the regular

    a^2 + b^2 = c^2
equality. You wouldn't use that in the context of an inner product space. The generalized Pythagorean theorem actually looks like

    ||x + y||^2 = ||x||^2 + ||y||^2 + 2<x, y>.
I think it would be better if there's less editorializing of the equalities and more relating of their abstract forms to their "simple", familiar forms. Otherwise I don't really understand the exercise - if the first page has a theorem involving inner product spaces, this is clearly targeted at people with at least a full course of linear algebra under their belt. Unless the audience is already aware of how that second equality implies the first equality, this explanation doesn't capture the heart of it.

As another commenter said, this is not a self-contained exposition (and I don't think it realistically could be). But if it's not going to be self-contained, I think it could be improved by more completely showing how the elegant abstractions imply the things we're already familiar with in a neat way.

The Pythagorean theorem is about orthogonal vectors, so it would just be

  |x + y|^2 = |x|^2 + |y|^2
Plus the line just above that tells you: let

  a = |x|, b = |y|, c = |x-y|
I don't think they are obscuring anything. I think they are showing how the a b and c in the familiar a^2+b^2=c^2 can be generalised to |x|, |y|, |x-y| for any orthogonal vectors x and y in an inner product space.

I thought the Pythagorean Theorem assumed that x and y were orthogonal? Then

    2<x, y> = 0
This is such a funny coincidence, by the way. My prof literally proved the Pythagorean Theorem on inner product spaces today in class.

v · w = 0 was one of the premises.

Your version is the more general statement usually called the “law of cosines”. Personally I prefer the geometric algebra version:

Given any vectors u, v: (u + v)² = uu + uv + vu + vv = u² + 2u·v + v²

Where u·v = ½(uv + vu) is just the definition of ·.

Number 20 in logic is incorrect. It’s surprising how often this is misstated. There are lots of complete axiom systems. Indeed, take as your axioms all true statements in a given model. Voila, you now have a complete axiom system. It’s not just a useful one but it is complete. What isn’t complete is a recursively enumerable axiom system that contains arithmetic of the standard model of integers.

Overall it is a nice list and well done.

EDIT: Deleted a sentence that was pointed out to me that was wrong in the context of the assumptions made in the paper.

>Number 20 in logic is incorrect.

>What isn’t complete is a recursively enumerable axiom system that contains arithmetic of the standard model of integers.

That's exactly what #20 says. The "axiom system" there, by definition, includes Peano axioms (2nd sentence of #20).

I didn’t read that part but the statement is still wrong. Take as your model the standard model of integers. Collects all true statements of this model. Make this your axiom system. It is now complete and contains the first order Peano Axioms.

Then that system is not finitely axiomatizable (there are infinitely many truths in that model) so the original claim is still true. It's true for all finitely axiomatized theories.

The author of the article does not stipulate that the axiom system has to be recursively enumerable. From the article:

We assume it to contain the basic Peano axioms of arithmetic. An axiom system is complete, if every true statement can be proven within the system.

The theorem in the article is true for recursively enumerable axioms systems that contain first order arithmetic.

EDIT: Note that the second order Peano Axioms are categorical but not recursively enumerable. This is also an axiomatic system that has finitely many axiom schema.

Not finite, but recursively enumerable, no? Since you can just start from PA and derive conclusions forever, so this is an effective procedure in that any true statement will eventually be thus produced.

No, even if you start with the axioms of PA and enumerate all theorems provable, you'll miss some true sentences, one of them being the Goedel sentence of PA.

Did the author specify “finitely axiomatizable”? Because neither Peano arithmetic nor ZFC are finitely axiomatizable, for example.

Also, “not finitely axiomatizable” doesn’t mean “having infinitely many truths in the (?) model”. Not sure what you meant by that.

> Also, “not finitely axiomatizable” doesn’t mean “having infinitely many truths in the (?) model”. Not sure what you meant by that.

If you take all truths in the standard model of arithmetic, and make them your axioms you'll get a complete and consistent system with infinitely many axioms, and an extension of Q.

If one uses Roser's trick, Godel says a system cannot be all of these at the same time:

* a conservative extension of Q ("minimal amount of arithmetic")

* finitely axiomatizable

* complete

* consistent

It is ok to have any <=3 of these 4.

Note that PA is a conservative extension of Q. So you can replace Q with PA above.

Godel's original proof doesn't generalize this much but using Roser's trick Godel immediately implies this.

Nit: The relevant notion is not finite axiomatizability but recursive axiomatizability.

I came into the link ready to nitpick the choices (for some reason which I really should examine), but it's actually a nice birds-eye-view of the field that is rare to see. I feel like mathematicians, perhaps more than most, tend to stay siloed within our specialized subtopics.

I belive it's because math is such a "clean" field that it doesn't take many people to 'clear' a topic and advance a subfield or branch into sub-subfields. So you don't have a large number of people all rooting around in the same area speaking the same language.

With complex sciences like biology, or worse, wet lab experimental science, it takes so much drudgery just to get data to work with, that you can have 100x people working in the same area, trying to cover the data/processing requirements to support theorizing and cataloging discovered entities.

For a fantastic and deep bird eye view my go-to ref is the Princeton companion to mathematics.

A great book in a similar vein is Proofs from THE BOOK by Martin Aigner, Karl H. Hofmann, Günter M. Ziegler.

It's a reference to Paul Erdős and his notion of "The Book" in which God had written down the most beautiful and elegant proofs.

I read the link title and assumed it was referring to things like the "Fundamental Theorem of Algebra" or the "Fundamental Theorem of Number Theory" etc. It was.

This list is fun, but it's a little bit sloppy:

- The central limit theorem relies on the mean and variance of your random variable existing; there are random variables for which mean and variance don't exist.

- The definition of "measure-preserving" in the ergodic theorem statement is missing the measure.

- dim (ran A)^perp = dim ker A^T can be strengthened by dropping the dim's. If v in ker A^T, then A^T v = 0. Pick any Ax in ran A; then <v, Ax> = <A^T v, x> = <0, x> = 0, so v is in (ran A)^perp.

- Others pointed out that 20 looks busted.

- The definition of Haar measure needs to fix the measure of some nonzero-measure set, or "unique" needs to be replaced with "unique up to scaling" otherwise the theorem isn't true.

- "Sounders Mac Lane"

- 25: x_0 came out of nowhere; it's unclear from the text which "invertible" is meant; the basin of convergence for Newton's method for finding a local continuation can be very small indeed.

- 36: Why is the adjoint of A named T^*?

- 46: You need some assumptions about f.

I think you're being a bit harsh.

The central limit theorem does hold for iid rv (independent identically distributed random variables) with finite mean and variance. Now, those assumptions can be relaxed (the rv need not be independent, but they can't be too dependent, and they need not have finite variance, but they can't be too "far out"), and some of the pertinent proofs are only a few decades old; but you can hardly expect a survey with 135 proofs to cover all the subtleties.

Some of the other points may be more egregious howlers, but, again, come on - this is not the definite reference for any one of those theorems.

There's more than enough neat stuff I don't know in there. Unfortunately it's tough to trust a source with so many errors in the stuff I do know.

I wasn't aware of any CLT for iid random variables with infinite variance. Do you have references?

Check Valentin Petrov Limit Theorems of Probability Theory.

Here [1] is a CLT for RV with infinite variance, Prop 3.1.12, but notice the (larger) scaling coefficient (1/sqrt(n log n)).

Also see the second answer on SO here [2].

[1] https://web.stanford.edu/~montanar/TEACHING/Stat310A/lnotes....

[2] https://stats.stackexchange.com/questions/169611/the-role-of...

EDIT to add: Having said that, the Lindeberg-Feller and the Lyuapunov formulation of the CLT do require finite variance, so maybe I was too quick in stating that that assumption can be relaxed.

In other words, a 130 page write up focused on the overall picture instead of lots of details has the usual number of typos without a couple passes from a skilled editor.

I love Oliver Knill! Not only does he have very broad mathematical interests, he also has this crazy YouTube Channel (https://www.youtube.com/user/knill101/videos) where he sometimes uses a synthetic voice (!) to give talks---supposedly because he cannot talk as fast as he wants to.

He really is quite an interesting charactern!

Thank you for sharing! One thing I noticed is the statement in 20. Logic:

An axiom system is neither complete nor provably consistent.

I think this would benefit from some rephrasing, because such a system could for example be both complete and provably consistent (in the sense given in the paper) if it were, in fact, inconsistent.

Also, IUT is not "Inner-Universal Teichmuller", but rather inter-universal Teichmüller theory.

That one also bothered me, because it is very much wrong.

The correct result is

Any consistent axiom system that describes the natural numbers cannot be complete or provably consistent.

You pointed out that an inconsistent axiom system is neither. And at the opposite extreme, Gödel proved that the axioms for the real numbers are provably consistent. (However those axioms do not allow for induction, and therefore do not allow one to describe the natural numbers!)

This is incorrect as well. Take the standard model of the integers. Now take all true statements in this model. Let this be your axiom system. It is complete. It is not recursively enumerable and therefore not very useful. You can also have second order systems that are categorical. For instance the second order Peano axioms. I believe they are complete.


But honestly, nobody ever convinced me that second order logic wasn't made up BS. :-)

Or that reasoning of a form that can't be verified, even in theory, actually is sensible. (Why yes, I do have Constructivist tendencies, why do you ask?)

Before I took Mathematical Logic in graduate school I believed all kinds of nonsense about Godel’s theorems. Even after the class I still had false ideas about what it really meant. It took a while for the the recursively enumerable part to sink in as to why it is important.

I’ve heard what you say about second order axiom systems before but I don’t know enough about the subject. I naively think, “Why not just use the Second Order Peano Axioms?”. But people far more knowledgeable than me don’t like them so I defer to their judgement.

I don't qualify as far more knowledgeable than you, in fact I'm probably less knowledgeable but with a different point of view.

However with that disclaimer, the fundamental challenge is that when we start reasoning about reasoning, things that "should obviously work" run into trouble. (Most commonly due to variations on the liar's paradox.)

First order reasoning within a first order logic system that we think is good is "obviously correct" reasoning.

Second order reasoning introduces as much reasoning about reasoning as we think we can get away with, hopefully without causing problems. The result is that second order reasoning lets us talk about what kinds of unverifiable statements we can discuss the absolute truth of. But I'm uncomfortable with talking about the absolute truth of any unverifiable statements, which makes it feel like discussing the subtle shades of BS we fool ourselves into believing.


IIRC S. Eilenberg once said "Elegance in mathematics is directly proportional to what you can see in it and inversely proportional to the effort required to see it.". This Knill book is elegant.

Topic by topic for a wide range of pure and well polished applied math, Knill is able and does go right to the most important results and for each gives a short outline of the needed definitions and of a proof, usually with good references to full details, and then states the theorem precisely. Net, for nearly any topic in pure/applied math, go there first.

For many of the topics, I studied them carefully from some of the best sources, but just on a short scan this Knill book often has a better treatment, e.g., has a super nice, simple, practical statement of a fundamental theorem in Fourier series, a super nice statement of the Lebesgue decomposition from the Radon-Nikodym theorem, some nice stuff from Zorn's lemma, and much more, e.g., often some historical notes.

I have to like the connection with Zorn's lemma: At one point I went to a lecture at Indiana University and to the tea before, and an older professor introduced himself as Max Zorn. Since the previous summer I'd taken a course in axiomatic set theory on an NSF summer program at Vanderbilt, I was nearly floored! All I could do was blurt out, "What did Paul Cohen prove?". The next day Zorn gave me his copy of Cohen's paper. I still have it!

At one point I saved FedEx by pleasing two guys from crucial investor General Dynamics by doing a revenue projection with the differential equation

y'(t) = k y(t) (b - y(t))

with y(0), k, and b given. So, the solution is a lazy S curve that starts at t = 0 at y(0) and as t increases rises, has an inflection point, and rises from below asymptotically to b. So, it is a case of growth to a market size of b. With y(t) the current revenue and (b - y(t)) the market revenue yet to be obtained, has the growth rate y'(t) proportional to the current revenue y(t) (number of current customers talking) and the remaining customers (hearing the talking) (b - y(t)). So, it's a simple model of viral growth.

Well, Knill has this differential equation with b = 2 for a case of biological growth!

A supremely amazing collection! Succinctly written, and a joy to read. The curation is fantastic. Some would disagree with the choices, but the task was herculean. (e.g. geometric group theory is not really represented there - but one can't expect to have all of mathematics in 60 pages!)

I ended up going through the list, categorizing these theorems into know/don't know/should know categories. Time well spent.


Mandatory "Hey, there's a typo there" note:

In #77: HOMFLYPT polynomial's list of authors is missing Y: David N. Yetter, a professor in Kansas State University. Hope this omission will be fixed!

I learned knot theory and did research with Yetter in 2007 (which resulted in a paper on Vassiliev Invariants, a subject also mentioned in #77); that was a pivotal point in my life.

Besides mathematics, Yetter's other major interests, to my knowledge, are Orthodox Christianity and anime.

As a math major, this makes me giddy. It's like a family reunion, all my old friends in one place.

The Control Theory (129) section does a real disservice to the field by insinuating it is little more than the study of Kalman filters. While an important model, the core results of control theory are without a doubt stability criterions.

The Lyapunov stability criterion is even quite elegant, despite being powerful: A system is asymptotically stable if there exists a function V where for x/=0, V(x)>0 and [Sum]_i (dV/dx_i) (dx_i/dt) < 0.

Every single subject in the overview is summarized with similar brutality, yet it still reads like an encyclopedia. There's a lot of math out there.

> While an important model, the core results of control theory are without a doubt stability criterions.

I'm guessing you mean "while they (kalman filters) are an important model...". What you've written involves a dangling modifier clause, which makes no sense.

It's amazing to see so many things included and yet so many missing. It really speaks to the breadth and depth of the subject. For instance, for elliptic curves, only a single result is mentioned (their points can be endowed with an abelian group structure), and it's really one of the most basic ideas known about these objects.

Right. If I pick one of these subjects where I know, say, 100x more than what is present, and know of the existence of 1000x more than is present, I can't help but multiply the number of subjects in this overview by that that same 1000x, and then my mind explodes.

>For instance, for elliptic curves, only a single result is mentioned (their points can be endowed with an abelian group structure), and it's really one of the most basic ideas known about these objects.

One might even say it's a fundamental idea.

Which is the whole point of the document :)

Indeed you are right!

The Topos theory example is completely wrong though? > For example, if E is the topos of sets, then the slice category is the category of pointed sets: the objects are then sets together with a function selecting a point as a “base point”.

There's no object X where E/X is pointed sets. If you take the co-slice of a one point set 1\E that would be the category of pointed sets.

The right way to think about E/X is that its objects are "families of sets indexed by X", because a function f : Y -> X defines for each x in X a set f^{-1}(x). If you know about dependent types, this is basically a type dependent on X.

Fundamental Theorem of Group Homomorphisms:

If f is a homomorphism from a group G_1 onto the group G_2, then G_1/ker(f) is isomorphic to G_2

...and that's the tea.

The Church-Turing Theorem is not really a mathematical theorem; It is more like F = Ma in Physics. It's an empirical result.

The statement that "f is generally recursive iff it is Turing computable" is a theorem that can be proven, because we can rigorously define what it means to be Turing computable (computable by a Turing machine which is precisely defined) and generally recursive (which is the smallest set of functions closed under projection, composition, primitive recursion and minimalization). The theorem was proven by Turing and Kleene soon after the concept of Turing computable was introduced.

But if you remove the word "Turing" it is no longer a theorem. It becomes a thesis because we don't have a definition of what it means to be computable. It is in fact an innovation of Turing to identify computable with Turing computable, but that's by no means universal.

I think it's clearer to understand the Church-Turing thesis as two theses, not one. Turing's thesis was that (intuitive) computability equals (formalizable) Turing computability.

So you could completely ignore the recursive functions part (and so, Church's Thesis) and there would still be the same sort of lay confusion about the computability thesis.

It's not a theorem, but that's because it's not proven, or can't be proven really. Unless you're referring to the fact that those three definitions coincide, that is a theorem.

The statement F=ma is either a definition, axiom or theorem depending on what context you're working in. There are more fundamental (empirically tested) axioms that have F=ma as a consequence. Unless you introduce special relativity, then it becomes false.

I don’t think that’s quite the case. The theorem is a mathematical given formalizations of computability (algorithms). Those formalizations were devised to reflect what seems to be the physical limits of computers in the real world, of course. If we discovered more powerful methods of computation in the real world, we would presumably update our mathematical formalizations, but the math behind the current formalizations is correct (in the same way that the math used in Newtonian physics is correct even though we now know that non-Newtonian physics are needed to analyze certain phenomena.

It's not a math theorem.

From http://mathworld.wolfram.com/Church-TuringThesis.html

    There has never been a proof, but the evidence for 
    its validity comes from the fact that every 
    realistic model of computation, yet discovered, 
    has been shown to be equivalent. 

    The Church–Turing Thesis is an extramathematical 
    proposition, not subject to formal proof.

You're right. The thesis is that nothing in the real world can compute a function that a Turing machine cannot, which is not proven (and it's not clear how it even could be proven). What is proven formally is the equivalence of several different formulations.

From #113, Artificial Intelligence

The lebowski theorem of machine superintelligence: No AI will bother after hacking its own reward function.

once the AI has figured out the philosophy of the “Dude” in the Cohen brothers movie Lebowski, also repeated mischiefs does not bother it and it “goes bowling”.

I'm relieved to learn that this is a theorem. :D

The Lebowski theorem: "No superintelligent AI is going to bother with a task that is harder than hacking its reward function" is surprisingly deep.

You can see the theory in action in human history. We have drives programmed by evolution, but we try to to shortcut them. Drugs are deliberate hack for example. TV, entertainment and games can provide rewards faster than effort towards real life goals.

We fully understand that there is difference between the drive's purpose and what we do, but but we don't actually value the purpose of our reward function, only the reward. Catholic church tries to insist that people should limit sexual pleasure to the purpose it was created (by god or evolution) but people don't listen.

>There is no apparent “fundamental theorem” of AI, (except maybe Marvin Minsky’s ”The most efficient way to solve a problem is to already know how to solve it.”

Personally, I consider Richard Feynman's method (1. Write down the problem. 2. Think real hard. 3. Write down the solution.) to be just as fundamental.

The reason being that steps 1 and 3 are actually very important. Step 1 asks that you model the problem you're solving, otherwise you might solve some other problem instead. Step 3 asks that you model the domain of solutions. This helps ensure that it is possible for solutions to even exist.

One of the big problems with unresolved and open-ended philosophical quandaries is that people never specify what they expect a solution to look like. Like if you don't expect a solution to look like a string of plain words, and instead expect it to look like some mystical revelation, then you can't honestly expect to get an intelligent resolution communicated to you.

I'd also say it is integral to understanding the difference between simply solving a problem and intentionally solving a problem.

Naive question: How would one go about learning all about this (or even a subset of this) for the mathematically inclined working engineer? Where does one even start? Just take each theorem from each section and try to follow through the proof? What are the prerequisites required?

The theorems in this list are in many cases central topics in 1+ semester courses, and aren’t really worth worrying about out of context. Overall you are looking at more than 5 years of full-time (40 hours/week) study, maybe more than 10 years. Unless you have extensive background reading and writing proofs, self-study is not likely to be very effective. You could conceivably make your way through with textbooks and the help of an expert tutor/mentor.

For most people the easiest method would be to enroll in an undergraduate mathematics degree at a decent university.

If you wanted to make your way through more of the topics in this list than an undergraduate degree covered, you could follow-up by enrolling in a PhD program.

There are standard texts on the various topics listed in the contents. There are online texts on these topics as well as free to download texts. To learn everything, get a math degree or self-study the curriculum of one. The prerequisite is basically calculus.

The thing about this text is it seems like a more or less random list of theorems - it lists one theory, I think, Zorn's lemma, under topology where Zorn's lemma is more a set-theory-link-to-topology (one of the guises of the Axiom of choice, really). Which to say it has next-to-nothing on point-set topology proper. Other theory listings seem just as random. So I couldn't what order you'd follow trying to learn everything specifically here.

I think it's a great list but the typos bothered me. To me, they seem to indicate sloppiness; and if there is sloppiness, could the mathematical results be sloppy too?

e.g., on page 2 itself:

"Anticipated by Babylonians Mathematicians in examples..."

"Let f be a function of one variables..."

Does anyone have a cheat sheet of sorts for the symbols and syntax used in these?

It doesn't work like that. Some of the same symbols are used in two different places with vastly different meanings, even within this paper.

To give an imperfect programming language analogy, imagine that every symbol is implicitly defined and has lexical scope. The grammar is as far from being context-free as possible. What may sometimes look like individual symbols to you are in fact a collection of inseparable symbols with its own semantic parse.

That's not really helpful. I struggle with the same problem, in fact the notation intimidates me to a point where I decided to forego getting deeper into theoretical computer science and haunts me every time I try to read a paper.

A resource that would explain formal mathematical notation would be great, much like a dictionary.

You can't google those symbols properly either.

Did you know what the words meant? Can you read this and except for the symbols understand everything else?

I don't think the symbols are the problem. They're a very superficial obstacle, at least in this paper.

This isn't meant as a put-down. It's just that I really do think the actual material requires a lot more preparation that consists in a lot more than just learning the definition of the symbols. I could rewrite the whole paper using only text and no symbols. The symbols are just shorthand for English words. I don't think if I were to replace all of the symbols with the English words that they stand for, you would be much closer to understanding.

"The symbols are just shorthand for English words."

OK, so then why would it be so difficult to produce a veritable cheat sheet to aide in reading a proof?

Because the symbols' translation to English words changes on context, culture, and even author. It's like trying to translate the word "rain" only by hearing it without having the context if it really meant "reign" or "rein". With some mathematical training, you don't really need the symbols to know what's being talked about, because you already have enough context to know what should be talked about and what fits into the context, regardless of how it's written down.

Anyway are symbols really the main obstacle for you? When you read "every second countable regular Hausdorff space is metrizable" (theorem 59 in this paper), you have no problem understanding what this means?

Fair point, but you know very well that, for true illustration, you'd have to expand out the meaning of the words "countable", "regular", "Hausdorff", "space" and "metrizable". And I bet you could. I get the sense you're being obstinate for a point.

And don't forget the word "second". I'd have to explain that one too.

And then I'd have to explain the words I used to explain these words.

It all just takes a while. I might be able to do it, but we'd both need to spend some time understanding all of this together, going back and forth, considering examples, and building the foundations.

Hm, that’s fair but I don’t think it’s an excuse for an easy reference to not exist. All you need to do is then provide context with the example. There are many languages with hard an fast rules about usage or formation of words that are products of their context.

It's best to google the theorem mentioned if you want to get more than the gist of what is being discussed.

Some Fundamental Theorems in Mathematics are bigger than others. Some Fundamental Theorems in Mathematics' Mothers are bigger than other Fundamental Theorems in Mathematics' Mothers.

Can you really trust any mathematician's judgment who doesn't use serial commas? ;)

This is not a self-contained exposition, and a lot of the interesting implications of the theorems seem to be omitted (why are these theorems fundamental; and fundamental to what?) It begs the question: why not instead post a list of links to Wikipedia articles? That would allow the reader to click through to definitions that they are not familiar with, as well as explore what these theorems are (practically or theoretically) useful for.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact