Having 1) studied "naive statistics" when I was young and 2) learning the full-on measure-theory version a la Bourbaki, and now 3) using applied statistics every day, I can say the naive version is the most useful by far. Yes, there is hand waving, but the hand waving works, and you really don't get into that much trouble unless you try to be pathological.
I admit, it would be great to have a version where the naive intuition (many of which are motivated by empirics, physics, and real world situations) and the theoretical definition matched better. Even still, I don't think the formal treatment is appropriate as an introduction -- it's an unnecessary hazing on learning minds when they can get most of the value with a bit of hand waving.
The best way to appreciate the information-theoretic role of sigma algebras is to look at them in the simplest case, where you have a discrete-time, finite-valued process. Then a sigma algebra is equivalent to a partition of the state space and it represents the information that can be gained from an observation; it's like a random variable without specific values, just the discriminating information from different outcomes. To say that a random variable is measurable with respect to the sigma algebra is to say that its value may only depend on information that can be gained from an observation. A filtration of sigma algebras corresponds to a causal series of observations where the observer learns more information over time.
The conditional expectation of a random variable with respect to a sigma algebra (or partition or other random variable) is another random variable that tells you the expectation over the states consistent with a given observation; this new random variable is measurable with respect to the sigma algebra you conditioned on, which as mentioned earlier means it only depends on the information gained from an observation. The conditional expectation is the best least-squares estimator given the information from an observation in the same way that the usual expectation is the best least-squares estimator given no information.
To put it another way - think about all the pathologies you may encounter working with even a "nice" function space like L^2(R). You'll never deal with those pathologies in reality, because every empirical function is much better behaved - finite domain, finite range, and even if you assume a continuous domain, you can choose a model that only has a finite number of discontinuities, is Lipschitz continuous in between them, has finite total variation, etc. And that's why the hand-wavey, "intuitive" approach works so well if you're not a theorist, at least in my opinion.
There are better approaches to measure theory which live in different "foundations". For example, you can build measure and probability theory based on the locale of valuations on a locale instead of a sigma-algebra on a topological space. You can do even better by starting in a constructive metatheory and adding some anti-classical assumptions which are modeled by all computable functions.
The reason we are teaching classical measure theory as the foundation of probability theory is historical and because there are no good expositions available for most alternative approaches. It is really not the most straightforward approach.
Before you accuse me of being overly negative: classical measure theory offers a consistent approach to probability theory which is well understood and for which carefully written textbooks are available. If you really need to go back to the definitions to derive something then you need to know at least one consistent set of definitions. So it is useful to teach measure theory, even if it is more complicated than it has to be...
Everything in mathematics is divorced from reality. Unbounded integers are divorced from reality. (Once you move beyond naive realism, bounded integers are divorced from reality, but that's a deeper philosophical debate.) The only question is whether these models are more or less effective for their various theoretical and applied purposes.
> There are better approaches to measure theory which live in different "foundations". For example, you can build measure and probability theory based on the locale of valuations on a locale instead of a sigma-algebra on a topological space.
Better by what definition? According to the practical needs of students, pure and applied mathematicians, etc? I've studied some topos theory and know a little bit about locales from the Topology Via Logic book, but it's hard for me to see that as anything more than a fun curiosity when considering the practical needs of mathematics as a whole. In my mind that kind of thing is much closer to navel-gazing than something like measure theory.
> It is really not the most straightforward approach.
The onus is on critics to do better. Dieudonne/Bourbaki made a valiant and elegant attempt even if they intentionally snubbed the needs of probability theory. And "better" will obviously be judged by the broader community.
Mathematics is an abstraction, but it is still useful for talking about concrete problems. Your mathematical assumptions can be either close or far away from your problem domain. Sometimes we introduce idealized objects, such as unbounded integers, in order to abstract further and simplify our reasoning.
These ideal objects can then either be "compiled away" in specific instances, or really do ignore corner cases which might invalidate your results.
For an example of the former, you can assume that there is an algebraically closed field containing a specific field, give an argument in terms of this closure and then translate this argument to one which does not construct the closure explicitly. The translation is mechanical and does not represent additional assumptions you made.
The second kind of ideal object is something like the real numbers applied to physics. We can think of a real number as an arbitrarily good approximate result. In practice we can only ever work with finite approximations. At the scales we are operating on the difference is usually not relevant, but there might, for example, be unstable equilibria in your solutions which are not physically realizable.
> Better by what definition?
Informally, better because it is "simpler". There are fewer corner cases to consider, theorems are more inclusive, constructions are more direct.
Formally, the theory has more models and is therefore more widely applicable. Theorems have fewer assumptions (but talk about a different and incompatible type of objects).
> The onus is on critics to do better. Dieudonne/Bourbaki made a valiant and elegant attempt even if they intentionally snubbed the needs of probability theory. And "better" will obviously be judged by the broader community.
Oh, sure, but that's not what I want to argue about.
I can tell you with certainty that classical measure theory is complicated by the interplay of excluded middle and the axiom of choice. This is a technical result. You can see this yourself in textbooks every time the author presents an intuitive "proof idea" which then has to be refined because of problems with the definitions. In alternative models, or in a metatheory with alternative assumptions, the simple proof idea usually works out fine.
It's a matter of degree. You can't point to a particular large integer as being "too large" to matter in real life, but there are plenty of objects in measure theory (like unmeasurable sets) that blatantly violate physical intuitions and don't seem to exist in any sense in real life.
That's not entirely fair. People doing controls work with continuous-time models often enough, so there are some practical benefits.
I agree with the rest of your comment though.
I think you're overstating some things, but I mostly agree. My main disagreement is with your implicit premise that the best practical theory should exist at the same level of abstraction as practical applications. The real numbers have been a really successful practical theory. Physicists and applied mathematicians know they don't "really" exist, but more "realistic" alternatives are awkward and messy. The same applies to more extravagant theoretical constructions like Hilbert spaces. They're an extremely nice mathematical setting for applications (e.g. optimal control, approximation theory, finite element methods, quantum mechanics). No-one should be losing much sleep over their ubiquity in applications. If your point is that we shouldn't belabor some of their technical details when teaching them to practitioners, sure, but that's already the case.
It might be possible to develop alternative proofs on purely finite/approximate mathematics, but for a working applied mathematician who already went through standard math grad school curriculum that is probably more trouble than it’s worth.
The users of those mathematical tools (whether software implementors or people just calling some software library) usually don’t need to care about the details of the proofs.
This is similar for other kinds of science/engineering.
Oh for sure, and perhaps this is just a confusion of terms but i think that's what the thread parent was speaking with "applied statistics". In academia, "applied math/statistics" can mean "I'm doing theoretical math with an eye towards applications but it still requires heavy mathematical machinery", but it can also mean "I'm using mathematical tools to solve empirical problems, and I'm never going to need to worry about Lebesgue measures".
And by the modern style, I meant starting out with analysis, defining the axioms of what measures are, demonstrating the existence of nonmeasurable sets with the axiom of choice, etc. IIRC, the course did rely on Borel algebras for its buildup, but did not openly buildup from sigma algebra machinery.
Bourbaki's shortcut to Radon measures is very elegant but it's noteworthy that unlike many other Bourbaki innovations I don't think it was picked up by other textbook authors. Already at that point there was a mathematical consensus that measure theory was a valuable part of the foundations of modern mathematics and shouldn't be eliminated or minimized.
Outside probability theory, measure theory is primarily used as a foundation for integration ("expectation"). There are also more specialist subjects like geometric measure theory; there's an excellent introductory textbook called Measure Theory and Fine Properties of Functions, and if you look at its table of contents you can get an idea of the breadth of topics.
His real complaint as far as I've ever been able to determine is that he is a highly symbolic thinker; and because of that won't accept certain assumptions that everyone else takes as a given - usually what the Reals are. I'm very happy to accept than any length in geometry is a number by definition (hence sqrt(2), constructed by a 1-1-sqrt(2) triangle, is clearly a number corresponding to that length). He won't accept sqrt(2) as a number because it can't be represented in Hindu-Arabic notation. This isn't really a logical issue, he just won't use everyone else's definitions.
He's worth listening too because he is good at maths despite that handicap and his perspective is interesting to provoke a bit of reflection on what your assumptions are and what does infinity really mean anyway. His complaints are otherwise unlikely to catch on.
You sure of that? Based on another of his other blog posts , his objection seems to be about uncomputable real numbers. Very roughly, a real number R is computable iff there exists a Turing machine that, given a natural number n on its initial tape, terminates with the nth digit of R. See  for a formal definition. Sqrt(2) and all familiar real numbers are computable. Of course, since there is only a countable infinity of Turing machines, but an uncountable infinity of reals, some reals must be uncomputable. Some versions of constructivist mathematics do differ from standard mathematics by rejecting the uncomputable reals and instead defining "real numbers" in such a way that they are essentially the computable reals.
Eg, "These phoney real numbers that most of my colleagues pretend to deal with on a daily basis ... such as sqrt(2), and pi, and Euler’s number e." 
Even in the article you cite, the irony of a pure mathematician of all people complaining that a concept has no tangible link to reality is a bit of a give away that he is speaking from the heart rather than the head. That isn't a valid complaint about pure mathematics; the point is patterns for patterns sake. So what if there are no known examples of your pattern? Study it anyway!
Great case of the flaw maketh the masterpiece; apart from that one little quirk with infinite things he is a lovely character and a force to be reckoned with. And I expect his personality motivates a lot of interesting research from him regardless.
Then I watched a couple of his more advanced videos and (from my limited watching) saw that he seems not so crazy after all. It's just that he seems to like natural numbers and finite constructions a lot, although I didn't really fact check that much. Infinite and more abstract structures _abound_ (it's in their nature :P) in mathematics obviously and can be encoded symbolically just fine.
Seems fine by me, finite structures are very important as well and you can make reasoning about them very rigorous. It's just, maybe he shouldn't be teaching about all those other kinds of topics...
Interestingly, that set of numbers is still very incomplete relative to what we expect to be able to talk about in modern math. It doesn't even include the roots of all polynomials (for example, the unique positive solution to x^3 - 2 = 0 isn't the length of any constructible segment in classical geometry).
You could also ignore polynomials entirely. Since Q(sqrt(2)) is just a two dimensional vector space over Q, we could define it as the ordered pairs Q^2 with an appropriate definition of multiplication and division (like we often define the complex numbers to highshoolers), but this also gets ugly.
I guess the moral of this post is 99% of the time a finitist or constroctivist complains you can rework your theory into a more ugly one that avoids the complaint.
I wonder why he never talks about this – surely as a mathematician he should be aware of intuitionistic logic? (Or is it that as a computer science and linguistics wannabe, I am aware of it but many mathematicians aren't bothered to take a look?)
If you stop thinking of axioms as value judgements and instead as definitions of formal systems (or adopt a more general system that encompasses others, such as Gentzen style sequent calculus with the only axiom being modus ponens), you achieve a piece of mind. But obviously his beef is not only that; he would like other people to admit the value of the thinking he prefers. And I agree with that.
Heck, I know professional logicians (mostly model theorists) who have never seen intuitionistic logic, in any context, ever. That said, Norman does know a fair bit about intuitionistic logic and constructive mathematics, and even some type theory. He does not like the underlying philosophy any more than he likes classical mathematics.
> Gentzen style sequent calculus with the only axiom being modus ponens
Sidenote: a calculus where modus ponens is an axiom is definitely not a Gentzen-style sequent calculus.
That said, there are philosophical questions about truth, infinity, unknowable statements and the like. Mathematicians have by and large settled on a set of answers to these. Every statement is true or false, regardless of whether we know the answer, or even whether we can know the answer. Infinite sets exist, and are described by a known set of axioms called ZFC. Almost all real numbers that exist can never, even in principle, be written down or described in any meaningful way. (In what sense do they exist again?)
All of these statements are part of classical mathematics.
Almost every elementary exposition will implicitly assume that they are try. Yet they can all be questioned, and their truth can never be settled in any absolute sense. However woe betide the student who dares question these in a math class.
I don't think many people see that as a matter of philosophy.
This question strikes at the heart of the debate between Constructivism and Formalism. A debate about what it means for things to exist, statements to be true, and so on. This is very much a matter of philosophy.
To a Constructivist, most of classical mathematics is nonsense. And Constructivism is at least as logically consistent as classical mathematics.
More precisely any contradiction found in Constructivism necessarily will lead to a contradiction in classical mathematics. The converse is only partially true. Gödel did prove that a logical contradiction in the classical handling of infinity will lead to a contradiction in Constructivism. But a flaw in a specific set of classical axioms, such as ZFC, need not lead to a flaw in usual Constructivism.
In real analysis you learn that "almost all" means that the exceptions are a set of measure 0. Since all countable sets have measure 0, the result is trivially true in classical mathematics.
In the constructible universe, you again have measure theory.
Almost all still has a perfectly well-defined meaning. And all sets with enumerations again have measure zero, just like in classical mathematics. But "uncountable" now is a statement about self-referential complexity, not size. Next, "the set of all numbers with finite definitions" is not a well-defined set. And numbers without concrete definitions do not exist.
A profound question to reflect on; and interesting to contrast with modern science as a philosophical foundation that there might not be any continuous objects present in reality. Might be discrete all the way down.
In a sense, it seems like the uncomputable reals are an artifact of assuming continuity, ie, between any two numbers located on a line there must be more numbers. Part of the reason it is so unintuitive is we don't have any real lines to play with at the physical human scale, they fall apart at the atomic level and turn out to be non-continuous approximations.
Maybe "connectedness" is the notion you're trying to get at -- the real numbers are topologically connected, but the rationals aren't. If "A" is the set of rational numbers x with x^2 < 2, and B is the set with x^2 > 2, then the rationals are the union of A and B, and there is a "hole where sqrt(2) should be", so the rationals are disconnected. It's possible to define the word "connected" in a way that makes this notion precise.
A related notion is what's called "(sequential) completeness". The infinite sequence whose terms are (2, 2 + 1/2, 2 + 1/2 + 1/6, 2 + 1/2 + 1/6 + 1/24, ...), where the nth term is obtained by adding 1/(n!) to the previous term, intuitively "should" converge, since its elements get arbitrarily close together as n gets arbitrarily large. Any such sequence converges to a real value (this one converges to the exponential constant "e"). But if our number system is only countably infinite, there must be some sequences that get arbitrarily close together but don't converge. For example, if we restrict ourselves to rational numbers, this is a valid infinite sequence (every element is rational), and its terms get arbitrarily close together as "n" is large, but it doesn't converge to anything.
This is the assertion of many mathematicians, but the justification for it is “this is convenient” and/or “we take this as an article of faith” (or often “I never really thought about it, but it doesn’t much affect my work day to day one way or the other”).
There is no way to prove that infinite sets “exist” by reasonable definitions of “exist”. Indeed, by a conventional definition of “exist” infinite sets pretty clearly don’t qualify.
Instead, mathematicians have redefined the words “exist” and “true”. In mathematics it now means something like “if we accept a particular set of non-obvious and rather handwavey premises, we will also accept any conclusions that result from symbolic manipulations thereof following our established formal rules.” [This is not a full or precise definition of mathematical existence; folks interested can do a search for those keywords and find piles of material.]
* * *
Personally I am happy to accept ZFC or the like because it is convenient and I can’t be bothered to work up an alternative system from scratch and carefully examine all of the conclusions that might follow from that, and whether ZFC is “true” or not doesn’t really affect me. It seems intuitively wrong to me, but I remain agnostic.
In my experience, people won't come out and say it, but this seems to be what everyone is thinking. :)
The problem with this is that it is wrong.
Classical ZFC in particular is a very strong and specific set of assumptions* with a very tenuous link to any practical application. If you actually want to develop a useful bit of mathematics it makes sense to consider the "foundations" as a moving piece. It's a part of the design space for modeling your problem domain, not some god-given notion of truth.
You can translate between different logical theories by building a model of one in another, so it's not like you loose anything. But it's cooky to insist that we should start with ZFC of all things.
*) I mean that second-order ZFC has basically no non-trivial models, so there is no real way of extending ZFC to talk about domain specific aspects of your problems.
> You can translate between different logical theories by building a model of one in another, so it's not like you loose anything. But it's cooky to insist that we should start with ZFC of all things.
I don't see how these two are consistent. Almost everything most mathematicians do can be done both in ZFC and your favourite non-kooky axiom system. Certain Powers That Be seem to have decided that ZFC is the foundation of mathematics, so they say that what they're doing follows from ZFC even if they have a very hazy idea of what it is, but why does it matter? Most mathematics probably won't be formalized in their lifetime anyway, so whether it ends up being formalized on top of ZFC or something else doesn't affect them.
You would be amazed at how many uniqueness and existence theorems in how many areas of mathematics require Zorn's Lemma. Which is, of course, equivalent to the axiom of choice. For example, "Every vector space has a (possibly infinite) basis." Or, "Every Hilbert space has a (possibly infinite) orthonormal basis."
It is rare for mathematicians to think much about choice. But it underpins key results in a surprising number of fields.
> "Almost everything most programmers do can be done both in x86 assembly and your favorite non-kooky programming language. Certain Powers That Be seem to have decided that x86 is the foundation of computer science. [...] Why does it matter?"
The problem is that it is difficult to translate results in a theory built in ZFC to other "architectures". In mathematics, the architectures in question are not different axiom systems, they are different branches of mathematics.
Let me give you an example. There is a large body of work on differential geometry with many useful constructions. Classical differential geometry works directly in a model where manifolds are certain subspaces of (countable products of) R^n. These constructions have been successfully imported into many different areas of mathematics. In most cases people just had to tweak the definitions slightly and adapt the proofs by keeping the basic strategy and changing all the details.
What is happening here is that the underlying ideas of differential geometry are not specific to this particular model.
When faced with such a concrete model, our first instinct should be to abstract from it and ask which assumptions are required. This is difficult in ZFC, because in the end you have to encode everything into sets. It's not possible to reason about "abstract datatypes" directly, without literally building a notion of logic (generalized algebraic theory) and models of it within ZFC. Even then, the existence of choice means that you usually have to exclude unwanted models of your theory.
Coming back to differential geometry: You can generalize a lot of it by working in a differentially cohesive (infinity-)topos. This is terribly indirect (in my opinion) and looses a lot of the intuitions. A topos is literally a model of a certain logic. Alternatively you can work directly in this logic (the "internal language" of the topos), where the differential geometric structure is available in the form of additional logical connectives. You are now talking in a language where it makes sense to talk about two points being "infinitesimally close" and where you can separate the topological from the infinitesimal structure.
At the same time you reap the benefits that there are many more models of differential cohesion than there are models of "R^n modeled in classical set theory". You can easily identify new applications, which might in turn suggest looking into different aspects of your theory. It's a virtuous cycle. :)
This approach is deeply unnatural when working in set theory or arithmetic. You have to encode everything into sets or numbers and then these sets or numbers become the thing you are studying.
I'd make the following alternative analogy: I code in Python on an x86, because that happens to be the machine on my desk. If you told me I should be using POWER instead of x86, I'd probably just shrug: I could do that - my work is portable - but it's also completely irrelevant to my work. I think this would be how most people in say analysis, algorithms or combinatorics feel, for example.
In particular I maintain that any competent logician or set theorist should be able to explain to you the sense in which the statements that I made cannot logically be settled, and also explain to you the extent to which the rest of mainstream mathematics accepts these statements as true.
Mathematics and physics both began with the advent of astronomy: Babylonians and others were curious to trace star patterns and from there both physics and math developed in tandem and influenced each other greatly. Really, neither would have developed without the other.
Calculus was invented for calculations in physics. This gave rise to differential equations which we use to model so many nontrivial things. The differential eqns describe flow and continuity and arguably reality. Differentiation and smoothness can't be defined over finite sets in the same way. My philosophical counter-argument to finitists is that clearly we're on the right path to understanding the universe and nature when using the reals. It seems foolish to shy from this because computers have trouble computing some functions. Statements like "there are only finitely many atoms in the universe" don't improve our understanding of much but PDEs explain.
Arguably mathematics began with accounting and surveying and trade, and more generally with counting and measurement.
Tell me. In what sense does an abstract concept exist that has no possible definition or unique description?
It's the same idea for the reals; we don't actually care about the vast majority of the real numbers, but since it's hard to know ahead of time which ones we will care about, we might as well prove things about all of them!
Now compare with the kinds of results that classical mathematics gives us. The Robertson-Seymour theorem (see https://en.wikipedia.org/wiki/Robertson%E2%80%93Seymour_theo... for the theorem) says that certain classes of graphs are characterized by a finite forbidden set. This means that membership can be tested by a polynomial time algorithm. However the construction provides no way to actually find that finite set. It also provides no way to find how many members it has. It not only provides no way to prove that you actually have all of them for a given class of graphs, but there are classes of graphs which it is impossible for us to prove that we actually have a complete list. Not only in practice, but in principle.
So the theorem asserts the existence of a finite set. But in what meaningful way does it exist, or is it finite?
Also the definition that you gave is complete nonsense to a Constructivist. And the reasoning that forces it to be uncountable in classical set theory requires reasoning that also makes implicit assumptions you are probably not aware of.
Also if we want to be precise I'd like to hear your definition of "finite definition", the only definition of "definability" I'm familiar with is relative to a model of a theory and if ZFC has models at all then it has countable pointwise definable models, but I guess that won't satisfy your idea of "finite definition"
The fact that this may not be a definition from my point of view is irrelevant - I was making a statement about what classical mathematics implies, and from the point of view of classical mathematics this is a perfectly reasonable definition.
Since the number of such statements is countable, and only some of them define real numbers, the set of such real numbers is countable. Being countable it is a set of measure 0, and therefore the "almost all" that I stated follows immediately.
This said, his complaints about set theory would have been (probably) more convincing if he stated the axioms correctly. The Infinite Set axiom does not simply say that some nebulous infinite set exists it states the existence of a set with some very specific properties. His complaints about the undefined nature of a 'property' are also not above criticism: GB theory (equivalent to ZF) eliminates it completely.
In the early decades of the 20th century there were furious philosophical debates about the philosophy of the foundation of mathematics (between Brouwer's Intuitionism and Russell's Logicism, and also Hilbert's compromise of Formalism), but they all stemmed from an underlying assumption that mathematics gains its validity from some notion of philosophical truth. One could argue to no end on how the truth of mathematics is established, but a different perspective later emerged that avoids this debate altogether: that mathematics takes its validity not from truth but from utility (which is always relative to a specific task). We cannot say that one of those views is unacceptable, and if your position is that mathematical validity stems from utility, we cannot tell you that your foundation is shaky if its utility is established.
Mathematics is about exploring the consequences of your axioms. Axioms are chosen, not dictated by the universe. If your axioms turn out to be inconsistent, well, congratulations on successfully showing that consequence of those axioms.
Case in point: the axiom of choice. There is no objective truth value to it. The physical world gives us no 'answer'. Mathematicians can choose to adopt it, or not, and then explore the consequences. ( https://en.wikipedia.org/wiki/Axiom_of_choice#Statements_con... )
It is also doubtful that this is the perception today. If we want to use mathematics to derive any result about the physical world -- and we most certainly do -- our axioms must be consistent with it. How do you ensure that that is the case?
But such questions are not part of mathematics itself, just as the scientific method is not part of physics, but rather belong in a more fundamental discipline, called foundations of mathematics, or the philosophy of mathematics, which is usually studied by philosophers/logicians.
: On the Significance of the Principle of Excluded Middle in Mathematics, 1923
That's a contradiction, and it's easy enough to show it: a mathematician can research the consequences of the axiom of choice, and can research the consequences of its negation. It would be silly to deny that such research is legitimate mathematics.
Brouwer's statement strikes me as circular. Beyond that, it seems to me that the law of excluded middle, is patently false. I already gave a counterexample: the axiom of choice. Neither the axiom, nor its negation, has a derivable truth value to settle upon.
Ah, I see I'm behind the times 
As to Russell, he of course turned out to be wrong. I refer of course to Gödel. I'm not sure I see your point in mentioning him, or for that matter Brouwer - do explain.
> our axioms must be consistent with it
They don't. We hope that we have the mathematics to advance physics, but we needn't hope that all mathematics is applicable to physics.
It's perfectly legitimate for a mathematician to explore the consequences of denying the axiom of choice. (I already linked to such.) Such work would presumably never have application in physics, but it's still legitimate mathematics.
> How do you ensure that that is the case?
As you later allude to, that isn't a question of mathematics, it's a question of how and why mathematical discoveries map onto observable realities like physics and statistics. This is a completely legitimate line of question. The canonical article about how remarkable this is: Wigner's "The Unreasonable Effectiveness of Mathematics in the Natural Sciences".
> such questions are not part of mathematics itself
Ok, but those pre-Gödel mathematicians were profoundly mistaken about the nature of mathematics. To put it bluntly: they were wrong, so why should I care what they thought?
> It does not seek to mathematically derive theorems from mathematical axioms, but to derive mathematical axioms from philosophical underpinnings
I don't agree. If you don't add new axioms, then you are 'merely' deriving theorems. If you do add new axioms, fine, you're just adopting a new set of axioms to explore through derivation.
If some set of axioms gives rise to interesting mathematical consequences, does that mean these axioms must have philosophical underpinnings? No. Does the 'validity' of the mathematics depend on the philosophical underpinnings of the axioms? No: it counts as mathematics either way.
It would be pure silliness to dismiss non-Euclidean geometry on the grounds that Hey, you just made that up!
Perhaps there's a gap somewhere in my account of things, but I'm not seeing one so far.
> to derive mathematical axioms from philosophical underpinnings -- whether they are physical reality, some Platonic reality, or even common sense
Which of these underpins non-Euclidean geometry? How about fields with no practical applications? How about, as I've mentioned several times, research into what happens when you deny the axiom of choice?
They're all still valid fields of mathematics. The 'underpinnings' of the chosen axioms are of no consequence: it's valid mathematics either way.
The miracle is that we're able to be so successful with fields like physics and statistics. Picking the right special sets of axioms, we've been able to derive huge amounts.
> the SEP link I provided above is a great place to start
Looks it - will read when I get a moment.
How do you know they were wrong? Has there been some breakthrough discovery in philosophy? And if they were, Gödel was wrongest of them all. He was the most extreme Platonist of them all.
> If some set of axioms gives rise to interesting mathematical consequences, does that mean these axioms must have philosophical underpinnings? No. Does the 'validity' of the mathematics depend on the philosophical underpinnings of the axioms? No: it counts as mathematics either way.
Again, the points you are raising have been discussed by logicians as part of the philosophy of mathematics. As usual, things are not as simple as you may think.
> Perhaps there's a gap somewhere in my account of things, but I'm not seeing one so far.
Perhaps that's because you've been thinking about this issue for a little bit, while logicians have been grappling with it for over a century. I may have confused you with the reference to physics. It is not the reason for the problem of foundation, but I used it as an example for a situation where something external to mathematics determines our choice of axioms.
The Wikipedia entry is also a good overview, but as usual, contains a few (sometimes serious) inaccuracies: https://en.wikipedia.org/wiki/Philosophy_of_mathematics
That mathematics explores the theorems derivable from axioms tells us little. Is mathematics about the formal manipulation of the symbolic expression of those axioms, some (nondeterministic) Turing machine, chewing up and spitting out strings of symbols? Nearly all mathematicians would say not (including Hilbert's Formalism, which some gravely mistake for saying mathematics is just what I described). But if not, this means that the linguistic symbols mean something, and in analytical philosophy, meaning is a sort of a mapping from linguistic terms to something outside the language. But if mathematical formulas mean something -- i.e. refer to something outside them -- what is the nature of that thing, and, more importantly, how do we learn about it?
Of course there has! Gödel! As you said, Russell believed that all of mathematics could be deduced from laws of logic that are true in the most absolute sense. Gödel showed that this is impossible, and his discovery ended Russell's project overnight. Russell was fundamentally mistaken about the nature of mathematics. No two ways about it.
I already linked to the Wikipedia discussion of why Brouwer's idea about the law of the excluded middle, seems no longer to be taken seriously.
> Gödel was wrongest of them all. He was the most extreme Platonist of them all.
Maybe so, I'm afraid I really don't know his philosophy.
> if mathematical formulas mean something -- i.e. refer to something outside them -- what is the nature of that thing, and, more importantly, how do we learn about it?
But the mathematics derives the same with or without associating real-world meaning. Seems to me that this is enough to demonstrate that the mapping to something beyond the mathematics, is something that should be treated as quite separate from the mathematics itself (which is 'merely' axioms + consequences).
Indeed, we see this in action with fields of mathematics that find real-world applications only years after the discovery of the mathematics itself.
Suppose a field of mathematics is found to have several applications. Again, the mathematics is unchanged: the discovered applications are a different beast entirely.
Doubtless I'm not the first to take this line of thought.
> including Hilbert's Formalism, which some gravely mistake for saying mathematics is just what I described
I'm afraid you're ahead of me again here -- that's still on my reading list.
Also, perhaps just a nitpick: nondeterministic Turing machines might differ from deterministic ones in terms of space/time complexity (that's an open problem), but they're just as impotent against non-computable problems. Were you thinking of some kind of probabilistic machine?
I also think you're confusing Russell with Hilbert, and it is not accurate that Gödel destroyed Hilbert's project, just changed its nature (or, rather, destroyed the original understanding but not subsequent ones). But Hilbert's idea was actually much more in line with yours -- he also believed that mathematics is about the process rather than some absolute truth, but in order to show that the process can lead anywhere it was important for him to prove consistency. After all, if our mathematics can prove any proposition (including, say, 1=0) then it's rather useless, whether or not it's "valid" in some sense.
> I already linked to the Wikipedia discussion of why Brouwer's idea about the law of the excluded middle, seems no longer to be taken seriously.
I think that neither Brouwer nor that Wikipedia article says what you think they do. :) There's a whole branch of logic, intuitionistic logic, and mathematics, constructive mathematics, that are largely based on Brouwer's ideas and rejection of LEM. They are particularly popular among some kind of computer scientists, BTW.
> But the mathematics derives the same with or without associating real-world meaning. Seems to me that this is enough to demonstrate that the mapping to something beyond the mathematics, is something that should be treated as quite separate from the mathematics itself
I'm not talking about real world meaning, nor "beyond mathematics", but beyond the formula (as a string). If the strings refer to anything, what is the nature of the things they refer to?
> Also, perhaps just a nitpick: nondeterministic Turing machines might differ from deterministic ones in terms of space/time complexity (that's an open problem), but they're just as impotent against non-computable problems. Were you thinking of some kind of probabilistic machine?
No, I simply meant that deduction in a formal system (like first-order logic) is a nondeterministic computation, and to emphasize the perspective that math is just a manipulation of symbols without meaning I referred to Turing machines. That they cannot settle non-computable problems is irrelevant, because neither can mathematicians, and whatever mathematicians can do, so can a TM.
I'm not sure that's the whole story. The decline of logical positivism was a process of philosophers discovering fatal flaws with it and discarding it, no?
> I also think you're confusing Russell with Hilbert
Oops! Absolutely right!
> it is not accurate that Gödel destroyed Hilbert's project, just changed its nature (or, rather, destroyed the original understanding but not subsequent ones)
Perhaps you're right - I'm not qualified to discuss things like proof theory I'm afraid.
> if our mathematics can prove any proposition (including, say, 1=0) then it's rather useless, whether or not it's "valid" in some sense.
Sure, provided we can maintain the distinction between a situation where the 'principle of explosion' applies, and situations where we're merely adopting exotic axioms like in non-Euclidean geometry. I imagine this distinction is fairly straightforward in practice: either there's contradiction and inconsistency, or there's not. (Ignoring of course the question of how we could know.)
> I think that neither Brouwer nor that Wikipedia article says what you think they do.
Perhaps so, but doesn't the existence of something like the axiom of choice, disprove the law of excluded middle? (At least when it's applied universally.)
> beyond the formula (as a string). If the strings refer to anything, what is the nature of the things they refer to?
Ah, right. Well I'm not arguing against 'notions not notations'. The maths derives the same whether we use conventional symbolic representations, or not.
So sure, there's a correspondence to something beyond the string itself (and not merely in the sense that you can substitute each symbol for an alternative, a la alpha equivalence). Presumably the correspondence is between the formula, and some space of facts.
I see a 'problem of interpretation' here - some amount of 'you have to know what it means', such as with clumsy ambiguous mathematical notations . I don't know if that's a serious problem though.
> deduction in a formal system (like first-order logic) is a nondeterministic computation
I'm afraid I don't follow. You can perform a derivation as many times as you like, you'll always get the same result, no?
> That they cannot settle non-computable problems is irrelevant, because neither can mathematicians, and whatever mathematicians can do, so can a TM.
Agree -- there's no good reason to assume brains cannot be simulated by computers (which is what that question boils down to, roughly speaking), at least in principle.
A surprising number of people disagree on that point though. Wishful thinking, I suspect. The same patterns arises in discussions of free will.
...and we're back to philosophy :P
 This particular bit of indefensibly brain-dead notation never ceases to bother me. https://math.stackexchange.com/a/932916/
Few things in philosophy are definitively discarded (and if they are, they can come back in some revised form); they may fall out of fashion.
> Ignoring of course the question of how we could know.
But that was precisely Hilbert's point. The problem was this: either you took Brouwer's intuitionism, which has no inconsistencies but is inconvenient, or you take classical mathematics because it's useful. But if you do, you need to know that it's consistent, but you can't.
> but doesn't the existence of something like the axiom of choice, disprove the law of excluded middle?
No. LEM is an axiom that you either include (in classical mathematics) or not (in constructive mathematics).
> I don't know if that's a serious problem though.
The problem is not exactly that of interpretation, but a more basic one: if the formulas refer to something, what kind of thing is that thing, and if that thing exists independently of the formulas, how do we know that the formulas tell us the truth about those things? Again, there is no definitive answer to this question, just different philosophies.
> You can perform a derivation as many times as you like, you'll always get the same result, no?
If you start with a set of axioms, you can apply any inference rule in any order to any subset of them. There's an infinity of possible derivations, that yield all the theorems in the theory. We can look at that as a form of nondeterministic computation (that nondeterministically chooses an inference) .
> A surprising number of people disagree on that point though.
Some philosophical ideas seem increasingly unsustainable in the face of mounting scientific knowledge. Vitalism, for instance.
In a similar vein, I'd say we're seeing an ongoing erosion of dualism in philosophy of mind, which will be unlikely to recover. Like with vitalism, the more we learn from science, the less magic we need to explain ourselves, be it 'life', or consciousness.
In the case of logical positivism, it was a case of showing the position to be unsustainable on its own terms. That's analogous to disproving a scientific hypothesis - I don't see it coming back.
> either you took Brouwer's intuitionism, which has no inconsistencies but is inconvenient, or you take classical mathematics because it's useful. But if you do, you need to know that it's consistent, but you can't.
Neat. I see I'll have to do my homework.
> LEM is an axiom that you either include (in classical mathematics) or not (in constructive mathematics).
But if you do include it, don't you have to commit arbitrarily to either AC or ¬AC?
> if the formulas refer to something, what kind of thing is that thing, and if that thing exists independently of the formulas, how do we know that the formulas tell us the truth about those things?
Isn't this 'just' a question of knowing that the correspondence is valid?
> There's an infinity of possible derivations, that yield all the theorems in the theory. We can look at that as a form of nondeterministic computation (that nondeterministically chooses an inference) .
Right, I'm with you. Infinite graph traversal.
> Gödel did.
Roger Penrose too. Honestly I find his position to be almost risibly weak. It's plain old quantum mysticism and wishful thinking, as far as I can tell. Microtubules are meant to be essentially magical? Seriously? Here he is explaining his position (takes about 10 minutes) https://youtu.be/GEw0ePZUMHA?t=660 and here's the Wikipedia article on his book about it: https://en.wikipedia.org/wiki/The_Emperor%27s_New_Mind
OK, but the philosophy of mathematics is a more slippery beast.
> But if you do include it, don't you have to commit arbitrarily to either AC or ¬AC?
No. You can have a theory that's consistent with either.
> Isn't this 'just' a question of knowing that the correspondence is valid?
Correspondence to what? Are those objects platonic ideals? Are they abstractions of physical reality? You can answer this question in many ways. But whichever way, this question of soundness (formal provability corresponds to semantic truth) rests on us knowing what is true.
Some philosophies of mathematics say that the question is unimportant: it doesn't matter if we get the "truth" part right, all that matters is that mathematics is a useful tool for whatever we choose to apply it to (this doesn't even mean applied mathematics; one valid "use" is intellectual amusement).
I think I see your point. If your theory has no way to express the question Is the AC true? then the LEM is no issue, as there's no 'proposition' to worry about at all.
> Are those objects platonic ideals? Are they abstractions of physical reality?
My intuition is to favour the former, as the latter seems a much stronger claim.
It seems reasonable to say that any mathematical system exists as a platonic ideal, but it's plain that not all mathematical systems abstract physical systems, and we shouldn't be quick to say that we know that any mathematical system does so. That's an empirical question that - true to Popper - can never be positively proved.
That's not what I meant. I meant that choice can be neither proven nor disproven. The philosophical fact that, due to LEM, it must either be true or not is irrelevant, because you cannot rely on it being true or on it being false.
> My intuition is to favour the former, as the latter seems a much stronger claim.
Well, Platonism is a popular philosophy of mathematics among mathematicians, but many strongly reject it.
So it's similar to, but not the same thing as, modifying the LEM to permit Unknowable as a third category (beyond the proposition is true and its negation is true). Instead we maintain that it has a truth value one way or the other, but that it happens to be both unknowable and of no consequence.
LEM doesn't need to be modified, and it's not a third category. In classical logic (with LEM), unlike in constructive/intuitionistic logic (no LEM), truth and provability are simply not the same -- in fact, it is LEM that creates that difference.
> Instead we maintain that it has a truth value one way or the other, but that it happens to be both unknowable and of no consequence.
Precisely! That is the difference between classical and constructive logic. In constructive logic truth and provability are the same, and something is neither true nor false until it has been proven or refuted. But before you conclude that this kind of logic, without LEM, is obviously better (philosophically), you should know that any logic has weirdnesses. What happens without LEM, for example, is that it is not true that every subset of a finite set is necessarily finite. ¯\_(ツ)_/¯
> The approaches using equivalence classes of Cauchy sequences ... suffer from an inability to identify when two “real numbers” are the same
Perhaps there's no computable general method, but it seems like it highlights an even bigger problem which remained unspoken. There is an inability to tell when a sequence of numbers approaches zero!
Then I would explain that imaginary numbers mean harmonics. exponential functions give you a derivative, one function can serve as input to another, PID control, ... ... and that's about it.