Hacker News new | past | comments | ask | show | jobs | submit login
Division by zero in type theory: a FAQ (xenaproject.wordpress.com)
120 points by EvgeniyZh on July 6, 2020 | hide | past | favorite | 82 comments



This strikes me as having some interesting similarity to the practice in Haskell of placing type class constraints on functions, not data types. A deprecated feature of Haskell used to allow you to write data types like:

    data Ord k => Map k v = ...
This defined a data type representing an ordered map from k to v, with the additional restriction that before you can talk about such a map you had to know that the type k is orderable (has an instance of the Ord type class which defines comparison operators).

This may be considered to be similar to the definition of (/) requiring that its second argument is nonzero.

This style was supplanted by leaving the data type definition 'bare' and placing the constraints on the API's functions instead:

    data Map k v = ...
    insert :: Ord k => k -> v -> Map k v -> Map k v
    lookup :: Ord k => k -> Map k v -> Maybe v
This style is similar to instead placing the nonzero-ness constraint on the theorems which define the API of (/).

In both cases there are basic engineering reasons for the switch:

- Having the constraints in the definition of terminology (Map, /) doesn't save any work, as you still need to prove those constraints to do anything or refer to it (in Haskell this meant sprinkling `Ord k =>` in the type of anything that referred to Map k v).

- In fact it results in doing more work than necessary, as you are sometimes forced to prove the constraints just to mention Map, even when what you are doing does not require them. For instance defining the empty map 'empty :: Map k v' shouldn't require you to know anything about k, because an empty tree doesn't contain any keys, let alone care about their ordering. In this case requiring Ord k in order to mention Map k v would be superfluous.


Rust tends to take the second approach as well; it tends to place trait bounds on `impl` blocks, and not on the struct itself.

(This is by no means a standard across the community though; I'd say it's a 50/50 split)


A case where it is unavoidable is when you want to refer to an associated type in your struct declaration.

In most other cases, putting bounds on impls leads to less repetition of those bounds and better error messages!

Another related topic is trait methods with method-level type parameters that have bounds. Those have to be repeated on every usage site so it is usually preferable to go for a design that has the type parameters in the actual trait.

Instead of:

    trait Foo {
        fn foo<T: Bar>(&self, bar: T)
    }
do this instead:

    trait Foo<T> {
        fn foo(&self, t: T)
    }

    impl<T: Bar> Foo for XYZ {
        fn foo(&self, bar: T) {

        }
    }
One bound might seem fine in this example but once stuff needs to be Send, Sync and 'static is when it gets annoying.


You can still do

    data TC x where
      DC :: Ord x => x -> TC x


Yes, but that doesn't mean quite the same thing. With the constraint on the type (data Ord x => TC x = DC x) you can be sure that `x` implements `Ord` for any type `TC x`. In your GADT, however, the constraint only applies to the DC constructor, so `x` is only guaranteed to implement `Ord` when you actually have a value inside a `DC` constructor. If you don't match on the constructor then there might not be an `Ord` instance:

    -- `Ord x` exists when matching on DC constructor
    ex1 :: TC x -> x -> x -> Ordering
    ex1 (DC _) a b = compare a b

    -- Without constructor match, no Ord instance
    ex2 :: TC x -> x -> x -> Ordering
    ex2 _ a b = compare a b  -- ERROR!

    -- This is why the above ex2 definition is an error
    badUser :: x -> x -> Ordering
    badUser a b = ex2 undefined a b

    -- No `TC x` value also means no Ord instance even
    -- though no arguments are ignored or undefined
    data Proxy x = Proxy
    ex2 :: Proxy (TC x) -> x -> x -> Ordering
    ex2 Proxy a b = compare a b  -- ERROR!


This sounds like the idea that it's fine for C++ templates to not be typechecked, because the expanded template will eventually be checked by the compiler. Perhaps, but it makes for errors a long way away from their causes, which are difficult to debug.

If I ever divide by zero (or by something that might be zero) in my proof, I've almost certainly made a mistake. I'd rather hear about it straight away than later when I try to use some fact about the result of that division.


For completeness: C++ eventually introduced a feature to allow something analogous to type-checking for template metaprogramming, called concepts - https://en.wikipedia.org/wiki/Concepts_(C%2B%2B)


> … it makes for errors a long way away from their causes, which are difficult to debug.

It also means that any type errors that don't affect the expanded template (but may still have an effect on semantics) will not be caught at all.


Using Scott encoding in the Lambda Calculus it is actually possible to define a 'usable' infinite natural number as a recursive construction (essentially its predecessor is defined as itself).

Interestingly enough, constructions for add, (natural) subtract and multiply work as you would expect with this infinite number without needing special handling. E.g. for all n, n + inf == inf, n - inf == 0, n * inf == inf.

Furthermore, the construction for (natural) div yields this infinite term (or at least something that behaves exactly like it) when the denominator is 0. Even more interestingly, 0/0 == inf for this system.

EDIT: Why might this be useful? Well now when you 'take n' from a list, n=inf will give you the whole list, regardless of how long (or even if it is infinite - i.e. lazy). Likewise 'drop n' will return the empty list - except with an infinite list, whereupon the calculation will fail to terminate.

EDIT: inf - inf == non-termination in this system, suggesting that not only inf - inf != 0, but it != inf either, and in fact it is something unreachable (in this system) (note the correspondence between 'subtract' and 'drop').


Mathlib does indeed define the extended real numbers, but they also need to define division on the reals. This is similar to the definition of subtraction having to be done for integers and for naturals (mentioned in the article)


I presume though that Mathlib does not define the extended naturals? If they did then 1:N / 0:N would be valued at this infinity.

What I found interesting about the Scott encoding was that natural infinity was trivial to define, and arises directly from a straightforward construction of the div function (for naturals).

The encoding is as follows (\x indicates lambda abstraction for identifier x. Recursion is assumed, of course it could be made explicit with the Y combinator):

  zero := \f \x n
  succ := \n \f \x f n

  infinite := \n succ (infinite n)
  infinity := infinite zero

  subn := \n \m m (\p n (\q subn q p) n) n

  divn := \n \m (subn m n) (\p zero) (succ (divn (subn n m) m))

  divn (succ zero) zero => infinity
EDIT: Note that infinity does not have a normal form (does not terminate), but:

  subn n infinity => zero


The extended reals can also arise directly from the definition of reals either as Cauchy sequences of rationals (as unbounded increasing sequences, etc) or Dedekind cuts (let the lower cut contain all rationals) :) for reals IIRC the main reason to exclude the extended reals is to make it a field.

However I don't think any of this matters for the point the article made. The problem is not that 1:N/0:N has "nothing sensible" to return, the problem is that if you define it in the way most mathematicians do, a lot of your proofs will be littered with (inline) lemmas showing that the result of such-and-such a division is a finite natural / finite real. This is analogous the problem brought up in the article of having proofs littered with lemmas showing that the result of such-and-such a square root was real.


In fact, there is also a type `enat` in mathlib. However, have `x/y` be a term of a type that is not the type of `x` and `y` comes with it's own sets of problems. It doesn't compose as smoothly as homogeneous division.


> n * inf == inf

I would expert this to be 0 actually.


Why would you want the product of two non-zero values to equal zero?

Among other things, this would have the unpleasant effect of making expressions like "0÷1" undefined, because the result could equal either zero or infinity.


Apologies, I meant for 0 * inf = 0. Since in n * inf n can be 0 the result should be able to be 0.


You are correct - it should indeed be zero (and * can be constructed accordingly)


I think this behavior makes perfect sense when you view it through the lens of type theory. Functions that return a real number must always return a real number. Some function types are defined by something like ‘R, error’, and then it’s up to the caller to check ‘error’. Others throw exceptions that bubble up the call stack and may never return ‘R’, which in a sense breaks the type contract and has “exception” theory preceding type theory.


(disclaimer: I'm the author of the blog post). The argument the other way is that if a mathematician sees this convention, their reaction is likely to be "that's just silly, 1/0 is obviously 'halt and catch fire'", and any attempt to defend this by saying "it makes perfect sense viewed through the lens of type theory" runs the risk of the response "well I won't be doing my mathematics in type theory then". In the comments to the blog post (and in personal emails) people have suggested that instead of trying to defend the idea that 1/0=0 (which is what I was trying to do, given that I am now very much used to it) I should instead be complaining to the Lean designers to make the front end "work more like a mathematician expects". However this is difficult, because talking to mathematicians it becomes clear that they have different opinions about what "actually happens" when you put garbage in.


> if a mathematician sees this convention, their reaction is likely to be "that's just silly, 1/0 is obviously 'halt and catch fire'"

mathematician here. The expression 1/0 is not "obviously" halt and catch fire. Extended real numbers (either projectively or affinely) are a well-known thing. If you see the expression 1/0 or 1/f(x) where f(x) may take the 0 value you do not halt and catch fire; you assume that the operations are taking place in a place where they make sense. A very common usage is when f(x)=0 for a zero-measure set of x, and then 1/f(x) has dirac masses at these points (weighted by the absolute value of the derivative of f).

On the other hand, I agree that "type theory" would sound like a ridiculously unnecessary abstraction to most of us.


What would “halt and catch fire” look like for a proof assistant? If I’m trying to prove a theorem that performs division over the reals, would it then have to mechanically prove that there cannot exist any inputs that would result in division by zero? Isn’t that then itself a theorem that I’d have to explicitly define using tactics?


Well, if you're doing a proof and (for example) you want to cancel x in the numerator of a fraction with x in the denominator, you then apply the constraint x ≠ 0 to every subsequent step of the proof.

(You may then do a separate proof in the case that x = 0, if you want to prove something for all x including 0.)


That's actually needed for a real proof. One should keep in mind that y = ax is not injective if a=0, so that "cancel a variable" step you're thinking of is most likely incorrect without that side-condition.


...yes?


Except that's a false dichotomy. Sqrt() could instead return Enum { Real | Complex } and the typing enforces that the surrounding math handles the appropriate cases (or proves that negative input is not allowed). Likewise for division by zero, etc.


Sure, but imagine that every time you divide, the result may be optional. At least in a general purpose language that would clutter the code with optional handling even in cases where you know (but the compiler doesn't) that the value absolutely can't be zero.

Relatedly, there is value to non-local error handling (i.e. unchecked exceptions), catching logical errors at a local level makes little sense.


There's a couple of points to be made to that.

First, we should be extending our hardware numerics to support the extended real number line, inclusive of infinity, in a way which causes a lot of these exceptions to disappear. Division by zero should result in an inf, not an exception or NaN.

Second, those exceptions which can't be eliminated DO need to be handled anyway. To say "but then we'd have to handle a bunch of exceptional cases everywhere!" is exactly the point. They do need to be handled.

Finally, language and compiler improvements can make handling exceptional cases easier. E.g. let numerical methods specify domain and range requirements and have these be compiler-enforced. Then the implementation is freed from handling exceptional conditions that only arise with inputs outside of its declared domain.


I feel you're missing the point:

- Division by zero resulting in Inf could be fine in some cases, but might also lead to issue in other situations. The "calculate a slope through two separate points" is a good example: the slope through a single point is either undefined or the derivative, but it's not infinity. In any case, division by zero is sufficiently "weird" or "corner case-y" that you'd want to pay special attention to it. And if the runtime blows up in your face and tells you something is wrong, that can in many cases be better than to continue with wrong values (compile-time checks are always better, but not always feasible).

- First, it's not true that all exceptions need to be handled. This heavily dependens on the use case. If you're an app developer, then the app crashing might be a better (!) alternative than e.g. corrupting data, if it happens rare enough. After all, the user can just restart the app. Even if you write a server side app, you may get away with crashing, as long as you have a supervisor or so restarting unhealthy processes/instances. I'm not saying, crashes are good behaviour, but im some cases they are better behaviour than some of the alternatives (and in any case, no code is ever crash-proof, you can always have OOM, stack overflow, etc.). This is the concept of fault tolerance: you might not know which bugs happen, but you want to be able to recover from them somehow. In fact, if I'm not mistaken, Erlang basically takes this philosophy to an extreme.

- Even if you handle exceptions, you don't necessarily want to handle them locally. This is why many languages have unchecked exceptions. You're free to declutter a huge chunk of your application of error handling that would be extremely tedious, and just handle the (rare) exception at the top-level, or whatever intermediate layer best knows how to deal with it. Sometimes you just want to assert something about the state of your program that the compiler doesn't know and throwing an unchecked exception in case the condition is violated and only handling it at the topmost layer of your application is a perfectly reasonable decision.


I think we are talking at cross purposes now? This is a theorem proving language the OP is talking about. Different objectives and needs.


Yes, maybe we are.

The original article was about a theorem prover, but I was under the impression that the discussion had migrated to a more general discussion about division by zero in programming languages. Apologies if that wasn't intended.


Well, if you ask the mathematicians then f(x) = 1/x isn't defined on the real numbers, it's defined on the non-zero real numbers. It always returns a real number but it's domain has a different type.

This places the burden on the person using the function to prove that x isn't 0, but that's the price you have to pay to have a proper multiplicative inverse.


Making the domain of f(x)=1/x be the non-zero real numbers allows f to be continuous. If f(0)=0 then continuity is lost.

Alternatively, defining f(0) to be infinity (a projectively extended real number) also allows f to be continuous.

Another possibility is to define f(0) to be some kind of "bottom" value. See the following tweets by Andrej Bauer: https://twitter.com/andrejbauer/status/1268860981943439361

Continuity is important because without it we lose computability. Every computable function is continuous (in an appropriate sense).


f(x) = 1/x is also defined over what we call the "extended reals" which is just R + inf.

But, I don't think what you said is quite right, either. If you define f(x) = 1/x over the non-zero reals, then the range does have the same type as the domain- you can't get zero out of the function, so the range is also the non-zero reals.


> f(x) = 1/x is also defined over what we call the "extended reals" which is just R + inf.

Not quite. The extended reals are ℝ + {inf, -inf}. This has the problem that 1/x approaches inf from the right and -inf from the left.

As ogogmad notes, in order to define a value for 1/x at 0, you need the projectively extended reals.


> Not quite. The extended reals are (...)

You are both right. There are several, different, extensions of the reals. The "projective" extension turns the real line into a (topological) circumference by adding a single point at infinity. The "affine" extension turns it into a closed segment by adding two infinities, one at each end; this corresponds to the convention used by ieee floating point numbers. There are many more extensions, like complex numbers, dual numbers, p-adic numbers, and in many of them (except complex numbers) you have various kinds of divide-by-zero fun.


Extending the real numbers with 1 vs. 2 points at infinity both have applications where they model real situations.

For example, if we take a mercator projection of the globe onto a 2-infinite-ended cylinder, the latitude coordinate could meaningfully be either +∞ or –∞. If we take the stereographic projection of the globe onto the plane, every pair of coordinates with one infinite value corresponds to the same point on the sphere.


The codomain doesn't have to equal the image. The image is the non-zero reals, but the codomain could be any superset of it (including all of R). The word "range" is ambiguous.


Thank you. I've forgotten some of the terminology, since I haven't thought about real math in a few years. :(


There's no real problem with defining the codomain to be too big. Although you can indeed also pick the non-zero reals as the co-domain.


I don't understand what you mean. You still have the different obvious ways of formulating, e.g. division in type theory:

1. The article's way: "R^2->R", but garbage value for 0 in the second argument.

2. What you propose: "R^2->Maybe R"

3. The mathematician's dependent-type-theoretic way "R * (x:R, proof that x!=0) -> R",

4. the conventional way "R * (R\{0}) -> R".

They have their advantages and disadvantages. The first is useful in proof verification since it simplifies most proofs, from which perspective this garbage value of 0 is actually well-behaved, and doesn't need to be handled as a special case. But you allow seemingly nonsense theorems when you forget to condition (x>0).

What you propose is similar, except that you are forced to reason about whether the output value is valid.

3 and 4 are tedious, since you always need to prove that the inputs are nonzero. 4 is more tedious since you have merely shifted the problem of handling the zero case into proving properties about the canonical embedding "R->R\{0}" (which must still have a garbage value for 0). On the upside, these functions surject into R.


By the way, 3 and 4 are pretty much the same in Lean, assuming you define "R\{0}" using complementation rather that somehow defining the nonzero reals from scratch.


> But you allow seemingly nonsense theorems when you forget to condition (x>0).

Such as what? I’ve only spent a few hours with lean, but I’m fairly sure that any attempt to prove something silly would fail because it would catch the weird behavior of 0.

For example. If you tried to use a/a = 1 as the article mentions, you’d be unable to prove it without adding the condition that a != 0, and you’d be unable to use it further on without that condition in place.


Here is a "nonsense" theorem that is provable in Lean:

“There exists a real number r such that 1/r = 0.”

If you try to translate this theorem into maths, you will run into trouble at some point. At which point exactly depends on how you want to make things precise... which is exactly what mathematicians avoid when talking about division in fields.


That’s a perfectly sensible function, it’s just not ordinary division on the reals where x = 0, and you can’t use it that way.


I agree that they all have their advantages and disadvantages. My point was agreeing with the author that a proof assistant returning 0 for n/0 isn’t nonsensical. It’s debatable whether it is the best approach, but it is sound. The reasoning is clear if you consider R to be an inductive type, which many proof assistants including Lean do. Returning anything but an element of an inductive type would be wrong.


> 3 and 4 are tedious, since you always need to prove that the inputs are nonzero.

Or assume it. This is what you have to do in real life anyway. If you can't assume the inputs are nonzero you have to prove it. Otherwise the proof doesn't work.


As the article mentions, it doesn't necessarily work this way in a verified proof. Every time you form an expression involving division, you have to supply the proof. It is this repeated supplying of a proof (not necessarily having to reprove it) that is omitted in regular informal proofs that is tedious. In certain situations you can write automation to help you with that. But more frequently, you already have automation for other aspects of your proof, and this need to prove this is tripping up your other automation.


What you are saying might make sense in regular programming languages like go/rust/etc but it does not make sense in type theory. Rather the reason that n / 0 = 0 makes sense in type theory is that you can simply define types such as

    (n : Nat) -> (m : Nat) -> Nonzero m -> n / m * m = n
as in you do not need to make your laws hold true for every possible input of / which is why it does not break regular mathematics.


With humility I will say that I don't like this solution.

In mathematics functions have domains and codomains. Two functions are equal if their domains are equal, their codomains are equal, and for all the values in the domain they return the same values in the codomain.

A mathematician who doesn't know anything about type theory, but hears Lean is good at formalizing math, will expect the division function to have the domain R x (R\{0}). Modifying the domain to be R x R and returning a garbage value for the added points is not a good solution. Maybe it works "in real life", but you always need to carry with you an invisible asterisk and footnote that Lean did the right thing only assuming you asked the right question.

Again, being humble (my Lean experience is about one hour in total, spread across several sessions), I think it would be better to invest a bit more time upfront and always define properly the domain of the function used, rather than introduce edge cases that can bite you in the back twenty five theorems down the road.


The title is a bit sneaky. Division by zero in "type theory" will depend on just that -- your type theory. Some theories use special values for division by zero (e.g. could spit out DNE or +INF/-INF). But then you need to formalize how you do "things" with DNE or infinities. This is, more often than not, an unnecessary rabbit hole, so just returning 0 is simpler. Some theories might require a proof that the denominator isn't zero (essentially making division a three-argument function) -- which is kind of a cool way of thinking about it[1].

[1] https://math.stackexchange.com/questions/334393/how-does-typ...


I don’t think the “in type theory” restriction is correct.

This choice works fine when using type theory in theorem provers, but would be unacceptable if we did it in programming languages that use type theory (or would that be called “type practice” ;-)?)

In Haskell, for example, 1/0 returns Infinity.


That's not exactly right... in Haskell, 1/0 has type Fractional a => a. So, it could be any fractional type. If 1/0 occurs in a context where it'll be interpreted as a floating-point number, you'll get Infinity, yes, as per how standard floating-point works. In other contexts though you may get an error. E.g., if you import Data.Ratio and do 1/0 :: Rational, you'll get an error.


The type signature of division in Haskell merely requires that it takes two fractional `a`s and returns another `a`.

  1 / 0 :: Double
evaluates to Infinity.

  1 / 0 :: Rational
throws a runtime exception. A division by zero producing zero would be perfectly sensible in Haskell, and perhaps even desired if you are totally averse to runtime errors. The Pony language infamously does this.


In academia, "type theory" almost always means dependent type theory, such as Martin-Löf type theory or homotopy type theory. Type theory is used in most theorem provers as a foundation for mathematics. The title of this blog post is using the term in a standard way.

Programming languages such as Haskell and Java are not based on type theory. They have type systems. Haskell's type system is much closer to type theory than that of Java, but it's still quite crippled compared to what type theorists study.

It is of course possible to define things in type theory such that 1/0 = Infinity or something else, but that is not the norm.


I ran into this argument before.

I can understand that it makes sense for proof assistants. As the post shows, proof assistants don't allow you to misuse that anywhere else.

(But as a sidenote: couldn't you fix this with dependent types somehow?)

I still think it would be wrong for a regular programming language. Afaik, the Pony language does this too, but that is a general-purpose programming language that does not check your logic, and I am very wary of something like that. People might type up some algorithms, or even just try to compute an average, and the system absolutely should yell at you if you try to divide by zero.


Pony now includes both partial division (which raises an error when dividing by zero) and checked division (which returns a tuple of both a result and a boolean which indicates whether an error occurred): https://tutorial.ponylang.io/expressions/arithmetic.html#par...

For what it's worth, I'm willing to argue that Pony is making a strongly motivated choice by making x/0 return 0 by default with respect to their specific error-handling model. In general, I don't think it's as much of a problem as you might imagine. The average function is a perfect example: if I naïvely implement it in such a language, then it would mean… that average([]) returns 0. That doesn't seem terribly bad to me.


I don't think that the average of an empty list being zero is semantically correct, but I can accept that it may not be so bad in practice (e.g. if it shows up like that in a UI and you display the number of elements anyway).

But I still think in other situations I'd want to be yelled at for trying to divide by zero. E.g. imagine I'm trying to compute the slope of a secant between two points. If both points are the same, I'd like the compiler to yell at me (the problem either makes no sense here, indicating a problem somewhere else in my code, or that I should instead compute the derivative) - in most cases, 0 would be the wrong answer. I think the issue here is that with division by zero you lose continuity (unless you use +Inf) which can contradict intuition.

I'm not saying it doesn't work for Pony, maybe it does, but I don't think I would feel comfortable with that behaviour.

Silent wrapping around on overflow is arguably even worse, that I could definitely see leading to logic errors.


The compiler is unlikely to yell at you, the runtime normally will. Runtime errors are not usually very helpful, you are better off testing every division yourself (the Pony checked division model). Note that most non Intel CPUs don't have runtime exceptions for division by zero, Arm returns 0 and AFAIK Risc-V returns -1 but I might have misremembered. Intel is unusual in having an exception.


Yes, you're obviously correct, the compiler won't yell at me. I wasn't thinking clearly when I wrote that. I did mean the runtime. :)

But I disagree that runtime errors aren't helpful. They are very helpful to surface logic errors (if you have proper logging). If you catch a semantic error as early as possible, you have a better chance of making sure it doesn't propagate somewhere else, and it will be easier to debug the eventual issues. This is why I generally like assertions.


In practice I guess it's quite complicated. Even in a safety-critical environment where you don't want any errors, halting on an exception often seems not the best idea. For example in an air plane, a halting flight controller could be worse than some incorrect values. I generally agree that raising exceptions and logging them is the way to go if your budget allows, though the default behavior (crash, give 0, a large positive number, w/e) seems less clear (as often when choosing between lesser harms).

Another example: if your system is autonomous and re-initializing it will likely lead to the same "dirty" state (i.e. the division by zero), then recovery would not be possible after a crash. Could be a simple game, or perhaps a Mars rover for instance. In that case 1/0 = 0 is a solid choice.


Halting on exceptions of course means that you need to have some fault recovery mechanism, there are many patterns for that.

I'm not saying that 1/0 should just lead to a hard crash. Ideally you can identify subsystems and somehow handle unexpected errors at the boundaries and use some fallback behaviour.

But I agree that it's not easy and very dependent on the specific situation. I wrote a whole whitepaper about error handling and error recovery about the last system I was working on, because it's such a complex topic. It's just that "you should never raise runtime errors" or "all errors need to be handled at the level they are raised" are answers that sound good, but are impractical in certain situations. And as said, you need to deal with potential crashes anyway due to OOM, etc.

The Mars rover thing is completely different, because the engineering standards at NASA are so insanely high as to exactly prevent dumb errors like that. They can develop like that, but most other companies can't even begin to afford such a process heavy development and need to accept that programming errors are going to happen and there needs to be some recovery mechanism.


You could also argue that 0^0=1 is also not semantically correct, but it is the most widespread convention among mathematicians.


It IS semantically correct. The number of functions from a set of 0 elements to a set of 0 elements is exactly 1. No parallel argument exists for division by zero. The two situations are not analogous at all.


Mathematics is full of cases where the same notation means different things. You’ve kind of dodged the question of what 0^0 is by talking about set theory, because it is (in some sense) arbitrary that X^Y means Y -> X or its size (cardinal exponentiation), and in analysis may mean exp(y log x). We accept that for natural X,Y except (0,0) these must agree but that does not imply that they should agree at (0,0), because the definitions are simply not the same.

If you are serious about math, which it seems you are, I would be a bit more careful about “transporting” notation from one field from another and arguing about correctness.

The usual way to transport a definition from a discrete domain to a continuous one is a technique called analytic continuation. I am curious if there is an analytic continuation of the discrete X^Y which contains 0^0=1, and what that would look like, but at that point you’re definitely not talking about exponentiation any more.

See: https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero

See: https://math.stackexchange.com/questions/11150/zero-to-the-z...


> it is (in some sense) arbitrary that X^Y means Y -> X or its size (cardinal exponentiation)

Are you referring to the notation? I don’t think notation is what’s being contested.

> in analysis may mean exp(y log x)

Doesn’t work with a base of 0.

> We accept that for natural X,Y except (0,0)

Why “except”?

> that does not imply that they should agree at (0,0), because the definitions are simply not the same.

The definition of exponentiation of reals typically starts with exponentiation of naturals as a given (see Baby Rudin).

> The usual way to transport a definition from a discrete domain to a continuous one is a technique called analytic continuation. I am curious if there is an analytic continuation of the discrete X^Y which contains 0^0=1

Analytic continuation refers to something else, but I get what you’re trying to say.

The answer is simple: Real exponentiation is an extension of natural exponentiation. Hence, it should have the same value as the latter wherever the latter is defined.

Yes, I’ve read that SE question before. It’s good that you brought it up. I recommend reading the comments under the accepted answer and the other answers as well.


Mathematicians not working in type theory generally agree that 1/0 is undefined (or if anything, they would set it to infinity, but even saying that would probably be too cavalier for most mathematicians). They do not necessarily for 0^0. There are good algebraic and combinatorial reasons for 0^0=1, while analysts will complain about non-continuity.


I think that’s missing the point. What I’m saying is “this is a convention, and because it is a convention, it may be reasonable to follow a different convention at times”. Different conventions are convenient for mathematicians working in different fields. Whether a convention is “semantically correct” is not a question that even makes sense, because when you define something, your definition is true by definition. The only real question here is whether you can reasonably expect people to understand you if you use different definitions.


This is a view of mathematics and programming that just completely ignores intuition. Of course, what you say is technically correct, yet mathematicians debate definitions all the time, precisely because it matters to them to get them "right" (of course, there's rarely one true answer).

This is even more true in programming. Yes, you can also learn all the type conversion rules of JavaScript, but many people agree that they are insane because they violate the principle of least surprise. Expectations matter, especially when you're fixing a bug at 3am.


Intuition is mostly an internalized notion of convention, and I don’t think that we can appeal to intuition to resolve much.

For students learning division, the fact that 1/0 is undefined is not an intuitive result and must be learned. This is their first time encountering something which is “not defined”.


We'll just have to disagree about this. I think intuition is incredibly important both in mathematics and in programming.


We agree that intuition is important, I think we disagree about where it comes from. Intuition comes from experience. A programming language is “intuitive” because it conforms to your previous experiences with programming languages.


To add to this, while you might argue that average([]) is undefined, it is no less defined than 0^0... that is, you can choose to define it by convention. For various reasons, most mathematicians have agreed that 0^0=1. This is not some obvious mathematical fact, but a choice of convention. For the same reason, we can choose that average([]) is 0.


Returning 0 for average([]) is useful. Returning a slope of 0 for a vertical linear regression is suspect.

Any choice of behavior for x/0 will be wrong in some context, and the question is really about how much help you want from the computer (both from the type system in trying to prevent/model division by 0 and at runtime via errors, via choosing a value for the division, etc).


yeah which is precisely what I'm saying (that last part).


Elm does this for integers (Floats support Infinity, natch) and I support it. Elm is meant for writing declarative UIs, and the only example I could think of for blindly dividing by 0 is taking the average of an empty list without checking first. But any time you might do that, you'd be better served by matching on the empty list _before_ trying to take the average and rendering something helpful for the user.


> couldn't you fix this with dependent types somehow?

quoting from the post:

> the way I had set things up, every time I used the [sqrt] symbol I had to supply a proof that what I was taking the square root of was non-negative.

"supplying a proof" is how you do it in dependently-typed systems afaik. the author tried it, and it was inconvenient


Well unless the system can somehow infer it for you. But maybe this is too hard a problem for now?


It's impossible to infer in general, and any dependant type system needs to be somehow designed around that fact.


> (But as a sidenote: couldn't you fix this with dependent types somehow?)

Most proof assistants, including Lean, are built on dependent type theory.


Sure but then I guess maybe you can find ergonomic improvements to your dependent type system so as not to run into the problems discussed in the article (e.g. having to reprove that something is not zero over and over again).


Maybe the way to do that would be to prove once that x is nonzero, use that to define r = 1/x, and thereafter use multiplication by r?


Obviously a system can define a symbol like "/" as whatever it wants to. But then it isn't what many mathematicians would expect.

The Metamath Proof Explorer defines "/" so it only has a value if the denominator is a nonzero complex number: http://us.metamath.org/mpeuni/df-div.html

You can argue if that is better, but that is a possible choice.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: