
Division by zero in type theory: a FAQ - EvgeniyZh
https://xenaproject.wordpress.com/2020/07/05/division-by-zero-in-type-theory-a-faq/
======
nshepperd
This strikes me as having some interesting similarity to the practice in
Haskell of placing type class constraints on functions, not data types. A
deprecated feature of Haskell used to allow you to write data types like:

    
    
        data Ord k => Map k v = ...
    

This defined a data type representing an ordered map from k to v, with the
additional restriction that before you can talk about such a map you had to
know that the type k is orderable (has an instance of the Ord type class which
defines comparison operators).

This may be considered to be similar to the definition of (/) requiring that
its second argument is nonzero.

This style was supplanted by leaving the data type definition 'bare' and
placing the constraints on the API's functions instead:

    
    
        data Map k v = ...
        insert :: Ord k => k -> v -> Map k v -> Map k v
        lookup :: Ord k => k -> Map k v -> Maybe v
    

This style is similar to instead placing the nonzero-ness constraint on the
theorems which define the API of (/).

In both cases there are basic engineering reasons for the switch:

\- Having the constraints in the definition of terminology (Map, /) doesn't
save any work, as you still need to prove those constraints to do anything or
refer to it (in Haskell this meant sprinkling `Ord k =>` in the type of
anything that referred to Map k v).

\- In fact it results in doing _more_ work than necessary, as you are
sometimes forced to prove the constraints just to mention Map, even when what
you are doing does not require them. For instance defining the empty map
'empty :: Map k v' shouldn't require you to know anything about k, because an
empty tree doesn't contain any keys, let alone care about their ordering. In
this case requiring Ord k in order to mention Map k v would be superfluous.

~~~
Ar-Curunir
Rust tends to take the second approach as well; it tends to place trait bounds
on `impl` blocks, and not on the struct itself.

(This is by no means a standard across the community though; I'd say it's a
50/50 split)

~~~
wh33zle
A case where it is unavoidable is when you want to refer to an associated type
in your struct declaration.

In most other cases, putting bounds on impls leads to less repetition of those
bounds and better error messages!

Another related topic is trait methods with method-level type parameters that
have bounds. Those have to be repeated on every usage site so it is usually
preferable to go for a design that has the type parameters in the actual
trait.

Instead of:

    
    
        trait Foo {
            fn foo<T: Bar>(&self, bar: T)
        }
    

do this instead:

    
    
        trait Foo<T> {
            fn foo(&self, t: T)
        }
    
        impl<T: Bar> Foo for XYZ {
            fn foo(&self, bar: T) {
    
            }
        }
    

One bound might seem fine in this example but once stuff needs to be Send,
Sync and 'static is when it gets annoying.

------
lmm
This sounds like the idea that it's fine for C++ templates to not be
typechecked, because the expanded template will eventually be checked by the
compiler. Perhaps, but it makes for errors a long way away from their causes,
which are difficult to debug.

If I ever divide by zero (or by something that might be zero) in my proof,
I've almost certainly made a mistake. I'd rather hear about it straight away
than later when I try to use some fact about the result of that division.

~~~
MaxBarraclough
For completeness: C++ eventually introduced a feature to allow something
analogous to type-checking for template metaprogramming, called _concepts_ \-
[https://en.wikipedia.org/wiki/Concepts_(C%2B%2B)](https://en.wikipedia.org/wiki/Concepts_\(C%2B%2B\))

------
somewhereoutth
Using Scott encoding in the Lambda Calculus it is actually possible to define
a 'usable' infinite natural number as a recursive construction (essentially
its predecessor is defined as itself).

Interestingly enough, constructions for add, (natural) subtract and multiply
work as you would expect with this infinite number without needing special
handling. E.g. for all n, n + inf == inf, n - inf == 0, n * inf == inf.

Furthermore, the construction for (natural) div yields this infinite term (or
at least something that behaves exactly like it) when the denominator is 0.
Even more interestingly, 0/0 == inf for this system.

EDIT: Why might this be useful? Well now when you 'take n' from a list, n=inf
will give you the whole list, regardless of how long (or even if it is
infinite - i.e. lazy). Likewise 'drop n' will return the empty list - except
with an infinite list, whereupon the calculation will fail to terminate.

EDIT: inf - inf == non-termination in this system, suggesting that not only
inf - inf != 0, but it != inf either, and in fact it is something unreachable
(in this system) (note the correspondence between 'subtract' and 'drop').

~~~
zodiac
Mathlib does indeed define the extended real numbers, but they also need to
define division on the reals. This is similar to the definition of subtraction
having to be done for integers and for naturals (mentioned in the article)

~~~
somewhereoutth
I presume though that Mathlib does not define the extended _naturals_? If they
did then 1:N / 0:N would be valued at this infinity.

What I found interesting about the Scott encoding was that natural infinity
was trivial to define, and _arises directly_ from a straightforward
construction of the div function (for naturals).

The encoding is as follows (\x indicates lambda abstraction for identifier x.
Recursion is assumed, of course it could be made explicit with the Y
combinator):

    
    
      zero := \f \x n
      succ := \n \f \x f n
    
      infinite := \n succ (infinite n)
      infinity := infinite zero
    
      subn := \n \m m (\p n (\q subn q p) n) n
    
      divn := \n \m (subn m n) (\p zero) (succ (divn (subn n m) m))
    
      divn (succ zero) zero => infinity
    

EDIT: Note that infinity does not have a normal form (does not terminate),
but:

    
    
      subn n infinity => zero

~~~
zodiac
The extended reals can also arise directly from the definition of reals either
as Cauchy sequences of rationals (as unbounded increasing sequences, etc) or
Dedekind cuts (let the lower cut contain all rationals) :) for reals IIRC the
main reason to exclude the extended reals is to make it a field.

However I don't think any of this matters for the point the article made. The
problem is not that 1:N/0:N has "nothing sensible" to return, the problem is
that if you define it in the way most mathematicians do, a lot of your proofs
will be littered with (inline) lemmas showing that the result of such-and-such
a division is a finite natural / finite real. This is analogous the problem
brought up in the article of having proofs littered with lemmas showing that
the result of such-and-such a square root was real.

------
e79
I think this behavior makes perfect sense when you view it through the lens of
type theory. Functions that return a real number must always return a real
number. Some function types are defined by something like ‘R, error’, and then
it’s up to the caller to check ‘error’. Others throw exceptions that bubble up
the call stack and may never return ‘R’, which in a sense breaks the type
contract and has “exception” theory preceding type theory.

~~~
jhanschoo
I don't understand what you mean. You still have the different obvious ways of
formulating, e.g. division in type theory:

1\. The article's way: "R^2->R", but garbage value for 0 in the second
argument.

2\. What you propose: "R^2->Maybe R"

3\. The mathematician's dependent-type-theoretic way "R * (x:R, proof that
x!=0) -> R",

4\. the conventional way "R * (R\\{0}) -> R".

They have their advantages and disadvantages. The first is useful in proof
verification since it simplifies most proofs, from which perspective this
garbage value of 0 is actually well-behaved, and doesn't need to be handled as
a special case. But you allow seemingly nonsense theorems when you forget to
condition (x>0).

What you propose is similar, except that you are forced to reason about
whether the output value is valid.

3 and 4 are tedious, since you always need to prove that the inputs are
nonzero. 4 is more tedious since you have merely shifted the problem of
handling the zero case into proving properties about the canonical embedding
"R->R\\{0}" (which must still have a garbage value for 0). On the upside,
these functions surject into R.

~~~
empath75
> But you allow seemingly nonsense theorems when you forget to condition
> (x>0).

Such as what? I’ve only spent a few hours with lean, but I’m fairly sure that
any attempt to prove something silly would fail because it would catch the
weird behavior of 0.

For example. If you tried to use a/a = 1 as the article mentions, you’d be
unable to prove it without adding the condition that a != 0, and you’d be
unable to use it further on without that condition in place.

~~~
leanuser57
Here is a "nonsense" theorem that is provable in Lean:

“There exists a real number r such that 1/r = 0.”

If you try to translate this theorem into maths, you will run into trouble at
some point. At which point exactly depends on how you want to make things
precise... which is exactly what mathematicians avoid when talking about
division in fields.

~~~
empath75
That’s a perfectly sensible function, it’s just not ordinary division on the
reals where x = 0, and you can’t use it that way.

------
credit_guy
With humility I will say that I don't like this solution.

In mathematics functions have domains and codomains. Two functions are equal
if their domains are equal, their codomains are equal, and for all the values
in the domain they return the same values in the codomain.

A mathematician who doesn't know anything about type theory, but hears Lean is
good at formalizing math, will expect the division function to have the domain
R x (R\\{0}). Modifying the domain to be R x R and returning a garbage value
for the added points is not a good solution. Maybe it works "in real life",
but you always need to carry with you an invisible asterisk and footnote that
Lean did the right thing only assuming you asked the right question.

Again, being humble (my Lean experience is about one hour in total, spread
across several sessions), I think it would be better to invest a bit more time
upfront and always define properly the domain of the function used, rather
than introduce edge cases that can bite you in the back twenty five theorems
down the road.

------
dvt
The title is a bit sneaky. Division by zero in "type theory" will depend on
just that -- your type theory. Some theories use special values for division
by zero (e.g. could spit out DNE or +INF/-INF). But then you need to formalize
how you do "things" with DNE or infinities. This is, more often than not, an
unnecessary rabbit hole, so just returning 0 is simpler. Some theories might
require a proof that the denominator isn't zero (essentially making division a
three-argument function) -- which is kind of a cool way of thinking about
it[1].

[1] [https://math.stackexchange.com/questions/334393/how-does-
typ...](https://math.stackexchange.com/questions/334393/how-does-type-theory-
handle-division-by-zero-and-such)

------
Someone
I don’t think the “in type theory” restriction is correct.

This choice works fine when using type theory in theorem provers, but would be
unacceptable if we did it in programming languages that use type theory (or
would that be called “type practice” ;-)?)

In Haskell, for example, _1 /0_ returns _Infinity_.

~~~
Sniffnoy
That's not exactly right... in Haskell, 1/0 has type Fractional a => a. So, it
could be any fractional type. If 1/0 occurs in a context where it'll be
interpreted as a floating-point number, you'll get Infinity, yes, as per how
standard floating-point works. In other contexts though you may get an error.
E.g., if you import Data.Ratio and do 1/0 :: Rational, you'll get an error.

------
Tainnor
I ran into this argument before.

I can understand that it makes sense for proof assistants. As the post shows,
proof assistants don't allow you to misuse that anywhere else.

(But as a sidenote: couldn't you fix this with dependent types somehow?)

I still think it would be wrong for a regular programming language. Afaik, the
Pony language does this too, but that is a general-purpose programming
language that does _not_ check your logic, and I am very wary of something
like that. People might type up some algorithms, or even just try to compute
an average, and the system absolutely should yell at you if you try to divide
by zero.

~~~
andolanra
Pony now includes both partial division (which raises an error when dividing
by zero) and checked division (which returns a tuple of both a result and a
boolean which indicates whether an error occurred):
[https://tutorial.ponylang.io/expressions/arithmetic.html#par...](https://tutorial.ponylang.io/expressions/arithmetic.html#partial-
and-checked-arithmetic)

For what it's worth, I'm willing to argue that Pony is making a strongly
motivated choice by making x/0 return 0 by default with respect to their
specific error-handling model. In general, I don't think it's as much of a
problem as you might imagine. The _average_ function is a perfect example: if
I naïvely implement it in such a language, then it would mean… that
_average([])_ returns 0. That doesn't seem terribly bad to me.

~~~
Tainnor
I don't think that the average of an empty list being zero is semantically
correct, but I can accept that it may not be so bad in practice (e.g. if it
shows up like that in a UI and you display the number of elements anyway).

But I still think in other situations I'd want to be yelled at for trying to
divide by zero. E.g. imagine I'm trying to compute the slope of a secant
between two points. If both points are the same, I'd like the compiler to yell
at me (the problem either makes no sense here, indicating a problem somewhere
else in my code, or that I should instead compute the derivative) - in most
cases, 0 would be the wrong answer. I think the issue here is that with
division by zero you lose continuity (unless you use +Inf) which can
contradict intuition.

I'm not saying it doesn't work for Pony, maybe it does, but I don't think I
would feel comfortable with that behaviour.

Silent wrapping around on overflow is arguably even worse, that I could
definitely see leading to logic errors.

~~~
klodolph
You could also argue that 0^0=1 is also not semantically correct, but it is
the most widespread convention among mathematicians.

~~~
alentist
It IS semantically correct. The number of functions from a set of 0 elements
to a set of 0 elements is exactly 1. No parallel argument exists for division
by zero. The two situations are not analogous at all.

~~~
klodolph
Mathematics is full of cases where the same notation means different things.
You’ve kind of dodged the question of what 0^0 is by talking about set theory,
because it is (in some sense) arbitrary that X^Y means Y -> X or its size
(cardinal exponentiation), and in analysis may mean exp(y log x). We accept
that for natural X,Y except (0,0) these must agree but that does not imply
that they should agree at (0,0), because the definitions are simply not the
same.

If you are serious about math, which it seems you are, I would be a bit more
careful about “transporting” notation from one field from another and arguing
about correctness.

The usual way to transport a definition from a discrete domain to a continuous
one is a technique called analytic continuation. I am curious if there is an
analytic continuation of the discrete X^Y which contains 0^0=1, and what that
would look like, but at that point you’re _definitely_ not talking about
exponentiation any more.

See:
[https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero](https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero)

See: [https://math.stackexchange.com/questions/11150/zero-to-
the-z...](https://math.stackexchange.com/questions/11150/zero-to-the-zero-
power-is-00-1/11155)

~~~
alentist
> it is (in some sense) arbitrary that X^Y means Y -> X or its size (cardinal
> exponentiation)

Are you referring to the notation? I don’t think notation is what’s being
contested.

> in analysis may mean exp(y log x)

Doesn’t work with a base of 0.

> We accept that for natural X,Y except (0,0)

Why “except”?

> that does not imply that they should agree at (0,0), because the definitions
> are simply not the same.

The definition of exponentiation of reals typically _starts with_
exponentiation of naturals as a given (see Baby Rudin).

> The usual way to transport a definition from a discrete domain to a
> continuous one is a technique called analytic continuation. I am curious if
> there is an analytic continuation of the discrete X^Y which contains 0^0=1

Analytic continuation refers to something else, but I get what you’re trying
to say.

The answer is simple: Real exponentiation is an _extension_ of natural
exponentiation. Hence, it should have the same value as the latter wherever
the latter is defined.

Yes, I’ve read that SE question before. It’s good that you brought it up. I
recommend reading the comments under the accepted answer and the other answers
as well.

------
dwheeler
Obviously a system can define a symbol like "/" as whatever it wants to. But
then it isn't what many mathematicians would expect.

The Metamath Proof Explorer defines "/" so it only has a value if the
denominator is a nonzero complex number: [http://us.metamath.org/mpeuni/df-
div.html](http://us.metamath.org/mpeuni/df-div.html)

You can argue if that is better, but that is a possible choice.

