
How to compare two functions for equivalence, as in (λx.2*x) == (λx.x+x)? - tosh
https://stackoverflow.com/questions/17045941/how-to-compare-two-functions-for-equivalence-as-in-%CE%BBx-2x-%CE%BBx-xx
======
dustingetz
See also _Morte: an intermediate language for super-optimizing functional
programs_

"Now suppose there were a hypothetical language with a stronger guarantee: if
two programs are equal then they generate identical executables. Such a
language would be immune to abstraction: no matter how many layers of
indirection you might add the binary size and runtime performance would be
unaffected. Here I will introduce such an intermediate language named Morte
that obeys this stronger guarantee."

[http://www.haskellforall.com/2014/09/morte-intermediate-
lang...](http://www.haskellforall.com/2014/09/morte-intermediate-language-for-
super.html)

[https://en.wikipedia.org/wiki/Superoptimization](https://en.wikipedia.org/wiki/Superoptimization)

~~~
amelius
So then the task is to transpile the source functions into Morte, compile
them, and see if the executables are equal; and somehow we have solved an
undecidable problem (?) There must be a catch somewhere.

~~~
kccqzy
Morte by itself also doesn't support arithmetics.

~~~
dustingetz
Urbit does this cool trick where you use pure lambda calculus arithmetic
(church encoding), use this theoretically correct impl as a basis of your
proofs and reasoning, and then at runtime provide a polyfill that uses
hardware arithmetic operators. The runtime impl is thus not equal since it can
for example overflow, but in practice that doesn't inhibit our ability to make
useful programs.

[https://en.wikipedia.org/wiki/Church_encoding](https://en.wikipedia.org/wiki/Church_encoding)

Facebook does this also with GraphQL - GraphQL queries are theoretically
dynamic and defined by the client, but you can bake them when you ship, and
then provide optimized implementations of the query that also have knowledge
of the UI and database that they aren't supposed to otherwise have.

------
bazizbaziz
Undecidable in general, but there are two approaches I've seen work based on
AST manipulation and comparison:

1\. Solvers that search for applicable re-write rules to transform X into Y,
such as Cossette for SQL. These may not terminate because undecidability lol
[http://cosette.cs.washington.edu/](http://cosette.cs.washington.edu/)

2\. Canonicalization of the AST. This is a form of #1 but much more
restricted, and the hope is that functions that are equivalent end up
canonicalized in the same way. LLVM and GCC do this for a variety of reasons.
In the example given, you'd hope that both functions get canonicalized to
either the left or right hand side.
[https://gcc.gnu.org/onlinedocs/gccint/Insn-
Canonicalizations...](https://gcc.gnu.org/onlinedocs/gccint/Insn-
Canonicalizations.html)
[http://llvm.org/docs/MergeFunctions.html](http://llvm.org/docs/MergeFunctions.html)

------
BreakfastB0b
One way is to use Functional Extensionality, which is to say that two
functions are equal if for all possible inputs they return the same value.

In Coq for instance.

    
    
      Axiom functional_extensionality: forall {X Y: Type} {f g : X -> Y},
        (forall (x: X), f x = g x) -> f = g.
    
      Theorem x2_eq_xplus:
        (fun x => 2 * x) = (fun x => x + x).
      Proof.
        apply functional_extensionality. intros.
        destruct x.
        - reflexivity.
        - simpl. rewrite <- plus_n_O. reflexivity.
      Qed.

~~~
theoh
Does that work for X being the real numbers?

~~~
BreakfastB0b
I haven't got up to proving things about floating point numbers in Coq. But my
guess would be probably not.

But Haskell's quickCheck doesn't seem to find a problem.

    
    
      λ quickCheck @(Float -> Bool) $ \x -> 2 * x == x + x
      +++ OK, passed 100 tests.
      λ quickCheck @(Double -> Bool) $ \x -> 2 * x == x + x
      +++ OK, passed 100 tests.
    

It doesn't like associativity though.

    
    
      λ quickCheck @(Double -> _) $ \a b c ->
        (a + b) + c == a + (b + c)
      *** Failed! Falsifiable (after 6 tests and 4 shrinks):
      20.0
      3.5741489348898856
      2.894651135185324

~~~
theoh
OK, so those techniques would work equally well on arbitrarily complicated and
intractable code examples -- which is useful in practice but not in the spirit
of a formal determination of equality.

From the perspective of an Agda-phobe: The "2*x vs x+x" example could be a
case of a general question about arithmetic expressions, or it could even just
be about multiplication and addition. Since multiplication of integers can be
defined as repeated addition, proving equality in that particular case for any
numeric type just takes a rewrite of both sides in terms of addition only. If
the coefficient ("2") was not a natural number, things would be a little more
complicated (as other comments mention, you'd have to introduce some way of
getting things into a canonical form). I guess that would be an "intensional"
approach.

The best story I have about extensional definitions is a true one. At a class
on bike repair, somebody asked what a fixed-wheel bike was. The instructor
started to give an extensional definition: "It's like... a unicycle."
Presumably he could have followed this with other examples such as a Penny
Farthing -- but the audience seemed satisfied. It would obviously have been
more helpful to give an intensional definition ("no gears".)

------
seanwilson
This is one of the core reasons dependently typed languages are difficult to
make practical which I find never gets mentioned. Getting your program to type
check requires proving arbitrary properties are true like (λx.2*x) ==
(λx.x+x), so when this is undecidable it cannot be automated meaning the user
has to help by providing the proof which can be arbitrarily difficult (think
of pen-and-paper maths proofs but more rigorous and it's a very different
skill to programming).

~~~
vilhelm_s
I disagree. In Coq (without axioms), it is not the case that (λx.2*x) ==
(λx.x+x), and programs type check fine anyway.

Speaking more generally, one of the desiderata when designing a dependent type
system is to set things up so that type checking is a simpler problem than
unrestricted theorem proving. Of course, if you want to prove an arbitrary
theorem that can be arbitrarily hard, but that has nothing to do with using
dependent types.

~~~
seanwilson
> I disagree. In Coq (without axioms), it is not the case that (λx.2*x) ==
> (λx.x+x), and programs type check fine anyway.

Type checking may be decidable in Coq but now the user has to do the work of
putting things into a form that type checks. You're just pushing the problem
elsewhere. Even for trivial Coq programs you have to write a large number of
proofs compared to how much code you wrote.

> Speaking more generally, one of the desiderata when designing a dependent
> type system is to set things up so that type checking is a simpler problem
> than unrestricted theorem proving.

Do you know of any examples of this? I know DML limits you to linear
arithmetic so type checking is decidable but that's a big limitation. All
other languages like Coq, Idris and Agda require you to write proofs.

------
ulber
For C the first approach I would try is just using a general software
verification tool. Write a test driver which feeds the two functions two
different non-deterministic inputs and assert that their results are the same.
A verification tool will attempt to prove that the assertion can not be
violated. The yearly software verification competition (SV-COMP) [1] has some
very capable tools for verifying C programs. If I remember correctly, SMACK
[2] and Ultimate Automizer [3] both do very well over all.

[1] [https://sv-comp.sosy-lab.org/2017/index.php](https://sv-comp.sosy-
lab.org/2017/index.php)

[2] [https://github.com/smackers/smack](https://github.com/smackers/smack)

[3] [https://monteverdi.informatik.uni-
freiburg.de/tomcat/Website...](https://monteverdi.informatik.uni-
freiburg.de/tomcat/Website/?ui=tool&tool=automizer)

------
webkike
I actually did some research in this area for my thesis. Obviously algorithm
equivalence is largely undecidable, but it turns out that determining if two
SSA representations of some function (maybe one is pre optimization and the
other is post) can be reduced to a a graph isomorphism problem

------
jroesch
If you have a system that allows you to write down the semantics of these
expressions you can easily prove equivalence. You can use many of the
techniques posted by others to construct such a proof for example functional
extensionality is one method.

This is why dependent type theory is useful for reasoning about program
behavior and program equivalence. For example it becomes possible show that a
compiler is semantics preserving by demonstrating that the input and output
program are "equivalent" in this way.

One issue with blindly translating a subset of expressions to SMT is that you
may give them behavior different then the original source program. SMT solvers
like Z3 have a builtin semantics for the logic, which may or may not reflect
the source program you wrote. Verification languages that use SMT solvers for
automated reasoning must be very careful about how they "compile" the source
code into SMT queries to ensure that the query actually corresponds to the
original program.

------
ulucs
Depending on the context, it might be a good idea to use the probabilistic
approach: feed it a predefined number of variables and see if the results
match.

A similar method is used in matrices to compute AB =? C in O(n^2) time (you
check a predefined number of rows). You get O(n^2.something) if you do the
actual multiplication

~~~
tlb
Short functions can be equivalent for all but a vanishingly small subset of
inputs:

    
    
      f1(x) = (x==7239829489) ? 0 : x
      f2(x) = x
    

so randomized tests would probably say they are equivalent.

~~~
quickthrower2
This is the same problem for unit tests even with 100% code coverage.

------
Norfair
Depending on the language, this is undecidable. See the function equality post
on cs-syd.eu

~~~
SilasX
So it could be [efficiently?] decidable if you forced the function
specification to adhere to certain rules that ensured a standard, easily-
comparable format?

~~~
tikhonj
Not if the language was (still) Turing-complete.

On the other hand, if the rules are restrictive enough to _stop_ the language
from being Turing-complete, you could make it work.

There's an interesting design problem there: how do you make a language that's
powerful and expressive enough to be useful but simple and restricted enough
to solve problems like this reliably?

~~~
seanwilson
> There's an interesting design problem there: how do you make a language
> that's powerful and expressive enough to be useful but simple and restricted
> enough to solve problems like this reliably?

See theorem proving systems like Coq which use languages that are total. The
majority of loops you write while coding obviously terminate (for item in
items...) and Coq lets you prove more complex forms of looping terminate so
it's not as restrictive as you'd think.

~~~
tikhonj
Coq and similar languages are definitely quite expressive, but they're still
complex enough to involve non-trivial manual intervention. Verifying something
with Coq is possible but it's _hard_ ; whole PhD theses have been written
around verifying code for a single program. I know there is _some_ proof-
search-based automation available (ie auto), but I understand it's still
pretty limited.

Coq, Agda and similar languages are restricted enough to make verification
_possible_ , but complex—and expressive—enough to make it very difficult to
automate.

The opposite extent would be domain-specific languages that are _very_
restrictive but still powerful enough _for their domain_. Designing languages
like that would let us verify non-trivial properties fully automatically and
can save a lot of work for the user, but we end up needing to design a
language for each class of task that we care about.

As I said: it's an interesting design problem.

One recent addition to this is the Lean theorem prover, which tries to be
similarly powerful and general as Coq but with more provisions for automation
via the Z3 SMT solver. I haven't played with it myself but from what I've
heard it's a very promising development in making formal verification more
accessible to non-experts.

~~~
seanwilson
> Coq, Agda and similar languages are restricted enough to make verification
> possible, but complex—and expressive—enough to make it very difficult to
> automate.

Pretty much every theorem proving system out there is expressive enough to
state whatever program property you want. You can't escape having to automate
proofs though.

> Coq and similar languages are definitely quite expressive, but they're still
> complex enough to involve non-trivial manual intervention. Verifying
> something with Coq is possible but it's hard; whole PhD theses have been
> written around verifying code for a single program.

Yes, it's the most challenging aspect of programming I've ever tried. It's
completely impractical and too expensive for most projects right now outside
of domains where mistakes cost even more like aviation, medical devices,
operating systems, CPU etc.

> The opposite extent would be domain-specific languages that are very
> restrictive but still powerful enough for their domain.

Well, see DML where you can only write linear arithmetic constraints so it's
limited but automated:
[https://www.cs.bu.edu/~hwxi/DML/DML.html](https://www.cs.bu.edu/~hwxi/DML/DML.html)

I'm not really sure what the sweet spot is to be honest or if there is one
when you require full automation. Even non-linear arithmetic is undecidable
(basically arithmetic where you're allowed to multiply variables together) so
you quickly get into undecidable territories for what seem like basic
properties. No clever language design idea is going to let you side-step that
proof automation is hard which has been worked on for decades. Even writing
program specifications requires a skillset most programmers will have to
learn.

------
jandrese
Isn't this basically the halting problem?

~~~
progval
Almost. Its answer is a direct consequence of Rice's theorem (“A Turing
machine can't decide non-trivial properties on Turing machines given as
input”), which is itself proven using the undecidability of the Halting
Problem (“A Turing Machine can't decide whether Turing Machines given as input
stop.”).

------
LightMachine
... why a question I asked 4 years ago is on the first page of Hacker News?

~~~
singold
Because someone found it interesting :)

------
zackmorris
Does someone know a generic way to do this, perhaps with wolframalpha.com? I
was thinking of perhaps having it re-solve for a certain variable and inspect
them visually but there may be no guarantee that it comes up with the same
simplified forms for both equations.

------
te
Term rewriting might help.

[https://en.wikipedia.org/wiki/Pure_(programming_language)](https://en.wikipedia.org/wiki/Pure_\(programming_language\))

------
lngnmn
By providing a heuristic that doubling could be defined as x+x and 2*x. That
one plus one is the same as one two times.

------
crb002
Undecidable in general, but you can compile many to SMTLIB and let Z3 crunch
them.

------
thomastjeffery
Another interesting thought:

Don't x+x and x*2 compile to different CPU instructions?

~~~
TheCoreh
An optimizing compiler will most likely compile both to the same instruction.
(If my intuition is correct, a bitshift.)

