
Differentiable Programming: A Semantics Perspective - matt_d
http://barghouthi.github.io/2018/05/01/differentiable-programming/
======
seanmcdirmid
A decade ago, symbolic differentiation could also be used to compute normal
vectors for mathematically defined surfaces (like iso surfaces). Conal Elliott
(the guy who co-invented FRP) wrote a great paper on this with fairly well
defined semantics:

[http://conal.net/Vertigo/](http://conal.net/Vertigo/)

And

[http://conal.net/papers/beautiful-
differentiation/](http://conal.net/papers/beautiful-differentiation/)

~~~
davidwihl
Automatic differentiation is a little different than symbolic differentiation.
Rather than use a new high level description of the computational graph, tools
like autograd [1] can differentiate somewhat arbitrary Python and numpy
instructions.

[1] [https://github.com/HIPS/autograd](https://github.com/HIPS/autograd)

~~~
mehrdadn
> can differentiate somewhat arbitrary Python and numpy instructions.

I hear this a lot but it doesn't make sense to me. How do you differentiate an
conditional or a loop? It doesn't make sense mathematically, so I'm not sure
what sense it makes programmatically.

~~~
noelwelsh
For a loop, you can unroll it and define the derivative that way. But you're
correct in general that there are programs that have no sensible derivative
but automatic differentiation will still compute something.

Going beyond AD there is a "differential lambda calculus" [1]. I don't really
understand this work. It requires more background in programming languages,
logic, and analysis than I have time to learn. My understanding is it allows
you to compute more derivatives (e.g. for higher-order functions) and place
conditions so only sensible derivatives are computed. There is also
"differential linear logic" [2] which has some relation to the differential
lambda calculus but it's unclear to me what that relationship is.

So in summary I believe:

1\. automatic differentiation will compute the correct derivative when such a
thing is available, and some nonsense answer otherwise

2\. to increase the space of programs for which derivatives can be computed
and to rule out other programs you need to look at differential lambda
calculus / linear logic.

[1]:
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.471...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.471.7213&rep=rep1&type=pdf)
[2]: [https://arxiv.org/abs/1606.01642](https://arxiv.org/abs/1606.01642)

~~~
mehrdadn
> automatic differentiation will compute the correct derivative when such a
> thing is available, and some nonsense answer otherwise

The problem is this is false too. Just modify my example a bit:

    
    
      y = x if x > 0 else x if x < 0 else 0
    

Here we clearly have y'(0) = 1, yet autodiff would seem to give y'(0) = 0.

~~~
noelwelsh
Good point / example!

------
vutekst
Article is a bit broken for me because while the page is https, the mathjax
file is loaded from a plain http. This is apparently forbidden by current
Chrome and Firefox, so the figures and LaTeX don't work. edit: ah-hah, HTTPS
Everywhere must have fiddled with it.

------
catnaroek
I don't understand in what sense programs can be called “differentiable”. Is
the space of programs modulo observational equivalence a manifold to begin
with? (I don't think it's Hausdorff or even T1, but I could be wrong.)

The examples given in the article are merely derivatives of ordinary
mathematical functions defined by ordinary mathematical expressions - in
particular, there are no sequencing, no conditionals and no loops. So why call
them “differentiable programs” when you are actually dealing with ordinary
differentiable functions from good old 19th century analysis? We need urgent
improvements in the intellectual honesty department.

