
Foundations for Efficient and Expressive Differentiable Programming [pdf] - espeed
http://papers.nips.cc/paper/8221-backpropagation-with-callbacks-foundations-for-efficient-and-expressive-differentiable-programming.pdf
======
mlazos
I saw this paper at nips and I thought it was an awesome application of
continuation passing style. Typically in CPS one of the advantages is that you
really only need a single stack frame (since the remainder of the program is
encoded in the callback), but in this paper they allow the CPS’d program to
use multiple stack frames and store the intermediate results from backprop in
the callstack. It never would’ve occurred to me to remove one of the
advantages of CPS in order to store values on the callstack. Cool idea!

------
carterschonwald
Another paper on the same topic that got best paper at icfp this fall is
[http://conal.net/papers/essence-of-ad/](http://conal.net/papers/essence-of-
ad/)

Which shows how to connect forward and reverse mode auto differentiation in a
very elegant way

------
tiarkrompf
Co-author here. Happy to answer any questions, as usual ...

~~~
cs702
Very cool.

Like all really good ideas, this one seems "obvious" in hindsight. I mean that
is a compliment: It would have _never_ occurred to me that transforming code
into continuation-passing-style code would allow for automatic differentiation
through all dynamic control-flow structures, by leveraging the function-call
stack, thus eliminating the need for some kind of "tape" data structure, e.g.,
as in PyTorch.

My question is about the ongoing work to provide a JIT compiler for Python
code. Do you expect it will provide full support for the entire PyTorch and/or
Tensorflow APIs?

~~~
tiarkrompf
Thanks! Yes, it wasn't obvious at all when we started looking at AD either.

Lantern supports a good deal of PyTorch (via Snek, our Python front-end
similar to AutoGraph) and can also read ONNX. Full feature parity is not our
main goal--so far, supported features have been driven mostly by what is
required for certain interesting models.

~~~
p1esk
How does this effort compare to Myia [https://github.com/mila-
udem/myia](https://github.com/mila-udem/myia) ?

