
Fully Automatic Differentiation for Tensor Expressions - hedgehog
http://vertex.ai/blog/fully-automatic-differentiation
======
yaccman
How would you differentiate something like merge/concatenate, where all the
action is in data movement and there are no arithmetic operations (but the
gradient is still required)?

~~~
melvinzzz
Here the way autodiff for a simple 1-D concatenation is computed. First we
write the concat itself:

    
    
      function (A[SA], B[SB]) -> (O) {
        O[i : SA + SB] = +(A[i] + B[i - SA]);
      }
    

Note, that SA and SB are the sizes of the two tensors, and due to the way that
edge case handling works, either A or B is out of bounds and thus equivalent
to zero for each index i of O. Now in the case of +/\+
accumulation/combination, the autodiff writes the derivative of each input as
the sum over only the output (rather than multiplied by the other inputs as in
the +/* case). This gives us:

    
    
      dA[i : SA] = +(dO[i]); 
      dB[i - SA : SB] = +(dO[i]); 
    

A further reduce/defract slightly changes this to:

    
    
      dA[i : SA] = +(dO[i]); 
      dB[i : SB] = +(dO[i + SA]); 
    

which basically extracts the proper portion of the derivative of dO into dA
and dB respectively.

------
manjunaths
Like automatic differentiation is it possible to use it to integrate too? Kind
of like f = ma; a = f/m; velocity = integral (a) dt; position = integral (v)
dt ?

~~~
tzerrell
Unfortunately no, for the same reasons as there's no chain rule for
integration (the very same reasons, in fact, as autodiff is built on the chain
rule).

