
Matrix calculus for deep learning part 2 - UKamath7
https://kirankamath.netlify.app/blog/matrix-calculus-for-deeplearning-part2/
======
bollu
Maybe it's just me, but I found matrix calculus atrocious. The rules aren't
even useful for most situations, since they're essentially "hardcoded" for
certain types of matrix situations that crop up often.

I _much_ prefer to reduce a given matrix expression into einstein summation
convention, at which point all of the "regular" calculus rules just work. You
can bash it out from this point on.

For example, consider the case of `x^T x`. We are told from matrix calculus
that this is `2x`. To do this using summation convention, we first write it in
terms of coordinates. We will have:

    
    
      y = xi xi [summation over i implicit]
      dy/dxj 
       = d(xi^2)/dxj
       = d(xi^2)/dxi * dxi/dxj [chain rule]
       = 2xi delta(ij) [all xi independent, dxi/dxj = dirac]
       = 2xj [summing over i]
      dy/dx = 2x

~~~
Yajirobe
How do you go from 2xj to 2x?

Also, for this trick to work on matrices, you need two indices.

------
fathead_glacier
In general I find that working with mathematics as auxiliary to any field is
hard when you do not get to practice it every day. In these cases I have found
the matrix cookbook [1] to be very helpful as it serves as a quick reference.

[1]
[http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/3274/pd...](http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/3274/pdf/imm3274.pdf)

~~~
UKamath7
Thanks book is awesome

