
How to derive convolution from first principles - Anon84
https://medium.com/@michael.bronstein/deriving-convolution-from-first-principles-4ff124888028
======
peter_d_sherman
>"Writing the above formula as a matrix-vector multiplication leads to a very
special matrix that is called _circulant_ :"

[...]

>"A _circulant matrix_ has multi-diagonal structure, with elements on each
diagonal having the same value."

[...]

>"One of the first things we are taught in linear algebra is that matrix
multiplication is non-commutative, i.e., in general, AB≠BA. However, circulant
matrices are very special exception:

 _Circulant matrices commute_ ,

or in other words, C(w)C(u)=C(u)C(w). This is true for any circulant matrix,
or any choice of u and w. Equivalently, we can say that the _convolution is a
commutative operation_ , x∗w=w∗x."

[...]

>"A particular choice of w=[0,1,0…,0] yields a special circulant matrix that
shifts vectors to the right by one position. This matrix is called the
_(right) shift operator_ [4] and denoted by _S_. The transpose of the right
shift operator is the _left shift operator_."

>"Circulant matrices can be characterised by their commutativity property. It
appears to be sufficient to show only commutativity with shift (Lemma 3.1 in
[5]): A matrix is circulant if and only if it commutes with shift. The first
direction of this “if and only if” statement leads to a very important
property called _translation or shift equivariance_ [6]: the convolution’s
commutativity with shift implies that

 _it does not matter whether we first shift a vector and then convolve it, or
first convolve and then shift — the result will be the same_."

[...]

>"Another important fact taught in signal processing courses is the
_connection between the convolution and the Fourier transform_ [8].

Here as well, the Fourier transform lands out of the blue, and then one is
shown that it diagonalises the convolution operation, allowing to perform
convolution of two vectors in the frequency domain as element-wise product of
their Fourier transforms."

[...]

>"Since all circulant matrices are jointly diagonalisable, they are also
_diagonalised by the Fourier transform_ [11].

They differ only in their eigenvalues. The last missing bit is the realisation
that

 _The eigenvalues of C(w) are the Fourier transform of w._

We can now put all the pieces of the puzzle into a statement known as the
_Convolution Theorem_ :

the convolution x∗w can be computed either as a circulant matrix C(w) applied
to x in the original system of coordinate (sometimes this is called “spatial
domain” convolution), or in the Fourier basis (“spectral domain”) by first
computing the Fourier transform of Φ*x, multiplying it by the Fourier
transform of w [12], and then computing the inverse Fourier transform."

