
Understanding Convolutions (2014) - headalgorithm
http://colah.github.io/posts/2014-07-Understanding-Convolutions/
======
losvedir
Hmm, so I'm reading _The Hundred Page Machine Learning Book_ [0] which gives
an overview of the field, and one of the sections is on convolutional neural
networks, and it was very crisp and clear (like the whole book, really; highly
recommended). I understand since it's a brief summary it's probably eliding
stuff, but it's so much shorter and more understandable than this blog post
that it makes me wonder if it's really eliding a lot of crucial stuff, or if
this post is just... not clear.

In the book, it walks you through the convolution of an input matrix with a
smaller filter matrix. You start with lining up the smaller matrix in the top
left corner of the input matrix, and you multiply each overlapping square
together and add them all up. That's the value of the top left element of the
result (the result is a matrix). Then you slide the filter to the right,
repeat the calculation, and that's the value one to the right in the result
matrix. You repeat, panning and scanning across the input matrix and in this
fashion fill in the the whole result matrix.

I thought I had it, but this blog post just confused things. I do remember
learning about convolutions back in undergrad and being confused then, too, so
that sounds about right. When I read the ML handbook, I had a "wait, is that
it?" thought and was confused why I was confused back in the day. But I guess
that's just one particular way of doing it, or one use case or something.

[0] [https://www.amazon.com/Hundred-Page-Machine-Learning-
Book/dp...](https://www.amazon.com/Hundred-Page-Machine-Learning-
Book/dp/199957950X)

~~~
adamnemecek
Convolution is essentially polynomial multiplication. That's how I think about
it.

You remember (a + b)(c + d) = ac + ad + bc + bd?

Or (2x^2 + 1)(4x^3 + x) can be calculated by doing conv([0,2,0,1], [4,0,1,0])
(the numbers represent the coefficients). The results are the coefficients of
the resulting polynomial.

As the polynomials grow larger this gets harder and harder computationally.
Convolution theorem lets you use Fourier to you convert this multiplication
into a pairwise multiplication so that

fft(f * g) = fft(f) .* fft(g)

where * is convolution and .* is pairwise multiplication.

------
glial
If anyone's interested in learning more, the clearest explanation I have ever
seen of convolutions is here:

[http://www.dspguide.com/ch6.htm](http://www.dspguide.com/ch6.htm)

Pedagogically speaking, that book is one of the best technical books I have
ever read.

~~~
magoghm
I agree. That book is excellent.

------
vortico
I would think that a tutorial on convolutions this extensive would mention the
convolution theorem. Definitely the easiest way to visualize for most
applications. Or is Fourier space not assumed to be known by the reader?

~~~
king_magic
I had a similar thought the first time I read this a few years back. I mean, I
get what the author is trying to say here (e.g. the ball dropping/rolling
probability thing), but it’s just a little more convoluted (ouch) than it
needs to be, IMO.

------
kyberias
I didn't understand this sentence and stopped reading:

"How likely is it that a ball will go a distance c if you drop it and then
drop it again from above the point at which it landed?"

~~~
jsharf
Okay so I was confused by this too. I think it is especially especially
deceiving because it specified that the ball only moved in one axis of motion,
but it's actually two (falling, and then rolling over). I also tend to get
very frustrated with articles when they do a terrible job of explaining
things, because it can feel like an impassable wall for the reader, where they
just stop and don't continue. But in this case, they made up for their (really
really) terrible wording with a (really really) clear image describing exactly
what they meant. It did take like a minute of staring at the image to figure
out what they meant, but sorry, this is a blog post about math, that's what
it's going to take regardless. It's unfortunate that they couldn't communicate
their thoughts more, but it's kinda brutal to just stop reading because of one
sentence, when they put all this effort into making a diagram to explain that
one sentence

~~~
Sharlin
The example problem just seems ridiculously underspecified at first. It
shouldn't be necessary to read between the lines in order to decipher what the
author's mental model was.

------
hprotagonist
“well, it’s like autocorrelation, but upside down and backwards!”

------
greydius
Convolutions show up in a lot of places. Multiplying polynomials is a
convolution. This provides an interesting technique to create all the partial
sums of a sequence. If you have a generating function for the sequence, divide
it by (1-x). You'll get a new generating function where the nth coefficient is
the sum of the first n coefficients of the original function. Remember
1/(1-x)=1+x×x^2+... ?

------
baruchel
I recently published a new crazy convolution formula at
[https://arxiv.org/abs/1902.08982](https://arxiv.org/abs/1902.08982) (see last
formulæ of the paper). I just submitted it to a journal and am waiting for the
answer. Discussion is welcome!

~~~
enriquto
For easier reading and referencing you could specify the total big-O cost of
your algorithm right in the beginning (or even in the abstract). And an
indication on which range of input sizes it is likely to outperform karatsuba
multiplication before being outperformed by the fft.

------
throway88989898
Convoluted explanation only to cite the epic wikipedia gif

Fouri-ous there's no mention of Fourier
[https://en.wikipedia.org/wiki/Convolution_theorem](https://en.wikipedia.org/wiki/Convolution_theorem)

------
platz
so a convolution is simply a method to calculate a total likelihood?

why do we say convolution instead of total likelihood, which is so much more
to the point the goal that is trying to be achieved?

~~~
nitrogen
Convolution is a tool with many applications, so naming it for just one of
those applications would make it harder to see its usefulness in the others

