
GIMP Convolution Matrix - icc97
https://docs.gimp.org/en/plug-in-convmatrix.html
======
lealanko
Author here (of the plugin, not the page).

This was my only tangible contribution to the Gimp, some twenty years ago.
IIRC it was modelled after a similar feature in Paint Shop Pro, and it was
easy enough an exercise for an aspiring programmer. Or so I thought: others
have now spent much more time maintaining it than I spent writing it.

In hindsight it seems like a bizarre idea to have a feature whose GUI requires
filling in 25 numbers. But perhaps it has use as an educational tool.

In development version of the Gimp, the entire plugin has effectively been
rewritten. Good riddance!

~~~
OskarS
I had no idea this feature was available in Gimp, it would have helped me out
a lot in debugging and developing shaders which use convolution matrices!
Thanks for your hard work!

~~~
daturkel
It's in photoshop too!

------
jey
Yeah, weirdly a lot of basic image processing tasks can be expressed as
convolutions:
[https://en.wikipedia.org/wiki/Kernel_(image_processing)](https://en.wikipedia.org/wiki/Kernel_\(image_processing\))

~~~
eat_veggies
I don't fully grok the math behind convolutions yet, but the symmetry of the
numbers in the matrices is beautiful.

~~~
mlevental
discrete convolution is just dot product. in 2d it's elementwise matrix
multiplication.

[https://i.stack.imgur.com/YDusp.png](https://i.stack.imgur.com/YDusp.png)

don't worry about continuous convolutions

~~~
llamaz
Discrete convolution definitely isn't the dot product. Given an n-tuple a and
an m-tuple b, form a polynomial p_1 from the coefficients of a and a
polynomial p_2 from the coefficients of p_2. The the convolution of a and b is
the coefficients of a * b.

e.g. a= [1,2,3] and b=[4,5,6].

p_1 = 1 + 2x + 3x^2 + ...

p_2 = 4 + 5x + 6x^2 + ...

p_1 * p_2 = 4 + 13x + 28x^2 + ...

Hence the convolution of a and b is [4, 13, 28, ...].

Here's the full answer:

octave:1> conv([1,2,3], [4,5,6])

ans =

    
    
        4   13   28   27   18
    

The intuition behind convolution to me for an image is uniform blurring. If
you have a square shaped eraser, and you use it to blur a pencil sketch, the
sketch will appear blocky and square shaped. Or perhaps a better analogy is
painting: your wall will appear different depending on whether you use a
paintbrush or a roller.

~~~
mlevental
why in the world are you presenting convolution of polynomials?

i was being glib true but the crux of convolution (what's important for
someone who expresses that they don't understand the mathematics) is
elementwise multiplication i.e. dot product. yes you take a filter and sweep
it across your input (or vice-versa depending on whether you're really talking
about convolution or correlation) and that's important but in my opinion
what's more important (obviously since that's what /i/ expressed) is
elementwise multiplication. once you understand that (the elementwise
multiplication) sweeping and padding and etc are simple to understand.

>The intuition behind convolution to me for an image is uniform blurring. If
you have a square shaped eraser, and you use it to blur a pencil sketch, the
sketch will appear blocky and square shaped. Or perhaps a better analogy is
painting: your wall will appear different depending on whether you use a
paintbrush or a roller.

this is a very bad example since you can easily convolve with filters that
sharpen (e.g. literally any gaussian kernel). so you're really not
illustrating convolution but instead convolution with an averaging filter (or
something like that).

in general you shouldn't jump to criticize someone's explanation to a student
unless you are prepared to give a much better and (this is important)
pedagogically sound explanation because 1 you look foolish 2 you sabotage the
student's understanding.

~~~
llamaz
I don't care about the "student", I'm telling /you/ what convolution is. It's
not the dot product.

"yes you take a filter and sweep it across your input..."

Sweeping across your input is the heart and soul of convolution. I've taken
courses in image processing, control theory, probability theory, and digital
signal processing - and in all these contexts, convolution is always sweeping
across in a very specific way. If you change a single minus sign, you get
cross correlation which is completely different.

I don't appreciate the rude tone you're taking so I will not be replying
further. I didn't take that tone when replying to your initial message.

~~~
mlevental
> don't care about the "student", I'm telling /you/ what convolution is. It's
> not the dot product.

Then don't comment on expositions given to students.

>Sweeping across your input is the heart and soul of convolution.

But that's like just your opinion man. The definition is what the definition
is and a component of it is the translation of the filter and a component of
it is the weighted sum. It's up to each individual to decide what's the most
significant component and each individual is free to adjust their focus
according to the problem at hand. In this instant I chose to focus on the sum
for the sake of pedagogy.

>If you change a single minus sign, you get cross correlation which is
completely different.

No in fact you get literally the same function just flipped about t=0. I would
not call that completely different.

> I didn't take that tone when replying to your initial message.

Read your response again and reflect some more because absolutely your initial
message was antagonistic

------
alexcnwy
This great interactive really helped me understand what a convolution is:

[http://setosa.io/ev/image-kernels/](http://setosa.io/ev/image-kernels/)

Highly recommended!

------
chris_wot
Oh man, I've been trying to find a good explanation of what a convolution
matrix does! I'm literally just now working on bitmap code in LibreOffice and
was deliving into the convolution matrix functionality. Thanks to the author
for this!

------
bitL
Funny that GIMP is actually doing cross-correlation instead of convolution.
That's also a mess in Deep Learning frameworks where "convolution" often means
"cross-correlation".

~~~
rcthompson
I'm interested to know what the difference is. Do you have a link that
explains it?

~~~
LolWolf
It's a pretty minor difference, really. Over functions (which is the same as
over vectors/matrices, except the latter are taken at discrete points), the
autocorrelation and the convolution differ by a sign. The definition of
convolution is

conv(x(t), y(t))(q) = integral (x(t)·y(q-t) dt)

where x(t), y(t) are functions. Or, with sums and x, y are vectors

conv(x, y)(q) = sum(over i + j = q+1) of (x(i)·y(j))

whereas the definition of the autocorrelation is "simpler"

ac(x(t), y(t))(q) = int(x(t)·y(t-q))

In other words, it replaces y(t) with y(-t) (i.e., the autocorrelation of x(t)
and y(t) is the same as the convolution of x(t) and y(-t)). A similar
transformation happens for the discrete case.

Turns out, in the context of learning, these are equivalent since it just
corresponds to a relabelling of the indices (which is an invertible linear
transformation). In fact, even more strongly, independent of the order of the
method (so long as it is >0th order!), this will cause no change in the
descent direction or objective value, unlike arbitrary invertible linear
transformations (for which this is only true of ≥2nd order methods).

~~~
LolWolf
Also, sorry, I was taking terminology from the GP. The real term for this is
_cross-correlation_ —with _autocorrelation_ being the application of the above
operation onto a single function; i.e., taking ac(f(t), g(t)) as defined
above, the autocorrelation of g is really defined to be ac(g(t), g(t)).

------
userbinator
One thing that would help the comprehension of this document greatly is to
mention that it's simply a "weighted average".

~~~
oehpr
I don't think it is a weighted average. I believe you have to divide at some
point for averaging to be going on. Convolutions just add up the neighborhood
based on the kernel, and output their sum.

~~~
userbinator
It still gets across most of the point of how it works.

As for dividing, that's what the Divisor option is for. To quote the article:

"The result of previous calculation will be divided by this divisor. You will
hardly use 1, which lets result unchanged, and 9 or 25 according to matrix
size, which gives the average of pixel values."

------
twic
A fun filter i used a bit back when i was a scientist was the rolling ball
filter:

[https://imagej.net/Rolling_Ball_Background_Subtraction](https://imagej.net/Rolling_Ball_Background_Subtraction)

It took me a while to realise that this is just a cousin of convolution: in a
convolution, as you sweep the kernel over the image, combine pairs of pixels
and kernel elements with multiplication, then summarise the results with
addition, whereas with a rolling ball, you combine the pixels and kernel
elements with subtraction, and summarise the results with the maximum. The
kernel used is the 'ball', but there's no real reason it has to be based on a
sphere in any way.

I wonder what other image processing operators could be built in that
framework?

------
xvilka
Hopefully GIMP will finish GEGL and GTK+3 migration soon to be more tablet-
friendly.

~~~
inferiorhuman
I still haven't gotten over the hatchet job they did on the UI with 2.8 on the
desktop.

~~~
Grue3
They did a great job with 2.8 UI. Or are you one of these people who still
can't get over the Save/Export thing?

~~~
rleigh
As an occasional user, no I can't get over it. I choose the wrong action every
time! It's a non-standard way of saving an image, and it's a usability fail.

~~~
Grue3
Since GIMP 2.8 came out 5 years ago and you still haven't got used to it, you
must have been using it very occasionally, so hopefully it didn't waste too
much of your time. However it's a big win for people who use any non-trivial
features of GIMP which can't be "saved" in an ordinary image format. Pretty
much every professional media software uses this workflow. You can't "Save As"
.mp4 in Sony Vegas, you can only save your project as .veg, a file format
which only Sony Vegas can open. Otherwise all your hard work will be lost.

~~~
inferiorhuman
Or he just hasn't upgraded -- I know I haven't. But, you're right. I've mostly
moved on to other tools. If I'm using GIMP to edit something, the source file
is never going to be XCF. Sorry guys, nobody uses XCF and pros don't use GIMP.
A big win for professional users would be CMYK support, not the inability to
save non-XCF files.

Let's face it. Chasing non-existent professional users at the expense of your
actual user base is pretty darn lousy UX.

~~~
Grue3
There are so many misconceptions in this post, I have to wonder if this isn't
some sort of an elaborate troll.

>If I'm using GIMP to edit something, the source file is never going to be
XCF.

What makes you think the source file can only be XCF? You can easily open any
image format with GIMP.

>Sorry guys, nobody uses XCF and pros don't use GIMP.

Wrong and wrong. XCF is necessary to save your work in GIMP, so most GIMP
users use it.

>A big win for professional users would be CMYK support,

Who the hell needs CMYK support these days? Print is dead (well, it's getting
there anyway).

>not the inability to save non-XCF files.

You can easily export your project to any image format that exists.

>Chasing non-existent professional users at the expense of your actual user
base is pretty darn lousy UX.

The tool should be optimized for people who spend the most time using the
tool. GIMP is for long edits, where saving (actually saving) your intermediate
progress is a part of the workflow, whether you're a hobbyist or pro. For
quick edits there is Pinta and whatever else.

~~~
inferiorhuman
> What makes you think the source file can only be XCF? You can easily open
> any image format with GIMP.

Nothing. But if I'm opening a not-XCF I probably don't want to SAVE it as an
XCF.

> Wrong and wrong. XCF is necessary to save your work in GIMP, so most GIMP
> users use it.

So basically the only reason anyone uses XCF is that GIMP forces them to?
That's some sound design logic there. Used to be you could save your work in
not-XCF.

> Who the hell needs CMYK support these days? Print is dead (well, it's
> getting there anyway).

The professional users GIMP is chasing after.

> You can easily export your project to any image format that exists.

Would be easier to save.

> The tool should be optimized for people who spend the most time using the
> tool.

So chasing after the non-existent professional GIMP users would be a bad
optimization then? Ok.

> GIMP is for long edits, where saving (actually saving) your intermediate
> progress is a part of the workflow, whether you're a hobbyist or pro.

You know, I'd love to find an open source alternative to Lightroom and
Photoshop. Turns out things like darktable and GIMP are far more archaic.
That's the reason I still buy and use the Adobe tools. Dicking about with the
one workflow where GIMP actually works is a poor move and doesn't do anything
beneficial for more involved workflows.

------
tambourine_man
In Photoshop, that's Filter -> Other -> Custom

------
f00_
I thought it was interesting the same technique (convolution) was used in two
different applications: photoshop/gimp for image filters and in convolutional
neural networks (CNNs). What is the purpose of convolution in CNNs?

the setosa interactive says convolution is "a technique for determining the
most important portions of an image"

the hubel and wiesel experiment showed "there is a topographical map in the
visual cortex that represents the visual field, where nearby cells process
information from nearby visual fields. Moreover, their work determined that
neurons in the visual cortex are arranged in a precise architecture. Cells
with similar functions are organized into columns, tiny computational machines
that relay information to a higher region of the brain, where a visual image
is formed"

I'm interested in David Marr's work, about to order Vision: A Computational
Investigation into the Human Representation and Processing of Visual
Information, his level of analysis approach is interesting. I think he did
work on edge detection, which must of been involved some kind of
convolution/filter

This setosa interactive is great, and actually references the gimp
documentation

[http://setosa.io/ev/image-kernels/](http://setosa.io/ev/image-kernels/)

 _An image kernel is a small matrix used to apply effects like the ones you
might find in Photoshop or Gimp, such as blurring, sharpening, outlining or
embossing. They 're also used in machine learning for 'feature extraction', a
technique for determining the most important portions of an image. In this
context the process is referred to more generally as "convolution"_

[https://knowingneurons.com/2014/10/29/hubel-and-wiesel-
the-n...](https://knowingneurons.com/2014/10/29/hubel-and-wiesel-the-neural-
basis-of-visual-perception/)

 _The classic experiments by Hubel and Wiesel are fundamental to our
understanding of how neurons along the visual pathway extract increasingly
complex information from the pattern of light cast on the retina to construct
an image. For one, they showed that there is a topographical map in the visual
cortex that represents the visual field, where nearby cells process
information from nearby visual fields. Moreover, their work determined that
neurons in the visual cortex are arranged in a precise architecture. Cells
with similar functions are organized into columns, tiny computational machines
that relay information to a higher region of the brain, where a visual image
is formed._

[https://ps.is.tuebingen.mpg.de/research_fields/inverse-
graph...](https://ps.is.tuebingen.mpg.de/research_fields/inverse-graphics)

 _Computer vision as analysis by synthesis has a long tradition and remains
central to a wide class of generative methods. In this top-down approach,
vision is formulated as the search for parameters of a model that is rendered
to produce an image (or features of an image), which is then compared with
image pixels (or features). The model can take many forms of varying realism
but, when the model and rendering process are designed to produce realistic
images, this process is often called inverse graphics. In a sense, the
approach tries to reverse-engineer the physical process that produced an image
of the world._

~~~
stevetk
> I thought it was interesting the same technique (convolution) was used in
> two different applications: photoshop/gimp for image filters and in
> convolutional neural networks (CNNs). What is the purpose of convolution in
> CNNs?

IMO, convolution in CNNs would be better denoted as correlation. In CNNs the
feature maps are convolved (correlated), eg swept over the image and multiple
at each spot with the results being accumulated, in order to find where in the
image these filters fit. The output produces a map of how strongly each filter
(there are usually multiple in a convolution layer, that is the third
dimension of the layer), fits with the image. These become the weights passed
onto the next layers.

As an electrical engineer by training, who used convolution all the time, I
didn't understand how the two were related, especially because of the third
dimension.

I read this [https://adeshpande3.github.io/A-Beginner%27s-Guide-To-
Unders...](https://adeshpande3.github.io/A-Beginner%27s-Guide-To-
Understanding-Convolutional-Neural-Networks/) and that lead me to how I
understand it today.

------
jasonmp85
The writing quality in these docs is absolutely terrible.

~~~
jake_the_third
English is likely not the first language of the volunteer who wrote these
docs. If you want better documentation, I encourage you to write it yourself.

The project needs all the help it can get.

~~~
jasonmp85
I assumed as much am not belittling the skills of a non-native speaker. I was
just sort of surprised that a project that has been around so long hasn’t had
better attention paid to its docs (my biggest FOSS experience is with
PostgreSQL which is a bit of an outlier in just how good its docs are).

------
Const-me
I wonder who and why would use the feature?

These matrices are slow. Many kernels used in practice are separable
[https://blogs.mathworks.com/steve/2006/11/28/separable-
convo...](https://blogs.mathworks.com/steve/2006/11/28/separable-convolution-
part-2/) and therefore much faster to compute than a generic convolution
matrix.

Also the UI is hard to use. 25 edit boxes is a lot.

