
Image Processing 101 - abecedarius
https://codewords.recurse.com/issues/six/image-processing-101
======
yan
Slightly related, but I had a small epiphany when taking a class on DSP on
Coursera. The kernel that is used to blur an image and that which is used to
remove the treble/high frequencies from an audio sample are identical, except
one is in 2 dimensions and the other is in 1. And this makes perfect sense! A
low pass filter removes high frequencies, and sharp edges are high frequencies
in the 2D plane.

TFA only mentions Gaussian blur, but a Gaussian blur is just a moving average,
with "closer" pixels being valued higher, plus a smooth falloff. When you
replace each value with an average of its neighborhood, you "soften" the
transitions.

~~~
elcapitan
May I ask which Coursera course that was?

~~~
aerioux
i would presume
[https://www.coursera.org/course/dsp](https://www.coursera.org/course/dsp)
which was pretty good when I did it -
[https://www.coursera.org/course/images](https://www.coursera.org/course/images)
also covered some similar things

~~~
elcapitan
That looks like it, great, thanks.

The reason why I'm asking instead of searching on Coursera is that Coursera
has become increasingly hard to search, at least in my view, so it's easier to
ask directly.

------
leni536
I know that this is an introduction, but I wish there was a warning about the
use of improper color spaces for different tasks (like using sRGB or any
nonlinear colorspace for downscaling images, blurring or Phong shading). A
warning about the existence of different colorspaces and their different use
cases would be enough in an introductionary write up. It's still an issue in
most of today's software [1].

Like open this image in your browser or in your favorite image viewer and
scale it down to 50%:
[http://www.4p8.com/eric.brasseur/gamma-1.0-or-2.2.png](http://www.4p8.com/eric.brasseur/gamma-1.0-or-2.2.png)

[1]
[http://www.4p8.com/eric.brasseur/gamma.html](http://www.4p8.com/eric.brasseur/gamma.html)

~~~
vanderZwan
So I just spent half an hour googling for libraries that could help with this.
Chroma.js seems to be a pretty nice option for dealing with this issue in a
web context:

[http://gka.github.io/chroma.js/](http://gka.github.io/chroma.js/)

The person behind is has some nice blog posts on color generation too:

[https://vis4.net/color/](https://vis4.net/color/)

The only issue that I see is that the library is focused on translating data
to colours. The problems with blurring and downscaling that you mentioned go
the other way: the colours are the data.

------
haffla
So OK you can use cv2.COLOR_RGB2GRAY to get a grayscale image. But what does
that teach you? In my image processing 101 course a couple of years ago we
actually didn't use any libraries except for reading images and showing them
(written by the teacher). Just Java. A picture would be a 2 dimensional array.
The pixel is represented by an integer. So you just need a nested for loop and
you can manipulate every pixel yourself, thus learning what really happens
under the hood, how filters work.

~~~
piratefsh
i totally agree that the fun part lies in understanding how things work under
the hood. i never understood how edge detection work until i worked on an
implementation with JavaScript and canvas, and that was a fantastic learning
experience. the motivation behind this article was to provide a general idea
of what image processing could do, and hopefully encourage someone to delve
deeper into the mechanics of image processing :)

------
joshvm
A good follow on from this is the Learning OpenCV book (O'Reilly), written by
a couple of the lead developers. It goes into detail on the mathematics, but
it's not heavy or verbose at all. I found it far more useful than a lot of
introductory image processing books simply for its theoretical content.

Don't forget scikit-image and scipy.ndimage too.

------
dosshell
Strictly speaking image processing is image in -> image out. And image
analysis is image in -> data out. The author gives the expression that
everything is image processing. Not a big thing but it helps to know the
difference if you want to take the correct course :)

~~~
pedrosorio
The first paragraph in
[https://en.wikipedia.org/wiki/Image_processing](https://en.wikipedia.org/wiki/Image_processing)

"In imaging science, image processing is processing of images using
mathematical operations by using any form of signal processing for which the
input is an image, a series of images, or a video, such as a photograph or
video frame; the output of image processing may be either an image or a set of
characteristics or parameters related to the image.[1]"

[1] Rafael C. Gonzalez; Richard E. Woods (2008). Digital Image Processing.
Prentice Hall. pp. 1–3. ISBN 978-0-13-168728-8.

~~~
dr_zoidberg
I was going to answer with the Gonzalez & Woods definition. Extremely clear
book by the way.

~~~
pedrosorio
Yes, I read this one too. Never applied the concepts in practice though.

------
utkarshsinha
Have you looked at [http://aishack.in/](http://aishack.in/) \- it has a bunch
of opencv tutorials and projects.

~~~
piratefsh
ahh this is great! opencv tutorials with detailed explanations on the
algorithms are hard to come by. i've found tutorials really useful to learn
some of the standard processes in image processing (i.e. grayscale and
blurring to remove noise).

------
EvanPlaice
Instead of using built-in method calls that come with a library, why look into
the algorithms used to generate the different transforms?

I once worked on a UI where the users wanted to capture a screenshot of the
current page.

Because color toner is more expensive they also wanted the option to print
grey scale. I'm pretty terrible at working in 2D space but a quick Google
search let me know that the conversion to grey scale involved averaging the
RGB values for each pixel.

Unfortunately, the coloring of the UI was darker more than light so the
resulting greyscale image was still black toner intensive. So we provided an
additional option to invert black and white.

To make it work a second transform was applied to each pixel that reversed the
pixel value from upper to lower bound (or vice versa depending on how you look
at it).

The result was an output that trended toward white instead of black. The
output looked surprisingly good and saved on toner so the users could print
many screen captures without worrying about wasting resources.

For the business, it resulted in a cost and resource savings. For users,
picking the resulting output provided better results that were easier to
understand. From a development perspective, the implementation wasn't
difficult at all to add. So, win-win-win.

What surprised me was how easy these transforms were to apply. It's a bit CPU
intensive on high resolution images but it's not terribly difficult to come up
with good results.

It would be awesome to see some more examples of algorithms used for image
processing. So much material covers generic algorithms and data structures
that come with the typical CS degree.

It would be much more interesting to see algorithms that can be used in
practice. For example, how to scale images, implement blur, color correction,
calculate HSL, etc...

Libraries are great but these concepts are simple enough that they don't
require 'high science'.

The article mentions a curiosity related to how edge detection works. I'd
assume that you select a color and exclude anything that falls outside a pre-
determined or calculated threshold. For instance, take a color and do
frequency analysis of colors above-below that value by a certain amount. Make
multiple passes testing upper and lower bounds.

A full color image @ 24 bit (8R 8G 8B) will take a max of 24 passes and will
likely have logarithmic runtime cost if implemented using a divide-conquer
algorithm.

Things like blur and lossy compression sound a hell of a lot more interesting
because they have to factor in adjacency.

~~~
GFK_of_xmaspast
> I'd assume that you select a color and

This is not at all how edge detection works. See
[https://en.wikipedia.org/wiki/Sobel_operator](https://en.wikipedia.org/wiki/Sobel_operator)
for a key building block.

~~~
EvanPlaice
Thanks!

I wasn't aware of this approach. Looks like a reasonable single-pass solution.

------
jlubawy
I'm actually taking a image processing course right now, and at least one
thing I didn't see in this article that I have found very useful is histogram
equalization (OpenCV equalizeHist). It basically takes images with low
contrast and increases the contrast. This is really useful for many
applications but one I've actually been able to use is increasing the
legibility of scanned pencil on paper images.

~~~
piratefsh
yes! i've found it really useful when dealing with images of items in
different lighting conditions. i've struggled with understanding histogram
equalization, would you happen to have good resources that explain how it is
done?

~~~
abecedarius
Does this help any? Say we're looking at a grayscale image, where each pixel
has a value between 0.0 and (just less than) 1.0. Sort all the pixel values,
then scan them from lowest value to highest. The pixels with the lowest value
in the image get reassigned the value 0; and in general, a pixel that's
brighter than k of the other pixels, out of n total, gets changed to intensity
k/n. This spreads the intensities as evenly as we can. (Which might not be so
even, for example if all of the input pixels were the same intensity to start
-- then this just changes them to 0!)

Maybe the image is, say, 8-bit grayscale, so the values can range from 0 to
255 instead -- then it'd be floor(k/n times 256).

This is usually described in terms of a cumulative description function; I
tried to say the same with less jargon.

------
yompers888
As a sort of abstract question, do readers here think of <Class X> 101 as
meaning fundamentals of X, or basic techniques in X? Having taken image
processing from both sides, I'd say that learning the principles was much more
useful (and would have been more useful still if I'd had a proper background
in linear algebra). This article is the equivalent of naming some tools and
showing us where they fit.

~~~
anjc
I agree. I'm terrible at maths so struggled with aspects of Computer Vision in
my degree, but I don't see how you could use OpenCV without understanding the
principles to at least a basic degree. It seems like you're purposely creating
a black box within which magic is happening. Which would be nice and fine and
abstract in many circumstances, except OpenCV keeps you greatly abstracted
from the concepts, and any non-trivial Vision application needs you to get
close to the theoretical-metal anyway, in my experience.

If anyone's interested in the theory, I'd recommend Sonka

~~~
piratefsh
totally. while OpenCV is really useful and magical (i did learn a lot of
theory from the OpenCV tutorials though), it didn't help me wrap my head
around concepts, but implementing the algorithms (gaussian blur, edge
detection, etc) without having fully grasped the math helped a whole lot with
understanding how things work. i still can't explain the math behind edge
detection, for example, but i can describe how it works. when i wrote this, i
had in mind a person who would like to get a big picture idea of what image
processing is, and will hopefully be inspired to learn what happens under the
hood.

------
Gepsens
My CS lab specialized in image processing, can confirm this is indeed the 101.

------
coin
-1 for explicitly disabling pinchzoom on mobile/tablet devices.

~~~
conceit
It's just a simple css value and not intentional, I think. I don't remember
where I read this or what it was exactly, some other comment about another
site.

~~~
coin
The default is to allow zoom. Why add a value to remove end user
functionality?

