
Making thumbnails fast - ka-engineering
http://engineering.khanacademy.org/posts/making-thumbnails-fast.htm
======
onalark
Great post, William. I really appreciated your exposition on both the
challenges you folks were facing and your solution to the problem.

As a few others have pointed out, sending a lambda function through NumPy is
almost always the last thing you want to do. Unfortunately, you were in a
situation where you were either going to have to do something really painful
like using einsum: [https://stackoverflow.com/questions/29989059/matrix-
multipli...](https://stackoverflow.com/questions/29989059/matrix-
multiplication-with-numpy-einsum) or writing your own
ufunc:[https://docs.scipy.org/doc/numpy-dev/user/c-info.ufunc-
tutor...](https://docs.scipy.org/doc/numpy-dev/user/c-info.ufunc-
tutorial.html)

I suspect that your primary limitation was the Google Compute Engine
infrastructure. I'm not familiar with the limitations there, but a quick
search on Google turns up a fairly limited set of libraries indeed.

I thought it would be interesting to adapt your code slightly to use Numba
acceleration. Here's what it looks like:

    
    
      from numba import jit
    
      def avg_transform(image):
          m, n, c = image.shape
          for i in range(m):
              xi = image[i]
              for j in range(n):
                  avg = xi[j].sum()/3
                  xi[j][:] = avg
          return image
    
      fast_avg_transform = jit(avg_transform, nopython=True)
    

I observed 25ms per image
[https://gist.github.com/ahmadia/c1f8be119f3cb2d2b8e5](https://gist.github.com/ahmadia/c1f8be119f3cb2d2b8e5)
processing times on my laptop on 1280x720 pixels.

Re-reading your post, I suspect that einsum might actually be your cup of tea,
but I really enjoy the simplicity and performance of using Numba for these
sort of tasks.

~~~
dahart
I haven't used Numba -- looks fast and easy!

But am I missing something? Numpy has everything you need already, natively,
no? Some slicing or a dot product should get you there... no need for ufuncs
or einsum, I think...

    
    
      avg = (rgb[...,0]+rgb[...,1]+rgb[...,2]) * (1.0 / 3.0)
    

or better yet,

    
    
      gray = np.dot(rgb, [0.299, 0.587, 0.114])

~~~
aktiur
More generally, for an image _im_ with shape (width, height, channels) and a
square transformation matrix _M_ of shape (channels, channels), you can do :

    
    
      res = np.dot(im, M.T)
    

It will work with affine transformation as well if you add a 1 component to
every pixel. It will also work with higher dimensional images if I'm not
mistaken.

~~~
onalark
Agreed, for some reason when I was looking at this last night I thought I
couldn't use broadcasting, I've added the example to the gist:
[https://gist.github.com/ahmadia/c1f8be119f3cb2d2b8e5](https://gist.github.com/ahmadia/c1f8be119f3cb2d2b8e5)

Would you believe that Numba is 4 times faster for the sort of simple
transformations described in the blog post?

 _(See aktiur 's response below, some performance gains come from avoiding a
copy)_

~~~
aktiur
Numba is indeed pretty impressive, but you're not comparing exactly the same
thing with this code.

In the Numba case, you're basically modifying the image in place: it means no
allocating a new array, no full copying. However, your pure-numpy code
basically creates a new array (the result of np.dot) before copying it back
entirely in image.

If you write the two functions so that they both return a new numpy array and
do not touch the original one, the time difference drops from 4 times faster
to 2.5 times faster. That's still an impressive difference, but at the loss of
a bit of flexibility.

[https://gist.github.com/aktiur/e1cddee8f699ded49824](https://gist.github.com/aktiur/e1cddee8f699ded49824)

N.B.: numpy.dot does not use broadcasting, i.e. it does not allocate a
temporary array to extend the smaller one. The function handles n-dimensional
arrays by summing on the last index of the first array, and on the second last
of the second array.

~~~
onalark
Thanks, I clearly wasn't being careful. I'll update my Gist...

edit: _On reviewing, I think the intent of the original blog post was to
modify images in place (or at least to do it as quickly as possible with in-
place filtering ok). In that case, I think my comparison is fair, since NumPy
doesn 't offer a faster way to do the requested operation. I didn't try out
einsum, but I think Numba would outperform that as well._

------
emilong
"Performing these kinds of image manipulations efficiently is essentially a
solved problem. There are many freely available software toolkits that would
be more than sufficient for our needs. However, our server infrastructure
doesn't allow us to use those libraries, so we ended up implementing these
operations ourselves."

Can you explain what prevented using the existing libraries in your server
infrastructure?

~~~
epidemian
What i can gather from the post is that it's because they are using Google App
Engine, and GAE does not allow the use of arbitrary native libraries. Some of
the most popular Python libraries that use native code are supported[1] (in
the post it's mentioned that they are already using numpy and PIL), but if you
need something else besides that, you're out of luck.

[1]:
[https://cloud.google.com/appengine/docs/python/tools/librari...](https://cloud.google.com/appengine/docs/python/tools/libraries27?hl=en)

~~~
stephenr
I'll never cease to be amazed at the hoops people will jump through so they
can feel cool and say "we use the cloud".

~~~
nulltype
The App Engine sandbox is not a good feature of app engine and is removed in
the new Managed VM version (naturally in perpetual beta).

I've seen people spend a lot of time trying to avoid "using the cloud" and end
up wasting a lot of money and time when they could have just used some AWS
services and been done with it.

~~~
stephenr
> they could have just used some AWS services and been done with it.

Like the author did with GAE? I think you missed my point completely.

------
nulltype
This is a pretty informative post about optimizing custom python code to do
some image manipulations, but if you just want to make images like in the
post, a service like Imgix could do a pretty good job and would work with App
Engine, no matter which language you choose.

I spent 5 minutes with the imgix sandbox and made this, which is pretty close
to the samples (I didn't have the correct font or a white colored logo):

[https://sandbox.imgix.com/view?url=https%3A%2F%2Fassets.imgi...](https://sandbox.imgix.com/view?url=https%3A%2F%2Fassets.imgix.net%2Funsplash%2Fpaperlamp.jpg%3Fmarkalign%3Dbottom%252Cleft%26mark%3Dhttp%253A%252F%252Fp6.zdassets.com%252Fhc%252Fsettings_assets%252F526991%252F200061184%252F3t62mqN9s2QjpyHy0CTEcg-
smaller_logo.png%26txtalign%3Dcenter%252Cmiddle%26txtfont%3DAvenir-
Black%26txtsize%3D125%26txtclr%3Dfff%26txt%3DCALLIGRAPHY%26bm%3Dmultiply%26blend%3D94424f%26bri%3D50%26sat%3D-50%26fit%3Dcrop%26h%3D720%26w%3D1280)

------
dahart
Using PIL's im.convert to do lightening and desaturation is a great trick!
I've used PIL and love it for ease of use, but had to look elsewhere for
performance. It strikes me as funny how the function name and docs saying that
convert() is for color space conversion prevented me from considering other
possibilities.

The numpy example is amusing though; sending a python lambda into a numpy op,
especially a lambda that does something numpy is designed to do natively... no
wonder it went slow. But if they'd used native numpy ops, it would be
comparable to this PIL solution, with or without pre-multiplying the matrix
ops. I haven't tried it, but it's possible it would even be faster. And you
have a lot more options for non-affine operations with numpy, whereas the 4x3
is PIL's limit. Making numpy fast makes you have to organize your programs
differently, but it is _well_ worth learning.

------
stuaxo
Hi,

Cairos compositing operations provide a good way of doing this and are
available from python via CairoCFFI.

They provide a multiply operator
[http://cairographics.org/operators/](http://cairographics.org/operators/) and
others.

For an advertising site we needed arbitrary tinting so we lightened the images
with Cairo, then added a color mask over the top in the web page.

S

------
devit
For thesu not restricted to Python, the ImageMagick convert tool is probably
the quickest way to get something like this working.

