
Nginx image processing server with OpenResty and Lua - fcambus
http://leafo.net/posts/creating_an_image_server.html
======
danielrhodes
I have done something similar at a little scale. At small load, it is great.
The larger you get, real problems start to arise.

The bottleneck with this implementation is ImageMagick. ImageMagick leaks a
lot of memory and is generally very inefficient with resizing operations.
GraphicsMagick is not much better. Under high load, this will crush your CPUs
and max out your Nginx threads much faster than it is worth. You will almost
definitely need to use something like OpenCV on the GPU for this to scale.

Although caching is referenced briefly in this article, it is crucial for this
system to work. A good CDN with fast invalidation and a low eviction rate
would be ideal.

~~~
hierro
I have to disagree with that. We've been using graphicsmagick via cgo on
memecrunch.com (alexa 30K, 8K in US) for 9 months and we're serving around
500K images per day on a single server without any issues. In fact, I've just
logged in and the app is currently using 0.3% of the memory (out of 64GB)
after running for a couple of weeks, since the last update I pushed to
production. I don't believe graphicsmagick leaks memory in any significant
way.

Of course, you need to cache the thumbnails for a reasonable time once they're
generated, otherwise the CPU usage will skyrocket, but that's pretty easy to
achieve with nginx's proxy_cache directive.

------
staunch
I've probably done this ten times in ten different ways over the years. Most
recently with ~150 lines of Go via nginx proxy_pass (or fastcgi_pass). I
needed to be able to control how to resize images while maintaining aspect
ratio on a fixed size canvas. The first request for an image is
disappointingly slow for larger images, but it's not terrible. Writing out the
generated images to a cache directory and using try_files, so subsequent
requests are static, is definitely key.

~~~
hierro
Go's image module is painfully slow, supports a limited set of formats and
fails on lots of optimized images.

I also have implemented this in Go, but using cgo and graphicksmagicks, which
is way faster and decodes almost any image you throw at it (there are some
issues with very optimized GIFs, but I fallback to gifsoup for "deoptimizing"
them in those cases). In fact, I even added a function for cropping and
resizing an image to a given size in the module itself (since I think is a
very common need), while keeping the aspect ratio and also giving the option
to just center the result or grab the part of the image with higher entropy
(e.g. suppose you have an image with a person in the side and then a lot of
blue sky, you probably don't want to crop and image and end up with just sky
in the thumbnail). This is just one of a few benchmarks I wrote:

BenchmarkResizePngMagick 20 80665091 ns/op 689 B/op 3 allocs/op

BenchmarkResizePngNative 1 9689016519 ns/op 351200 B/op 27 allocs/op

(yup, that's 120x faster)

The bindings are mostly documented, but I haven't gotten around to releasing
the code yet, although I do hope do publish it soon. If you're interested,
send me an email and I'll let you know when I put the code in Github.

~~~
staunch
Sent you an email. Thanks. I actually kind of like using the native Go
libraries, because they're so simple I feel like I can actually trust them.
But, you're right, they're really slow and strict in what they parse. I'll
probably need to do something different in the future, so your code would
certainly be interesting to check out.

------
ck2
Did it with the perl module in Nginx with imagemagick.

Much more powerful than their built in image handling which uses the ancient
and no longer updated GD library.

~~~
eksith
Would you be willing to let us take a peek at that code? It could be
interesting to compare and contrast to what Lua offers.

~~~
ck2
Ha, my code is ugly as Perl is not my primary coding language, would be
embarrassing.

Could never get sendfile to work under perl through nginx either for some
reason and they never answered my question on the nginx forums, so I had to
just inefficiently dump the image directly to nginx in a buffered loop. The
image is cached just like this lua code so the next read goes directly through
nginx so wasn't too worried about the sendfile problem.

Code is just 50 lines though, if I can figure it out, most any coder should be
able to. Just compile nginx with
[http://wiki.nginx.org/HttpPerlModule](http://wiki.nginx.org/HttpPerlModule)
and then find any perl example code for imagemagick.

~~~
eksith
Haha! Ok, fair enough :)

Thanks for the details, though. That actually gave me a few ideas.

------
tt
I highly recommend [http://www.imgix.com/](http://www.imgix.com/)

------
juri
I can recommend thumbor:
[https://github.com/globocom/thumbor](https://github.com/globocom/thumbor)

------
lamnk
Since disk space is not much of problem this day, why don't we just pre-
process the images? I know this only applies if you know in advance which
sizes you want, but do you really need like 20 different sizes of an image?

~~~
sstrudeau
Because when your design requirements change, you need to reprocess your
entire corpus, among other reasons.

------
zzzcpan
Since author is talking about security a bit, I'll add. Be careful, if there
is a bug somewhere in imagemagick or it runs out of memory, it could easily
take down an entire worker process and abort every connection there.

~~~
dddd_david
You have to be especially careful with Imagemagick because it calls through to
format specific libraries (libjpeg, libpng, etc) and the implementation of
THOSE libraries can have a huge impact on your application.

For instance, say you are generating thumbnails for JPEG images. Most
Operating Systems ship with the IJG libjpeg, but a few have switched or are
considering switching to libjpeg-turbo, a forked binary compatible library
that has several performance enhancements. One thing libjpeg-turbo doesn't do
though, is implement the DCT scaling functionality of libjpeg, which is a way
of efficiently downscaling jpeg images without fully decoding the image (and
of course has an impact on image quality as well). The most important benefit
of using DCT scaling for generating thumbnails is that it has much lower
memory overhead. Since you don't need to decompress the entire image first, it
can be done block by block, which means full-image sized buffers don't need to
be allocated (which is what Imagemagick will try to do by default). Generating
a small thumbnail of a large (10,000 x 20,000 pixel) image will allocate large
amounts of memory, whereas using the DCT scaling option will allocate only
small working buffers and complete much faster. If you're running an image
processing server, these considerations are vital.

Long story short, if you stop thinking about your code at the level of the
Imagemagick API (or whatever graphics library you choose), you can end up with
more problems than you might realize.

~~~
zzzcpan
Agreed.

------
tmzt
Has anybody done something like this with support for uploading and resizing
images via a REST API with OpenResty. Do you know of any good examples or
tutorials for that or upload support in general?

------
the1
for true hipster web scale power, one must use node.js
[https://github.com/saml/nodejs-resize-image](https://github.com/saml/nodejs-
resize-image)

~~~
nyan_sandwich
Actually, Openresty is both faster and more obscure.

------
canterburry
...or you could just use Pixtulate
([http://www.pixtulate.com](http://www.pixtulate.com))

~~~
ddorian43
Looks like free. Really ?

~~~
canterburry
Yes, we'll be providing a free beta coming up soon. If you are interested, I'd
love to reach out to you when we launch the beta.

------
limmeau
Thanks for posting this. Do those Lua scripts run concurrently in any way?

~~~
leafo
To enable concurrency you'll have to spawn multiple workers using the
worker_processes directive
[http://wiki.nginx.org/CoreModule#worker_processes](http://wiki.nginx.org/CoreModule#worker_processes)

