Hacker News new | past | comments | ask | show | jobs | submit login

I just had to write an image server for work and found it incredibly hard. What's your on-the-fly image processing setup look like?



I'd break it down into two parts. The first is the image transformation. For this we use graphics magic, and based on the params in the querystring, we'd crop, resize, alter the quality, anchor the image to a focus point and so on. Not copy and pastable, but [1] should give you a rough idea. Also [2] is code we use to get the file type and size of the image (gm identify can be painfully slow).

The second part was a bit more "fancy". There were two really slow parts to this (a) fetching the origin (from S3) and (b) applying lossless compression (for a large image, it can take 10+ seconds). Fetching from origin is easily solved by caching the origin to disk. So if you ask for goku.png?w=90001&h=9001 and then goku.png?w=2393&h=43433 it's only going to be 1 origin fetch. For the lossless compression, we just used the filesystem as a queue. We'll serve up the umcompressed image with a short cache header (maybe 10 minutes) and store it in /storage/uncompressed. The filesystem is monitored and when a file is added, we compress it and them move it to /storage/compressed.

So, when you serve an image, the flow is:

- check for the file in /storage/compressed/ and serve that with a long cache header (this is a fully transformed image (hash the querystring parameters))

- check for the file in /storage/uncompressed/ and serve that with a short cache header (this is a fully transformed image (hash the querystring parameters))

- Check if we at least have the original in /storage/original

  - if not, fetch the original, put it in /storage/original
- Transform the image, store it at /storage/uncompressed and serve it up

- In the background, compress images and move them from /storage/uncompressed to /storage/compressed

It might seem like overkill when you consider that, despite serving thousands of images per second, the CDN handles almost every request. The problem is with the lossless compression. We found it impossible to do it on-the-fly for too many of our images, so you absolutely need that available and ready to go for the 5% CDN miss.

[1] https://gist.github.com/anonymous/8f328359f07f6c5d142e

[2] http://openmymind.net/Getting-An-Images-Type-And-Size/


Are you using S3 with multiple EC2 instances/multiple servers? Do you keep your /storage/ on S3? I'm considering pulling from S3, then resizing on whatever server it is, then storing back on S3 - any issues with that?

Do you handle the malicious case of someone supplying various widths and heights potentially DoSing the server?


Non EC2 servers, one in Europe and one in the US. /storage are local SSDs to the machines (so it's 2 copies of the data (3 for the originals since they're also on S3)).

Whether storing it back on S3 is "good enough" depends on whether you feel the latency to fetch from S3 is acceptable. I don't have any hard numbers (I might have at some point). I imagine you'll see a percentage in the 1-4s range, which is pretty bad considering you still have to serve it to the CDN and then the CDN to the user. If you have users on mobile or in developer countries, you do what you can to make your side as fast as possible.

Never had malicious users, but we worried about it. We took a reactive approach: monitoring disk space usage. It never proved necessary to do more. You're definitely open to a DOS attack. Hard to mitigate too...can't rate limit since the request comes from the CDN. You could whitelist certain dimensions, but we also allowed our content owners to specify the focal point of the image, which we'd center our crop on, which means any value of x and y is reasonable. You could possibly store that data on the image servers, instead of passing it in the querystring, but then you're introducing state and, with multiple servers, synchronisation. shudder.

You can see it in action at:

http://0.viki.io/viki.jpg?s=263x220&q=h

with documentation at:

http://dev.viki.com/v4/images/

(the [q]uality argument isn't documented, weird....unless you specify a quality (I only remember [h]igh) we pick a jpg compression based on the filesize)


> It never proved necessary to do more. You're definitely open to a DOS attack. Hard to mitigate too...

An approach I've used before is to have a hash in the URL, and discard any requests where the width/height don't match the hash value. Not good if users are meant to be able to link at whatever size they want, but in our case we gave a shortcode to users which then generated the actual URL.


I have heard anecdotally that graphicsmagick and imagemagick both have awful memory leaks. Did you run into this? We serve ~400 reqs/sec from our image server after putting a CDN in front of it, so we couldn't work with memory leaks.

File size was a problem for us too, as was file format. We needed a solution that would work with pngs, jpegs, gifs and tiffs.


I worked for a company that used ImageMagick for compositing images together and resizing them to one of ~60 sizes (different page locations, device types, etc.). While it worked, and it was at the core of the company's technology, it was truly the worst software I have ever had to wrangle into a production service.

You name it and ImageMagick could do it:

  - Leak memory
  - Perform terribly until you find the magic incantation that is 10x faster
  - Output wildly different images after a minor patch release
  - Remove / change options after a minor patch release
  - Enormously degrade performance after a minor patch release
  - Have numerous security vulnerabilities all the time, which require frequent upgrades
  - Dump core more often than you might like
It took us upwards of 3 months to simply move from one ImageMagick release to another (a few minor versions ahead), and we had to do all sorts of workarounds and A/B tests to ensure the images would look right.

I heard that GraphicsMagick was superior in that it maintained some consistency of behavior between versions, but it doesn't have all of the functionality of ImageMagick. So we couldn't switch to it.

Another company that I worked for had a fleet of several thousand servers running constantly just to thumbnail user uploaded images, and it was not unheard of for it to fall behind.

IM / GM are the stock answer to process images, but from my experiences they have no place in a production system. I think this is an area that is pretty poorly served by open source software; there are lots of libraries to handle different image formats, but no good infrastructure exists to tie it all together (that I'm aware of).


Eep. Thanks for the writeup. I'm really glad we didn't go with imagemagick.


No, we didn't have memory leaks, but I do remember we had to build GM from source because the version in the ubuntu rep was old and it DID leak.

For gifs, we used gifsicle, but we didn't support the full set of functions with it.


I hate to be all self promotion, but this is why imgix exists as a service. We aren't running gm or imagemagick. One of our statements is that "this cannot be built in a weekend" as many engineers are quick to claim how easy an implementation it is when coming across our service.


Did you run across the memory leak issue too? Could you talk about the imgix stack? I'm curious to know what it is like.


Had to write image server myself and it wasn't really hard.

Nginx to handle existing files. Python + Pillow + cherrypy (or could be any other microframework) to handle image processing on the fly and then caching processed image to the disk.

All in all around 350 lines of code in python. And something like 100 lines in nginx (because of a heavy filename processing and inner url rewriting).

Result - facebook-like image processing:

Simple resize: http://media.example.com/w400x200/id_token_string.jpg Crop (based on coords): http://media.example.com/20.20.380.300/id_token_string.jpg Or crop resize from center: http://media.example.com/c200x200/id_token_string.jpg

and so on...

Fun project.


It's about total cost of ownership

A: Imgix = cost of imgix subscription + integration time * hourly rate (it's dead simple)

B: Rolling your own = dev time * hourly rate + maintenance/ops time * hourly rate

For most orgs B > A. For most individual programmers since hourly rate is not a factor A < B.


That's certainly true. I'm not saying that this service doesn't have any value. Just wanted to share my experience with writing similar solution.

It's good to be dev and have spare time for stuff like that. You learn stuff while making such projects, you save money and it just works.

About money, judging by the pricing on imgix page (and if I understand pricing correctly, I'm saving around $300 per month ($50 for cheapest plan + $250 for traffic).


If you have to pay for hardware to run it on, that would be part of B.


What was the max throughput your server could handle? We ended up using Go because we serve 400 - 600 reqs/sec and Python/Ruby solutions didn't have high throughput.


I've never tested its throughput - there was no reason to do that because all of the processed images are saved and stored as long as those files are accessed at least once a week.

So, the answer is pretty much this: it'll handle as many rps as nginx serving static files can handle.

If we are talking about unique requests each requesting to process unique image - I'm pretty sure linode server CPU it's running on will choke.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: