Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Imgix.js, a JavaScript library for responsive imaging (imgix.com)
102 points by zacman85 on Oct 10, 2014 | hide | past | favorite | 32 comments

I wonder why are there so many services that work with this kind of business model. while I would be willing to pay for code I can own, I would never invest in something in something where I can't predict it's future.

I can't speak specifically about this responsive image feature, but as for imgix's core business, I have mixed feelings.

It's pretty trivial to write on-the-fly image processing using an existing graphics library. And given that the image can be cached and served from disk and a CDN, it can scale incredibly well.

Having said that, the features that they support is impressive, the API is intuitive, the speed is great, and you can stick your own CDN infront of it (or use theirs, which is actually Fastly, I believe). The founder, Chris, is wicked smart...this is really more than just a wrapper around GM.

I just had to write an image server for work and found it incredibly hard. What's your on-the-fly image processing setup look like?

I'd break it down into two parts. The first is the image transformation. For this we use graphics magic, and based on the params in the querystring, we'd crop, resize, alter the quality, anchor the image to a focus point and so on. Not copy and pastable, but [1] should give you a rough idea. Also [2] is code we use to get the file type and size of the image (gm identify can be painfully slow).

The second part was a bit more "fancy". There were two really slow parts to this (a) fetching the origin (from S3) and (b) applying lossless compression (for a large image, it can take 10+ seconds). Fetching from origin is easily solved by caching the origin to disk. So if you ask for goku.png?w=90001&h=9001 and then goku.png?w=2393&h=43433 it's only going to be 1 origin fetch. For the lossless compression, we just used the filesystem as a queue. We'll serve up the umcompressed image with a short cache header (maybe 10 minutes) and store it in /storage/uncompressed. The filesystem is monitored and when a file is added, we compress it and them move it to /storage/compressed.

So, when you serve an image, the flow is:

- check for the file in /storage/compressed/ and serve that with a long cache header (this is a fully transformed image (hash the querystring parameters))

- check for the file in /storage/uncompressed/ and serve that with a short cache header (this is a fully transformed image (hash the querystring parameters))

- Check if we at least have the original in /storage/original

  - if not, fetch the original, put it in /storage/original
- Transform the image, store it at /storage/uncompressed and serve it up

- In the background, compress images and move them from /storage/uncompressed to /storage/compressed

It might seem like overkill when you consider that, despite serving thousands of images per second, the CDN handles almost every request. The problem is with the lossless compression. We found it impossible to do it on-the-fly for too many of our images, so you absolutely need that available and ready to go for the 5% CDN miss.

[1] https://gist.github.com/anonymous/8f328359f07f6c5d142e

[2] http://openmymind.net/Getting-An-Images-Type-And-Size/

Are you using S3 with multiple EC2 instances/multiple servers? Do you keep your /storage/ on S3? I'm considering pulling from S3, then resizing on whatever server it is, then storing back on S3 - any issues with that?

Do you handle the malicious case of someone supplying various widths and heights potentially DoSing the server?

Non EC2 servers, one in Europe and one in the US. /storage are local SSDs to the machines (so it's 2 copies of the data (3 for the originals since they're also on S3)).

Whether storing it back on S3 is "good enough" depends on whether you feel the latency to fetch from S3 is acceptable. I don't have any hard numbers (I might have at some point). I imagine you'll see a percentage in the 1-4s range, which is pretty bad considering you still have to serve it to the CDN and then the CDN to the user. If you have users on mobile or in developer countries, you do what you can to make your side as fast as possible.

Never had malicious users, but we worried about it. We took a reactive approach: monitoring disk space usage. It never proved necessary to do more. You're definitely open to a DOS attack. Hard to mitigate too...can't rate limit since the request comes from the CDN. You could whitelist certain dimensions, but we also allowed our content owners to specify the focal point of the image, which we'd center our crop on, which means any value of x and y is reasonable. You could possibly store that data on the image servers, instead of passing it in the querystring, but then you're introducing state and, with multiple servers, synchronisation. shudder.

You can see it in action at:


with documentation at:


(the [q]uality argument isn't documented, weird....unless you specify a quality (I only remember [h]igh) we pick a jpg compression based on the filesize)

> It never proved necessary to do more. You're definitely open to a DOS attack. Hard to mitigate too...

An approach I've used before is to have a hash in the URL, and discard any requests where the width/height don't match the hash value. Not good if users are meant to be able to link at whatever size they want, but in our case we gave a shortcode to users which then generated the actual URL.

I have heard anecdotally that graphicsmagick and imagemagick both have awful memory leaks. Did you run into this? We serve ~400 reqs/sec from our image server after putting a CDN in front of it, so we couldn't work with memory leaks.

File size was a problem for us too, as was file format. We needed a solution that would work with pngs, jpegs, gifs and tiffs.

I worked for a company that used ImageMagick for compositing images together and resizing them to one of ~60 sizes (different page locations, device types, etc.). While it worked, and it was at the core of the company's technology, it was truly the worst software I have ever had to wrangle into a production service.

You name it and ImageMagick could do it:

  - Leak memory
  - Perform terribly until you find the magic incantation that is 10x faster
  - Output wildly different images after a minor patch release
  - Remove / change options after a minor patch release
  - Enormously degrade performance after a minor patch release
  - Have numerous security vulnerabilities all the time, which require frequent upgrades
  - Dump core more often than you might like
It took us upwards of 3 months to simply move from one ImageMagick release to another (a few minor versions ahead), and we had to do all sorts of workarounds and A/B tests to ensure the images would look right.

I heard that GraphicsMagick was superior in that it maintained some consistency of behavior between versions, but it doesn't have all of the functionality of ImageMagick. So we couldn't switch to it.

Another company that I worked for had a fleet of several thousand servers running constantly just to thumbnail user uploaded images, and it was not unheard of for it to fall behind.

IM / GM are the stock answer to process images, but from my experiences they have no place in a production system. I think this is an area that is pretty poorly served by open source software; there are lots of libraries to handle different image formats, but no good infrastructure exists to tie it all together (that I'm aware of).

Eep. Thanks for the writeup. I'm really glad we didn't go with imagemagick.

No, we didn't have memory leaks, but I do remember we had to build GM from source because the version in the ubuntu rep was old and it DID leak.

For gifs, we used gifsicle, but we didn't support the full set of functions with it.

I hate to be all self promotion, but this is why imgix exists as a service. We aren't running gm or imagemagick. One of our statements is that "this cannot be built in a weekend" as many engineers are quick to claim how easy an implementation it is when coming across our service.

Did you run across the memory leak issue too? Could you talk about the imgix stack? I'm curious to know what it is like.

Had to write image server myself and it wasn't really hard.

Nginx to handle existing files. Python + Pillow + cherrypy (or could be any other microframework) to handle image processing on the fly and then caching processed image to the disk.

All in all around 350 lines of code in python. And something like 100 lines in nginx (because of a heavy filename processing and inner url rewriting).

Result - facebook-like image processing:

Simple resize: http://media.example.com/w400x200/id_token_string.jpg Crop (based on coords): http://media.example.com/20.20.380.300/id_token_string.jpg Or crop resize from center: http://media.example.com/c200x200/id_token_string.jpg

and so on...

Fun project.

It's about total cost of ownership

A: Imgix = cost of imgix subscription + integration time * hourly rate (it's dead simple)

B: Rolling your own = dev time * hourly rate + maintenance/ops time * hourly rate

For most orgs B > A. For most individual programmers since hourly rate is not a factor A < B.

That's certainly true. I'm not saying that this service doesn't have any value. Just wanted to share my experience with writing similar solution.

It's good to be dev and have spare time for stuff like that. You learn stuff while making such projects, you save money and it just works.

About money, judging by the pricing on imgix page (and if I understand pricing correctly, I'm saving around $300 per month ($50 for cheapest plan + $250 for traffic).

If you have to pay for hardware to run it on, that would be part of B.

What was the max throughput your server could handle? We ended up using Go because we serve 400 - 600 reqs/sec and Python/Ruby solutions didn't have high throughput.

I've never tested its throughput - there was no reason to do that because all of the processed images are saved and stored as long as those files are accessed at least once a week.

So, the answer is pretty much this: it'll handle as many rps as nginx serving static files can handle.

If we are talking about unique requests each requesting to process unique image - I'm pretty sure linode server CPU it's running on will choke.

There are at least couple open source projects that handle dynamic image generation on the backend quite nicely. It seems photon is dead, but I found it works quite well and is easy to extend. Thumbor seems quite active although I've never used it myself.



One that I wrote, written in Ruby: http://magickly.afeld.me/ Powers http://mustachify.me :3)

This is nice, but it does not put a valid img tag in the document source without javascript execution. That might be fine for things that don't face the public internet, but if you do any type of public publishing, you should care about having markup that describes your content independent of scripts or css.

This is why standard markup-based responsive images (picture and srcset) are such a big deal.

There is nothing stopping you from setting the src tag. But you are just adding another request. Our base service works perfectly with src set and picturefill. This library is for cases where a different result is desired.

Why isn't there an img-tag in the default example? Why are you promoting bad practice by not having images in img-tags? You solved one problem but created a much bigger problem. Bots, screen readers and non-javascript client should see an image, not a div with some attributes.

They're welcome to target whichever type of users they want to - it's 2014, most average internet users have javascript enabled. Besides, you can add a title/aria-* attribute for bots and screenreaders (perhaps not semantically correct but gets the job done).

Yes it is 2014 and we should be able to follow simple standards that has been around for quite some time.

What would the advantage of this be versus using the picture tag with your own image sources hosted on a CDN? (Let's assume picture tag is widely supported for now)

It is possible to easily target thr picture element using imgix's URL API directly. However, with picture you are limited to a set of image sizes and predefined dprs. This library allows the containing element to recieve an image at any size necessary with any DPR multiplier (up to 5, I believe) with the added bonus of conditional image manipulation overrides. In the example, we are baking in textual image information. Other edits like image quality and sharpening or midtone adjustments could be conditionally set as the image crops larger or smaller.

Title is "for responsive images", not "imaging".

Imaging is completely different!

What is your definition of imaging? Our use of imaging is in reference to imaging in technology. "The production of graphic images from digitally generated data." Our service processes images on our servers and produces new image data with each request when necessary (if not cached already.) The Javascript library is just a way to interface with our infrastructure and generate these requests.

where did they get the yeti image in the demo?

I'm not sure what "yeti" image you are referring to. All images were purchased royalty free images from iStockPhoto.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact