How do you communicate with Photon, is it over HTTP? (I know nothing of this "JAX-RS" you speak of). I was thinking of building something like this for my photo warping app Trollaroid -- currently all the processing is done on client devices, but it'd be much more sensible to do it in the cloud. How might hooking up my API endpoint to an array of Java image processors over HTTP compare to say, using ZeroMQ as the glue?
Our webapp doesn't directly communicate with photon at all. The webapp sticks the photo in a certain bucket on S3 with, and then includes the path to that photo in the photon url that gets passed to the client.
On the read path, HTTP is really nice because browsers, caches, and CDNs all speak HTTP.
How does this compare to something like Dragonfly (https://github.com/markevans/dragonfly/)? 1000memories's issue was with the time it took to generate 12 different versions, not the speed of Imagemagick itself. So I'm not quite sure why they needed to write their own extension. Even Carrierwave (plus caching of course) can be setup this way (https://gist.github.com/1541912).
Dragonfly is quite nice, but we opted not to use it for a handful of reasons: ImageMagick itself is a actually a good bit slower for basic resizing/cropping/rotating than the library we're using in Photon; Photon is ~100 SLOC, Dragonfly is ~3500; Photos are the core of our business, so a generic solution is not likely to work 100% how we need it to (we have various legacy concerns that probably would've require more code to monkey patch Dragonfly than to just write it into Photon); Java is much better than Ruby for CPU and RAM bound tasks that need to execute in parallel.
All that said, Dragonfly is a great choice for most people.
Thanks for open sourcing this 1000mem. What is average request latency for a 2mb photo? We tried to do something similar with python, but we found there's a 300ms latency between s3 and ec2. It takes quite a few seconds to transfer from s3 to ec2, resize it, and transfer to client. You mentioned using caching to speed things up, but what about the first request?
We do two layers of caching (CDN in front of Varnish in front of photon). The latency is annoying, but it's no worse than serving the files directly off of S3 (as we were doing previously).
In an ideal world, Photon would be running on boxes that store the files themselves, and some sort of intelligent load balancer would make sure that the requests get directed to the correct box.
I tried to write an interface to ImageMagick's MagickWand in Golang as well to solve the same problem. You can find it here: https://github.com/dqminh/mage
Awesome to see something like this open sourced. Definitely going to check this out. For those who don't have the resources available to manage this kind of infrastructure however, you may wish to look at Cloudinary (http://cloudinary.com/).
I'm sure it works great. But it's one other piece of infrastructure to deploy, monitor, and maintain. Depending on what stage your at and whether or not photos are core to your business, I think (at least temporarily) a hosted solution is better for some.