
Using AWS lambda for cheap S3 content processing - cleverfoo
http://docs.scanii.com/articles/aws-lambda-for-cheap-s3-content-processing.html
======
jayroh
This is an excellent write-up and a great use-case for what S3+Lambda can do.
As AWS offerings go, I believe Lambda has a _tremendous_ upside and will be
growing significantly over the next few years.

I've been throwing a lot of my own personal resources into building some
things on top of both S3 and lambda and have found a few tools that help with
it quite a bit. For one - the lambduh project from Russ Matney has been a
great resource for abstracting out some of the more common s3->lambda
workflow:
[https://github.com/lambduh/lambduh](https://github.com/lambduh/lambduh). On a
different note is T.J. Holowaychuk's Apex project:
[https://medium.com/@tjholowaychuk/introducing-
apex-800824ffa...](https://medium.com/@tjholowaychuk/introducing-
apex-800824ffaa70)

------
cleverfoo
Hi there, author here, happy to answer any questions.

~~~
StreamBright
Excellent write up, kudos for using IAM and roles for this. We are working on
implementing the very same system, we might just re-use your code. Thanks for
sharing!

~~~
cleverfoo
Thank you!

------
untog
Off topic, but I'm hoping there are some Lambda-heads in the room. I want to
write a system that basically rebroadcasts a message sent over SNS, to
different HTTP endpoints. (I don't have control over these endpoints so can't
use SNS itself as I can't confirm subscriptions).

How many HTTP requests can Lambda do concurrently? Is my best approach to fire
all these requests inside one worker, or should/could I have it spin up
subsequent lambadas whose only function is to run the HTTP request then close?
I'm imagining that would be a lot more expensive.

~~~
yowmamasita
My guess: all you can do in the 300 sec execution limit

~~~
cleverfoo
The tricky part there is that it wouldn't work if you just sat there in a
tight look dispatching http requests, any one of them timing out would,
likely, trigger the deadline and make all subsequent http requests not happen.

So, alternatively, you could do something with DynamoDB event sources, where
you have some sort of pub/sub table that your lambda functions listen on
(basically a list of all the http requests that have to happen) - thus keeping
a minimal 1 lambda dispatch per http request. The catch is you would need
another system to manage that table (technically that system can be lambda
itself).

Two important things, 1) I haven't used the dynamodb/lambda integration myself
so be skeptical of my suggestion and 2) what I can say from our usage of the
s3/lambda integration is that concurrency is not a problem with thousands of
lambda dispatches/second being surprisingly quick to spin up.

------
piyushco
Excellent post. I wanted to generate thumb images for photos uploaded to s3
bucket using aws Lambda, could successfully implement it.

but found one issue, that many here might not be aware of, S3 bucket and
Lambda function should be in same aws region.

Unfortunately, my s3 bucket is in southeast-ap, and aws lambda is not
available in this region. couldn't go live today. will have to copy bucket to
another region to use it.

hope this helps. thanks.

~~~
gravypod
Well that is strange but it does make sense. They would be spending a lot of
money on bandwidth if that was no the case.

------
estefan
One lambda application I really want is a pingdom-style service. Use lambda to
ping a web site and send an email if it's offline. Any takers to build this
:-)?

~~~
delluminatus
Actually, Amazon already did -- one of the sample Lambda functions you can use
is just that. It runs on a scheduled timer, and if I remember correctly, will
alert using AWS Simple Notification Service, which can be configured to send
alerts to your devices or emails.

~~~
vacri
Unfortunately, unless it's changed recently, SNS can only send SMSes to US
numbers :/

~~~
dorfsmay
[https://www.twilio.com/sms](https://www.twilio.com/sms)

~~~
vacri
Good point.

------
aantix
Still waiting on Ruby support...

~~~
semiquaver
It requires a little boilerplate but jruby works well and is reasonably
performant if you precompile the ruby:

    
    
      require 'java'
    
      java_import 'com.amazonaws.services.lambda.runtime.Context'
      java_import 'java.util.Map'
       
      class Main
        java_signature 'static String handler(Map<String,String> args, Context context)'
        def self.handler(args, context)
          puts "hello world"
        end
      end
    
      # jrubyc --java main.rb

