Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How was your experience with AWS Lambda in production?
215 points by chetanmelkani on June 21, 2017 | hide | past | favorite | 155 comments
I would like to hear from people who have used AWS Lambda in production, how was there experience with it. It would be great if you have references to project repositories.

I made an image hosting tool on Lambda and S3, for internal corporate use. Staff can upload images to S3 via an SPA. The front end contacts the Lambda service to request a pre-signed S3 upload URL, so the browser can upload directly to S3. It works really well. Observations:

1. Took too long to get something working. The common use case of hooking up a Lambda function to an HTTP endpoint is surprisingly fiddly and manual.

2. Very painful logging/monitoring.

3. The Node.js version of Lambda has a weird and ugly API that feels like it was designed by a comittee with little knowledge of Node.js idioms.

4. The Serverless framework produces a huge bundle unless you spend a lot of effort optimising it. It's also very slow to deploy incremental changes edit: – this is not only due to the large bundle size but also due to having to re-up the whole generated CloudFormation stack for most updates.

5. It was worth it in the end for making a useful little service that will exist forever with ultra-low running costs, but the developer experience could have been miles better, and I wouldn't want to have to work on that codebase again.


Edit: here's the code: https://github.com/Financial-Times/ig-images-backend

To address point 3 above, I wrote a wrapper function (in src/index.js) so I could write each HTTP Lambda endpoint as a straight async function that simply receives a single argument (the request event) and asynchronously returns the complete HTTP response. This wouldn't be good if you were returning a large response though; you'd probably be better streaming it.

Although I've just started using AWS Lambda and think it's a cool technology to use, I can understand your points.

2) Logging is indeed painful! You definitely need a separate tool/system for that. I have created a small CLI tool to view logs of multiple Lambdas which have been deployed using a CloudFormation template: https://github.com/seeebiii/lambdalogs This does not replace a good external system, but it can help for small searches in the logs.

4) Yes, this takes a lot. Though I'm not using the Serverless framework, deploying the code using CloudFormation takes me about 2:30 minutes (with a project using Java Lambdas), because CF is doing lots of checks in the background. I also wrote a tool for this to decrease the waiting time and just update the JS/Java code instead of the whole stack: https://github.com/seeebiii/lambda-updater

Hope this helps you or someone else a bit!

For those using Serverless, being able to upload just the code (not a full Cloudformation stack) is built in, and will save you tons of time. Also, do what you can to limit/optimize your npm modules (if you're using node) since it has to be uploaded every time.

The problem of logs is actually a problem of CloudWatch Logs being just not a very good service. A great way to solve that is to push all logs from CloudWatch Logs into an ElasticSearch cluster (using a Lambda function). AWS even has the code already done for you if you click the "subscribe" button in CWL. Then with Kibana/ElasticSearch the experience of inspecting and analysing logs is MUCH better.

This is what drives me nuts about AWS, but is genius on their part - they've got you to spend more $$ on ElasticSearch and Lambda to overcome the fact that one of their other products just doesn't work very well.

I don't think this is intentional. My experience with AWS is that they aim to constantly improve products and create hosted services that are cheaper than if you frankensteined the same thing yourself. They don't always succeed, but I'm willing to give them the benefit of the doubt for how bad cloudwatch truly is, and just assume either they're blind to the pain because they know what not to do internally, or they're just really hamstrung in trying to modify that feature since so much relies on it.

Hopefully this isn't just stockholm syndrome speaking though...

We built IOpipe[1] to address these issues by offering our own wrapper[2] that sends telemetry to our service. IOpipe aggregates metrics, and errors, and allows the creation of alerts with multiple rules per alert.

[1] - https://iopipe.com [2] - https://github.com/iopipe/iopipe/

+1 to the folks at IOpipe. A really cool product that gives you very interesting visibility into your Lambda function executions!

FYI: except on very wide screens that landing page copy is barely readable over the background image, and with a narrow window it's also obscured by the header and navigation.

Have you tried out Azure Functions? It has pretty good features for continuous integration, a CLI where you can run locally, and a really nice monitoring experience. Obviously it's probably not worth migrating an existing project, but it might be useful for future projects. Plus, we have a Serverless Framework plugin.

Here's a talk that walks through some of our features: https://www.youtube.com/watch?v=TgB-fs1hwlw&t=18s

Disclosure: I'm a Program Manager on Azure Functions.

I used Azure Functions (when I was a MSFT employee, in fact), and found it to be unusable. A list of complaints:

* setup is 100% completely clicky-clicky UI driven, which was a huge pain to scale. instantiation of a Function on behalf of a developer for production use was a huge time sink

* it's clearly a thin veneer on Azure Web Services, and the abstractions leak badly in the portal (deployment credentials, for example)

* the web UI breaks completely and mysteriously if you enable authentication

* management of service princi- uh, I mean, Azure AD Applications was weird, and the (internal to MSFT, I suspect) permissions model to the Graph API was a huge barrier to ease of use

* management of NPM packages required me to start a terminal session in the UI and run commands manually, which was a huge turnoff (and had to be repeated ad nauseam with every new Function created)

* the configuration files for the runtime are utterly undocumented, with the sole exception of the bits used to plug Azure inputs/outputs together. this makes automating things exceedingly difficult. I recall there even being a magic value in the topmost config file

* the edit-commit-push-test cycle was VERY slow, with new commits sometimes taking tens of minutes to "appear" in my function

* I never found a way to run it locally, making the previous point that much worse

* log output is very difficult to find, and can live in a few different places. I spent too much time hunting for errors, especially things like syntax errors that make the runtime itself go kaboom. This was the thing that really killed it for me; if I had an error that resulted in anything but a "clean" return, it was torture trying to figure out where I'd missed the paren.

Thanks for the detailed feedback! I think you probably used Functions when it was much newer, and I think we’ve actually addressed all of your issues.

- You can create a Function App via ARM/CLI/etc., you can write functions without ever touching the portal. See https://docs.microsoft.com/en-us/azure/azure-functions/funct.... You can also now use Visual Studio to author C# functions: https://docs.microsoft.com/en-us/azure/azure-functions/funct...

- It’s true that Functions is built on App Service, but I see that as an advantage. You get all the great features of Continuous Integration, custom domains, automated deployment, etc.

- Indeed, the portal does not do well when auth is enabled and all routes are protected. The problem is that the portal calls admin APIs that are also protected, so it fails. We now have better error messages for this, and we’re tracking this bug: https://github.com/Azure/azure-functions-ux/issues/499

- The Graph API issue is probably not specific to Functions, but it is a bit easier with the Authentication/Authorization feature. Can you provide more detail?

- You can install npm packages at the "root" of your Function and not reinstall them for each Function, just like a normal Node.js app - it walks the directories.

- Our documentation is much better now, and we even have documentation for all bindings in the portal. We also have much better conceptual docs on bindings, see https://docs.microsoft.com/en-us/azure/azure-functions/funct.... We’d welcome any specific feedback on docs that are missing.

- CI should be faster now, it usually takes about 2-3 minutes for commits to show up. It’s fast enough that I’ve demo’d it.

- You can now run locally and debug using the Azure Functions Core Tools (npm i -g azure-functions-core-tools; func init; func host start). See docs: https://docs.microsoft.com/en-us/azure/azure-functions/funct.... This is something that our users always praise us for. We support C# debugging with Visual Studio and JavaScript debugging with VSCode.

- Logs definitely weren't great. Initially, they always went to table storage, but the ones you see streaming in the portal get written to disk to enable the realtime portal stream - they are only written to disk when you're in the portal, so they are "sometimes" there. The good news is that we've tightly integrated Application Insights, which means logs are easy to find. It's easy to alert on failed functions. You can see perf and metric data all in one place without log parsing. For a demo, go to the 6 minute mark of this video: https://www.youtube.com/watch?v=TgB-fs1hwlw&t=6m

Update: the portal does work if Auth is enabled--the bug was fixed a couple of months ago and we didn't close the issue. See https://github.com/Azure/azure-functions-ux/issues/499

Thanks for such a comprehensive follow-up. It's true that it has been a while, and it really does sound like you've addressed nearly all of the issues I had. (btw, the undocumented file was host.json, which it appears may be better documented now)

Chalice [1] makes hooking up lambda to HTTP endpoints easy, provided you don't mind using AWS API Gateway (and using python).

I use it to handle a contact form on a static web site. It works really well.

[1] https://github.com/awslabs/chalice

1. I think a framework / tool will ensure this isn't an issue

2. Dear heavens yes. I ended up building a wrapper (similar to what you did to address 3) that handles logging, etc for any internal events. Everything else is a pass / fail check

3. Also had to build a wrapper. Context / Callback params are confusing

4. I wouldn't use Serverless unless you need it. Try something smaller. Apex is a nice, simple start. Shameless plug: I built a deployment tool because serverless wouldn't work for us and I wanted something in node (no binary like Apex - integrate into our build process) [1]

[1]: https://github.com/Prefinem/lambdify

how do you make sure it will run forever ? what about breaking changes and discontinued services ? can you move to another vendor without changes to your codebase ?


We use AWS Lambda to process Hearthstone replay files.

My #1 concern with it went away a while back when Amazon finally added support for Python 3 (3.6).

It behaved as advertised: Allowed us to scale without worrying about scaling. After a year of using it however I'm really not a big fan of the technology.

It's opaque. Pulling logs, crashes and metrics out of it is like pulling teeth. There's a lot of bells and whistles which are just missing. And the weirdest thing to me is how people keep using it to create "serverless websites" when that is really not its strength -- its strength is in distributed processing; in other words, long-running CPU-bound apps.

The dev experience is poor. We had to build our own system to deploy our builds to Lambda. Build our own canary/rollback system, etc. With Zappa it's better nowadays although for the longest time it didn't really support non-website-like Lambda apps.

It's expensive. You pay for invocations, you pay for running speed, and all of this is super hard to read on the bill (which function costs me the most and when? Gotta do your own advanced bill graphing for that). And if you want more CPU, you have to also increase memory; so right now our apps are paying for hundreds of MBs of memory we're not using just because it makes sense to pay for the extra CPU. (2x your CPU to 2x your speed is a net-neutral cost, if you're CPU-bound).

But the kicker in all this is that the entire system is proprietary and it's really hard to reproduce a test environment for it. The LambCI people have done it, but even so, it's a hell of a system to mock and has a pretty strong lock-in.

We're currently moving some S3-bound queue stuff into SQS and dropping Lambda at the same time could make sense.

I certainly recommend trying Lambda as a tech project, but I would not recommend going out of your way to use it just so you can be "serverless". Consider your use case carefully.

Can you clarify this point:

> its strength is in distributed processing; in other words, long-running CPU-bound apps.

it seems to me that's an explicit non-usecase for Lambda given it limits sessions to < 5min per invocation.

I should have been more specific yes. I actually meant CPU-or-network-bound tasks with a predictable, sub-5min runtime.

Like I said we use it for game replay processing, so that's 5-15 second tasks that read and parse log files and hit a db and s3 with the results.

Other suitable tasks: image resizing, API chatter, bounded video transcoding, etc. Lambda is pretty good at distributed processing (as long as you can make the bill work in your favour, over hosting your own overprovisioned fleet).

Regarding the problem that you want to read the costs per function: have you considered using resources tags? This could help a bit when evaluating your costs (though not directly on the bill). https://aws.amazon.com/answers/account-management/aws-taggin...

I've read about them yeah. A friend also showed me recently the power of stuffing the advanced billing CSVs into Redshift and querying them with Tableau or something.

Personally, I think it's all fucking ridiculous the amount of effort you have to spend into reading your own bill.

Hey all, My name is Chris Munns and I am currently the lead Developer Advocate for Serverless at AWS (I am part of the Lambda PM team). We really appreciate this feedback and are always looking for ways to hear about these pain points. Can email me directly: munns@amazon.com if you ever get stuck.

Thanks, - Chris

I am on a devops team trying to implement lambda by splitting pieces off from a monolithic Java app where applicable.

The biggest pain point I have is because we have multiple "environments" (such as dev, da, staging) in the same amazon account and because lambdas are global I can't limit access to resources via IAM easily without hacks.

Aka, because the same lambda will be used (but different versions and/or aliases) on all environments I can't marry the code and configuration to limit access to say RDS or elasticache or an S3 bucket per lambda.

I feel like I need a higher order primitive (a lambda group that is role + configuration can live in that includes the lambdas) to achieve this. I realize api gateway has the concepts of stages but currently the idea is for some lambdas to be invoked directly by the monolithic app or via SNS/SQS async.

Otherwise I could namespace my lambda functions which is hacky and make DevFooBar, StageFooBar, etc.

Currently we plan to split off our environments into separate AWS accounts.

Hi Runamok,

Today given that aliases/versions do not have any sort of different permissions you are probably best to either run completely different stacks of resources or the multiple account model. With AWS Organizations these days its not that hard to run multiple environments across accounts.

We did a webinar on some of this a few months back, the slides here might be useful to you: https://www.slideshare.net/AmazonWebServices/building-a-deve...

It heavily leverages AWS's tools, but you could create similar practices using 3rd party frameworks and CI/CD tools as well.



To get around the IAM issue we deployed the lambda's with a naming convention like:

prod_lambda_name staging_lambda_name dev_lambda_name

Then the IAM's are written with resource access to prod_* staging_* etc.

It allows to give full permissions to the developer to create dev ones, modify the other ones, but the prod_ are all controlled by a smaller group of people.

It's a bit hacky but it works well enough.

Would be nicer to grant access by stages.

I know we're discussing AWS here, but I find Azure handles this nicely. It has a 'slots' concept, so you can have a slot for production, another for QA, and any more you need. The really great thing is the ability to swap slots - so once you've finished testing a new QA build, you swap the QA and production slots, so what was running in QA is now running in production.

We use Lambda for 100% of our APIs some of which get over 100,000 calls per day. The system is fantastic for micro services and web apps. One caveat, you must use a framework like Serverless or Zappa. Simply setting up API Gateway right is a hideous task and giving your function the right access level isn’t any fun either. Since the frameworks do all that for you it really makes life easier.

One thing to note. API Gateway is super picky about your response. When you first get started you may have a Lambda that runs your test just fine but fails on deployment. Make sure you troubleshoot your response rather than diving into your code.

I saw some people complaining about using an archaic version of Node. This is no longer true. Lambdas support Node V6 which, while not bang up to date, is an excellent version.

Anyway, I can attest it is production ready and at least in our usage an order of magnitude cheaper.

I worked on a project where the architect wanted to use Lambdas for the entire solution. This was a bad choice.

Lambdas have a lot of benefits - for occasional tasks they are essentially free, the simple programming model makes them easy to understand in teams, you get Amazon's scaling and there's decent integration with caching and logging.

However, especially since I had to use them for whole solution, I ran into a ton of limitations. Since they are so simple, you have to pull in a lot of dependencies which negate a lot of the ease of understanding I mentioned before. The dependencies are things like Amazon's API Gateway, AWS Step Functions, and AWS CLI itself, which is pretty low-level. So now, the application logic is pretty easy, but now you are dealing with a lot of integration devops. There's API Gateway is pretty clunky and surprisingly slow. Lambdas shut themselves down, and restarting is slow. The Step Functions have a relatively small payload limit that needs to be worked around. Etc. So use them sparingly!

1. Use a framework for deploying your lambda functions. There are a few that will manage the API Gateway for you.

2. Don't put the Lambda inside a VPC if you want lower response times

3. Step Functions don't seem ready for prime time that I can tell. (This might have changed in the last couple of months)

4. Lambda Functions should be microservices. Small and lean.

5. There is a limit on resources for CloudFormation so at about 20-30 functions with API Gateway on the serverless framework, you will hit a limit and can't add anymore (other deployment tools which don't use CloudFormation shouldn't have an issue)

6. Want more CPU, add more RAM

What makes you say Step Functions isn't ready for prime time? We've been using it (and SWF which it is based off) for about a year at decent scale and been generally very happy with it.

When I initially set the up, there was no way to edit or delete them. I haven't messed with them since, although, I did just check and there aren't any step functions in my account so it looks like they deleted the old ones.

It may be time to test them out again, I just go bit pretty bad with the last time I implemented them and lost about a weeks worth of work because of it being un-usable.

Don't put the Lambda inside a VPC if you want lower response times

Response time's are fine inside a VPC... People keep saying gateway is slow but it isn't from my experience...

API Gateway adds around 200ms to each call for us based on our testing plus another 100ms if it’s inside a VPC. What region are you in? Our VPC has a NAT Gateway so that our Lambda functions can talk to the internet as well.

I am by no means an expert and am reporting what watching the time differences between Lambda reporting and our network calls shows.

Edit: this is by no means a deal breaker or anything. Just something that shocked me when I first noticed it.

(crap I just typed stuff out and pressed F5...)

I haven't tried with a NAT Gateway.

This is what I know.

If the response is small, the request duration is small.

Cloudfront > Gateway > Lambda > RDS (PostgreSQL)


1k | 20ms-60ms

10k | 50ms-90ms

100k | 150ms-200ms

350k | 200ms-450ms

That's a rough gauge of what I've experienced.

I think the throughput on the gateway is the bottleneck.

Just a follow up, after the issue last week with AWS Lambda in East-1 (thursday I believe) VPC is not longer causing any additional overhead.

I am not sure what happened, but I had to move a couple of functions inside the VPC and our response times have remained the same.

Just hit one of our test endpoints

Test #1 Lambda run time (140 ms) Total waiting time (354ms) Total time (358ms)

Test #2 Lambda run time (300 ms) Total waiting time (490ms) Total time (567ms)

Test #3 Lambda run time (139 ms) Total waiting time (479ms) Total time (485ms)

This is for a 20kb payload single request.

Stack: Custom Domain -> Cloudfront -> API Gateway -> VPC -> Lambda -> NAT Gateway -> ElasticSearch

I'm curious why you have a NAT gateway between your Lambda and Elasticsearch?

Lambda is inside the VPC which means it can’t connect to the outside world without a NAT Gateway.

Are you paying for ssl round trips in your tests? We haven't seen such slow response times with lambda or the gateway.

I don't believe so. I am using Chrome network data and the SSL has already been downloaded / cached.

- Monitoring & debugging is little hard

- CPU power also scales with Memory, you might need to increase it to get better responses

- Ability to attach many streams (Kinesis, Dynamo) is very helpful, and it scales easily without explicitly managing servers

- There can be a overhead, your function gets paused (if no data incoming) or can be killed undeterministically (even if it works all the time or per hour) and causes cold start, and cold start is very bad for Java

- You need to make your JARs smaller (50MB), you cannot just embed anything you like without careful consideration

@CSDude you can check out this small tool i've written for debugging Lambda and API Gateway integration.

https://github.com/AlexanderC/lambdon (i know the name sucks)

Also @chetanmelkani as a hint: if you are using NodeJS runtime most optimal from the execution time and cost efficiency perspective is setting up 512mb of memory ;) it's about getting x2 performance boost over the 128mb configuration.

I deployed a couple AWS lambda endpoints for very low-volume tasks using claudia.js - Claudia greatly reduces the setup overhead for sane REST endpoints. It creates the correct IAM permissions, gateway API and mappings.

Claudia.js also has an API layer that makes it look very similar to express.js versus the weird API that Amazon provides. I would not use lambda + JS without claudia.

For usage scenarios, one endpoint is used for a "contact us" form on a static website, another we use to transform requests to fetch and store artifacts on S3. I can't speak toward latency or high volume but since I've set them up I've been able to pretty much forget about them and they work as intended.

Oh claudia sounds interesting. I'm a mobile developer who's used express/node before. The hardest thing for me when I was cobbling a backend together with lambda/dynamodb was unterstanding the permission system and debugging it when I configured it wrong.

Lots of the examples and articles around this process are out of date and AWS's web front end can be painful to deal with. That said, when everything was setup, it was pretty straight forward to maintain.

Do you have any links to good tutorials on Claudia? I'd love to setup a contact me form using lambda for a project I'm working on.

Also, do you know how dos Claudia compare to stuff like serverless.js?

Hey, one of the guys from Claudia here.

You can have tutorials and examples on the : - Claudia.js website - https://claudiajs.com - Claudia Github examples - https://github.com/claudiajs/example-projects

The purpose of Claudia.js is just to make it super easy to develop and deploy your applications on AWS Lambdas, API Gateway, also ease up the work with DynamoDb, AWS IoT, Alexa and so on.

There are two additional libraries: Claudia API Builder and Claudia Bot Builder, to ease up API and chat bot development and deployment.

Regarding the contract form - the best is to create a single service that will handle all the contract form requests. At that point, you can either connect it to DynamoDb, or even call some other data storage / service.

Both Serverless and Claudia have their points where they shine. For a better understanding of their comparison, you can read about it in the Claudia FAQ - https://github.com/claudiajs/claudia/blob/master/FAQ.md#how-...

Awesome answer, thanks!

We have a number of different use cases at FundApps, some obvious like automated tasks, automatic DNS, cleanup AMI's etc, to the more focused importing and parsing of data from data sources. This is generally a several times a day operation, so lambda was the right choice for us. We also use API gateway with lambdas, its a small API, about 2 requests per second on average, but very peaky during business hours, its response and uptime has been excellent.

Development can be tricky, there are a lot of of all in one solutions like the serverless framework, we use Apex CLI tool for deploying and Terraform for infra. These tools offer a nice workflow for most developers.

Logging is annoying, its all cloudwatch, but we use a lambda to send all our cloudwatch logs to sumologic. We use cloudwatch for metrics, however we have a grafana dashboard for actually looking at those metrics. For exceptions we use Sentry.

Resources have bitten us the most, not enough memory suddenly because the payload from a download. I wish lambda allowed for scaling on a second attempt so that you could bump its resources, this is something to consider carefully.

Encryption of environment variables is still not a solved issue, if everyone has access to the AWS console, everyone can view your env vars, so if you want to store a DB password somewhere, it will have to be KMS, which is not a bad thing, this is usually pretty quick, but does add overhead to the execution time.

I'm running Rust on Lambda at the moment for a PBE board gaming service I run. I can't say it runs at huge scale though, but using Lambda has provided me with some really good architectural benefits:

* Games are developed as command line tools which use JSON for input and output. They're pure so the game state is passed in as part of the request. An example is my implementation of Lost Cities[1]

* Games are automatically bundled up with a NodeJS runner[2] and deployed to Lambda using Travis CI[3]

* I use API Gateway to point to the Lambda function, one endpoint per game, and I version the endpoints if the game data structures ever change.

* I have a central API server[4] which I run on Elastic Beanstalk and RDS. Games are registered inside the database and whenever players make plays, Lambda functions are called to process the play.

I'm also planning to run bots as Lambda functions similar to how games are implemented, but am yet to get it fully operational.

Apart from stumbling a lot setting it up, I'm really happy with how it's all working together. If I ever get more traction I'll be interesting to see how it scales up.

[1]: https://github.com/brdgme/lost-cities

[2]: https://github.com/brdgme/lost-cities/blob/master/.travis.ym...

[3]: https://github.com/brdgme/lambda/blob/master/index.js

[4]: https://github.com/brdgme/api

I'm not sure if you care, but it's possible to use Neon to write native Node modules in Rust. No need for JS glue code and better performance than spawning a process.

Will check that out, cheers!

It gets the job done but the developer experience around it is awful.

Terrible deploy process, especially if your package is over 50mb (then you need to get S3 involved). Debugging and local testing is a nightmare. Cloudwatch Logs aren't that bad (you can easily search for terms).

We have been using Lambdas in production for about a year and a half now, to do 5 or so tasks. Ranging from indexing items in Elasticseaech, to small CRON clean up jobs.

One big gripe around Lambads and integration with API Gateway is they totally changed the way it works. It use to be really simple to hook up a lambda to a public facing URL so you could trigger it with a REST call. Now you have to do this extra dance with configuring API Gateway per HTTP resource, therefore complicating the Lambda code side of things. Sure with more customization you have more complexity associated with it, but the barrier to entry was significantly increased.

I've been using AWS Lambda on a side project (octodocs.com) that is powered by Django and uses Zappa to manage deployments.

I was initially attracted to it as a low-cost tool to run a database (RDS) powered service side project.

Some thoughts:

- Zappa is a great tool. They added async task support [1] which replaced the need for celery or rq. Setting up https with let's encrypt takes less than 15 minutes. They added Python 3 support quickly after it was announced. Setting up a test environment is pretty trivial. I set up a separate staging site which helps to debug a bunch of the orchestration settings. I also built a small CLI [2] to help set environment variables (heroku-esque) via S3 which works well. Overall, the tooling feels solid. I can't imagine using raw Lambda without a tool like Zappa.

- While Lambda itself is not too expensive, AWS can sneak in some additional costs. For example, allowing Lambda to reach out to other services in the VPC (RDS) or to the Internet, requires a bunch of route tables, subnets and a nat gateway. For this side project, this currently costs way more running and invoking Lambda.

- Debugging can be a pain. Things like Sentry [3] make it better for runtime issues, but orchestration issues are still very trail and error.

- There can be overhead if your function goes "cold" (i.e. infrequent usage). Zappa lets you keep sites warm (additional cost), but a cold start adds a couple of seconds to the first-page load for that user. This applies more to low volume traffic sites.

Overall: It's definitely overkilled for a side project like this, but I could see the economics of scale kicking in for multiple or high volume apps.

[1]: https://blog.zappa.io/posts/zappa-introduces-seamless-asynch...

[2]: https://github.com/cameronmaske/s3env

[3]: https://getsentry.com/

Zappa author here, thank you for your kind recommendation!

Lots more features in the pipeline, too!

I usually default to using Flask for Python when building APIs and using Zappa to deploy it has been a wonderful experience. Easy to develop locally with a Flask web server and then you just deploy with Zappa.

I haven't used it in a huge production environment, but it's definitely my go to way of handling APIs in side projects and other related things.

Thanks for a great tool! The API Gateway console is inscrutable, so I'm glad Zappa just takes care of all that for you.

Waiting for those features. Even we love Zappa.

Nice to see Zappa being used and the async task looks very interesting indeed!

I've been using it for heavy background job for http://thefeed.press and overall, I think it's pretty ok (I use NodeJs). That said here are few things:

- No straight way to prevent retries. (Retries can crazily increase your bill if something goes wrong)

- API gateway to Lambda can be better. (For one, Multipart form-data support for API gateway is a mess)

- (For NodeJs) I don't see why the node_modules folder should be uploaded. (Google cloud functions downloads the modules from the package.json)

> I don't see why the node_modules folder should be uploaded

So you don't end up with a leftpad-like event. Control and ship your dependencies.

But here's the fun part - if you want to just upload your code and make it download+deploy dependencies, you can do it using your own lambda function :-)

> I don't see why the node_modules folder should be uploaded.

Exactly! Especially if you're using modules that include some sort of binary and build your function on macOS it's a pain -- I ended up using a Docker-based workflow to get the correct binaries into the node_modules.

Or you can use CI to deploy.

Agreed regarding downloading of libraries. I tried to get a python torrent library working in lambda the other day, and I had to manually dig through my /usr/lib to find the right shared objects. Should be much easier than that, I should be able to place a language appropriate set of requirements in the zip root and it should do what it needs to do.

> should be able to place a language appropriate set of requirements in the zip root and it should do what it needs to do.

Let's say you depend on libfoo. It can be obtained via system package, built from sources (with 3 different feature switches), or your language's package can simulate the effect without the native libfoo but it will take longer. Why knows what "it needs to do"?

This is not something anyone but you can answer. There could be some nice wrapper that warns you about libraries you use, but you have to make the decision.

What do you mean by retries? Also, why would you upload node_modules? Why don't you build it first and only upload what is required...

> What do you mean by retries If there is an error, the function will be retried: http://docs.aws.amazon.com/lambda/latest/dg/retries-on-error...

> Why would you upload node_modules It is required to: http://docs.aws.amazon.com/lambda/latest/dg/nodejs-create-de...

The node_modules one is rubbish IMO, I think their doco is absurd. You can just compile things down using webpack or such, this makes the package smaller and can be optimized for faster execution as well.

The retry one is new to me, need to read more about it.

Thanks for the info.

Does webpack work for node.js code? I was under the impression that webpack was specifically for front end code.

Node.js code works for frontend code.

I've always built my lambdas before upload and never upload my node_modules folder. It also means that when you get an error, debugging can be tricky.

I'd recommend using a framework such as the Serverless Framework[1], Chalice[2], Dawson[3], or Zappa[4]. As any other (web) development project, using a framework will alleviate a big part of the pain involved with a new technology.

Anyways, I'd recommend starting from learning the tools without using a framework first. You can find two coding sessions I published on Youtube[5][6].

[1]: https://serverless.com/

[2]: https://github.com/awslabs/chalice

[3]: https://dawson.sh/

[4]: https://github.com/Miserlou/Zappa

[5]: https://www.youtube.com/watch?v=NhGEik26324

[6]: https://www.youtube.com/watch?v=NlZjTn9SaWg

At Annsec we are all out on serverless infrastructure and use Lambdas and Step functions in two development teams on a single backlog. Extensibility of a well written lambda is fenomenal. For instance we have higher abstraction lambdas for moving data. We make them handle several input events and to the greatest extent as pure as possible. Composing these lambdas later in Step functions is true developer joy. We unit test them locally and for E2E-tests we have a full clone of our environment. In total we build and manage around 40 lambdas and 10 step functions. Monitoring for failure is conducted using Cloudwatch alarms, Ops Genie and Slack bots. Never been an issue. In our setup we are aiming for an infrastructure that is immutable and cryptological verifiable. It turned out to be bit of a challenge. :)

Best to keep your workloads as small as possible, cold starts can be very bad, depending on the type of project. Been using mostly node myself, and it's worked out well.

One thing to be careful of, if you're targeting input into dynamodb table(s), then it's really easy to flood your writes. Same goes for SQS writes. You might be better off with a data pipeline, and slower progress. It really just depends on your use case and needs. You may also want to look at Running tasks on ECS, and depending on your needs that may go better.

For some jobs the 5minute limit is a bottleneck, others it's the 1.5gb memory. Just depends on exactly what you're trying to do. If your jobs fit in Lambda constraints, and your cold start time isn't too bad for your needs, go for it.

> Best to keep your workloads as small as possible, cold starts can be very bad, depending on the type of project.

Here's a recent, interesting article on the topic that quantifies some of this: https://read.acloud.guru/does-coding-language-memory-or-pack...

Thanks for the link... interesting how quick python is to start...

You might want to have a look at Serverless, a framework to build web, mobile and IoT applications with serverless architectures using AWS Lambda and even Azure Functions, Google CloudFunctions & more. Debugging, maintaining & deploying multiple functions gets easier.

Serverless: https://github.com/serverless/serverless


- works as advertised, we haven't had any reliability issues with it

- responding to Cloudwatch Events including cron-like schedules and other resource lifecycle hooks in your AWS account (and also DynamoDB/Kinesis streams, though I haven't used these) is awesome.


- 5 minute timeout. There have been a couple times when I thought this would be fine, but then I hit it and it was a huge pain. If the task is interruptible you can have the lambda function re-trigger itself, which I've done and actually works pretty once you set up the right IAM policy, but it's extra complexity you really don't want to have to worry about in every script.

- The logging permissions are annoying, it's easy for it to silently fail logging to to Cloudwatch Logs if you haven't set up the IAM permissions right. I like that it follows the usual IAM framework but AWS should really expose these errors somewhere.

- haven't found a good development/release flow for it. There's no built-in way to re-use helper scripts or anything. There are a bunch of serverless app frameworks, but they don't feel like they quite fit because I don't have an "app" in Lambda I just have a bunch of miscellaneous triggers and glue tasks that mostly don't have any relation to each other. It's very possible I should be using one of them anyway and it would change how I feel about this point.

We use Terraform for most AWS resources, but it's particularly bad for Lambda because there's a compile step of creating a zip archive that terraform doesn't have a great way to do in-band.

Overall Lambda is great as a super-simple shim if you only need to do one simple, predictable thing in response to an event. For example, the kind of things that AWS really could add as a small feature but hasn't like send an SNS notification to a slack channel, or tag an EC2 instance with certain parameters when it launches into an autoscaling group.

For many kinds of background processing tasks in your app, or moderately complex glue scripts, it will be the wrong tool for the job.

If you're already using the Node runtime, https://serverless.com/ fixes all of your "cons" perfectly. Terraform isn't a great fit, in my experience.

I started off doing manual build / deploy for a project and it was a total pain in the ass. From packaging the code, to versioning, rollbacks, deploy. Then that doesn't even include setting up API Gateway if you want an endpoint for the function.

Since then I've been using Serverless for all my projects and it's the best thing I've tried thus far. It's not perfect, but now I'm able to abstract everything away as you configure pretty much everything from a .yml file.

With that said, there are still some rough spots with Lambda:

1) Working with env vars. Default is to store them in plain text in the Lambda config. Fine for basic stuff, but I didn't want that for DB creds. You can store them encrypted, but then you have to setup logic to decrypt in the function. Kind of a pain.

2) Working within a subnet to access private resources incurs an extra delay. There is already a cold start time for Lambda functions, but to access the subnet adds more time... Apparently AWS is aware and is exploring a fix.

3) Monitoring could be better. Cloudwatch is not the most user friendly tool for trying to find something specific.

With that said, as a whole Lambda is pretty awesome. We don't have to worry about setting up ec2 instances, load balancing, auto scaling, etc for a new api. We can just focus on the logic and we're able to roll out new stuff so much faster. Then our costs are pretty much nothing.

I'm reading here a lot of people jumping through some massive amounts of hoops to deal with a system that lock you down to a single vendor, and makes it hard to read logs or even read your own bill.

a few years back, the mantra was "hardware is cheap, developer time isn't". when did this prevailing wisdom change? Why would people spend hours/days/weeks wrestling with a system to save money which may take weeks, months or even years to see an ROI?

I think it's very dependent on the task. I do agree with you, building a large/high load system on lambda is bad for the reasons you're suggesting. I think a lot of the hoop jumping is due to experimentation (you won't know how much of a pain in the ass a piece of technology is until you need to deal with it).

We've mostly used it for small tasks that will get run once a day. It's been fantastic for that, as putting up a box to handle a sparsely (one or a couple of times a day) run task is a lot of work and is expensive.

> hardware is cheap

yes, 95% of the time this is accurate. Hardware is only a small part of what you are paying for though. You are also paying for: the actual lambda platform and completely transparent hardware support + replacement, security patching, feature updates, reliability guarantees, etc...

If anything that supports your statement. Time and people are expensive.

Most of my experience mirrors that found in other comments, so here's a few unique quirks I've personally had to work around:

- You can't trigger Lambda off SQS. The best you can do is set up a scheduled lambda and check the queue when kicked off.

- Only one Lambda invocation can occur per Kinesis shard. This makes efficiency and performance of that lambda function very important.

- The triggering of Lambda off Kinesis can sometimes lag behind the actual kinesis pipeline. This is just something that happens, and the best you can do is contact Amazon.

- Python - if you use a package that is namespaced, you'll need to do some magic with the 'site' module to get that package imported.

- Short execution timeouts means you have to go to some ridiculous ends to process long running tasks. Step functions are a hack, not a feature IMO.

- It's already been said, but the API Gateway is shit. Worth repeating.

Long story short, my own personal preference is to simply set up a number of processes running in a group of containers (ECS tasks/services, as one example). You get more control and visibility, at the cost of managing your own VMs and the setup complexity associated with that.

> You can't trigger Lambda off SQS. The best you can do is set up a scheduled lambda and check the queue when kicked off.

This kills me. I can't believe they haven't added this.

IIRC, it's a philosophical argument that prevents it - SQS is pull, Lambda triggers off push. They don't want to add a push mechanism to SQS, ergo they can't support Lambda.

Pretty good actually. We started using AWS Lambda as a tool for a cron job.

Then we implemented a RESTful API with API Gateway and Lambda. The Lamdbas are straightforward to implement. API Gateway unfortunately has not a great user experience. It feels very clunky to use and some things are hard to find and understand. (Hint: Request body passthrough and transformations).

Some pitfalls we encountered:

With Java you need to consider the warmup time and memory needed for the JVM. Don't allocate less than 512MB.

Latency can be hard to predict. A cold start can take seconds, but if you call your Lambda often enough (often looks like minutes) things run smooth.

Failure handling is not convenient. For example if your Lamdba is triggered from a Scheduled Event and the lamdba fails for some reason. The Lamdba does get triggered again and again. Up to three times.

So at the moment we have around 30 Lambdas doing their job. Would say it is an 8/10 experience.

For running Java in Lambda, I had to optimize it for Lambda. To decrease processing time (and in the end the bill), I got rid of all reflection for example and though twice when to initialize what and what to make static. Also, Java Cold Start is an issue. I fixed this with creating a Cloudwatch Trigger that executes the Lambda function every minute to keep it hot. Otherwise, after some minutes of no-one calling the function, it takes 10+ seconds to respond. But if you use Python for example, you don't run into this issue. I built complete backends on top of Lambda/API Gateway/Dynamo and having "NoOps" that also runs very cheap is a killer argument for me.

If you are triggering it every minute, may I ask how much you are paying for it per month . I have though about it as well as I have a java + spring + Hibernate which takes much too long to start up ; in fact the task execution time is less than the cold start time in my case.

Execution takes less than 100ms and I never exhausted my monthly free quota on Lambda, so I can't say.

Seems like a lot of hustle to me...

The polling trigger is a fairly common pattern from what I've seen to keep it hot.

I feel that they could just charge you to keep a minimum of available instances up.

How much does it cost to receive 3600x24x31 calls per month? I'd start from that.

We use Node.JS lambda functions for real time image thumbnail generation and scraping needs. As well as mirroring our S3 buckets to another blob storage provider and a couple of periodic background jobs. It works beautifully. It's a little hard to debug at first but when it's set up, both pricing and reliability is really good for our use cases.

I think a lot of people try to use the "serverless" stuff for unsuitable workloads and get frustrated. We are running a kubernetes cluster for the main stuff but have been looking for areas suitable for lambda and try to move those.

Can you share your image resizing code on github? We're using thumbor on AWS but it is a huge PITA.

Setting up your own resizer using sharp[1] is pretty simple. Just make sure you install the module in a Lambda-compatible environment, so it can build its copy of libvips (native C library) correctly. I built and deployed my image thumbnailer on a CentOS VM.

[1]: https://github.com/lovell/sharp

After multiple side projects with Lambda (e.g. image processing services), we finally implemented it on larger scale. Initially we started out without any framework or tool to help, because there we pretty much non-existent at that time. We created our own tool, and used Swagger a lot for working with API gateway (because it is really bad to work with). Over time everything smoothened out and really worked nicely (except for API Gateway though). Nowadays we have everything in Terraform and Serverless templates, which really makes your life easier if you're going to build your complete infrastructure on top of AWS Lambda and other AWS APIs. There are still a bunch of quarks you have to work with, but at the end of the line: it works and you don't have to worry much about scaling.

I'm not allowed to give you any numbers; here's an old blogpost about Sketch Cloud: https://awkward.co/blog/building-sketch-cloud-without-server... (however, this isn't accurate anymore). For this use-case, concurrent executions for image uploads is a big deal (a regular Sketch document can easily exist out of 100 images). But basically the complete API runs on Lambda.

Running other languages on Lambda can be easily done and can be pretty fast, because you simply use node to spawn a process (Serverless has lots of examples of that).

Let me know if you have any specific questions :-)

Hope this helps.

why is API gateway so bad?

Well, using it manually is just cumbersome. API Gateway is not specifically designed for Lambda, so it has lots of settings which you would think are just default for building your API. Using it through Cloudformation or Serverless is way easier.

Can talk only about the node.js runtime with native add-ons. Using it for various automation tasks less then 100 invocations a day where it is the most convenient solution out there for peanuts. We also use it for parsing Swagger/API Blueprint files, here we talk 200k+ invocations a day and works great once we figured out logging/monitoring/error handling and limited output (6MB). We do not use any framework because they mostly are not flexible enough but apex (http://apex.run/) and serves us well. We've hit couple of times some limits but as it is invocation per request only some calls failed and the whole service was unaffected. I see the isolation as a big benefit you get. One thing which sucks is that if it fails (and it is not your code) often you have no idea why and if anything can be done. We use it together with AWS API Gateway and the Gateway part is sub par. The gateway does not support correct HTTP like 204 always returns a body and god forbid if you want something else than application/json. To sum it up lambda is great with some minor warts and the API gateway is OK but can easily imagine it much better.

Developing Lambda is an absolutely terrible experience. Tightening the integration between CloudFormation, API-Gateway, and Lambda would really improve the situation. For example, a built-in way to map requests/responses between API-Gateway and Lambda which didn't involve a janky parsing DSL would be pretty nice.

The strategy Lambda seems to suggest you implement for testing/development is pretty laborious. There's no real clear way for you to mock operations on your local system and that's a real bummer.

A lot of things you run into in Python lambda functions are also fairly unclear. Python often will compile C-extensions... I could never figure out if there was really a stable ABI or what I could do to pre-compile things for Lambda.

All of those complaints aside - once you deploy your app, it will probably keep running until the day you die. So that's a huge upside. Once you rake through the muck of terrible developer experience (which I admit, could be unique to me), the service simply works.

So, if you have a relatively trivial application which does not need to be upgraded often and needs very good up-time.. it's a very nice service.

I've only been using it for one project right now. I made an API that I can use to push security-related events to a location that a hacker couldn't access, even if they get root on a local system. I use it in conjunction with sec (Simple Event Correlator). If sec detects something, e.g. a user login, or a package install, it'll send the event to the API in AWS Gateway + Lambda. The event then gets stored in a DynamoDB table, and I use a dashing.io dashboard to display the information. It works super well. I still need to convert my awful NodeJS code to Python, but that shouldn't take long.

I do remember logging being a confusing mess when I was trying to get this started. I feel better about the trouble I had now that I see it wasn't just me. But for a side project that's very simple to use, Lambdas have been a blessing. I get this functionality without having to manage any servers or create my own API with something like Python+Flask. Having IAM and authentication built in for me made the pain from the initial set-up so worth it.

We use it with node for a bunch of things like PDF generation, asynchronous calls to various HTTP services etc. I think it's excellent.

The worst part about it by far is CloudWatch, which is truly useless.

Check out https://github.com/motdotla/node-lambda for running it locally for testing btw - saved us hours!

+1 for node-lambda. It's a lot simpler than Serverless when you just want a little help with testing/deploying.

I've used it in production and we're building our platform entirely in Serverless/AWS Lambda.

Here are my recommendations:

1) Use Serverless Framework to manage Functions, API-Gateway config, and other AWS Resources

2) CloudWatch Logs are terrible. Auto-stream CloudWatch Logs to Elastic Search Service and Use Kibana for Log Management

3) If using Java or other JVM languages, cold starts can be an issue. Implement a health check that is triggered on schedule to keep functions used in real-time APIs warm

Here's a sample build project I use: https://github.com/bytekast/serverless-demo

For more information, tips & tricks: https://www.rowellbelen.com/microservices-with-aws-lambda-an...

We're doing both stream processing and small query APIs using Lambda.

A few pointers (from relatively short experience):

- The best UC for Lambda seems to be stream processing where latency due to start up times is not an issue

- For user/application-facing logic the major issue seems to be start-up-times (esp. JVM startup times when doing Java or your API gets called very rarely) and API Gateway configuration management using infrastructure as code tools (I'd be interested in good hints about this, especially concerning interface changes)

- The programming model is very simple and nice but it seems to make most sense to split each API over multiple lambdas to keep them as small as possible or use some serverless framework to make managing the whole app more easy

- This goes without saying, but be sure to use CI and do not deploy local builds (native binary deps)

I used it for converting images to BPG format and do resizing. I really enjoyed it. Basically with Docker/lambda these days I feel like the future will be 'having code' and then 'running it' (no more ssh, puppet, kuberdummies, bash, vpc, drama). Once lambda runs a docker file it might take over middle earth. These were my issues with lambda:

1. Installing your own linux modifications isn't trivial (we had to install the bpg encoder). They use a strange version of the linux ami.

2. Lambda can listen to events from S3 (creation,deletion,..) but can't seem to listen to SQS events WTF? It seems like amazon could fix this really easily.

3. Deployment is wonky. To add a new lambda zip file you need to delete the current one. This can take up to 40 seconds (which you would have total downtime).

For a serverless system that uses Lambda together with eg CloudFormation, Dynamo, and S3, Cognito etc - it's pretty low level and you spend a lot of time understanding, refining & debugging basic things. The end-to-end logging and instrumentation throughout the services used by your app weren't great.

Doesn't like big app binaries/JARs and Amazon's API client libs are bloated - Clojure + Amazonica goes easily over the limit if you don't manually exclude some Amazon's API JDKs from the package.

On the plus side, you can test all the APIs from your dev box using the cli or boto3 before doing it from the lambda.

Would probably look into third party things like Serverless next time.

I uploaded with Clojure 50kbyte compiled java, which have multi handler in one function. Execute times avg 15ms.

- Cheap, especially for low usage.

- Runs fast, unless your function was frozen for not enough usage or the like

- Easy to deploy and/or "misuse"

- Debugging doesn't really work

All in all, probably the least painful thing I've used on AWS. But that doesn't necessarily mean much.

A couple of months ago, I've started using AWS Lambda for a side project. The actual functions were pretty easy to code using `nodejs` and deploying them with `serverless` but the boilerplate to opening them via an http API was the real bummer. IAMs, routing and all kind of other little things standing in the way of actual productive work. Some time after that I tried to setup GCloud Functions and to my surprise that boilerplate was minimal! Write your function and have accessible with just a couple of commands. IMHO GCloud Functions is way more developer friendly and AWS Lambda.

I work for Gcloud functions. happy to help if you need support.

If you need to store environment variables easily and securely take a look at EC2 Parameter Store - you can fetch the relevant parameters on startup and they are automatically encrypted and decrypted using KMS for you

A session I remember that might be of interest:

Building reactive systems with AWS Lambda: https://vimeo.com/189519556

It's been great for using as 'glue' to do small tasks like clean ups in our case or other short lived minor tasks. I haven't used it for anything major though, only for minor tasks that are easier or more convenient to do with Lambda rather than a different way. The real value comes from the integration with other AWS services, for example, for developers using DynamoDb Lambdas make a lot of maintenance of records far easier with streams events.

Have used it in production for > 2 years, mainly for ETL/Data processing type jobs which seems to work well.

We also use it to perform scheduled tasks (e.g. every hour) which is good as it means you don't have to have an EC2 instance just to run cron like jobs.

The main downside is Cloudwatch Logs, if you have a Lambda that runs very frequently (i.e. 100,000+ invocations a day) the logs become painful to search through, you have to end up exporting them to S3 or ElasticSearch.

We run many microservices on Lambda and it has been a pleasant experience for us. We use Terraform for creating, managing environment variables, and permissions/log groups/etc. We use CodeShip for testing, and validating and applying Terraform across multiple accounts and environments.

For logging, we pipe all of our logs out of CloudWatch to LogEntries with a custom Lambda, although looking at CloudWatch logs works fine most of the time.

"looking at CloudWatch logs works fine most of the time"...are you kidding? I wanna gouge my eyes out with a rusty spoon every time I have to look at CW logs :)

I'm using it for a year in a half, and I'm more than happy, The cost increments when you have much load, but I am a happy user to use it for these small applications that need to be always up.

Need to say, that you should use gordon<https://github.com/jorgebastida/gordon> to manage it, Gordon makes the process easier.


So far used only for toy/infrequent use cases and it works there well. E.g. Slack command, integration with different systems, cron style job.

Pretty great, we're using it for resizing and serving images for our clients (large media companies, banks, etc): https://hmn.md/2017/04/27/scaling-wordpress-images-tachyon/

API Gateway is a little rougher, but slowly getting there.

- We use Lambda along with Ansible to execute huge, distributed ML workloads which are (completely)serverless. Saves a lot of bucks, as ML needs huge boxes.

- For serverless APIs for querying the S3 which is a result of the above workload

Difficulties faced with Lambda(till now):

1. No way to do CD for Lambda functions. [Not yet using SAM]

2. Lambda launches in its own VPC. Is there a way to make AWS launch my lambda in my own VPC? [Not sure.]

2. You can! Although it's a bit tedious. You need to also ensure the function has EIP allocation permissions.

We have moved our database maintenance cron jobs to Lambda as well as the image resize functionality. General experience is very positive after we figured out hot to use Lambda from Clojure and Java. People worried about JVM startup times: Lambda will keep your JVM up and running for ~7 minutes after the initial request and you can achieve low latency easily.

We use Lambda extensively at https://emailoctopus.com. The develop-debug cycle takes a while, but once you're up and running, the stability is hard to beat. Just wish they'd raise that 5 minute execution limit so we can migrate a few more scripts.

It's good. We're using it for a ton of automation of various developer tasks that normally would get run once in a while (think acceptance environment spinup, staging database load, etc.).

It fails once in a while and the experience is bad, but that's mostly due to our tooling around failure states instead of the platform itself.

Hey everyone, I'm Daniel Langer and I help build lambda monitoring products over at Datadog. I see lots of you are unhappy with the current monitoring solutions available to you. If anyone has thoughts on what they'd like in a Lambda monitoring service feel free to email me at daniel.langer@datadoghq.com

I spent a while talking to the Datadog guys at the AWS Sydney summit in April and while the product was compelling, nobody could give me a straight answer on pricing for lambda - all the pricing is given in terms of per host. So I'd have to say the main thing I'm after is pricing transparency...

Been using it for about 6 months with Serverless for Node API endpoints and it's great so far!

The only negatives are: - cold start is slow, especially from within a VPC - debugging/logging can be a pain - giving a function more memory (~1GB) always seems to be better (I'm guessing because of the extra CPU)

- When using with API Gateway the API response time is more than 2-3 seconds for a NodeJS lambda, for Java it will be more. - Good for use cases for example -- cron - can be triggered using Cloudwatch events. -- Slack command bot (API Gateway + Lambda) the only problem is timeout.

I'm having around 4 mln lambda executions per month, mostly on data processing and I'm happy in overall with performance and easy of deployment. Debugging is hard, frameworks are still very mature. I use AWS SDK and C# and I'm having quite good experience.

Lots of great comments here. I'd like to add that being limited to 512mb of working disk space at /tmp has been a stumbling block for us.

Would be really great to have this configurable along with CPU/memory.

Additionally being able to mount and EFS volume would be very useful!

Is there any reason why you can't use S3?

Sure, and that's what I do most of the time.

However, if the S3 keys are larger than ~500mb, then it's not possible to process them in any way with Lambda since there isn't enough "scratch space" available in /tmp.

I was suggesting EFS support merely because it would allow access to arbitrarily large amounts of "local" disk to work with...

didn't know you need scratch space to work with S3 in lambda? What about something like s3fs/goofys?

- There is a surprisingly high amount of API gateway latency

- The CPU power available seems to be really weak. Simple loops running in NodeJS run way way slower on Lambda compared to a 1.1 GHz Macbook by a significant magnitude. This is despite scaling the memory up to near 512mb.

- Certain elements, such as DNS lookups, take a very long time.

- The CloudWatch logging is a bit frustrating. If you have a cron job it will lump some time periods as a single log file, other times they're separate. If you run a lot of them its hard to manage.

- Its impossible to terminate a running script.

- The 5 minute timeout is 'hard', if you process cron jobs or so, there isn't flexibility for say 6 minutes. It feels like 5 minutes is arbitrarily short. For comparison Google Cloud Functions let you work 9 minutes which is more flexible.

- The environment variable encryption/decryption is a bit clunky, they don't manage it for you, you have to actually decrypt it yourself.

- There is a 'cold' start where once in a while your Lambda functions will take a significant amount of time to start up, about 2 seconds or so, which ends up being passed to a user.

- Versions of the environment are updated very slowly. Only last month (May) did AWS add support for Node v6.10, after having a very buggy version of Node v4 (a lot of TLS bugs were in the implementation)

- There is a version of Node that can run on AWS Cloudfront as a CDN tool. I have been waiting quite literally 3 weeks for AWS to get back to me on enabling it for my account. They have kept up to date with me and passed it on to the relevant team in further contact and so forth. It just seems an overly long time to get access to something advertised as working.

- If you don't pass an error result in the callback callback, the function will run multiple times. It wont just display the error in the logs. But there is no clarity on how many times or when it will re-run.

- There aren't ways to run Lambda functions in a way where its easy to manage parallel tasks, i.e to see if two Lambda functions are doing the same thing if they are executed at the exact same time.

- You can create cron jobs using an AWS Cloudwatch rule, which is a bit of an odd implementation, CloudWatch can create timing triggers to run Lambda functions despite Cloudwatch being a logging tool. Overall there are many ways to trigger a lambda function, which is quite appealing.

The big issue is speed & latency. Basically it feels like Amazon is falling right into what they're incentivised to do - make it slower (since its charged per 100ms).

PS: If anyone has a good model/providers for 'Serverless SQL databases' kindly let me know. The RDS design is quite pricey, to have constantly running DBs (at least in terms of the way to pay for them)

RDS always running was just recently fixed...


I have witnessed all of this too. Using it for event based stuff seems the most sane, or small scripts...

I have heard that go will be supported in the next three months and theres a lot of improvements coming. Cant wait to ditch those js and python wrappers.

- Decouple lambdas with queues and events, SQS, SNS and S3 events are your friends here

- Use environment variables

- Use step functions to create to create state machines

- Deploy using cloudformation templates and serverless framework

Since you can not trigger Lambda with SQS, I don't recommend using Lambda with SQS. You end up having to use some other tool to trigger the Lambda invocation, and polling SQS from there.

Better to use a stream that can trigger Lambdas natively - like SNS or Kinesis.

In the end it's only the web server that is serverless, you still need other servers depending on your use case, and hey, web servers aren't that hard to run anyway.

We had a bad experience: we accidentally made an error in one function that got called a lot, which blocked other functions from runnning. Yay for isolation!

I'd like to use Lambda@Edge to add headers to my CloudFront responses. Does anybody have any idea when this might be released from preview?

Can anyone speak to continuous deployments with Lambda, where downtime is not an option? Is it possible to run blue green deployments?

If downtime is an not an option, Lambda is not the solution. Amazon will automatically shut down instances of your Lambda if they haven't been used (~5-15 minutes), and starting a fresh Lambda has noticeable latency. So, some people have resorted to periodically querying their Lambda to prevent this. However, occasionally Amazon will reset all Lambdas, which will force the hard restart.

I'm aware of the increased startup latency for functions haven't been recently used, but that's not the same as downtime or dropped connections.

Deploying code doesn't take a lambda function down. It just spins up the next instance with it.

You do have a cold start issue as mentioned above but if that isn't an issue, then you shouldn't see any down time.

We run a beta / prod system to do testing and then for blue / green, we deploy a second function and switch the API Gateway over when we are good to go. Pretty straight forward

I tried using Lambda, but need to set up the API gateway before using as well. Painful logging and parameter forwarding.

That makes no sense, API Gateway can proxy requests directly to lambda, and in C# there's Nancy / Web API middleware.

Lots of people saying the API gate away is hard.

You don't need to use the API gateway.

Just talk direct to Lambda.

Any comparisons to GCE and Azure's offerings for those who have used both?

I've only played with it as opposed to deploying Prod but guve Azure Functions a try too https://azure.microsoft.com/en-us/services/functions/

Works terribly. It's basically a thin wrapper around a shoddy jar framework. All the languages supported are basically shit-farmed from the original Java one. The Java one is the only one that works half decently.

This sounds like you touched it for all of 30 seconds hated it and formed an opinion.

I use both NodeJS and C# lambda's without issue. The support is really good.

Debugging experience isn't great but aside from that it's fast and easy to use.

C# lambda's can call RDS and respond back in ~5ms...

(before anyone calls me out on the 5ms...)


Last image, I state 2-3 second startup and then 4ms response on a call to the database.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact