
Going Serverless: Migrating an Express App to AWS API Gateway and AWS Lambda - davidjnelson
https://aws.amazon.com/blogs/compute/going-serverless-migrating-an-express-application-to-amazon-api-gateway-and-aws-lambda/
======
falcolas
Serverless via Lambda has been, frankly, a disappointment so far. The benefit
is that I'm supposed to not have to manage servers anymore, yet I find myself,
well, managing servers.

I have to build and assign IAM roles, subnets, set up application
configuration files in fixed external services (since you can't do things like
set environment variables), configure the endpoints in nginx^h^h^h^h^h API
Gateway, suffer through cold starts, fight against arcane packaging issues
with site directories and .pth files (though this is probably just be a Python
thing), fight concurrency throttling, external timeouts, manage out-of-band
database connections, logging which requires even more IAM permissions...

This doesn't even consider how many times I've had to tear down and re-create
an API Gateway setup because it got "stuck" and would stop working.

Finally, WTF is up with not being able to test API Gateway -> S3 integrations
if the S3 bucket is in the same region as the gateway instance? It's a two
plus year old bug by now, and a real pain (especially when coupled with other
AWS services which require the bucket to be in the same region as the rest of
a service - such as CodePipeline).

Perhaps it's just because I'm familiar with setting up and provisioning
servers, but for basic web services and periodic tasks, Lambda is much harder
to work with most of the time. Its biggest benefit so far has been the low
cost.

~~~
noobiemcfoob
> Perhaps it's just because I'm familiar with setting up and provisioning
> servers

I'd be inclined to say that may be the case. Our codebase leans heavily on
Lambda, primarily as a wide stage in our data pipeline. And I love it.

Using a configuration file in S3 - or any other means of dropping a text file
on AWS - is barely more complicated than open("file.txt"). The true
frustration is when you need to work with files beyond your memory limits, but
even that can be mitigated by divide and conquer approaches.

That said, the API Gateway is annoying in that it makes complex things
impossible and simple things more configuration than they need. Yet, I'd still
take it over actually standing up a server.

~~~
falcolas
> is barely more complicated than open("file.txt")

Well, that depends. First, you have to set up the IAM roles to access the
bucket which owns file.txt. Then you have to ensure that "file.txt" is
properly shared between accounts if you separate production from staging from
development. Then there's three lines of code to set up a client, pull the
file, and read the contents. If it's an encrypted file, you also get to do a
bit more setup work to ensure that you're using the v4 signature in the
library, since it defaults to v3 signatures. There's also the additional error
handling and points of failure to account for.

The requests are far from instantaneous as well, so you need to set up an out-
of-band cache for the contents of the file if you plan to use it with every
call of the lambda function.

Is it obtusely hard? Of course not. Is it as simple as `open("file.txt")` (or
more appropriately `os.environ["FOO"]`)? Not by a long shot.

I have a feeling that our differences in opinion have more to do with the
environment in which we run our lambda functions than what the lambda
functions are doing. Once you move beyond a single region on a single account,
things get a lot harder.

~~~
noobiemcfoob
I may be overly simplifying working with a file, but I'd still argue you're
over-complicating it :P You're hitting on a good point with Lambda though,
namely the stateless aspect. To get the most out of Lambda, you have to adopt
a more Functional-based programming model.

> Once you move beyond a single region on a single account, things get a lot
> harder.

I could see that.

I'm working in IoT, primarily connecting lots of dumb devices to small
translation layers with minute data massaging along the way. In this type of
low-intensity environment, the flexibility Lambda (AWS in general) gives you
is tremendous.

The second performance(/latency) becomes a primary concern, it's easy to see
that trade-off become untenable.

------
twagner
@hobofan, take a look at our recent API Gateway features and see if the greedy
paths and pass-through settings provide what you're looking for (they're also
supported by CloudFormation). We tried to simplify the "configure every route"
problem, but always looking for additional suggestions to make API config
easier.

@jomamaxx (& others): Reducing latency (and latency variability) is a critical
goal for our team. We've improved p99 variability in API Gateway-related
latency over the last few months, and will be addressing some of the Lambda-
related latency that occurs when we refresh containers in the coming months.
Additional latency optimizations coming at all levels of the stack, including
networking.

A clarification on the discussions about "managed hosting": Lambda is not
classic web hosting; in fact, we _block_ Lambda functions from calling
socket.listen. In the Lambda model (whatever you prefer to call the broader
category), the cloud service sees every request in order to perform scaling
and load balancing on the function's behalf.

Happy to chat offline with anyone who has additional questions or feedback: DM
me on twitter @timallenwagner

~~~
scrollaway
Tim, when are we getting Python 3 support on Lambda?

Being stuck on 2.7 has been the greatest source of issues and the #1 problem
we have with Lambda. If Google/Microsoft came out _today_ with Python 3
support in their Lambda competitor, we'd move in an instant.

~~~
noobiemcfoob
I'm in the same boat. We're developing a new product with no real reason to
stay on Python 2, besides Lambda. Python 3 support would make my life so much
easier.

~~~
twagner
Yep, I hear about the lack of Python3 all the time, and I know our Python
users are waiting for it. On the roadmap, and partially complete, but
competing with some other language work at the moment.

------
Everhusk
I'm always surprised at the lack of discussion around database connection
reuse with AWS Lambda.

It's a pretty big deal that every single API call requires a new database
connection. The only solutions I've seen so far are to run a separate app to
interface with the database, or moving the connection outside of the handler
(which still has issues).

Am I missing something or is everyone just really happy to use DynamoDB?

~~~
pmontra
DynamoDB is not immune from that problem. There is no way out: if the
container is terminated the connection goes down. The key is not making Lambda
stop and rm the container (in docker terms). If there are enough requests the
container is reused and the connection stays up, but you must initialize it
outside the function. An example with DynamoDB

    
    
        const AWS = require("aws-sdk");
        const docClient = new AWS.DynamoDB.DocumentClient();
        module.exports.theFunction = (event, context, callback) => {
          var params = {
            TableName: "table",
            Item: {
                "attribute": "value"
            }
          };
          docClient.put(params, function (err, data) {
            ...
          });
          callback(null, {...});
        };
    

I didn't try but it should work with any other db.

However, is Lambda cost effective for services with the amount of requests
required to have their containers almost never terminated?

~~~
brianwawok
If there are enough requests to keep your container up all the time, why are
you using lambda??

~~~
pmontra
Exactly what I was asking but I didn't do the math.

Maybe it's OK for burst shaped traffic. The first request pays a toll, the
others are quick.

------
markonen
If anyone's interested in doing the math about the costs involved, I have two
quick notes: first, the API Gateway, at $3.50 per million requests, is
typically more expensive than the Lambda functions themselves (and comes with
no permanent free tier, unlike Lambda); second, the 128MB minimum memory for a
Lambda function can be on the tight side for an Express app, so you might want
to base your calculations on more memory.

------
erikcw
I've recently deployed a small Flask based micro-service for an internal tool
leveraging Zappa[0] to deploy it to Lambda/API Gateway. Overall it's been a
really smooth experience. I'm looking forward to using it again.

[0] [https://github.com/Miserlou/Zappa](https://github.com/Miserlou/Zappa)

~~~
ma2rten
Is there a big overhead in terms of latency? If I understand correctly you
essentially have to start a server for each request.

~~~
taurath
Not the case - it runs with all dependencies cached if its called multiple
times. Inconsistent calls will incur the delay though.

------
doublerebel
So, Apigee had this exact service. They did something pretty cool, it seems
they implemented nearly all (!) of the Nodejs APIs in a JVM variant, then
designed them to run on demand [0].

I actually ported an Express app to Apigee's service, to run webhooks and ETL.
It worked flawlessly. The one downside was that they kept their own internal
registry of compatible packages and there was no way they could keep up with
the Nodejs ecosystem even with few holes in their API.

I still have a ton of hope that Google does something smart with this, I
actually wanted to pay Apigee for their service, but their pricing started at
an Enterprise tier and none of my projects made it that far.... yet.

Something to think about.

[0]: [http://docs.apigee.com/api-services/content/overview-
nodejs-...](http://docs.apigee.com/api-services/content/overview-nodejs-
apigee-edge)

------
cycomachead
Wherein "serverless" basically means more computers in more places just doing
different tasks.

~~~
Sanddancer
Yep. It's good old fashioned webhosting. Drop your files in the amazon
equivalent of ~/public-html/ . If it serves your needs, it's not a bad thing,
but it's hardly revolutionary.

~~~
pmontra
Webhosting usually have a flat monthly fee so maybe Lambda it's more like old
fashioned mainframe timesharing, billed per resource usage. Not that I was
there in its heydays but the two concepts seem very similar.

A consequence is that you really want to terminate quickly and use little RAM.
If your JS function terminates in 110 ms and get billed for two 100 ms units
you might be tempted to switch to a faster language (or ASM what you can) to
save half the bill. Webhosting doesn't have that dynamic.

------
siscia
A little of a shameless plug, but I am the author of effe
GitHub.com/siscia/effe A microservice framework.

If you are worried about latency or you want lambda on your own servers it may
be worth to take a look.

Feedback welcome

------
gtrubetskoy
One significant problem with the API Gateway / Lambda integration is that the
protocol they use to communicate is JSON. JSON is a _text-only_ encoding, i.e.
anything that isn't valid UTF8 isn't valid JSON, and so if you need to pass
something binary to/from the Lambda (e.g. an image), you need to base64 encode
it, which I'm actually not even sure is at all possible with the API/GW
templates. There are other issues, but this alone is a huge headache if you
ever have to work around it.

------
partycoder
I hate the term serverless. It involves a server. AWS Lambda is a service
bound to a port therefore a server. It runs user defined functions, true. But
it is a server.

~~~
Freak_NL
It would help if people referred to this technological concept as _Function as
a Service_. That is a lot less ambiguous and buzzwordish than 'serverless'.

------
halmor
I am currently using AWS API Gateway + Lambda for the serverless backend of a
social traveling startup. It is absolutely awesome: we are few people in IT
and it is amazingly simple to manage the whole stuff. Moreover imho they are
mature services at this time, they are flexible enough with enough
configuration properties to fulfill a nearly full control of the development
experience.

~~~
lucaspiller
How much does it end up costing you? Any ideas how much a 'traditional' setup
would be in comparison?

~~~
pksunkara
We reduced our service server costs to 1/6th by moving from Heroku to Lambda.
And that's not even considering the main benefit of a big computational
request not blocking other requests (which is why we moved to it in the first
place).

------
borplk
Can you achieve competitive latency for web pages using API Gateway and
Lambda?

~~~
ledgerdev
I was curious about the same thing and setup a jvm lambda to test. The setup
was served through API gateway, connected to a postgres RDS, selected a couple
records from a single database table and returned them as json, and deployed
in us-west.

From the west coast, I'm getting between 150ms and 300ms response times for
most requests, but it does jump around quite a bit, with occasional requests
taking 500ms-800ms to complete.

A lot of the variability seems to be coming from execution time of the lambda
itself, which would probably be because of the connection to RDS? I'm using
HikariCP so the connection is pooled but maybe something like dynamo would be
better.

So latency is all over the place, but generally under half a second.

~~~
taurath
Connection would only be pooled on a per-instance basis, which can be cached
but sometimes are not for many parallel running processes (to my
understanding).

------
JohnnyConatus
How does this work, if at all, for local development? Is there a pseudo-Lambda
server that runs locally?

If not, this introduces yet another environment variable or it requires
changes to be pushed constantly to Lamba for test.

~~~
breandr
Check out our example on github [https://github.com/awslabs/aws-serverless-
express/tree/maste...](https://github.com/awslabs/aws-serverless-
express/tree/master/example). The lambda wrapper is very thin (4 LOC). So you
have two options: run/test your express app locally as you always would, or
use the provided `npm run local` command which simulates the API
Gateway+Lambda part (you can modify the `api-gateway-event.json` to change the
"request" from API Gateway). This is primarily an example and starting point,
and there is much more you could do to improve this process.

------
hirenj
I have just recently got through a migration from express to a lambda based
stack. In the end, performance is pretty good (once the lambdas are warm), but
there were a few tricky parts, but nothing impossible.

Serving backend requests is easy with Lambda, but the code for managing the
backend can get quite large, quickly. If you are loading up data and it's
going to take more than 5 minutes to perform a task, you need to build a
queueing system from AWS primitives.

------
neximo64
I hope they solve the latency issues for a basic function that returns text (a
basic callback(null, 'hello world') with the api gateway that is running 'hot'
with the container model and yet still has so much latency.

------
vamur
What is the benefit of AWS Lambda vs getting a cheap VPS or a dedicated server
with much better specs and bandwidth, setting it up and adding auto-update?
And how does one prevent going bankrupt if a Lambda app is DDOSed?

~~~
falcolas
The cost is stupidly low. You would really have to work hard to go bankrupt.

That said, there is a concurrency limit, which would help prevent your server
time going to exponential limits.

~~~
vamur
Article doesn't mention any numbers. OVH VPS is $3.50 for a 2GB RAM and 100mbs
(about 14TB a month assuming 50% utilization). That is about 14mln pages with
1MB size vs 1mln requests for API Gateway at $3.50, not including Lambda costs
or bandwidth costs.

~~~
falcolas
I don't have exact numbers, but we have dozens of lambda functions written in
Python which are parsing and acting on Kinesis log streams (at volumes that
cause problems for the Splunk ingest), and regularly hit the 100 process
concurrency limit... but our bill is well under $10 a year (yes, year, not
month. Hence my use of the adjectives "ridiculously" and "stupidly").

Luck? Hidden discounts? Don't know. Just know that our bill is ridiculously
low for everything we're doing.

------
swang
i hope they upgrade to node 6 soon. i can't use fancy destructuring and some
other es2015 functionality!

~~~
davidjnelson
You can use Babel for server side code too which would get you what you want.

------
gjolund
I love lambda's, but api gateway was far from mature the last time I used it
(6 months ago).

Unless they have made serious updates to the deployment workflow, feature set,
and documentation I would not recommend building your products on it.

~~~
jotto
The "0 to 1" workflow right out of AWS could not be easier, they have an in-
browser code editor with a hello world already written for you.

But that workflow doesn't make sense for real dev teams. So there's
[https://github.com/serverless/serverless](https://github.com/serverless/serverless)
. It's driven by a yaml file and is intuitive.

The only thing I'm bothered by is lack of environment variables (have to hard
code them) and terrible API Gateway latency - see
[https://news.ycombinator.com/item?id=12681926](https://news.ycombinator.com/item?id=12681926)
further down the page

~~~
gjolund
Serverless was nowhere near ready the last time I looked.

Are they still requiring sn unrestricted iam profile? That is a non starter
for most orgs.

Also, I found working with serverless to be pretty unintuitive. You are
relying entirely on large config files and it was very difficult and time
consuming for me to test and rollback several different configs.

------
jomamaxx
Listen folks - it's still a fantasy. Latency is still 5s on many requests = no
deal.

This has been going on for years.

Would someone form AMZ please stand up?

I suggest that the latency is 5s for the 'first call', then, subsequent calls
are fast - and if you don't use your lambda for a minute or so, it goes back
to having latency.

I suggest maybe this has something to do with loading it into memory when it's
used?

There must be a way to keep the lambdas 'hot' or else it will never really get
used for anything but background tasks.

~~~
toomuchtodo
You break the entire model of an oversubscribed multi-tenant container system
if you want every lambda hot all the time.

If that's what you need...you need a server.

~~~
jomamaxx
"If that's what you need...you need a server."

We want micro-services that are available as we need them, at any scale, and
we never want to deal with a 'server' again.

"You break the entire model of an oversubscribed multi-tenant container
system" \- we don't care that much about how it's implemented, or what
underlying model is used, as long as it is secure, robust, and reliable.

~~~
toomuchtodo
> We want micro-services that are available as we need them, at any scale, and
> we never want to deal with a 'server' again.

I'm going to argue that's currently unrealistic, but if you want someone to
build something, tell them its impossible.

Clearly its not easy to build a system that'll run code on demand as fast as
bare metal; accept the trade offs with the understanding why you're making
them.

~~~
jomamaxx
What I'm asking for is definitely within grasp of Amazon's team :) they just
need to get there.

Moreover, it's a rather fundamental opportunity for them: Lambda's are a form
of 'container' and they are exactly what so many of us want. The deal-breaker
for us was the latency issue.

Amazon is a pretty smart company, and it's a big opportunity, I'll bet they
eventually get this sorted out.

