This isn't an equivalent comparison between Lambda and App Runner. The lowest memory configurations of lambda don't get a cpu allocation and run a lot slower than lambda functions with more memory. In practice this makes a pretty big difference for lambda performance.
Worth noting that this is comparing Express Step Functions (meant to run very fast, limited life, often synchronously as in this test), rather than Standard Step Functions (long running, up to a year, typically in background processing)
I really wish they had said something about what the cost to run each one of these things was. There are times when I'm willing to deal with slower runtime performance if it costs me less money; and, conversely, times when I really want the best available performance, cost (mostly) be damned. I got none of the information I'd need to make such a decision out of this article, which is unfortunate.
This is interesting. The direct integration got slower as concurrency increased. The author didn't list the dynamo db table settings, but this could be caused by insufficient read capacity. Instead of failing immediately, I believe the default DynamoDB client implementation is to back off and retry the read. This reduces errors at the expense of latency.
It also seems that this direct integration would only be suitable for a limited number of (very simple) use cases. Not to mention it'd prevent you from leveraging some form of caching which ought to be factored in.
I don't know about API Gateway specifically, but AWS AppSync (their managed GraphQL) supports the same VTL based direct integration. Using a direct integration does not prevent you from using their managed cache solution (which uses ElastiCache under the hood).
Neat test, I'm sure a lot of people will be consulting this benchmark. You've probably saved a lot of dev time. Hopefully the AWS teams review this as well.
What I have been curious about is the speed of cloudflare's fast edge workers calling into AWS, Google Cloud, or Azure because CF still doesn't have a database of their own.
Cloudflare seems to have the most optimized cold starts via their TS/JS v8 isolates, but the latency of calling into an external provider that has it's own database probably kills that benefit (CF's "durable objects" not withstanding)
Cloudflare also has D1 as a serverless database based on SQLite. Haven't tried it out yet but could be useful to reduce processing time even more! it doesn't have transactions though :(
I must call out though that most people pick step functions because of workflow orchestration as doing your own business logic (no matter how simple) over lambdas is an exercise in futility.
I am currently moving all my workflows from lambda to Temporal and it’s been an absolute breeze. I didn’t pick step functions because the whole “lego blocks” design philosophy pushes the complexity down to devs and is hard to maintain if you are leading a team of junior devs.