
The Case of the Broken Lambda - josep2
http://veekaybee.github.io/2018/09/24/the-case-of-the-broken-lambda/
======
ak217
If you are frustrated by the complexity of packaging Python-based Lambda
applications, I highly recommend checking out Chalice:
[https://github.com/aws/chalice](https://github.com/aws/chalice). It's very
well engineered and drastically simplifies the process (for example, it
handles pre-built wheels properly, so you can cross-compile your Lambda (build
and deploy it from a Mac, etc).

Actually, we just open-sourced a template for using Lambda with Chalice and
Terraform that automates this and many other relevant steps:
[https://github.com/chanzuckerberg/chalice-app-
template](https://github.com/chanzuckerberg/chalice-app-template). It's not
100% directly applicable to this use case yet, because SAM/CloudFormation
templates don't have a good way to manage bucket event subscriptions. But
domovoi
([https://github.com/kislyuk/domovoi](https://github.com/kislyuk/domovoi)) can
manage S3 event subscriptions (direct or bridged through SNS, SQS, or SNS-SQS)
in an idempotent declarative (non-IaC) process.

~~~
sciurus
Seconding chalice! It's very easy to get started with and made whipping up my
first lamba-powered web app ([https://github.com/mozilla-
services/pagerstatus](https://github.com/mozilla-services/pagerstatus)) a
breeze. I do wish their approach to managing environment variables was more
flexible, and that their deployment tooling could do a bit more (e.g. assign
cudtom domains and ssl to the API gateway) for me.

------
guitarbill
Lambda does have a learning curve, and so does deployment safety. Many Python
deployment strategies are kinda interesting because they simply always re-
download the packages. Lambda doesn't allow this.

And they were close to a solution. They have a CI pipeline which can and
should be doing the packaging for them. The Linux image only has to be close
enough to Amazon Linux, not exactly Amazon Linux. Heck, even CodeBuild uses
Ubuntu [0].

It also doesn't help that there's a lack of information or simply
misinformation out there. Sometimes I think frameworks with a very specific
use-case like Zappa do more harm than good. Yes, it's easier to get running,
but it doesn't give you a general purpose solution and makes you think
everyone is just hacking around the mess.

Serverless' serverless-python-requirements is a good solution if you can't be
bothered having the CI do the packaging/artifact creation.

[0]
[https://docs.aws.amazon.com/codebuild/latest/userguide/build...](https://docs.aws.amazon.com/codebuild/latest/userguide/build-
env-ref-available.html)

~~~
traverseda
I'm not going to stop hacking around this mess until this mess is open source.
It's a personal preference, but at that point I'll be willing to pitch in and
start fixing this mess.

~~~
scarface74
What do you need to be open source? Using Codebuild, you just put a
buildspec.yml file in the root directory of your source directory containing a
few import statements, and a list of wildcard file specifications to tell
CodeBuild what to include in your zip file artifact.

It's not like there is a lot of "vendor lock in" for a 10 line yml file.

~~~
traverseda
>It also doesn't help that there's a lack of information or simply
misinformation out there. Sometimes I think frameworks with a very specific
use-case like Zappa do more harm than good. Yes, it's easier to get running,
but it doesn't give you a general purpose solution and makes you think
everyone is just hacking around the mess.

I'm generally unwilling to learn skills that only apply to a proprietary
environment. If aws was using openwhisk, I'd probably code directly against
that instead of using a hack like zappa.

~~~
scarface74
I didn’t suggest Zappa. Zappa is not the AWS solution. The AWS Solution is
using CodeBuild.

With CodeBuild you would just specify the bash commands you want to run to
create your zip file with a yml file.

But you’re already talking about using lambda so you’re already using
something proprietary.

On a deeper level, if you work for a company, you’re probably learning a lot
of business specific stuff anyway that’s not transferable.

------
reilly3000
It boggles the mind that JSON to arvo support for Athena isn't a native part
of Firehose. Thousands of devs have had to fight, and in my case lose, this
same battle. I've had decent luck with Severless framework's python
requirements plugin for other C libs and for large modules. Outside of that is
a hellscape of logs with import errors and a lot of waiting for feedback.

------
scarface74
The easy way to get around this is to use AWS CodeBuild with the built in
Python Docker image and have it trigger when you push. You can trigger a build
from either AWS Code Commit or GitHub.

As part of the buildspec.yml just import all of your dependencies using

pip install {package} -t . # (The period at the end forces it to install in
the local directory.

In your artifacts section make sure you include everything.

CodeBuild will then create a zip file that you can load into the console.

For bonus points, once you create the lambda manually for the first time using
the AWS console, you can export the CloudFormation yml file and use that as
part of your automation strategy where you have a CF Parameter that specifies
the name of your zip file that was uploaded to S3 by CodeBuild.

I use this strategy all of the time to develop on Windows and deploy to
lambda.

------
teej
I recommend following the author Vicki Boykis on Twitter (@vboykis). I've been
following her for quite awhile, she full of data science jokes.

------
skunkworker
We've been developing some products for lambda but have been building them
completely in Go. We can compile them into a single fat binary with the text
files inside so our deploys become a simple zip file with the binary inside.
Another advantage is warmup time for go is quite fast.

------
lostmsu
TL;DR; It's not trivial to add a native binary (from Python package) to a
Lambda function.

~~~
silvexis
TL;DR; it's trivial.

Run `docker run --name lambdapy36 -it -v $(shell pwd):/mycode
lambci/lambda:build-python3.6 /bin/sh -c "pip install -r
/mycode/requirements.txt -t /mycode/vendored/"`

All your packages are now compiled for the lambda env and have been placed in
the /mycode/vendored folder. Either move them into the root before deploy or
add /var/task/vendored to Lambda's python path by setting PYTHONPATH:
"/var/runtime:/var/task/vendored"

~~~
lostmsu
Lol. That at least requires docked installed! And you have to know that magic
line!

------
jocastro
You can use a free ec2 instance for compiling, then upload to lambda

~~~
thehesiod
that that's what we do, then upload to codecommit and then do a codebuild
which pushes a zip to S3. Theoretically can have it all triggered to do most
the work for you.

------
doombolt
That's a rude awakening but an expected one. It's very easy to write an
elegant piece of code in scripting language only to find that some of
dependencies who work magic are pretty messy to deploy.

You can usually count on having native libraries for a given activity for
Java, so you can just use a JVM-based language (does Lambda support that? I
bet it does.)

~~~
tmarthal
Further down the article she mentions that she solved the dependency by "I
ended up rewriting the Lambda in Java", noting that the use-case doesn't care
about the known JVM warmup times.

The POM packaging and jar based deployment seemed to make the dependencies
work.

