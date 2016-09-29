https://github.com/Miserlou/Zappa
Comparison:
https://blog.zappa.io/posts/comparison-zappa-verus-chalice
It also lets you build fully-fledged event-driven apps with a single line of code:
https://blog.zappa.io/posts/zappa-introduces-seamless-asynch...
Also, Chalice is a first party framework in that it's led by an AWS affiliated developer.
... I know what I'm doing tomorrow.
Edit: I know what I'm doing today.
Follow the progress here: https://github.com/Miserlou/Zappa/issues/793
Zappa has even more features now. Basically, you can't even build real server-less apps with Chalice, plus Zappa has support for tons of other magic features.
Edit: Just noticed in my AWS account that there's 3.6 in Europe. This will get me away from GCP.
I want to use european regions for my testing, not us.
Other than that I find the GCP interface easier to work with, when it's working.
I'm eager to try AWS.
I think they have very incapable developers on the CloudFormation project. This could have been a game changer, but it's been a source of pain.
For example, they introduced YAML, and !Sub, but you can nest tags, yet, !ImportValue in many cases needs a nested !Sub. So, also, you can't have "$", "{", and "}" characters in the exported name, but they didn't add string templates to functions such as !ImportValue. Total nonsense!
Also, as you've assumed logically, all stack resources need to inherit the tags of the owner stack, but no, you have to do tons of copypasta!
Last, but not least - it's all designed that the templates are stored on S3 - most people use source control. Their other services already support Git - Elastic Beanstalk, CodeBuild, CodePipeline, etc. Why they don't allow Git-hosted templates?!
Anyway, when I see the complexity of my templates to have a basic Magento infrastructure running in VPC, which I've been working on, it's very disgusting. Lots of manual steps if you don't want to have a monolithic template, lots of CLI, and build steps. This is not how things like these should be implemented in 2017!
Lastly, they introduced CloudFormation exports. Okay, decent feature, but not in the real world! So, if you refactor your infrastructure, it becomes a huge pain as you cannot delete exports for some reason - they belong to the stack. So, if I decide to rename or split an export, I need to have an intermediate step, which duplicates the old and the new exports, I updated all importing stacks, to use the new values, and so on. Most AWS resources have "retain" capability - S3 buckets, ECRs, Route 53 records, etc. - CloudFormation exports don't! Honestly, they need to put some more experience and brighter developers on the team!
Most of the prototypical examples of Lambda I see are for things like data processing pipelines. I know in theory Lambda should be able to handle just about any kind of request from a browser or mobile app short of a websocket connection, and Amazon does have some sample code and a brief case study on their site. But I'm wondering if it's really ready for this or if people have experiences going near-100% serverless for their apps.
Lambda excels at taking in arbitrary amounts of long-running jobs and feeding you its output. For example: Upload a png image to convert it to jpg. Zip a directory of S3 objects. Etc.
Lambda gets very costly and inconvenient when you're just taking in requests you could handle by a couple of load-balanced web servers. The whole "running a whole website on Lambda" craze does not actually yield any benefit and is more complex, harder to play with than a simple ec2 instance (which, with a good setup, needs very little "server" management at all).
Also, API Gateway is just horribly inflexible imho.
But I'm surprised to hear that Lambda gets costly. Is this from real experience or is it just theoretical? My impression was that Lambda saves you money by not having to pay for excess capacity. But I haven't done the math. I'll admit, it's also appealing to not even have to worry about configuring a web server cluster to scale up and down.
On top of this, you can't increase one of CPU or Memory allocation without increasing the other. This means if you're very memory-efficient and CPU-bound, you'll be eating extra runtime costs. You also kind of end up using the entire Amazon toolchain, which has costs embedded in every single bit of it. SNS, SQS, API Gateway, S3 requests, S3 network out, etc they all have costs.
And here's the thing: Lambda has a ton of layering on top of it, which you wouldn't have in an EC2 environment where you have full control. You can't optimize Lambda, you can optimize EC2.
My company is currently paying $4k/mo in Lambda costs, parsing log files in Python into XML and doing an S3 call at a peak of 40 requests / second. Back of the envelope, we can probably get this down to <$300/mo by overprovisioning a few m4.large instances. But now there's the question of having to manage a processing queue, reprocessing, etc so it's hard to tell how much would actually be saved. (On top of that, if a box goes down, that's a significant chunk of processing unable to be taken care of; with Lambda, that doesn't happen).
All in all, has been excellent to us to get started, but there's a point where we definitely want to investigate a dedicated system we have full control of. Lack of Python 3 support was the #1 reason I wanted to do that, so now there's a bit less motivation - it's a lot of work.
I'll certainly write a blog post about it if we decide to move our main processing off Lambda.
Edit: This looks like it has a lot of interesting numbers. https://www.reddit.com/r/Python/comments/4hebys/cost_analysi...
{ "statusCode": 200, headers: {}, body: "<h1>Hello!</h1>" }
And API Gateway knows exactly what to do. Before this, yes there was some extra config and it was not fun.
PS. Our AWS bill went down when we moved everything to Lambda but that was not the motivating factor for the switch
Because those sites you mentioned (bustle.com and romper.com) look like something that could benefit a great deal from using CDN which would then drastically reduce need for large instances.
I delve into this in detail here:
https://news.ycombinator.com/item?id=14075634
In short, using Amazon's own pricing example, yes, it's extremely expensive for production web app workloads compared to just running an autoscaling group with 10 (!) nodes on ELB/ALB, and the pricing disparity increases as load increases.
With that said, Lambda is a great fit for little tasks that will never need a whole server.
You can read more about our process here: https://blog.cloudsploit.com/we-made-the-whole-company-serve...
Everything else is serverless however.
My application has:
-- AWS Cognito as the user management and auth
-- AWS Cloudfront serving a Reactjs front end as static files
-- AWS EC2 running Postgres as the database
-- AWS Lambda Python functions for the back end
-- AWS SES serverless email inbound and outbound processing
-- AWS SQS for email processing coordination
Oh I do in fact have another server which does spam checking of emails, PDF conversion of emails, text extraction of emails and parsing of emails - this works best on a server rather than serverless.
The most interesting thing I found along the way is that the API gateway (which I liked a whole lot by the way, and found to be powerful and easy to configure) is completely unnecessary. My application simply directly calls the Lambda functions to get and set the data that it needs - dramatic simplification.
Of further note is that I only have one Lambda function for the entire back end - this further reduces the need for layers of APIs and parameters. All my Python code goes into the one function which is structured as a complete application. An additional benefit of this is that AWS Lambda can be slow to first run a function unless it is "warm". If you have lots and lots of small Lambda functions then any given function is less likely to be warm. With everything in the one Lambda function, then all parts of my Lambda function are more likely to be warm.
So no API gateway completely rips out an entire layer of complexity, and then only one single Lambda function rips out another layer of complexity. It's very nice to be able to write front end functions in ReactJS that just call the back end function that they need. Making changes means I just change the functions and don't need to fiddle with all sorts of REST API layers or anything to accommodate the change.
I started with node.js as the Lambda back end but switched to Python because personally I find that Python with its synchronous programming model is much easier to reason about for back end stuff. I'm more than happy using ES2015 at the front end for ReactJS.
As a reference, this is for non-revenue generating hobby sites/projects.
So I don't know what the costs but I don't imagine much.
Incidentally, AWS Aurora now supports Postgres.
It is most definitely ready. I gave a talk at Node Interactive last year that has some more detail https://www.youtube.com/watch?v=c4rvh_Iq6LE&index=2&list=PLf...
I know cold starts are an issue when services have a lull in usage, and that you can work around it. But I'm curious whether cold starts also happen when there's a surge in demand and AWS needs to spin up more containers or whatever it uses to host Lambdas.
Granted the 30 second timeout is a constraint, but how bad a constraint is it? Ideally long-running requests like that should be rearchitected to return fast and deliver the results asynchronously, right? The bigger problem I see is the lack of websockets support, which makes delivering async results harder. Supposedly AWS IoT does it but that seems like an even more exotic usage than implementing a REST service in Lambda.
Basically I get that it's still early days and there are gaps here and there, but am wondering if it's actually an "antipattern" to use Lambda this way or just a little early.
> Ideally long-running requests like that should be rearchitected to return fast and deliver the results asynchronously, right?
Ideally for whom? I can't think of an engineer who wouldn't rather do a simple request/response. It's certainly cheaper than spending the money to keep a connection open longer than 30 seconds, but far from ideal.
Anyone running on heroku works with a hard coded 30 second timeout on all web requests. It works fine for everyone there.
You must have a different definition of ideal. I'm sure no one using Heroku would be upset if that limit was removed.
The FaaS used in the article is Microcule, but the same approach should work with Lambda now that API Gateway supports binary.