Hacker News new | past | comments | ask | show | jobs | submit login
Going Serverless: AWS and Compelling Science Fiction (compellingsciencefiction.com)
96 points by mojoe on Nov 11, 2016 | hide | past | favorite | 56 comments

The biggest pain for me in managing a system is handling all of the service configurations; making sure the ACLs are in place, that the configuration files are all in the right place and in the correct format.

Serverless isn't saving me any of that pain. I still have to configure ACLs for everything, web folders in S3, ensure that my backend isn't hitting any concurrency or timeout limits, ensure that all of my routes are set up properly in the API gateway, that my DB queries are properly tuned and that someone is watching that DB...

All it's really buying me is not having to write Ansible scripts, but instead I have to write CloudFormation templates. Sure, I have to think about maintenance and troubleshooting less, but when I do have to do troubleshooting, I'm for a long, frustrating day.

As easy as it is to create VMs anymore, the serverless story is nowhere near as compelling as it should be. It makes the easy things easier, and the hard things really f'ing hard.

Concrete example: I wanted to just store the contents of a github webhook post in S3, after verifying the hashed secret. Should be a simple case of wiring together the API gateway to S3, right?

First bug: You can't test an API Gateway -> S3 connection if they are both in the same region. Known issue from back in 2014.

First hurdle: You can't pass the contents of a post to an auth hook in API gateway, just a single header value. That means I can't use the API Gateway authentication hook for this purpose; github creates a HMAC hash of the post contents.

Second hurdle: Finding the proper Velocity Template Language function (and its location in Amazon's custom libraries) to escape the JSON from the webhook body so I could pass it on to a lambda function. It's '$util.escapeJavaScript($input.body)' by the way. You're welcome.

By this point, I was wishing I had just set up a t2.tiny server running flask.

Let's talk about serverless/lambda/FaaS for a minute:

I'm currently working on a side project (which if successful can easily be a product in itself) to create a service that would be the Digital Ocean of FaaS (lots of talk - I understand).

What would be the core features that devs/users/managers are looking for? (something you downright expect for v1 that would compel you to put the service to use for development/production)

Currently some of the items top of my list (in no particular order) are:

* API development (like AWS's APIG)

* Support for more languages by default

* Ability for users to a custom defined language easily

* Creation of an opinionated framework / workflow for FaaS development (you don't need to do this if you don't want to...but FaaS first apps would be done better/easier this way)

* Rewriting common products (Wordpress, Ghost, Jekyll etc.) for to be served via FaaS.

Things I don't want to work on (or would prefer to not jump into):

* Storage service (S3 alternatives)

* Databases (there are so many out there...building a cloud is hard, building a database is even more so)

Who do I feel would pay for this?

* Front end & Mobile devs who don't want to complicate a backend

* Mobile devs who don't want to complicate a backend

* Enterprises maybe?

1) Loop detection -- don't let me shoot myself in the foot an make an infinite loop of two functions calling each other

2) Multi-datacenter -- I want to deploy to two places at least so my risk goes down

The lack of database and/or storage would be a big showstopper. It would mean that if I want build anything with state, I need to store it in another service, which costs money to move data out of. Even a very basic key store would better than nothing. You could actually build one pretty easily on top of Postgres or even SQLLite (although you'd need one per customer w/ SQLLite).

Wouldn't loop detection be equivalent to solving the halting problem? Obviously a naive approach would be disallow recursion/ mutual recursion, but there are valid use cases for those that may not result in infinite execution

I helped design a large scale JS execution engine. We used Babel to rewrite code with loops that could be interrupted by our system. Eg. `for(...){mycode();}` would be recompiled to `for(...){if(!running()){halt();}mycode();}`. The technical reason is that tight loops wouldn't respond to SIGINT, if I recall correctly. So while it doesn't stop infinite loops, it can control the program. I wasn't involved in productionizing it.

Sorry--couldn't be interrupted by SIGINT? That's arguably the point of signals. Does the runtime just queue signals rather than respond in the signal handler itself?

As I understand it, this is mutual recursion accross services, which is a much more specific case. Forbidding it may be viable.

Ha, got the Sqlite case covered! Full disclosure lite-engine.com is my service, the idea here is SQL as a service, literally. The net result is basically MS Access backend for the web, with html/css/js for the frontend:


Edit: To the downvoters: self promotion is allowed on this site, and in fact encouraged!


Thanks for the reply!

1. Loop detection does seem harder to implement but I'll definitely look into it (if the api service is used, it would be trivial to implement)

2. but multi-datacenter has almost been implemented (you can select from different service providers, regions or a combinations of each - and the functions would scale accordingly).

As for a database/key store...would integrating with database solutions from other cloud providers be of some consideration? (from GC / AWS / Azure?).

Keep in mind that all users would have access to environment variables (across projects/functions etc.) by default.

If you're using lambda you have access to dynamodb, which is reasonably priced (and runs on a similar "pay what you use" and very generous free plan).

The real value people see with AWS Lambda is having the ability to easily glue together their AWS services which they run, one of which is an API, a lot of which is stateful services. Requiring users to write the interfaces between their stateful services and stateless functions via APIs may be more effort than what your target audience would want to deal with, there may be clever ways to make this almost as easy as mouse clicks as it is in AWS Lambda while also allowing the flexibly to be using something other than AWS.

I would also like to have something I could point at a docker api which would manage the end to end lifecycle of a function. Building it into a container, exposing itself over an API for management, telling it when it should run and of course running it on my docker machine(s). This idea I don't imagine is appealing to your audience however.

I will try my best to integrate AWS/GC/Azure as much as possible to reduce that issue in the first place (obviously I can't create alternatives to all AWS services for example, but I can integrate as much as possible via their SDK to reduce the friction users would face as you mentioned).

> I would also like to have something I could point at a docker api which would manage the end to end lifecycle of a function. Building it into a container, exposing itself over an API for management, telling it when it should run and of course running it on my docker machine(s).

Could you please go into a bit more detail? I don't understand what you are trying to do. (or it's use case)

The use case is the same, I want to run functions based on events from a variety of sources (API/DB/Storage). I just want to in particular run it on whatever hardware that I have which exposes a docker api, whether it be a computer in my room or in datacenter or cloud service. Most people do not care about having that flexibility, if someone is using software as a service usually they have already decided on using other cloud services as well and those who like to self host probably don't like the idea of having a service which has control over their machines.

I am actually building this on top of Docker, so technically that's already implemented (I'm actually testing this locally).

However...it does have some current caveats (can only run on a single machine, and definitely not something you'd want to run in production).

But it's (with more features obviously) something I've considered for v1 as an enterprise/maybe OSS feature (as most enterprises would love to host it themselves/behind their firewall/on their private cloud)

Doesn't AWS constitute a total vendor lock-in?

I'm not sure why you're being downvoted for a legitimate concern.

But, to answer your question, AWS does constitute lock-in and migrating away from AWS is a major pain. However, companies from big to small see this as a price worth paying for the flexibility of AWS.

That said, I don't think it's the only viable kid on the block any more. The competition, e.g. Google, is getting to the point where choosing AWS is no longer an automatic and should be weighed carefully. This is where the downside of lock-in starts to factor in.

If you just use Lambda, S3, and Dynamo, no not really. It's pretty easy to move to Google. If you use more of the ecosystem, yes, but so what?

Are you worried about vendor lock in with your power company?

As you know, I've had to deal with a lot of these and certainly S3 is the easiest because GCS roughly supports the S3 XML API (we do resumable uploads differently). The sad thing is that even when you say "oh just rewrite this piece to be Datastore or BigTable instead of Dynamo" that can be a pretty tall order. I think "pretty easy" depends on the organization, and how tied to a single provider it was in the first place. That said, I think it's a mistake to "avoid lock-in" religiously, because otherwise you're really getting a lowest common denominator experience. You really should use BigQuery if you're on GCP, and if you need to leave it might be tough. But the alternative is worse (using an inferior service or hand-rolled system from the get go).

P.S. As we're both in the same region of California, we're both with PG&E and can't do anything about it. It's kind of okay because they're so regulated, but how do you know we're getting a good price per kWh?

Disclosure: I work on Google Cloud (and met jedberg in person once).

I would be worried about vendor lock in with my power company if it had a variety of competitors that kept competing with each other on features and price and I engaged in negotiations with them every couple of years for the best rate.

The one situation that makes cloud providers different than other utilities is that in most cases a utility is based out of the same country as its clients - this is a trivial fact most of the time but can become important in wartime. At some point in time if the US goes to war against countries A, B, C it would not be unreasonable to expect that the US government would mandate that AWS/Google etc stop servicing clients in those countries. This is different than an electric utility because in almost all cases the electric utility will end up being isolated on the same side of the fence as its customers.

This isn't a very credible counter; being locked in because it was cheap to get hooked is exactly how we got ourselves into our previous disasters. In fact, until open source we kind of stumbled from one into the other.

As for comparing power utilities to public cloud, I don't think the two are remotely comparable within our timeframes -- utilities are highly regulated and provide an almost entirely undifferentiated service. Public clouds are the opposite, and that's why you want to be mindful of developing your codebase to an API you pay rent for.

How this combo will stop vendor locking?

And comparing Amazon to heavily regulated power company does't make sense.

> If you just use Lambda, S3, and Dynamo, no not really. It's pretty easy to move to Google. If you use more of the ecosystem, yes, but so what?

Vendor lock-in is a well understood anti-pattern. I don't think I'm required to defend it.

> Are you worried about vendor lock in with your power company?

... yes? You aren't?

Vendor lock-in is very much a spectrum -- in my case, it's the same amount of work to move my Python lambdas to a self-hosted server now as it would have been when I first decided to use AWS Lambda. The only overhead was figuring out the Amazon documentation, which was a pretty trivial one-time deal.

Yes but isn't serverless is the new buzzword?

When Google App engine came out it was a big concern but now it's hip.

Yeah. This is an important problem with Lambda that doesn't get nearly enough attention.

Tangentially related question. What deployment tools are people using to manage systems with many lambda functions?

We use SWF to trigger over 50 separate lambda functions in processing. We've got some very nice internally developed tools to identify which functions are out of date and help with deployment. I'm just curious what else is available to handle DevOps tasks in a Serverless environment (i.e. deploying library updates, etc).

I'm not aware of a publicly available tool that does this (without bringing in a big clumsy framework), but internally we just use a script that idempotently redeploys all the lambdas by calling update_function_code on them. It ingests package dependency changes by running "python setup.py install" and grabbing the resulting site-packages contents. This script can be subscribed to an SNS-SQS bus that listens to GitHub notifications for changes on a branch, etc.

This will flush hot lambdas, so if you have a lot of traffic it can be a bit disruptive. I can imagine comparing deployment and commit timetamps make-style, or calling update_function_configuration(Description=...) to save and later compare version/commit metadata.

AWS recently came out with https://github.com/awslabs/chalice, which is pretty sweet if you're using Python (although it doesn't seem to directly address the concern of deployment orchestration).

I've looked at chalice very shortly, and it was really nice to easily set up an API Gateway -> Lambda connection, we haven't really looked at it for our non api-gateway uses.

I've only heard of Apex: https://github.com/apex/apex

Has anyone used the AWS API Gateway service with lambdas to set up a small backend for a web service? The next thing I'm thinking about setting up is a slush reader with a GUI, and I'm wondering how difficult it will be to pipe binary files from S3 -> Lambda -> API Gateway -> client. It looks like I'll probably need to encode the binaries to base64 on the lambda side, and will need a library for that.

Yes extensively.

API-Gateway => Lambda isn't fun to trouble shoot.

That said, once you get it working, it works really well, and it's ridiculously cheap.

Have you written anything about this, or can you point me to any resources that might help me avoid pitfalls? I'd appreciate it!

Sadly, no I haven't written anything. (if my comments are any indication--I kinda suck as a writer).

The biggest pains are:

* Mapping parameters in API Gateway

* Setting the IAM Roles for API Gateway & Lambda (the latter is required to provide execution for API Gateway)

* Setting up custom domains for your API (it uses cloudfront, but must be done via the commandline, and the process kind of sucks)

I've done this work now 3 times. Feel free to email me any questions you have: balasuar@gmail.com and I'll be happy to answer, and provide guidance.

Thanks, I may take you up on that down the road!

Take a look at https://github.com/awslabs/chalice. IMO it's better than zappa if you are Python-centric. As you can tell from the name, it strives to be the Lambda-based equivalent of Flask.

One tip regarding your workflow: don't pipe data from S3 to Lambda to your client unless you really have to. You can use S3 pre-signed URLs to granularly delegate access to S3 objects for a limited time, then refer your client to them. Even if you are generating the binary content on the fly in your lambda, it's easier to save it to S3 and refer the client to that. That said, if your content is truly too dynamic to live on S3, then gzip+base64 encoding it, serving it in json and decoding/displaying it in javascript should be straightforward.

I'll check this out, thanks!

Zappa is a Python tool that can be extremely helpful for persons getting into Lambda. It basically handles everything except for code creation (You have to do that). You can make crontab like jobs or a website and then push the code up to Lamba with a single command. It handles all the temporary S3 buckets, IAM account creation, fetching the Lambda URL, uploading code and connecting Lambda to API Gateway. Deployments are super smooth.

pip install zappa

It supports all of the Languages for Lambda, not just Python.

I'm glad that tools like this exist, I wasn't aware of them, thanks

For binary content I would recommend serving those files directly from S3 via Cloudfront under a different domain. Lambda and API gateway really aren't set up to work with binaries right now. Even serving .csv files requires some workarounds.

FYI it is also possible to go directly from the API Gateway to S3 without going thru Lambda

https://www.zappa.io/ if you value your sanity. The raw tools provided by Amazon are extremely powerful but are basically the cloud equivalent of programming in assembly.

shameless plug: have a look at https://claudiajs.com - I wrote lots of tutorials and example projects for common use cases of API GW+Lambda

This looks cool -- I'm using Python right now, but if I ever use javascript lambdas I'll give this a try.

Any comments from people who are happy with testability of their Lambda code: how much testing were you able to achieve, and how did you achieve that? Indirection, mocks? Frameworks? I am thinking python, but any language would be interesting.

Thanks for sharing. I have used SES previously but hadn't thought to use it to route INCOMING emails ... was only using it to send ... so that concept in and of itself made it worth the read!

It's a pain to setup, as most of these things are, but it works well after that. You can trigger lambda functions to read email stored in S3, which is quite convenient. And with a library parsing the email is easy as pie.

It works really well, and was easy to set up since I use AWS Route 53 for DNS.

I'm curious regarding the motivation behind this. You are receiving and storing emails, looking at them, and optionally forwarding them for processing. Why not use an email client, and only forward your chosen stories into a linear ingest and publish lambda job? I ask because I'm considering building something similar and I'm thinking about the rationale.

If I was the only person in the organization, that might work fine. However, there are two main drivers of this methodology:

1. I need to allow other people to simultaneously evaluate stories, and a database works well to coordinate multiple people accessing the story queue.

2. Having metadata and stories in dynamo/S3 allows for much more flexible querying of ALL stories, to look for patterns, model the data, etc.

Some friends of mine run a science fiction site. A reasonably serious one - they've been nominated for a Hugo several times. They have WordPress and some email accounts. They do grumble about the website, but it's nowhere near their biggest problem. You are definitely overthinking this.

I'd love to talk shop with your friends, if they're interested please direct them to the site or have them email me at joe@compellingsciencefiction.com!

just because others do with less does not mean a solution is overthought.

Serverless? Err, yeah, right...

As a community we should be correcting this proliferation of useless, misleading, and blatantly wrong terminology.

Shameful plug: Our developer site is serverless: http://developer.avalara.com/

Content is stored in our public github repo, and we use TravisCI to build Jekyll pages on commit/merge, which are uploaded and served from S3.

Interactive functionality is provided via React in the front end, and API-Gateway/ AWS Lambda in the backend.

The API Gateway/Lambda provide various proxies to our various sandbox API environments, so Developers can quickly try our APIs directly on the site.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact