Hacker News new | past | comments | ask | show | jobs | submit login
AWS Lambda Is Not Ready for Prime Time (datawire.io)
315 points by sebg on May 23, 2016 | hide | past | favorite | 137 comments



We built Kappa[0] to handle our production Lambda deployment. It was built by the same person who wrote Boto and the AWS command line.

Here is an example[1] of a program built in Python that uses Kappa, and here is a video tutorial[2] on how I deploy that program with kappa.

Obviously I disagree with the premise. It's true that it is more difficult to use than other technologies and you'll certainly pay the pioneer tax, having to develop your own tooling, but it's ready for production traffic.

Error handling is ok, could be better (it takes a while for the cloud watch log to show up).

The real big problem is testing. It's really hard to test if you have more than one function because there is no mocking framework (yet). It's fairly easy to deploy and test with a test account, but local testing still needs to be solved.

[0] https://github.com/garnaat/kappa

[1] https://github.com/jedberg/wordgen

[2] https://www.youtube.com/watch?v=JtLLkCt-lPY&feature=youtu.be


We are finding that unit testing is pretty straightforward with Placebo [0] (also Mitch Garnaat!), for which we use a common "boto client getter" library that the unit test framework can hook into to inject Placebo.

The really hard part is integration testing. You can Chaos-Monkey your Lambda functions, but you can't Chaos-Monkey DynamoDB. We're looking at ways of building tooling to do that.

For deployment, we wanted to use Serverless, but as they started to move away from CloudFormation, that didn't work for our more enterprise needs, so we've been rolling our own, based on ideas from Kappa.

[0] https://github.com/garnaat/placebo


Mitch actually built Placebo alongside kappa, because he needed a way to test kappa. He even submitted a pull request against boto to make it support placebo. Sadly though, placebo is better for mocking the other AWS services, not so much a mesh of Lambda functions.

Have you guys figure out a way to do offline testing of one Lambda calling another? I'd love to see it!


Hey Ben, we're going to build completely on top of Cloudformation in the next version of Serverless. Currently starting to work on it. If you're interested would love to get your feedback: florian@serverless.com


> you can't Chaos-Monkey DynamoDB

I bet you could achieve this effect by either temporarily decreasing provisioned throughput, or inserting clientside middleware that occasionally refuses to do a DynamoDB op.


The middleware would be best option. Put it between your code and the calls and then have it able to throw random errors.

But it means you have to write the middleware and have it be it's own source of bugs, which is still better than nothing, but another risk.


Is Placebo better than moto? It sounds very similar.

https://github.com/spulec/moto


My main issue with lambda so far has been its lack of Python 3 support. I do not understand why there isn't a 3.x runtime available. We have a 3.5 compatible app with parts that depend on 3.4, and we had to cut it in two and cripple it to make it work with lambda.


I've heard that the foot-dragging on Python is due to the senior technologist at AWS not liking the language and/or not taking it seriously.

Anyone else heard the same?


They upgraded their Node runtime from the legacy 0.10 to 4.2LTS... I'm having a hard time imagining a reason to hate Python that doesn't also apply to Node. Lack of a JIT? (It would be awesome to have Lambda + PyPy + 3.5... though that is of course not yet ready on the PyPy side.)


While I can't say I know anything more about python and aws specifically, hate doesn't have to be logical.

Someone might hate python just because.


Lambda can run arbitrary executables, including a Python3.5 interpreter. Admittedly it's a little silly to have a Py2 script that does nothing but start up Py3, but it's better than cutting and crippling, I'd bet.


> Lambda can run arbitrary executables

Linux executables.

During re:Invent 2015 there was talk about supporting a windows runtime environment.


Well if it can run arbitrary executables, it can run Wine!

:)


What is the definition of "Linux executable" that excludes `python3`? Builtins?


???

Arbitrary Executables = Arbitrary Linux Executables.

Since Python3 runs on linux, you can invoke python3 scripts.


Oh, sorry, I agree. I thought you were disagreeing with parent; saying "it doesn't run arbitrary executables, it only runs linux executables".

Although actually, I see what you mean now. D'oh. Windows was so far out of my mind I wasn't even thinking of it as a distinction to be made.


Yeah, that's the biggest problem. It's bugs me too. I'm sure it is coming soon, but the again, most of the big hosted python installs don't support 3.x yet (App engine being the big one).


App engine has 3.4 on their beta flavor right?

But yes in 2016 everyone needs day 1 Python 3 support. Python 2 support is optional.


I don't know why you would say that. I'm sure it would be great to support Python 3, but saying that Python 2 is optional is a little strange.


It's easier to make Python 2 code run on Python 3 than Python 3 code run on 2. Why shouldn't Python 2 be optional at this point? It's the version in maintenance-only mode. Python 2 is the VB6 of Python and it seems clear to me that it should be optional going forward for people spending support time on it.


This is why Python 2 isn't optional yet:

http://py3readiness.org/

Of the top 360 libraries, only 332 of them support Python 3 (but almost all of them still support Python 2).

This analysis was done in March and expected even support for Python 3 and 2 around now:

https://blogs.msdn.microsoft.com/pythonengineering/2016/03/0...

So maybe next year it will be optional, but not yet.


This isn't the thread for that debate, so I'll refrain from a specific response, other than to note that python2 has clearly not been replaced by python3, whatever the language's maintainers wish had happened.


You phrase that as if the language maintainers are the antagonist here. At that point, why continue using a language if you don't agree with the direction the language maintainers believe to be the language's present state? Why would you expect hosting companies (such as Amazon in this example) to necessarily pay support costs to continue hosting a deprecated version of a language side by side with the current version?

So yes, it seems to me that from the standpoint of a hosting company like Amazon for example, Python 2 should be the option for them to host, should not be guaranteed, and maybe even shouldn't cost the same as Python 3 moving forward. That's the present of the language. You are asking them to support the past of the language. Perhaps you should be willing to pay for them to support the past of the language.

(Of course, this particular example is moot because it sounds like Amazon themselves have engineers in the "Python 2 or Die" clade, but it's the example at hand.)


As much as I'd like everybody to be using Python 3, the truth is there's not a lot of immediate benefits, which is why 85% of Python users are still using 2.x.


As a Python user myself, the better, more reliable Unicode-by-default string handling in Python 3 is more than enough of an immediate benefit that I've stopped installing Python 2.x on my devices entirely at this point. Obviously, ymmv, but there seem to me to be a lot of immediate benefits to Python users and what feet dragging remains to stay in 2.x seems more like a political protest of some sort than one focused on actual benefits.


I am fine with supporting both, but if you are only going to do one - pick the way forward.

The more people that say "no python 2" the sooner it will die.

I don't know, python2vs3 thing is weird. Bet someone has a good book on it.


Obviously that is not the case.


Why not?

I can't image rolling out a new python hosting product in 2016 and not even covering python3 ?!?!?


> It's really hard to test if you have more than one function because there is no mocking framework (yet).

I'm not sure how this applies to Lambda, but we've found a workable solution for Service Fabric: the services and actors are almost nothing more than a thin host for the actual implementation (or are DI'd into the implementation) - which is defined in a completely separate package/assembly.


Thanks for kappa, I really like the idea, but it currently seems a bit "broken/unmaintained", is it still actively developed?


that's what I love about Lambda. You CAN test against the online environment, since it costs almost literally zero.

there are very few times where you are offline and would like to run a test suite, but it's much easier to handle those than try to fake the service AWS provides


Why would you use prod lambda in your test suite? Latency and other issues aside, that requires pushing your environment to a lambda (which can take a massive amount of time especially with larger uploads) to run a test suite that's supposed to take a couple of seconds.


we need to create a new build in order to deploy it anyway. The build is hit with some interaction tests. Because config is passed in and not hard coded into the buil it just works. After tests pass the build can be promoted to production alias.


Requiring an internet connection for normal development seems a bit extreme. That rules out using it in many places in the world where the internet is slower or less stable, and even here, we have developers who work on their commute with no internet connection.


That's certainly true. Useful Lambda code (i.e. lots of functions cooperating to do something, likely using a variety of AWS services) can only run in the Lambda context. So it's not appropriate if you need to develop offline.


When US east was freaking out last year, we found Lambda was unreliable. For days ahead of the outages, Lambda would take many hours to respond to S3 events.

When the events eventually came in and fired an event, the logs reflected the time, hours before, when it should have happened.

Sadly, I don't have a support contract so I couldn't get any help. The forums just assumed I was doing something wrong, until the outage which was linked to dynamo IIRC.

We moved on from lambda as well.


Was this when it was still in preview? Lambda is still pretty new having launched in November 2014. Reliability and features for Lambda are getting better though [1][2]. AWS seems pretty invested in Lambda, so I think it's just a matter of time until it feels like a 'mature' offering.

1. https://aws.amazon.com/blogs/aws/aws-lambda-update-productio...

2. https://aws.amazon.com/blogs/aws/category/aws-lambda/


No. This was last outage. I believe that was around September 2015.


> We moved on from lambda as well.

To what, might I ask?


Queue processing. I am a big fan of Celery.

I love lambda, and if I had a budget and a contract, I'm sure Amazon would have solved the issue. However, I think it was merely an underlying problem. I believe Lambda uses Dynamodb streams under the hood.


If you're in the JVM ecosystem, you may want to check out http://akka.io, which is self hosted lambda with more features and better documentation.


We have seen issues with Lambda and S3 events as well. Our solution was to feed SQS the events and process them with lambda and no more problems. No idea why SQS would make a difference but it did.


I think this is a bit too harsh. Sure, it has a few rough edges, and not everyone's idea of prime time is the same, but Lambda is perfectly suited for lots of use cases that do not depend on very low latency -- for example, file conversions, async tasks where users can wait a few seconds such as payment processing, automating incoming e-mail workflows etc.

We've been running APIs on it for six months with no issues, and are now in the process of moving the entire backend from heroku to Lambda. So far, no major issues.

Regarding Lambda being a building block, I actually like that. Werner Vogels points out [1] that one of the key lessons from 10 years of AWS is that the market wants primitives, and that the higher-level toolkits emerge from that. A ton of third-party helpful stuff is already out, such as apex, serverless and so on. We built a toolkit[2] that lets people use Lambda and API Gateway from JavaScript very similar to lightweight web frameworks (eg api.get('/hello', function (request) {...})

Documentation is there, just not in the most sensible places, and the whole pipeline is optimised for Java processing (eg Velocity VTL for API Gateway transformations allows people to do everything they need, as long they know the Java collections API executed below).

[1] http://www.allthingsdistributed.com/2016/03/10-lessons-from-...

[2] https://github.com/claudiajs/claudia/


The heroku-> lambda path is what I was considering - so would love to hear more about any pain points etc. edited nvm read claudia.js. Reviewing that vs apex.run now. Thanks for open sourcing that deployment tool.


one potential pain point is warmup time. if your app isn't doing anything, the first request may take a few seconds. it seems that once it's running, this is quick, as the VM is reused.

another thing is that although Api GW can behave like a web server for most things, some things are more restricted than with a fully programmable server. for example, you need to declare all HTTP response codes upfront, so that the pipeline can be configured. There can be only one success code (so eg returning 204 when there is no content and 200 with content from the same endpoint isn't trivial).

the third thing, that several commenters mentioned on this page, is that if you want to use API Gateway, processing binary data requires S3 or something else for storage. We currently let people convert files by using ApiGW + Lambda to get a signed URL for S3, then post to S3, another Lambda converts it into PDF and saves back to S3, and the client polls S3 to pick up the result. It's fantastically scalable with that design, much better than posting a file to heroku and then getting a synchronous response, but it takes a bit of rearranging the code to work.


I will definitely check out claudiajs. Nice work!


Note that this indictment includes API Gateway. API Gateway is clunky, buggy, and underdocumented. It's useless if you're not using JSON. It doesn't implement HTTP properly (e.g., a body is returned with 30x and 204 responses). The web interface is buggy to the point of being unusable.

To top it off there's no way to report non-UI bugs unless you are paying $$$.


In my experience, API Gateway is holding Lambda back. It would help if AWS came out with a radically simpler (and faster) configuration of API Gateway tightly coupled with Lambda, that doesn't require an arcane API to configure routes or get data out of your request and into your response.

API Gateway seems to have been pressed into the role of proxying Lambda without necessarily being fit for purpose. If I understand correctly, API Gateway is a third-party acquisition (which is why its API behaves quite different from most AWS APIs) originally intended to translate and proxy between JSON and XML APIs.


I think API gateway is the primary problem here. I've found writing Lambdas to be relatively fine, but turning those Lambda responses into useful HTTP responses is nothing short of a nightmare.


You are correct (Datawire.io employee here) and I work with the author of the article. Most of our shared frustration stemmed from the API gateway. Lambda as asynchronous event handlers for SQS etc are great. I used it at point to automate DNS assignment for autoscaled servers not behind an ELB.


Yup. Its a shame they can't expose a normal http interface directly to lambda. That would make setting up APIs a thousand times easier. For example I want to have a lambda function receive events from github, but in order to authenticate those requests I need to have the raw body and headers.

The API Gateway has the concept of "custom authorizers" in which another lambda function gets called to authorize the request before it gets passed to the real lambda function (WHY!!!).

The API Gateway feels more like "enterprise" software than anything else I've used in AWS (not in a good way).

I hope the AWS team and jeffbarr are listening.


Custom authorizers are glorious in the corporate context if you don't wanna use role assumption


>To top it off there's no way to report non-UI bugs unless you are paying $$$.

That's not necessarily true. The AWS forums are pretty active, and I was actually able to get some help/feedback from engineers themselves including promises to look at improvements. This is also a good example of how limiting API Gateway is - https://forums.aws.amazon.com/thread.jspa?threadID=228067


It's such a shame because it has the potential to be amazing but is so difficult to use and unsatisfying to use.

I wish that it started with the verbs rather than the resources. Set up basic behavior for GET, POST, PUT, &c. and apply it to all resources.


I agree. I've been doing quite a bit of stuff with Lambda and API Gateway lately, and API Gateway is the source of most of my pain.


AWS Lambda is not ready for prime time, but not for the reason the author states.

Lambda are unpredictable which is probably its biggest downfall. You can get Super fast deployment and execution one second, the next you're getting random execution failures out of your control.

Lambda often feels like its unsupported by AWS. It took them over a year to support the latest version of node.

Java perf is terrible and support should either be dropped or fixed. Go really should be supported out of the box.

Responses from lambda are not flexible.

Deployment of lambda is frustrating, and the inability to execute a lambda from SQS is even more frustrating.

Could go on.


I can empathize with the OP. Using Lambda in production still has some pretty rough edges. You basically have to use one of the deployment tools/frameworks out there[1]. There are also just certain things it won't do (someone else mentioned binary data). Learning some of the limits takes time and it is not always super clear.

But my personal experience says Lambda is ready for prime time. We use it in production. ~15 million API calls per day. Mostly user facing HTTP requests. Even with the rough edges I would prefer to use it for any new web development project. It feels like at least an order of magnitude reduction in risk/complexity of scaling and deploying code. Its not "zero" but that is huge for me and for my team. We spend more time shipping.

[1]We wrote one: https://github.com/bustlelabs/shep but there are plenty of others mentioned in the comments.


> ~15 million API calls per day

Isn't that horrifically expensive?


Define horrific. The traffic that generates those calls also generates lots of revenue :) It is also extremely spikey. This is a big win for Lambda since we are only pay for what we need and don't have to worry about scaling EC2 up and down to meet demand.

I have one hard example I can share. We had a node service that was running on ec2 and cost ~$2500/mo. Moved the code directly over to lambda. Now ~$400/mo.

Quantifying other costs is a bit harder but do you have a DevOps person on your team? Or multiple people? How much do they get paid?


Guessing they use API gateway so that's:

$3.50(cost per million calls)*15 = $52.50

Lambda is pretty cheap.

For example a 256 memory function running for 300ms being called 15,000,000 times would cost 21.77.

All together that's $74 a day for just lambda and API gateway without any extras (cache, bandwidth pricing etc).

Maybe more expensive that raw infrastructure but it's a pretty inconsequential amount of money per day for close to no ops.


So, $2,257/mo for 15Mreq/day ≈ 174req/sec

Practically any smaller instance type (i.e. m3.medium) can handle this small of a load all by itself, without even breaking a sweat.. and instead of paying $74 per day, it would cost less than $74 per month.

In fact, ELB + an ASG of three t2.micro's running continuously would cost around $49 per month, not per day, and possibly around the same amount of effort (or less) to create/maintain/manage.

It's somewhat apples and oranges, but there's no doubt that lambda is expensive compared to plain old EC2, and that cost disparity increases linearly with scale.


Yeah for sure, you're right in that a machine or even machines across different AZ for HA can have better economics for performance.

But you have api management to sort out and versions to solve which api gateway can do fairly easily.

API gateway is connected to Cloudfront for low latency.

You can simply add a cache for your API.

You have analytics already setup up and ready to go.

Also other things like API keys, auth and cognito integration with other integrations etc that API gateway has.

You can deploy and maintain tens of lambda functions fairly easy, to get something similar you would either have to use some container service like ECS or Kubernetes so have to figure them out compared to just deploying your code with one of the frameworks out there for lambda.


That's true, and also you can use all of those (Cloudfront, analytics, API gateway, etc) with EC2 instead if you prefer. API gateway has its own (strengths and) weaknesses compared to a probably more mature server-based API platform. (and don't forget about things like Elastic Beanstalk.)

I'm not looking to put down Lambda, although it could maybe be a bit cheaper; we use EC2/ELB/ASG extensively with Userify but we might use Lambda for eventing-based services in the future. Evaluating each on its own merits will probably give you the best picture of what's right for your project and team.


thanks for doing the math for me :)


Amazon API Gateway supports OpenAPI (http://swagger.io/) via Console and AWS-CLI.

(http://docs.aws.amazon.com/cli/latest/reference/apigateway/p...)

This approach allows to avoid a lot of the mentioned Amazon API Gateway hassle.

RAML (http://raml.org/) seems to be also around the corner. (https://github.com/awslabs/aws-apigateway-importer)

"API First" is in general quite promising.


Lambda does have a ton of rough edges and documentation is one of them.

But we spent the better of 4 weeks figuring all of this out and automating it. Once automated it's pretty brilliant.

It's not just the automation of getting a lambda function and api gateway working together (though that's a royal pain). It's building the tools to develop and test locally (which we've also done).

The service we created to automate everything is called Joule (it's not ready for prime time but you can kick the tires here; https://joule.run - it supports node, python would be easy if we ever get around to it). Docs are here https://joule.run/docs/quickstart

Anyways, the point is that it's possible and pretty amazing once you start deploying microservices using Lambda.

DDNS using Lambda and Route53 and Joule - https://medium.com/@jmathai/create-a-serverless-dynamic-dns-...

Group Text Message Channel using Lambda and Twilio and Joule - https://medium.com/@jmathai/create-a-group-text-channel-in-u...

Sources on Github for the above Joules

https://github.com/joulehq/joule-node-twilio-group-text

https://github.com/jmathai/joule-node-dynamic-dns

Edit: Here's a Joule that takes an area code, looks up the city name by parsing Google search results and using that to get a creative commons photo from 500px.

https://api.joule.run/jmathai/area-code-500px/lookup?number=...

Source (ugly but functional) - https://github.com/jmathai/area-code-500px/blob/master/src/i...


I've been building a production project fulltime with the [AWS Lambda/API gateway/S3 Website/Route 53/..] stack for about 3 months now. The idea of a holy grail type serverless application (high resiliency, high performance, low price, low complexity) was too attractive for me to ignore. Here's my two cents:

1) Documentation is bad, but not insurmountable. There's enough usage of these platforms now that you'll get pretty far searching for and adapting open source code.

2) Error handling is fine once your code is running, but getting execution there (and the response out again) can be painful.

3) Once sufficiently automated, all these woes go away.

This automation could be done with a framework, however I was skeptical of giving something like Apex[1] or serverless[2] access to my AWS account. Instead I've hand-written terraform[3] for all of my deployment. The documentation isn't great, but there are enough examples out there now to make it possible to glue something working together. I started with this project[4] and wrote a bunch of bash and terraform templates to make it extensible.

The main issues that I have run into haven't been with Lambda itself. Once your code is running, you can build appropriate error responses for all the edge cases. However the AWS API gateway seems to expect you to configure precisely what you want as an input to your lambda, and precisely how the lambda response maps to a HTTP output (the defaults are sane, but overriding isn't easy). I started with the Javascript AWS SDK on my frontend to just invoke the Lambdas directly, and have managed to ignore these problems but for my API integrations, which have ended up a bit more complex than expected.

[1] https://github.com/apex

[2] https://github.com/serverless

[3] https://terraform.io/

[4] https://github.com/danilop/LambdAuth


AWS Lambda is not good for microservice, it's more like nanoservice which is kind of anti-pattern – It is not good to draw proper context boundary to present a business model/domain.

Speaking in another way, AWS lambda is really good for coordinating/dispatching tasks based on events happened to S3, sns, dynamodb etc.

Overall, after I work with lambda for a while, I think that I am not a big fun of using AWS lambda for writing any services which hold business logics.


We've used serverless framework https://github.com/serverless/serverless for automating it. Still has some kinks, but should be good to enable automating.


This framework looks great, thanks for pointing out.

I've used lambda for supporting a static website's need for storing and e-mailing forms, and collecting analytics events. It is very limited compared to what this framework provides, but I already see how helpful it could be.


This article really didn't resonate with me.

The first point revolves around AWS not providing tools they provide building blocks. To me this is true basically of all AWS, not just Lambda. If you come into AWS without understanding that they are building blocks meant to be used together, you will experience pain and suffering. If you want EC2 to be a virtual hosting service, for example. You are expected to use RDS if you want to store data, not put it on an EC2 instance. You are better off using their blocks together rather than trying to roll your own.

The second thing about the documentation seemed like a non-starter... I took him up on the challenge of finding out how to pass arguments to a python lambda, so I typed in "aws lambda python", which google suggested adding "example" to. The first hit showed "If you are passing in this JSON, your function will look like this".

Obviously, this guy had a hard time trying to deploy a Python lambda, and I haven't tried to do the same myself, but it felt a bit off base to me.


>If you want EC2 to be a virtual hosting service, for example. You are expected to use RDS if you want to store data, not put it on an EC2 instance.

Heaps of people run their own datastores on AWS, this doesn't feel like a great example.


The point was that it's lower friction to use the AWS services, and RDS is a great example of this - you can certainly run your own database in EC2, but if you use RDS it's far less work as many things are provided automatically or very easily. Of course, RDS doesn't suit all use cases, but that's true of many AWS services.


For docs, here is a 350 page developer guide:

http://docs.aws.amazon.com/lambda/latest/dg/lambda-dg.pdf

The "Programming Model (Python)" section addresses the OPs complaints about documentation.


For debugging, use the CloudWatch Logs FilterLogEvents API

http://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIRe...

It makes it trivial to tail, filter and range all the logs in a log group.

That plus some structured application logging that is lambda func, and request aware and includes backtraces and you can make sense of what's going on.


I commented on this blog post when it first came out. I think the only point that really holds up is the documentation complaint, which can be made about pretty much any of the fast-moving cloud platforms. AWS is certainly no worse, and in some respects much better, than most. Beyond that I guess the main thing that struck me was the complaint about the number of manual steps. Anything you do in the console on AWS takes a lot of steps. The alternative is to avoid the console, and instead use the aws cli client and write scripts to automate the process of creating and configuring the lambda function and gateway. API gateway is a bigger pain than lambda, just because of the request and response mapping that has to be set up, but thankfully it can be configured using swagger, which is perfect for a scripted deployment.


Most of the issues described in the article are solved by projects like [0], [1] and [2]. [2] was particularly easy to work with (tried it with a standard installation of Django CMS and it was working very well, only in US regions) and the advantage is that you can still work locally as you would usually do, even keep a testing environment for CI on a managed server, and just replace the production server with it.

[0] https://github.com/serverless/serverless

[1] https://github.com/Miserlou/Zappa

[2] https://github.com/Miserlou/django-zappa


(repost from the blog comment thread. Would dig any feedback with HN readers who use lambda in production)

Here's the setup I'm hoping to leverage lambda for: light workers. I have kue.js/redis for submitting jobs to, and creating workers. My subscribed worker listeners will simple trigger lambda.invoke with json packages (no need to call or support http endpoints etc). No need for api gateway either.

I'm starting with apex.run as a deployment tool and writing/running all tests locally. I assumed local testing is doable by hitting the same func exports with mocked inputs - this could be off.

See any big hurdles with that usage plan?

As an aside, I've got a backlogged task to explore serverless, and their moving away from cloudformation shouldn't be an issue (assume terraform?)


Fair point in the article about SEO for technologies being an issue when trying to find help. "Lambda" has a good chance of leading to irrelevant results, even if there was a lot of experience with it documented online. It's a good reminder to try for high entropy names.


> high entropy names

This sounds like a useful metaphor, but I'm unable to fully comprehend it because I suck at thermodynamics and information theory. Could you please elaborate a bit and include some examples of high-entropy names?

(I'm asking because naming is hell, and I'll need to name three products very soon).


This should answer your question about entropy: https://en.wikipedia.org/wiki/Entropy_(information_theory)

tl;dr If you use names that are similar to other names, you will convey less information every time you use such names. So pick a name which doesn't require additional bits to disambiguate.

Note that English conveys roughly 1 bit of information per word.


Note that English conveys roughly 1 bit of information per word.

I think that's per character, not word.


Maybe he's using UTF-16.


It's just a fancy way to say names that have rarely been used. (In compressed data, commonly used strings are represented using fewer bits.)

You could estimate this by doing a search.


The Linux 'perf' tool is one of the canonical examples.


Don't want to hijack the thread (sorry) but I've been working for the last 10 months in a tool which might be of the interest of people reading this article.

This tool will be opensource in the following weeks, but at the moment is in closed beta. I'm looking for people interested in giving it a look. If you are interested, drop me an email to me[at]jorgebastida.com and I'll invite you to the repo.

tl;dr version: Dead simple Lambda+Kinesis+Dynamodb+S3+CloudWatch+Apigateway over Cloudformation with support for Python, Javascript, Java, Go etc... Lot's of examples including Telegram, Twilio, Slack... and quite extensive documentation (which I think is the key of adoption of a technology like this).


What does your tool do? Is it a replacement for all those AWS services you listed, or does it run on top of them?


It makes it easy to deploy Lambdas and integrate them with those services. Underneath uses CloudFormation to orchestrate it


I've been using serverless - https://github.com/serverless/serverless which solves quite a few of these issues


Has anyone used hook.io as an alternative? I have only built toy projects with it, nothing substantial, so I can't speak to important things like error handling, but so far I really like it and find it substantially easier to use than Lambda.


I used hook.io in its early days. It once lost the data in its internal data store when they were doing a database migration, rendering my hooks useless until I manually intervened. Haven't really used it much longer after that, but due to not needing it anymore.


Creator of hook.io checking in. Sorry to hear you had an issue.

We haven't had any recorded incidents of lost data. We did have an issue after a security upgrade where older accounts did need to be migrated with new access tokens. If you lost any data drop me an email and I'd be glad to find it.

The internal datastore is mostly for development purposes. We generally recommend persisting data to an external database like DynamoDB.


I've been watching it for a while but always feel turned off by the huge amount of exclamation marks all over!!!


We did push a visual re-design to the site a week or two ago and rewrote a bit of copy, maybe it may have less exclamation marks now?

If it would make you more inclined to try the platform, I'm glad to adjust the tone of the copy. I get easily excited!


Hey Marak, sorry, I think I came across as too snarky and I think it'd be a good rule to always imagine the author/creator will read/reply. So with that in mind, I think hook.io is really cool and thanks for your excitement.


(4) lock in to a closed mainframe when you can easily duplicate the same design pattern with open stuff.


You could make the same argument about your power company, but yet most people just go with the local utility and never complain about "lock in".

I wonder why that is...?


Choosing a different electricity company doesn't make you change all your plugs. Choosing a different cloud provider may require substantial rewrites.


I've been a very happy lambda user (using Scala); serverless architecture is really nice. There are some remaining issues (e.g. cold start; although should not be an issue once you actually have decent amount of traffic); many of the issues are getting resolved rather quickly (e.g. API Gateway integration has improved dramatically). I am not aware of any alternatives that come even close to lambda especially if it needs integration with many other services.


I used lambda to implement a loader from s3 to redshift. Serverless context and the ability to be triggered on event (i.e. on s3 file creation) are great but I found same problems described in the article.

I can suggest to give it a try to tj project apex.run, it saved me a lot of time. Also error and debugging are difficult but remember there is an ec2 instance at the end behind lambda. I just mocked the lambda function and debugged on a real server.


One more experience I wanna share with HN fellows about Lambda Java and NodeJS: Lambda Java is slower than NodeJS either for cold start or response time in warm, however, if the NodeJS hits on rocks, then its response time becomes unacceptable with huge spikes.

Overall, Java is stable slow all the time under the same configuration, so, you might wanna to use higher config for Java to get stable and performance.


It can't do binary data.

You can redirect to a file the Lambda function writes out. But that sucks.

AWS Lambda is the perfect use case for something like dynamic image sizing. Except if you use it for that you'll force all your users to do a redirect when fetching images. No easy way to clean up when you do it that way either.


> Except if you use it for that you'll force all your users to do a redirect when fetching images.

???

I guess this depends on your setup, but I don't see why you would have to do this. Lambda takes the image in, resizes them and uploads the result to S3.

If you use predictable S3 paths, your clients can just look those up. Of course, there are things you'll need to watch out for, but no redirects needed.


Resizing on the fly. Think of an API that gets an image and desired resolution and sends it. It's a common enough use case. Compare Lambda as the entry point of this API compared to another framework.

Any other webserver is simply get image off S3->Resize it->Send raw binary data to the user (and optionally cache the resized image if you think it'll be requested in that size again).

The Lambda flow is as follows. Get image off S3->Resize it->upload to S3 and then redirect the user to that image via a cloudfront URL.

Unnecessary steps caused by the inability of Lambda get binary data out through the API Gateway. Particularly from the users point of view. You double the number of requests they have to make to fetch images.


Oh. Yeah, that's not a good use case for lambda. Resizing images, sure, but not handling requests like that

Lambda is great for running code in response to events (like SNS messages or S3 file uploads), but I really don't think it should be used for something like handling web requests. Just because you _can_ do something with a million steps and configuring 50 AWS services together, doesn't mean you _should_.


Except that on S3 upload there will be only three attempts to execute Lambda, and then it will give up. I can easily imagine a simple outage that will lead to three errors in a row, and the S3 files goes unaccounted.

You're far better of sending out an SNS or an SQS triggered by an S3 object creation event.


> "Lambda is not well documented"

AWS in general is not well documented. Well-written drivel mostly.

My guess is that there's little to no feedback between documentation writers & developers actually trying to use that documentation to achieve real outcomes. Same could be said about many of the AWS UIs.


I was planning on porting some short-lived but computationally intensive jobs to AWS Lambda. These run once per day so cost isn't an issue at all. The whole plan fell apart when I found that Lambda doesn't support more than 2 concurrent threads[0].

Is there any service out there that does this? Basically I'm willing to pay $1 to rent a c4.4xlarge instance for just 1 minute. Keep in that the hourly rate of c4.4xlarge is only $0.621 on US East, so I'm willing to pay a huge premium here.

[0] http://stackoverflow.com/questions/34135359/whats-the-maximu...


Have you considered a cron lambda that breaks the task down into smaller parts and creates jobs for those parts onto SQS or SNS? Sounds like you already have smaller jobs since you are breaking up to threads.

You could then have your worker lambdas being triggered off the SNS publication and doing the work.


Herokus performance dynos might fit the bill. They are really fast to spin up.

Joyent's Triton is pretty interesting for running one off docker containers though last time I tried it took a variable number of seconds to start one.


Yep, this looks like exactly what I need. Thanks!


I'm interested in finding out more about your use case and why other offerings aren't working. You can email me at $username@gmail.com if you're willing to discuss this.


Why not just rent c4.4xlarge for one hour?


Think of it as a daily sales report. Sometime (any time) during day the admin user clicks a button which launches the job. The fastest I've seen an EC2 instance booting is about 45 seconds, which is too slow for my needs. A cold-start for Lambda is less than 100ms.


Have you tried Google Cloud. Performance & pricing (including booting) is much better on GCE than on AWS.


Whether you agree if AWS Lambda is ready or not , AWS Lambda is an great new way for executing traditional server side code. I used to have several Linodes running java jms clients listening on a message queue. Balancing the number of client vms and message load was a pain.Now I am transitioning my jms clients to AWS Lambda and my costs have come down considerably.

AWS Lambda is only going to get better as even Google has come up with their implementation called Google Cloud Function https://cloud.google.com/functions/docs/


Obviously I disagree with the premise, you can work easily with Lambda and there's a lot of debugging methods you can use specially the cloudwatch attached to the functions.

It's more than ready for prime time! It's an awesome tool.


Lambda runtime also does not have the scientific python libraries installed (numpy/scipy/pandas etc...). These cannot be just pip installed because of extension modules. You need some serious hackery like this: https://markn.ca/2015/10/python-extension-modules-in-aws-lam... to get it working. Not something I am willing to do in a production system.


I have started the lambda-packages[1] project for use with Zappa[2] and other Python on Lambda projects.

[1] https://github.com/Miserlou/lambda-packages [2] https://github.com/Miserlou/Zappa


I've done numpy, pillow and scipy from scratch. Note that numpy+scipy is too big for 50MB zip file, so I had to prune out various sub packages I didn't need in order to keep the size done.


Cool. Would love to have pandas in there.


The major thing missing IMO is "re-run on last input [because I've fixed the bug that made it crash]".

However, I suspect they don't do that because it's build to be fast on a streamed event from another of their service, and that if you want that feature, you're supposed to implement it yourself by dumping each event to S3 or something... but it should really be there as something you can turn on.


I've only used lambda with Alexa skills and the linkage is nice. We let lambda handle the static responses and delegate the dynamic responses to a variety of endpoints that provide the natural language responses. We include requests to help with delegation and keep the logic as simple as possible.

My only two complaints, python 2 and code deployment. From reading the responses it sounds like there may be options for both.


This does not mirror my experience at all. I've had great success with Lambda. I am running in production with millions of requests per day and will be ramping up to millions per hour in the next week. I actually found the documentation to be very good. Early on my misunderstanding of things gave me trouble, but that was all on me.


I am really surprised that 4 months old article is so heavily discussed. Lots of things changed / improved / evolved since then and I'm almost positive that the author's opinion changed / improved / evolved. Would be cool to hear @_flynn's updated view on AWS Lambda!


Seems similar to the concerns that we have on AWS lambda around the difference between proof of concept and software that is fit to run at scale in production. More specifically: From a lambda function, ow do we get the metrics into our statd server and the errors into our ELK stack?


I'm still very suspicious of 'serverless' architectures and people trying to hammer Lambda into fitting that ideal.

Lambda seems best suited for data processing tasks, one off 'cron' style jobs and synchronous request/response tasks that don't execute frequently


Can you elaborate on why you don't think it's well suited for more frequently accessed resources?


I guess I didn't phrase myself well, I meant for things like web services.

Data processing (i.e. pulling events from Kinesis, responding to S3 events) seems like the perfect use case for Lambda, we have thousands of Lambda invocations a minute working against that and it works fine.


webtask is pretty neat

  https://webtask.io/
As the website says - "Run code with an HTTP call. No provisioning. No deployment."

Just run javascript at an endpoint. Done. It was built to make Auth0 run and the whole thing runs from the cli.


I've been using Serverless and I'm finding the request and response mapping to be extremely tiresome. If you're used to a modern HTTP framework like Express, Lambda+Serverless ain't even close.


read the title, I thought performance might be one of them. fortunately it's not. I wrote a micro service and deployed to Lambda, it seems good. the documentation is poor, but anyway we figured it out. for error handling, we didn't put a lot of effort, if there's an error, the request simply fails. since it's not a core service, the behavior is okay. btw, we're using nodejs, the developing flow is okay - with the help of serverless


Look at Azure Functions as an alternative. I don't yet have hands-on experience, but I plan on doing a proof-of-concept project shortly.


I'll just wait for Serverless to become stable ...


Working on it :D. Just started reimplementing a lot for Version 1. Happy to share: https://gitter.im/serverless/serverless


Oh, no. I appreciate the work being done and I've been watching on GitHub. Just waiting on the sidelines going, "Not yet ..." for now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: