Hacker News new | past | comments | ask | show | jobs | submit login
Serverless computing: one step forward, two steps back (acolyer.org)
117 points by godelmachine on Jan 14, 2019 | hide | past | favorite | 12 comments

The lede is kind of buried. The authors were attempting to use FaaS for a long-running, number crunching, distributed computing algorithm.

Case of selecting the wrong tool for the wrong job. Not what serverless was designed to do.

It's really just that they selected the wrong serverless architecture.

Code repo -> docker build -> AWS Fargate. Easy to set up CI/CD pipeline that doesn't require you manage servers and can support long-running, CPU intensive, and even GPU applications.

(You may have to manage the fleet a bit for GPU support, ie create your own ECS cluster with autoscaling -- but not actively manage the deployment of instances.)

+1 on this. The paper doesn't really address the most typical usecases of Serverless which are typically API backends, streaming data processing (feeding off Kinesis/Kafka/similar), and reactions to one off-ish events like object uploads to storage or events from other systems. While folks like those building Pywren have found success with highly parallel workloads, FaaS still isn't the typical way people go after distributed computing workloads, and no one from the vendor side is pushing this. - munns - Lead Dev Advocate - AWS Serverless

> Not what serverless was designed to do.

I think the point of the piece was more "what serverless could be" rather than what we have right now. I don't think there's any concrete definition of what serverless is, let alone what its designed to do!

The thing the article brushes over is the maintenance/operational burden of running stuff on VMs. It says they achieved much better performance training a machine learning model on a single EC2 instanec, but failed to mention setting up monitoring, logging, deploying code, making that reliable etc.

That's the trade off you make

But the question is could it be made to be good for that domain?

See https://arxiv.org/abs/1702.04024 [cited by this paper] for promising results doing linear algebra using aws lambda.

aka "we were close to the precipice, but we made a big step forward"

Personally, I expect baremetal to be "rediscovered" in a few year after being given a fancy name like "cloudless".

"serverless" just means distributed programs, which implies it own limits, including latency and costs (compared to running on baremetal without any need to adjust to costs) that are often unpredictable, and that carry consequences on the linked accounts.

Project have limited budget. I once received a bill of several thousands of dollars from my hosting company, after a mistake I had made that had costed a lot of bandwidth.

One call and it was cleared. If it hadn't been possible, it was an important partnership, but not one that would have been inseverable.

With google, I fear I would have costed me to be locked out of gmail or something like this.

With regard to some of those downside, there's definitely a theme in some circles of "lets just take this traditional app and write it in lambda".

15 minute runtime execution is a limitation, but you could equally write "memory per request is measured in gigabytes" for a traditional server setup. Similarly, you are limited by network bandwidth, but you could equally say that you're limited by SSD bandwidth on a traditional server; serverless makes it easier to fan-out and capitalize on the aggregate network bandwidth between ten functions, whereas doing the same between ten VMs requires... ten VMs.

What are the real problems we've run into?

- Latency. Bandwidth has never been an issue. Latency is, which includes cold start time but also includes network roundtrips and things like being unable to pool DB connections.

- Intermediate services. The article mentions this one, and they suck. "Just put API Gateway in front of it" when all you want is a dumb proxy with SSL makes a nimble little lambda function feel like a lumbering behemoth. SSL demands CloudFront to function, which has a whole new class of behavior you need to learn, and is SO UNBELIEVABLY SLOW to provision and update. Then adding additional behavior like API Keys or DNS all happens outside the function, so instead of managing a single little service you have to manage a single little function + four-to-ten more AWS resources.

- Tooling. There's no shortage of it, but IMO the ideal tooling setup for a lambda function is one which feels as light as the lambda itself. Serverless (the framework) isn't that. Its crazy complex. Apex is the closest I've seen a toolchain get, and its pretty great, but might actually be too simple. The CDK shows promise, but also feels too complex. I think the problem here is truly that AWS is just too complex, and there's no way to abstract that complexity in a way that works for the 99% use-case.

- Configuration. The big thing I don't get in this category is why can't I map lambda environment variables to values in the Secrets Manager. In order to deploy any environment variable to a lambda, the deployment environment needs access to the value of that variable, which is a security NIGHTMARE. We've instead resorted to loading sensitive configuration at function start, which sucks for performance and cost but security comes first. (We're specifically storing configuration in DynamoDB w/ encryption at rest)

Overall, I think the serverless model is really great. But doing it on AWS sucks so so bad, and I don't see a future where it gets better. There's just so much unnecessary complexity on AWS that I'd rather were abstracted away.

If I were designing a serverless architecture from scratch, I'd probably opt to build it on Kubernetes with something like knative. But that could have its own host of issues I'm unfamiliar with.

> SSL demands CloudFront to function

i don't think that's the case for apigw/lambda unless you're picky about your API gateway URL. (which seems unnecessary, as that can be all hidden away inside the javascript?)

> - Tooling. There's no shortage of it, but IMO the ideal tooling setup for a lambda function is one which feels as light as the lambda itself. Serverless (the framework) isn't that.

i believe the tooling needs to be and could be waaaay better, but zappa has seemed pretty decent to me in my relatively simple use cases.

We used to do the DynamoDB thing, but nowadays we use AWS Parameter Store to store secrets. You can configure the IAM role of the lambda to only be able to read/decrypt the particular parameters it needs too

Definitely agree with you that it's an irritating extra layer of annoyance though

What does serverless give you that containers don't?

You can deploy containers in a "serverless" configuration, if you're not actively managing the underlying fleet and scheduling (eg, Fargate). FaaS is just one kind of "serverless" computing.

Most of the modern "serverless" systems either operate with containers, or are just mounting your FaaS into a provider based container anyway.

Containers with provider managed infrastructure are really just "process as a service", which often are a better match to your software than "functions as a service".

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact