Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Skeptical about my company going “full serverless”. What am I missing?
109 points by birdstheword5 26 days ago | hide | past | favorite | 147 comments
An Azure Cloud Architect has joined our company. They are recommending that all our web apps that are currently deployed in Azure App Services/ VMs should be (gradually) split up into Azure Functions going forward.

I'm skeptical - I was under the impression that serverless was for small "burstable" apps with relatively low traffic, or background processing.

The two products I work on are both REST APIs that send and receive data from a user interface (react) with roughly 60 API routes each. They have about 100 concurrent users but those users use the apps heavily.

The consensus on the internet seems to be "serverless has its use cases" but it's not clear to me what those use cases are. Are the apps I'm working on good use cases?

Any "architect" joining the company and pushing for drastic changes like that is either an idiot chasing the hype, or a malicious actor trying to boost their importance.

Good news: your gut feeling is correct.

Bad news: you will likely lose this battle, unless you're good at playing company politics.

Here's how it typically goes:

1. A new lead/architect/manager joins the company.

2. They push for a new hyped technology/methodology.

-> you are currently here <-

3. The team is split: folks that love new things embrace it, folks that care hate it, rest are indifferent.

4. Because the team is split, the best politician wins (usually the new hire).

5. Switch happens and internally you know it's fucking disaster, but you're still forced celebrate it.

6. When disaster is becoming obvious people start getting thrown under the bus (usually junior engineers or folks that opposed the switch).

> Any "architect" joining the company and pushing for drastic changes like that is either an idiot chasing the hype, or a malicious actor trying to boost their importance.

Or, which seems more likely, but still just as bad - someone chasing the "successfully redesigned the infrastructure on a scale of the entire company" on their promo packet and resume.

Whether it actually improved the infrastructure is of no concern to them. Not needing to even try to understand the existing infra in-depth to make it happen, and the added job security (due to making the infra more complex and confusing for the rest of engineers to understand), are just a cherry on top.

Call it a cynical take, but this is one of those situations where "the simplest explanation is probably what actually happened" feels about right.

> someone chasing the "successfully redesigned the infrastructure on a scale of the entire company" on their promo packet and resume.

Also don't underestimate someone that is smart and well intentioned but dangerously overconfident and completely unaware.

These may be the toughest for manager types to sniff out because they believe their own hype.

5.5 - the new hire adds this project to their resume, uses that to land a new job with a fancier title, and leaves the project to fall apart because there's no longer anyone committed to its success.

Seen that. Currently living it.

Honestly it has pretty much destroyed our business and customer relations looking after the pile of steaming shit that remains.

Don't leak my playbook, spiffytech..

How to derail this dynamic early:

"Does this solve a problem we have?"

"Of the top five things that we are trying to build, does this add any of them?"

Often the problem isn't that the new methods/tech is bad but that all the effort spent transitioning to it could be better spent directly attacking the goal in the first place.

I wish it was that easy.

"Of course it does: engineers will be more productive, we'll have less bugs and better performance. {TECHNOLOGY} has been around for {N} years and getting more and more traction. Do you want us to be ahead of our competitors or not?"

"You need to understand that sometimes you have to sharpen your axe before cutting the trees. Have you heard the phrase 'work smart, not hard'?"

"So essentially you're saying we shouldn't change anything despite having issues (bring up any bug/downtime you had recently). John, sometimes you have to escape the comfort zone and learn new technology."

I once worked with a person who talked like this in front of non-technical founder. I bet he did sound quite convincing, it took me a while to cut through the bullshit.

Those quotes ain't generally bullshit. They are general and generic pieces wisdom - but of course, whether they are really wise, depends on the actual technology and the actual product and the actual problems.

Yeah that doesn't work with things as hugely hyped as the cloud, agile, etc. Obviously it's a cult and it doesn't solve anything in and of itself, but you're not allowed to say that.

Rather, I'd go for:

"Let's solve this problem we've been having with Azure Cloud Functions!"

"Of the top five things that we are trying to build, let's build them with Agile!"

Otherwise, you are not a team player, part of the problem not the solution, etc. It sucks, but that's politics.

Sadly, I think you are completely correct, and this exact sequence of events will go down. I don't think they're an idiot or malicious - I think they're clever and mean well, but perhaps they're blinded by their enthusiasm after getting certified

Or it's pure RDD (resume driven development.)

That said, you can figure out (just ask!) if this person was brought in as the result of an already-made decision to move to this type of development, or if this person is pushing it. As an EM, I'd have no problem with a report directly asking that question.

If the answer is the decision was made, and this hiring was a downstream result, then your choice is likely to deal with it or move on. If the decision hasn't been made / this architect is advocating for it, then a cost benefit analysis (which you'd think would be a standard part of any work) will be illuminating. My understanding of the tooling / debugging / monitoring tooling for serverless is that it's still quite raw. Although my bias is towards not wasting innovation tokens on things like serverless / I align strongly with the choose boring technology view advocated by Dan McKinley [1], amongst others.

[1] https://mcfunley.com/choose-boring-technology

just a great explanation ;)

> The consensus on the internet seems to be "serverless has its use cases" but it's not clear to me what those use cases are.

My $0.02, having used serverless before. Those use cases are:

* Very very low traffic apps. POST hooks for Slack bots, etc.. Works well!

* Someone who is an "architect" can now put "experience with Serverless" on their CV and get hired somewhere else that is looking for that keyword in their CV scans.

Those are any and all uses cases.

Agreed. The only time I reach for lambda/cloud functions is for proof-of-concepts or throwaway code (low traffic is a valid use-case though). I'm not sure about Azure Functions, but GCP Cloud Functions can have atrocious cold-start times, depending on language.

This is absolutely a huge step backwards in terms of architecture. And methinks this architect doesn't have the broadest understanding of Azure and is reaching for the easiest tool in the toolbox.

Edit: "serverless" is a broad term. There's a difference between shoehorning every microservice into a lambda-like service (bad), and migrating from a collection of VMs to a managed Azure service that does the exact same thing (can be good).

There's tons of low traffic parts of apps. So if you're building in more of a 'microservice' architecture instead of a giant monolith, serverless has it's place. Most of your app is serving requests to users, but your signup and login flows might only run 100's of times per day and they are simple workflows, check if user exists > create new users > add password > send email verification.

Then, the email verification is checked by a second function. You click the link, it updates a field in the DB and redirects to the main page.

login is the same, we validate the creds and give a token.

And then the only infra we maintain is the stuff we have always on and serving mass requests. There's a lot of stuff that doesn't run that often that is more or less a single function and running a fleet or redundant machines for that doesn't really add value.

It just depends on how your app is designed as to if it makes sense.

If the load is so minimal for those functions and you are already running other infra for the rest of your app why would you "outsource" those functions to a serverless runtime? What do you gain besides another system you deploy to?

config drift

What do you mean?

By duplicating and distributing parts of your app into a bunch of distributed targets you will, eventually, accumulate drift in assumptions about your environment/partners/libraries/databases/etc

Right, so that is an argument against outsourcing low-traffic stuff to a serverless runtime, right?

After using AWS Lambda and Kubernetes for different situations, I would go for Lambda everytime unless I had a compelling reason to use containers because 99% of what people have asked for is a CRUD application.

Scales down to 0, so effectively no cost overnight when used infrequently, and if you use a scripting language like Python or JavaScript the startup times are in the ms for a new instance when scaling up.

When I would use K8s would be if there is heavy processing involved, so want a language like Java or C to do the heavy lifting, image processing, encoding or encrypting and would be long running.

Disagree with serverless only being for low traffic, I've used it for a a fair amount of high traffic situations and it is great for scaling up quickly.

> unless I had a compelling reason to use containers

AWS Lambda and Google Cloud Functions support containers.

> and would be long running.

Unless it's running 24/7, you're better of with serverless batch processing systems or you'll need scale-to-zero on your Kubernetes cluster.

Yes, but…

While Lambda supports containers, you lose out one one of the largest benefits of having Amazon patch/update the runtime.

Which brings me to my opinion on the value proposition. There’s a concrete cost curve where ECS/EC2 is cheaper than Lambda in what you pay AWS, there’s a much higher inflection point where you are responsible for maintenance/operations. Under that line there’s still a place for serverless.

I completely agree with you, and I didn't think that anything I said contradicted you.

Lamba works really well for reactive processing, such as supporting an API. You lose that benefit if you deal with something like Java because of the slow startup times gimps the processes.

Hence, running it in an environment which will be on constantly with consistent throughput that is just a pipeline 24/7. The scenario I've used it with was GIS and satellite image processing, which would have blown through the Lambda limits and with constant images being published meant that it was essentially processing 24/7.

Do you happen to have a list of languages with lower startup times? I was trying to figure out what would work well for serverless development

> * Very very low traffic apps. POST hooks for Slack bots, etc.. Works well!

I would add, that it offers an extension point / hook-in for a variety of AWS products. I can add custom functions to be called when a Cognito user is created, updated. I can add custom functions to be called to authorize routes transiting API Gateway.

At "The Firm" - we're going heavily in on AWS for our architecture - Fargate for k8s on top of the usual RDS, S3, etc.

We've begun roll-out of an external-facing API authenticated with OAuth2. Our backing store here is Cognito and our routing happens via API Gateway. We use the API Gateway's baked-in Cognito token authorizer for incoming requests, which route to pods hosted in Fargate. These pods represent a variety of projects.

As such we have two Lambda functions providing the /oauth/secret and /oauth/token endpoints for exchange of One-Time-Password for a secret token and secret token for access tokens - they experience very low N call rates (dozens?) of times per day and don't fit into any given API we host via Fargate.

We use Lambda to handle lots of scale-out tasks & extensions to AWS services too - it works really well for this.

For example:

- resizing images - we can do a size per lambda all in parallel. This means we can process images quickly (with minimal latency) without having to have loads of slack memory & CPU on our backends

- queue processing - we have an app that needs to copy files from user provided URLs to S3. We do this by dumping them in a SQS queue and having a lambda fire for each queue item. Means we can do lots in parallel without filling up an EC2/fargate instance’s network port

- dynamically processing images using Lambda@Edge & Cloudfront - similar to my first one, but on the fly when requested, instead of ahead of time

To add a couple more good use cases for Lambda:

- To run something at deploy-time with custom CloudFormation resources. As an example, a Lambda can be invoked during each deployment to run tests. If the test fails, the stack(s) can be automatically rolled back. Tricky to get right and not often the right tool in the bag, but sometimes useful.

- As handlers for CloudWatch events. Useful for forwarding events to third-party log services, or for taking some sort of complex action based on a given event.

AWS is nearly “all in” on Lambda at this point, so it’s a good bet that if you want to hang some kind of task off an AWS service, they’ll give you a hook to fire off a Lambda. From a cost and simplicity perspective this works out well.

* Reduce systems footprint

* Good for "minimally dynamic" apps

* Easier in compliance/regulation-heavy environments (the kind that impose burdens on OS configuration and maintenance)

* Like you said, great for low traffic apps (the kind of things we usually build, toss over the fence to another department, and then never look at again)

* Also like you said, resume driven development

Hard disagree, a couple of years back my team was running mobile game services connecting users in the millions (at peak).

We were hosted on AWS, and pretty much only used Lambda + API Gateway + DynamoDB.

In AWS that post hooks workflow also is usually "firehose" apis such as events from table writes, and they work well for streaming ETL type work.

Lambdas and workflows are like SQS and workflows. you get a few of them just use lambdas and workflows, you have a lot of them switch to a workflow engine.

You mention POST hooks for Slack bots - what about webhooks in general? Been thinking about moving to API Gateway -> Lambda -> SQS -> Lambda funnel for processing webhooks async & scaling easily. But this could be for higher volumes rather than low traffic.

The really are great for little low-to-medium-traffic web-automation scripts that are small in scope. But then you can do the same damn thing with Nginx and Lua or maybe PHP. Or anything, really. CGI exists and is wonderfully suited to exactly the same use case.

100%. Someone set up serverless at my last freelance company. Just due to the cold start, our network requests were averaging SIX seconds. This was even for requests that were just getting basic config stuff, i.e. no DB transactions required. Insanity.

The project I was working on at the time replaced an Excel sheet in the order of hundreds of MB. The users were used to performance being miserable, so no one cared if an API request took two seconds.

I am an old school kind of engineer, and if you tell me your web requests take 2s instead of 20ms due to an architectural decision you made that doesn't have any other strong upsides, I would agree that that is insanity.

This is from the perspective of a small startup CTO:

We've used AWS Lambda for about 4 years, and it's been so good and so cheap that I'm shifting literally everything (except Redis) to serverless. Also, GCP has a better serverless offering (Cloud Run, Spanner), so we're switching from AWS to GCP to take advantage of that. I bet we're going to see a massive cost reduction, but we'll see.

Things I like about serverless (again, from the perspective of a very small startup, with 5 engineers, and me being the primary architect):

* It's so liberating to not worry about EC2 servers and autoscale and container orchestration myself. All our Cloud Formation templates add up to around 3,000 lines, which maybe doesn't sound like a lot, but it's a lot. There are tons of little configuration things to worry about, and it adds up. (Not to mention the sheer amount of time it took to learn.) ECS Fargate takes care of some of this, but it doesn't autoscale based on demand or anything (not without settings things up yourself). (This is a big reason why I want to switch to GCP: Cloud Run is like Fargate in that it runs containers, but unlike Fargate it autoscales from 0 based on load.)

* It's very cheap in practice, at least for loads like ours that respond to events: API services that sometimes see a lot of use and sometimes see very little use; queue consumers sometimes have a lot to do and sometimes have very little to do. AWS Lambda bills down to the milisecond in terms of resolution, and GCP Cloud Run/Cloud Funcitons bills down to the next 100 miliseconds. These are very fine resolutions and for us at least, we've seen costs be small.

* For database serverless products (like DynamoDB for example), it's very liberating to never have to think "Hm, do we have enough CPU provisioned?"

Things I don't like about serverless

* Pushing source code sucks. Lambda will just one day decide your version of Python or whatever isn't good enough and force your customers to upgrade all their user-written code to the latest Python version. (But! Cloud Run supports containers, and so this won't be a problem.)

Does your team do local dev?

Every team I've known that adopted Lambda + DynamoDB (or equivalents) gave up on running their app locally, adding a lot of friction to the development process.

I highly recommend using AWS's own Chalice library as it makes local dev _and_ deployment very easy.

If you need more complex cases like deploying docker containers to a Lambda function, take a look at AWS's SAM library. Also supports local dev _and_ makes deployment easy (its essentially a wrapper around Cloudformation so its very powerful).

Within Lambda, yes, sucks. This is why container-based serverless is so much more exciting. (Which Cloud Run offers.)

I'm in the early stages of this rearchitecture but so far I've had no difficulty with local development.

This is one of my concerns as well - apparently azure functions allows you to debug your function from within vscode. I see lots of issues with this in that 1. you are limited to vscode as your editor and 2. you can only interact with Azure resources in a "development environment" within Azure itself, i.e. no local copy of the database etc

I agree with this line of thought. (Also from the perspective of a small startup CTO with ~10 developers mixed across golang, python, and react.)

We used GCP at our previous startup (sold ) and ran our own K8S, when it was very new (2015). There were lots of pains in those days. So when we started our current startup in 2018, we started with App Engine (flexible, which supports containers). This was fine, but lots of drawbacks. After a year or two we ended up back on K8S, using GCP's GKE (managed K8S). Our team is pretty good with K8S, so it was fine. But regardless, the little stuff adds up.

Fast forward to about 6 months ago. We had used GCP's Cloud Run off and on for little stuff, and it kept getting better. One day someone asked the question why we shouldn't just use it for everything. Everyone was a bit defensive, but we kind of stared at each other and couldn't think of great reasons (for our use case), so we tried it.

Our setup consists of a primary API service (Golang), and a dozen or so smaller microservices, mainly in Python. We even moved most of our React apps to cloud run.

6 months in, and I can't really say anything bad. We turn off scale to 0 for the services where it matters. It scales up quickly to loads, zero down time over 6 months, no troubleshooting (so. much. time. saved.), super easy to deploy, swap traffic between versions, etc.

I'm not saying it a silver bullet, nor that it's perfect for everyone... but I couldn't say enough good things about _container-based_ serverless like Cloud Run.

That said, breaking big systems down to the function level (Lambda, GCP Cloud Functions, etc) sounds like a nightmare to me. I'm sure there are ways, but that's a different ballgame. We do use FaaS for some tasks.


Edit: Oh, and our hosting bill went from ~$5k a month to $500 a month (in part to other things, but primarily the lack of need for big node pools.)

>GCP has a better serverless offering

I am starting to evaluate AWS for GCP for serverless. What, in your opinion, makes GCP better? Is the comment in the context of containers or functions?

I have limited time before my next meeting so I'll type real quick:

GCP Cloud Run is like the best of both worlds between AWS ECS Fargate and AWS Lambda. (Yes, the comment is in the context of containers. Sort of.)

* Like Fargate, Cloud Run hosts containers and takes care of figuring out where they actually live. Unlike Fargate, you don't have to say exactly how many containers you want running at once; GCP will automatically scale the # of containers up and down based on HTTP load and will scale down to 0. This should make Cloud Run cheaper than Fargate. (If you want to hook up Fargate to a webserver and you don't have autoscale figured out, you'll have to keep a lot of workers alive doing nothing.)

* Like Lambda, Cloud Run bills by the amount of time spent processing at least one request. But unlike Lambda, Cloud Run lets one container handle more than one request at a time (it sucks to have to spin up a lot of Lambda invocations that do a bunch of IO). Web servers that are good at concurrency shine here. This should save money.

* Cloud Run has more generous limits in many respects than Lambda. Cloud Run lets you set up SIGTERM hooks, so you can do some cleanup logic in your container (to e.g. write performance data to a timeseries table or whatever).

That's Cloud Run. On the database side: GCP Firestore is very interesting and we're going to build a big feature around it. AWS has nothing like it. On the queue side of things: We're planning to build around GCP Cloud Tasks; We've more or less built Cloud Tasks ourselves using a mix of MySQL and AWS SQS (and it was hard and we haven't done a good job).

I'd love to start a Discord or something to discuss these thoughts more. It's so hard to get good practical information for system architects/CTO types who just need to hammer stuff out.

A couple of points: 1. Cloud Run is more analogous to AWS App Runner than Fargate. 2. Cloud Run isn't a great analog to lambda. Lambda is built to host functions. Cloud Run is built to host applications. Lambda is more analogous to GCP Functions. 3. Cloud Tasks should probably be built with EventBridge + Lambda or EventBridge + StepFunctions or EventBridge + ECS.

I don't profess to be a GCP expert so it's hard for me to make a judgement call on what's better. I can, however, say that most of this post ignores some of the real serverless power provided by AWS. AWS AppSync, AWS API Gateway, DynamoDB, CloudFront Functions, Lambda@Edge. It also makes comparisons that are not very fair.

Huh. I had this long call with our AWS Account Reps (+ Support Engineers) the other day and no one mentioned App Runner! This is the first I've heard of it. Looking at it now.

Ah I see, launched originally in May 2021. That's probably why they weren't aware of it. Yes, this looks cool. Very much what I was looking for.

The differences that I can see are...

* AWS App Runner lacks an advertised free tier. Not a big deal for all but the smallest projects though.

* AWS App Runner bills rounded up to the next second, whereas GCP Cloud Run rounds up to the next 100 millisecond.

* AWS App Runner doesn't charge per request (?!), whereas GCP Cloud Run charges $0.40/1M requests.

* AWS App Runner has fewer CPU/RAM configuration options. The lack of low end options may be a blocker for us.

* It's cheaper than GCP Cloud Run - $51.83 @ vCPU/1GiB, but 2GiB minimum, in Runner vs $69.642 @ 1 vCPU/1GiB (v1) $97.50 @ 1 vCPU/1GiB (v2) in Cloud Run.

* I'm confused by the networking model. In App Runner, you have to make an ENI for your App Runner service to access your VPC? Weird. There's some extra cost there I think.

Things that I can't determine based on the documentation...

* Does App Runner support committed use discounts?

* Does App Runner throw a SIGTERM before shutting the container down? I hope yes but I can't find docs on it.

* Is there a file system accessible on App Runner and is it just in memory or is there actually a disk to write to?

* The quotas & limits page on App Runner feels incomplete and I'm left with a lot of questions about it.

* Is there an SLA?

* In fact the documentation for App Runner just feels a little incomplete.

It looks like AWS definitely wants App Runner to be the answer to Cloud Run, but to me, it feels like it's not quite there yet.

It's also weird, that ECS Fargate lets you run a container without thinking about the server that it runs on, and App Runner does too, just with a few extra things. Why is it a whole separate service? Why didn't they just add it onto Fargate?

Re: Other services. I've only heard of API Gateway, DynamoDB and Lambda@Edge; I'll have to spend time investigating the other ones. Thank you for mentioning them!

I know this is an old thread now, but I just came back to it and thought to dig in a bit. First thing my Googling hit was this, which provides a good comparison of App Runner and Fargate


That lack of WAF support stood out. So Googling that:


"Hello, we are looking at supporting WAF in App Runner and will have more updates on this thread going forward. "

this falls in line with how i feel about cloud run, it really feels like a much better abstraction than ecs/gke or functions. it also is more similar to how most devs currently work, and local dev is the same. for non-api based traffic, like queues, it has some really weird quirks, like the autoscaler is problematic. but our experience with gcp in general has ranged from mediocre to bad, where the tech seems cool but things dont quite fit together

This has been my experience so far too. Serverless is amazing, it does require some shift in thought when it comes to backend architecture but once you get there it really does provide everything it promises. Less cost, less complexity (if done right), and the scalability is amazing. I agree it's not perfect but I highly recommend any backend or fullstack dev take a look at if you haven't already.

Been working on a lambda, dynamodb, typescript react app for about 6 months for work and it's been just mind-melting how much money and complexity we've saved switching. I'm talking like 5x the cost drop and we can more easily onboard devs to the project because it's just simpler, no need for a devops hire honestly.

Can you expand a bit on complexity part? How exactly serverless reduces it?

The only thing that comes to mind better isolation between parts of the app, but this could be achieved with any architecture if done right.

We don’t have to manage a server, that’s really the drop in complexity. We just write a function and it runs on a machine somewhere in the cloud. As we scale, AWS just handles it automatically. If the demand decreases, no problem. We just pay for what we use. Of course there are ways to handle scale with servers, but with serverless you barely have to think about it.

> I'm shifting literally everything (except Redis) to serverless

A new company called Momento just launched that offers a serverless cache. Might be of interest to you.


AWS Lambda supports shipping your Lambdas as containers now. Very nice experience.

Was going to comment this same thing. We've been finding that the cold-start times are somewhat worse, but not disastrously so.

I just started working with Lambda, so my experience may just be the teething pains, but. I find the developer experience a bit jarring. My lambda is in python, but every time I want to test my code I have to "build" the lambda with the `sam build` tool and only then can I exercise it. It takes a non-trivial amount of time to build.

The deployment story looks good though, with the `sam deploy`, but for now I can't get over the developer experience.

My recommendation is to invest in testing now while it’s easy. You can mock the incoming events and use something like moto for service mocks. That catches dumb mistakes sooner.

Also, don’t put any logic in the handler itself unless it’s extremely trivial.

Sound advice, but that ship has sailed. This is an existing lambda that I'm refactoring so I need to be able to "go through the front door", as it were. Once I become more familiar with the code, I'll be able to test individual pieces are you're recommending.

I've done serverless where it made sense: Data feeds for the Nasdaq where end of stock market day means lighting up tons of servers reactively to incoming data from data brokers.

Everywhere else, I went with traditional deployment of a monorepo.

You only have 100 concurrent users. You do not need serverless. You could serve 100x that amount easily with a simple nginx reverse proxy to your webserver.

This is almost comical enough for me to suspect that it is satire, but unfortunately there are too many examples out there of this type of thinking. It's just infra bloat and a waste of money.

> all our web apps that are currently deployed in Azure App Services/ VMs

What problem does migrating to a new architecture solve? Does the current deployment have scaling, maintenance, or other troubles? Going from something that's broken to something that works is one thing, but going from something that works to something else that works is pointless unless there are tangible benefits.

If the only reason is to make the Cloud Architect feel better or pad people's resumes, the correct answer is no.

Yes, they might try a pilot project to see if cost savings are substantial enough to continue.

Can't they just calculate/estimate it?

This is the best question to present to them and the business.

I am not a fan of serverless computing. I am more familiar with AWS after being in a company that went all in on AWS so I will speak to those terms.

I think Serverless is good in some areas, S3 and Dynamo are both good products for example.

I have a few big issues with serverless: 1. It is harder to develop for. Sure you get to ignore server configuration but honestly a well made infra team should be removing that concern for the development team anyways. The problem is when you are running it locally you so rarely can actually run the code. So setting up things is annoying, especially when you get into the final stage of serverless which is some object lands in SQS which fires a lambda, which puts puts another object in another queue which fires another lambda which load s3 which writes to a db, etc. This all ends up more complicated than just writing an application for it, but its harder to develop for it. Often the only way to actually run this stuff ends up being setting up the whole infra in the cloud and running it through that way, so that means dealing with deploys, and you lose a lot of debugging ability.

2. It doesn't save money. A single lambda that runs quickly on some event does save money vs a server all of the time. But most companies seem to over-provision servers so that's easier. But once you include the prod environment, dev environment, and the serverless things running all of the time, it does not save money since often 100 lambdas could be a single instance.

3. It doesn't save time. Developer messing around with setting up hundreds of new services and the corresponding rules and configurations and deployments and cicd pipelines , I don't think saves dev time vs a normal well maintained infra with servers and a good a cicd pipeline. Often the time savers are vs like manually configured bare metal servers moving to serverless, but there are better ways to save time.

I wholeheartedly agree about developing/debugging being harder. People advocating for Serverless secretly ignore this and focus on scaleability and pricing. I have worked at two places using Serverless and in those places setting up a local environment was not possible. Instead I have to deploy to a test environment to try things out. It is such a slow workflow.

Odd, I've never had an issue with SAM and lets me run an API gateway, Lambda with Dynamo locally so by itself covers a lot of the usecases and LocalStack to fill in any testing gaps.

The bad developing/debugging dev-loop is getting fixed. My team is working on a platform that fixes it, as are others in the space.

You should never need leave your IDE to complete a change>execute>change inner loop. https://modal.com/docs/guide/ex/hello_world

We use AWS Lambda with serverless.com. I can deploy a single lambda change to our dev environment in about 5 seconds. I can also edit directly through the AWS console if I want to experiment. I've never wanted for a local environment.

5 seconds is way too slow

Economically, it's usually more expensive to use serverless functions for a constant level of load. At least on AWS but I'd be surprised if the math turns out differently on Azure. So your intuition (low traffic or bursts) for serverless is quite correct, IMO. I usually try to move any stuff that doesn't fit with typical web server loads to serverless functions, e.g. cpu-intensive one-time tasks like image resizing or generating argon2id hashes, etc. But even for those loads it might be more economical to put them on separate instances/scaling groups if they can saturate those and the load is predictable enough.

Fire the astronaut architect immediately. Seriously. I couldn't fathom why the hell you'd migrate an existing application to this other than because the guy is a one trick pony who drank all the sales kool aid from cloud vendor of the hour. We had one, he's gone. We burned millions of dollars on things with no discernible ROI because someone saw some shiny poo at a conference. It becomes a religious crusade, not a sound technical decision.

Key concerns:

1. You're locking your platform into one supplier permanently. The exit fee is starting again. Literally burn it to the ground.

2. You're going to introduce problems when you migrate it. The ROI is negative if you spend money and achieve more bugs without improving functionality.

3. The cost estimation of every pure serverless platform is entirely non-deterministic. You can't estimate it at all even with calculators galore.

4. The developer story for serverless applications is quite frankly a shit show. Friction is high, administrative host is high and the tooling is usually orders of magnitude slower and more frustrating than local dev tools.

5. It's going to take time to migrate it which is time you're not working on business problems and delivering ROI to your customers.

As always ask yourself: is the customer benefiting from this now or in the future? If the answer is no or you don't know, don't do it!!!! Really sit down, find a sound business decision analysis framework and put all the variables in and watch it melt instantly.

All you're going to do here is put a "successful" (pah!) project under the architect's belt before he pisses off and trashes someone else's product.

As a somewhat extreme opposite of this, I would at this point never allow my cloud estate to progress past portable IaaS products and possibly Kubernetes control plane management. Anything else is a business risk.

Depends. We've had a few apps that we migrated from on-premise to AWS and made it "full-serverless." Here are some of the things we've learned.

* Scalability is real. We have some bursty traffic, sometimes with extreme burst, and we've had no problems scaling to meet that need.

* Our traffic is still predominately during business hours in the U.S. That's an extremely important point - because our site is effectively being used for only 12 hours or so per day. The remainder of the day and on weekends it's unused. We looked at the cost of using EC2 instances and Elastic Beanstalk and the full serverless is still cheaper.

What we've discovered in our cost analysis is if you have a site that's hit 24x7, 7 days per week then you'd be better off hosting on EC2. If your traffic is constant and there's not much variability over time then it may make more sense to host on-prem. In our case we have highly variable traffic during standard business hours. Serverless is the way to go for that scenario.

My previous company went "full serverless" on new development, and delivered amazing performance at very low cost. My general opinion has flipped after that experience. I believe there are use cases where serverless isn't an option, but not very many.

I think most people don't realize just how "burstable" their own traffic is. If you're looking at graphs with one-hour resolution, remember that AWS bills for lambda at 1ms increment. Not sure about Azure, though.

Interesting. How would you compare the developer experience between the two approaches? And what was your use case?

The serverless dev experience was delightful! I'd work on a single lambda, or maybe a "stack" of related lambdas, and each was focused and lean. A serverless approach is also probably-necessarily a microservices approach, which removes a lot of complication.

It was in EdTech, so we had students downloading assignments, uploading results (lambda-fronted S3 for blob storage, DynamoDB for data), administrators paging through result, grading things, students uploading images and videos of themselves, administrators reviewing them, many lambdas being triggered by changes to S3 or DynamoDB tables, or SNS messages sent from other lambdas.

I don't think most people would consider it especially bursty, except in the most general sense around midterms and finals, but in truth even a "heavy user" is only hitting APIs every so many seconds at most, and like I said, AWS lambda bills at the millisecond level.

Couldn't agree more on the experience of developing on serverless. And you're quite on point, about the need for a good architect who can solve most of these issues.

I know a lot of EdTech startups who are primarily serverless at massive scale.

Along with a DevOps team, an Observability tool like KloudMate will go a long way in managing serverless stacks.

I will say, if you don't have a great architect at the top level, and support from DevOps, you end up with miserable experiences like the negative stuff I see here.

Well, some of it seems to be assumptions, but I believe the negative experiences, I just think they lacked a great architect.

Lots of different opinions here, but:

My company has moved several "regular" websites to serverless. In fact, we just took the existing websites (which were often Django, sometimes huge ones) and dumped them into Lambda. The exact opposite of every "what serverless architectures are for" article you've ever read. And you know what?

It's awesome.

It's way cheaper than running it on EC2, and I never have to reboot a server or worry about their disks filling up or anything. Then when traffic spiked hard during the covid lockdowns? I did nothing. Lambda just handled it.

The only serious change we made to the setup was preloading a percentage of machines at all times to remove cold starts.

I'm not saying it's trivial (zappa, serverless, CDK), but usually one guy gets it working and the rest of the dev team changes nothing at all.

Did you just dump the entire app into a single Lambda? I've been wracking my brain trying to figure out how to turn Django apps into services, but it never occurred to me that it could just make sense to dump the whole app in there.

I've done similar, works great for Node, Python and PHP apps.

Yes, that is exactly what we did.

There are two types of serverless that most people reference. Serverless functions (e.g. AWS Lambda) and serverless managed platforms (e.g. AWS ECS Fargate)

Serverless functions are great if you have a lot of small services that need to be "on standby at all times".

For example if you have 5000 separate services, it doesn't makes sense to have them all running all the time if 4,000 of them have very low traffic. So one of the main benefits is that you get the ability to "increase your library of services at a very low cost". Serverless also really shines with quick stateless actions.

However, converting an app to all serverless is a huge task and for most apps it doesn't make sense.

Two major drawbacks:

1) You're bound to only the language versions that are currently supported.

2) You're writing code specifically for the platform so without a heavy lift you're "locked in"

If the goal is to go serverless and get rid of the server management, I'd suggest looking into containerizing your existing apps and deploying on a "serverless" managed service like Fargate (or your favorite cloud provider's equivalent). This approach is also lets you go to a different cloud provider if you want... or even move back to your own datacenter with no code changes.

Ride the wave, you'll have nice horror stories to tell later.

Just don't get attached to company and don't work late.

If the architect has taken the reins for this project, I think this will be OP’s best bet. Nearly all of the other comments on this post indicate that this plan will end badly (given what we have been told), so best be prepared to land when it all falls apart. Start the job search now.

For some reason, every SE in HollyWood has some fantasy that scalability is their number one problem to solve.... 100 concurrent users is a nice size problem to have.... likely could be handled by a pair of cheap Digital Ocean servers for a few bucks a month while you work on more important things.

Depending on the use pattern, with 100 concurrent users, the serverless functions might not go "cold" often. In that setup, it's not technically much different from running REST services on a traditional server.

Whether or not it's for you has a lot to do with what's important to you. How much weight you put on runtime cost, versus ease of development / deployment, versus whatever other benefits it might bring (integrated logging, monitoring, multi-region deployment, etc). And, of course, whatever downsides it brings...like cold starts, team ramp up on how it works, etc.

At first, after reading the title, I thought "Well, sometimes 'serverless' isn't actually serverless because they're using EKS, which I don't consider serverless" but then saw they want to do everything in Functions and...ew...

You are right that your use case doesn't make sense for going full serverless. An application with heavy and predictable usage doesn't gain anything by becoming serverless. All you're doing is raising your cloud hosting bill.

Probably makes them more money. My friend that handles a very large MS account gave me the insight. When the account bought Windows it spent $5 million a year with MS. When they went to the cloud they now spend $25 million. Good deal for M$. Of course that may have been offset by other cost reductions but still a huge revenue boost to MS.

If app usage is not steady 24 hours a day (e.g., the 100 concurrent users all live in the same hemisphere and don't use the apps much outside of business hours), you might see some cost savings from not running excess infrastructure. You'd have to run a cost projection to make sure; it's entirely possible the serverless option would still be more expensive (even before you factor in the $$ for the migration effort).

You can scale down app service/VM based infra either on a timer or in response to metrics, so it's not like serverless is your only option if cost is the motivator.

You are right to skeptical, but I think for the wrong reasons. I'm not very familiar with other serverless options, but it sounds like Azure functions will be fine performance-wise. After all, if you run into any issues you can just put your functions on a dedicated tier, which is basically an app service under the hood.

The question I'd have is what is the driving force behind the initiative to move to Functions? If the answer is "reduce infrastructure costs", I'd ask serious questions if the "juice is worth the squeeze" for a transition, and then create estimates for the cost of transition versus cost savings. For a 100 user app it is likely not going to have much payoff unless your infrastructure bill is a lot higher than I expected ($10-12k).

However, if the answer is "to create a better integration architecture between our apps and services" then you should engage with that. Azure Functions pushes developers really hard toward creating APIs that are discoverable and reusable, especially in a Microsoft oriented enterprise where you start seeing other tools like logic apps or the power platform start being able to produce and consume for custom functions. Over time, I've watched benefits accrue from common integration points functions drive across the organization.

So, ask questions, but make sure you understand what the organization is trying to achieve with the recommendation.

Disclaimer- I'm a Microsoft employee, but opinions my own.

My concerns about serverless aren't to do with them not scaling - quite the opposite! I'm worried that we'll sacrifice developer experience in the name of "scaleability" of Functions without any real benefit.

Our current hosting costs for the projects I work on are about two orders of magnitude below your "worth it" cutoff :)

The integration stuff you mention is indeed very interesting, thank you for mentioning. I can think of a couple projects that would would really benefit from Functions in this way. Our architect is mainly concerned with scalability here, however.

There are quite a few benefits you can gain from moving to serverless and some trade-offs. Rearchitecting your app is generally a pretty severe way to achieve a goal and means you've had an issue recently that you need to solve or a company wide initiative to mitigate a specific risk, The best way to play this is usually to look at it from an opportunity cost perspective.

Some of the possible reasons why you'd go down this path would include 1. Cost optimisation, generally not a good driver unless you have very spikey workloads (which you don't) 2. Resilience/availability, this is a pretty good driver especially if you've had issues recently, moving to serverless takes away almost all maintenance tasks and solves a lot of potential problems

Some of the main trade-offs include 1. Developer velocity, generally it's pretty hard to debug locally which you will need to spend time working out how to do it or do your debugging in the cloud 2. Cold start ups, this can be largely solved with solutions such as GraalVM however you do need to invest time to implement these solutions 3. More complex internal application architecture, you need to either deploy your entire application as a single function or break it out into multiple and you'd need to do the analysis of how it should work and the performance tuning of each option

That being said, I find the best way to look at these situations from a political perspective is to have a quick chat with the architect and look to understand what problem he's trying to solve and for you to mention your costs and take it into a cost/benefit discussion

If he says its for cost benefits, you could say it will be a x week migration timeline, which has a developer salary cost of y and delay a new feature which is expected to bring z revenue, 10-20% operational delay to pushing out new features. So what would be the cost saving total and ROI?

One big question is: do those API implementations keep any state between requests in RAM or is all the state in the front-end and/or database?

If there is no local state than serverless is a feasible solution, if not the best. If there is then you need to find some substitute for that local state and the case for serverless is much worse.

I think there's a lot of nuance to this. You may not keep any explicit state, but you may keep a lot of implicit state (often in frameworks and libraries). This state can include: database connections, cache connections, HTTP connections to downstream services, local LRU caches for computations, compiled regexes, import caches, and much more depending on the language and infra being used.

It's common for servers to take a while to start up, it's common to see the first request to each endpoint take a bit longer, it's a common optimisation to add keepalive to downsteam services, or tweak your database connection pooling. These issues are normally straightforward optimisations, and startup time isn't usually too much of an issue. With FaaS platforms these become significant hurdles that take engineering work to overcome, require introducing more services, more cost, etc.

Thank you for this - implicit state is something I never even thought of - that could be a serious problem

Some api routes do caching - right now this is using Asp.net core's in memory cache. He recommended redis to replace this. I feel that it's a bit overkill

Now that I think about it, some parts of ASP.NET do a lot of stuff behind your back so the concerns that Dan Palmer brings up in a sibling comment may be worse than you think...

What are they trying to improve? Stability? Time from change request to change appearing in production? Developer velocity over time? Sometimes you'll hear things like "this will cost an extra $1000/month but make hiring and retention easier"

Operationally: You need a few more/different specifics to avoid talking in generalities. How many requests per second? What's the floor? What's the peak? How bad a sudden surge ("thundering herd") do you ever see? What's the heaviest request / worst case response time?

Then you can start comparing the two solutions under various scenarios. How much will our average RPS cost us? Will the service deal well under very low or very high load? What happens when your worst-case thundering herd hits? Does your heaviest request fit comfortably within limits?

I think more important than the serverless question (and btw I agree with you, I I lean towards it not being a good idea) is whether they are wearing "Vendor tinted glasses".

I tried hard to think of advice on how you may remove the glasses if that is the situation here, but in honesty it is a tricky one. It is akin to a little worldview; bubble, and those are tricky to try and actively shift in others (an attempt at a suggestion: I think perhaps it would be best to come into team discussions around this not as being on "one side" but rather as being the reasoned, dispassionate expert on all sides; the whole question).

I say this is someone who up until recently wore the glasses (in my case it was for Kubernetes) - it took me a failed project to take them off, I hope that does not happen to you.

Ask your architect to model you serverless bill for last 3 month and future 3 month (estimate). Than you will have one juicy presentable datapoint.

Make sure the serverless model include all gimmick you currently have such as firewall, waf, cache, ssl termination, load balancer, current traffic levels etc.etc.

Several of our portfolio companies have gone completely serverless (admittedly on AWS) and it has been working well for them. If endpoints are stateless and independent being able to scale them all separately and only paying for use (where usage is far less than 100%) it can be a huge win.

> An Azure Cloud Architect has joined our company

The title says it all, they might as well work for Azure.

... It's likely everything is an Azure shaped nail to them, do not trust them.

Makes sense to be skeptical here. Cold boots can make the system slow unless there's a fairly constant flow of requests, and if that's the case, why not keep using app services. I get that the architect wants to rid the system of VM's, but going from monoliths to nano services.. Can't tell if that's a good idea.

But if you compare Azure Functions with stored procedures in a DB, then it's pretty cool to have a kind of hot swapping at the function level.

I'd be cautious, but with a gradual migration there's hopefully time for reflection as well. Going 100% on anything is rarely a good idea, so hopefully your architect isn't religious about this.

One of my main concerns is how easily and quickly can issues be debugged? Can you attach a debugger in production if necessary? I've seen people experiment with serverless and it seems like they were back to print statements for debugging.

I don't have Azure experience but AWS Lambda is overcomplicated for REST APIs IMO. One reason is each function now has a deployment instead of just one deployment for the application. Frameworks can simplify it, but they don't eliminate all the moving parts. Lambda is great glue for filling in gaps in AWS itself.

Serverless can also mean something like EKS with Fargate. You get to use Kubernetes without managing any servers. Azure AKS has something similar with virtual nodes as I understand, though I haven't used them. I do think this model is better for long running services than serverless functions.

You should design it such that everything is in one monorepo and the entire system can always be run on one developer's local laptop without ANY serverless/cloud stuff...and in a single process if possible.

Then if you ever have a bug, you can easily set a breakpoint, step through the entire execution, and fix it.

Agree that this MUST always be possible whatever architecture changes occur.

Then have a flag/config to allow specifying certain things to communicate over network instead of as function calls, and using serverless functions.

I've NEVER seen teams do this though. It's like no one can imagine they will write a bug.

Some of the comments below are talking about how to successfully argue against the move. Does your company have some history you can point out? We've had enough of these type guys and their failed space ship projects that any proposal for a language shift, or major framework adoption is viewed with skepticism first.

We've dealt with "new guy wants to overhaul ..." scenario. When I joined this company we were a C++ shop with some Perl and bash. Multiple new recruits successfully lobbied to implement refactors, or new projects in a hot language/framework. Several of the refactors were a huge waste of resources that either didn't come to fruition, or were only partially successful.

Now, we are a Perl shop with active development in 3 other languages(not counting front end), and we're maintaining legacy apps in an additional 4 languages. And we've deprecated apps in at least 3 additional languages.

I guess I should be thankful none of them have lobbied for switching databases. :-O On any given year, we average 3-4 programmers and 2-4 contractors(mostly front end) Two of us have been there 15+ years, but the other full timers seem to move on around the three year mark. Because of that all the hot shots have left. When a major bug is discovered in their code it can take a long time to fix, and any breakage due to upgrades is quite a hassle since those of us left aren't experts at every language we have to maintain.

> What am I missing?

1. The same stocks in $cloud_platform_provider that your architect has bought.

2. A bunch of certifications for $cloud_platform_provider so you also want to lock everyone and their mother down into that platform.

Your take on what serverless is good for sounds right.

What benefits does the "cloud architect" say the migration will bring? It sounds like you have a reasonable backend api setup that works. There needs to be a strong motivation to do a migration like that.

I'm also not convinced you're at a scale where you need a cloud architect, but it's hard to say from your description. I bet their main motivation is delivering a project that justifies their role.

Since there's a lot of good comments here, can someone suggest if what I'm suggesting to do in the company I work for is a good idea?

We run a lot of services in Kubernetes, some of those services also run background jobs (same container serving both HTTP and doing bg processing). I want us to migrate background jobs from our containers to a dedicated platform (e.g. Lambdas), because we can scale to 0 when not needed, we'll offload our Kubernetes cluster (our cluster will serve only HTTP traffic that is easy to scale for us) and if done right, we should have better debuggability/observability. Also right now we orchestrate our jobs with redis which means we need a redis instance for each service with bg jobs, but I want to move orchestration to a separate service that will store the data in postgres so instead of running x redis clusters we'll just have 1 postgres.

The tricky thing is the rewrite, but frankly, we still need to do it and we don't need to rewrite whole services, just the code responsible for bg jobs.

I think the main reason to undertake the serverless route should be OPEX savings. We expect the serverless infrastructure to cost less than the current Azure App Services/ VMs set up.

Here are questions to ask

What is the current monthly spend? What is the estimated monthly spend in the new system?

Perhaps the new serverless system is easier for operations and deployments. Does the new system provide for better uptime/monitoring? How is monitoring done on the current system? If there is a problem, like the service returning 500s, do you have the tooling to diagnose the issue? How does this change in the new system?

What is the developer experience on the new system? Is it easy to deploy to staging and production environments? How long does it take to create a new feature? What does the develop/test/debug loop look like in this system? How does this compare to the current system?

Ask yourself and others these type of questions. Maybe migrating to serverless is better, but it should depend on the answers to questions/concerns that I listed above.

My initial impression is NOPE, and the fact this cloud architect is pushing you towards azure functions says a lot. The run on windows, take forever to cold start but do work ok for background processes. Given you mention user facing endpoints, prepare yourself for 15+ second cold start times while they boot a windows VM for your function, and no private subnet for the functions either. The answer to this is a super expensive "Premium Plan" which basically just rebranded app service which runs the VM's for you full time at double the price. I have used them and am not an azure functions fan myself.

In the azure world, a more modern option going forward is azure container apps which just run docker containers, but you still will have 8+ second cold starts and will need to run at least a single instance full time, but it's cheaper than functions premium. Also would suggest looking at an evented architecture using dapr which is built into ACA. In the GCP world cloud run is frankly amazing.

Not paying for down time, dead simple scaling, better service composability.

If I was starting from scratch, I'd use serverless. If you're migrating everything, I think that is a giant project that needs justification. I'd ask, "What specific current problem do you have that it would solve?"

Cold boot is only a (minor) issue on the first hit, that's quickly amortized.

Goodbye to local development experience. New architect hire suggesting big changes does not have anything else to do to justify their excistence.

Some Async use cases is great but large scale apps becomes clusterfuck. Experience: we made a whole feature on AWS lambda. Sucked. 2 years later its a spring app in a container now.

Serverless is good for any kind of app if you accept its pricing. You have perfect scalability, you have perfect availability.

The question is: what alternative to do you propose. How does your alternative reduces hardware when load is low. How does your alternative orders more hardware when load is high. How much time does it take? What's your plan if your data center is cut from the Internet because of bad router configuration?

A proper alternative for serverless is Kubernetes cluster. It'll likely cost less (for big application) but it'll require more knowledge to properly manage it.

You can use simplistic setup with dedicated server or virtual machines with manual operations, but at your load I'd consider that not appropriate.

Anyway is management decided to hire Azure Cloud Architect, the decision is already taken and I suggest you to relax and enjoy new experience.

Serverless has its place

Several folks have written about it (Architect Elevator[0] is a good blog on these types of topics, as he routinely talks about tradeoffs and ROI to the business). High Scalability's[1] "what the internet says" posts frequently highlight serverless projects (both pro and con)


[0] https://architectelevator.com/blog

[1] most recent - http://highscalability.com/blog/2022/7/11/stuff-the-internet...

> Are the apps I'm working on good use cases?

It's OK... it would work.

It could be somewhat more expensive or less expensive to host, and somewhat more or less performant. (You didn't say where the data lives, but if it's in a database and you aren't doing something extra with it, then this layer might not be that important, one way or the other.)

For a 100-user app, I'm guessing the major cost here is the switchover cost. Whether it makes sense or not depends on details of where you are now and what problem(s) this is meant to solve.

No one here knows that (maybe even you don't either?) so we can't really give you an answer, just some general pros and cons of serverless.

Depending on your business's "economics", serverless can be decent for MVPs/projects where you're not sure you'll become popular.

For my own project (uptime monitoring + status pages), I got to about 500 users before serverless costs were eating enough of my profits to make me want to move to VMs. It was nice to be able to validate the idea on a service that costs zero if no one is using it.

With continuously running applications (100 concurrent users), it makes zero sense to use serverless as you're paying a high premium over a continuously running VM. I'd just use a VM and scale the number of instances serving the API.

Having maintained a service that uses Lambda. I think Lambda is probably fine for one-shot tasks, or taking from a queue (which is what it was built for), but I would never use it for API endpoints.

The main issues: 1. Unpredictable performance - latency (with cold start), concurrency limits (how quickly can we scale to X concurrent requests), etc? We spent many hours with AWS support before moving away from lambda. 2. Short running process are terrible in many ways - no DB connection pooling, no in memory cache.

I'd be much more happy if AWS fixed scale-up speed of ECS tasks so you can scale up your services in a reasonable time, than having these one-shot tasks.

Our app is mostly hit during business hours. So we wrote a pinger to "wake up" the back end lambdas when a user visits the website (not log in, just hit the site anywhere) *and* it's been more than 5 minutes since the last ping. It's not perfect but it helps a lot with the cold start times.

Our problem wasn't with zero traffic cold starts (nothing is calling the Lambda), but rather with scaling up cold starts (oh damn, traffic! We need more lambdas!).

What is the reasoning being provided? Will it save money due to your users using the application during business hours? Are there assumptions being made that are not applicable/valid in your case?

Personally, I was excited for serverless, but after using API Gateway and Lambda to serve a simple REST API it seemed like more work compared to using a load balancer to route requests to a container running in ECS. ECS can autoscale too, so you can scale up and down as required.

Serverless is a way to deploy code. It has its tradeoffs in cost and scaling and speed, just like anything else. Most of the advantages are in more rapid iteration and deployment, since you're forced to deploy small chunks of code at a time.

But if you have an API that is getting sustained traffic, Lambdas probably aren't your best bet -- you're going to want a container that is always running.

But to be honest, with 120 routes and 100 users, it sounds like Lambdas are a good way to go.

A lot was said good about serverless, and I was a big proponent of going ”all in”, but now as CTO of small startup I am careful with it.

From my experience of 4 years with serverless in AWS following problems have been identified:

- Difficult to debug

- Difficult to collect logs - Lambda@Edge

- Slow cold starts

- Frontend and NodeJS bundling are problematic - size limits, slow and unpredictable problems

- Pricing are difficult to estimate

- Careful planning needed for network and architecture - how lambdas work together

- Workflow orchestration might be needed

What is serverless good for?

- Queue processing

- Event processing

- Internal infrastructure code

What is the daily request volume?

This is most likely a waste of time and the "cloud architect", like most cloud proponents, has no fucking clue what they're talking about.

When we migrated our API to Google Cloud Functions we had a major rise in costs. Reason: idle time is billed. Meaning every connection to the outside world like a GET request or some other api will become costly. This is not a problem when you have a light api. We had 30-500 concurrent connections and after the first week our bill was already USD 500 higher as usual.

You’ve lost this battle. Prepare for the next: figuring out what full serverless even means.

It should mean “keeping state to an absolutely minimum, and relying on event-based architecture.”

Are you familiar with event-based architecture? Are you familiar with functional programming?

This is your time to shine.

There’s a strong possibility you’ll end up with Lamdas (or whatever) that are just CRUD endpoints.

That would be bad.

So be prepared to fight out what comes next.

I’ve had good success using lambda and functions for analytics data pipelines. It allows a team of python developers to focus on the business logic instead of the infrastructure. On my current project we’ve been struggling with K8 for months (not our decision) and I intend to push for a refactor to functions as soon as we get to MVP.

Feels similar in a way to owning/leasing a car vs getting a taxi/uber everywhere. Depending on how much / how often you need it, one can be better suited than the other, or both depending on certain scenarios.

Nothing. You should be. For any serious workload it will be wayyyy more expensive than an RI running a vm or ecs/eks running a container or two. It will also perform worse most likely or at least be less consistent.

I have a hard time believing serveless will be cost effective as a full replacement for a high traffic webapp. Higher ups do care about costs, so you should encourage some due diligence on pricing.

General benefits of serverless

- Easily scalable/autoscaling

- Drastically reduced operations/maintenance/devops overhead

- CI/CD can be much simpler

- Observability is built in (metrics, logging, alerting is built in)

- Built in connections to other cloud products

Does serverless has protections against denial or service? Couldn’t an attacker use all the quotas or empty the bank account using something like apache bench on a raspberry pi or t2.nano ?

Maybe instead of asking the Internet's opinion (without background on your particulars), you should ask the architect why that's the recommendation.

I would almost threaten to leave. The architect seems like a moron chasing a dead hype, which is even worse than a still-alive hype.

It’s a disaster for any long running processes and most Serverless offerings cap the execution time to 15 minutes

Had the same thing happen except with AWS architects.

If you can stomach the vendor lock-in then it might not be so bad.

Serverless means total vendor lock in. Hope you are ready to marry Azure for life.

Hard to say exactly without knowing more but sounds like this architect is right.

How serverless deals with versioning hell problems?

It makes versioning not seem so bad by introducing you to hotter hells.

I would get my resume out if I was you.

Don’t do it, seems like a terrible idea

Try version control an app full of Lambda functions. Try moving cloud provider with provider specific Lambda functions.

Don’t do it

I've used serverless for both big and small loads while working internally at AWS and also continue to do so now that I'm running a small startup.

The advantages are:

* Lower costs from much better resource utilization rates. Comparisons against a perfectly sized fleet of servers is inherently flawed. Sure, you can make sure auto-scaling happens, but that costs time and energy to get right. Even then, you're always going to be having to leave some buffer room. Instead of saying serverless is good for bursty/low traffic, I'd frame it as serverless is great for any workload that isn't close to a fixed load. Dev and other non-prod environments also basically cost nothing instead of potentially being quite expensive to replicate multi-AZ setups. In practice, serverless is going to be cheaper for a lot more use cases than you may think at first.

* Tight integration with IaC. Your application and infra logic can be grouped based on logical units of purpose rather than being separated by technology limitations. This is especially true if you use things like CDKs.

* Zero need to tune until you get to massive scale. We went from our first user to hundreds of thousand of users with no adjustment needed at all. Even at millions of users, there's little you'd need to change from the infra side beyond maybe adding a cache layer and requesting limit increases. Obviously app/db optimizations might be needed, but for the most part, scaling problems become billing problems.

* A simpler threat model. If you're running servers, keeping them secure is not trivial. There's just a lot to less to do to keep serverless apps secure.

* Ability to avoid Kubernetes and other complicated infra management. One could argue that you're just trading Kubernetes complexity for cloud specific complexity. That's true, but it's still a net reduction in complexity.

* Operational overhead is way down. A base level of logging/tracing/metrics comes out of the box (at least on AWS, not sure about Azure). No need to run custom agents for statsd/collectiond/prometheus/opentelemetry/whatever. No need to spend any time looking at available disk space metrics or slow-building memory leaks that creep up over weeks. It just works.

* Easy integration with lots of cloud managed services. Want to deploy an API endpoint? Want to build a resolver for an AppSync GraphQL field? Want to write code that runs in response to some event or alarm going off? Want to process messages from a queue without spinning up a fleet to longpoll from it? Want to write code that applies transforms on a data stream before writing to your data warehouse? The infra definitions for all of these all share a foundation. You have a unified API for everything.

is there a business need to do this? like seriously.

> An Azure Cloud Architect has joined our company. They are recommending that all our web apps that are currently deployed in Azure App Services/ VMs should be (gradually) split up into Azure Functions going forward.

Having your API's in a bunch of different App Services is sort of a bad idea. You can do it, but you're likely going to have "fun" with how much complexity is involved with setting up the VNETs, Private Endpoints, Custom Domains, DNS stuff and different Subnets that can't be shared across App Service Plans for all those apps and their deployment slots. You're likely also going to be a significantly higher price for it than the alternatives, especially if you use containers, but it's "significantly higher" in a way that's "unimportant" because it's likely peanuts compared to developer salary, total IT expenses and so on.

That being said, an Azure Function App is still an Azure App Service, so unless your Architect means that you should consolidate your different backend App Services into fewer Function Apps, then I don't see the benefit. If you're unsure what I mean by this, it's that you can replace the 60 API routes with 60 functions in an Azure Function App.

> I'm skeptical - I was under the impression that serverless was for small "burstable" apps with relatively low traffic, or background processing.

You're not correct about this. They scale just fine, and they can handle huge workloads, sometimes a lot better than their alternative, though at the cost of locking yourself into your cloud provider.

> The consensus on the internet seems to be "serverless has its use cases" but it's not clear to me what those use cases are.

I can't speak for AWS, but the basic way to view an Azure Function is to use a simple Express NodeJS API as an example. In a standard Azure App Service you're going to write the Express part, you're going to write the routes and you're going to write middleware for them. In a standard Azure Function App you take the Express part out, because that part is handled by the Azure Function.

Azure Functions have the benefits of integrating really well with the rest of Azure, and in many cases can be really good. It's also much easier to work with them because you don't have to care about the "Express" part and can simple work on the business logic. The downside is that you're limited to what Microsoft puts in the Azure Function functionality, and that you lock yourself into Azure.

With C# you further have to consider whether you want to run your Azure Function as an Azure dotnet, or an dotnet-isolated. Again dealing with the degrees of which you'll want to lock yourself into Azure.

> So what should you do?

I think your Cloud Architect should look into Azure Container Apps, or AKS if you want less lock-in. Both are kubernetes, but Azure Container Apps sort of handle the heavy lifting for you, again, though with some of the highest lock-in that you'll find in any Azure product.

It depends a little on your actual circumstances, but generally speaking, your backend service will have an easier life in AKS once you're up and running. I wouldn't personally touch Azure Container Apps, but I'm in a sector of EU where we might be forced to leave Azure. If you're not, it's a much easier road to kubernetes greatness than AKS.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact