
Guide to Serverless Architecture - md365
https://www.simform.com/serverless-architecture-guide/
======
cle
I've built and maintain quite a large serverless system, with a large team of
people, and most of these issues aren't really a big deal. Cold start is well-
known and also has well-known mitigations (e.g. don't use the JVM, which
wasn't designed for fast startup times).

I use AWS extensively, so I can elaborate on AWS's approach to these problems.
Deployment is straightforward with CloudFormation, VPCs/security
groups/KMS/etc. provide well-documented security features, CloudWatch provides
out-of-the-box logging and monitoring. Integration testing is definitely
important, but becomes a breeze if your whole stack is modeled in
CloudFormation (just deploy a testing stack...). CloudFormation also makes
regionalization much easier.

The most painful part has been scaling up the complexity of the system while
maintaining fault tolerance. At some point you start running into more
transient failures, and recovering from them gracefully can be difficult if
you haven't designed your system for them from the beginning. This means
everything should be idempotent and retryable--which is surprisingly hard to
get right. And there isn't an easy formula to apply here--it requires a clear
understanding of the business logic and what "graceful recovery" means for
your customers.

Lambda executions occasionally fail and need to retry, occasionally you'll get
duplicate SQS messages, eventual consistency can create hard-to-find race
conditions, edge cases in your code path can inadvertently create tight loops
which can spin out of control and cost you serious $$$, whales can create lots
of headaches that affect availability for other customers (hot partitions,
throttling by upstream dependencies, etc.). These are the _real_ time-
consuming problems with serverless architectures. Most of the "problems" in
this article are relatively easy to overcome, and non-coincidentally, are easy
to understand and sell solutions for.

------
Vinnl
Drawbacks I experience:

\- Vendor lock-in: although I try to minimise this and have a clear picture of
where the lock-in lies, it is very much present.

\- Cold/warm start: this is somewhat annoying. I haven't set up keepalive
requests yet, but they feel like an ugly hack, so I'm not sure if I'm going to
or if I will just suck it up.

\- Security/monitoring: having fewer functions make this less of a worry, but
it's still difficult and it's clear that there's not much of an ecosystem and
best practices.

\- Debugging: this can be really annoying. I can usually avoid this by having
a proper local testing setup (see below), but when that doesn't cut it, this
gets annoying.

Drawbacks I haven't really experienced:

\- Statelessness: this is probably due to not splitting up my code in too many
functions, and thus not having that much complex interactions, but this hasn't
really been a problem for me.

\- Execution limits: haven't run into them yet.

Drawbacks I've managed to contain:

\- Local testing: I'm writing Node Lambda's. Since they're just stateless
functions, it was relatively easy for me to write a really simple Express
server, transform the Express requests into the API Gateway format, and
convert the API Gateway response format back into Express format. This works
fine for local testing, and reduces the effects of vendor lock-in. That said,
I do do a lot of the request parsing in the Lambda function itself, rather
than letting API Gateway do this.

\- Deployment: this is actually pretty sweet. I'm using TerraForm which,
although it's a bit immature and thus cumbersome to set up, has been getting
the job done and allows me to easily deploy new copies of the infrastructure
for every branch of my code.

\- Remote testing: as mentioned above, deploying it for every branch I have
allows me to see what it's going to look like in production.

Hmm, perhaps I should turn this into an article sometime...

~~~
dschep
> \- Vendor lock-in: although I try to minimise this and have a clear picture
> of where the lock-in lies, it is very much present.

The post isn't loading so I don't know exactly what it's in reference to (AWS
lambda, the framework, the more generic idea?). But, the serverless
framework[0] supports all major cloud providers (AWS, Google, & Azure) as well
as self-hostable options like OpenWhisk and Kubeless.

[0]
[https://github.com/serverless/serverless](https://github.com/serverless/serverless)

~~~
debaserab2
I may be naive (or just witless), but don’t you kind of end up with vendor
lock-in to serverless itself?

The first step in their example is “login to your serverless account”.

~~~
dschep
You have to make a choice at some level of the stack, for me, that was
serverless as the deployment framework. You could probably use terraform or
ansible + some custom scripts, but then you could argue you're locked into
terraform or ansible ;)

The login thing isn't actually necessary (or if it is, that's new)

------
dpweb
CGI scripts with a new name. Lambda and the like are interesting but any
system is a composable set of components.

You can say the same about Object oriented programming or func programming.
separation of concerns but on the network. Server functions arent a panecea
cause you still have to manage all the other pieces.

An elegant thing would be your entire app is in that single lambda but without
state there goes your db. even so, people cant resist taking something and
adding and adding more to it.

~~~
emj
More like CGI scripts with a loadbalancer that promises to run your
application on a host with enough memory within at most 600ms. That solves an
actual problem with CGI scripts, but I really would love to see more tiers
with lower latency brackets, sub 10 milisecond should be doable for certain
lambdas and costs.

~~~
wahnfrieden
The other key component of serverless is the pricing scheme: pay for what you
use; don’t pay for provisioned capacity. This has a huge impact on how you
design systems as you no longer have to consider throughput (besides account
limitations that you can raise without cost), only latency. You can even treat
lambda like an async queue which will never accumulate a backlog.

Interestingly lambda seems to make async IO technologies like nodejs less
compelling, because throughput capacity isn’t really relevant anymore.

~~~
rdsubhas
> The other key component of serverless is the pricing scheme: pay for what
> you use; don’t pay for provisioned capacity.

There is another way of looking at it: Pay for a percentage of the traffic,
irrespective of how fast your application runs or what your budget is. If you
reach your monthly budget because of one hacker news trending article, yeah,
then no more traffic for you. It can be insane.

If I have autoscaling with an upper bound, I can stay within a fixed budget
per month. If I reach my budget, yeah no more new servers, but whatever
running keeps running. My business doesn't disappear, I get business
continuity. Paying based on % of traffic sounds insane. How will I plan for
business continuity if my monthly budget is reached with this kind of model?
(without any "minimum xx requests" kind of crap free tier, business continuity
means fixed upper bound _AND_ continuing service, not _OR_ )

~~~
icebraining
I don't know if platforms already support it, but there's no reason why you
can't have a throttle (calls/minute).

~~~
alpha_squared
I can't speak for all serverless platforms, but AWS API Gateway does support
throttling and burst throttling[0].

[0]
[http://docs.aws.amazon.com/apigateway/latest/developerguide/...](http://docs.aws.amazon.com/apigateway/latest/developerguide/api-
gateway-request-throttling.html)

------
maitrik
The phrase that Amazon used when launching lambda was 'deploy code not
servers'. To me this sums up what 'serverless' means. It means the developer
doesn't have to worry about servers in any way. With AWS Lambda/API Gateway
(and arguably with Google App Engine before it) you take away the toil of
having to:

    
    
      * Manage/deploy servers
      * Monitor/maintain/upgrade servers
      * Figuring out tools to deploy your app to your server
      * Scaling an app globally.
      * Coping with outages in a data-centre/availability
      * Worry about load-balancing & scaling infrastructure

~~~
twic
This is exactly what PaaS's like Heroku have been doing for years, right?
Isn't Lambda just an even more locked-in PaaS?

~~~
viraptor
Heroku is a bit higher level. If you're using heroku, you're using their
blocks and their methods of passing traffic, logging, etc. With lambda, it's
just a function. You have to connect everything yourself, not you can do it
exactly the way you want.

~~~
gitgud
You can redeploy a Heroku app to many other servers and platforms relatively
easily.

However, I'm not sure you can do the same with AWS Lamda Functions. To mean,
this means it is like a lock-in.

~~~
viraptor
It's just code with basic function entry point. You may get your parameters
and configuration in a different way, but not much else changes. There are
only so many ways to call some code.

------
buf
> Drawback of Serverless: Vendor lock-in, statelessness, local testing,
> cold/warm start perf, security, deployment, execution, monitoring, remote
> testing, debugging.

Excuse me? Let's use lambda for what it was meant to do, not replace your
entire stack.

~~~
gervase
But 'serverless' has a much nicer ring to it than 'serverfewer'...

~~~
maxxxxx
"Serverfewer" sounds like the title of the next Neal Stephenson book :-)

~~~
laurentl
I think you meant "Serverevres" :D

------
hopfog
I tried the excruciating task of building a real-time multiplayer game with
server-side authority using only Firebase and Google Cloud Functions.

Basically I handle all the logic in Firebase rules and lambdas. Any
destructive action goes through a GCF that updates the Firebase Realtime
Database. Actions that happen often (like moving) are updated directly by the
client but validated with some overly complicated Firebase rules.

It's very nice not having to worry about servers and scaling, but it can be a
really painful development experience.

Let me know if you want to try it out and I'll post the URL.

~~~
tabdon
That sounds really cool. I'd like to check it out.

Now that you've been through it, would you do it again? Or stick to a more
traditional architecture?

~~~
hopfog
You can play it at [https://tombs.io/](https://tombs.io/)

It's pretty much a collaborative and social experiment where I wanted to
explore the concept of using in-browser cryptocurrency mining as a
monetization and bot fighting method (something I had a problem with in the
original Ludum Dare entry it's based on). That's a different discussion
though. (You need to manually start the mining and you can explore the map
without doing it. It's only when digging and chatting it's required.)

I'm actually planning on rewriting the whole backend in a more traditional
Node.js + WebSockets stack so no, I probably wouldn't do it again for this
type of application. However, I will probably use it again for other things.

~~~
test1235
I get this from your site:

Security risk detected: Trojan.Gen.NPE

~~~
hopfog
Thanks! I presume it's the Coinhive script that triggers it. What antivirus
software do you use?

~~~
test1235
symantec endpoint protection ... ? (I'm at work)

------
neya
For me the BIGGEST gain of using serverless architecture is security. For
example, this year I started two ecommerce platforms. One's a digital delivery
e-shop - selling ebooks, training and stuff and the other is a physical
delivery e-shop. Normally, I'd write my own Phoenix/Rails app, but this time,
I decided to go serverless for my furniture shop and wrote it all in Jekyll.
Yes, the static site builder. I use netlify to manage the production aspect of
it (which IS pretty AWESOME) and simple excel sheets to track inventory (which
is what my vendor provides me, anyway). For payment, I simply use the
checkout/cart functionality provided by Paypal and all this just works! The
site is designed in such a way that you can't even tell anyway what's being
used for the backend. No one can tell it's just a bunch of static HTML pages
on display.

Whereas, for my digital delivery store, I regularly need to check my logs to
see if anyone's doing anything suspecious. For example, a lot of IPs randomly
try to visit wp-login.php or /phpmyadmin. Maintaining a production web
application is a full time job by itself, if you don't have a team.

Having said that, many people would immediately assume static page builders
are generally dumb. That isn't exactly true - You can automate a lot of stuff.
For example, my local machine has a custom Jekyll plugin for my store that
resizes and optimizes product images before pushing to prod to keep the page
load time small. IF I had chosen the Rails/Phoneix route, I'd need to worry
about hosting imagemagick or the like somewhere. Or maybe write some code to
communicate with an third party API and usually, it's not free.

End of the day, I make sales and that's all that matters. That's when it hit
me hard that my customers needn't care nor know what's behind their favorite
site.

~~~
icebraining
If static sites are now also Serverless, the word has lost all meaning.

 _IF I had chosen the Rails /Phoneix route, I'd need to worry about hosting
imagemagick or the like somewhere._

Static sites have no option but to run that stuff ahead of time, but that
doesn't mean that dynamic sites can't do the same. Asset Pipelines with
precompilation are pretty common - both Rails and Phoenix have one.

~~~
cryptonector
"Serverless" is itself a bit of a misnomer. The point seems to be to
distribute stateless sub-computations. Whoop dee doo. At some point you need a
"server" to modify application state. A static site has no state
modifications, so it actually is serverless in this sense of having stateless
computation.

~~~
icebraining
I get it, but while the name is not very good, there's a reason why it came
about now, despite the fact that we've had static website hosting for decades.
It was coined to describe a particular kind of service, and retrofitting it
makes it less useful and more prone to confusion.

------
Dawny33
> grouping a bundle of functions together behind an API gateway, you’ve
> created a microservice

Always beats me why people don't warn the readers about the almost impossible
nature of logging and monitoring such systems :/

~~~
tehramz
Because most devs aren’t the ones that will need to worry about how it’s going
to be monitored once deployed. Not to mention troubleshooting a billion micro
services when something breaks.

~~~
mirko22
How do you figure that? Who else should worry about it?

------
sidhuko
What I've found having to use serverless is the local development tools are
broken. You'll find cloudformation is the only way to reliable manage all the
proprietary paid for services which will require you to read several
documentation on various tools (kinesis, dynamodb stream or fifo, standard
queues) to work out which one might work. To find out it doesn't necessarily
work in your applied function (CloudWatch Insufficient Alarms I'm looking at
you). So you then have a choice to use cloudformation and another deployment
tool to push your services. This is more typical after you've given up trying
to manage dynamodb-local across platforms for your codebase.

------
fuball63
One thing I would like to see in articles like this are more concrete use
cases for using serverless. The article begins talking about the concept as
though you should write your entire app in serverless, and then in their case
study, use serverless as a background process to convert image types.

The other use case I always come across is image scaling.

I'd be interested if anyone would like to share their use cases as entire apps
or background processes.

~~~
laurentl
Well, I'll add my own use of serverless.

We use Lambda + API GW to manage the glue between our different data/service
providers. So for instance we expose a "services" API (API GW) that takes a
request, does some business logic (lambda code), calls the relevant
provider(s) and returns the aggregate response. That principle can (and
probably will) be extended to hosting our own back-end / business logic.

We're trying to get to the point where a dev only needs to write a Swagger
file, the lambda code and a bit of configuration, and the rest is taken care
of by AWS and our CI framework.

~~~
fuball63
Very cool, thanks for sharing. The last part about just writing Swagger,
Lambda, and a little config is very interesting.

Do you worry about vendor lock in, which is coming up a lot on this comment
section?

~~~
laurentl
we're a small start-up, so I worry about shipping as fluidly as possible, and
being confident that my infra just runs (security, patch management,
scalability, uptime, etc.) All of which serverless gives me much more easily,
and at a lower cost (for now) than running my own EC2 instances. I'll worry
about vendor lock-in (or dumping serverless for that matter) once I start to
get ridiculous bills from AWS, or hit performance issues, or whatnot. But I
try to mitigate by relying on standard formats (Swagger) and keeping my code
as close to the business logic as possible - which is fairly easy with Lambda.
The only thing that is really tied to AWS is the framework we use to build and
deploy the architecture: a large Cloudformation file basically. We explored
Serverless (the app) to manage this, but it didn't fit our need (in
particular, Serverless has apparently never heard about Swagger, so that
sucks).

------
NKCSS
Page doesn't load... maybe they need a server...

------
songshu
I’m looking forward to Amazon Fargate which is less extreme than Lambda but
still server less in some sense. Basically you can still write proper apps
(Spring Boot apps in my case) but, so long as you containerize them, you don’t
have to provision servers, just tell Amazon what resources they need and how
many instances you want.

------
NetOpWibby
This article needs more exclamation points.

------
MPSimmons
Is this supposed to be comedic? It seemed comedic to me in several places. Is
that the joke?

------
polotics
Only Google and Amazon exist in this write-up? This is not a guide, more an
intro.

~~~
imglorp
Who are some other providers, aside from self-hosted?

~~~
bonesss
Microsoft has Azure Functions, if you're in to that kind of thing.

------
sidcool
It's redirecting to their homepage. Am I missing something?

~~~
md365
You might want to use a VPN, sometimes organizations enable geo filters.

~~~
sidcool
Wow, you were right!

------
Hnrobert42
Be more concise.

~~~
cryptonector
TFA is unintelligible mumbo jumbo describing an architecture that amounts to
(I think!) minimizing statefulness and moving all stateless computation out to
a cloud.

Well, OK, but how about some examples? How about some advice as to where to
draw lines? E.g., if the computations take less time to do on a server than to
distribute, then maybe don't?

