
Is Heavy Use of Async/Await a Fad? - tabtab
The multi-user design of web servers already provides &quot;natural&quot; parallel processing to distribute the processing load. If that&#x27;s the case, then why the common suggestion to add async&#x2F;await to all the libraries and even domain code?<p>I&#x27;m not an expert on the &quot;guts&quot; of web servers (such as IIS and Apache), but when I ask in related expert forums, it stirs up controversy.<p>According to some the &quot;need&quot; for such is only for heavily used web-sites such that async&#x2F;await are wasted on roughly 99% of sites.<p>Second, it&#x27;s allegedly only a performance improvement because of the particular implementation of the common web servers and&#x2F;or OS&#x27;s. If the implementation of the web servers is reworked in the future, the benefits of async&#x2F;await might just go away.<p>Some say &quot;adding async&#x2F;await still doesn&#x27;t hurt&quot;, but it can clutter up code and make certain debugging harder. Thus, having it scattered about is not a free lunch.
======
nostrademons
That's not the point of async/await, or of any concurrency strategy in
general.

The point is backend fan-out and the associated latency that goes with it.
Most modern webservices - ones that do useful things at least - need to make a
number of network calls to backend services to retrieve results. If you
single-thread the webserver, these get executed in series, and each one of
them blocks the whole webserver until it completes. Even thread-per-request or
process-per-request, you're still blocking the individual request until all
backend requests complete; you're just sitting there waiting on a database
fetch while you have other backends running on other computers that could be
doing useful work. That's how you end up with 2s requests where the user's
browser just hangs while loading.

Worse, each of those threads take up a few megabytes of RAM that can't be
freed until the request completes, so if you have a lot of users requesting
pages at once, there's a big risk of the server paging to swap (which will
absolutely kill performance) or running out of RAM entirely and crashing.

The only way to avoid this is to have some sort of concurrency within the
request, where you can you fetch your session cookie and user record from
Redis _at the same time_ as you check memcached for a cached version of the
page _at the same time_ as you kick off a backend request to a custom server
_at the same time_ as you read some records out of a database. Different
languages have different approaches for this: old-school Java used to use
threadpools and message queues, pre-ES6 Node.js would use callbacks, C++ would
use state machines, Django or Rails would just single-thread the backend
requests and live with the 10 requests/sec throughput. Async/await has gained
popularity lately because it's honestly a lot more intuitive than the
alternatives: you can at least use normal language mechanisms (variables,
arrays, and semicolons) to indicate "these things should happen in series, and
these things should happen in parallel" instead of breaking your server up
into little function-sized or class-sized chunks that are dictated by when
responses come back instead of what data needs to be processed.

~~~
tabtab
Re: _If you single-thread the webserver, [network calls] get executed in
series_

Why is the webserver single-threaded then? And even if it were, why can't
network calls be fanned out within it? If they made multi-threaded webservers,
would that solve the problem?

Re: _Worse, each of those threads take up a few megabytes of RAM that can 't
be freed until the request completes_

Do you mean user page requests, or RAM related to network calls? If the first,
that probably will happen no matter what.

I'm trying to sketch little timing diagrams to see where the bottleneck could
come from, and cannot see any "general" problem, only problems related to
specific architecture choices. But again, I'm not a systems-level software
expert. Is there a common concept behind the bottleneck, lots of architecture-
specific nuances adding up (OS+WebSrvr+Compiler), or something else?

~~~
nostrademons
Are you familiar with the difference between blocking & non-blocking I/O
calls? Also, have you done any distributed systems work, where the algorithm
that services a given request might require computation on different machines?
That's background knowledge that's important; it's hard to know where to start
without knowing your background.

~~~
tabtab
Yes, but when I ask web-server experts, they can't supply a clear explanation
for what happens with a typical web application, such as the famed "shopping
cart application". Or if they do, another expert chimes in and says Expert A's
explanation is technically wrong, and then Expert B argues over Expert A's
knowledge of how web servers/OS's work or could work, and they delve into
arguments over minutia about cache size and chip architecture, and/or bad
design of C#'s compiler, etc.

I open Pandora's Box but can't close it.

Thus, I cannot get the bottom of whether heavy async/await usage is to get
around quirks in _current_ systems, or solve a general/universal problem
regardless of how web-servers/OSs/compilers/chips are built.

~~~
nostrademons
Most experts on expert programming forums are idiots. That includes this one,
but I've at least worked on an application (Google Search) that I'm pretty
sure you've used.

Most _large_ websites (with hundreds of millions of users and dev teams in the
thousands) don't run all their code in the webserver. They farm out requests
to a bunch of backend services running on different boxes - databases, app
servers, caches, etc. The central problem with this is that _you don 't know
when a request is going to come back_. You don't want to have your webserver
simply wait around and do nothing while each request finishes; it'll spend 90%
of its time doing nothing, which means you need 10x as much hardware to
service the same number of users. But if you have lots of requests to backends
in flight, you need some way to synchronize the responses and ensure that the
webserver executes the right piece of code only when all of the data needed
for it is present.

Different languages have different approaches to this. The way we used to do
it in C++ is to hand-write a big state machine: each time a response comes
back, dump it into memory, consult the current state, move to a new state that
might be something like COOKIE_FETCH_COMPLETED or USER_FETCH_COMPLETED or
RESULTS_LOADED, execute the code associated with that state (which may kick
off even more backend requests), and then wait for the next response to come
back, which then triggers the next edge in the state machine, and so on.

Async/await just gives language-level support for that. When you await a
future, the language interprets it as "execute the next statement of this
function only when the data in this future is present. And you can compose
futures together, so you're awaiting "all of the data in this list of promises
is present". It's a way to apply familiar programming-language concepts like
variables and statements to code that's not all running on one box.

Do you _need_ async/await? Not if your entire system runs on one box. If every
time you make a blocking call, it's just going to a database that runs on the
same physical machine as the webserver, you're not really losing anything
because the same computer has to do the work anyway. I'd bet that a number of
"experts" have only worked on systems like this: if they're talking about OSes
and cache sizes and chip architecture, they're not even thinking in terms
where concurrency and non-blocking I/O is useful. Oftentimes it's very
reasonable to build an initial version like this, but the set of profitable
niches that can be served by things like a "shopping cart application" is
rapidly declining.

Unfortunately, the failure mode for a single-box server with blocking I/O is
"your server locks up", and that tends to happen right when it gets most
popular, and then you have to rewrite _everything_ if you want to switch it to
non-blocking async code. That's the other reason async/await is getting
popular: this is one of those areas where scaling is a cliff, not a slope, and
most people want to avoid cliffs in their business.

~~~
tabtab
Re: _You don 't want to have your webserver simply wait around and do nothing
while each request [of another server] finishes; it'll spend 90% of its time
doing nothing, which means you need 10x as much hardware to service the same
number of users._

No, it's processing _other_ users' requests. We are talking about high-traffic
sites, right?

Let me see if I can present a scenario to explain:

    
    
        R-I-A-I-B-I-C-I-H  // typical processing steps
    
        R = request received (get or post)
        H = HTML/HTTP output (response) 
        A,B,C = requests for services from/to other servers
        I = Internal processing (regular app logic)
    

Even if there is a lot of waiting for A, B, and C, there's still more than
plenty of "I" for the web server(s) to work on, because each user request will
be at a different stage typically.

If say B is slow, it's going to be the bottleneck regardless of whether
async/await is used or not. They'll all be waiting on B (at least those
requests requiring the same resources).

Now maybe you are implying one rewrite the application so that R-I-A-I-B-I-C-
I-H's steps don't have to be done in the order shown. But that's NOT what
async/await does (at least not how it's typically presented/promoted).

~~~
nostrademons
You lose the internal concurrency within the request (hoping this formatting
goes through):

    
    
                    W---
         C-H       / X  \
        /          |/ \  \
       R-I--I--I---I--I-I-I-I-H
          \/\ / \ /|\  /   /
           A B   U \ Y    /
                    Z-----
      R = request
      I = internal processing
      C = cache server
      H = HTML output
      X = early exit
      A = Advertising server
      B = Session cookie server
      U = User record
      W,X,Y,Z = application backends
    
      Compare vs sequential processing:
    
      R-C-I--A-H?-I-B--I-U---I--W-----I-X-I-Y--I--Z--------I-H
    

Note the user-perceived latency. For realistic sites, you may be looking at
400ms in parallel vs 5s or worse in series, which is a big difference in how
long your users wait and how likely they are to bounce.

This also has knock-on effects in RAM usage. Say that your webserver is
running at 100% CPU utilization. If a request takes 10x as long, it will have
10x requests in flight at any given time. They won't be consuming CPU, but
they will be consuming RAM, as the server needs to keep all the context around
that it'll need later. RAM can often be the bottleneck for many servers.

> Now maybe you are implying one rewrite the application so that R-I-A-I-B-I-
> C-I-H's steps don't have to be done in the order shown. But that's NOT what
> async/await does (at least not how it's typically presented/promoted).

It is if you're doing it right. That's what Promise.all()/race() does in ES6,
or asyncio.gather()/wait() in Python3, or Future::join in Rust. If you're not
doing this you're basically gaining nothing from async/await.

~~~
tabtab
Re: _It is if you 're doing it right._

Well, okay, but the typical suggestion is to use async/await even if your app
logic is not currently intended to be divided that way. Is it a "keep up good
habits" suggestion? Unless scaling up happens a lot in a shop in a way that
can be app-side parallelitized, I'd say YAGNI dictates to leave it _out_ of
app logic. (I deal mostly with database-heavy work-oriented CRUD-ish apps, not
social networking and ad-funded sites, by the way.)

Re: _If a request takes 10x as long, it will have 10x requests in flight at
any given time._

That assumes the app logic is "parallelitized" per above, doesn't it?
Otherwise, the processing time per request shouldn't be different because
async/await-ing database/network calls will not make them process faster by
itself.

By the way, I couldn't find much info on "session cookie servers". Is that
usually a roll-your-own kind of task, perhaps with the aid of an API?

Hey, I think your diagram came out as you intended. Kudos! Ascii U.

~~~
nostrademons
> Well, okay, but the typical suggestion is to use async/await even if your
> app logic is not currently intended to be divided that way. Is it a "keep up
> good habits" suggestion? Unless scaling up happens a lot in a shop in a way
> that can be app-side parallelitized, I'd say YAGNI dictates to leave it out
> of app logic. (I deal mostly with database-heavy work-oriented CRUD-ish
> apps, not social networking and ad-funded sites, by the way.)

For that domain you could take it or leave it. Async/await is most helpful
when you have lots of users hitting a server with lots of backends; if it's a
few users hitting 1 webserver with 1 DB, performance gains will be marginal,
and it's functionally equivalent to blocking calls but with some extra
syntactic overhead. The deciding factor will probably be what libraries you
need to support: many libraries are moving to async/await because they don't
want consumer/HPC applications as customers.

> By the way, I couldn't find much info on "session cookie servers". Is that
> usually a roll-your-own kind of task, perhaps with the aid of an API?

Google has a custom server for this because of security needs to validate
signed cookies & timeouts, but most sites on the outside just stick a JSON or
protobuf record into memcached or Redis and fetch it. It usually is a separate
fetch before many of the other database calls, though, because most (consumer)
sites have some notion of a session and saved user state that's independent of
the user record, because you frequently get anonymous users who don't have an
account. So it might physically be on the same Redis or memcached box, but
still requires separate network round-trips.

~~~
tabtab
Re: _if it 's a few users hitting 1 webserver with 1 DB, performance gains
will be marginal_

Let's say it's 4 web servers hitting 4 DB's, but otherwise most of the app
logic code will not be parallelitized?

