
Async might be a fad - zhong-j-yu
http://cs.oswego.edu/pipermail/concurrency-interest/2014-August/012864.html
======
fleitz
Async computation is not the fad, it's poor syntax is.

C#, F# and coffeescript have excellent syntax that remove the line noise
caused by writing async code. Actors and channels also nicely remove line
noise from async programming.

If anything as computers become more powerful and distributed you'll see
threads and locks disappear rather than async computation. Async computation
in an imperative style is what most want, threads and locks are what we have.

~~~
zhong-j-yu
That is where I disagree completely. I think the old fashioned
sync/blocking/threaded style is much easier than async.

Of course, C# has great async support; but it is still a complicate thing that
programmers must be very cautious about when applying.

~~~
fleitz
I don't think you've seen truly great async support, it's virtually
indistinguishable from sync code.

Sync:

    
    
      let file = File.Open("foo.txt")
      let data = file.Read(8192)
      // Do some compute stuff with data

ASync:

    
    
      let! file = File.OpenAsync("foo.txt")
      let! data = file.ReadAsync(8192)  
      // Do some compute stuff with data

~~~
zhong-j-yu
while the syntax can be as simple as that, there is still a difference, and
the programmer still needs to be very careful. what if you accidentally forget
the `"!"`?

~~~
fleitz
Umm... it doesn't compile because it's the wrong type...

But yeah multithreading is easy.

------
majke
Some time ago I wrote my thoughts in a blog post:

[https://idea.popcount.org/2013-09-05-it-aint-about-the-
callb...](https://idea.popcount.org/2013-09-05-it-aint-about-the-callbacks/)

Basically, it's easy to show that callbacks are a much harder paradigm to work
with considering flow control.

That's it. There is a place to use callbacks, but if you need anything that is
not trivial and won't blow up at some point, you should use threads.
Greenlets, processes or whatever you call it, things with stack that take time
to context switch.

I strongly believe threads are better, if not anything else is due to the fact
that when you do "spawn", you make an explicit statement, saying: here we
demultiplex - programmer beware of flow control here!

------
gamegoblin
I'm currently hacking on a toy webserver in Haskell, and from my point of
view, I spin off a thread for each request, but Haskell's runtime environment
manages all of these as a group of green threads doing aynch IO behind the
scenes. Best of both worlds as far as I'm concerned.

I just change

    
    
        serve request

to

    
    
        forkIO (serve request)
    

And magically I have multithreaded aynch IO that seems to be extremely
performant.

~~~
zhong-j-yu
Yes, if you have light-weight threads, there's no question that threaded
programming is better than async programming. But we are talking about heavy
Java thread, and whether its cost is so high that we need to avoid threaded
programming. I think the answer is no, generally.

Also, the discussion is in the context of imperative programming.

~~~
gamegoblin
Funnily, Java's original threads (back in 1997) were green threads, but they
got dropped in favor of native threads.

~~~
dragonwriter
IIRC, Java's original threads were N:1 green threads, while the lightweight
threads in Haskell discussed above are M:N green threads. N:1 is easier to
implement, and fine for single core systems, but doesn't give you real
parallelism. M:N, like 1:1 native threading, gives you real parallelism, but
also, like N:1, cheap concurrency (at the expense of being the hardest to
implement well.)

Multicore processors and the fact that single-threaded performance basically
hit a wall explains why an N:1 threading models in something with the use
cases of Java fell out of favor. M:N, while still "green threads", has a
somewhat different set of trade-offs versus 1:1 native threads than N:1 does,
though.

------
ninjakeyboard
"There's a dilemma though - if the application code is writing bytes to the
response body with blocking write(), isn't it tying up a thread if the client
is slow?"

There are TCP buffers. If it gets full ya maybe you'll wait a bit but it's
thread vs rope relative to blocking a whole big roundtrip data exchange while
blocking a thread. Nobody really tunes the write buffers because it doesn't
slow down our apps. That is not the case with calls to remote services.

Note that TCP buffers are pretty big so the chance of the thread blocking
while writing are very low.

------
pkghost
As others here have mentioned, there is no necessary relationship between the
awful async syntax so common to JavaScript environments and non-blocking IO.
There are plenty of languages/environments that support highly scalable
network operations (that is, do not use a new thread or process to handle each
new connection) without introducing callback hell.

The Greenhouse framework in Python is one of them
([http://teepark.github.io/greenhouse/master/](http://teepark.github.io/greenhouse/master/)).
The docs (linked) have a concise discussion of the various approaches to
parallelizing IO operations with pros and cons of each, and, for most database
applications, an obvious winner.

~~~
zhong-j-yu
We can argue that thread sucks because it is expensive. But we must measure
how expensive it actually is in real world applications, before abandoning it.
If a server must maintain a few hundred concurrent threads, it is really
nothing.

~~~
pacala
To be precise, threads suck because thread _scheduling_ is expensive. From a
logical pov, structured programming still rules, and threads are an excellent
way to implement structured programming.

------
jeremyjh
It seems like such a waste to use async libraries in many Java applications
and I wince at the thought of a mainstream application framework adopting CPS.
In the techempower benchmarks Netty beats Servlet by only like 3%. Maybe if
you are serving millions of connections you will get your money's worth but I
think a synchronous API in the application stack is the right answer for
almost everything. Even Play framework with Scala which takes most of the edge
off is still needlessly complex compared to say, Dropwizard code. You have to
remember not to block, and be sure all your libraries don't block either. Why
pick up that burden if it is not totally necessary?

~~~
zhong-j-yu
Yes, even if your language/framework have great abstraction over async, it is
still something that the programmer must be aware of, and must be reasoning
about all the time. It's just easier doing sync instead, at least for the
C-family programmers.

------
__david__
The select() loop has been a core part of unix since before many people here
were born. That's the basis of async, evented IO—it's hard to call that a fad.

Perhaps the callback mechanism of Node is a fad—continuation based techniques
can make async code look like non-async code. Perl's Coro and Ruby's new(ish)
fibers are examples of how that could look.

~~~
zhong-j-yu
what I meant is whether it's a fad to spread async everywhere inside
application code, and call that a good thing. the computer is of course async
in nature; but the abstraction on the app layer does not have to be.

~~~
kaoD
> the computer is of course async in nature; but the abstraction on the app
> layer does not have to be.

Aren't computers actually synchronous? The abstraction on the OS layer is what
makes computers async.

At the programming-language layer, async means you can write programs based on
how the user and your program interact with each other. Other applications
(and the OS itself) behave asynchronously too. Why make it harder?

Of course it shouldn't _have_ to be that way (it didn't use to be), but it's
very convenient to think around "what is actually happening" instead of "what
the synchronous execution of the program is doing".

EDIT: but now I've seen you only meant a specific subset of async. I'm leaving
the comment here anyways.

------
vampirechicken
I hope somebody told the poster about putting a proxy server (squid, apache,
nginx, varnish, etc.) between the application server and the client.

Proxy deals with slow client delays. App server serves app requests at speed.
Tune number of proxy connections to keep app servers reasonably busy. Scale
each layer individually.

IMO Whether the proxy is async or not is a matter of taste.

~~~
zhong-j-yu
I would rather buffer the entire response in the app server, instead of in the
central reverse-proxy.

~~~
cpeterso
Why buffer on the app server?

~~~
zhong-j-yu
Because we have multiple app servers, but only one reverse-proxy? I don't want
to centralize a task that could be distributed.

~~~
vampirechicken
So you're basically causing your own problems by refusing to scale your proxy
layer, which should be dirt cheap hardware, while scaling your beefier app
servers. You basically have it backwards.

------
virmundi
Seems like as a good a time as any to refresh people's ideas on Java. There is
a user-thread implementation for Java. It does require instrumenting the
bytecode, but it does work.
[http://docs.paralleluniverse.co/quasar/](http://docs.paralleluniverse.co/quasar/).

I'm looking at the port of Quasar to Clojure as my sole reason for looking at
Clojure over Erlang.

~~~
zhong-j-yu
I absolutely love Quasar. Nevertheless, there's an honest question whether it
is actually needed in majority of applications. I think not.

~~~
pron
Main author of Quasar here. The big question is what do "the majority of
applications" tell us? A lot of applications run on virtualized hardware,
which basically runs a few Pentiums on a single i7 box. So "the majority of
applications" don't really need more than a Pentium. But is that really by
choice? Or is that simply because modern hardware is so much harder to fully
exploit, so we just don't bother and lower expectations? I think it is the
latter. If we make it easier to fully take advantage of modern hardware (its
processor and memory architecture, its IO/processing latency ratio etc.) then
all of a sudden you'll see how most applications actually need every inch of
performance they can get their hands on.

------
angersock
Erlang called, and left a message.

The message was "You are all super late to the party lol".

------
nobbyclark
Waaaaiit a second! This "async thing" is not a fad created by node.js but
rather the effective conclusion of the C10K problem -
[http://www.kegel.com/c10k.html](http://www.kegel.com/c10k.html) \- that we
could scale up our application servers to handle more requests and that
threads alone had failed to get us there.

And sorry anyone who wants to claim that Java's green threads are somehow a
better programming model than async IO ala node.js + promises is pretending to
write code. Yes async IO is not easy, certainly nowhere near as simple to
manage as process / fork but with consistent coding style your can still end
up with a system that behaves predicably and most importantly can be reasoned
about.

Meanwhile I bitterly regret the days and weeks of my life lost to debugging
threaded code. Never again!

~~~
zhong-j-yu
One-thread-per-connection is very bad. But one-thread-per-request is probably
not that bad.

------
twerquie
If your application is largely fetching a bunch of external resources, say
from something slow like a database or a web API, mashing them up and
returning a response to a client, you need async.

Node has a very beginner-friendly primitive for doing this.

~~~
nmjohn
Yes.

People in general are really bad at grasping the concept that
$programming_paradigm are not all equally suited for a given problem.

Object Oriented, functional, async, strict type system, etc.

I've noticed an almost evangelical nature of people in trying to push their
prefered environment on others. Guess what? Node.js is not the answer to
everything (and I do 90% of my work in node!). But you know what? Neither is
OO, or functional programming. There may be many advantages to your language
of choice, but that doesn't mean it's always better.

Fetching resources from many different areas - especially external or
unpredictable (3rd party) services - that has async written all over it, and
node makes that easy.

~~~
zhong-j-yu
I agree that external IO most likely would benefit from async. But, we don't
need to turn the entire request-response code flow into async style, just
because of one async call. We could break it up into 3 parts: sync code, async
code (for external IO), sync code again.

~~~
orclev
In some code I wrote for a major corporation who will not be named, that's
pretty much exactly what was done. We would use a thread pool to spin off a
bunch of workers given Future instances to chew on, and then we'd just block
waiting for all the results to come back before returning to the client. The
thread that handled a request was synchronous to the client, but did all its
services requests async and in parallel. Where appropriate we'd also chain
things, so one batch of async requests would go out, the results would be
gathered and then those results would be used to generate a new batch of async
requests, which one again would be gathered and the results used to respond to
the client. For those keeping track, this is basically the lazy IO pattern.
Not async in the sense of nodejs (which I hate by the way), no callbacks,
rather we get a collection of thunks which we can block on evaluating.

------
m0th87
This is a straw man argument. Async doesn't have to look like node.js'
callback hell - that's what go, erlang and several other languages achieve
with M:N green threading.

~~~
zhong-j-yu
That's a matter of terminology. I don't call `go` "async".

~~~
dragonwriter
How do you define "async" that excludes Go?

~~~
zhong-j-yu
I use the word from the programmer's point of view; it's irrelevant how things
are done under the hood.

For example, in Go, when you read a value from a channel, it's just like a
good old blocking call, as far as the programmer is concerned.

On the other hand, an "async" read would involve callback, promise, or some
other constructs.

~~~
kaoD
Promises are also like good old blocking calls. How is a channel different?

------
zhong-j-yu
Those are my thoughts; I'd like to hear counter arguments.

~~~
curiousDog
Seems like you're looking at it solely from a web server/application back-end
perspective. Async I/O and the corollary, freeing up threads, is useful in
lots of other places like UI or applications that require very low latencies
(We had a distributed process that had to respond to heart-beat requests from
other machines amongst other I/O bound requests. Tying up threads when doing
I/O would've been a death sentence).

~~~
zhong-j-yu
UI - of course the UI thread should not be blocked in handling IOs. My point
is, move these IO actions to another thread; the code in that thread is good
old synchronous/threaded code.

heart-beat - yes we'll need concurrent threads for handling concurrent
requests in the blocking world; the question is whether this will result in
too many threads, which depends on the application.

~~~
MaulingMonkey
"My point is, move these IO actions to another thread; the code in that thread
is good old synchronous/threaded code."

Sure. But how do you handle that? IO results need to be communicated back to
the UI thread somehow.

Throwing together a whole new thread+stack against a named method that
communicates back via explicit messages (or however you want to get the
results of IO back to the UI thread) seems like a bit much if all you want to
do is download a motd.txt and stuff it in a label.

~~~
zhong-j-yu
There is no problem to modify UI state from any thread; just put up some
synchronizations.

The hard part is, if the modifications do not form a single, predictable,
serialized chain, how can the programmer reason about them? This problem is
independent of async/sync. If you use C# async for a UI action, you still need
to worry about it.

~~~
kaoD
> There is no problem to modify UI state from any thread; just put up some
> synchronizations.

It reminded me of the "I know, I'll use regular expressions" joke.

~~~
zhong-j-yu
The single-event-thread design is not without its own problem; programmers
have difficulty in understanding and abiding to it too.

Here we have an inherently concurrent problem - user actions and some IO
actions occur concurrently. That problem cannot be reduced by some API or
language trick.

~~~
fleitz
Given the Universal Turning Machine and the Church-Turing thesis in actuality
all of programming except for assembler is pretty much some API or language
trick.

~~~
dllthomas
Arguably, even machine language is "some API" and assembler "some language
trick".

~~~
fleitz
Good point, I forgot for a second that the CPU turns assembler into microcode
before executing it.

~~~
dllthomas
The CPU often turns machine code into microcode, true, and that's a good point
but it wasn't the point I was trying to make. Even in simpler processors that
actually _do_ directly execute the machine code, I think you can view that
machine code as an API for controlling the processor system.

------
dragonwriter
The async-everywhere trend might be a fad (though I think it was emerging
before Node and that Node is a response to rather than the source of the
trend), but async itself clearly is not.

------
deadgrey19
I don't write web applications, and I don't use much JavaScript, so it's very
possible that I don't properly understand the motivation for this blog post.
However, as others have said there is a difference between some inconvenient
syntax in JS, and the fundamental model of non-blocking I/O.

What I do write is lots of C/C++ client/server applications for HPC/HFT/DC
workloads where speed both in req/sec (throughput) and speed in
min/avg/max(secs/req) (latency) matters. In these environments I almost
exclusively use non-blocking I/O. There are several reasons:

1) Threads are not free. Even if you use a thread-pool to avoid spin up costs,
context switching overhead matters. Every time you call blocking I/O, you make
sure that the kernel will wake up, schedule another thread, and do anything
else that it decides to do. Waste time that you could have used to do useful
work. Non-blocking I/O puts you in charge of your own "thread scheduler". Your
"threads" are functions, they are "cooperatively scheduled" and you can make
full use of every cycle that you get.

2) Programming with threads is hard. Trust me. If you think it's easy, or I'm
soft, you haven't done it enough. At some point you will need shared state
across those threads. And then you'll need locking and unlocking. (also Mutexs
are slooooowww) And then you'll need to handle error cases, and you'll need to
make sure that all the unlocking is done right in all of the right places. And
then you'll need signaling between your threads. And you'll need semaphores or
similar. And 3 months down the line, you're thinking to yourself, when a foo
exception causes a bar signal, will a baz handler deadlock? Will it make
progress? Humans just aren't designed to reason about this sort of thing.

With a single threaded, non-blocking design, it's really easy to reason about
exactly what is happening with all of your state. Debugging is obvious and
straightforward. This is necessary if you're like me and don't write perfect
code first time. There's only ever one function accessing shared state at one
time. The "scheduler" is working for you, not against you. If you write your
code simply, cleanly and efficiently, you'd be amazed how much work a modern
CPU can really do. Honestly, once you've saturated a 10G NIC what more do you
want to do?

3) If you buy into the non-blocking design, then, as long as you only use 1
process/thread per core, almost anything a thread can do, a process can do
better. Threads have no memory protection, anything you touch probably belongs
to some other thread and you're inviting subtle bugs. Processes have memory
protection by default if you want to share things you can do it explicitly via
safe mechanisms (shared memory rings, pipes, IPC etc). Shared memory rings are
(can be) so fast that data is more or less local so if you want to use shared
state from a TCP connection or whatever, you can always "dispatch" work to
another process to do it for you. You get the benefits of many cores working
for you as well as a clean and obvious programming model.

Ultimately, if the question is one of syntax, then I'd happily believe that JS
has some ugly syntax for doing these things, but if the question is one of
design, then you should think really really hard before deciding that a
threaded model is the correct one for you.

~~~
zhong-j-yu
I apologize for the sensational and generalizing title. What I'm talking about
is focused on web applications, and from empirical data, it seems that most
web servers handle very few concurrent requests, therefore it would be silly
to go all async to avoid threads.

I'm very surprised that many people here argue that async code is much better
to understand than sync code. Ok, so that part is subjective, and let's file
it under personal preference. For people who love synchronous coding but fear
the cost of threads, I'm trying to make an argument that the fear is probably
not justified.

~~~
deadgrey19
Thanks for the very considered response. I'm pretty interested in this because
some of my recent work has been about designing I/O APIs/abstractions.

My reaction is due to my experience which is that threaded programming is
something that's very hard to get right and especially to maintain. Async
programming cleans up the threading and makes it kind of implicitly
cooperative.

I was involved in a big move of some core infrastructure from a multi-threaded
design over to a pipeline of async style apps. The result was a huge boost in
productivity and debugability which worked out really well for the company.

