Hacker News new | past | comments | ask | show | jobs | submit login

Async is, in many situations, better than traditional threads.

Threads are a resource hog. They take a lot of system resources, and so you usually want to have as few of them as possible. This is a problem for applications that could, in theory, support thousands of concurrent connections, if not more. With a basic thread-based model, you need 1 thread per connection, and if you have long-lived connections with infrequent traffic, those threads mostly do nothing but consume precious system resources. When you're waiting for data, the thread is blocked and does nothing. With async/await, you can have far fewer threads, maybe even just one, and handle blocking through a system call that wakes a thread up whenever any one of the currently blocked tasks is ready to progress. In languages with much lighter thread alternatives, such as Go's goroutines or Erlang's Beam processes, this problem basically doesn't exist, and so those languages don't need async/await at all.




> Threads are a resource hog.

Not really on any decent operating system, but if they are too heavy, there's still fibers aka green-threads aka stack-switching (which at least on Windows are an operating system primitive - but can be implemented in user code on any system that gives you direct access to the CPU stack and registers).

I doubt that the async-await state machine code transformation which 'slices' sequential function bodies into many small parts which are then jumped in and out frequently is any better for performance than stack-switching (in async/await you still need to switch a 'context pointer' on slice-entry/exit instead of the stack pointer).

One obvious advantage of the state-machine code transformation is that it also works in very limited single-threaded runtime environments without access to the callstack (like WASM).

In any case, from the user perspective, async/await should just be language syntax sugar, how it is implemented under the hood ideally shouldn't matter (e.g. it should also be possible to implement it on top of a task scheduler that runs on fibers or threads instead of a state-machine code transformation).


The async/await model gives you exactly one guarantee: because the yield continuation is second class, at most one stack frame can be suspended, so the the amount of space that needs to be reserved for a task is bounded and potentially can be computed statically. This can be important for very high performance/very high concurrency programs, so I think the upsides can be more than the downsides in something like rust, C++ [1], and possibly C#. I still do not understand why async was deemed appropriate, for example, in python.

As an aside, there is a lot of confusion in this thread between general async operations and async/await.

[1] but of course C++ screwed it up by requiring hard to remove allocations.


> I still do not understand why async was deemed appropriate, for example, in python.

My best guess is that it's because of implementation limitations in CPython and likely other interpreters. StacklessPython is a fork of CPython with real coroutines/fibers/green threads but apparently they didn't want to merge that patch. Very disappointing, because async/await is a nearly useless substitute for my desired usecase (embedded scripting languages with pauseable scripts).


There is also gevent which is a library only coroutine extension which didn't require any changes to the interpreter itself. I'm also sure it would be easier to maintain and evolve if it was part of python core.


> I still do not understand why async was deemed appropriate, for example, in python.

Because queues backed by thread/process pools for serving web requests has sharp edges.


Userspace fibers (no clue about Windows fibers) still have the blocking IO problem. If your fiber calls read() but there's no data and read blocks for a few minutes, until the next message is received, no other fibers can be scheduled on that thread in the meantime. With async, the task just gets suspended, something like epoll gets called with info about all the suspended tasks, and the thread unblocks once any task can move forward, not necessarily the one that requested the read. This problem doesn't exist if your pseudo threads have first-class language and runtime support, see goroutines for example.


If the blocking function would be fiber-aware, and yield execution back to the fiber runtime until the (underlying) async operation has completed, it would "just work". One could most likely write their own wrapper functions which use the Windows "overlapping IO" functions (those just have a completion callback if I remember right - PS or maybe completion Event?)

Not possible with the C stdlib IO functions though (that's why it would be nice to have optional async IO functions with completion callback in the C stdlib)

PS: just calling a blocking read in async/await code would have the same effect though, you need an "async/await aware" version of read()


if your async task performs a raw read it also will block. In the coroutine case you of course need to call a read wrapper that allows for user mode scheduling. That can literally be the same function you use for async. Coroutines also allow library interposition tricks that transparently swap a blocking read with one that returns control to the event loop, so in principle existing blocking code need not change. Libpth did something like that for example. YMMV.


> Threads are a resource hog.

I agree. They are if we spawn OS threads everytime we need a thread. The equivalent in async would be spawning the entire overhead of the event loop every time we need concurrency.

Obviously, we don't do that.

WorkerPools don't need respawning. Greenlets don't need respawning. Virtual Threads handled by the runtime don't need respawning.


Isn't that problem generally easily solved with a thread pool ? (that's what nginx does I believe)


There are use cases where a thread pool doesn't solve your problem. If you're handling a few short-lived connections at a time, it's more than enough, but if you're developing something like a push / messaging / queuing service, with thousands of clients connected for hours at a time and receiving very little data once every few minutes, a thread pool won't help you.


This is a solved problem.

I can run millions of goroutines on a laptop. These get mapped to a relatively small number (number of available CPU cores with default settings) by the runtime.


> a thread pool won't help you.

What do you see as the main limitations of spawning 2048 threads in a pool in this scenario?


2048 threads would be fine, but they are talking about 10s of thousands of clients.


> if you have long-lived connections with infrequent traffic

This is an interesting case. Is it difficult to recover state in the case of an error in such a connection? If not, then you could just use that ability. If so, that seems fragile.

Also, this doesn't sound like an inherent limitation of the design approach. Couldn't the linux kernel just improve the performance of that case?

> those threads mostly do nothing but consume precious system resources.

You mean a small amount of virtual memory?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: