
Async.h – asynchronous, stackless subroutines in C - signa11
https://higherlogics.blogspot.com/2019/09/asynch-asynchronous-stackless.html
======
saagarjha
Actual header:
[https://github.com/naasking/async.h/blob/master/async/async....](https://github.com/naasking/async.h/blob/master/async/async.h)

~~~
bjackman
I'm always on the close lookout for stuff like this because until recently I
maintained some C code that really ought to use threads but can't afford the
extra stacks/context switching. Here's my current view:

\- "Protothreads" are just the old switch statement coroutine trick. Local
variables do not persist across async switches - this is expected and not
surprising but that doesn't make it any less unnerving, it would be very easy
to accidentally introduce very hard-to-debug problems in code using this
technique. None of the author's examples expose this limitation!

\- async.h seems to be just the addition of a per-protothread structure where
you can store your state (where the old coroutine trick typically has people
just using static variables). This improves things a little but the author's
examples still don't highlight the major issue that local variables don't
persist. I suspect they didn't mention it because it seems so obvious, but
that doesn't mean it isn't important.

I worked for about 2 years on code that's perfectly suited to these techniques
(i.e. software that manages inherently parallel procedures, but that cant use
actual threads), spent lots of time considering them, but in the end I think
it's just better to manually manage your flow control. You end up with state
coming in and out of functions as struct members (which is exactly the main
idea of async.h IIUC) and the flow of your code doesn't look very naturally
like the flow of the procedures it's actually managing. The solution IMO is
just to code carefully, write comments, and draw diagrams.

My C files like this start with a couple pages of prose outlining the
operation of the HW they're interacting with (if it was not interacting with
HW, it would probably describe the protocol it's implementing, or the user
interaction it's facilitating), and how that relates to the design of the SW
in the file. Once you have that, I think the reader can see how the slightly
unnatural function boundaries map onto elements of the broader system.

Sorry to be negative, these headers are cool and interesting and I'm glad
people invent them. I just don't think we should use them in important
software.

------
aelo
What one should mention is that his version of coroutines is also stateless.
That's why you need either stackful coroutines written in assembly or support
in the language/compiler or operating system. Special case versions where you
can avoid backing up the stack frame exist in other projects too, like in
asio:
[https://github.com/chriskohlhoff/asio/blob/master/asio/inclu...](https://github.com/chriskohlhoff/asio/blob/master/asio/include/asio/coroutine.hpp)

But in my opinion I would not use them for generic use cases and rather wait
for the official support in C++20.

Of course I'm aware that I'm talking about C++ here and the situation for C
might be slightly different.

------
innagadadavida
Does anyone use Apple’s libdispatch (aka GCD) on Linux production code? It
seems to be pretty fast and has some neat features. It also gives you
blocks/closure support in C (needs Clang).

~~~
Yoric
Oh, libdispatch has been ported to other platforms?

That's great!

~~~
innagadadavida
Was available in FreeBSD back in 2009:
[https://wiki.freebsd.org/GCD](https://wiki.freebsd.org/GCD) Archived Linux
port:
[https://github.com/nickhutchinson/libdispatch](https://github.com/nickhutchinson/libdispatch)
You can use this without blocks, but it is way more convenient with blocks.

~~~
Yoric
Good to know, thanks!

------
kccqzy
And now when you just happen to decide to merge two statements in a single
line, things start to crash and burn :)

Or, you know, when you happen to want to declare a few local variables.

------
yvdriess
Ah yes the switch case statement hack :)

Does anyone have experience using libunwind for this kind of thing? You need
to save and swap state for actual parallel execution.

~~~
paulddraper
It is a "hack", but not a bad one.

------
huxingyi
Neat! I checked the async.h, looks like it doesn't support nested calls? I did
a similar toy many years ago, which could be used with other async libraries
such as libuv:
[https://github.com/huxingyi/c-block](https://github.com/huxingyi/c-block)

~~~
naasking
Nested async calls are supported, there's an example at the end of the
article. Ordinary calls simply have to return before the next await statement,
and obviously you don't want to make blocking calls.

Async/await semantically doesn't support resumption from deep in the stack, so
it's a good choice for this sort of trick.

------
Yoric
Nice hack :)

I don't see myself using this in any kind of real code at this stage, although
it might serve as a base for a more comprehensive library of Rust-style
futures.

~~~
naasking
My intent was to play around with different approaches for Arduino
programming. I saw a bunch of blocking functions, like delay(), and thought
that seemed silly and made me wish for async/await. I realised at that point
that I could do it via macros with the same approach used by protothreads.

~~~
bxparks
I found that computed-goto ([https://gcc.gnu.org/onlinedocs/gcc/Labels-as-
Values.html](https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html)) is
cleaner than Duff's Device for this sort of thing. (It works for clang
compiler too.) Take a look at
[https://github.com/bxparks/AceRoutine](https://github.com/bxparks/AceRoutine)
for an Arduino library that uses computed-goto.

~~~
naasking
Agreed, definitely more efficient and probably flexible, but two minor
downsides:

1\. not portable C

2\. requires more storage for each continuation (4 bytes vs. 2)

------
anfilt
Looks like it's just using Duff's device like you would use to implement a
coroutine.

~~~
kevsim
Yep, first line from the README:

> this is a header-only async/await implementation for C based on Duff's
> device

------
PixelOfDeath
I wonder if there is anything between the complexity of HPX and this stack-
less task libraries.

