
Coroutines in C (2000) - jeffreyrogers
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
======
srean
It is a little unfortunate that one of the few (or only) underlying
abstraction of the machine that C chose to leave out was that of coroutines. I
find this rather ironic because I have been told that PDPs, the virtual
machine model that C is based on, had instruction level support for exchanging
control very efficiently. I have also heard that VMs like LLVM aren't that
good at portably exposing efficient exchange of control between stack frames
(please correct me if I am wrong here).

Since C is the implementation medium for many other languages coroutines did
not get as much visibility as it should have.

I can only speculate why C left it out. Perhaps it was felt that it wont port
too well outside of the PDP family. The other reason could be that coroutine
as an abstraction completely leaves out how they ought to be scheduled.
Perhaps they did not want to flesh out a schedule that would be portable
everywhere. It is not a very satisfactory explanation because C wasnt that shy
to leave things undefined. I think the only way to get an answer to this
question is to go to the source, that would be Ken Thompson or Brian
Kernighan.

~~~
jeffreyrogers
It may be because of implementation complexity? C was designed with the
compiler writer in mind (which is why so many things are undefined... to make
it easier for compiler writers), so if co-routines are hard to implement then
that might be an explanation. Anyways that's mostly pure speculation on my
part.

~~~
srean
...but thats the thing, on the PDP the crux of it was essentially one
instruction. May be the worry was that porting this to other architectures of
the time wasnt feasible or easy.

~~~
ANTSANTS
My guess is that, in light of the insanely limited memories of the machines
they were working with, they had to leave out any feature that wasn't
absolutely necessary. Maybe the benefits of coroutines weren't as obvious
then, or maybe they didn't seem to mesh well with the rest of UNIX ("why not
just fork")

Then, when UNIX and C exploded into far bigger things than they had probably
ever dreamed of, it was too late. I think C would be a very different language
if you had explicitly given Dennis Ritchie the task of writing the eternal
lingua franca of programming with at least 80's-level computing resources at
his disposal, but that's history for you.

------
rumcajz
I wrote a little preprocessor that translates slightly augmented version of C
(with Golang-like "go" and "select" commands) into the kind of code described
in the article.

[http://millc.org](http://millc.org)

~~~
vu3rdd
This is really cool.

Have you seen the plan9port implementation? It has the channel abstraction
available as a library. The 'select' is called 'alt' in the code.

[http://man.cat-v.org/plan_9/2/thread](http://man.cat-v.org/plan_9/2/thread)

~~~
rumcajz
Thanks for the link! I'll check it.

------
DenisM
It's also possible to implement coroutines using setjmp[1]. IIRC, first you
malloc() a new call stack, then populate a buffer using setjmp, patch the
buffer to refer to the new call stack, longjmp to your new coroutine. I've
implemented that about 20 years ago. There was some minor tweaking involved
for different platforms, except for VAX/VMS - that one turned out to be a real
pain.

What I learned from the whole experience is that coroutines are a lot less
mysterious than they look. Although I never had a chance to use them since
then.

[1]
[http://en.wikipedia.org/wiki/Setjmp.h](http://en.wikipedia.org/wiki/Setjmp.h)

------
halayli
For those interested in coroutines, check out lthread & lthread cpp bindings
(disclaimer: author)

Code:

[https://github.com/halayli/lthread](https://github.com/halayli/lthread)

[https://github.com/halayli/lthread_cpp](https://github.com/halayli/lthread_cpp)

Docs:

[http://lthread.readthedocs.org/en/latest/](http://lthread.readthedocs.org/en/latest/)

[http://lthread-cpp.readthedocs.org/en/latest/](http://lthread-
cpp.readthedocs.org/en/latest/)

------
canadev
When I was in school, we used something called μC++, developed by a research
group there. It implemented coroutines, tasks, and monitors.

[http://plg.uwaterloo.ca/usystem/uC++.html](http://plg.uwaterloo.ca/usystem/uC++.html)

~~~
Liru
I have to deal with uC++ for the term. I'm not quite sure how to feel about
it. On the one hand, it does make it a bit easier to write concurrent
programs. On the other hand, I would have been more comfortable with Boost or
almost any other language to learn about concurrent systems.

~~~
canadev
FWIW, I really liked it. I thought it was pretty clear and comprehensible. Of
course, that was like 10+ years ago now.

------
amelius
The problem with coroutines in C and C++ is that you'll have to save the stack
upon a yield, and restore it later. If only C or C++ added some stack barrier
primitives to the language, coroutines could (I suspect) be added more
naturally.

Perhaps Rust could introduce this.

~~~
halayli
This is not C/C++ specific. It's how procedure calls work on x86.

You don't need to save the stack if you have a stack per coroutine. Using
madvise you can always let the kernel know of the memory you no longer need.
So even if you allocate 1MB stack, it doesn't mean you are using 1MB of
physical memory. If at some point you used 512kb then you yielded at 100kb
stack usage, you can let the OS free up the difference.

~~~
amelius
But when the coroutine continues running again, you need to have stack space.
You'd probably then have to insert special stack-switching directives into the
new stack to make it work.

In any case, I think it is best to just precompute the stack-size beforehand
(assuming there is no recursion), and otherwise check for stack-overflow in
recursive calls (and reallocate space if necessary). Pre-allocating megabytes
of stack for each running coroutine sounds like madness, and a disaster
waiting to happen (also, why should my stack overflow at 1M calls, when I have
64GB of main memory?)

~~~
halayli
stack space is always precomputed and static. You cannot reallocate after it
overflows.

On the other hand, the default stack size on linux is 8k, 1M is excessive and
way more than you need in a sane program that doesn't have huge buffers on the
stack.

Stack is not meant to use 64GB of memory, it's meant to be small to pass
arguments and allocate small buffers on it. The rest of the 64GB is used for
heap allocations.

------
justincormack
Thats not the sane way of implementing coroutines in C. Use
makecontext/swapcontext if they exist, or implement in assembly.

~~~
psykotic
It's sane enough that Adam Dunkels wrote a functional if minimal TCP/IP stack
using this style of coroutines:
[http://en.wikipedia.org/wiki/UIP_(micro_IP)](http://en.wikipedia.org/wiki/UIP_\(micro_IP\))

Edit: He more recently wrote the Contiki operating system which is entirely
based on protothreads: [http://www.contiki-os.org/](http://www.contiki-
os.org/)

~~~
lmb
If you look at uIP you'll realize its pretty much a gigantic function [1] with
a lot of gotos and ifdefs. Not exactly what I would cite for sane code.

1: [https://github.com/contiki-
os/contiki/blob/master/core/net/i...](https://github.com/contiki-
os/contiki/blob/master/core/net/ipv4/uip.c#L673)

P.S. UIP doesn't use the coroutine stuff at all it seems.

~~~
psykotic
> Not exactly what I would cite for sane code.

The uses of goto I saw compensate for C's lack of nested breaks and continues.
Reasonable people may differ on this topic, but I'm perfectly fine with it.
The real sin is obfuscating the natural control flow of a program by
introducing dummy variables like try_again and keep_running, or being forced
to split things into multiple functions just so you can use a return in lieu
of a nested break. (There are single-exit zealots for whom an early-out return
like that would be an equally grievous sin as the goto!)

In any case, my point was to demonstrate by example that serious software can
be written this way. The point wasn't to hold up uIP or Contiki or anything
else as impeccable exemplars of coding style.

> P.S. UIP doesn't use the coroutine stuff at all it seems.

It's not used in the core but in the protocol servers:
[https://github.com/adamdunkels/uip/blob/master/apps/webserve...](https://github.com/adamdunkels/uip/blob/master/apps/webserver/httpd.c).
All the PT and PSOCK code is protothreads.

------
zamalek
I've currently got a toy language on the go (learning LLVM) and decided to
have generators/coroutines as a core part of the language.

What I realized is that a function's stack frame is little more than a struct
living on the stack - once you have that it makes coroutines somewhat more
straightforward to encapsulate. There are some oddities (e.g. the callee's
frame is hoisted to the caller's frame, or you sometimes need the frame living
on the heap).

I put some pseudo-c together as an end-goal for the toy compiler[1]. Thought
this is an interesting approach, comments are appreciated.

[1]:
[https://gist.github.com/jcdickinson/af7fabf37c808d8e5814](https://gist.github.com/jcdickinson/af7fabf37c808d8e5814)

