Coroutines in C (2000)

srean · on Nov 17, 2014

It is a little unfortunate that one of the few (or only) underlying abstraction of the machine that C chose to leave out was that of coroutines. I find this rather ironic because I have been told that PDPs, the virtual machine model that C is based on, had instruction level support for exchanging control very efficiently. I have also heard that VMs like LLVM aren't that good at portably exposing efficient exchange of control between stack frames (please correct me if I am wrong here).

Since C is the implementation medium for many other languages coroutines did not get as much visibility as it should have.

I can only speculate why C left it out. Perhaps it was felt that it wont port too well outside of the PDP family. The other reason could be that coroutine as an abstraction completely leaves out how they ought to be scheduled. Perhaps they did not want to flesh out a schedule that would be portable everywhere. It is not a very satisfactory explanation because C wasnt that shy to leave things undefined. I think the only way to get an answer to this question is to go to the source, that would be Ken Thompson or Brian Kernighan.

jeffreyrogers · on Nov 17, 2014

It may be because of implementation complexity? C was designed with the compiler writer in mind (which is why so many things are undefined... to make it easier for compiler writers), so if co-routines are hard to implement then that might be an explanation. Anyways that's mostly pure speculation on my part.

srean · on Nov 17, 2014

...but thats the thing, on the PDP the crux of it was essentially one instruction. May be the worry was that porting this to other architectures of the time wasnt feasible or easy.

ANTSANTS · on Nov 17, 2014

My guess is that, in light of the insanely limited memories of the machines they were working with, they had to leave out any feature that wasn't absolutely necessary. Maybe the benefits of coroutines weren't as obvious then, or maybe they didn't seem to mesh well with the rest of UNIX ("why not just fork")

Then, when UNIX and C exploded into far bigger things than they had probably ever dreamed of, it was too late. I think C would be a very different language if you had explicitly given Dennis Ritchie the task of writing the eternal lingua franca of programming with at least 80's-level computing resources at his disposal, but that's history for you.

rumcajz · on Nov 17, 2014

I wrote a little preprocessor that translates slightly augmented version of C (with Golang-like "go" and "select" commands) into the kind of code described in the article.

http://millc.org

vu3rdd · on Nov 17, 2014

This is really cool.

Have you seen the plan9port implementation? It has the channel abstraction available as a library. The 'select' is called 'alt' in the code.

http://man.cat-v.org/plan_9/2/thread

rumcajz · on Nov 17, 2014

Thanks for the link! I'll check it.

DenisM · on Nov 17, 2014

It's also possible to implement coroutines using setjmp[1]. IIRC, first you malloc() a new call stack, then populate a buffer using setjmp, patch the buffer to refer to the new call stack, longjmp to your new coroutine. I've implemented that about 20 years ago. There was some minor tweaking involved for different platforms, except for VAX/VMS - that one turned out to be a real pain.

What I learned from the whole experience is that coroutines are a lot less mysterious than they look. Although I never had a chance to use them since then.

[1] http://en.wikipedia.org/wiki/Setjmp.h

halayli · on Nov 17, 2014

For those interested in coroutines, check out lthread & lthread cpp bindings (disclaimer: author)

Code:

https://github.com/halayli/lthread

https://github.com/halayli/lthread_cpp

Docs:

http://lthread.readthedocs.org/en/latest/

http://lthread-cpp.readthedocs.org/en/latest/

canadev · on Nov 16, 2014

When I was in school, we used something called μC++, developed by a research group there. It implemented coroutines, tasks, and monitors.

http://plg.uwaterloo.ca/usystem/uC++.html

Liru · on Nov 16, 2014

I have to deal with uC++ for the term. I'm not quite sure how to feel about it. On the one hand, it does make it a bit easier to write concurrent programs. On the other hand, I would have been more comfortable with Boost or almost any other language to learn about concurrent systems.

canadev · on Nov 16, 2014

FWIW, I really liked it. I thought it was pretty clear and comprehensible. Of course, that was like 10+ years ago now.

amelius · on Nov 16, 2014

The problem with coroutines in C and C++ is that you'll have to save the stack upon a yield, and restore it later. If only C or C++ added some stack barrier primitives to the language, coroutines could (I suspect) be added more naturally.

Perhaps Rust could introduce this.

halayli · on Nov 17, 2014

This is not C/C++ specific. It's how procedure calls work on x86.

You don't need to save the stack if you have a stack per coroutine. Using madvise you can always let the kernel know of the memory you no longer need. So even if you allocate 1MB stack, it doesn't mean you are using 1MB of physical memory. If at some point you used 512kb then you yielded at 100kb stack usage, you can let the OS free up the difference.

amelius · on Nov 17, 2014

But when the coroutine continues running again, you need to have stack space. You'd probably then have to insert special stack-switching directives into the new stack to make it work.

In any case, I think it is best to just precompute the stack-size beforehand (assuming there is no recursion), and otherwise check for stack-overflow in recursive calls (and reallocate space if necessary). Pre-allocating megabytes of stack for each running coroutine sounds like madness, and a disaster waiting to happen (also, why should my stack overflow at 1M calls, when I have 64GB of main memory?)

halayli · on Nov 17, 2014

stack space is always precomputed and static. You cannot reallocate after it overflows.

On the other hand, the default stack size on linux is 8k, 1M is excessive and way more than you need in a sane program that doesn't have huge buffers on the stack.

Stack is not meant to use 64GB of memory, it's meant to be small to pass arguments and allocate small buffers on it. The rest of the 64GB is used for heap allocations.

justincormack · on Nov 16, 2014

Thats not the sane way of implementing coroutines in C. Use makecontext/swapcontext if they exist, or implement in assembly.

psykotic · on Nov 16, 2014

It's sane enough that Adam Dunkels wrote a functional if minimal TCP/IP stack using this style of coroutines: http://en.wikipedia.org/wiki/UIP_(micro_IP)

Edit: He more recently wrote the Contiki operating system which is entirely based on protothreads: http://www.contiki-os.org/

lmb · on Nov 16, 2014

If you look at uIP you'll realize its pretty much a gigantic function [1] with a lot of gotos and ifdefs. Not exactly what I would cite for sane code.

1: https://github.com/contiki-os/contiki/blob/master/core/net/i...

P.S. UIP doesn't use the coroutine stuff at all it seems.

psykotic · on Nov 16, 2014

> Not exactly what I would cite for sane code.

The uses of goto I saw compensate for C's lack of nested breaks and continues. Reasonable people may differ on this topic, but I'm perfectly fine with it. The real sin is obfuscating the natural control flow of a program by introducing dummy variables like try_again and keep_running, or being forced to split things into multiple functions just so you can use a return in lieu of a nested break. (There are single-exit zealots for whom an early-out return like that would be an equally grievous sin as the goto!)

In any case, my point was to demonstrate by example that serious software can be written this way. The point wasn't to hold up uIP or Contiki or anything else as impeccable exemplars of coding style.

> P.S. UIP doesn't use the coroutine stuff at all it seems.

It's not used in the core but in the protocol servers: https://github.com/adamdunkels/uip/blob/master/apps/webserve.... All the PT and PSOCK code is protothreads.

pm215 · on Nov 16, 2014

Those aren't portable (in particular they don't exist on Windows, and occasionally they exist but are buggy), and "write a pile of assembly" is even less portable.

Personally I think the correct conclusion to draw is "C is not a language which supports coroutines" and not try to implement them at all.

justincormack · on Nov 16, 2014

I hear they are buggy on OSX too. It is not a very big pile of assembly though; that is what eg Go does (gccgo uses and requires swapcontext).

EDIT: I agree really, C is not very suitable and that is the correct conclusion but if you are eg implementing a language that has them, assembly is the best solution.

jeffreyrogers · on Nov 16, 2014

The author is aware of that. In one of the references he notes that PuTTY uses this trick and calls it "the worst piece of C hackery ever seen in serious production code."

tedunangst · on Nov 16, 2014

Note the context functions were marked obsolescent in POSIX 6 and removed entirely in issue 7 because the makecontext function prototype itself cannot be expressed in standard C.

justincormack · on Nov 17, 2014

Yeah I know. C really needs to fix some expressiveness issues.

justincormack · on Nov 17, 2014

Actually, makecontext should have just taken a single void pointer not a varargs...

zamalek · on Nov 17, 2014

I've currently got a toy language on the go (learning LLVM) and decided to have generators/coroutines as a core part of the language.

What I realized is that a function's stack frame is little more than a struct living on the stack - once you have that it makes coroutines somewhat more straightforward to encapsulate. There are some oddities (e.g. the callee's frame is hoisted to the caller's frame, or you sometimes need the frame living on the heap).

I put some pseudo-c together as an end-goal for the toy compiler[1]. Thought this is an interesting approach, comments are appreciated.

[1]: https://gist.github.com/jcdickinson/af7fabf37c808d8e5814