It is a little unfortunate that one of the few (or only) underlying abstraction of the machine that C chose to leave out was that of coroutines. I find this rather ironic because I have been told that PDPs, the virtual machine model that C is based on, had instruction level support for exchanging control very efficiently. I have also heard that VMs like LLVM aren't that good at portably exposing efficient exchange of control between stack frames (please correct me if I am wrong here).
Since C is the implementation medium for many other languages coroutines did not get as much visibility as it should have.
I can only speculate why C left it out. Perhaps it was felt that it wont port too well outside of the PDP family. The other reason could be that coroutine as an abstraction completely leaves out how they ought to be scheduled. Perhaps they did not want to flesh out a schedule that would be portable everywhere. It is not a very satisfactory explanation because C wasnt that shy to leave things undefined. I think the only way to get an answer to this question is to go to the source, that would be Ken Thompson or Brian Kernighan.
It may be because of implementation complexity? C was designed with the compiler writer in mind (which is why so many things are undefined... to make it easier for compiler writers), so if co-routines are hard to implement then that might be an explanation. Anyways that's mostly pure speculation on my part.
...but thats the thing, on the PDP the crux of it was essentially one instruction. May be the worry was that porting this to other architectures of the time wasnt feasible or easy.
My guess is that, in light of the insanely limited memories of the machines they were working with, they had to leave out any feature that wasn't absolutely necessary. Maybe the benefits of coroutines weren't as obvious then, or maybe they didn't seem to mesh well with the rest of UNIX ("why not just fork")
Then, when UNIX and C exploded into far bigger things than they had probably ever dreamed of, it was too late. I think C would be a very different language if you had explicitly given Dennis Ritchie the task of writing the eternal lingua franca of programming with at least 80's-level computing resources at his disposal, but that's history for you.
I wrote a little preprocessor that translates slightly augmented version of C (with Golang-like "go" and "select" commands) into the kind of code described in the article.
It's also possible to implement coroutines using setjmp[1]. IIRC, first you malloc() a new call stack, then populate a buffer using setjmp, patch the buffer to refer to the new call stack, longjmp to your new coroutine. I've implemented that about 20 years ago. There was some minor tweaking involved for different platforms, except for VAX/VMS - that one turned out to be a real pain.
What I learned from the whole experience is that coroutines are a lot less mysterious than they look. Although I never had a chance to use them since then.
I have to deal with uC++ for the term. I'm not quite sure how to feel about it. On the one hand, it does make it a bit easier to write concurrent programs. On the other hand, I would have been more comfortable with Boost or almost any other language to learn about concurrent systems.
The problem with coroutines in C and C++ is that you'll have to save the stack upon a yield, and restore it later. If only C or C++ added some stack barrier primitives to the language, coroutines could (I suspect) be added more naturally.
This is not C/C++ specific. It's how procedure calls work on x86.
You don't need to save the stack if you have a stack per coroutine. Using madvise you can always let the kernel know of the memory you no longer need. So even if you allocate 1MB stack, it doesn't mean you are using 1MB of physical memory. If at some point you used 512kb then you yielded at 100kb stack usage, you can let the OS free up the difference.
But when the coroutine continues running again, you need to have stack space. You'd probably then have to insert special stack-switching directives into the new stack to make it work.
In any case, I think it is best to just precompute the stack-size beforehand (assuming there is no recursion), and otherwise check for stack-overflow in recursive calls (and reallocate space if necessary). Pre-allocating megabytes of stack for each running coroutine sounds like madness, and a disaster waiting to happen (also, why should my stack overflow at 1M calls, when I have 64GB of main memory?)
stack space is always precomputed and static. You cannot reallocate after it overflows.
On the other hand, the default stack size on linux is 8k, 1M is excessive and way more than you need in a sane program that doesn't have huge buffers on the stack.
Stack is not meant to use 64GB of memory, it's meant to be small to pass arguments and allocate small buffers on it. The rest of the 64GB is used for heap allocations.
The uses of goto I saw compensate for C's lack of nested breaks and continues. Reasonable people may differ on this topic, but I'm perfectly fine with it. The real sin is obfuscating the natural control flow of a program by introducing dummy variables like try_again and keep_running, or being forced to split things into multiple functions just so you can use a return in lieu of a nested break. (There are single-exit zealots for whom an early-out return like that would be an equally grievous sin as the goto!)
In any case, my point was to demonstrate by example that serious software can be written this way. The point wasn't to hold up uIP or Contiki or anything else as impeccable exemplars of coding style.
> P.S. UIP doesn't use the coroutine stuff at all it seems.
Those aren't portable (in particular they don't exist on Windows, and occasionally they exist but are buggy), and "write a pile of assembly" is even less portable.
Personally I think the correct conclusion to draw is "C is not a language which supports coroutines" and not try to implement them at all.
I hear they are buggy on OSX too. It is not a very big pile of assembly though; that is what eg Go does (gccgo uses and requires swapcontext).
EDIT: I agree really, C is not very suitable and that is the correct conclusion but if you are eg implementing a language that has them, assembly is the best solution.
The author is aware of that. In one of the references he notes that PuTTY uses this trick and calls it "the worst piece of C hackery ever seen in serious production code."
Note the context functions were marked obsolescent in POSIX 6 and removed entirely in issue 7 because the makecontext function prototype itself cannot be expressed in standard C.
I've currently got a toy language on the go (learning LLVM) and decided to have generators/coroutines as a core part of the language.
What I realized is that a function's stack frame is little more than a struct living on the stack - once you have that it makes coroutines somewhat more straightforward to encapsulate. There are some oddities (e.g. the callee's frame is hoisted to the caller's frame, or you sometimes need the frame living on the heap).
I put some pseudo-c together as an end-goal for the toy compiler[1]. Thought this is an interesting approach, comments are appreciated.
Since C is the implementation medium for many other languages coroutines did not get as much visibility as it should have.
I can only speculate why C left it out. Perhaps it was felt that it wont port too well outside of the PDP family. The other reason could be that coroutine as an abstraction completely leaves out how they ought to be scheduled. Perhaps they did not want to flesh out a schedule that would be portable everywhere. It is not a very satisfactory explanation because C wasnt that shy to leave things undefined. I think the only way to get an answer to this question is to go to the source, that would be Ken Thompson or Brian Kernighan.