So you allocate a coroutine stack by allocating a large array on the caller's stack, so that when you longjmp back to the setjmp'd context, you can safely overwrite that reserved space. F* me plenty.
Good stuff. One reason my inline asm serves my specific needs better is that my host OS has stack protection - a thread can only access its own stack; and I send coroutines around threads - yield in thread A, resume in thread B - and so I have to allocate my stacks outside the stacks of either A or B. I'll add a link to your implementation in my article, and perhaps Wikipedia should, too, if it didn't already (it has a lengthy list of coroutine implementations in C...)
Didn't see your comment til now. So, here is the difference I see:
Full-blown coroutines give us a way to write fully asynchronous code without using callbacks. For example, I could define ASYNC macro as follows:
#define ASYNC(blocking_function_call) \
do { \
auto io_pool = GetIOThreadPool(); \
auto cpu_pool = GetCurrentThreadPool(); \
\
Detach_The_Thread_and_Enqueue_Coroutine_For_IO(io_pool); \
\
blocking_function_call; \
\
Detach_The_Thread_and_Enqueue_Coroutine_For_CPU(cpu_pool); \
} while (0)
Now, I can make any blocking operation asyncrhonous. For example, open() or readdir() functions do not have asynchronous interfaces, but I can make them asynchronous by simply wrap it with ASYNC(...) macro.
It works as follows: When we need to do a blocking operation, we suspend the coroutine and we enqueue it into a thread-pool that is designated to do blocking calls; the cpu thread is now free, so it can start executing other cpu-bound coroutines. When io thread has finished the job, it will enqueue the coroutine back to the CPU thread pool.
What we have now is a fully asynchronous system where cpu threads can be kept busy with only cpu-bound work and io threads can be kept with io-bound work. This gives us full control over the concurrency. We could be confident that CPU threads are always available for compute work.
You don't need asm to manipulate the stack and frame pointers in standard C - see http://fanf.livejournal.com/105413.html or for some slightly more fleshed-out variants that do not require C99 see http://dotat.at/cgi/git/picoro.git