It worked really well --- on my machine.
Unfortunately on other machines there were weird bugs and instability, and really-hard-to-diagnose crashes; because it all worked fine on my machine, debugging was painful. Eventually I figured out that pthreads, which was being linked in by a library I depended on, when combined with a particular glibc and a particular Linux kernel, would store the TLS pointer at the top of the C stack --- it used alignment tricks to be able to figure out where the TLS pointer was from the current stack frame. (I assume this was to work around a kernel with no native TLS support.) Of course, my coroutine implementation was allocate its own stack with mmap(). This was causing pthreads to pick up either a garbage TLS pointer or, even worse, the wrong TLS pointer.
That was when I gave up on manual coroutines in C. Lovely idea, works really well on paper, so much simpler than threading (if you can live without multicore support), doesn't actually work in practice.
They're still worth checking out in languages like Lua, though. I'm still bitter that ES6 doesn't have proper coroutines, opting for the much less useful generator concept instead. Apparently they were too complicated to implement...