> If I understand correctly, this is a framework for non-preemptive cooperative multitasking whereby provided API functions serve as yield points. Pretty much what Windows API was before NT/2000. Not exactly a co-routine library, at least not in a conventional definition of co-routines.
> Yes, that's exactly what it is. Yes, that's exactly what it is. But it also matches wikipedia's difinition of coroutine: "Coroutines are computer program components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations."
All coroutine libraries are non-preemptive and cooperative. E.g. in Haskell, threads will switch only on GC allocations. In Go, they only switch on I/O (and maybe function returns). In theory, the compiler could automatically add `yield` statements into non-allocating, non-I/O loops. But nevertheless, all user-space threading is cooperative.
> But nevertheless, all user-space threading is cooperative.
Not necessarily. You can do user-space pre-emptive threading on any system that support pre-emptive signal handlers and that lets you mess around with the stack, particularly if you can generate signals from timers and/or IO operations (both of which you can with POSIX timers)
Technically it is not the kernel, it is the hardware clock.
At some point any user space scheduler will switch to the kernel in an event demultiplexing syscall and potentially block there if no user space thread is runnable. By your definition any threading system which doesn't handle all IO purely in user-space can't claim to be user-space threads.
But that is when the userspace must (and wants) to communicate with the kernel. Ultimately, that's the programmer's choice. You could easily write a program in Haskell or Go with two lightweight threads that would keep sending ping/pong messages to each other, and all the waiting (on locks, channels, etc.) and switching would be implemented in userspace.
I don't know enough about userspace/kernelspace communication, maybe I/O (refilling buffers or calling select) and timers are much more lightweight than full thread context switches, but I'm guessing it's still much slower than pure userspace scheduling.
How is a clock signal different? It is just another external event userspace wants to be notified about.
Note that on Unix you can use signals to get IO readiness notification and in fact for sometime on Linux (before epoll) realtime signals were the preferred method to do get notification over a large amount of fds.
Anyway, as a general guideline, user/kernel transition is cheaper than a (kernel) thread/thread transition which is in turn cheaper than a process/process transition.
I had the exact opposite experience with Intel's "steaming pile of crap" as you put it. We had a computationally intensive app that was sped up by 25% compared to the version produced by MS compiler. This was few years ago though.
Yep, I could totally see it speeding up computationally intensive code - but try downloading it and using it to compile entire code bases with medium to high level of complexity - like ACE/TAO or Qt or any other OSS project. It might compile at low optimization settings but then it's not faster and the moment you try to use optimizations it makes your life hell.