Some may find this directness unempathetic, but I consider it as the opposite: the writing seems precisely aware of exactly what the reader wants to know at any given time and what questions he or she would be likely to have on their mind.
There's nothing anywhere restricting green threads to a single OS thread. Most modern runtimes will automatically multiplex the green threads into as many OS threads as your computer can run.
1:N green threads (which Java had) aren't intended for parallelism and provide none. They provide concurrency only.
M:N green threads (e.g., Erlang processes) provide parallelism.
I dug up this fossil the other day, trying to get a fix from Sun for this very issue:
Synchronous programming is typically much more natural. It's too bad no languages have opted to abstract this away. Of course I suppose coroutines are kind of that.
Erlang and other BEAM languages like Elixir and LFE kind of do. They use the actor model for concurrency and message passing is asynchronous. However, most of the time directly after sending a message, a program waits for a reply message, which blocks the calling process until the reply arrives, or a configurable timeout passes, making it a synchronous call. This is ok since spawning new processes is very cheap, so a web server for example idiomaticly spawns a new process for every request, making blocking calls in one request not interfere with other requests. The result is highly concurrent code that is mostly written synchronously without creating red function/ blue function kind of divisions most(?) async/await implementations have.
For example you can send a message to a different process, even running on a different machine and wait for the reply right there in the next line. It doesn't hold any OS threads.
If a program needs the input before continuing doesn't it need to wait and therefore hold the program flow and therefore stop, even in erlang?
Erlang applications are developed using message passing Actors which are implemented as very light weight processes/green-threads.
So your process that sent a message can wait for the reply synchronously. It doesn't hold anything up in the overall application.
You can have hundred of thousands of these processes running in a node.
Do check it out. It is very liberating. Why it is not vastly more popular is a mystery to me.
I'm not sure about the tradeoff. It seems to be equivalent with performance. It might require more memory, but if the competition is Java, Python, and Ruby then they are easy to beat in terms of memory consumption. I'm not sure how it compares to Go.
I also have a version for x86-32 bit and x86-64 bit (and they work under Linux and Mac OS-X). The assembly code isn't much, but it took a while to work out just what was needed and no more.
Green threading is just the idea of doing the scheduling of threads in user-space. It can be preemptive or cooperative. Fibers and green threading aren't mutually exclusive, afaik.
All the implementations I've seen are cooperative.
(If you're wondering, this works to achieve soft-realtime guarantees because, in Erlang, as in Prolog, loops are implemented in terms of recursive tail-calls. So any O(N) function is guaranteed to hit a call or ret op after O(1) time.)
If you're writing an Erlang extension in C (a "NIF"), though, and your NIF code will be above O(1), then you have to ensure that you call into the runtime reduction-checker yourself to ensure nonblocking behavior. In that sense, Erlang is "cooperative under the covers"—you explicitly decide where to (offer to) yield. It's just that the Erlang HLL papers over this by having one of its most foundational primitives do such an explicit yield-check.
In practice, memory allocation happens a lot, so it's pretty close to preemptive.
IIRC, work is underway (may be finished) to add a scheduler yield check to loops as well, which would fix the pathological case in that blog entry.
If your code is doing a lot of channel sends and/or calling a lot of non-inlined functions, it may look like preemption, but it's still cooperative. If you're writing latency-sensitive code this distinction should be kept in mind.
 Or whenever time.Sleep() is called, or somebody calls runtime.Gosched(). There might be other cases, would be happy to hear about it.
Obviously if you block in the OS for IO, your interpreter won't have a chance to preempt you, which is one of several issues with M:N threading models.
From the "client" side, there are many that are preemptive in that user code does not explicitly yield (Ruby threads pre-1.9, Erlang processes, etc.)
The runtime implementation is, by definition, cooperative scheduling on top of one (1:N) or many (M:N) native threads.
Is it? It seems like one could theoretically implement thread switch on signal, and I'd probably still call that "green threads".
The advantage of being cooperative is that context switches are deterministic. Raw preemptive multi-threading, on the other hand, leaves open the possibility of hard-to-reproduce timing-related bugs.
CPython threads are honest to goodness OS threads, the GIL has nothing to do with green-ness (and the GIL can be released).
greenlets are green threads, but they're cooperative.
There's an abstraction mismatch between the terms, though. Green threads is having separate threads of control (meaning separate program counters and contexts like stack and registers) running concurrently, with switching controlled in user space, whereas OS threads switch in the kernel. It's a description of an architecture. You could implement green threads in a VM, where the program counter and context is switched by a VM loop, yet the VM interpreter itself might be single-threaded. Or you could have multiple VM interpreter loops implemented using fibers, and switch between them that way.
Fibers are a way of switching program counter and context explicitly in userland. Like I said, you could use them to implement coroutines, iterators, or other code that resembles continuation passing style (like async) instead (the context being switched includes the return address, which is the continuation - the RET instruction in imperative code is a jump to the continuation). Fibers let you hang on to the continuation and do something else in the meantime. But they're a low-level primitive, and confusing to work with directly unless you wrap them up in something else.
Here's a chap who used Fibers to implement native C++ yield return (iterators) similar to C#: