Hacker News new | comments | show | ask | jobs | submit login
Green Threads Explained (c9x.me)
195 points by terminalcommand on May 29, 2017 | hide | past | web | favorite | 47 comments



I enjoyed the expedient writing style. The author did a great job of expressing the essential knowledge without too much verbosity. It's very clear what the skeleton of the project consists of, and the known unknowns are labeled (i.e. "this is how this works, you can read more later [here]").

Some may find this directness unempathetic, but I consider it as the opposite: the writing seems precisely aware of exactly what the reader wants to know at any given time and what questions he or she would be likely to have on their mind.


Curious about which bit someone might find unempathetic? It all seems pretty straightforward to me.


I just meant it subjectively. I assume some might not like this expedient style as a matter of taste, because most publications/articles/tutorials are not written so concisely.


Related: Early (almost prehistoric by software timeline standards) versions of Java used a green thread implementation, but this was dropped in Java 1.1 in favour of the current native OS thread model. Some background:

https://softwareengineering.stackexchange.com/questions/1203...


Java had a quite lousy implementation of green threads, that hurt performance more than enabled parallelism. (What isn't explainable by it being early, since there were earlier green threads implementation that didn't suck. Looks like it just wasn't a priority for Java developers.)

There's nothing anywhere restricting green threads to a single OS thread. Most modern runtimes will automatically multiplex the green threads into as many OS threads as your computer can run.


> Java had a quite lousy implementation of green threads, that hurt performance more than enabled parallelism

1:N green threads (which Java had) aren't intended for parallelism and provide none. They provide concurrency only.

M:N green threads (e.g., Erlang processes) provide parallelism.


Java had a mixed model where it could use LWPs or kernel threads, and switched over around Solaris 8 because things like stat() were still blocking and had no async counterpart, so were difficult to deal with.

I dug up this fossil the other day, trying to get a fix from Sun for this very issue:

http://d.hd.org/JavaThreadsCommentToSun19980820.html


Is this solved in Go? A while back Go required programmers to use rate limiters, otherwise it would spawn unlimited kernel threads for blocking syscalls.


It's capped at 10000 threads (configurable). Note that Go uses asynchronous I/O where possible.


Rust originally supported only M:N green threads. Native thread support was added, became the default, and green thread support was eventually moved out of Rust core to a library. Here is the proposal and rationale:

https://github.com/rust-lang/rfcs/blob/0806be4f282144cfcd55b...


I think Java only started shipping native Linux thread support somewhat later, maybe in JDK 1.3.


Linux itself didn't have (working) threads until around 2003.


Linux got threads with the clone() system call in 1999. They worked but it was not a cross platform API. Though in that time most operating systems had proprietary thread APIs. Standards compliant POSIX threads came later to Linux than some other Unix systems, that much is true.


It's occurred to me that asynchronous programming is essentially just very lightweight green threads with the added benefit of scheduling tied directly to I/O.

Synchronous programming is typically much more natural. It's too bad no languages have opted to abstract this away. Of course I suppose coroutines are kind of that.


>It's too bad no languages have opted to abstract this away.

Erlang and other BEAM languages like Elixir and LFE kind of do. They use the actor model for concurrency and message passing is asynchronous. However, most of the time directly after sending a message, a program waits for a reply message, which blocks the calling process until the reply arrives, or a configurable timeout passes, making it a synchronous call. This is ok since spawning new processes is very cheap, so a web server for example idiomaticly spawns a new process for every request, making blocking calls in one request not interfere with other requests. The result is highly concurrent code that is mostly written synchronously without creating red function/ blue function kind of divisions most(?) async/await implementations have.


Erlang does that. The logic inside erlang processes look synchronous but outwardly its all async.

For example you can send a message to a different process, even running on a different machine and wait for the reply right there in the next line. It doesn't hold any OS threads.


Intriguing thought, async to what?

If a program needs the input before continuing doesn't it need to wait and therefore hold the program flow and therefore stop, even in erlang?


>If a program needs the input before continuing doesn't it need to wait

Erlang applications are developed using message passing Actors which are implemented as very light weight processes/green-threads.

So your process that sent a message can wait for the reply synchronously. It doesn't hold anything up in the overall application.

You can have hundred of thousands of these processes running in a node.

Do check it out. It is very liberating. Why it is not vastly more popular is a mystery to me.


While it's possible to have thousands of these processes running, it's completely up to the problem domain for it to be useful. If it wouldn't be bound by the problem domain you wouldn't need to write actors either.


Vibe.d [0] the one and only D web framework uses fibers, because of this reason. The tagline is "Asynchronous I/O that doesn’t get in your way"

I'm not sure about the tradeoff. It seems to be equivalent with performance. It might require more memory, but if the competition is Java, Python, and Ruby then they are easy to beat in terms of memory consumption. I'm not sure how it compares to Go.

[0] http://vibed.org/


Python has gevent and eventlet which achieve this pattern.


I recently did a similar thing [1]. The major difference I did was to store the registers on the stack instead of a separate structure. The stack is the thread structure.

I also have a version for x86-32 bit and x86-64 bit (and they work under Linux and Mac OS-X). The assembly code isn't much, but it took a while to work out just what was needed and no more.

[1] http://boston.conman.org/2017/02/27.1


In the code listing on the 4th page [0], why is MaxGThreads and StackSize wrapped in an enum? I would use an enum when I might want some values to be mutually exclusive. Seems like these values aren't even used in the same context? I would use static const ints.

[0] https://c9x.me/articles/gthreads/code0.html


Because in C (as opposed to C++), a static const int is not a compile-time constant (so you can't use it in places like switch case labels).


Interesting. I hadn't heard the "green thread" term before. Does anyone happen to know how this compares to Windows' "fiber" mechanism? I can't recall whether fibers actually run in user land.


A Fiber is a cooperative multitasking strategy. As such, a fiber must yield at some point to allow other fibers to do their work.

Green threading is just the idea of doing the scheduling of threads in user-space. It can be preemptive or cooperative. Fibers and green threading aren't mutually exclusive, afaik.


Are there actual examples of green threads that are not cooperative?

All the implementations I've seen are cooperative.


Haskell has green (N to M) threads that are pre-emptive. As does Go (but it calls them coroutines). Erlang (but it calls them processes) I'm not sure about. Maybe it only re-schedules on message sends and receives?


Erlang does a different thing called "reduction-counting": the active process in a scheduler-thread gets a budget of virtual CPU cycles (reductions/"reds"), tracked in a VM register, and part of the implementation of each op in the VM's ISA is to reduce that reduction-count by an amount corresponding to the estimated time-cost of the op. Then, the implementation of the call and ret ops in the ISA both check the reduction-counter, and sleep the process (scheduling it to resume at the new call-site) if it has expended all its reds.

(If you're wondering, this works to achieve soft-realtime guarantees because, in Erlang, as in Prolog, loops are implemented in terms of recursive tail-calls. So any O(N) function is guaranteed to hit a call or ret op after O(1) time.)

If you're writing an Erlang extension in C (a "NIF"), though, and your NIF code will be above O(1), then you have to ensure that you call into the runtime reduction-checker yourself to ensure nonblocking behavior. In that sense, Erlang is "cooperative under the covers"—you explicitly decide where to (offer to) yield. It's just that the Erlang HLL papers over this by having one of its most foundational primitives do such an explicit yield-check.


If loops are implemented as tail calls, doesn't the call opcode get optimized away thus preventing the reduction checker from being run?


It marks the point just before the call as a yield. That yield remains after optimization.


Haskell's green threads are cooperative on memory allocation. If no memory is allocated, then a green thread will not be preempted.

In practice, memory allocation happens a lot, so it's pretty close to preemptive.


Goroutines are not preemptive. They yield at well-defined points.


I remember hearing that the scheduler will indeed pre-empt occasionally if none of the active goroutines are attempting to perform blocking calls.


No, the scheduler cannot preempt. The scheduler is called by the goroutine and given the opportunity to pause it whenever the goroutine does a channel send/receive or calls a non-inlined function or, of course, makes a blocking I/O call[1]. But the scheduler can't do anything if nobody calls it. To wit:

http://www.sarathlakshman.com/2016/06/15/pitfall-of-golang-s...

IIRC, work is underway (may be finished) to add a scheduler yield check to loops as well, which would fix the pathological case in that blog entry.

If your code is doing a lot of channel sends and/or calling a lot of non-inlined functions, it may look like preemption, but it's still cooperative. If you're writing latency-sensitive code this distinction should be kept in mind.

[1] Or whenever time.Sleep() is called, or somebody calls runtime.Gosched(). There might be other cases, would be happy to hear about it.


If your implementation is interpreted, or at least can act indistinguishably from an interpreter, it's possible to interrupt a green thread every N operations.

Obviously if you block in the OS for IO, your interpreter won't have a chance to preempt you, which is one of several issues with M:N threading models.


> Are there actual examples of green threads that are not cooperative?

From the "client" side, there are many that are preemptive in that user code does not explicitly yield (Ruby threads pre-1.9, Erlang processes, etc.)

The runtime implementation is, by definition, cooperative scheduling on top of one (1:N) or many (M:N) native threads.


> by definition

Is it? It seems like one could theoretically implement thread switch on signal, and I'd probably still call that "green threads".


IIRC the Squeak Smalltalk VM has "green threads" (in the sense that they're threads implemented by the VM scheduler) which are pre-emptive. On architectures the VM runs on that have with OS threading, the VM just translates the requests to create green threads into requests to create native threads, and lets the OS manage them. On architectures that don't, it "brings its own" pre-emptive scheduling logic.


Erlang/Elixir's are preemptive.


Smalltalk-80 had green threads that have multiple priorities. Scheduling within each priority was round-robin, and timeouts (instances of the Delay class) were handled by a single high-priority thread that woke up threads whose deadline has passed. As a result, you could be preempted at any time not just by a higher-priority that, but also by another thread waiting on a Delay.


OCaml uses green threads where you can cooperate (yield()) or be preempted every 50ms.


That might be a feature, actually.

The advantage of being cooperative is that context switches are deterministic. Raw preemptive multi-threading, on the other hand, leaves open the possibility of hard-to-reproduce timing-related bugs.


Ruby, Python are the prime examples of this. They use the multiple OS threads to run them, but actually use locks to force them to be 'green'.


That's not really accurate.

CPython threads are honest to goodness OS threads, the GIL has nothing to do with green-ness (and the GIL can be released).

greenlets are green threads, but they're cooperative.


That's not green threading, it's just 1:1 native threading with a big lock, which has very different behavior than green threading (either 1:N or M:N).


You could use fibers to implement green threads; you could also use fibers to implement coroutines.

There's an abstraction mismatch between the terms, though. Green threads is having separate threads of control (meaning separate program counters and contexts like stack and registers) running concurrently, with switching controlled in user space, whereas OS threads switch in the kernel. It's a description of an architecture. You could implement green threads in a VM, where the program counter and context is switched by a VM loop, yet the VM interpreter itself might be single-threaded. Or you could have multiple VM interpreter loops implemented using fibers, and switch between them that way.

Fibers are a way of switching program counter and context explicitly in userland. Like I said, you could use them to implement coroutines, iterators, or other code that resembles continuation passing style (like async) instead (the context being switched includes the return address, which is the continuation - the RET instruction in imperative code is a jump to the continuation). Fibers let you hang on to the continuation and do something else in the meantime. But they're a low-level primitive, and confusing to work with directly unless you wrap them up in something else.

Here's a chap who used Fibers to implement native C++ yield return (iterators) similar to C#:

https://www.codeproject.com/Articles/20015/Yield-Return-Iter...




Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: