> Because protothreads do not save the stack context across a blocking call, local variables are not preserved when the protothread blocks. This means that local variables should be used with utmost care - if in doubt, do not use local variables inside a protothread!
That's literally what "stackless" implies though. I don't know if it's so much of a catch as it is a difference between stackful and stackless routines.
No, not at all. Stackless coroutines only preserve a single call frame, while stackful coroutines [1] preserve the whole stack. For example python, C#, C++ coroutines are stackless, while lua and Go are stackful.
It is hardly a coroutine at all if no state, other than the instruction pointer, is preserved.
[1] See the lua coroutine papers where the nomenclature is introduced.
> That's literally what "stackless" implies though
You could have threads with enough memory for the base levels local variables. It would mean you wouldn't need global variables to keep state, but it would still be stackless as you can't pause execution within a function call.
what does that even mean? Sure you could calculate what stacksize your exact programm would need at maximum, if you unroll everything, but you could not define a fixed "base level" of a stacksize for every single programm out there, could you?
The topmost stack frame. That's what "stackless" generally implies: it doesn't preserve the stack itself, that doesn't mean it preserves nothing.
That's what you see with most async/await systems (as opposed to more integrated "lightweight thread" systems like Erlang or Go), yield points blow the stack (though it may be relinked for debugging purposes), but they don't break local variables.
Yes, that’s a significant catch, but this is some good work nonetheless.
That said, I’d hope that someone using a C library like this wouldn’t just start using it without understanding the potentially non-trivial consequences/side-effects of that decision.
Not really. The compiler sees the resulting C code after macro expansion, so the compiler sees the giant switch and the return statements. It won't try to preserve anything.
> Protothreads can be used with or without an underlying operating system to provide blocking event-handlers. Protothreads provide sequential flow of control without complex state machines or full multi-threading.
So if it's still single-threaded, then this is basically a cumbersome async/await right?
Actually, it is a way to represent state machines as 'code'. It is a way of seperating the data from the functions, such that the functions can be 'stopped' and 'continued' in the middle (turning them into coroutines). Each stop point (often hidden in an await) is a state of the state machine represented by the function.
Interesting, does the async/await pattern always imply stackless then? Might you or someone else have any resources you could share on stackless vs stackful coroutines?
kinda; using a keyword instead of library call for suspend is done primarily to constrain suspension to a single stack frame (i.e. the calling continuation is not first class, same as for return), hence it is used to enforce the stackless-ness.
How so? The decision between a stackful and a stackless coroutine is precisely that, a decision. The tradeoff is that storing all registers/stack variables and processor state before a yield and restoring it after a resume is a non-trivial cost. For some applications, that cost may not matter, but for others, it may be critical.
Also, this is less like a goto, and more like a setjmp/longjmp.
A stack is an over approximation of the closure of the continuation.
You want something to create a union of all the relevant local variables of each state. Doing this efficiently is non compositional and so requires language support.
I realize it's been a while and you won't get any notification, but there are high-level languages that can compile to C code. Like e.g. https://ivorylang.org/
> Nitpicks aside... if you're considering this: Why aren't you using a language that better supports your use case?
What other languages would you suggest that supports the use-case that this is aimed at?
> Protothreads are extremely lightweight stackless threads designed for severely memory constrained systems, such as small embedded systems or wireless sensor network nodes.
So many poorly implemented hacks, why not just use libdispatch instead, you get C blocks and it makes threading much more easy:
dispatch_async(main_q, ^{ printf("Hello %s!\n", argv[i]); });
> Protothreads are extremely lightweight stackless threads designed for severely memory constrained systems, such as small embedded systems or wireless sensor network nodes. Protothreads provide linear code execution for event-driven systems implemented in C. Protothreads can be used with or without an underlying operating system to provide blocking event-handlers. Protothreads provide sequential flow of control without complex state machines or full multi-threading.
libdispatch has none of the features listed above. You're right that libdispatch is a better choice for typical PC software -- but this is designed for devices like an 8-bit microcontroller with a kilobyte of RAM.
Any numbers on task creation/switching for protothreads? GCD tasks are extremely lightweight:
> Tasks in GCD are lightweight to create and queue; Apple states that 15 instructions are required to queue up a work
Linear code execution can be done using a serial queue.
GCD is just a convenience wrapper around thread pool. Probably inspired by Windows thread pool API introduced in Vista few years before GCD, adding API functions like SubmitThreadpoolWork.
Even on PCs with tons of resources, thread pools and coroutines are not always interchangeable.
One reason is performance, accessing a cache line shared across cores is even more expensive than cache miss.
Besides performance, writing same data from multiple threads, or doing in parallel what must be running sequentially, is a common source of bugs.
It is not about serial ordering. The coroutine code looks the same as the synchronous code. While using GCD blocks, you can still end up with callback hells. However, you can, do nicer things with GCD when build on top of https://github.com/google/promises, but that is outside of the linear code execution discussion.
> Apple states that 15 instructions are required to queue up a work
Maybe the queuing part. However the "work" (block, callback, etc) is typically dynamically allocated thingy, which can be quite expensive (and is a no-go for lots of embedded systems).
As explained here [1], it uses a variation of Duff's device. So the claim that Protothreads were invented by Adam Dunkels with support from Oliver Schmidt is slightly misleading IMHO.
But fair play for turning it into a working library.
Duff's device to ProtoThreads is what a paintbrush is to a painting.
One is a construct for conditional loop unrolling, but it takes a stroke of genius to apply it in an absolutely orthogonal way to implement a functional cooperative multithreading with just a handful of macros. Duff's device was nowhere close to be meant for that, so your condescending pat on a back for "turning it into a working library" is completely inappropriate.
Very well. In their own About page, they say they are "heavily inspired" by Simon Tathams "Coroutines in C". In fact, it even more closely resembles that than it does the original Duff's Device.
So even they admit it isn't entirely their own work, and I still see no reason to claim they "invented" it.
edit: my "pat on the back" was absolutely not "condescending". Turning a neat design in theory into a safe, functional implementation is not to be sniffed at.
I'm not very familiar with each but in reading even the Wikipedia article mentions Duff's device can be used to implement coroutines (with references) so I'm not sure the application is as novel as you lead it on to be.
ProtoThreads can use Duff's device to implement its "local continuations", but it's not the only implementation.
There's an alternate one that uses the "labels as values" feature supported by GCC and Clang, which is much more elegant. In particular, it allows you to have a 'switch' statement inside the protothread.
I've tried that, but unfortunately, at least on GCC, taking the address of a label suppresses inlining of the containing function (which is important of you are trying to implement generators, less critical for async functions), so it often generate worst code than the switch statement implementation.
> Because protothreads do not save the stack context across a blocking call, local variables are not preserved when the protothread blocks. This means that local variables should be used with utmost care - if in doubt, do not use local variables inside a protothread!