Hacker News new | past | comments | ask | show | jobs | submit login
ProtoThreads: Lightweight, Stackless Threads in C (dunkels.com)
157 points by retrocryptid on Sept 22, 2019 | hide | past | favorite | 44 comments



the catch:

> Because protothreads do not save the stack context across a blocking call, local variables are not preserved when the protothread blocks. This means that local variables should be used with utmost care - if in doubt, do not use local variables inside a protothread!


That's literally what "stackless" implies though. I don't know if it's so much of a catch as it is a difference between stackful and stackless routines.


No, not at all. Stackless coroutines only preserve a single call frame, while stackful coroutines [1] preserve the whole stack. For example python, C#, C++ coroutines are stackless, while lua and Go are stackful.

It is hardly a coroutine at all if no state, other than the instruction pointer, is preserved.

[1] See the lua coroutine papers where the nomenclature is introduced.


> That's literally what "stackless" implies though

You could have threads with enough memory for the base levels local variables. It would mean you wouldn't need global variables to keep state, but it would still be stackless as you can't pause execution within a function call.


>for the base levels local variables.

what does that even mean? Sure you could calculate what stacksize your exact programm would need at maximum, if you unroll everything, but you could not define a fixed "base level" of a stacksize for every single programm out there, could you?


> what does that even mean?

The topmost stack frame. That's what "stackless" generally implies: it doesn't preserve the stack itself, that doesn't mean it preserves nothing.

That's what you see with most async/await systems (as opposed to more integrated "lightweight thread" systems like Erlang or Go), yield points blow the stack (though it may be relinked for debugging purposes), but they don't break local variables.


Yes, that’s a significant catch, but this is some good work nonetheless.

That said, I’d hope that someone using a C library like this wouldn’t just start using it without understanding the potentially non-trivial consequences/side-effects of that decision.


So if the compiler decides to spill an intermediate value to the stack, you'll have a horribly confusing corruption bug? Fun!


Not really. The compiler sees the resulting C code after macro expansion, so the compiler sees the giant switch and the return statements. It won't try to preserve anything.



> Protothreads can be used with or without an underlying operating system to provide blocking event-handlers. Protothreads provide sequential flow of control without complex state machines or full multi-threading.

So if it's still single-threaded, then this is basically a cumbersome async/await right?


Actually, it is a way to represent state machines as 'code'. It is a way of seperating the data from the functions, such that the functions can be 'stopped' and 'continued' in the middle (turning them into coroutines). Each stop point (often hidden in an await) is a state of the state machine represented by the function.


Yes, and here is an implementation: https://github.com/naasking/async.h

Duff’s device along with the __LINE__ trick enables the creation of automatic DFAs useful to generate coroutines, protothreads and async/await alike.


All I know is the Contiki OS running on a Commodore 64 is(was?) quite capable and used Protothreads a fair bit.

http://contiki.sourceforge.net/docs/2.6/a01802.html


is this like an implementation of Go in the C language?


No. Go uses stackful coroutines. This is more like an implementation of async/await in C.


Interesting, does the async/await pattern always imply stackless then? Might you or someone else have any resources you could share on stackless vs stackful coroutines?


kinda; using a keyword instead of library call for suspend is done primarily to constrain suspension to a single stack frame (i.e. the calling continuation is not first class, same as for return), hence it is used to enforce the stackless-ness.


The catch is brutal. This looks like an implementation of goto in C.


How so? The decision between a stackful and a stackless coroutine is precisely that, a decision. The tradeoff is that storing all registers/stack variables and processor state before a yield and restoring it after a resume is a non-trivial cost. For some applications, that cost may not matter, but for others, it may be critical.

Also, this is less like a goto, and more like a setjmp/longjmp.


A stack is an over approximation of the closure of the continuation.

You want something to create a union of all the relevant local variables of each state. Doing this efficiently is non compositional and so requires language support.


aren't most control structures just sugar for goto?


Honestly I'm not sure how to define "control" when dealing with languages in the abstract and not just compilation to conventional machine code.

Many control operators involve continuations with closures, and cannot be desugared to goto as such.


Nitpicks aside... if you're considering this: Why aren't you using a language that better supports your use case?

(Don't get me wrong, this may be technically interesting, but... choose your tools with care.)


Because if you're using this, you're likely using an embedded device with limited memory and C or assembly are your only real options.


I realize it's been a while and you won't get any notification, but there are high-level languages that can compile to C code. Like e.g. https://ivorylang.org/


> Nitpicks aside... if you're considering this: Why aren't you using a language that better supports your use case?

What other languages would you suggest that supports the use-case that this is aimed at?

> Protothreads are extremely lightweight stackless threads designed for severely memory constrained systems, such as small embedded systems or wireless sensor network nodes.


So many poorly implemented hacks, why not just use libdispatch instead, you get C blocks and it makes threading much more easy: dispatch_async(main_q, ^{ printf("Hello %s!\n", argv[i]); });

https://apple.github.io/swift-corelibs-libdispatch/tutorial/


From the first sentence of the article:

> Protothreads are extremely lightweight stackless threads designed for severely memory constrained systems, such as small embedded systems or wireless sensor network nodes. Protothreads provide linear code execution for event-driven systems implemented in C. Protothreads can be used with or without an underlying operating system to provide blocking event-handlers. Protothreads provide sequential flow of control without complex state machines or full multi-threading.

libdispatch has none of the features listed above. You're right that libdispatch is a better choice for typical PC software -- but this is designed for devices like an 8-bit microcontroller with a kilobyte of RAM.


Any numbers on task creation/switching for protothreads? GCD tasks are extremely lightweight: > Tasks in GCD are lightweight to create and queue; Apple states that 15 instructions are required to queue up a work

Linear code execution can be done using a serial queue.


GCD is just a convenience wrapper around thread pool. Probably inspired by Windows thread pool API introduced in Vista few years before GCD, adding API functions like SubmitThreadpoolWork.

Even on PCs with tons of resources, thread pools and coroutines are not always interchangeable.

One reason is performance, accessing a cache line shared across cores is even more expensive than cache miss.

Besides performance, writing same data from multiple threads, or doing in parallel what must be running sequentially, is a common source of bugs.


It is not about serial ordering. The coroutine code looks the same as the synchronous code. While using GCD blocks, you can still end up with callback hells. However, you can, do nicer things with GCD when build on top of https://github.com/google/promises, but that is outside of the linear code execution discussion.


The runtime cost of protothread switching is basically an indirect function call. Surely less than 10 instructions. The space cost only 2 bytes.


> Apple states that 15 instructions are required to queue up a work

Maybe the queuing part. However the "work" (block, callback, etc) is typically dynamically allocated thingy, which can be quite expensive (and is a no-go for lots of embedded systems).


Libdispatch is a job scheduler. A coroutine is a job to be scheduled.


As explained here [1], it uses a variation of Duff's device. So the claim that Protothreads were invented by Adam Dunkels with support from Oliver Schmidt is slightly misleading IMHO.

But fair play for turning it into a working library.

[1] http://dunkels.com/adam/pt/expansion.html


Don't be mean like that.

Duff's device to ProtoThreads is what a paintbrush is to a painting.

One is a construct for conditional loop unrolling, but it takes a stroke of genius to apply it in an absolutely orthogonal way to implement a functional cooperative multithreading with just a handful of macros. Duff's device was nowhere close to be meant for that, so your condescending pat on a back for "turning it into a working library" is completely inappropriate.


Very well. In their own About page, they say they are "heavily inspired" by Simon Tathams "Coroutines in C". In fact, it even more closely resembles that than it does the original Duff's Device.

So even they admit it isn't entirely their own work, and I still see no reason to claim they "invented" it.

edit: my "pat on the back" was absolutely not "condescending". Turning a neat design in theory into a safe, functional implementation is not to be sniffed at.


I'm not very familiar with each but in reading even the Wikipedia article mentions Duff's device can be used to implement coroutines (with references) so I'm not sure the application is as novel as you lead it on to be.


Well, I've seen this exact stroke of genius in at least a dozen different code bases, so while it's a good idea, it's not exactly rare.


ProtoThreads can use Duff's device to implement its "local continuations", but it's not the only implementation.

There's an alternate one that uses the "labels as values" feature supported by GCC and Clang, which is much more elegant. In particular, it allows you to have a 'switch' statement inside the protothread.


I've tried that, but unfortunately, at least on GCC, taking the address of a label suppresses inlining of the containing function (which is important of you are trying to implement generators, less critical for async functions), so it often generate worst code than the switch statement implementation.


This is basically a beautified version of:

https://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

from 2000.


No one is discovering fire here, no reason to get so upset over a hack.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: