Hacker News new | past | comments | ask | show | jobs | submit login
Asynchronous programming and cooperative multitasking (luminousmen.com)
97 points by luminousmen 7 months ago | hide | past | web | favorite | 25 comments

Good job at highlighting async as being cooperative multi-tasking. Those of us old enough will remember the fanfare when both Mac and Windows moved from cooperative to pre-emptive multi-tasking models. To adapt a well known phrase, every non-trivial cooperative application includes a half-baked, buggy implementation of a pre-emptive scheduler. We know that pre-emptive scheduling of isolated processes is a good abstraction. It's non-trivial to write a scheduler; certainly not something that every app developer should need to deal with.

I've said before and I'll say again: this kind of async is a design flaw. It's a retrograde step that brings all the cognitive overload of cooperative multitasking back. If the OS doesn't offer enough processes for some reason (e.g. too heavyweight) the answer should not be to throw our hands up and regress. The better answer is to work out how to provide more processes. It's absolutely doable: look at Erlang.

Of course things can improve still further; e.g. Dataflow variables [0] provide an effective abstraction for processing pipelines.

[0]: https://en.wikipedia.org/wiki/Oz_%28programming_language%29

> I've said before and I'll say again: this kind of async is a design flaw. It's a retrograde step that brings all the cognitive overload of cooperative multitasking back.

You seem to be assuming that preemption is always needed or wanted. This is not always the case. The lack of preemption actually simplifies a number of factors that simply aren't relevant in async scenarios in a single program.

async/await patterns are actually the perfect solution for capturing the minimal context necessary for resumption after the async operation completes.

Cooperative multi-tasking if terrible is you don't have control over what the others are doing, but in your program you generally do.

It also have the benefit of making concurrent access much easier to reason about without having to resort to message passing or immutability, yet locks are rarely necessary.

As usual, it's a trade off.

I've learned (the hard way) that any in any project with more than one programmer, sometimes you don't have control over what the others are doing.

And on a bad day, I can do "the bad thing" to myself.

One way to help mitigate this is good tooling. A warning that a piece of async computation took longer than expected can be very useful if you expect everything to finish in a few tens of milliseconds.

I've been doing cooperative multitasking using dataflow for a couple of decades; your terms are fuzzy enough that you're criticizing the problems of particular implementations and not fundamental flaws in the concepts.

As an example of cooperative multitasking with dataflow, consider single-process nginx. Events are delivered using epoll; nginx has an event loop that does work on the next task that has data ready (or a timer event.) This is called the Reactor Pattern in the article.

Can't agree more. As someone who've done lots of work on both preemptive- and cooperative- multitasking codebases I'm perplexed by how many frameworks dive head-first into the latter.

Interesting observation. Any theories on why? As someone who has consumed but not written much by way of schedulers I’d be keen to hear thoughts.

(I assume you're asking why frameworks delve headfirst into coop, please correct me if I'm wrong)

Cooperative multitasking has much of the semantics (and thus semblance) of preemptive multitasking while also having the allure of not needing the harder (either with respect to understanding or actual usage) semantics of locking, the mental accounting related to thread safe algorithms, etc.

In other words, you get the semantics of concurrency (e.g. tasks and working queues) without the overhead of synchronizing. That sounds very alluring, but in practice I would like to not delve into specialized semantics AT ALL unless I'm expecting some speed/scalability gains, where the bulk of it comes from actual parallel execution, either in the form of different cores or different nodes. And then you need in any case to use the same semantics concurrent (i.e. cooperative) frameworks try to avoid, e.g. in the form of synchronizing between processes but with much less tooling because this is a second-tier scenario in most cooperative multitasking frameworks I've seen.

In other words, you don't get the speedups that are worth the mental effort of using the semantics in the first place. The only exception I've seen is a code base properly utilizing their I/O "breaks" for cooperation and indeed I see cooperative multitasking frameworks often touting I/O intensive applications as an ideal candidate. But, please note that:

1. Programmers usually overestimate how much I/O (or any resource usage) blocks. The codebase I was referring to was an actual filesystem.

2. Behind toy programs you still need to do a lot of thread-safe type accounting. You're not susceptible to cache-coherency issues, but that's about it.

Phew, didn't think I have so much to say about it.

I'm currently working on a project that's implementing this whole pattern using AWS. 1.) Requests are written to a DynamoDB table 2.) a Lambda subscribes to the Dynamo event stream 3.) each event from the stream causes the Lambda to make an async call to an external system 4.) A message (with a delay) is sent to SQS from the Lambda that the async call has been made 5.) Another Lambda reads from the SQS queue and updates the original DynamoDB record completing the loop.

The delay in the queue simulates a polling interval to the external system and when a successful response is returned from the external system the Dynamo record is updated with a status telling the Lambda subscribing to the event stream that "the loop" is done.

This setup has to run at fairly significant peak loads and we're relying on Amazon to provision resources reliably enough to pull this off. None of us know if this will work but in terms of abstracting away the complexities of async/concurrent processing it's a novel approach.

The challenge with that kind of system is flow control. Essentially everything is coupled by queues without a feedback loop. And once the system gets under load, those queues might run fuller and fuller until the whole system fails.

It’s actually similar than an in-program model where you just asynchronously queue void functions in threadpools without knowing how many actually run.

The fact that you are only running on top of managed services and that AWS can scale really well might mitigate the problem. And if you have to process all events and can not fail early or move backpressure on the initial caller there might not really be a better solution anyway. But you should be aware about it.

It's similar to how you use Linda (1986), which wasn't novel when it was invented, either.

The interesting linked article reminds me that techniques (crutches?) popular in the embedded world still seem fresh and new to the rest of the computing community.

A scheduler need not be complex - a superloop / commutator / for (;;) {} construction that polls clients and workers is usually enough to provide the appearance of a preemptive multitasker. The complexities typically arise in communicating data from interrupt contexts back to the foreground thread, and from appropriately designing the elements to deal with exceptions, but these complexities are already familiar to anyone developing non-trivial products whether or not a preemptive switcher is around.

Don't futures/promises help alleviate the last two cons under callbacks ("callbacks swallow exceptions" and "callback after callback gets confusing and hard to debug")?

Odd that they weren't mentioned.

They're a step towards it, but they don't really solve the callback after callback on their own. You need something like a ->get() or await on them to be able to serialize the flow of things to completely solve that.

But that encapsulation of success or failure does do wonders for helping the swallowing of exceptions, by basically wrapping the async operation in a try/catch that carries it over for you.

"callbacks swallow exceptions" - I guess you're right, but I'm talking not only about JS implementation here. I'll delete it to generalize

"callback after callback gets confusing and hard to debug" - I think it is true even for mature developers

This is written from a closed (python) world perspective. Even in python you have other alternatives (stackless [0]). In statically typed functional programming languages, the monadic approach (still) is rather popular: f () >>= g can be read as follows, when (eventually) the function f produces a result, hand that result as an argument to the function g. Monads do have a cognitive overhead, but the type system prevents you from attempting to do silly things. ymmv

  [0] https://github.com/stackless-dev/stackless/wiki

Even in the python async/await world, you can get preemptable multitasking by adding in a few ProcessPoolExecutor threads. I've found this to be extremely effective in a web crawler: the main thread is 100% cooperative and does all of the network I/O on a single core, while cpu-hungry activities like webpage parsing are done by ProcessPoolExecutor processes on other cores. The code is pretty and I can fully use an entire server with what is conceptually a single crawler.

asyncio.create_subprocess_shell() is another way to get a preemptable thing in the Python async/await world.

What y'all need is concurrent ML. I don't know why, but it still seems rather unknown. For me it hits the sweet spot of power and flexibility (generalising the actor model and still not having to do the nitty gritty details of reagents). Guile scheme has what is probably the nicest implementation (utilizing multiple cores) and the best introduction:


Since Java doesn't have first class await support, I'm super excited about Project Loom. Imo, using green threads also has one huge benefit over callbacks: legacy code, small hacks, PoCs written in a simple and imperative manner should be much easier to port to it.

I think one distinction often missed is if the programming language can share memory between threads natively (without copying) and if the language has stable nio & concurrency libraries to do this today on a live service. The only bottleneck we have is CPU, and because of litography/memory bandwidth it will never go away for transistor processors. I just called this "joint parallel" for my completely async. non-blocking HTTP server/client; meaning it can have many cores work "simultaneously" on the same memory, which is really rare in the async. world.

With JavaScript, you can also use `for-await-of` loops to process asynchronous data as an alternative to using callbacks or green threads.

I've been doing a lot of open source work on this in the past few months: https://hackernoon.com/getting-started-with-asyngular-bbe3dd...

Still too complicated.

Go almost got it right [0]; and then blew it with opaque, mandatory preemptive threading.

[0] https://gitlab.com/sifoo/snigl#io

On a related note, I found that asyncio (cooperative multitasking in Python) has made huge strides in usability in Python 3.7. The code feels sequential, and it’s quite easy to understand the flow of events ; it’s very explicit yet you never have to run the callbacks yourself, the exceptions do show up, and you don’t even need to manipulate a `loop` object anymore! If you're trying to learn how to use it, my advice would be:

* "Stick to the official documentation[1]!". Many resources found online are outdated, and give convoluted or plain wrong examples, and fail to mention the recent additions.

* Use `asyncio.run`. The alternative way (using get_event_loop() and loop.run_until_complete() is cumbersome and hard to get right. Even the documentation wasn't correct[2].). The part on Coroutines and Tasks is well-written, and the best part is the examples. Simply reading all the examples on after another gives a a good insight on how asyncio should be used.

* If you want to work with sockets, use the high-level "Streams"[3] if you want to stay in the standard library. `asyncio.start_server` is a powerful abstraction. If you’re willing to use a third-party library, I found that `pynng`[4] was a breeze to work with. It is compatible with asyncio and other async frameworks, and it found it more straightforward than pyzmq, which is also compatible with asyncio[5].

* If you want to run async functions in the main event loop and blocking functions in threads (with `loop.run_in_executor), Janus[7] seems to be a great way to share data. I have not used it yet though.

My use case was that I wanted to read sensor data on one computer (server) and broadcast it to other computers (clients) which would in turn graph it live, or write it to disk. The server used `pyserial_asyncio`[6] to asynchronously read serial data and published it via TCP using a the pub/sub scheme from pynng (pynng.Pub0).

The clients could then either synchronously or asynchronously receive the data by subscribing to the server (pynng.Sub0), and make plots in realtime.

[1]: https://docs.python.org/3/library/asyncio.html

[2]: https://github.com/python/asyncio/pull/465#issue-93620963

[3]: https://docs.python.org/3/library/asyncio-stream.html#asynci...

[4]: https://pypi.org/project/pynng/

[5]: https://pyzmq.readthedocs.io/en/latest/api/zmq.asyncio.html

[6]: https://github.com/pyserial/pyserial-asyncio

[7]: https://github.com/aio-libs/janus

I agree, and the docs got better too. In 3.7, asyncio is usable for people that don't understand it very well.

Before that, you had to learn the whole thing brick by brick before being able to do something serious.

Yet, there is a missing piece I'm hoping we'll see in 3.8: a way to limit the scope of `asyncio.ensure_future()`.

Indeed, right now either you `await` to get a sequential execution , or you call `asyncio.ensure_future()` to get a concurrent one. The later, unfortunatly, is the equivalent of a GOTO, and worse, it can contain a GOTO itself (see https://vorpus.org/blog/notes-on-structured-concurrency-or-g...).

So the best practice is to use `asyncio.gather()` to delimitate the pyramid of the task life cycle. Unfortunatly few people kow this, and hence do it. Plus it is not fun to do, it's boring boilerplate, something Python usually frees you of.

Yuri is thinking about how to implement the trio solution (the infamous nursery) in uvloop, and if he does, we usually get the feature ported to the stdlib a year later.

Meanwhile, I noted that a simple wrapper does meet the Pareto requirement: https://github.com/Tygs/ayo/

You can see it's not really hard to write your own version of it if you need to. It helped me a lot: the code is easier to reason about, and you remove a lot of edge cases.

I'll have to test Janus, it seems super nice.

> So the best practice is to use `asyncio.gather()` to delimitate the pyramid of the task life cycle. Unfortunatly few people kow this, and hence do it.

And even then, any async function might run `loop = asyncio.get_event_loop()`, run some background processes, and return before they stopped! I actually had this exact problem with my realtime sensor data server: the background tasks were never properly closed, and the sockets remained open.

The article you linked to (https://vorpus.org/blog/notes-on-structured-concurrency-or-g...) is super interesting. I didn't get the appeal of `trio` before, but now it does seem really useful.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact