Naiive question: Who needs No-GIL when we have asyncio and multiprocessing packa...

n2d4 · on July 29, 2023

It's only about performance. asyncio is still inherently single-threaded, and hence also single core. multiprocessing is multi-core and hence better for performance, but each process is relatively heavy and there's additional overhead to shared memory. GIL multi-threading is both single-core and difficult to use correctly.

No-GIL multi-threading is multi-core, though difficult to use. I don't know the Python implementation but shared memory should be faster than using multiprocessing.

That said, when designing a system from scratch, I completely agree with you that for almost almost almost all Python use cases, threads should never be touched and asyncio/multiprocessing is the way to go instead. Most Python programs that need fast multi-threading instead should not have been written in Python. Still, we're here now and people did write CPU-intensive code in Python for one reason or another, so no-GIL is practical.

In these threads, I also always see a lot of people who simply aren't aware of asyncio/multiprocessing. I assume these are also a significant share of people asking for no-GIL, though probably not the ones pushing the change in the committee.

slt2021 · on July 29, 2023

I would argue that if you have large concurrency and shared complex state - you better off use kafka and redis/memcached as a shared state - and design proper fan-out.

This design scales much better for systems that will eventually overgrow one big machine. the No-GIL pytohn will be of no use, when you need to deploy your app across 100s machines.

I understand people want to take advantage of all cores etc, but at large scale - you will eventually need to split computation across machines and will resort to in-memory cache/queue anyways - so better just architect your system since day0

n2d4 · on July 29, 2023

Stores like that, while scaling well, are orders of magnitudes slower than CPU memory. The kind of application I was thinking of is more compute-intensive, eg. image processing or fancy algorithms.

lozenge · on July 29, 2023

PostgreSQL shows how far you can get with a single big box and using multiple cores and shared memory. It's incredibly powerful and the vast majority of applications never have data big enough to warrant "100s of machines".

merb · on July 29, 2023

and even they are looking into changing its shared memory and process model to work and make it multi-threaded: https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd...

probably a way more problematic effort than no gil python tough and it might fail

AlphaSite · on July 29, 2023

There are a lot of performance sensitive codebases where something like this would destroy performance, it works well for shared nothing parallelism, but the moment you have shared state it kinda falls over.

pxeger1 · on July 29, 2023

But many systems will never need to run at that kind of scale, but could still benefit from better threading performance, so it's good to have an "intermediate" choice, if that's what you consider it.

mrkeen · on July 29, 2023

> you will eventually need to split computation across machines

More machines is a multiplier. Instead of underutilising 1 machine, you'll be underutilising 100s of machines. More incentive to remove locking.

whalesalad · on July 29, 2023

sometimes you need all of that data in-process. when you move that state into redis, you still need to perform i/o to access it. when speed matters, this is troublesome.

KeplerBoy · on July 29, 2023

Even with multithreading Python will still be slow. It's not like singlethreaded Python can keep up with singlethreaded C or even JS.

This will just allow people to waste even more compute resources to get somewhat quick results from Python instead of doing the right thing.

From an ecological viewpoint this PEP will be disastrous and keep entire power plants online.

BiteCode_dev · on July 29, 2023

Do you include the man hours in those calculations? Because they pollute a lot between the car, the food, the electricity the computer and internet for devs consumes, etc.

hanselot · on July 29, 2023

Sometimes the right thing is writing code in a language that reaches market before runway runs out.

KeplerBoy · on July 29, 2023

That's fine for regular Python, but this doesn't convince me for multithreaded Python. Most Python modules which are optimized for performance (numpy, pytorch, pandas, and all others built on top of them) are already multithreaded and drop the GIL so you can parallelize your workload with the threading module.

If someone really needs several threads of pure python being interpreted, something is afoul imo.

dgb23 · on July 29, 2023

Ecosystem churn and exposing more issues to python contradicts this.

The whole point of using languages like python, js etc. is to not worry about whole classes of problems and intricacies.

There are modern languages with great tooling like Go, Rust or Clojure that have excellent support for safe concurrency and parallel execution etc.

And there are reliable, well established languages like Java/C# that have been doing this forever.

miraculixx · on July 29, 2023

This. Python had a unique stance, it was different for a good reason.

The seamingly endless attempt to become more likeable to yet another subgroup of needs is not a good development. It started with static typing, it continued with async, and now we have free threading as the final draw.

AlphaSite · on July 29, 2023

Python is also getting a JIT, they’re pursuing performance on all axis.

tjpnz · on July 29, 2023

The text editors people use to write said code will keep even more power plants online.

miraculixx · on July 29, 2023

Especially PyCharm with its endless indexing. Have the time its key features are not available bc it has decided to go on another indexing spree. Alas that's for another thread ;)

geekraver · on July 29, 2023

A free-threaded Python will be harder to make faster for single-threaded cases. So this could be a win for those who want to write multi-threaded code at the expense of everyone else.

chippiewill · on July 31, 2023

> asyncio is still inherently single-threaded, and hence also single core.

IIRC some of the proposals around removing the GIL in the past have actually suggested that the asyncio paradigm could become multithreaded for parallelism.

gjulianm · on July 29, 2023

> is there any use case of No-GIL which is not solved by multiprocessing ?

Tons:

- A web server that responds using shared state to multiple clients at the same time.

- multiprocessing uses pickle to send and receive data, which is a massive overhead in terms of performance. Let’s say you want to do some parallel computations on a data structure, and said structure is 1GB in memory. Multiprocessing won’t be able to deal with it with good performance.

- Another consequence of using pickle is that you can’t share all types of objects. To make matters worse, errors due to unpickable objects are really hard to debug in any non-trivial data structure. This means that sharing certain objects (specially those created by native libraries) can be impossible.

- Any processing where state needs to be shared during process execution is really hard to do with the multiprocessing module. For example, the Prometheus exporter for Flask, which only outputs some basic stats for response times/paths/etc, needs a weird hack with a temporary directory if you want to collect stats for all processes.

I could go on, honestly. But the GIL is a massive problem when trying to do parallelism in Python.

oars · on July 29, 2023

From PEP 703:

> Manuel Kroiss, software engineer at DeepMind on the reinforcement learning team, describes how the bottlenecks posed by the GIL lead to rewriting Python codebases in C++, making the code less accessible:

> "We frequently battle issues with the Python GIL at DeepMind. In many of our applications, we would like to run on the order of 50-100 threads per process. However, we often see that even with fewer than 10 threads the GIL becomes the bottleneck. To work around this problem, we sometimes use subprocesses, but in many cases the inter-process communication becomes too big of an overhead. To deal with the GIL, we usually end up translating large parts of our Python codebase into C++. This is undesirable because it makes the code less accessible to researchers."

For average usage like web apps, no-GIL can be solved by multiprocessing. But for AI workloads at huge scale like Google and DeepMind, the GIL really does limit their usage of Python (hence the need to translate to C++). This is also why Meta are willing to commit three engineer-years to making this happen: https://news.ycombinator.com/item?id=36643670

cutler · on July 29, 2023

I never really understood Meta/Facebook's practice of relying on scripting languages. Ok, replacing PHP might not have been an option given the accelerated growth of Facebook but Python was only used for tooling originally, as I understand. If they needed threading and performance so badly why didn't they go for a compilted, statically-typed language?

erinaceousjones · on July 29, 2023

Sunk cost / laziness - I remember when Facebook wrote their own JIT VM to run PHP on top of (HHVM?) to speed up all that PHP code.

Probably was easier to have one crack team of software developers write something new which could interpret all of the existing codebase, than it was to lead a widespread conversion of all of that code into faster languages.

ie. not everyone's a senior dev. There's reams more junior devs coming from bootcamps and such, computer science grads, etc, who can grok "scripting" languages like python and JS and Java much easier than they can pick up C++SIGSEGV or Rust algebraic data type smart pointer closure trait object macros.

Think how much of the world is boring "business logic" and it makes more sense - focus efforts making the [on the surface] simple, widespread, generalist, scripting languages, faster - we've seen it with Python, we've seen it with JS (node), we've seen it with Java.

Given how big the slow languages are, it makes lots of sense to save their CPU cycles compared to trying to hire from a much smaller pool of "competent at lower level programming" devs.

imtringued · on July 29, 2023

I honestly don't understand why people complain about this. One of the best parts of software development is that your tools keep getting better. Your value as a developer keeps growing because the code you have written in the past gets automatic improvements.

cutler · on July 29, 2023

Java isn't a scripting language. I don't get your point about junior devs either as FAANG companies such as Facebook can pick and choose from the highest caliber developers.

naclet · on July 30, 2023

You are right, but Facebook practices ageism and wants young developers.

That's why many code bases are a mess because everything is dictated by almost teenagers (500 per code base ...) who rewrite everything daily.

erinaceousjones · on July 29, 2023

I conceptually think of Java in the same family of languages as the "scripting" languages as it still (in its default distributions) is a garbage-collected language running on top of a virtual machine instruction set, and allows you to do stuff with dynamic typing / duck-typing and reflection at runtime that less experienced developers (me, as a CS undergrad) can make use of. Compared to stricter typed compiled languages that less experienced devlopers (me, as a CS postgrad) had an inevitable learning curve with. It was slower than it used to be, over time it has had improvements to the language which have introduced progressive increases in performance but also introduced backwards incompatabilities.

RE "scripting" vs "compiled" - I'm probably using the semantics wrong :p

To me, "scripting" is more like... Rexxx, or Lua, or Bash. Stuff that's turing complete but more restricted in how you can express things in the code, or sandboxed (designed to be embedded). Python may have started off designed to be embedded as a scripting language, but these days it's a very very general purpose _predominantly interpreted_ language, considering where it's used and the libraries it has. It's not just used inside of Blender or OpenResty for example.

I'd argue the same about Perl and PHP. If people (psychopaths) are content with using PHP-Gtk to make _desktop_ apps, does it count as scripting language the same way "Lua embedded as a way to make Source Engine entities interactive" is a scripting language? :p

> FAANG companies such as Facebook can pick and choose from the highest caliber developers.

Sure - they have lots of money. They're also very, very big companies, with offices all over the world. Look at the sheer amount of people they hired which they backtracked on later "oops, we hired too many of you, haha, sorry! layoff time!". It doesn't take 10+ years of experience to come fresh from a coding bootcamp, complete the Google code test and become a Noogler in the "Wear OS performance metrics" team writing boilerplate AsyncTasks that call UrlRequests all day every day. Plus on the positive side they encourage people to join through undergrad / new grad schemes. Like, isn't there a whole thing about people _starting_ their tech careers in FAANG corps?

umanwizard · on July 29, 2023

So Python is being fundamentally changed for everyone because of the needs of a niche subset of Python programmers (AI researchers), because that niche subset refuses to learn a language more suited to their task?

erinaceousjones · on July 29, 2023

Ugh, don't - we're convincing the legions of data scientists to move _away_ from the specialist languages (Matlab, R), because at least in Python, the code they publish with their papers is more repeatable/reproducible/reusable, and is a free language [that doesn't require a license server and/or paid plugins], and then we can plug their Torch model / numpy based computer vision algorithm into a Celery worker or a Flask endpoint :-)

ShamelessC · on July 29, 2023

In a word - nope!

usrbinbash · on July 29, 2023

> Naiive question: Who needs No-GIL when we have asyncio and multiprocessing packages ?

1. Because asyncio is completely useless when the problem is CPU bound, as the event loop still runs only on a single core. As the name implies, it is really only helpful when problems are IO bound.

2. Because sharing data between multiple processes is a giant PITA. Controlling data AND orchestrating processes is an even bigger pain.

3. Processes are expensive, and due to the aforementioned pain of sharing data, greenlets are not really a viable solutions.

spamizbad · on July 29, 2023

This probably isn't going to be that groundbreaking for your average web application. But for several of the niches where Python has a large footprint (AI, Data Science), being able to spin up a pile of cpu/gpu-bound threads and let them rip is a huge boon.

d0mine · on July 29, 2023

how likely, the corresponding code doesn't release GIL already? Pure Python is 100x slower than native code therefore the number crunching itself happens in C extensions where GIL can be released.

usrbinbash · on July 29, 2023

> therefore the number crunching itself happens in C extensions

The number crunching does, but distributing the workload, receiving results, storing and retreiving data, etc. doesn't. And these are huge losses in performance that could be avoided if we could parallelise them.

mlyle · on July 29, 2023

The moment you need to do something that there's not extension coverage for, or the extension primitive is too small, you're contending the GIL.

It's pretty nice to be able to just throw cores at a problem during prototyping phases, compared to throwing developer time to write native code.

This is a real problem-- which is why there's a whole lot of numpy/scipy maintainers in favor of the change.

dragonwriter · on July 29, 2023

> is there any use case of No-GIL which is not solved by multiprocessing ?

Anything that benefits from both parallelism and replacing IPC overhead with shared data between parallel tasks.

slt2021 · on July 29, 2023

but it would incur overhead of concurrency control: mutex, locks, semaphores.

I dont believe python will ever have atomic operations, even if it had - they still incur significant overhead for concurrency control.

sharing state between threads is such a narow niche use case, this pattern is practically solved by memcached/redis for larger scale python based systems

dragonwriter · on July 29, 2023

> but it would incur overhead of concurrency control: mutex, locks, semaphores.

Yes, but there are plenty of problems where that is more development overhead, but less runtime overhead than IPC.

fred123 · on July 29, 2023

> sharing state between threads is such a narow niche use case

It is the norm. „Kafka scale“ problems are not the norm.

usrbinbash · on July 29, 2023

> sharing state between threads is such a narow niche use case,

Outside of python, no it really isn't, it's the norm.

And even within python itself: That redis connection IS shared state.

samsquire · on July 29, 2023

Relying on Redis for data sharing between concurrent processes seems like a massive overhead to me. You've got network overhead as well as a single threaded data store.

I am thinking about multithreading every day to try make it easier to use. I journal about it in my ideas journals.

miraculixx · on July 29, 2023

Fully agree. People put far too much emphasis & expectations on the "free" part in "free multithreading".

slt2021 · on July 29, 2023

even if they get "free multithreading" with no-GIL, their system eventually will overgrow one beefy machine and will need to be deployed across a fleet of 10/100/1000 machines.

at which point you lose benefit of no-GIL, because you now have to introduce redis and kafka into the system

dragonwriter · on July 29, 2023

> even if they get “free multithreading” with no-GIL, their system eventually will overgrow one beefy machine

Why?

Yeah, if you are building a system that is, say, serving web requests, and have an internet scale potential market, success might mean that.

Not every system works that way. A simulation system with defined parameters doesn’t grow in scale if it becomes more popular, you just have more people running isolated instances that don’t depend on each other. Plenty of other applications scale that way rather than the “SaaS that serves ever more clients” way.

mlyle · on July 29, 2023

I think this argument presumes that everything is the sort of problem that maps well to redis and kafka. Scientific computing doesn't. And while things like numpy might lower contention on the GIL a bit, it's not a cure-all.

Finely-grained locks are useful. Even when you end up scaling between machines, it can be useful to have many threads in one memory space to maximize what you get out of one machine.

We're moving up to hundreds of cores; Python often being stuck only being able to use a couple while tightly coupling state has been unfortunate.

fred123 · on July 29, 2023

Why? On AWS you can rent a 24 TB, 500 core machine. Almost all problems are smaller than that so don’t need to scale to more than one machine.

Building applications that run on multiple machines is at least one order of magnitude more complex and thus slower (in development velocity), so needlessly building an application to work distributedly is just bad engineering.

tgv · on July 29, 2023

Dear Lord, is that real? It surely would not be appropriate to compensate for bad performance in a Python app.

imtringued · on July 29, 2023

I wonder if SAP Hana is written in python. Because EC2 Nitro is certified for SAP.

xmcqdpt2 · on July 29, 2023

Yes well if you are distributing to N machines, you probably want to use all M cores on each of those machines. You'll still get a performance advantage from multi-threading.

You might think that you can simply spin M processes per machine instead but now you have N*M servers instead of N servers that are M times faster. In many cases this means you have significantly higher overheads: slower startups, a lot more RAM usage, more network IO etc.

Outside of a few embarrassingly parallel problem, two-level parallelism is usually the highest performance approach.

miraculixx · on July 29, 2023

I kept making this point as well as the other arguments above (and others did too) in the Core Dev discussion group. Unfortuately to no avail. To be sure I am not a core dev.

matsemann · on July 29, 2023

Well I don't agree that just because one needs >1 servers, no-gil is suddenly useless.

Still lots of complexity and awkwardness that can be avoided if you can do threading instead of processes. Like Promotheus scraping from a non-webserver python app is a pita, as you need a new process and lots of communication, vs just plug and play as in other languages.

Or just the insane resource usage. Had a java app serving multiple orders of magnitude more customers running on a few containers. Our current python app needs multiple deployments with different entry points, and about 15x the amount of containers.

slt2021 · on July 29, 2023

It is not fair to compare CPython (which is on purpose not optimized, only a reference implementation of interpretable scripting language without any focus on performance) to OpenJDK - an arguably state of the art compiled bytecode VM with JIT and AOT compilers available, with decades and many $millions poured into runtime/JIT/GC/etc research and optimization

lozenge · on July 29, 2023

"on purpose not optimized, only a reference implementation of interpretable scripting language without any focus on performance"

That policy is over.

As the last years have shown, no alternative implementation can get off the ground due to C extensions and compatibility concerns, and CPython is now relied on for many large applications. It no longer makes sense to prioritise a simple implementation over performance.

matsemann · on July 29, 2023

Why is it not fair? I need to choose a tool, how it came to be is irrelevant for me.

fred123 · on July 29, 2023

Do you think Meta (Instagram) are pushing GIL removal and Cinder for no reason? They clearly have that scale and still benefit from faster single machine performance

miraculixx · on July 29, 2023

Yes they will benefit from this. Unfortunately most everyone else will suffer.

Dylan16807 · on July 29, 2023

Most systems don't grow forever, and can stay on one machine.

And "one beefy machine" has a very high limit, so by the time you actually outgrow it you usually have tons of resources available to help rewrite things.

opportune · on July 29, 2023

The problem with only relying on asyncio and multiprocessing is that they only implement per-process concurrency and parallelization per-process.

Threads let you use the same unified abstraction for parallelization and concurrency. They also make it easier to share state with parallelization (no need to go out of your way to do it) at the cost of requiring you to think about and implement thread safety when you do so.

Also, with no-GIL + threads the computational costs of creating and maintaining a parallel execution is much less vs multiprocessing. And data sharing and synchronization are less expensive.

What LMAX is doing is really just an overhyped way to speed up producer-consumer models. It might apply to your use case but it’s not the only reason you’d use parallelism or concurrency. I don’t even understand why they are claiming it to be an innovation when it’s just using a LockFreeQueue implementation within a pre-allocated arena? You also can’t synchronize with their implementation, which sometimes you really need to do. Not a silver bullet

slt2021 · on July 29, 2023

multithreading with shared state introduces several limitations:

  1. random jumps in memory and branch misses
  2. L1/L2 cache flush
  3. context switch cost
  4. concurrency locks cost

my understanding is that LMAX eliminated these costs:

  1. pre-allocated arena ensures cache locality of operations
  2. we dont jump form one sector of memory into another. Algorithm more resembles linear scanning of working memory set, and mostly within L1/L2 cache
  3. no context switches, no cache flushes
  4. no concurrency control costs

opportune · on July 29, 2023

Yes, but LMAX is a constrained model. With a producer:consumer dichotomy you don’t have to consider synchronization among consumers.

Let’s say you did try to implement that in LMAX. It’s common for consumers/“workers”/what have you to require synchronizations amongst themselves, for example if they are operating on a shared k:v store of strings (operating an in-memory db let’s say). You can’t do atomic reads or writes on the thing so you need a locking mechanism; under LMAX you’d have to introduce another layer of producers to control reads and writes and then have another layer of consumers afterwards to handle the rest of your “consumer flow”, or wait in the original consumer thread for the producer to complete, which starts getting very wasteful and is certainly no better than a typical regular locks and context switching.

Again, this is not even a new thing. Lock free queues and “local atomic concurrent pub-sub” have existed for a long time - we have an implementation where I work. It’s not a perfect model even for where that concurrency pattern is wholly sufficient for what you’re doing either - the performance boost from the cache and context switching improvements have to be greater than the slack (in cost or throughput) introduced from producers or consumers sitting idle waiting for upstream data.

Also, context switches/cache invalidation/concurrency overhead can be avoided or at least greatly reduced by smart userspace scheduling a la fibers. With hand tuning it can potentially be completely eliminated (you can control which concurrency units to collocate on a thread and resume concurrency units/threads immediately after their waiting locks free) which is basically the same idea as LMAX. The problem of course, like with LMAX, is that doesn’t generalize.

samus · on July 29, 2023

PEP-703 contains a whore Motivation section. Long enough to require a summary:

> Python’s global interpreter lock makes it difficult to use modern multi-core CPUs efficiently for many scientific and numeric computing applications. Heinrich Kuttler, Manuel Kroiss, and Paweł Jurgielewicz found that multi-threaded implementations in Python did not scale well for their tasks and that using multiple processes was not a suitable alternative.

> The scaling bottlenecks are not solely in core numeric tasks. Both Zachary DeVito and Paweł Jurgielewicz described challenges with coordination and communication in Python.

> Olivier Grisel, Ralf Gommers, and Zachary DeVito described how current workarounds for the GIL are “complex to maintain” and cause “lower developer productivity.” The GIL makes it more difficult to develop and maintain scientific and numeric computing libraries as well leading to library designs that are more difficult to use.

_ugfj · on July 29, 2023

Your comment has been nominated for the best typo in 2023.

Let's hope the change is not this badly mercenary.

ShamelessC · on July 29, 2023

What?

samus · on July 31, 2023

Yes, in the first line. Only spotted it now, totally my bad. It's a very not nice word to say to women, and what it makes it worse is that it actually doesn't outright destroy the meaning of the sentence. I'm sure PEP-703's authors are not that desperate about enacting this change.

_flux · on July 29, 2023

I haven't use Python multiprocessing packages, so I need to ask, how does one do flexible work queues with them?

I mean a situation like where in threaded context there would be code like:

    def determine_quest_latency(quest_name: str) -> int:
        return other_thread.wait_sync_job(lambda context: context.ping_quest(quest_name))

..without needing to provide a protocol that covers each possible scenario the client might wish to execute in the process?

I believe the answer is "you don't", but passing functions in messages is a highly convenient way to structure code in a way that local decisions can stay local, instead of being interspersed around the codebase.

xmcqdpt2 · on July 29, 2023

I believe you just pass objects instead, like you would in OOP, and take the hit of pickling and unpickling them every time.

If you really want to pass lambdas, you can use a third party library to pickle them

https://github.com/cloudpipe/cloudpickle

Yes, this is not great.

the8472 · on July 29, 2023

> I thought Single threaded execution without overhead for concurrency primitives is the best way to high performance computing

You can have shared-memory parallelism with near-zero synchronization overhead. Rust's rayon is an example. Take a big vector, chunk it into a few blocks, distribute the blocks across threads, let them work on it and then merge the results. Since the chunks are independent you don't need to lock accesses. The only cost you're paying is sending tasks across work queues. But that's still much cheaper than spawning a new process.

miraculixx · on July 29, 2023

Some of the use cases advocating for nogil come from the AI/ML group of library builders, stating a need for free threading concurrency.

pama · on July 29, 2023

Agreed. Feeding the GPUs with multiple forked memory-hogging processes is no fun and leads to annoying hacks. And, yes, as per your other post, there could have been other solutions to this problem, some of which might have been better.

miraculixx · on July 29, 2023

Yes but that's a very particular use case that could have been well served with a per gil thread and arena based memory for explicitely shared objects.

kzrdude · on July 29, 2023

GIL blocks parallelism for ThreadPool.

But I have the same question as you have if we add another coming concurrency model: SubinterpreterThreadPool, which will be possible with the per-interpreter GIL in python 3.12 and later.

That's another new model that is already confirmed to be coming: interpreters (isolated from each other) in the same process, that can run with each their own GIL.

lucidyan · on July 30, 2023

Are we talking about "PEP 554 – Multiple Interpreters in the Stdlib"[1] proposal?

[1] https://peps.python.org/pep-0554/

kzrdude · on July 30, 2023

Yes and https://peps.python.org/pep-0684/ which is no longer a proposal, but it's also been implemented in CPython 3.12.

with PoC of the parallelization code being developed here https://github.com/jsbueno/extrainterpreters

TX81Z · on July 29, 2023

Multiprocessing has a lot of issues, one of which is handling processes that never complete, subprocesses that crash and don’t return, a subprocesses that needs to spawn another subprocesses, etc.

miraculixx · on July 29, 2023

Threads can end in limbo too. At least with multiprocessing you get to kill those that hang. Threads, not so much.

samsquire · on July 29, 2023

LMAX Disruptor is multithreaded.

Multithreading is more efficient but more difficult to work with.

You share the same address space in threads, so you can communicate any amount of data between threads instantly within a lock. The same cannot be said for network traffic or OS pipes or multiprocessing.

Multiprocessing uses pickle to serialize your data and deserialize it in the other python interpreter.

If you start a Python Thread, you're still single threaded due to the GIL.

Galanwe · on July 29, 2023

Not sure why this is downvoted, I never had much issues with the GIL as well.

Multiprocessing does the parallel computation pretty well as long as the granularity is not too small. When smaller chunks are needed most of the time that's something better done from an extension.

influx · on July 29, 2023

AsyncIO is great for IO bound applications, not so much for CPU bound...

aflag · on July 29, 2023

When you create a new process you can't share things like network connections. Also, IPC tends to be very slow. It is abstracted away nicely in python, but it's still very slow, making some parallelism opportunities impossible to exploit.

For creating stateless, mostly IO bound, servers, it's great. Try to squeeze in performance and it all starts to fall apart.