Not a Python dev but I worked a lot with Ruby performance a lot for the past few years.
Do you really need async IO? Even at Google scale[1] it's kinda a waste of time for web/job servers and only required for proxying etc. The overhead of threads is massively overstated [2]
Ruby scales a long way using Puma or Phusion Passenger with one process per core (just like NodeJS) and adding threads until you hit CPU saturation. Python must have something similar?
Do you really need async IO? Even at Google scale[1] it's kinda a waste of time for web/job servers and only required for proxying etc. The overhead of threads is massively overstated [2]
Ruby scales a long way using Puma or Phusion Passenger with one process per core (just like NodeJS) and adding threads until you hit CPU saturation. Python must have something similar?
[1] https://www.slideshare.net/e456/tyma-paulmultithreaded1 [2] https://eli.thegreenplace.net/2018/measuring-context-switchi...